Re: [basex-talk] large number of xml files

2012-08-21 Thread Michael Seiferle
Hi Sateesh, 

is saw that you sent dirk an XQuery file, is it the same that takes that much 
memory?
In case yes we will see if we can help with that :)
Kind Regards
Michael
Am 20.08.2012 um 14:34 schrieb sateesh sate...@intense.in:

 Hi Michael,
 
 I created the collection of 2k xml's as per your previous mail and tried
 executing the query,even though after creating the collection also the
 memory consumption is high(700MB of heap memory) and also it is taking 3
 mins of time for processing.
 
 Thanks  Regards
 Sateesh.A
 
 -Original Message-
 From: Michael Seiferle [mailto:m...@basex.org] 
 Sent: Monday, August 20, 2012 2:33 PM
 To: sateesh
 Cc: basex-talk@mailman.uni-konstanz.de
 Subject: Re: [basex-talk] large number of xml files
 
 Sateesh, 
 
 sorry I totally overlooked your last email.
 I'll reply inline:
 Am 18.08.2012 um 08:58 schrieb sateesh sate...@intense.in:
 
 
 Hi Micheal,
 
 I have tried to implemet your suggested changes , but I got struck as the
 10k xml's which I have to query on comes from different folders,and also
 one
 more question is how do I create collections using the program before
 running the query.
 XQuery at the moment has no possibility to create a collection on the fly,
 as such you would have to use our Java API  [1] or Commandline API [2].
 
 For creating a collection from different folders you would do as follows:
 create db myDB path/to/files;
  . creates the database coll with all documents found in the input
 directory.
 
 ADD TO target/ xmldir
  . adds all files from the xmldir directory to the database in the
 target path.
 
 
 I hope this helps :-)
 
 Kind Regards
 Michael
 
 
 Thanks  Regards
 Sateesh.A
 
 
 [1]
 https://github.com/BaseXdb/basex-examples/blob/master/src/main/java/org/base
 x/examples/query/CreateCollection.java
 [2] http://docs.basex.org/wiki/Commands
 
 
 

___
BaseX-Talk mailing list
BaseX-Talk@mailman.uni-konstanz.de
https://mailman.uni-konstanz.de/mailman/listinfo/basex-talk


Re: [basex-talk] large number of xml files

2012-08-21 Thread sateesh
HI Michael,

For dirk that was a separate issue(grouping of records) in that also iam
facing the memory issue,and in our case of querying on 10k xml's after
creating collections also is taking huge memory as mentioned in my previous
mail.

Waiting for your suggestions ,It would really help me in closing the issue
as I am at a crucial stage of the project.

Thanks  Regards
Sateesh.A

-Original Message-
From: Michael Seiferle [mailto:m...@basex.org] 
Sent: Tuesday, August 21, 2012 1:52 PM
To: sateesh
Cc: basex-talk@mailman.uni-konstanz.de
Subject: Re: [basex-talk] large number of xml files

Hi Sateesh, 

is saw that you sent dirk an XQuery file, is it the same that takes that
much memory?
In case yes we will see if we can help with that :)
Kind Regards
Michael
Am 20.08.2012 um 14:34 schrieb sateesh sate...@intense.in:

 Hi Michael,
 
 I created the collection of 2k xml's as per your previous mail and tried
 executing the query,even though after creating the collection also the
 memory consumption is high(700MB of heap memory) and also it is taking 3
 mins of time for processing.
 
 Thanks  Regards
 Sateesh.A
 
 -Original Message-
 From: Michael Seiferle [mailto:m...@basex.org] 
 Sent: Monday, August 20, 2012 2:33 PM
 To: sateesh
 Cc: basex-talk@mailman.uni-konstanz.de
 Subject: Re: [basex-talk] large number of xml files
 
 Sateesh, 
 
 sorry I totally overlooked your last email.
 I'll reply inline:
 Am 18.08.2012 um 08:58 schrieb sateesh sate...@intense.in:
 
 
 Hi Micheal,
 
 I have tried to implemet your suggested changes , but I got struck as the
 10k xml's which I have to query on comes from different folders,and also
 one
 more question is how do I create collections using the program before
 running the query.
 XQuery at the moment has no possibility to create a collection on the fly,
 as such you would have to use our Java API  [1] or Commandline API [2].
 
 For creating a collection from different folders you would do as follows:
 create db myDB path/to/files;
  . creates the database coll with all documents found in the input
 directory.
 
 ADD TO target/ xmldir
  . adds all files from the xmldir directory to the database in the
 target path.
 
 
 I hope this helps :-)
 
 Kind Regards
 Michael
 
 
 Thanks  Regards
 Sateesh.A
 
 
 [1]

https://github.com/BaseXdb/basex-examples/blob/master/src/main/java/org/base
 x/examples/query/CreateCollection.java
 [2] http://docs.basex.org/wiki/Commands
 
 
 



___
BaseX-Talk mailing list
BaseX-Talk@mailman.uni-konstanz.de
https://mailman.uni-konstanz.de/mailman/listinfo/basex-talk


Re: [basex-talk] Basex editing performance test

2012-08-21 Thread Christian GrĂ¼n
Hi Yoann,

my initial assumption would be that the culprit for the performance
drop is the used file system (NTFS? ext3?). If 80,000 databases are
created, your db directory will contain 80,000 directories, which is
quite a lot for usual file systems. Some alternatives (e.g. XFS, maybe
ReiserFS, or ReFS in Windows 8) may give you better results here.

Another, more general, approach is to cluster your databases and find
a good tradeoff between the number and the size of your databases. As
your results have already shown, there's hardly any difference if 1 or
1,000 databases are created - but it will hardly be possible to get
satisfying results with 1M databases.

Hope this helps,
Christian
___

 Continuing on my recent question about using multiple databases, we've been
 running some performance test on basex.
 I don't have the detail of the computer that hosted these test but they were
 all done on the same computer.
 Test was to edit a node's value from a php script in a database a million
 time, the database being choosen randomly in X amount of databases created.
 (this simulates multiple users accessing our app)

 2 DB
 Total time : 1517.95 seconds for 100 iterations with 2 DB
 Min time : 1.17 ms with database 1
 Max time : 704.72 ms with database 0
 Mean time : 1.52 ms

 10 000 DB
 Total time : 1515.75 seconds for 100 iterations with 10 000 DB
 Min time : 1.21 ms with database 8879
 Max time : 645.16 ms with database 3822
 Mean time : 1.52 ms

 20 000 DB
 Total time : 1680.29 seconds for 100 iterations with 20 000 DB
 Min time : 1.18 ms with database 3749
 Max time : 285.49 ms with database 6518
 Mean time : 1.68 ms

 40 000 DB
 Total time : 1813.53 seconds for 100 iterations with 40 000 DB
 Min time : 1.04 ms with database 786
 Max time : 212.2 ms with database 6949
 Mean time : 1.81 ms

 80 000 db - test 1
 Total time : 24728.94 seconds for 100 iterations with 80 000 DB
 Min time : 1.16 ms with database 25693
 Max time : 2433.44 ms with database 22021
 Mean time : 24.73 ms

 80 000 db - test 2
 Total time : 18661.74 seconds for 100 iterations with 80 000 DB
 Min time : 1.68 ms with database 5979
 Max time : 1936.4 ms with database 30239
 Mean time : 18.66 ms

 We can just see that there is an important difference from 40k to 80k
 databases. We haven't checked other mean method to see if this was due to a
 few edit actions. Does any one tried to have many databases and at some
 point reached a certain limit? In oder to do server sizing what is key for
 these actions? processor?ram?

 Thanks for your help!

 --
 Yoann Maingon
 mydatalinx
 0664324966
___
BaseX-Talk mailing list
BaseX-Talk@mailman.uni-konstanz.de
https://mailman.uni-konstanz.de/mailman/listinfo/basex-talk