Dear Shahin, thanks for your email. As your queries include a non-equality condition (<), the existing indexes won't speed up your query. Instead, your data is sequentially parsed, which explans the linear increase of your query times. A future version of BaseX will include a range index [1] -- possibly sponsored by some of our users, so if you are interested to participate, your feedback is welcome.
Best regards, Christian [1] https://github.com/BaseXdb/basex/issues/236 ___________________________ On Wed, Jan 25, 2012 at 12:43 AM, Shahin Roboubi <srobo...@mdacorporation.com> wrote: > I’m trying to see if I can use baseX for a project we have. We need to store > a large number of small documents (about 5,000,000 where each document is 1 > to 10K). I had some performance issues and searched the mailing list and > found some answers like this: > > > > https://mailman.uni-konstanz.de/pipermail/basex-talk/2012-January/002478.html > > > > This suggests I should be able to get good performance (query times that are > around ~100ms or so). I’m running this on a linux server with fast disks and > 24 GB of RAM (4 GB for JVM). By the way, I’m doing the queries through the > baseX GUI… not sure if that makes any difference. > > I created 3 test databases, small, medium and large. The results are shown > below. All databases have full text search disabled (because I don’t need > it) and “Path Summary”, “Text Index”, “Attribute index” enabled. It seems > like the indexes are not doing anything or just not working, because the > query times are going up linearly (up to 5 seconds for the large database!!) > with the size of the database… can someone explain what is happening/why, > and how I can fix it? > > > > Thanks a lot, > > Shahin Roboubi > Software Engineer > MDA > > Embedded Attachment: > > > > ----------------------------------------------------------------------------------- > > > > Database Properties > > Name: radarsat2small > > Size: 97 MB > > Nodes: 3891930 > > Resources: 92665 > > Timestamp: 05.01.2012 15:07:37 > > > > Query: /metadata/Radarsat2Signal[Acquisition[orbit_number<200]] > > Compiling: > > - adding text() step > > - rewriting orbit_number/text() < 200 > > Result: root()/metadata/Radarsat2Signal[Acquisition[orbit_number/text() < > 200.0]] > > Timing: > > - Parsing: 0.25 ms > > - Compiling: 0.37 ms > > - Evaluating: 530.21 ms > > - Printing: 5.09 ms > > - Total Time: 535.94 ms > > Result: > > - Results: 165 Items > > - Updated: 0 Items > > - Printed: 145 KB > > Query plan: > > <IterPath> > > <Root/> > > <IterStep axis="child" test="metadata"/> > > <IterStep axis="child" test="Radarsat2Signal"> > > <AxisPath> > > <IterStep axis="child" test="Acquisition"> > > <CmpR min="-INF" max="200"> > > <AxisPath> > > <IterStep axis="child" test="orbit_number"/> > > <IterStep axis="child" test="text()"/> > > </AxisPath> > > </CmpR> > > </IterStep> > > </AxisPath> > > </IterStep> > > </IterPath> > > > > ----------------------------------------------------------------------------------- > > > > Database Properties > > Name: radarsat2medium > > Size: 194 MB > > Nodes: 7777056 > > Resources: 185168 > > Timestamp: 05.01.2012 15:26:07 > > > > Query: /metadata/Radarsat2Signal[Acquisition[orbit_number<100]] > > Compiling: > > - adding text() step > > - rewriting orbit_number/text() < 100 > > Result: root()/metadata/Radarsat2Signal[Acquisition[orbit_number/text() < > 100.0]] > > Timing: > > - Parsing: 0.25 ms > > - Compiling: 0.56 ms > > - Evaluating: 1079.27 ms > > - Printing: 5.99 ms > > - Total Time: 1086.08 ms > > Result: > > - Results: 185 Items > > - Updated: 0 Items > > - Printed: 163 KB > > Query plan: > > <IterPath> > > <Root/> > > <IterStep axis="child" test="metadata"/> > > <IterStep axis="child" test="Radarsat2Signal"> > > <AxisPath> > > <IterStep axis="child" test="Acquisition"> > > <CmpR min="-INF" max="100"> > > <AxisPath> > > <IterStep axis="child" test="orbit_number"/> > > <IterStep axis="child" test="text()"/> > > </AxisPath> > > </CmpR> > > </IterStep> > > </AxisPath> > > </IterStep> > > </IterPath> > > > > ----------------------------------------------------------------------------------- > > > > Database Properties > > Name: radarsat2large > > Size: 873 MB > > Nodes: 34999986 > > Resources: 833333 > > Timestamp: 05.01.2012 16:32:29 > > > > Query: /metadata/Radarsat2Signal[Acquisition[orbit_number<20]] > > Compiling: > > - adding text() step > > - rewriting orbit_number/text() < 20 > > Result: root()/metadata/Radarsat2Signal[Acquisition[orbit_number/text() < > 20.0]] > > Timing: > > - Parsing: 0.28 ms > > - Compiling: 2.16 ms > > - Evaluating: 5296.87 ms > > - Printing: 5.71 ms > > - Total Time: 5305.04 ms > > Result: > > - Results: 174 Items > > - Updated: 0 Items > > - Printed: 153 KB > > Query plan: > > <IterPath> > > <Root/> > > <IterStep axis="child" test="metadata"/> > > <IterStep axis="child" test="Radarsat2Signal"> > > <AxisPath> > > <IterStep axis="child" test="Acquisition"> > > <CmpR min="-INF" max="20"> > > <AxisPath> > > <IterStep axis="child" test="orbit_number"/> > > <IterStep axis="child" test="text()"/> > > </AxisPath> > > </CmpR> > > </IterStep> > > </AxisPath> > > </IterStep> > > </IterPath> > > > _______________________________________________ > BaseX-Talk mailing list > BaseX-Talk@mailman.uni-konstanz.de > https://mailman.uni-konstanz.de/mailman/listinfo/basex-talk > _______________________________________________ BaseX-Talk mailing list BaseX-Talk@mailman.uni-konstanz.de https://mailman.uni-konstanz.de/mailman/listinfo/basex-talk