Dear Shahin,

thanks for your email. As your queries include a non-equality
condition (<), the existing indexes won't speed up your query.
Instead, your data is sequentially parsed, which explans the linear
increase of your query times. A future version of BaseX will include a
range index [1] -- possibly sponsored by some of our users, so if you
are interested to participate, your feedback is welcome.

Best regards,
Christian

[1] https://github.com/BaseXdb/basex/issues/236
___________________________

On Wed, Jan 25, 2012 at 12:43 AM, Shahin Roboubi
<srobo...@mdacorporation.com> wrote:
> I’m trying to see if I can use baseX for a project we have. We need to store
> a large number of small documents (about 5,000,000 where each document is 1
> to 10K). I had some performance issues and searched the mailing list and
> found some answers like this:
>
>
>
> https://mailman.uni-konstanz.de/pipermail/basex-talk/2012-January/002478.html
>
>
>
> This suggests I should be able to get good performance (query times that are
> around ~100ms or so). I’m running this on a linux server with fast disks and
> 24 GB of RAM (4 GB for JVM). By the way, I’m doing the queries through the
> baseX GUI… not sure if that makes any difference.
>
> I created 3 test databases, small, medium and large. The results are shown
> below. All databases have full text search disabled (because I don’t need
> it) and “Path Summary”, “Text Index”, “Attribute index” enabled. It seems
> like the indexes are not doing anything or just not working, because the
> query times are going up linearly (up to 5 seconds for the large database!!)
> with the size of the database… can someone explain what is happening/why,
> and how I can fix it?
>
>
>
> Thanks a lot,
>
> Shahin Roboubi
> Software Engineer
> MDA
>
> Embedded Attachment:
>
>
>
> -----------------------------------------------------------------------------------
>
>
>
> Database Properties
>
> Name: radarsat2small
>
> Size: 97 MB
>
> Nodes: 3891930
>
> Resources: 92665
>
> Timestamp: 05.01.2012 15:07:37
>
>
>
> Query: /metadata/Radarsat2Signal[Acquisition[orbit_number<200]]
>
> Compiling:
>
> - adding text() step
>
> - rewriting orbit_number/text() < 200
>
> Result: root()/metadata/Radarsat2Signal[Acquisition[orbit_number/text() <
> 200.0]]
>
> Timing:
>
> - Parsing:  0.25 ms
>
> - Compiling:  0.37 ms
>
> - Evaluating:  530.21 ms
>
> - Printing:  5.09 ms
>
> - Total Time:  535.94 ms
>
> Result:
>
> - Results: 165 Items
>
> - Updated: 0 Items
>
> - Printed: 145 KB
>
> Query plan:
>
> <IterPath>
>
>   <Root/>
>
>   <IterStep axis="child" test="metadata"/>
>
>   <IterStep axis="child" test="Radarsat2Signal">
>
>     <AxisPath>
>
>       <IterStep axis="child" test="Acquisition">
>
>         <CmpR min="-INF" max="200">
>
>           <AxisPath>
>
>             <IterStep axis="child" test="orbit_number"/>
>
>             <IterStep axis="child" test="text()"/>
>
>           </AxisPath>
>
>         </CmpR>
>
>       </IterStep>
>
>     </AxisPath>
>
>   </IterStep>
>
> </IterPath>
>
>
>
> -----------------------------------------------------------------------------------
>
>
>
> Database Properties
>
> Name: radarsat2medium
>
> Size: 194 MB
>
> Nodes: 7777056
>
> Resources: 185168
>
> Timestamp: 05.01.2012 15:26:07
>
>
>
> Query: /metadata/Radarsat2Signal[Acquisition[orbit_number<100]]
>
> Compiling:
>
> - adding text() step
>
> - rewriting orbit_number/text() < 100
>
> Result: root()/metadata/Radarsat2Signal[Acquisition[orbit_number/text() <
> 100.0]]
>
> Timing:
>
> - Parsing:  0.25 ms
>
> - Compiling:  0.56 ms
>
> - Evaluating:  1079.27 ms
>
> - Printing:  5.99 ms
>
> - Total Time:  1086.08 ms
>
> Result:
>
> - Results: 185 Items
>
> - Updated: 0 Items
>
> - Printed: 163 KB
>
> Query plan:
>
> <IterPath>
>
>   <Root/>
>
>   <IterStep axis="child" test="metadata"/>
>
>   <IterStep axis="child" test="Radarsat2Signal">
>
>     <AxisPath>
>
>       <IterStep axis="child" test="Acquisition">
>
>         <CmpR min="-INF" max="100">
>
>           <AxisPath>
>
>             <IterStep axis="child" test="orbit_number"/>
>
>             <IterStep axis="child" test="text()"/>
>
>           </AxisPath>
>
>         </CmpR>
>
>       </IterStep>
>
>     </AxisPath>
>
>   </IterStep>
>
> </IterPath>
>
>
>
> -----------------------------------------------------------------------------------
>
>
>
> Database Properties
>
> Name: radarsat2large
>
> Size: 873 MB
>
> Nodes: 34999986
>
> Resources: 833333
>
> Timestamp: 05.01.2012 16:32:29
>
>
>
> Query: /metadata/Radarsat2Signal[Acquisition[orbit_number<20]]
>
> Compiling:
>
> - adding text() step
>
> - rewriting orbit_number/text() < 20
>
> Result: root()/metadata/Radarsat2Signal[Acquisition[orbit_number/text() <
> 20.0]]
>
> Timing:
>
> - Parsing:  0.28 ms
>
> - Compiling:  2.16 ms
>
> - Evaluating:  5296.87 ms
>
> - Printing:  5.71 ms
>
> - Total Time:  5305.04 ms
>
> Result:
>
> - Results: 174 Items
>
> - Updated: 0 Items
>
> - Printed: 153 KB
>
> Query plan:
>
> <IterPath>
>
>   <Root/>
>
>   <IterStep axis="child" test="metadata"/>
>
>   <IterStep axis="child" test="Radarsat2Signal">
>
>     <AxisPath>
>
>       <IterStep axis="child" test="Acquisition">
>
>         <CmpR min="-INF" max="20">
>
>           <AxisPath>
>
>             <IterStep axis="child" test="orbit_number"/>
>
>             <IterStep axis="child" test="text()"/>
>
>           </AxisPath>
>
>         </CmpR>
>
>       </IterStep>
>
>     </AxisPath>
>
>   </IterStep>
>
> </IterPath>
>
>
> _______________________________________________
> BaseX-Talk mailing list
> BaseX-Talk@mailman.uni-konstanz.de
> https://mailman.uni-konstanz.de/mailman/listinfo/basex-talk
>
_______________________________________________
BaseX-Talk mailing list
BaseX-Talk@mailman.uni-konstanz.de
https://mailman.uni-konstanz.de/mailman/listinfo/basex-talk

Reply via email to