I have 2 xml docs, each about 1GB and about 2 mil fragments ("rows")
each ... in fact the elements are called "rows".

Each "row" element is about 500 bytes.   But I dont yet have a better
way to fragment them.

( Yes Its been suggested to split these to seperate docs and I may
experiment with that. )

 

Here's a case where I've found ML refuses to optimize xpaths.

 

First off, this expression takes about 5 seconds, which I find a little
slow ...  it returns 8 rows.

 

 

declare variable $id := '2483417'; 

for $r in doc("/RxNorm/rxnsat.xml")/rxnsat/row[RXAUI eq $id]

return $r

 

 

Now to complicate things I actually need $id from a previous query so
the real query is like

 

 

declare variable $id := '2483417'; 

declare variable $c := doc("/RxNorm/rxnconso.xml")/rxnconso/row[RXAUI eq
$id]; 

declare variable $id2 as xs:string := $c/RXAUI/string();

 

for $r in doc("/RxNorm/rxnsat.xml")/rxnsat/row[RXAUI eq $id2]

return $r

 

This takes about 1 minute ! ..    Checking the profile I find the
expression  row[ RXAUI eq $id] is evaluated a million times ...
indicating its not doing indexing.

 

I've tried all sorts of combinations of these like

 

doc("/RxNorm/rxnsat.xml")/rxnsat/row[xs:string(RXAUI) eq $id2]

doc("/RxNorm/rxnsat.xml")/rxnsat/row[RXAUI eq $c/RXAUI]

doc("/RxNorm/rxnsat.xml")/rxnsat/row/RXAUI[. eq $id2]/ancestor::row

 

 

All to the same avail ... no indexing !

 

But of course this brings things back to speed

 

---------

for $r in cts:search(doc("/RxNorm/rxnsat.xml")/rxnsat/row, 

cts:element-query( xs:QName("RXAUI") , $id2 )) 

return $r

 

------------

 

 

Still takes too long (about 5 sec) ... but its back to realtime atleast.

 

I'm experimenting now with fields ... 

 

But I find it strange that I cant the xpath expression to use the
indexes in one case but it does in another that seems almost identical
to me.

 

This expression

declare variable $id2 as xs:string := $c/RXAUI/string();

 

should tell the system that $id2 is a single string so why wont it use
it in xpath based index queries ?

 

 

 

 

----------------------------------------

David A. Lee

Senior Principal Software Engineer

Epocrates, Inc.

[email protected] <mailto:[email protected]> 

812-482-5224

 

 

_______________________________________________
General mailing list
[email protected]
http://xqzone.com/mailman/listinfo/general

Reply via email to