Re: [basex-talk] improving query performance

2020-08-23 Thread Liam R. E. Quin
On Sun, 2020-08-23 at 14:05 -0700, Bill Osmond wrote:
> Indeed I have, with no positive results unfortunately. I'm now
> testing to
> see if having multiple return statements (as in Liam's queries)
> helps,
> although the results so far are basically the same.

I tried to make clear what was going on, using explit returns, The
syntax of a FLWOR expression allows either form:
  for $sock in /drawer/socks, $shoe in /tray/shoes
is the same as
  for $sock in /drawer/socks
for $shoe in /tray/shoes
and constructs every possible (sock, shoe) pair.

However,
  for $sock in /drawer/socks
  return
for $shoe in /tray/shoes
return
   some_expression($sock, $shoe)
maybe makes clearer that some_exoression() will be called
count(/drawer/socks) * count(/tray/shoes) times.

the way to speed this up is likely to construct many fewer tuples,
using grouping or windowing to process the inner part of your query.

Liam

-- 
Liam Quin, https://www.delightfulcomputing.com/
Available for XML/Document/Information Architecture/XSLT/
XSL/XQuery/Web/Text Processing/A11Y training, work & consulting.
Barefoot Web-slave, antique illustrations:  http://www.fromoldbooks.org



Re: [basex-talk] improving query performance

2020-08-23 Thread Bill Osmond
Indeed I have, with no positive results unfortunately. I'm now testing to
see if having multiple return statements (as in Liam's queries) helps,
although the results so far are basically the same.

On Sat, Aug 22, 2020 at 9:59 AM Christian Grün 
wrote:

> Yes, I see now why my query returns much more hits than yours
> (including the first). As Liam already expressed, it’s not really a
> nested query what you are wanting to achieve.
>
> Oe thing you can always try is to change the order of your for clauses
> and see what happens. Maybe you did that already? In all cases, no
> index will be applied at the moments with these kinds of query
> patterns. I might get back to you later or tomorrow once I have
> another idea what could be done.
>
>
> On Sat, Aug 22, 2020 at 6:42 PM Bill Osmond  wrote:
> >
> > This is vexing - it seems as though the mechanism that provides the
> necessary "filtering" is the very thing that slows the execution down so
> much. This wouldn't have been obvious from the single example document I
> sent earlier, but each document stands alone: all of the searching and
> reference linking done for each TrackRelease in a NewReleaseMessage should
> only refer to other nodes in that same NewReleaseMessage.
> >
> > In my query, I started out with "for $r in /ernm:NewReleaseMessage" and
> I used $r on the right hand side of the subsequent for statements. It seems
> like without that, the execution is quick, but all the results from every
> document are getting matched to each other. With it, the results are
> correct, but the execution time shoots way up. In case any of you still
> have any patience for this question (and thanks again for everything so
> far!), I've attached a small sample set of 6 documents. The desired number
> of results from the query is 70 (which is the number of TrackReleases from
> all the documents combined), and the query that I've adapted from
> Christian's ddex2.xq which returns the right number of results is the
> following:
> >
> > declare namespace ernm = 'http://ddex.net/xml/ern/411';
> > (: declare context item := db:open('ddex'); :)
> >
> > for $r in /ernm:NewReleaseMessage
> >
> > for $party in $r/PartyList/Party[
> >   PartyReference/text() =
> >   $r/ReleaseList/TrackRelease/ReleaseLabelReference
> > ]
> > for $track_release in $r/ReleaseList/TrackRelease[
> >   ReleaseLabelReference/text() =
> >   $r/PartyList/Party/PartyReference
> > ]
> > for $sound_recording in $r/ResourceList/SoundRecording[
> >   ResourceReference/text() =
> >   $track_release/ReleaseResourceReference
> > ]
> > for $release in $r/ReleaseList/Release[
> >
>  
> ResourceGroup/ResourceGroup/ResourceGroupContentItem/ReleaseResourceReference/text()
> =
> >   $track_release/ReleaseResourceReference
> > ]
> > return 
> >   { $track_release/ReleaseId/ISRC/text() }
> >   { fn:string-join($sound_recording/DisplayArtistName, '/')
> }
> >   { $sound_recording/DisplayTitleText/text() }
> >   { $release/DisplayTitleText/text() }
> >   { $release/ReleaseId/ICPN/text() }
> >   { $party/PartyName/FullName/text() }
> > 
>


-- 



*William Osmond*
Manager, System Operations & IT

3760 Motor Ave, Los Angeles, CA 90034
310-853-8737 | www.with.in