> But as a side note: Do you have any insight in how expensive child, parent, > ancestor and descendant operations are? Should one avoid those if possible?
What you ask is really entirely processor and system configuration dependent. I suggest asking on a mailing list specific to that processor/vendor. (who knows I might see you on that list too !) ---------------------------------------- David A. Lee [email protected] http://www.xmlsh.org > -----Original Message----- > From: Robby Pelssers [mailto:[email protected]] > Sent: Monday, August 13, 2012 4:43 AM > To: David Lee; Adam Retter > Cc: [email protected] > Subject: RE: [xquery-talk] how to optimaly denormalize 1-to-many > relationships with XQuery > > In fact that query returns results under 1 second over a collection of 20k > documents. The documents can be 50k - 100k in size but the data I need to > extract is not deeply nested in the document hierarchy. > > So performance is not (yet) a real focus point but it was more of a 'Can I > improve on this or learn a new trick' question ;-) > > But as a side note: Do you have any insight in how expensive child, parent, > ancestor and descendant operations are? Should one avoid those if possible? > > Robby > > -----Original Message----- > From: David Lee [mailto:[email protected]] > Sent: Sunday, August 12, 2012 9:23 PM > To: Adam Retter > Cc: Robby Pelssers; [email protected] > Subject: Re: [xquery-talk] how to optimaly denormalize 1-to-many > relationships with XQuery > > I wouldn't necessarily be worried about iteration efficiency until you try it. > Often iteration Is cheep compared to element creation. It all depends on the > processor , data, and other issues. > Remember "iteration" doesn't necessarily imply temporal literation, unlike in > procedural languages. > > > Sent from my iPad (excuse the terseness) > David A Lee > [email protected] > > > On Aug 12, 2012, at 7:02 AM, "Adam Retter" <[email protected]> > wrote: > > > I dont think you can do this in one pass. However, depending on the > > implementation its impossible to know how many passed the processor > > will actually make over the source to fulfil the query. > > > > However, if we were to assume each FLWOR expression is a pass over the > > source data, then I think the following implementation could be more > > efficient. It really depends on how the implementation handles > > descendant-or-self and ancestor selection. > > > > > > let $basicTypes := /BasicType > > <Result> > > <BasicTypes> > > { > > $basicTypes > > } > > </BasicTypes> > > <ProductTypes> > > { > > for $productType in $basicTypes//ProductType return > > <ProductType> > > { > > $productType/*, > > <BasicType ref-id="{$productType/ancestor::BasicType/@id}"/> > > } > > </ProductTypes> > > } > > </ProductTypes> > > <SalesItems> > > { > > for $salesItem $basicTypes//SalesItem return > > <SalesItem> > > { > > $salesItem/*, > > <ProductType > > ref-id="{$salesItem/ancestor::ProductType/@id}"/> > > } > > </SalesItem> > > } > > </SalesItems> > > </Result> > > > > > > > > On 10 August 2012 15:32, Robby Pelssers <[email protected]> wrote: > >> Hi all, > >> > >> Suppose I have a collection of XML documents looking like this: > >> > >> Basictype has 1 to many Product types. > >> Producttype has 1 to many Sales items. > >> > >> Example snippet: > >> --------------------------------- > >> <BasicType id="PH3330L"> > >> <Status>End of life</Status> > >> ... > >> <ProductTypes> > >> <ProductType id="xxx"> > >> <Status>Deprecated</Status> > >> ... > >> <SalesItems> > >> <SalesItem id="yyy"> > >> <Owner>abcde</Owner> > >> </SalesItem> > >> </SalesItems> > >> </ProductType> > >> </ProductTypes> > >> </BasicType> > >> > >> Now I want to generate some data looking like this: > >> > >> <Result> > >> <BasicTypes> > >> <BasicType id="PH3330L"> > >> <Status>End of life</Status> > >> </BasicType> > >> ... > >> </BasicTypes> > >> <ProductTypes> > >> <ProductType id="xxx"> > >> <Status>Deprecated</Status> > >> <BasicType ref-id="PH3330L"/> > >> </ProductType> > >> ... > >> </ProductTypes> > >> <SalesItems> > >> <SalesItem id="yyy"> > >> <Owner>abcde</Owner> > >> <ProductType ref-id="xxx"/> > >> </SalesItem> > >> ... > >> </SalesItems> > >> </Result> > >> > >> ------------- > >> I have written a query which returns just this but it iterates > >> - three times over the basictypes > >> - 2 times over the producttypes > >> - 1 time over the salesitems > >> > >> Is there a better way to get this accomplished in 1 iteration? > >> > >> > >> Pseudo-code: > >> > >> let $basictypes := collection("basictypes") > >> return > >> <Result> > >> <BasicTypes> > >> { > >> for $basictype in $basictypes > >> ...do some stuff > >> > >> } > >> </BasicTypes> > >> <ProductTypes> > >> { > >> for $basictype in $basictypes > >> for $producttype in $basictype/ProductTypes/ProductType > >> ...do some stuff > >> } > >> </ProductTypes> > >> <SalesItems> > >> { > >> for $basictype in $basictypes > >> for $producttype in $basictype/ProductTypes/ProductType > >> for $salesitem in $producttype/SalesItems/SalesItem > >> ...do some stuff > >> } > >> </SalesItems> > >> </Result> > >> > >> Robby Pelssers > >> > >> > >> _______________________________________________ > >> [email protected] > >> http://x-query.com/mailman/listinfo/talk > > > > > > > > -- > > Adam Retter > > > > skype: adam.retter > > tweet: adamretter > > http://www.adamretter.org.uk > > _______________________________________________ > > [email protected] > > http://x-query.com/mailman/listinfo/talk > > > _______________________________________________ [email protected] http://x-query.com/mailman/listinfo/talk
