Bugs item #1881181, was opened at 2008-01-28 15:36
Message generated for change (Comment added) made by stmane
You can respond by visiting: 
https://sourceforge.net/tracker/?func=detail&atid=482468&aid=1881181&group_id=56967

Please note that this message will contain a full copy of the comment thread,
including the initial issue submission, for this request,
not just the latest update.
Category: PF/runtime
Group: Pathfinder CVS Head
>Status: Closed
>Resolution: Invalid
Priority: 5
Private: No
Submitted By: Jan Flokstra (jflokstra)
>Assigned to: Stefan Manegold (stmane)
Summary: XQ: following uses too much memory

Initial Comment:
Next too the difficult to reproduce preceding/following bug #1850772 
([preceding and following]-sibling:: bug) there is another problem with the 
following() axis step .

On the small 1M xmark document the queries listed below all result in a memory 
allocation failure of BAT's sized 200M to several G's. I think this example is 
so simple it should run on any machine. 

count(doc("xmark1.xml")//street[./following::keyword[./following::current]])
count(doc("xmark1.xml")//*[./following::incategory[./following::interest]])
count(doc("xmark1.xml")//*[./following::category]/following::interest)
count(doc("xmark1.xml")//*[./following::incategory[./following::interest]])
count(doc("xmark1.xml")//itemref[./following::phone[./following::africa]])

The first query for instance running the CVS HEAD (28/1-14:00) on SuSe9.3 (1G 
RAM)

floks...@ewi581:~/scripts/DATA> mclient -lx
xquery>count(doc("xmark1.xml")//street[./following::keyword[./following::current]])
more>MAPI  = mone...@localhost:50000
QUERY = 
count(doc("xmark1.xml")//street[./following::keyword[./following::current]])
ERROR = !ERROR: GDKload failed: name=06/646, ext=head.priv
        !ERROR: PFll_following: could not allocate a result BAT of size 
555611000.
        !ERROR: PFll_following: operation failed.
xquery>          

I added the used xmark document bzipped2 to this report,

Jan.                                 

----------------------------------------------------------------------

>Comment By: Stefan Manegold (stmane)
Date: 2009-02-15 22:28

Message:
turned the request for name-test push-down in preceding & following steps
into a feature request:
[ 2603533 ] XQ: name-test push down for preceding and following step

https://sourceforge.net/tracker/index.php?func=detail&aid=2603533&group_id=56967&atid=482471



----------------------------------------------------------------------

Comment By: Stefan Manegold (stmane)
Date: 2008-02-10 14:14

Message:
Logged In: YES 
user_id=572415
Originator: NO

tagged as "XQ" in subject


----------------------------------------------------------------------

Comment By: Stefan Manegold (stmane)
Date: 2008-01-29 10:50

Message:
Logged In: YES 
user_id=572415
Originator: NO

Hoi Jan,

thanks for the info!
The push-down are actually for both MPS & ALG (as long as we still have
staircasejoins), hence, they should eventually be done in any case --- most
probably not "this weeK', though ...

Stefan


----------------------------------------------------------------------

Comment By: Jan Flokstra (jflokstra)
Date: 2008-01-29 10:33

Message:
Logged In: YES 
user_id=1054297
Originator: YES

Jan and Stefan thanks for the effort. These following() queries were
(again:-) created by a student of ours, Gerben Broenink, who does a
statistical analysis project of XML docs for Riham and Maurice. He told me
he will just skip these queries so it is not important to do the push-down
optimizations in the MPS version.

JanF.

----------------------------------------------------------------------

Comment By: Stefan Manegold (stmane)
Date: 2008-01-28 21:03

Message:
Logged In: YES 
user_id=572415
Originator: NO

Jan (R.), thanks for the analysis!

The problem is "particularly bad", since the following- & preceding-step
(unlike descendant & child) do not (yet?) perform a "name-test push-down",
i.e., they effectively first need to materialize following::* before
applying the name-test.
I'll try to check how difficult it will be to add the "name-test
push-down" also for following & preceding ...

(Obviously, even with the "name-test push-down", Jan R.'s analysis/comment
still holds.)


----------------------------------------------------------------------

Comment By: Jan Rittinger (tsheyar)
Date: 2008-01-28 18:25

Message:
Logged In: YES 
user_id=993208
Originator: NO

I just tried the fourth query
('count(doc("xmark1.xml")//*[./following::incategory[./following::interest]])')
on the 580k auctionG.xml from the testset using the algebra variant (pf -A)
and added a count before and after the following steps:

before following::incategory:
[ 8517 ]
before following::incategory:
[ 614915 ]
before following::interest:
[ 614915 ]
during following::interest:
!ERROR: PFll_following: could not allocate a result BAT of size
3069254744.

I think this is perfectly fine as you asked (almost) for a Cartesian
between all descendants of the documents (8517), the incategory nodes
(436), and the interest nodes (211) which gives you around 783 million
nodes. The guess of the system (3.069 million) thus was not far off.

I assume the query writer was looking for the axis following-sibling
instead of the 'globabl' following path check.

----------------------------------------------------------------------

You can respond by visiting: 
https://sourceforge.net/tracker/?func=detail&atid=482468&aid=1881181&group_id=56967

------------------------------------------------------------------------------
Open Source Business Conference (OSBC), March 24-25, 2009, San Francisco, CA
-OSBC tackles the biggest issue in open source: Open Sourcing the Enterprise
-Strategies to boost innovation and cut costs with open source participation
-Receive a $600 discount off the registration fee with the source code: SFAD
http://p.sf.net/sfu/XcvMzF8H
_______________________________________________
Monetdb-bugs mailing list
[email protected]
https://lists.sourceforge.net/lists/listinfo/monetdb-bugs

Reply via email to