[Monetdb-bugs] [ monetdb-Bugs-2722174 ] XQ: shredding 70GB XML fails.

SourceForge.net Fri, 10 Apr 2009 04:37:07 -0700

Bugs item #2722174, was opened at 2009-03-30 18:38
Message generated for change (Comment added) made by vzzzbx
You can respond by visiting: 
https://sourceforge.net/tracker/?func=detail&atid=482468&aid=2722174&group_id=56967


Please note that this message will contain a full copy of the comment thread,
including the initial issue submission, for this request,
not just the latest update.
Category: PF/loader
Group: Pathfinder "stable"
>Status: Closed
Resolution: Fixed
Priority: 5
Private: No
Submitted By: Wouter Alink (vzzzbx)
Assigned to: Peter Boncz (boncz)
Summary: XQ: shredding 70GB XML fails.

Initial Comment:
Shredding 1000000 (1 million) documents (+- 70GB) using MonetDB/XQuery fails, 
while it succeeds when only shredding the first 700000 of this collection. See 
error-message below. More investigation is needed. To be continued.

MAPI  = mone...@localhost:52009
QUERY = for $i in doc("1M_docs.xml")//doc return 
pf:add-doc($i,concat("1M_docs_",$i
),"1M_collection.xml")
ERROR = !ERROR: BBPdecref: 1001729024_rid_size does not have pointer fixes.
        !ERROR: BBPdecref: 1001729024_rid_level does not have pointer fixes.
        !ERROR: BBPdecref: 1001729024_rid_prop does not have pointer fixes.
        !ERROR: BBPdecref: 1001729024_prop_text does not have pointer fixes.
        !ERROR: BBPdecref: 1001729024_prop_val does not have pointer fixes.

real    240m55.752s
user    0m0.004s
sys     0m0.004s


----------------------------------------------------------------------

>Comment By: Wouter Alink (vzzzbx)
Date: 2009-04-10 13:36

Message:
There is only good news to tell: 
- the initial bug is gone, and
- the database was not corrupted anyway, because the bug manifested itself
only after the shredding of the document.  This means that the databases
created with the MonetDB before the fix (which gave an error at the end)
can still be used, by simply using the recompiled Mserver on the same
dbfarm)

----------------------------------------------------------------------

Comment By: Wouter Alink (vzzzbx)
Date: 2009-04-10 12:05

Message:
The bug should have been fixed yesterday by Peter's checkin. A test for the
testweb can not be added easily, as the data is just too large. I will test
the new code with the data, and will close the bug if everything works
fine.

----------------------------------------------------------------------

Comment By: Wouter Alink (vzzzbx)
Date: 2009-04-07 12:50

Message:
Cheering was too early... querying the resulting collection (which was
shredded in batches) does still not work (assertion in decref).

The problem seems to be in __runtime_index() (which is invoked from
open_coll(), which in turn is invoked in ws_collection_root()). This
function both: uses massive amounts of memory (>150GB) and calls decref
once too often (lowers the refs value to zero).

Debugging is more difficult, as setting watchpoints on an IA64 system
seems a little difficult. To be continued.


----------------------------------------------------------------------

Comment By: Wouter Alink (vzzzbx)
Date: 2009-04-03 12:20

Message:
The following code seems to work (no errors or crashes are generated) in a
few hours: 

for line in 1 100001 200001 300001 400001 500001 600001 700001 800001
900001 1000001
do
  echo "# Shredding from $line to "$(($line + 100000))
  time echo 'for $i in
subsequence(doc("'$PWD/1M_doc_list.xml'")//doc,'$line',100000) return
pf:add-doc($i, concat("all_",$i), "all")' | mclient -lxq
done
echo "#Done!"

However, this doesn't say much about the actual problem. What it perhaps
does say is that the total amount of data in 1 collection is not the
problem. 

----------------------------------------------------------------------

Comment By: Wouter Alink (vzzzbx)
Date: 2009-04-02 16:22

Message:
In the initial posting of this bug, the '1001729024_rid_size' bat was a bat
which belonged to the /some/path/1M_docs.xml document (which contained the
names of the documents). This document was shredded on the fly (cached, and
not made persistent). The actual bug-query read:

for $i in doc("/some/path/1M_docs.xml")//doc return
pf:add-doc($i,concat("1M_docs_",$i),"1M_collection.xml")

A possible problem could have been that the 'temporary' document was
thrown away too early, but this doesn't seem to be the case:

If using an explicit add-doc for the XML-document which contained the
document-names of the
million documents, which can be done using the following two queries

pf:add-doc("/some/path/1M_docs.xml","1M_docs.xml")
<>
for $i in doc("1M_docs.xml")//doc return
pf:add-doc($i,concat("1M_docs_",$i),"1M_collection.xml")

Results in:

Mserver: gdk_bbp.mx:1705: decref: Assertion `0' failed.

Investigation showed that this time a 'ws' bat had too few refs.

So the thought that the 'temporary' document was thrown away too early may
not be the problem after all.

----------------------------------------------------------------------

You can respond by visiting: 
https://sourceforge.net/tracker/?func=detail&atid=482468&aid=2722174&group_id=56967

------------------------------------------------------------------------------
This SF.net email is sponsored by:
High Quality Requirements in a Collaborative Environment.
Download a free trial of Rational Requirements Composer Now!
http://p.sf.net/sfu/www-ibm-com
_______________________________________________
Monetdb-bugs mailing list
[email protected]
https://lists.sourceforge.net/lists/listinfo/monetdb-bugs

[Monetdb-bugs] [ monetdb-Bugs-2722174 ] XQ: shredding 70GB XML fails.

Reply via email to