Bugs item #2693776, was opened at 2009-03-19 09:10
Message generated for change (Comment added) made by tsheyar
You can respond by visiting: 
https://sourceforge.net/tracker/?func=detail&atid=482468&aid=2693776&group_id=56967

Please note that this message will contain a full copy of the comment thread,
including the initial issue submission, for this request,
not just the latest update.
Category: PF/runtime
Group: MonetDB4 "stable"
Status: Open
>Resolution: Fixed
Priority: 5
Private: No
Submitted By: Hans (hans_0_)
Assigned to: Jan Rittinger (tsheyar)
Summary: xquery fails: memory allocation

Initial Comment:
The following xquery fails, using Monet4 Nov 2008 SP2.

$ cat error.xq
let $result := (let $items := doc("consumer04.xml")//properties/..
let $values := 
distinct-values((doc("consumer04.xml")//properties/..)/properties/content)
for $i in $items[properties/content]
where some $x in $values satisfies $x=$i/properties/content
return $i)
let $attrs := distinct-values(
  for $j in $result/properties/*
    return name($j)
  )
let $props := doc("elements.xml")//proper...@name]
let $subprops := doc("elements.xml")//subproper...@name]
return element result {
  element matches { count($result) },
  element attributes {
    let $usedprops := for $attr in $attrs return $pro...@name=$attr]
    for $prop in $usedprops/.
    return element { string($prop/@name) } {
      attribute format { $prop/format },
      attribute unit { $prop/unit },
      attribute description { $prop/description },
      for $subprop in $prop/subproperty
      let $sub := $subpro...@name=$subprop]
      return element { string(exactly-one($sub/@name)) } {
        attribute format { $sub/format },
        attribute unit { $sub/unit },
        attribute description { $sub/description }
      }
    }
  }
}
<>
$ mclient -p55555 -lx -t error.xq > error.xml
MAPI  = mone...@localhost:55555
QUERY = let $result := (let $items := doc("consumer04.xml")//properties/..
ERROR = !ERROR: GDKmallocmax: failed for 238795816 bytes
        !ERROR: GDKload: failed name=61/6151, ext=head
        !ERROR: CMDleftjoin: operation failed.
        !ERROR: BBPdecref: 1000000017_rid_level does not have pointer fixes.
        !ERROR: BBPdecref: 1000000017_rid_prop does not have pointer fixes.
        !ERROR: BBPdecref: 1000000017_prop_text does not have pointer fixes.
        !ERROR: BBPdecref: 1000000017_prop_val does not have pointer fixes.
        !ERROR: BBPdecref: 1000000017_attr_qn does not have pointer fixes.
        !ERROR: BBPdecref: 1000000017_attr_prop does not have pointer fixes.
$ cat error.xml
#GDKmalloc(180627422248) fail => BBPtrim(enter) 
usage[mem=148149976,vm=902317056]
#GDKmalloc(180627422248) fail => BBPtrim(ready) usage[mem=36756832,vm=583467008]
Timer   34246.875 msec
$


The used documents were sent earlier in March 2009 to Jan (Tuebingen, PF WIKI) 
and to Sjoerd (CWI, testset).

----------------------------------------------------------------------

>Comment By: Jan Rittinger (tsheyar)
Date: 2009-04-07 15:27

Message:
Added an optimized translation for thetajoins that takes semijoins into
account. (Query now runs in the HEAD.)

----------------------------------------------------------------------

Comment By: Peter Boncz (boncz)
Date: 2009-04-07 14:53

Message:
for $i in $items[properties/content]
where some $x in $values satisfies $x=$i/properties/content
return $i)

is a join of 86K nodes with itself, and all have the same value: "" (the
empty string).
Therefore, there are 9G result tuples for the join. Therefore a runtime
error occurs.

So, the direct problem is that this query (on this data, with all values
equal) produces a very large intermediate result. It should be rewritten or
avoided by the application.

However, this "where some satisfies" currently uses a normal join
strategy, whereas a semijoin strategy would allow to push duplicate
elimination on the outer side under the join. That would turn this join
into a 86K x 1 join and solve the problem.

The algebra plans has this opportunity pretty openly exposed, so Jan
figured it would be possible to put an optimization in.

turning this over to Jan, then




----------------------------------------------------------------------

Comment By: Peter Boncz (boncz)
Date: 2009-03-27 19:01

Message:
Jan, it seems that an equi-join is not detected here.. it seems a pretty
normal pattern though. Could you check what gies wrong?



----------------------------------------------------------------------

Comment By: Hans (hans_0_)
Date: 2009-03-19 15:33

Message:
Reply to question from stmane Date: 2009-03-19 10:09
consumer04-small.xml is a very limited intersection of the bigger
consumer04.xml. A lot of 'directories' from the original system were left
out in order to create a small document to fit in the nightly test (less
than 10 MB). As a result no 'content' type properties are present there,
hence the zero size result set.

Reply to question from stmane Date: 2009-03-19 09:59
Most of our xqueries are generated by putting a lot of small pieces
together. These xqueries are a result of a filter description and a
required result set. As a result most xqueries we send to Mserver are not
optimized. We realize there could be a performance improvement, if we
always send the most optimal xquery. The drawback would be a lot of
development costs.




----------------------------------------------------------------------

Comment By: Stefan Manegold (stmane)
Date: 2009-03-19 11:11

Message:
Just for completeness, as can be read from my Mserver welcome messages, I'm
working on 64-bit Linux.


----------------------------------------------------------------------

Comment By: Stefan Manegold (stmane)
Date: 2009-03-19 11:09

Message:
Using your "customer04-small.xml", both your original query and both of my
proposed alternatives produce the same ("empty"?) result, with both Feb2009
& Nov2008 and both ALG & MPS:

<result><matches>0</matches><attributes/></result>


----------------------------------------------------------------------

Comment By: Stefan Manegold (stmane)
Date: 2009-03-19 10:59

Message:
BTW:
What the intended and/or factual difference between your

let $result := (
let $items := doc("consumer04.xml")//properties/..
let $values :=
distinct-values((doc("consumer04.xml")//properties/..)/properties/content)
for $i in $items[properties/content]
where some $x in $values satisfies $x=$i/properties/content
return $i
)
and a simple

xquery>let $result := doc("consumer04.xml")//*[properties/content]
more>return count($result)
more><>
146457

or

xquery>let $result := doc("consumer04.xml")//properties/content/../..
more>return count($result)
more><>
146457

?

The latter two work fine with Nov2008 & Feb2009 and make your original
query work fine with both, too, (at least with my below patch applied),
returning

<result><matches>146457</matches><attributes><name description="Removed
content" unit="" format="xs:string"/><path description="Removed content"
unit="" format="xiraf:path"/><content description="Removed content" unit=""
format="xiraf:complex"><utf16le description="Removed content" unit=""
format="xs:string"/><utf8 description="Removed content" unit=""
format="xs:string"/><int32le description="Removed content" unit=""
format="xs:integer"/><int64le description="Removed content" unit=""
format="xs:integer"/></content><stream description="Removed content"
unit="" format=""/><size description="Removed content" unit="bytes"
format="xs:integer"/></attributes></result>


----------------------------------------------------------------------

Comment By: Stefan Manegold (stmane)
Date: 2009-03-19 10:44

Message:
ps: in both cases (Nov2008 & Feb2009) I locally applied the following patch
to avoid
[ 2668437 ] PF runtime: parent step produces not/wrongly sorted result
https://sourceforge.net/tracker/index.php?func=detail&aid=2668437&group_id=56967&atid=482468

$ cvs diff pathfinder/runtime/ll_upwards.mx
Index: pathfinder/runtime/ll_upwards.mx
===================================================================
RCS file: /cvsroot/monetdb/pathfinder/runtime/ll_upwards.mx,v
retrieving revision 1.32
diff -u -r1.32 ll_upwards.mx
--- pathfinder/runtime/ll_upwards.mx    8 Jan 2009 16:54:18 -0000       1.32
+++ pathfinder/runtime/ll_upwards.mx    19 Mar 2009 09:43:04 -0000
@@ -571,7 +571,7 @@
         }
         bn->tsorted = 0;
         if (niters == 1) {
-            bn->tsorted = GDK_SORTED;
+            /*bn->tsorted = GDK_SORTED;*/
             BATkey(BATmirror(bn), 1);
         }
         *ret = bn;


----------------------------------------------------------------------

Comment By: Stefan Manegold (stmane)
Date: 2009-03-19 10:42

Message:
With Feb2009 release branch (post Feb2009 release)

$ Mserver --dbinit='module(pathfinder);'
# MonetDB Server v4.28.0
# based on GDK   v1.28.1
# Copyright (c) 1993-July 2008, CWI. All rights reserved.
# Copyright (c) August 2008-, MonetDB B.V.. All rights reserved.
# Compiled for x86_64-unknown-linux-gnu/64bit with 64bit OIDs; dynamically
linked.
# Visit http://monetdb.cwi.nl/ for further information.
# PF/Tijah module v0.9.0 loaded. http://dbappl.cs.utwente.nl/pftijah
# MonetDB/XQuery module v0.28.1 loaded (default back-end is 'algebra')
# XRPC administrative console at http://127.0.0.1:50001/admin

$ mclient -lx
xquery>let $items := doc("consumer04.xml")//properties/..
more>return count($items)
more><>
1156921
xquery>let $values :=
distinct-values((doc("consumer04.xml")//properties/..)/properties/content)
more>return count($values)
more><>
1
xquery>let $items := doc("consumer04.xml")//properties/..
more>let $values :=
distinct-values((doc("consumer04.xml")//properties/..)/properties/content)
more>return count($items[properties/content])
more><>
146457
xquery>let $items := doc("consumer04.xml")//properties/..
more>let $values :=
distinct-values((doc("consumer04.xml")//properties/..)/properties/content)
more>let $result := (
more>for $i in $items[properties/content]
more>where some $x in $values satisfies $x=$i/properties/content
more>return $i
more>)
more>return count($result)
more><>

  PID USER      PR  NI  VIRT  RES  SHR S %CPU %MEM    TIME+  COMMAND
26353 manegold  20   0  337g 7.1g 7.1g R 32.1 91.3   2:09.11 Mserver
                        ^^^^ ^^^^ ^^^^

... stopped (I killed client & server) after 10 min without result ...



----------------------------------------------------------------------

Comment By: Stefan Manegold (stmane)
Date: 2009-03-19 10:31

Message:
With Nov2008 release branch (post Nov2008-SP2 release):

$ Mserver --dbinit='module(pathfinder);'
# MonetDB Server v4.26.5
# based on GDK   v1.26.5
# Copyright (c) 1993-July 2008, CWI. All rights reserved.
# Copyright (c) August 2008-, MonetDB B.V.. All rights reserved.
# Compiled for x86_64-unknown-linux-gnu/64bit with 64bit OIDs; dynamically
linked.
# Visit http://monetdb.cwi.nl/ for further information.
# PF/Tijah module v0.9.0 loaded. http://dbappl.cs.utwente.nl/pftijah
# MonetDB/XQuery module v0.26.5 loaded (default back-end is 'algebra')
# XRPC administrative console at http://127.0.0.1:50001/admin

$ mclient -lx 
xquery>let $items := doc("consumer04.xml")//properties/..
more>return count($items)
more><>
1156921
xquery>let $values :=
distinct-values((doc("consumer04.xml")//properties/..)/properties/content)
more>return count($values)
more><>
1
xquery>let $items := doc("consumer04.xml")//properties/..
more>let $values :=
distinct-values((doc("consumer04.xml")//properties/..)/properties/content)
more>return count($items[properties/content])
more><>
146457
xquery>let $items := doc("consumer04.xml")//properties/..
more>let $values :=
distinct-values((doc("consumer04.xml")//properties/..)/properties/content)
more>let $result := (
more>for $i in $items[properties/content]
more>where some $x in $values satisfies $x=$i/properties/content
more>return $i
more>)
more>return count($result)
more><>
MAPI  = mone...@localhost:50000
ACTION= read_line
QUERY = let $items := doc("consumer04.xml")//properties/..
ERROR = Connection terminated

MonetDB>Mserver:
/ufs/manegold/_/scratch0/Monet/Testing/Stable_Nov2008/source/MonetDB/src/gdk/gdk_bbp.mx:1593:
decref: Assertion `0' failed.
Aborted


----------------------------------------------------------------------

You can respond by visiting: 
https://sourceforge.net/tracker/?func=detail&atid=482468&aid=2693776&group_id=56967

------------------------------------------------------------------------------
This SF.net email is sponsored by:
High Quality Requirements in a Collaborative Environment.
Download a free trial of Rational Requirements Composer Now!
http://p.sf.net/sfu/www-ibm-com
_______________________________________________
Monetdb-bugs mailing list
[email protected]
https://lists.sourceforge.net/lists/listinfo/monetdb-bugs

Reply via email to