Bugs item #2144639, was opened at 2008-10-03 18:25
Message generated for change (Comment added) made by tsheyar
You can respond by visiting: 
https://sourceforge.net/tracker/?func=detail&atid=482468&aid=2144639&group_id=56967

Please note that this message will contain a full copy of the comment thread,
including the initial issue submission, for this request,
not just the latest update.
Category: PF general
Group: Pathfinder 0.24
Status: Open
Resolution: None
Priority: 5
Private: No
Submitted By: Loredana Afanasiev (lafanasi)
Assigned to: Lefteris Sidirourgos (lsidir)
Summary: XQ: fn:collection() in algebra version

Initial Comment:
Hi Lefteris,

as discussed on Thu, I add this as a bug, so that you can close it soon :)

[EMAIL PROTECTED] xq]$ mclient -lx -s 'fn:collection("MotiesTweedeKamer")'
MAPI  = [EMAIL PROTECTED]:50000
QUERY = fn:collection("MotiesTweedeKamer")
ERROR = !fatal error: Algebra implementation for function `fn:collection' 
unknown.

Thanks,
l.

----------------------------------------------------------------------

>Comment By: Jan Rittinger (tsheyar)
Date: 2008-10-08 23:41

Message:
Answer to 'why does count(pf:collection(...)/*) return 0':

I don't know the behavior of pf:collection but I assume it returns a
magical node that sits on top of all root nodes in the collection. If the
collection like in Loredanas case only consists of documents the children
of the magical node naturally are all document nodes. Thus the element
tagname test * does not return any results.


Answer to 'why is there a performance difference between algebra and MPS
version':

As always if there is a big difference a join is not detected in the one
or other variant. In the stable branch we had to avoid some error message
(about running out of column names) and thus are not able to apply all
optimizations necessary to detect the value-based join. If you use pf to
compile the query you'll see a warning that some optimizations could not be
applied.


----------------------------------------------------------------------

Comment By: Stefan Manegold (stmane)
Date: 2008-10-08 16:31

Message:
and here the "proof" that pf:collection() does not cause any performance
degradation --- the problem is indeed the ALG translation ...

$ cat q9-fn.xq
let $col := fn:collection("MotiesTweedeKamer")
for $y in distinct-values( for $y in $col//hiddendatum return
substring-before(fn:string($y),'.') )
        let $thisyear :=
$col//document[substring-before(fn:string(.//hiddendatum[1]),'.')=$y]
        let $partij := distinct-values($thisyear//partij)
        for $p in $partij
                let $aantalingediendemoties :=
count($thisyear[.//indienergnlod//partij=$p])
                let $aantalmedeingediendemoties :=
count($thisyear[.//medeindienergnlod//partij=$p])
                order by $y descending, $aantalingediendemoties
descending, $aantalmedeingediendemoties descending
        return 
                <aantal jaar='{$y}'
                        partij='{$p}' 
                       
aantalingediendemoties='{$aantalingediendemoties}'
                       
aantalmedeingediendemoties='{$aantalmedeingediendemoties}'
                />

$ diff q9-{fn,pf}.xq
1c1
< let $col := fn:collection("MotiesTweedeKamer")
---
> let $col := pf:collection("MotiesTweedeKamer")

$ mclient -t -lx -g q9-fn.xq | grep -v '^<aantal .*/>'
Trans      32.156 msec
Shred       0.000 msec
Query    3272.690 msec
Print       1.598 msec
Timer    3339.119 msec 

$ mclient -t -lx -g q9-pf.xq | grep -v '^<aantal .*/>'
Trans      27.816 msec
Shred       0.000 msec
Query    3247.053 msec
Print       1.594 msec
Timer    3308.713 msec 

$ mclient -t -lx q9-pf.xq | grep -v '^<aantal .*/>'
[makes Mserver grow >> 13 GB and (hence) runs very long ...]


----------------------------------------------------------------------

Comment By: Stefan Manegold (stmane)
Date: 2008-10-08 16:26

Message:
$ mclient -lx -s'count(pf:collection("MotiesTweedeKamer"))'
1

and

$ mclient -lx -g -s'count(pf:collection("MotiesTweedeKamer"))'
1

are indeed correct. no point.
But

$ mclient -lx -s'count(pf:collection("MotiesTweedeKamer")/*)'
0

and

$ mclient -lx -s'count(pf:collection("MotiesTweedeKamer")/*)'
0

are (at least) "unexpected".
And since

$ mclient -lx -s'count(pf:collection("MotiesTweedeKamer")/node())'
27946

and

$ mclient -lx -s'count(pf:collection("MotiesTweedeKamer")/node())'
27946

(mind the "/node()" instead of "/*"!)
appear to work correctly, too,
I tried to find out why the "/*" yields "unexpected" results;
hence, my "layman's" attempt to check the type returned by
"pf:collection(<colname>)/*":

$ mclient -lx -s'doc(pf:collection("MotiesTweedeKamer")/*)'
MAPI  = [EMAIL PROTECTED]:50000
QUERY = doc(pf:collection("MotiesTweedeKamer")/*)
ERROR = !type error: no variant of function fn:doc accepts the given
argument type(s): string?
        !type error: maybe you meant:
        !type error:   fn:doc (string?) as document { node }?
        !type error: illegal arguments for function fn:doc

and

$ mclient -lx -g -s'doc(pf:collection("MotiesTweedeKamer")/*)'
MAPI  = [EMAIL PROTECTED]:50000
QUERY = doc(pf:collection("MotiesTweedeKamer")/*)
ERROR = !type error: no variant of function fn:doc accepts the given
argument type(s): string?
        !type error: maybe you meant:
        !type error:   fn:doc (string?) as document { node }?
        !type error: illegal arguments for function fn:doc

...


----------------------------------------------------------------------

Comment By: Lefteris Sidirourgos (lsidir)
Date: 2008-10-08 16:11

Message:
The pf:collection returns correctly 1 node. The (string?) error message is
misleading here, fn:doc wants a string?, not that pf:collection is giving
one. I think what it does not like is the cardinality, since you are giving
a *.

----------------------------------------------------------------------

Comment By: Stefan Manegold (stmane)
Date: 2008-10-08 15:54

Message:
It seems as if the return type of pf:collection() is not quite correct:
string? instead of node :
========
15:50:43 [EMAIL PROTECTED]:/tmp $ mclient -lx -s'pf:collections()'
<collection updatable="false" size="120 MiB"
numDocs="27946">MotiesTweedeKamer</collection>,
<collection updatable="true" size="237 MiB"
numDocs="27946">MotiesTweedeKamer_Updatable</collection>
15:50:46 [EMAIL PROTECTED]:/tmp $ mclient -lx -g -s'pf:collections()'
<collection updatable="false" size="120 MiB"
numDocs="27946">MotiesTweedeKamer</collection>,
<collection updatable="true" size="237 MiB"
numDocs="27946">MotiesTweedeKamer_Updatable</collection>
15:50:47 [EMAIL PROTECTED]:/tmp $ mclient -lx
-s'count(pf:collection("MotiesTweedeKamer"))'
1
15:51:01 [EMAIL PROTECTED]:/tmp $ mclient -lx -g
-s'count(pf:collection("MotiesTweedeKamer"))'
1
15:51:06 [EMAIL PROTECTED]:/tmp $ mclient -lx
-s'count(pf:collection("MotiesTweedeKamer")//document)'
27946
15:51:20 [EMAIL PROTECTED]:/tmp $ mclient -lx -g
-s'count(pf:collection("MotiesTweedeKamer")//document)'
27946
15:51:26 [EMAIL PROTECTED]:/tmp $ mclient -lx
-s'count(pf:collection("MotiesTweedeKamer")/*)'
0
15:51:34 [EMAIL PROTECTED]:/tmp $ mclient -lx -g
-s'count(pf:collection("MotiesTweedeKamer")/*)'
0
15:51:39 [EMAIL PROTECTED]:/tmp $ mclient -lx
-s'count(pf:collection("MotiesTweedeKamer")/*/*)'
0
15:51:44 [EMAIL PROTECTED]:/tmp $ mclient -lx -g
-s'count(pf:collection("MotiesTweedeKamer")/*/*)'
0
15:51:47 [EMAIL PROTECTED]:/tmp $ mclient -lx
-s'doc(pf:collection("MotiesTweedeKamer")/*)'
MAPI  = [EMAIL PROTECTED]:50000
QUERY = doc(pf:collection("MotiesTweedeKamer")/*)
ERROR = !type error: no variant of function fn:doc accepts the given
argument type(s): string?
        !type error: maybe you meant:
        !type error:   fn:doc (string?) as document { node }?
        !type error: illegal arguments for function fn:doc
15:51:57 [EMAIL PROTECTED]:/tmp $ mclient -lx -g
-s'doc(pf:collection("MotiesTweedeKamer")/*)'
MAPI  = [EMAIL PROTECTED]:50000
QUERY = doc(pf:collection("MotiesTweedeKamer")/*)
ERROR = !type error: no variant of function fn:doc accepts the given
argument type(s): string?
        !type error: maybe you meant:
        !type error:   fn:doc (string?) as document { node }?
        !type error: illegal arguments for function fn:doc
15:52:04 [EMAIL PROTECTED]:/tmp $ mclient -lx
-s'count(pf:collection("MotiesTweedeKamer")/node())'
27946
15:52:13 [EMAIL PROTECTED]:/tmp $ mclient -lx -g
-s'count(pf:collection("MotiesTweedeKamer")/node())'
27946
15:52:18 [EMAIL PROTECTED]:/tmp $ mclient -lx
-s'count(pf:collection("MotiesTweedeKamer")/node()/document)'
27946
15:52:25 [EMAIL PROTECTED]:/tmp $ mclient -lx -g
-s'count(pf:collection("MotiesTweedeKamer")/node()/document)'
27946
========

!??


----------------------------------------------------------------------

Comment By: Stefan Manegold (stmane)
Date: 2008-10-08 15:31

Message:
Loredana,

(1)
The performance differences you experience are most probably not caused by
fn:collection() vs. pf:collection() but by ALG vs. MPS (see also (2)) ---
just try MPS with pf:collection() and it should show the same performance
as MPS with fn:collection().

(2)
Since MPS and ALG use quite different ways & techniques to translate and
optimize queries, it is expected that they show performance differences
(sometimes severe ones) on the same query; we (plus the Pathfinder folks in
Tbingen) have to check, what goes wrong with ALG in case of your query
...
Thanks for reporting it.

(3)
We also have to check the unexpected(?) behaviour of pf:collection().
Thanks for reporting it.


----------------------------------------------------------------------

Comment By: Loredana Afanasiev (lafanasi)
Date: 2008-10-08 15:13

Message:
Subject:    fn:collection() vs pf:collection()

Hi Lefteris and all,

I think this relevant to this bug report.

I get huge performance times difference when switching from
fn:collection() to pf:collection(). I can you please advise me how to avoid
the long times while running on algebra version?

thanks in advance,
l.


[EMAIL PROTECTED] xq]$ more q9-fn.xq 
                let $col := fn:collection("MotiesTweedeKamer")
                for $y in distinct-values(for $y in $col//hiddendatum
return substring-before(fn:str
ing($y),'.'))
                let $thisyear := $col//document
                   
[substring-before(fn:string(.//hiddendatum[1]),'.')=$y]
                let $partij := distinct-values($thisyear//partij)
                for $p in $partij
                let $aantalingediendemoties :=
                    count($thisyear[.//indienergnlod//partij=$p])
                let $aantalmedeingediendemoties := 
                    count($thisyear[.//medeindienergnlod//partij=$p])
                order by $y descending,
                    $aantalingediendemoties descending,
                    $aantalmedeingediendemoties descending
                return 
                <aantal jaar='{$y}'
                  partij='{$p}' 
                  aantalingediendemoties='{$aantalingediendemoties}'
                 
aantalmedeingediendemoties='{$aantalmedeingediendemoties}'
                />

[EMAIL PROTECTED] xq]$ mclient -lx -g -t q9-fn.xq

Trans      35.174 msec
Shred       0.000 msec
Query    4401.659 msec
Print       2.680 msec
Timer    4487.610 msec 


[EMAIL PROTECTED] xq]$ more q9-pf.xq 
                let $col := pf:collection("MotiesTweedeKamer")
                for $y in distinct-values(for $y in $col//hiddendatum
return substring-before(fn:str
ing($y),'.'))
                let $thisyear := $col//document
                   
[substring-before(fn:string(.//hiddendatum[1]),'.')=$y]
                let $partij := distinct-values($thisyear//partij)
                for $p in $partij
                let $aantalingediendemoties :=
                    count($thisyear[.//indienergnlod//partij=$p])
                let $aantalmedeingediendemoties := 
                    count($thisyear[.//medeindienergnlod//partij=$p])
                order by $y descending,
                    $aantalingediendemoties descending,
                    $aantalmedeingediendemoties descending
                return 
                <aantal jaar='{$y}'
                  partij='{$p}' 
                  aantalingediendemoties='{$aantalingediendemoties}'
                 
aantalmedeingediendemoties='{$aantalmedeingediendemoties}'
                />

[EMAIL PROTECTED] xq]$ mclient -lx -t q9-pf.xq

Timer  584185.070 msec

[EMAIL PROTECTED] xq]$ diff q9-fn.xq q9-pf.xq 
1c1
<                 let $col := fn:collection("MotiesTweedeKamer")
---
>                 let $col := pf:collection("MotiesTweedeKamer")


Besides this there is something strange happening with pf:collection().
The website says:
"
pf:collection() returns a single special collection node, whose immediate
children are the document nodes. Therefore, fn:collection("my-collection")
is roughly equivalent to pf:collection("my-collection")/*. 
"

While I get:

[EMAIL PROTECTED] xq]$ mclient -lx -g -s
'count(fn:collection("MotiesTweedeKamer"))'
27946
[EMAIL PROTECTED] xq]$ mclient -lx -g -s
'count(pf:collection("MotiesTweedeKamer"))'
1
[EMAIL PROTECTED] xq]$ mclient -lx -g -s
'count(pf:collection("MotiesTweedeKamer")/*)'
0
[EMAIL PROTECTED] xq]$ mclient -lx  -s
'count(pf:collection("MotiesTweedeKamer")/*)'
0
[EMAIL PROTECTED] xq]$ mclient -lx -s
'count(pf:collection("MotiesTweedeKamer")/*/*)'
0
[EMAIL PROTECTED] xq]$ mclient -lx -s
'count(pf:collection("MotiesTweedeKamer")/*/document)'
0
[EMAIL PROTECTED] xq]$ mclient -lx -s
'count(pf:collection("MotiesTweedeKamer")//document)'
27946



----------------------------------------------------------------------

You can respond by visiting: 
https://sourceforge.net/tracker/?func=detail&atid=482468&aid=2144639&group_id=56967

-------------------------------------------------------------------------
This SF.Net email is sponsored by the Moblin Your Move Developer's challenge
Build the coolest Linux based applications with Moblin SDK & win great prizes
Grand prize is a trip for two to an Open Source event anywhere in the world
http://moblin-contest.org/redirect.php?banner_id=100&url=/
_______________________________________________
Monetdb-bugs mailing list
[email protected]
https://lists.sourceforge.net/lists/listinfo/monetdb-bugs

Reply via email to