Bugs item #1636588, was opened at 2007-01-16 10:03
Message generated for change (Comment added) made by stmane
You can respond by visiting:
https://sourceforge.net/tracker/?func=detail&atid=482468&aid=1636588&group_id=56967
Please note that this message will contain a full copy of the comment thread,
including the initial issue submission, for this request,
not just the latest update.
Category: PF general
Group: Pathfinder 0.24
Status: Open
Resolution: Fixed
Priority: 6
Private: No
Submitted By: Arthur van Bunningen (arthurvb)
Assigned to: Jan Rittinger (tsheyar)
>Summary: XQ: Predicate selects too few nodes
Initial Comment:
Introduction:
The attached file p.xml contains different "possible worlds" about movies
together with their probability. The goal is to return the movies that are
thriller movies together with the probability that they are thriller movies. We
do this by for each movie that is a thriller movie (in some world) summing the
probabilities of all the worlds in which it actually is a thriller movie.
The problem:
If we do the following query (attached as sfquery.xq) in pathfinder 0.15.0
(HEAD version, checked out last Friday).
let $worlds :=
exactly-one(doc("/home/db/bunninge/phd/implementation/xquery/maurice/p.xml")/worlds)
let $ws := for $w in $worlds/world
return
<r>{$w/@prob}{$w//genre[.="Thriller"]/ancestor::movie/title}</r>
for $v in distinct-values($ws/title)
let $rank := sum($ws[//title=$v]/@prob)
return
<res rank="{$rank}">{$v}</res>
This results in the following results (attached as sfresults_pf.xml):
<res rank="0.012197519800210664">King Kong</res>,
<res rank="0.012197519800210664">Die Hard: With a Vengeance</res>,
<res rank="0.0067562687537343277">Die Hard</res>,
<res rank="0.0054412510464763482">Die Hard 2</res>
These results seem to be wrong since if we just return the nodes in $ws (using
sfquery_ws.xq as attached) we can see that there are "worlds" in which King
Kong is a thriller with a probability higher than 0.35
Testing in Qizx/open studio seems to give more reasonable results for
sfquery.xq (see sfresults_qizx.xml):
<?xml version='1.0' encoding='UTF-8'?>
<res rank="1.000000000000013">King Kong</res>
<res rank="1.000000000000013">Die Hard: With a Vengeance</res>
<res rank="0.7941176470588303">Die Hard 2</res>
<res rank="0.7941176470588311">Die Hard</res>
I did some tests and it seemed that even doing only a name-test predicate on
title in the sum, does not give all results, whereas it does work when the test
is done in the where-clause. However, since I am not able to test this
extensively I will just report the original problem.
Thank you in advance,
Arthur
----------------------------------------------------------------------
>Comment By: Stefan Manegold (stmane)
Date: 2008-06-02 00:02
Message:
Logged In: YES
user_id=572415
Originator: NO
See also
[ 1751684 ] XQ: merged_union error with fn:sum()
https://sourceforge.net/tracker/index.php?func=detail&aid=1751684&group_id=56967&atid=482468
----------------------------------------------------------------------
Comment By: Stefan Manegold (stmane)
Date: 2008-06-01 23:39
Message:
Logged In: YES
user_id=572415
Originator: NO
Re-opening:
While working fine with the MPS back-end, the tests fails with the ALG
back-end, mainly as it seems to calculate different ranks (via fn:sum());
cf.
http://monetdb.cwi.nl/testing/projects/monetdb/Stable/pathfinder/.mTestsG103/GNU.64.64.d-Fedora8/tests_BugTracker/predicate_selects_too_few_nodes.SF-1636588.out.00.html
----------------------------------------------------------------------
Comment By: Stefan Manegold (stmane)
Date: 2007-01-17 18:30
Message:
Logged In: YES
user_id=572415
Originator: NO
Arthur,
see `Mtest.py --help`, pathfinder/HowToStart-PF, and/or
http://monetdb.cwi.nl/Development/TestWeb/Background/index.html &
http://monetdb.cwi.nl/monet/src/testing/README for details how to run
tests.
Further, simply looking at
pathfinder/tests/BugTracker/Tests/predicate_selects_too_few_nodes.SF-1636588*.stable.out
probably tells you, whether this is the output that you expect.
Given that it seems to work for you, I feel free to colse this bug
report.
Stefan
----------------------------------------------------------------------
Comment By: Arthur van Bunningen (arthurvb)
Date: 2007-01-16 22:32
Message:
Logged In: YES
user_id=1045637
Originator: YES
I do not know how to test the test scripts, but at least I can confirm
that the result of my original query is correct in the latest stable
build.
Thank you.
Regards,
Arthur
----------------------------------------------------------------------
Comment By: Stefan Manegold (stmane)
Date: 2007-01-16 17:30
Message:
Logged In: YES
user_id=572415
Originator: NO
added tests in
pathfinder/tests/BugTracker/Tests/predicate_selects_too_few_nodes.SF-1636588.*
pathfinder/tests/BugTracker/Tests/predicate_selects_too_few_nodes.SF-1636588_ws.*
seem to work fine.
Peter, Arthur,
could please check/verify whether the provided stable output are indeed
correct / as expected?
(XQuery_0-14 branch; propagation to development trunk will follow as soon
as the recently introduced bugs/problems in the Stable branch are fixed)
----------------------------------------------------------------------
Comment By: Peter Boncz (boncz)
Date: 2007-01-16 12:32
Message:
Logged In: YES
user_id=591107
Originator: NO
Hi Arthur,
The query gives the expected result, if you would use
sum($ws[.//title=$v]/@prob)
instead of
sum($ws[//title=$v]/@prob)
That is, the context node for //title is not the $ws elementm, but
fn:root($ws).
However, Qizx is right about the answer, because in this case, each $ws is
in fact the root of a newly created fragment.
There was a bug in the element construction of mps (and algebra!).
FRAG_ROOT contains:
head => the OPEN_DOCID, which allows to find documents within a
collection.
tail => the root NID
In case of the temporary doc collection, the heads are all oid_nil (we
cannot find tmp fragments with the doc() function) and the tails are PREs
(PRE=NID=RID for temporary docs).
So basically, the format of FRAG_ROOT had been reversed in 4.14, and we
forgot to update the node construction code.
fixed now
----------------------------------------------------------------------
Comment By: Arthur van Bunningen (arthurvb)
Date: 2007-01-16 12:07
Message:
Logged In: YES
user_id=1045637
Originator: YES
I added the group Pathfinder CVS Head (checked out last Friday).
----------------------------------------------------------------------
Comment By: Stefan Manegold (stmane)
Date: 2007-01-16 10:26
Message:
Logged In: YES
user_id=572415
Originator: NO
out of "pure curiosity":
which version of MonetDB/XQuery are you using / refering to?
(Please see / set / use the "Group:" in the header.)
Thanks!
----------------------------------------------------------------------
Comment By: Arthur van Bunningen (arthurvb)
Date: 2007-01-16 10:06
Message:
Logged In: YES
user_id=1045637
Originator: YES
File Added: sfquery_ws.xq
----------------------------------------------------------------------
Comment By: Arthur van Bunningen (arthurvb)
Date: 2007-01-16 10:04
Message:
Logged In: YES
user_id=1045637
Originator: YES
File Added: sfquery.xq
----------------------------------------------------------------------
You can respond by visiting:
https://sourceforge.net/tracker/?func=detail&atid=482468&aid=1636588&group_id=56967
-------------------------------------------------------------------------
This SF.net email is sponsored by: Microsoft
Defy all challenges. Microsoft(R) Visual Studio 2008.
http://clk.atdmt.com/MRT/go/vse0120000070mrt/direct/01/
_______________________________________________
Monetdb-bugs mailing list
[email protected]
https://lists.sourceforge.net/lists/listinfo/monetdb-bugs