Bugs item #1636588, was opened at 2007-01-16 10:03
Message generated for change (Comment added) made by boncz
You can respond by visiting:
https://sourceforge.net/tracker/?func=detail&atid=482468&aid=1636588&group_id=56967
Please note that this message will contain a full copy of the comment thread,
including the initial issue submission, for this request,
not just the latest update.
Category: PF general
Group: Pathfinder CVS Head
Status: Open
Resolution: None
Priority: 5
Private: No
Submitted By: Arthur van Bunningen (arthurvb)
Assigned to: Nobody/Anonymous (nobody)
Summary: Predicate selects too few nodes
Initial Comment:
Introduction:
The attached file p.xml contains different "possible worlds" about movies
together with their probability. The goal is to return the movies that are
thriller movies together with the probability that they are thriller movies. We
do this by for each movie that is a thriller movie (in some world) summing the
probabilities of all the worlds in which it actually is a thriller movie.
The problem:
If we do the following query (attached as sfquery.xq) in pathfinder 0.15.0
(HEAD version, checked out last Friday).
let $worlds :=
exactly-one(doc("/home/db/bunninge/phd/implementation/xquery/maurice/p.xml")/worlds)
let $ws := for $w in $worlds/world
return
<r>{$w/@prob}{$w//genre[.="Thriller"]/ancestor::movie/title}</r>
for $v in distinct-values($ws/title)
let $rank := sum($ws[//title=$v]/@prob)
return
<res rank="{$rank}">{$v}</res>
This results in the following results (attached as sfresults_pf.xml):
<res rank="0.012197519800210664">King Kong</res>,
<res rank="0.012197519800210664">Die Hard: With a Vengeance</res>,
<res rank="0.0067562687537343277">Die Hard</res>,
<res rank="0.0054412510464763482">Die Hard 2</res>
These results seem to be wrong since if we just return the nodes in $ws (using
sfquery_ws.xq as attached) we can see that there are "worlds" in which King
Kong is a thriller with a probability higher than 0.35
Testing in Qizx/open studio seems to give more reasonable results for
sfquery.xq (see sfresults_qizx.xml):
<?xml version='1.0' encoding='UTF-8'?>
<res rank="1.000000000000013">King Kong</res>
<res rank="1.000000000000013">Die Hard: With a Vengeance</res>
<res rank="0.7941176470588303">Die Hard 2</res>
<res rank="0.7941176470588311">Die Hard</res>
I did some tests and it seemed that even doing only a name-test predicate on
title in the sum, does not give all results, whereas it does work when the test
is done in the where-clause. However, since I am not able to test this
extensively I will just report the original problem.
Thank you in advance,
Arthur
----------------------------------------------------------------------
>Comment By: Peter Boncz (boncz)
Date: 2007-01-16 12:32
Message:
Logged In: YES
user_id=591107
Originator: NO
Hi Arthur,
The query gives the expected result, if you would use
sum($ws[.//title=$v]/@prob)
instead of
sum($ws[//title=$v]/@prob)
That is, the context node for //title is not the $ws elementm, but
fn:root($ws).
However, Qizx is right about the answer, because in this case, each $ws is
in fact the root of a newly created fragment.
There was a bug in the element construction of mps (and algebra!).
FRAG_ROOT contains:
head => the OPEN_DOCID, which allows to find documents within a
collection.
tail => the root NID
In case of the temporary doc collection, the heads are all oid_nil (we
cannot find tmp fragments with the doc() function) and the tails are PREs
(PRE=NID=RID for temporary docs).
So basically, the format of FRAG_ROOT had been reversed in 4.14, and we
forgot to update the node construction code.
fixed now
----------------------------------------------------------------------
Comment By: Arthur van Bunningen (arthurvb)
Date: 2007-01-16 12:07
Message:
Logged In: YES
user_id=1045637
Originator: YES
I added the group Pathfinder CVS Head (checked out last Friday).
----------------------------------------------------------------------
Comment By: Stefan Manegold (stmane)
Date: 2007-01-16 10:26
Message:
Logged In: YES
user_id=572415
Originator: NO
out of "pure curiosity":
which version of MonetDB/XQuery are you using / refering to?
(Please see / set / use the "Group:" in the header.)
Thanks!
----------------------------------------------------------------------
Comment By: Arthur van Bunningen (arthurvb)
Date: 2007-01-16 10:06
Message:
Logged In: YES
user_id=1045637
Originator: YES
File Added: sfquery_ws.xq
----------------------------------------------------------------------
Comment By: Arthur van Bunningen (arthurvb)
Date: 2007-01-16 10:04
Message:
Logged In: YES
user_id=1045637
Originator: YES
File Added: sfquery.xq
----------------------------------------------------------------------
You can respond by visiting:
https://sourceforge.net/tracker/?func=detail&atid=482468&aid=1636588&group_id=56967
-------------------------------------------------------------------------
Take Surveys. Earn Cash. Influence the Future of IT
Join SourceForge.net's Techsay panel and you'll get the chance to share your
opinions on IT & business topics through brief surveys - and earn cash
http://www.techsay.com/default.php?page=join.php&p=sourceforge&CID=DEVDEV
_______________________________________________
Monetdb-bugs mailing list
[email protected]
https://lists.sourceforge.net/lists/listinfo/monetdb-bugs