Bugs item #2728133, was opened at 2009-04-03 13:52
Message generated for change (Comment added) made by stmane
You can respond by visiting:
https://sourceforge.net/tracker/?func=detail&atid=482468&aid=2728133&group_id=56967
Please note that this message will contain a full copy of the comment thread,
including the initial issue submission, for this request,
not just the latest update.
Category: Core
Group: MonetDB4 "stable"
Status: Open
Resolution: None
Priority: 5
Private: No
Submitted By: Wouter Alink (vzzzbx)
Assigned to: Nobody/Anonymous (nobody)
Summary: M4: count() returns int
Initial Comment:
To scale PF/TIJAH (and also the rest of pathfinder/M4) beyond the magical 31
bit boundary, the count() function needs to return a wrd instead of an int.
PF/TIJAH hits the 2G border earlier due to the fact that it counts each word in
an XML element, where pathfinder only counts a text element as a single node.
In M5 this apparently is designed right from the start. In M4 it seems to be a
legacy issue.
A simple analysis showed that in the pathfinder code there are +- 584
occurrences of the 'count()' function (c and MIL invocations). In PF/TIJAH
there are +- 155. Most of them seem to be easily to replace. Stefan estimated
the change to be a 'weekend's work. A simple replacement function he came up
with.
PROC count_wrd(BAT b) : wrd {
return wrd(count(int(b)));
}
It is debatable whether to actually implement this fix, as the M4 code is
basically end-of-life, and M5 does not exhibit this bug. A plan needs to be
made. To be continued.
----------------------------------------------------------------------
>Comment By: Stefan Manegold (stmane)
Date: 2009-04-03 14:28
Message:
- GDK C function BATcount() does return a BUN, not an int.
- in MIL, we have
COMMAND: count(BAT[any,any]) : int
MODULE: algebra
COMPILED: by adm on Wed Apr 1 23:29:46 2009
Returns the number of elements currently in a BAT.
COMMAND: count(int) : lng
MODULE: bat
COMPILED: by adm on Wed Apr 1 23:29:46 2009
Returns the current size (in number of elements) of a BAT.
- my correct quote was "If you have a weekend, you're welcome to do it
[i.e., change the count(BAT[any,any]) to return a wrd instead of and int
--- and fix all MIL code that uses count(BAT[any,any]), accordingly --- and
for consistency change all MIL function that return or expect a count or
index of BUN(s) (thing of grouped counts for aggragation, slice, etc.)
likewise]" --- don't know, though, whether a weekend is sufficient to do it
all correctly & consistently ...
----------------------------------------------------------------------
You can respond by visiting:
https://sourceforge.net/tracker/?func=detail&atid=482468&aid=2728133&group_id=56967
------------------------------------------------------------------------------
_______________________________________________
Monetdb-bugs mailing list
[email protected]
https://lists.sourceforge.net/lists/listinfo/monetdb-bugs