Bugs item #2728133, was opened at 2009-04-03 13:52
Message generated for change (Comment added) made by stmane
You can respond by visiting: 
https://sourceforge.net/tracker/?func=detail&atid=482468&aid=2728133&group_id=56967

Please note that this message will contain a full copy of the comment thread,
including the initial issue submission, for this request,
not just the latest update.
Category: Core
Group: MonetDB4 "stable"
Status: Open
Resolution: None
Priority: 5
Private: No
Submitted By: Wouter Alink (vzzzbx)
Assigned to: Nobody/Anonymous (nobody)
Summary: M4: count() returns int

Initial Comment:
To scale PF/TIJAH (and also the rest of pathfinder/M4) beyond the magical 31 
bit boundary, the count() function needs to return a wrd instead of an int. 
PF/TIJAH hits the 2G border earlier due to the fact that it counts each word in 
an XML element, where pathfinder only counts a text element as a single node. 
In M5 this apparently is designed right from the start. In M4 it seems to be a 
legacy issue. 

A simple analysis showed that in the pathfinder code there are +- 584 
occurrences of the 'count()' function (c and MIL invocations). In PF/TIJAH 
there are +- 155. Most of them seem to be easily to replace. Stefan estimated 
the change to be a 'weekend's work. A simple replacement function he came up 
with.

    PROC count_wrd(BAT b) : wrd {
        return wrd(count(int(b)));
    }

It is debatable whether to actually implement this fix, as the M4 code is 
basically end-of-life, and M5 does not exhibit this bug. A plan needs to be 
made. To be continued.

----------------------------------------------------------------------

>Comment By: Stefan Manegold (stmane)
Date: 2009-04-03 14:28

Message:
- GDK C function BATcount() does return a BUN, not an int.

- in MIL, we have
COMMAND:   count(BAT[any,any]) : int
MODULE:    algebra
COMPILED:  by adm on Wed Apr  1 23:29:46 2009
Returns the number of elements currently in a BAT.

COMMAND:   count(int) : lng
MODULE:    bat
COMPILED:  by adm on Wed Apr  1 23:29:46 2009
Returns the current size (in number of elements) of a BAT.

- my correct quote was "If you have a weekend, you're welcome to do it
[i.e., change the count(BAT[any,any]) to return a wrd instead of and int
--- and fix all MIL code that uses count(BAT[any,any]), accordingly --- and
for consistency change all MIL function that return or expect a count or
index of BUN(s) (thing of grouped counts for aggragation, slice, etc.)
likewise]" --- don't know, though, whether a weekend is sufficient to do it
all correctly & consistently ...


----------------------------------------------------------------------

You can respond by visiting: 
https://sourceforge.net/tracker/?func=detail&atid=482468&aid=2728133&group_id=56967

------------------------------------------------------------------------------
_______________________________________________
Monetdb-bugs mailing list
[email protected]
https://lists.sourceforge.net/lists/listinfo/monetdb-bugs

Reply via email to