Re: [datatable-help] integer64 group by doesn't find all groups

Matt Dowle Sun, 02 Mar 2014 04:27:07 -0800


On 14/02/14 15:07, Yike Lu wrote:

Thanks for the info guys! Wondering if there's any way I can help?

Thanks for your offer. The function iradix in forder.c needs copyingand tweaking to become i64radix (8 passes instead of 4), or makinggeneral so that 4 or 8 can be passed in. Should also check first how thebit64 package sorts integer64. Then in bmerge.c add a case to the switchfor integer64 to cast to long long, add tests to tests.Rraw forgrouping and joining, update documentation (.Rd) files and add checks toinit.c.

Is that something you could do? If you are rusty on C I don't mindguiding you through.


Matt

On Wed, Feb 12, 2014 at 11:17 AM, [email protected]<mailto:[email protected]> <[email protected] <mailto:[email protected]>>wrote:


    Yes this isn't a data.table criticism, just a bit64 one in general.


    On Wed Feb 12 2014 at 11:39:47 AM, Matt Dowle
    <[email protected] <mailto:[email protected]>> wrote:


        Sometimes we take the hard road in data.table, to get to a
        better place.  Once bit64::integer64 is fully supported, it'll
        be much easier.   All the recent radix work for double applies
        almost automatically to integer64 for example,  but that radix
        work had to be done first.


        On 12/02/14 16:26, [email protected] <mailto:[email protected]>
        wrote:

        FYI (and this is a long outstanding argument) this is why I
        don't like the bit64 package.  These sorts of errors happen
        silently.  I understand that data.table can't use the other
        integer64 package, but at least there it is obvious when
        things are being coerced.

        In my situations, if I am grouping by a int64, it is usually
        either an ID so I can just make it a character vector
        instead, or it is something where I don't mind lost precision
        so I just make it numeric.

        On Wed Feb 12 2014 at 11:22:40 AM, Matt Dowle
        <[email protected] <mailto:[email protected]>> wrote:


            Hi,

            You're doing nothing wrong.  Although you can load
            integer64 using fread
            and create them directly,  data.table's grouping and keys
            don't work on
            them yet.  Sorry,  just not yet implemented. Because
            integer64 are
            internally stored as type double  (a good idea by package
            bit64),
            data.table sees them internally as double and doesn't
            catch that the
            type isn't supported yet (hence no error message such as
            you get for
            type 'complex').   The particular integer64 numbers in
            this example are
            quite small so will use the lower bits.  In double, those
            are the most
            precise part of the significand, which would explain why
            only one group
            comes out here since data.table groups and joins floating
            point data
            within tolerance.

            Matt

            On 06/02/14 23:38, Yike Lu wrote:
            > After a long hiatus, I am back to using data.table.
            Unfortunately,
            > I've encountered a problem. Am I doing something wrong
            here?
            >
            > require(data.table)
            >
            > dt = data.table(idx = 1:100 %% 3, 1:100)
            > dt[, list(sum(V2)), by = idx]
            > # normal
            >
            > require(bit64)
            >
            > dt2 = data.table(idx = integer64(100) + 1:100 %% 3, 1:100)
            > dt2[, list(sum(V2)), by = idx]
            > # only has one group:
            > #   idx   V1
            > #1:   1 5050
            >

            _______________________________________________
            datatable-help mailing list
            [email protected]
            <mailto:[email protected]>
            
https://lists.r-forge.r-project.org/cgi-bin/mailman/listinfo/datatable-help

_______________________________________________
datatable-help mailing list
[email protected]
https://lists.r-forge.r-project.org/cgi-bin/mailman/listinfo/datatable-help

Re: [datatable-help] integer64 group by doesn't find all groups

Reply via email to