Re: [datatable-help] Error in coercing matrices within j expressions

Nathaniel Graham Tue, 17 Sep 2013 15:15:39 -0700

It hadn't occurred to me to use CJ(), so I'll tinker with that this evening
and
see if there are any gains to be made there.  In theory it's highly
parallelizable,
and one of the posts Matthew points to in his comments (in the post you
reference) shows a way that it can be done (using the old multicore library,
so I'm not exactly sure how it maps to the parallel library).  In my case
though, the whole process appears to be memory bound rather than CPU
bound.  Since my machine is fairly optimal (i7-4770 with 4x8GB DDR3-1600),
I just don't think it's going to get dramatically faster.  That doesn't mean
I won't try...


-------
Nathaniel Graham
[email protected]
[email protected]


On Tue, Sep 17, 2013 at 5:52 PM, Frank Erickson <[email protected]> wrote:

> Maybe not ultrafast, but with nice syntax:
>
> CJ(i=iset,j=jset)[criterion(i,j)]
>
> I guess it should be parallelizable, but that wouldn't be with data.table,
> if I understand this correctly:
> http://stackoverflow.com/questions/14759905/data-table-and-parallel-computing
>
>
> On Tue, Sep 17, 2013 at 5:42 PM, Nathaniel Graham <[email protected]>wrote:
>
>> Oops; I meant to reply to all, and then forgot after I discarded and
>> rewrote my
>> message a few times.  I suspect (although I'm not absolutely certain)
>> that if
>> NULL or similar did the same thing as returning a 0-row data.table with
>> the
>> appropriate number of columns, some operations could be sped up a bit.
>> In those cases, the data.table code wouldn't need to check the number and
>> type of the columns returned.
>>
>> I suspect that unless someone knows a secret, ultrafast way to iterate
>> through
>> a list of all combinations of a set of items and return the subset of
>> those that
>> match some criteria, that I'm as close to optimal as I'm likely to get
>> right now.
>>
>>
>> -------
>> Nathaniel Graham
>> [email protected]
>> [email protected]
>>
>>
>> On Tue, Sep 17, 2013 at 5:22 PM, Frank Erickson <[email protected]>wrote:
>>
>>> Well, rbindlist(list()) says "Null data.table" (though it doesn't pass
>>> the is.null() test). Maybe someone else has an idea how to deal with the
>>> no-results case. By the way, it's best to use "reply to all" to make sure
>>> you reply to the mailing list, too; they should be able to see your message
>>> quoted below, though.
>>>
>>> --Frank
>>>
>>>
>>> On Tue, Sep 17, 2013 at 5:03 PM, Nathaniel Graham 
>>> <[email protected]>wrote:
>>>
>>>> Frank,
>>>>
>>>> Thanks.  This seems to have done the trick, so long as I'm careful to
>>>> check for
>>>> zero-length lists and return data.table(i = integer(), j = integer())
>>>> in those
>>>> cases.  Essentially, I have to test every combination of i and j to see
>>>> if it's
>>>> "interesting" or not, and some groups have a lot of rows.  At the
>>>> moment I'm
>>>> attacking some other low hanging fruit, like speeding up the comparisons
>>>> I have to do.
>>>>
>>>> As a side note, it would be kind of nice if there was a simple way to
>>>> clue
>>>> data.table to the fact that there are no rows to return, like returning
>>>> NULL
>>>> or NA or similar.
>>>>
>>>> -------
>>>> Nathaniel Graham
>>>> [email protected]
>>>> [email protected]
>>>>
>>>
>

_______________________________________________
datatable-help mailing list
[email protected]
https://lists.r-forge.r-project.org/cgi-bin/mailman/listinfo/datatable-help

Re: [datatable-help] Error in coercing matrices within j expressions

Reply via email to