On Tue, Sep 13, 2011 at 7:01 AM, Laurent Gautier <lgaut...@gmail.com> wrote:
> On 2011-09-12 21:16, Artur Wroblewski wrote:
>>
>> On Mon, Sep 12, 2011 at 12:26 PM, Laurent Gautier<lgaut...@gmail.com>
>>  wrote:
>>>
>>> Probably not.
>>>
>>> R is doing a lot of things behind the hood. Sometimes it is good,
>>> sometimes
>>> it is bad.
>>> The code snippet given to you has a quadratic time-complexity ( O(nm) ).
>>> It
>>> can be make linearithmic ( O(n log(m) ) ) simply:
>>>
>>> from rpy2.robjects.vector import BoolVector
>>> ref = set(differential)
>>> select_b = BoolVector(tuple(x in ref for x in source.rx2('gene')))
>>> mysubset = source.rx(select_b, True)
>>
>> If I reckon well BoolVector(...genexpr...) is not possible here due to
>> R API limitation - we need length of iterable, isn't it?
>
> You might have missed the call to tuple(). This will make it work
> independently of having a generator with a length.
>>>> tuple(x for x in [1,2,3])
> (1, 2, 3)

In case of Luca's data (please correct me if I am wrong):
1. Tuple with 2mln items is created.
2. The tuple is iterated to created BoolVector with 2mln items.
3. The vector (and hopefully) data frame are iterated to filter data.

>> It seems like copying R on Python level is not always nice
>> and can be quite inefficient.
>>
>> Above probably could be rewritten in more Pythonic way (it would
>> be more efficient I believe, as well)
>>
>>    mysubset = source.rx((x in ref for x in source.rx2('gene')), True)
>>
>> or
>>
>>    mysubset = DataFrame(row for row in source if row['gene'] in ref)
>>
>> but of course is not supported by rpy2.
>>
>> Is there a chance to make rpy2 bit more Python integrated? :)
>
> As Luca wrote it earlier, what is missing for it to work is that the
> generator had a length.

Let's say

    select_b = BoolVector(...gen..., n)

Will above create bool vector with 2mln items?

On other side

    mysubset = source.rx(...gen..., True)

could allow to avoid generating of large vectors/tuples, isn't it?

> This can probably be addressed by adding a custom iterator for R vectors,
> and should I appear a little slow to have it implemented you are welcome to
> submit a patch. ;-)

Depends on amount of knowledge of R internals required. ;) ;P

Best regards,

w

------------------------------------------------------------------------------
BlackBerry&reg; DevCon Americas, Oct. 18-20, San Francisco, CA
Learn about the latest advances in developing for the 
BlackBerry&reg; mobile platform with sessions, labs & more.
See new tools and technologies. Register for BlackBerry&reg; DevCon today!
http://p.sf.net/sfu/rim-devcon-copy1 
_______________________________________________
rpy-list mailing list
rpy-list@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/rpy-list

Reply via email to