On Tue, Sep 13, 2011 at 7:01 AM, Laurent Gautier <lgaut...@gmail.com> wrote: > On 2011-09-12 21:16, Artur Wroblewski wrote: >> >> On Mon, Sep 12, 2011 at 12:26 PM, Laurent Gautier<lgaut...@gmail.com> >> wrote: >>> >>> Probably not. >>> >>> R is doing a lot of things behind the hood. Sometimes it is good, >>> sometimes >>> it is bad. >>> The code snippet given to you has a quadratic time-complexity ( O(nm) ). >>> It >>> can be make linearithmic ( O(n log(m) ) ) simply: >>> >>> from rpy2.robjects.vector import BoolVector >>> ref = set(differential) >>> select_b = BoolVector(tuple(x in ref for x in source.rx2('gene'))) >>> mysubset = source.rx(select_b, True) >> >> If I reckon well BoolVector(...genexpr...) is not possible here due to >> R API limitation - we need length of iterable, isn't it? > > You might have missed the call to tuple(). This will make it work > independently of having a generator with a length. >>>> tuple(x for x in [1,2,3]) > (1, 2, 3)
In case of Luca's data (please correct me if I am wrong): 1. Tuple with 2mln items is created. 2. The tuple is iterated to created BoolVector with 2mln items. 3. The vector (and hopefully) data frame are iterated to filter data. >> It seems like copying R on Python level is not always nice >> and can be quite inefficient. >> >> Above probably could be rewritten in more Pythonic way (it would >> be more efficient I believe, as well) >> >> mysubset = source.rx((x in ref for x in source.rx2('gene')), True) >> >> or >> >> mysubset = DataFrame(row for row in source if row['gene'] in ref) >> >> but of course is not supported by rpy2. >> >> Is there a chance to make rpy2 bit more Python integrated? :) > > As Luca wrote it earlier, what is missing for it to work is that the > generator had a length. Let's say select_b = BoolVector(...gen..., n) Will above create bool vector with 2mln items? On other side mysubset = source.rx(...gen..., True) could allow to avoid generating of large vectors/tuples, isn't it? > This can probably be addressed by adding a custom iterator for R vectors, > and should I appear a little slow to have it implemented you are welcome to > submit a patch. ;-) Depends on amount of knowledge of R internals required. ;) ;P Best regards, w ------------------------------------------------------------------------------ BlackBerry® DevCon Americas, Oct. 18-20, San Francisco, CA Learn about the latest advances in developing for the BlackBerry® mobile platform with sessions, labs & more. See new tools and technologies. Register for BlackBerry® DevCon today! http://p.sf.net/sfu/rim-devcon-copy1 _______________________________________________ rpy-list mailing list rpy-list@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/rpy-list