On 2/3/10 3:00 PM, B.A.D.C.M.D Santos wrote:
> The main reason I was trying to move to the low level interface is
> speed. So what I am doing is calculating several times the p-value for
> the same python object.

?! phyper is deterministic, I think. Calling it with the same parameters 
will give the same results.


> So I bind phyper to the object and then just
> perform the test several times. This already improved the speed a lot,
> compared to simply calling rinterface.phyper everytime.

There is no such thing as rinterface.phyper by default, but I think that 
I understand what you mean ( rinterface.baseenv.get("phyper") ).
Early binding is definitely improving things. This may even make rpy2 
faster than the same code in R, in some cases.

# -- for rpy2-2.1.0 (written without actually running it)

import rpy2.robjects as robjects
from rpy2.robjects.pacakges import importr

# this steps performs an early binding of all objects
# in the R package "stats"
stats = importr("stats")

# save the Python lookup in stats.
phyper = stats.phyper

# cast to low-level (to cut the automagic conversion of Python objects)
lowlevel_phyper = robjects.conversion.py2ri(phyper)

my_q = robjects.IntVector((0, ))
my_m = robjects.IntVector((0, ))
my_n = robjects.FloatVector((0, ))
my_k = robjects.FloatVector((0, ))

# you can note that high-level and low-level objects
# can be inter-exchanged.

results = lowlevel_phyper(my_q, my_m, my_n, my_k)

# low-overhead change of a parameter
my_q[0] = 123

results = lowlevel_phyper(my_q, my_m, my_n, my_k)


At this point, you should profile your code, and the largest part of the 
time should be spent in the call lowlevel_phyper(), and then there is 
not much left to optimize (well... in fact there is probably the options 
Rmaths library that could still push things a little faster).



> But the problem
> is that I am using this in huge sequence datasets, taking hours for a
> single run. And the bottleneck is on the significance calculation which
> is taking 3x more time than all the rest.
>
> In a very small dataset that I use for test using the low level
> interface gives me less time to run it. Dropping from ~0.9s to ~0.8s so
> I expect that using it on a large dataset will give me even more saving.
>
> I will take a look at the code and try to make sense of it. I just
> though there was a better solution since it kind of points to that on
> the documentation.

Every code optimization problem can be different. I agree that the 
documentation could be giving hints (it is planned, and there already a 
section on perfomances), but will not replace knowing the API inside out.



L.




> Thank you very much once more,
> Bruno
> 2010/2/3 Laurent Gautier <lgaut...@gmail.com>
> On 2/3/10 12:32 PM, B.A.D.C.M.D Santos wrote:
> Hello again,
>
> Today I was trying to port my phyper from the high-level interface to the
> low-level interface. My problem is again how I map the arguments with dot.
> According to the documentation I should be able to use the special **Kwargs
> again. But I have no idea how to do this. I tried directly but it obviously
> didn't work because the function was expecting Sexp_Type object. Can
> someone give some light on this.
>
> Also is there any easy way to pass four parameters to phyper without
> needing to create four Sexp_Type vectors individually?
>
>
> Unless you have good reasons to move to the lower-level interface, I
> would advice you to stay with the higher level interface.
>
> The higher level interface is mostly abstracting those details from you.
>
> If you really really want to do it at the lower level, you can study how
> the higher level interface is doing it itself (it is implemented in
> Python). An answer to your question would just be that code ;-) .
>
>
> L.
>
>
>
>
>
> Thanks in advance,
> Bruno
>
> 2010/2/2 B.A.D.C.M.D Santos<bac...@cam.ac.uk>
> Thank you Laurent. I had completely forget that solution. It seems to be
> working know, although I am getting some weird values. But I think the
> problem is on my script.
>
> Thank you very much,
> Bruno
>
>
> 2010/2/2 Laurent Gautier<lgaut...@gmail.com>
> On 2/2/10 12:40 PM, B.A.D.C.M.D Santos wrote:
> Hello everyone,
>
> I am currently using rpy2 in order to use phyper function from R into my
> Python script. Nevertheless I have a problem because I want to use the
> argument log.p=TRUE. The last time I checked there was a problem with the
> mapping of the arguments with dots in the middle. Is this still a bug? If
> it is, is there any workaround to solve this?
>
> Did the link given earlier work for you ?
> http://www.mail-archive.com/rpy-list@lists.sourceforge.net/msg02313.html
>
>
>
> Or does the new rpy2 alpha
> solves this problem?
>
> It does provide a syntactic sugar for this. You can read more at:
> http://rpy.sourceforge.net/rpy2/doc-2.1/html/robjects.html#functions
> (the doc also tells to make a blind 'turn "." into "_"' function)
>
>
> Hoping this helps,
>
>
> L.
>
>
>
>
> I am using:
> Python 2.6.4
> R version 2.10.1
> rpy2 2.0.4
>
> Thanks in advance,
> Bruno
>
> ------------------------------------------------------------------------------
> The Planet: dedicated and managed hosting, cloud storage, colocation
> Stay online with enterprise data centers and the best network in the
> business Choose flexible plans and management services without long-term
> contracts Personal 24x7 support from experience hosting pros just a
> phone call away. http://p.sf.net/sfu/theplanet-com
> _______________________________________________ rpy-list mailing list
> rpy-list@lists.sourceforge.net
> https://lists.sourceforge.net/lists/listinfo/rpy-list
>
>
> ------------------------------------------------------------------------------
> The Planet: dedicated and managed hosting, cloud storage, colocation
> Stay online with enterprise data centers and the best network in the
> business Choose flexible plans and management services without long-term
> contracts Personal 24x7 support from experience hosting pros just a
> phone call away. http://p.sf.net/sfu/theplanet-com
> _______________________________________________ rpy-list mailing list
> rpy-list@lists.sourceforge.net
> https://lists.sourceforge.net/lists/listinfo/rpy-list
>
>
> ------------------------------------------------------------------------------
> The Planet: dedicated and managed hosting, cloud storage, colocation
> Stay online with enterprise data centers and the best network in the
> business Choose flexible plans and management services without long-term
> contracts Personal 24x7 support from experience hosting pros just a
> phone call away. http://p.sf.net/sfu/theplanet-com
> _______________________________________________ rpy-list mailing list
> rpy-list@lists.sourceforge.net
> https://lists.sourceforge.net/lists/listinfo/rpy-list
>


------------------------------------------------------------------------------
The Planet: dedicated and managed hosting, cloud storage, colocation
Stay online with enterprise data centers and the best network in the business
Choose flexible plans and management services without long-term contracts
Personal 24x7 support from experience hosting pros just a phone call away.
http://p.sf.net/sfu/theplanet-com
_______________________________________________
rpy-list mailing list
rpy-list@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/rpy-list

Reply via email to