On Thu, Jul 23, 2009 at 9:50 AM, Skipper Seabold<jsseab...@gmail.com> wrote: > On Thu, Jul 23, 2009 at 7:12 AM, Peter<rpy-l...@maubp.freeserve.co.uk> wrote: >> On Thu, Jul 23, 2009 at 12:02 PM, Laurent Gautier<lgaut...@gmail.com> wrote: >>>> >>>> Since x is a NumPy array these can come in different flavours, and >>>> perhaps something here is breaking rpy2 or R, e.g. >>> >>> The bug report if for rpy-1.0.x, I think. >> > > In [2]: rpy.__version__ > Out[2]: '1.4.0.dev7055' > > Is that expected behavior? That is the numpy release that I currently > have installed. Note that I tested this on two machines however. But > yes it's rpy-1.0.3 I believe. > >> Sorry, you are right. I would still try making a copy of the NumPy array >> which should give you a continuous C ordered array which owns its >> own memory. >> >> Peter >> > > > Thanks, but I think I checked this. When I said exactly the same, as > far as I can tell they are exactly the same. > > Say exog is my x that I have imported and x is built as above from rpy > and numpy. > > In [16]: x.flags > Out[16]: > C_CONTIGUOUS : False > F_CONTIGUOUS : True > OWNDATA : False > WRITEABLE : True > ALIGNED : True > UPDATEIFCOPY : False > > In [17]: exog.flags > Out[17]: > C_CONTIGUOUS : False > F_CONTIGUOUS : True > OWNDATA : False > WRITEABLE : True > ALIGNED : True > UPDATEIFCOPY : False > > In [18]: x.flags == exog.flags > Out[18]: True > > In [19]: x.dtype > Out[19]: dtype('float64') > > In [20]: exog.dtype > Out[20]: dtype('float64') > > In [21]: x.shape > Out[21]: (21, 4) > > In [22]: exog.shape > Out[22]: (21, 4) > > I also tried making a copy of exog, so I had a C contiguous array that > owns its own data, but it was the same result. The weird thing is > that using exog works for psi.bisquare and psi.huber, but when I try > psi.hampel I get an error about the conformability of x and the wts, > so it appears to be some sort of a dimensions issue in the iterated > weighted least squares steps for RLM. Anyway, I've update my code to > use the array as built up from rpy for now and it works. When I have > time I will try to dig deeper. I might also try to reinstall rpy > tonight, though again this happened on two machines. > > Cheers, > > Skipper >
For posterity's sake, I will post my solution. The problem was in how Hampel's norm works and there are two workarounds. The first is to add an axis to the endogenous variable (as the initial exception said. I just looked in the wrong place.). So in my example above for my variables not from R/Rpy, I would have to add an axis to the response variable so y = y[:,numpy.newaxis] or y = y[:,None]. This needs to be done because Hampel's norm is a hard redescender for large outliers (big ones get a weight of zero rather than just downweighted), so the starting values are chosen differently and it was here that my approach was failing. The second and probably "correct" way to do it (the above lead to some odd behavior in how the results were returned) is to specify how you want the initial values to be calculated. So my command is then results3 = r.rlm(formula, data=frame, psi='psi.hampel', init='lts') # lts = Least Trimmed Squares and everything works as usual. Skipper ------------------------------------------------------------------------------ Let Crystal Reports handle the reporting - Free Crystal Reports 2008 30-Day trial. Simplify your report design, integration and deployment - and focus on what you do best, core application coding. Discover what's new with Crystal Reports now. http://p.sf.net/sfu/bobj-july _______________________________________________ rpy-list mailing list rpy-list@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/rpy-list