I'd like to do two (actually three) things:

1) Using a grep-like operator, delete rows in a dataframe that match a
   particular pattern in a particular column (in my case, every row that
   has a '#' as the first character in column 'a')
2) Set elements in a dataframe based on the characteristics of other
   elements, across all rows (in my case, if an element in column 'c'
   is NA, set it to 2*that row's value in column 'b')
2a) Only do this if column 'd''s value is a particular value (in my 
    case, the character 'J')

I'm trying to do this with calling the R code directly using ro.r,
but that's (a) not satisfying because I'd rather do it in python (how?),
(b) rpy/R doesn't seem to like doing code like "df = ro.r("function(df)"),
and (c) it doesn't work anyway.

I'm having some coredump problems when instantiating the dataframe below
with NAs in it, so forgive any errors in the code since I can't run it.
Thanks for any help!

JDO

==========================================================

#!/usr/bin/env python2.6
import rpy2.robjects as ro

df = ro.DataFrame({'a': ro.StrVector(('# x','y','z')), 
                   'b': ro.IntVector((4,5,6)),
                   'c': ro.IntVector((8,ro.NA_integer,10)),
                   'd': ro.StrVector(('I','J','K')), 
                   })

# would like to delete all rows whose name in column 'a' begins with a '#'
df = ro.r("df[grep('^#', sdpf[,%d], invert=TRUE),]" % \   
tuple(df.colnames).index('a'))

# would like to set all NAs in 'c' to 2*value in 'b'
df = ro.r("ifelse(is.na(df$c), 2*df$b, df$c)")

# would really like to do this only if column 'd' is 'J' - not sure how




------------------------------------------------------------------------------

_______________________________________________
rpy-list mailing list
rpy-list@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/rpy-list

Reply via email to