Hi Luca, Unfortunately this does not seem to be caused by your installation.
The problem exists for IntVector, and FactorVector inherits from it. Few features are likely missing from FactorVector, but the good thing is that they already can be implemented simply. Let's take an example: import rpy2.robjects as ro fcr = ro.r('factor(c("a", "b", NA, "a", NA))') 'fcr' is now a FactorVector, that is an IntVector with levels. >>> list(fcr) [1, 2, -2147483648, 1, -2147483648] That large negative integer is the one used by R to encode missing "integer" values: >>> ro.NA_integer[0] -2147483648 What is happening when doing 'list(fcr)' is that fcr will be iterated through and each element stored into a result Python list. The issue is that Python does not have a "missing integer" value, but that should not stop us from writing a simple function to deal with it as needed. def as_character_list(factor): na_val = ro.NA_integer[0] res = [None, ] * len(factor) for i, elt in enumerate(factor): if elt != na_val: #NOTE: R is using 1-offset indices res[i] = factor.levels[elt-1] return res >>> as_character_list(fcr) ['a', 'b', None, 'a', None] What we have implemented is a variant of the R base function "as.character.factor": from rpy2.robjects.packages import importr base = importr("base") >>>list(base.as_character(fcr)) ['a', 'b', 'NA', 'a', 'NA'] L. On 1/15/10 2:36 PM, Luca Beltrame wrote: > Hello, > > in my code, I need to convert the columns from a robjects.DataFrame to other > data types (list, for example). Howver, I've found a problem when dealing with > data that contains NAs. In particular, I'm referring to non-numeric columns, > that are represented as FactorVectors. > > Example code: > > import rpy2.robjects as robjects > > data = robjects.DataFrame.from_csvfile("file_with_NAs_in_columns", sep="\t") > > column_with_na = data.rx2("Column") > > print column_with_na > > [1]<NA> <NA> <NA> some_value > Levels: some_value > > and If I issue > > print column_with_na[0] > > I get: > -2147483648 > > And of course, accessing the levels I only get some_value. Converting to other > types of Vector doesn't seem to help. > > Notice that this works if I do > > base = importr("base") > column_value = base.as_vector(column_with_na) > column_value = list(column_value) > print column_value > ['NA', 'NA', 'NA', 'some_value'] > > Is there a way to translate the column *including* the NAs, into a Python list > without doing the hackish way described above? > > This is with RPy 2.1 alpha 2. I admit that there may be a problem with my > installation as I'm running a local copy of rpy2 2.1 as I still have a system- > wide 2.0.x needed for some projects. > > > > > ------------------------------------------------------------------------------ > Throughout its 18-year history, RSA Conference consistently attracts the > world's best and brightest in the field, creating opportunities for Conference > attendees to learn about information security's most important issues through > interactions with peers, luminaries and emerging and established companies. > http://p.sf.net/sfu/rsaconf-dev2dev > > > > _______________________________________________ > rpy-list mailing list > rpy-list@lists.sourceforge.net > https://lists.sourceforge.net/lists/listinfo/rpy-list ------------------------------------------------------------------------------ Throughout its 18-year history, RSA Conference consistently attracts the world's best and brightest in the field, creating opportunities for Conference attendees to learn about information security's most important issues through interactions with peers, luminaries and emerging and established companies. http://p.sf.net/sfu/rsaconf-dev2dev _______________________________________________ rpy-list mailing list rpy-list@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/rpy-list