I appreciate all the responses and apologize for not being more detailed. An R data frame is a tightly grouped array of vectors of the same length. Each vector is all the same datatype, I believe, but you can read all types of data into the same variable. The benefit is being able to quickly subset, stack and such (or 'melt' and 'cast' in R vernacular) according to any of your qualitative variables (or 'factors'). As someone pretty familiar with R and quite a newbie to python, I'm wary of insulting anybody's intelligence by describing what to me is effectively the default data format my most familiar language. The following is some brief R code if you're curious about how it works.
d <- read.csv(filename, header = TRUE, sep = ',') #this reads the table. '<-' is the assignment operator d[ , 'column.name'] # this references a column name. This same syntax can be used to reference all rows (index is put left of the comma) and columns in any order. The data frame then allows you to quickly declare new fields as functions of other fields. newVar <- d[ ,'column.name'] + d[ ,'another.column'] d$newVar <- newVar # attaches newVar to the rightmost column of 'd' At any rate, I finally got pydataframe to work, but had to go from Python 2.6 to 2.5. pydataframe has a bug for Windows that the author points out. Line 127 in 'parsers.py' should be changed from: columns = list(itertools.izip_longest(*split_lines ,fillvalue = na_text)) to: columns = list(itertools.izip_longest(list(*split_lines),fillvalue = na_text)) I don't know exactly what I did, but the module would not load until I did that. I know itertools.izip_longest requires 2 arguments before fillvalue, so I guess that did it. It's a handy way to handle alpha-numeric data. My problem with the csv module was that it interpreted all numbers as strings. Thanks again. On Thu, Mar 31, 2011 at 8:17 AM, James Reynolds <eire1...@gmail.com> wrote: > > > On Thu, Mar 31, 2011 at 11:10 AM, Blockheads Oi Oi < > breamore...@yahoo.co.uk> wrote: > >> On 31/03/2011 09:38, Ben Hunter wrote: >> >>> Is anybody out there familiar with data frame modules for python that >>> will allow me to read a CSV in a similar way that R does? pydataframe >>> and DataFrame have both befuddled me. One requires a special stripe of R >>> that I don't think is available on windows and the other is either very >>> buggy or I've put it in the wrong directory / installed incorrectly. >>> Sorry for the vague question - just taking the pulse. I haven't seen any >>> chatter about this on this mailing list. >>> >>> >>> >> What are you trying to achieve? Can you simply read the data with the >> standard library csv module and manipulate it to your needs? What makes >> you say that the code is buggy, have you examples of what you tried and >> where it was wrong? Did you install with easy_install or run setup.py? >> >> >> >>> _______________________________________________ >>> Tutor maillist - Tutor@python.org >>> To unsubscribe or change subscription options: >>> http://mail.python.org/mailman/listinfo/tutor >>> >> >> Regards. >> >> Mark L. >> >> >> >> _______________________________________________ >> Tutor maillist - Tutor@python.org >> To unsubscribe or change subscription options: >> http://mail.python.org/mailman/listinfo/tutor >> > > > > > > > > > > > > > > > I'm not familiar with it, but what about http://rpy.sourceforge.net/ > > _______________________________________________ > Tutor maillist - Tutor@python.org > To unsubscribe or change subscription options: > http://mail.python.org/mailman/listinfo/tutor > >
_______________________________________________ Tutor maillist - Tutor@python.org To unsubscribe or change subscription options: http://mail.python.org/mailman/listinfo/tutor