Re: [Tutor] Data frame packages

Ben Hunter Thu, 31 Mar 2011 11:28:45 -0700

I appreciate all the responses and apologize for not being more detailed. An
R data frame is a tightly grouped array of vectors of the same length. Each
vector is all the same datatype, I believe, but you can read all types of
data into the same variable. The benefit is being able to quickly subset,
stack and such (or 'melt' and 'cast' in R vernacular) according to any of
your qualitative variables (or 'factors'). As someone pretty familiar with R
and quite a newbie to python, I'm wary of insulting anybody's intelligence
by describing what to me is effectively the default data format my most
familiar language. The following is some brief R code if you're curious
about how it works.

d <- read.csv(filename, header = TRUE, sep = ',') #this reads the table.
'<-' is the assignment operator
d[ , 'column.name'] # this references a column name. This same syntax can be
used to reference all rows (index is put left of the comma) and columns in
any order.

The data frame then allows you to quickly declare new fields as functions of
other fields.
newVar <- d[ ,'column.name'] + d[ ,'another.column']
d$newVar <- newVar # attaches newVar to the rightmost column of 'd'

At any rate, I finally got pydataframe to work, but had to go from Python
2.6 to 2.5. pydataframe has a bug for Windows that the author points out.
Line 127 in 'parsers.py' should be changed from:
columns = list(itertools.izip_longest(*split_lines ,fillvalue = na_text))

to:
columns = list(itertools.izip_longest(list(*split_lines),fillvalue =
na_text))

I don't know exactly what I did, but the module would not load until I did
that. I know itertools.izip_longest requires 2 arguments before fillvalue,
so I guess that did it.

It's a handy way to handle alpha-numeric data. My problem with the csv
module was that it interpreted all numbers as strings.

Thanks again.

On Thu, Mar 31, 2011 at 8:17 AM, James Reynolds <[email protected]> wrote:

>
>
> On Thu, Mar 31, 2011 at 11:10 AM, Blockheads Oi Oi <
> [email protected]> wrote:
>
>> On 31/03/2011 09:38, Ben Hunter wrote:
>>
>>> Is anybody out there familiar with data frame modules for python that
>>> will allow me to read a CSV in a similar way that R does? pydataframe
>>> and DataFrame have both befuddled me. One requires a special stripe of R
>>> that I don't think is available on windows and the other is either very
>>> buggy or I've put it in the wrong directory / installed incorrectly.
>>> Sorry for the vague question - just taking the pulse. I haven't seen any
>>> chatter about this on this mailing list.
>>>
>>>
>>>
>> What are you trying to achieve?  Can you simply read the data with the
>> standard library csv module and manipulate it to your needs?    What makes
>> you say that the code is buggy, have you examples of what you tried and
>> where it was wrong?  Did you install with easy_install or run setup.py?
>>
>>
>>
>>> _______________________________________________
>>> Tutor maillist  -  [email protected]
>>> To unsubscribe or change subscription options:
>>> http://mail.python.org/mailman/listinfo/tutor
>>>
>>
>> Regards.
>>
>> Mark L.
>>
>>
>>
>> _______________________________________________
>> Tutor maillist  -  [email protected]
>> To unsubscribe or change subscription options:
>> http://mail.python.org/mailman/listinfo/tutor
>>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
> I'm not familiar with it, but what about http://rpy.sourceforge.net/
>
> _______________________________________________
> Tutor maillist  -  [email protected]
> To unsubscribe or change subscription options:
> http://mail.python.org/mailman/listinfo/tutor
>
>

_______________________________________________
Tutor maillist  -  [email protected]
To unsubscribe or change subscription options:
http://mail.python.org/mailman/listinfo/tutor

Re: [Tutor] Data frame packages

Reply via email to