Dear Python Tutor, I'm doing econometric work and am a new user of Python. I have read several of the tutorials, but haven't found them useful for a newbie problem I've encountered. I've used a module (StataTools) from (http://presbrey.mit.edu/PyDTA ) to get a Stata ".dta" file into Python. In Stata the data set is an NXK matrix where N is the number of observations (households) and K is the number of variables. I gather it's now a list where each element of the list is an observation (a vector) for one household. The name of my list is "data"; I gather Python recognizes the first observation by: data[1] . Example, data = [X_1, X_2, X_3, . . . . , X_N] where each X_i for all i, is vector of household characteristics, eg X_1 = (age_1, wage_1, . . . , residence_1).
I also have a list for variable names called "varname"; although I'm not sure the module I used to extract the ".dta" into Python also created a correspondence between the varname list and the data list--the python interpreter won't print anything when I type one of the variable names, I was hoping it would print out a vector of ages or the like. In anycase, I'd like to make a scatter plot in pylab, but don't know how to identify a variable in "data" (i.e. I'd like a vector listing the ages and another vector listing the wages of households). Perhaps, I need to run subroutine to collect each relevant data point to create a new list which I define as my variable of interest? From the above example, I'd like to create a list such as: age = [age_1, age_2, . . . , age_N] and likewise for wages. Any help you could offer would be very much appreciated. Also, this is my first time using the python tutor, so let me know if I've used it appropriately or if I should change/narrow the structure of my question. Thanks Steve -- Steven Buck Ph.D. Student Department of Agricultural and Resource Economics University of California, Berkeley
_______________________________________________ Tutor maillist - Tutor@python.org http://mail.python.org/mailman/listinfo/tutor