[julia-users] Re: DataFrame melt question

Dan Wed, 16 Dec 2015 00:43:06 -0800

the following re-reads the header and generates a dictionary which assigns 
the original column name to the converted one, in a one-liner-ish:


df = readtable("/the/file.csv")

h = Dict(zip(keys(df.colindex.lookup),split(open("/tmp/file.csv") do f 
chomp(readline(f)) ; end,",")[collect(values(df.colindex.lookup))]))


now aside from using `h` in other ways, you can do:

melteddf[:Region] = [h[r] for r in melteddf[:Region]]


to fix the `melteddf`.

On Wednesday, December 16, 2015 at 2:39:57 AM UTC+2, David Anthoff wrote:
>
> Hi,
>
>  
>
> I have a csv file that roughly looks like this:
>
>  
>
>  
>
> Year,Name of country 1, Name of country 2
>
> 1950, 5., 6.
>
> 1951, 6., 8.
>
>  
>
> The real file has more columns and rows.
>
>  
>
> I want to bring this into tidy format, so that I have a DataFrame that 
> looks like this:
>
>  
>
> Year, Region, Value
>
> 1950, Name of country 1, 5.
>
> 1950, Name of country 2, 6.
>
> 1951, Name of country 1, 6.
>
> 1951, Name of country 2, 8.
>
>  
>
> Right now I read the file with readtable into a DataFrame and then use
>
>  
>
> melt(df, :Year)
>
>  
>
> This gives me the right structure, but now all the country names are 
> messed up, e.g. they look like “Name_of_country_1” instead of “Name of 
> country 1”.
>
>  
>
> I understand why that is the case, i.e. readtable converts strings into 
> symbols and has to insert these underscores, but I’m wondering whether the 
> original string value is preserved somewhere, and could be used in the melt 
> operation in some way?
>
>  
>
> Thanks,
>
> David
>
>  
>
> --
>
> David Anthoff
>
> University of California, Berkeley
>
>  
>
> http://www.david-anthoff.com
>
>  
>

[julia-users] Re: DataFrame melt question

Reply via email to