If you replace
cols = [dict[k] for k in di_keys]
with
cols = {dict[k] for k in di_keys}
things work properly. This is actually correct behavior, although it
demonstrates why we deprecated the awful DataFrame(::Any…) constructor. Your
current code doesn’t match the correct signature of a DataFrame. It only works
by accident in your first accident because the column types aren’t homogeneous.
Once they, type inference produces the correct tighter type for cols and then
you’re not producing the right output.
FWIW, I think you’ll find Julia easier to use if you avoid list comprehensions
without explicit types.
— John
On Jun 12, 2014, at 8:31 AM, Florian Oswald <[email protected]> wrote:
> sure - any idea where should I look for this? I pretty much copied this line
>
> https://github.com/JuliaStats/DataFrames.jl/blob/master/src/deprecated.jl#L34
>
> for my collectFields function.
>
>
>
>
> On 12 June 2014 16:18, John Myles White <[email protected]> wrote:
> Certainly seems like a bug. A PR fixing this would be very helpful.
>
> At some point I’d like to move the functions for converting Dict’s to
> DataFrames out of the DataFrames package since there’s so many ways to do it
> that it’s hard for me to keep track of them.
>
> — John
>
> On Jun 12, 2014, at 8:15 AM, Florian Oswald <[email protected]> wrote:
>
>> Hi all,
>>
>> I found some strange behaviour and am trying to find out where I'm going
>> wrong. I have a dict that stores several vectors of equal length, and I want
>> to make a DataFrame from it, where the columns should have the names of the
>> dict keys:
>>
>> using DataFrames
>>
>> function collectFields(dict::Dict)
>> di_keys = collect(keys(dict))
>> cols = [ dict[k] for k in di_keys ]
>> cnames = Array(Symbol,length(dict))
>> for i in 1:length(di_keys)
>> cnames[i] = symbol(di_keys[i])
>> end
>> return DataFrame(cols, cnames)
>> end
>>
>> di = ["a"=>[1,3],"b"=>[0.0,1.0]]
>> collectFields(di)
>>
>> This works as expected:
>>
>> collectFields(di)
>> 2x2 DataFrame
>> |-------|---|-----|
>> | Row # | a | b |
>> | 1 | 1 | 0.0 |
>> | 2 | 3 | 1.0 |
>>
>> however, changing the type of the vectors in dict:
>>
>> di2 = ["a"=>[1,3],"b"=>[0,1]]
>>
>>
>> julia> collectFields(di2)
>> 2x2 DataFrame
>> |-------|-------|----|
>> | Row # | x1 | x2 |
>> | 1 | [1,3] | a |
>> | 2 | [0,1] | b |
>>
>>
>> Any ideas? thanks!
>>
>
>