No, the problem is that you're doing 

[dict[k] for k in di_keys]

If you did

Any[dict[k] for k in di_keys]

you'd have a type-stable variable. As is, your code has some "type thrashing".

 -- John

On Jun 12, 2014, at 8:57 AM, Florian Oswald <[email protected]> wrote:

> ok, I see. so ideally I would do this
> 
> di = (ASCIIString=>Array{Real,1})["a"=>[1,3],"b"=>[0.0,1.0]]
> 
> ( not quite sure why the curly brackets solves this problem, since that 
> creates Dict{Any,Any}, and you said that's what's causing trouble? )
> 
> thanks!
> 
> 
> On 12 June 2014 16:46, John Myles White <[email protected]> wrote:
> If you replace
> 
> cols = [dict[k] for k in di_keys]
> 
> with
> 
> cols = {dict[k] for k in di_keys}
> 
> things work properly. This is actually correct behavior, although it 
> demonstrates why we deprecated the awful DataFrame(::Any…) constructor. Your 
> current code doesn’t match the correct signature of a DataFrame. It only 
> works by accident in your first accident because the column types aren’t 
> homogeneous. Once they, type inference produces the correct tighter type for 
> cols and then you’re not producing the right output.
> 
> FWIW, I think you’ll find Julia easier to use if you avoid list 
> comprehensions without explicit types.
> 
>  — John
> 
> On Jun 12, 2014, at 8:31 AM, Florian Oswald <[email protected]> wrote:
> 
>> sure - any idea where should I look for this? I pretty much copied this line
>> 
>> https://github.com/JuliaStats/DataFrames.jl/blob/master/src/deprecated.jl#L34
>> 
>> for my collectFields function.
>> 
>> 
>> 
>> 
>> On 12 June 2014 16:18, John Myles White <[email protected]> wrote:
>> Certainly seems like a bug. A PR fixing this would be very helpful.
>> 
>> At some point I’d like to move the functions for converting Dict’s to 
>> DataFrames out of the DataFrames package since there’s so many ways to do it 
>> that it’s hard for me to keep track of them.
>> 
>>  — John
>> 
>> On Jun 12, 2014, at 8:15 AM, Florian Oswald <[email protected]> wrote:
>> 
>>> Hi all,
>>> 
>>> I found some strange behaviour and am trying to find out where I'm going 
>>> wrong. I have a dict that stores several vectors of equal length, and I 
>>> want to make a DataFrame from it, where the columns should have the names 
>>> of the dict keys:
>>> 
>>> using DataFrames
>>> 
>>> function collectFields(dict::Dict)
>>>     di_keys = collect(keys(dict))
>>>     cols = [ dict[k] for k in di_keys ]
>>>     cnames = Array(Symbol,length(dict))
>>>     for i in 1:length(di_keys)
>>>         cnames[i] = symbol(di_keys[i])
>>>     end
>>>       return DataFrame(cols, cnames)
>>> end
>>> 
>>> di = ["a"=>[1,3],"b"=>[0.0,1.0]]
>>> collectFields(di)
>>> 
>>> This works as expected:
>>> 
>>> collectFields(di)
>>> 2x2 DataFrame
>>> |-------|---|-----|
>>> | Row # | a | b   |
>>> | 1     | 1 | 0.0 |
>>> | 2     | 3 | 1.0 |
>>> 
>>> however, changing the type of the vectors in dict:
>>> 
>>> di2 = ["a"=>[1,3],"b"=>[0,1]]
>>> 
>>> 
>>> julia> collectFields(di2)
>>> 2x2 DataFrame
>>> |-------|-------|----|
>>> | Row # | x1    | x2 |
>>> | 1     | [1,3] | a  |
>>> | 2     | [0,1] | b  |
>>> 
>>> 
>>> Any ideas? thanks!
>>> 
>> 
>> 
> 
> 

Reply via email to