Thanks for the kind words. I'll put together a pull request.
On Tuesday, June 10, 2014 10:01:36 AM UTC-4, Gustavo Lacerda wrote:
>
> hey Keith,
>
> Your solution is elegant because it delegates conversion to the column
> push!, i.e. push!{S,T}(dv::DataArray{S,1},v::T)
>
> I have tested it, and it works for me too. This is your code, so I
> think you should get all the credit.
>
> Gustavo
> --
> Gustavo Lacerda
> http://www.optimizelife.com
>
>
> On Tue, Jun 10, 2014 at 7:35 AM, Keith Campbell <[email protected]
> <javascript:>> wrote:
> > Hey Gustavo,
> >
> > Below is a crack at a version that handles tuples and deals with some of
> the
> > issues John raised. You can see some simple tests at
> > http://nbviewer.ipython.org/gist/catawbasam/003743259cf0a6ec968d.
> >
> > If you're interested in working it over for a pull request, please feel
> > free. If you'd like me to do it, I'd be happy to. And if this seems
> like
> > the wrong approach, that's fine too.
> > cheers,
> > Keith
> >
> > import Base.push!
> > function push!(df::DataFrame, iterable)
> > K = length(iterable)
> > assert(size(df,2)==K)
> > i=1
> > for t in iterable
> > try
> > #println(i,t, typeof(t))
> > push!(df.columns[i], t)
> > catch
> > #clean up partial row
> > for j in 1:(i-1)
> > pop!(df.columns[j])
> > end
> > msg = "Error adding $t to column $i."
> > throw(ArgumentError(msg))
> > end
> > i=i+1
> > end
> > end
> >
> >
> > On Monday, June 9, 2014 11:14:24 PM UTC-4, Gustavo Lacerda wrote:
> >>
> >> OK, but first I want to make it work for heterogenous lists (tuples),
> >> which is mysteriously failing.
> >>
> >> Gustavo
> >>
> >>
> >> On Monday, June 9, 2014, John Myles White <[email protected]>
> wrote:
> >> > Would be good to clean this up by removing some of the slow parts
> (map
> >> > usage, anonymous function usage) and have it submitted as a PR.
> >> > — John
> >> >
> >> > On Jun 9, 2014, at 1:17 PM, Keith Campbell <[email protected]>
> wrote:
> >> >
> >> > Thanks for putting this togehter.
> >> > Under 0.3 pre from yesterday, I get a deprecation warning in the
> Array
> >> > version where df2 is assigned. The tweak below appears to resolve
> that
> >> > warning:
> >> > function push!(df::DataFrame, arr::Array)
> >> > K = length(arr)
> >> > assert(size(df,2)==K)
> >> > col_types = map(eltype, eachcol(df))
> >> > converted = map(i -> convert(col_types[i][1], arr[i]), 1:K)
> >> > ## To do: throw error if convert fails
> >> > df2 = convert( DataFrame, reshape(converted, 1, K) ) #
> <==tweaked
> >> > names!(df2, names(df))
> >> > append!(df,df2)
> >> > end
> >> > On Monday, June 9, 2014 3:44:28 PM UTC-4, Gustavo Lacerda wrote:
> >> >
> >> > I've implemented this:
> >> >
> >> > function push!(df::DataFrame, arr::Array)
> >> > K = length(arr)
> >> > assert(size(df,2)==K)
> >> > col_types = map(eltype, eachcol(df))
> >> > converted = map(i -> convert(col_types[i][1], arr[i]), 1:K)
> >> > ## To do: throw error if convert fails
> >> > df2 = DataFrame(reshape(converted, 1, K))
> >> > names!(df2, names(df))
> >> > append!(df,df2)
> >> > end
> >> > X1 = rand(Normal(0,1), 10); X2 = rand(Normal(0,1), 10); X3 =
> >> > rand(Normal(0,1), 10); Y = X1 - X2 + rand(Normal(0,1), 10)
> >> > df = DataFrame(Y=Y, X1=X1, X2=X2, X3=X3)
> >> > push!(df, [1,2,3,4])
> >> >
> >> > I tried to generalize it by replacing Array with Tuple.
> >> >
> >> > function push!(df::DataFrame, tup::Tuple)
> >> > K = length(tup)
> >> > assert(size(df,2)==K)
> >> > col_types = map(eltype, eachcol(df))
> >> > converted = map(i -> convert(col_types[i][1], tup[i]), 1:K)
> >> > ## To do: throw error if convert fails
> >> > df2 = DataFrame(reshape(converted, 1, K))
> >> > names!(df2, names(df))
> >> > append!(df,df2)
> >> > end
> >> > julia> df[:greeting] = "hello"
> >> > "hello"
> >> > julia> df
> >> > 11x5 DataFrame
> >> >
> |-------|-----------|-------------|-----------|------------|----------|
> >> > | Row # | Y | X1 | X2 | X3 | greeting
> |
> >> > | 1 | 0.39624 | 0.163897 | -0.146526 | 0.592489 | "hello"
> |
> >> > | 2 | -0.236239 | -1.81627 | -0.726978 | 0.638524 | "hello"
> |
> >> > | 3 | -0.801656 | 0.000801096 | 0.543645 | -0.997613 | "hello"
> |
> >> > | 4 | -0.30888 | -0.166953 | 0.640827 | 1.53217 | "hello"
> |
> >> > | 5 | -0.662719 | -1.38129 | -0.194937 | 0.928446 | "hello"
> |
> >> > | 6 | 4.37102 | 2.22107 | -2.15648 | -0.703392 | "hello"
> |
> >> > | 7 | 0.0866397 | -0.633333 | -0.745456 | -0.0144429 | "hello"
> |
> >> > | 8 | 0.581942 | 1.24061 | -0.867256 | 0.283671 | "hello"
> |
> >> > | 9 | -3.15614 | -1.39045 | 1.34395 | 0.343224 | "hello"
> |
> >> > | 10 | -1.67029 | 0.634846 | 2.08062 | -0.845479 | "hello"
> |
> >> > | 11 | 1.0 | 2.0 | 3.0 | 4.0 | "hello"
> |
> >> >
> >> > But then this happens:
> >> > julia> push!(df, (1,2,3,4, "hi"))
> >> > ERROR: no method convert(Type{Float64}, ASCIIString)
> >> > in setindex! at array.jl:305
> >> > in map_range_to! at range.jl:523
> >> > in map at range.jl:534
> >> > in push! at none:5
> >> >
> >> > It apparently tries to convert "hi" to Float64, even though the 5th
> type
> >> > is ASCIIString:
> >> > julia> col_types
> >> > 1x5 DataFrame
> >> > |-------|---------|---------|---------|---------|-------------|
> >> > | Row # | Y | X1 | X2 | X3 | label |
> >> > | 1 | Float64 | Float64 | Float64 | Float64 | ASCIIString |
> >> >
> >> > Gustavo
> >> > P.S. Should the code go here?
> >>
> >> --
> >> --
> >> Gustavo Lacerda
> >> http://www.optimizelife.com
>