So, the other argument is that, if the types fit, why not make it easy to append data to a DataFrame via any iterable? Constructing a DataFrame just to append it to another DataFrame and throw it away seems wasteful, especially since a new array is allocated for each column, and (I think) each array allocates space for 16 elements. That means we're allocating and throwing away, e.g., 128 bytes per Float64 column, just so we can append one number to the column.
If we had a separate type for DataFrame rows, on the other hand... Cheers, Kevin On Monday, May 26, 2014, John Myles White <[email protected]> wrote: > I’d not really opposed to it, but I’m also not super excited about it. > It’s a redundant and non-obvious interface: I’ve seen people try to use > both vectors and 1-row matrices to do this. That suggests to me there’s no > clear right answer, so picking one way arbitrarily (appending only > DataFrames to DataFrames) is pretty reasonable. > > — John > > On May 26, 2014, at 8:14 PM, Kevin Squire > <[email protected]<javascript:_e(%7B%7D,'cvml','[email protected]');>> > wrote: > > It shouldn't be that hard to make the array version work. I might give it > a shot, unless that isn't desired. > > Kevin > > On Monday, May 26, 2014, Jason Solack > <[email protected]<javascript:_e(%7B%7D,'cvml','[email protected]');>> > wrote: > >> this works for me: >> >> dfA = DataFrame(A=[1:10], B=[11:20]) >> dfB = DataFrame(A=11, B=21) >> append!(dfA, dfB) >> >> >> >> On Monday, May 26, 2014 11:59:28 AM UTC-4, Tomas Lycken wrote: >>> >>> I'm probably just being incredibly daft, but I can't figure out how to >>> add a new row to a DataFrame. >>> >>> Basically, I have a bunch of data sets for which I want to perform some >>> calculations - lets say the mean and standard deviation of something - each >>> dataset corresponding to some named category of data. So I do the following >>> to construct my new DataFrame >>> >>> julia> measures = DataFrame() >>> julia> measures[:Mean] = Float64[] >>> julia> measures[:StdDev] = Float64[] >>> julia> measures[:Category] = Symbol[] >>> >>> Now, I want to add some values that are the results of a calculation on >>> a different data set, and I try this: >>> >>> julia> push!(psispread, [1.0,0.1,:Fake]) >>> ERROR: no method push!(DataFrame, Array{Any,1}) >>> julia> append!(psispread, [1.0,0.1,:Fake]) >>> ERROR: no method append!(DataFrame, Array{Any,1}) >>> julia> psispread[1,:] = [1.0,0.1,:Fake] >>> ERROR: BoundsError() >>> in setindex! at /home/tlycken/.julia/v0.3/DataArrays/src/dataarray.jl: >>> 764 >>> in insert_single_entry! at /home/tlycken/.julia/v0.3/ >>> DataFrames/src/dataframe/dataframe.jl:410 >>> in setindex! at /home/tlycken/.julia/v0.3/DataFrames/src/dataframe/ >>> dataframe.jl:521 >>> >>> Is there a nice and simple way to add a row to a DataFrame without >>> having to do it one value at a time? >>> >>> // T >>> >> >
