Re: [julia-users] Dataframe readtable change?

Stefan Karpinski Thu, 22 May 2014 13:03:49 -0700

For what it's worth, I was much happier when dataframes showed their
contents rather than a summary. I must have missed the discussion where
that decision was made (ditto for all the extra ASCII chrome when
displaying data frames these days).



On Thu, May 22, 2014 at 3:01 PM, John Myles White
<[email protected]>wrote:

> Nobody had time to integrate it anywhere. A pull request would help move
> things forward.
>
>  -- John
>
> On May 22, 2014, at 11:57 AM, Bob Nnamtrop <[email protected]> wrote:
>
> OK. Thanks. That is helpful.
>
> Any reason why that page is not shown in the documentation given in the
> link on the front page.
>
>
> On Thu, May 22, 2014 at 11:46 AM, John Myles White <
> [email protected]> wrote:
>
>> head and tail don't actually print anything: they just give you a subset
>> of a DataFrame. So you're seeing the usual show method's output, which can
>> be overriden by explicitly requesting that you see the whole DataFrame. See
>>
>> https://github.com/JuliaStats/DataFrames.jl/blob/master/spec/show.md
>>
>>  -- John
>>
>> On May 22, 2014, at 10:44 AM, Bob Nnamtrop <[email protected]>
>> wrote:
>>
>>  An issue I noticed with Dataframes recently is that head(df) and
>> tail(df) both list the show(df) summary (like those above) instead of
>> listing the top and bottom of the dataframe. I just started using
>> dataframes so I have no idea what they did in the past but it seems they
>> should list the df and not the summary.
>>
>> Also, are there any other handy ways to list the df in the repl?
>>
>> Bob
>>
>>
>> On Thu, May 22, 2014 at 11:39 AM, Rob J. Goedman <[email protected]>wrote:
>>
>>> Thanks John.
>>>
>>> I should have filed it as an issue on DataFrames.jl but initially
>>> thought it could deeper than that.
>>>
>>> For now in Stan.jl I've included a 'small' cleanup step. Small for say
>>> 1000 samples, a bit bigger for 100000 samples.
>>>
>>> Like you mentioned earlier, for years I've been using
>>> file-out-file-in-communication for Jags and other programs (Finite
>>> Elements) and was quite ok with it because sampling and FE iterations
>>> dominated the time to complete.
>>>
>>> FOFI really only became an issue when I had to adjust values in between
>>> each of hundreds of runs (e.g. a stiffness matrix in FEM when dealing with
>>> buckling).
>>>
>>>  Rob J. Goedman
>>> [email protected]
>>>
>>>
>>>
>>>
>>> On May 22, 2014, at 10:16 AM, John Myles White <[email protected]>
>>> wrote:
>>>
>>> I need to find time to look into this, but could someone try a git
>>> bisect and see if some of the metaprogramming changes we made to readtable
>>> caused this? It might be that this file would have never worked, but if it
>>> once did, it would be good to point out the problematic code.
>>>
>>>  — John
>>>
>>> On May 20, 2014, at 7:53 PM, Rob J. Goedman <[email protected]> wrote:
>>>
>>> Actually, another way to make it work is removing the blank line. Below
>>> little program shows that readtable() accepts test_df1 and test_df2, but
>>> fails on test_df3.
>>>
>>> Also, the fact that it started to happen today had nothing todo with
>>> Julia or DataFrame updates. The file is created by Stan and the latest
>>> version inserts that blank line.
>>>
>>> Of course I could clean up the file, but maybe this is an issue in
>>> DataFrame's readtable function?
>>>
>>> Apologies for the earlier incomplete report.
>>>
>>>  Rob J. Goedman
>>> [email protected]
>>>
>>>
>>>  <test_df.jl><test_df1.csv>
>>> <test_df2.csv>
>>> <test_df3.csv>
>>>
>>>
>>> julia>
>>> include("/Users/rob/.julia/v0.3/MCMCExampleRepository/test/test_df.jl")
>>> 4x10 DataFrame
>>> |-------|---------------|---------|---------|
>>> | Col # | Name          | Eltype  | Missing |
>>> | 1     | lp__          | Float64 | 0       |
>>> | 2     | accept_stat__ | Float64 | 0       |
>>> | 3     | stepsize__    | Float64 | 0       |
>>> | 4     | treedepth__   | Int64   | 0       |
>>> | 5     | n_leapfrog__  | Int64   | 0       |
>>> | 6     | n_divergent__ | Int64   | 0       |
>>> | 7     | beta_1        | Float64 | 0       |
>>> | 8     | beta_2        | Float64 | 0       |
>>> | 9     | beta_3        | Float64 | 0       |
>>> | 10    | sigma         | Float64 | 0       |
>>>
>>> 4x10 DataFrame
>>> |-------|---------------|---------|---------|
>>> | Col # | Name          | Eltype  | Missing |
>>> | 1     | lp__          | Float64 | 0       |
>>> | 2     | accept_stat__ | Float64 | 0       |
>>> | 3     | stepsize__    | Float64 | 0       |
>>> | 4     | treedepth__   | Int64   | 0       |
>>> | 5     | n_leapfrog__  | Int64   | 0       |
>>> | 6     | n_divergent__ | Int64   | 0       |
>>> | 7     | beta_1        | Float64 | 0       |
>>> | 8     | beta_2        | Float64 | 0       |
>>> | 9     | beta_3        | Float64 | 0       |
>>> | 10    | sigma         | Float64 | 0       |
>>>
>>> ERROR: BoundsError()
>>>  in findcorruption at
>>> /Users/rob/.julia/v0.3/DataFrames/src/dataframe/io.jl:663
>>>  in readtable! at
>>> /Users/rob/.julia/v0.3/DataFrames/src/dataframe/io.jl:731
>>>  in readtable at
>>> /Users/rob/.julia/v0.3/DataFrames/src/dataframe/io.jl:812
>>>  in readtable at
>>> /Users/rob/.julia/v0.3/DataFrames/src/dataframe/io.jl:879
>>>  in include at boot.jl:244
>>> while loading
>>> /Users/rob/.julia/v0.3/MCMCExampleRepository/test/test_df.jl, in expression
>>> starting on line 11
>>>
>>> julia>
>>>
>>>
>>> On May 20, 2014, at 6:36 PM, Rob J. Goedman <[email protected]> wrote:
>>>
>>> Hi,
>>>
>>> Using a freshly updated Version 0.3.0-prerelease+3251 (2014-05-20 23:18
>>> UTC) of Julia I think I noticed a different behavior of readtable(),
>>> which I hope is not intended.
>>>
>>> I have a small test file with data as shown below (and attached as a
>>> file at the end of the email):
>>>
>>> lp__,accept_stat__,stepsize__,treedepth__,n_leapfrog__,n_divergent__,mu
>>> # Adaptation terminated
>>>
>>> -19.8871,0.975123,0.303529,4,15,0,4.25051
>>> -22.1208,0.971631,0.303529,3,7,0,8.55276
>>> -23.8336,0.857954,0.303529,4,15,0,4.41087
>>>
>>> If I remove the commented line ("# Adaptation terminated"), readtable()
>>> has no problem, but if it's there readtable() seems to ignore the
>>> 'allowcomments=true'.
>>>
>>> I didn't update DataFrames as far as I am aware, but once or twice today
>>> I did pull Julia's master from github.
>>>
>>> I wonder if someone could try this example. Thanks a lot.
>>>
>>> Rob J. Goedman
>>> [email protected]
>>>
>>>
>>> julia> df = readtable("schools8_samples.csv", allowcomments=true)
>>> ERROR: Saw 4 rows, 5 columns and 22 fields
>>>  * Line 1 has 3 columns
>>>
>>>  in error at error.jl:21
>>>  in findcorruption at
>>> /Users/rob/.julia/v0.3/DataFrames/src/dataframe/io.jl:680
>>>  in readtable! at
>>> /Users/rob/.julia/v0.3/DataFrames/src/dataframe/io.jl:731
>>>  in readtable at
>>> /Users/rob/.julia/v0.3/DataFrames/src/dataframe/io.jl:812
>>>  in readtable at
>>> /Users/rob/.julia/v0.3/DataFrames/src/dataframe/io.jl:879
>>>
>>> julia> df = readtable("schools8_samples.csv", allowcomments=true)
>>> 3x7 DataFrame
>>> |-------|---------------|---------|---------|
>>> | Col # | Name          | Eltype  | Missing |
>>> | 1     | lp__          | Float64 | 0       |
>>> | 2     | accept_stat__ | Float64 | 0       |
>>> | 3     | stepsize__    | Float64 | 0       |
>>> | 4     | treedepth__   | Int64   | 0       |
>>> | 5     | n_leapfrog__  | Int64   | 0       |
>>> | 6     | n_divergent__ | Int64   | 0       |
>>> | 7     | mu            | Float64 | 0       |
>>>
>>>
>>> <schools8_samples.csv>
>>>
>>>
>>>
>>>
>>>
>>
>>
>
>

Re: [julia-users] Dataframe readtable change?

Reply via email to