[julia-users] Re: Some DataFrames questions

David Gold Wed, 20 May 2015 08:33:26 -0700

Re #1: Have you looked into the DataFramesMeta.jl experimental package? 
https://github.com/JuliaStats/DataFramesMeta.jl


It may be able to help you, though I'm not sure. See in particular this 
issue: https://github.com/JuliaStats/DataFramesMeta.jl/issues/13.

On Wednesday, May 20, 2015 at 11:17:28 AM UTC-4, Nils Gudat wrote:
>
> I have two questions regarding the usage of DataFrame:
>
> 1. How can I subset a DataFrame based on multiple criteria (similar to the 
> pandas np.logical_and)?
> Consider:
>
> df = DataFrame(A = 1:3, B = 1:3)
>
> How do I get the subset of the DataFrame for which (for simplicity) A and 
> B are 1? df[:A].==1 and df[:B].==1 give me boolean arrays, but I can't find 
> any way of combining them to give me a single boolean mask - things like 
> df[df[:A].==1 & df[:B].==1] won't work, and my first idea of a workaround 
> df[ (df[:A].==1 + df[:B].==1)==2 ] fails as well, as for some reason adding 
> the two boolean arrays gives me false even for the first entry (which 
> should be true+true).
>
> 2. How do I deal with NA's when indexing? Consider:
>
> df = DataFrame(A = 1:3, B = 1:3, C = @data([1,2,NA]))
>
> Here, df[df[:C].==1, :] fails with NAException("cannot index an array with 
> a DataArray containing NA values"). One way around this would be 
> df[array(df[:C].==1, false), :] - is this the "correct" way of doing it or 
> are there other indexing methods that automatically deal with NAs?
>

[julia-users] Re: Some DataFrames questions

Reply via email to