Le mardi 24 novembre 2015 à 19:32 -0800, Arin Basu a écrit :
> Thanks a million Milan and Dan. I have learned hugely from the codes 
> you shared and the packages you discussed. There is a need for 
> dedicated biostatistics packages in Julia. For instance, I could not 
> find a dedicated package on regression diagnostics
I don't think this should live in a dedicated biostatistics package.
I'm a social scientist and I would find these tools useful too. Better
share our work.

> (I tried RegTools but it did not compile for some reason in my 
> machine Mac OSX El Capitan, Julia 0.4.1).
That code is fairly new, it doesn't look like it's set up to be used as
a package yet. You could file an issue in GitHub against this package,
as it seems to be actively maintained, to tell the author people are
interested in testing it. Adding a src/RegTools.jl file containing
these lines:

include("diagnostics.jl")
include("misc.jl")
include("modsel.jl")

and including the functions in a module should be enough.


Regards

> 
> Best,
> Arin
> 
> On Monday, 23 November 2015 04:53:46 UTC+13, Milan Bouchet-Valat
> wrote:
> > As I noted just a few days ago, I have written a small package to 
> > compute frequency tables from arbitrary arrays, with an optimized 
> > method for pooled data arrays : 
> > https://github.com/nalimilan/FreqTables.jl 
> > 
> > I've just pushed a fiw so it should now work on 0.4 (but not with
> > 0.3). 
> > 
> > We could easily add a method taking a DataFrame and symbol names
> > for 
> > columns to save some typing. 
> > 
> > 
> > Regards 
> > 
> > Le dimanche 22 novembre 2015 à 03:26 -0800, Dan a écrit : 
> > > Hi Arin, 
> > > It would be helpful to have more details about the input (a 
> > > dataframe?) and output (a two-by-two table or a table indexed by 
> > > categories?). Some code to give context to the question would be
> > even 
> > > more help (possibly in another language, such as R). 
> > > 
> > > Having said this, here is a starting point for some code: 
> > > 
> > > If these packages are missing Pkg.add works: 
> > > 
> > > using NamedArrays 
> > > using DataFrames 
> > > using RDatasets 
> > > 
> > > Gets the dataset and makes some categorical variables in
> > DataFrames 
> > > style: 
> > > 
> > > iris = dataset("datasets","iris") 
> > > iris[:PetalWidth] = PooledDataArray(iris[:PetalWidth]) 
> > > iris[:Species] = PooledDataArray(iris[:Species]) 
> > > 
> > > Define function for a `twobytwo` and a general categorical table 
> > > `crosstable`: 
> > > 
> > > function twobytwo(data::DataFrame,cond1,cond2) 
> > >        nres= 
> > >
> > NamedArray(zeros(Int,2,2),Any[[false,true],[false,true]],["cond1","
> > co 
> > > nd2"]) 
> > >        for i=1:nrow(data) 
> > >            nres[Int(cond1(data[i,:]))+1,Int(cond2(data[i,:]))+1]
> > += 1 
> > >        end 
> > >        nres 
> > > end 
> > > 
> > > function crosstable(data::DataFrame,col1,col2) 
> > >        @assert isa(data[col1],PooledDataArray) 
> > >        @assert isa(data[col2],PooledDataArray) 
> > >        nres= 
> > >
> > NamedArray(zeros(Int,length(data[col1].pool),length(data[col2].pool
> > )) 
> > > ,Any[data[col1].pool,data[col2].pool],[col1,col2]) 
> > >        for i=1:nrow(data) 
> > >            nres[data[col1].refs[i],data[col2].refs[i]] += 1 
> > >        end 
> > >        nres 
> > > end 
> > > 
> > > Finally, using the functions, make some tables: 
> > > 
> > > tbt = twobytwo(iris,r->r[1,:Species]=="setosa",r 
> > > ->r[1,:PetalWidth]>=1.5) 
> > > ct = crosstable(iris,:PetalWidth,:Species) 
> > > 
> > > My summary and conclusions: 
> > > 1) Julia is general purpose and with a little familiarity any
> > data 
> > > handling is possible. 
> > > 2) This is a basic data exploration operation and there must be
> > some 
> > > easy way to do this. 
> > > 
> > > Waiting for more opinions/solutions on this question, as it is
> > also 
> > > basic for my needs. 
> > > 
> > > Thanks for the question. 
> > > 
> > > On Sunday, November 22, 2015 at 3:34:56 AM UTC+2, Arin Basu
> > wrote: 
> > > > Hi All, 
> > > > 
> > > > Can you kindly advise how to get a simple way to do two by two 
> > > > tables in Julia with two categorical variables. I have tried
> > split 
> > > > -apply-combine (by function) and it works with single
> > variables, 
> > > > but with two or more variables, I cannot get the table I want. 
> > > > 
> > > > This is really an issue if we need to do statistical data
> > analysis 
> > > > in Epidemiology. 
> > > > 
> > > > Any help or advice will be greatly appreciated. 
> > > > 
> > > > Arin Basu 
> > > > 

Reply via email to