Thanks a million Milan and Dan. I have learned hugely from the codes you 
shared and the packages you discussed. There is a need for dedicated 
biostatistics packages in Julia. For instance, I could not find a dedicated 
package on regression diagnostics (I tried RegTools but it did not compile 
for some reason in my machine Mac OSX El Capitan, Julia 0.4.1). 

Best,
Arin

On Monday, 23 November 2015 04:53:46 UTC+13, Milan Bouchet-Valat wrote:
>
> As I noted just a few days ago, I have written a small package to 
> compute frequency tables from arbitrary arrays, with an optimized 
> method for pooled data arrays : 
> https://github.com/nalimilan/FreqTables.jl 
>
> I've just pushed a fiw so it should now work on 0.4 (but not with 0.3). 
>
> We could easily add a method taking a DataFrame and symbol names for 
> columns to save some typing. 
>
>
> Regards 
>
> Le dimanche 22 novembre 2015 à 03:26 -0800, Dan a écrit : 
> > Hi Arin, 
> > It would be helpful to have more details about the input (a 
> > dataframe?) and output (a two-by-two table or a table indexed by 
> > categories?). Some code to give context to the question would be even 
> > more help (possibly in another language, such as R). 
> > 
> > Having said this, here is a starting point for some code: 
> > 
> > If these packages are missing Pkg.add works: 
> > 
> > using NamedArrays 
> > using DataFrames 
> > using RDatasets 
> > 
> > Gets the dataset and makes some categorical variables in DataFrames 
> > style: 
> > 
> > iris = dataset("datasets","iris") 
> > iris[:PetalWidth] = PooledDataArray(iris[:PetalWidth]) 
> > iris[:Species] = PooledDataArray(iris[:Species]) 
> > 
> > Define function for a `twobytwo` and a general categorical table 
> > `crosstable`: 
> > 
> > function twobytwo(data::DataFrame,cond1,cond2) 
> >        nres= 
> > NamedArray(zeros(Int,2,2),Any[[false,true],[false,true]],["cond1","co 
> > nd2"]) 
> >        for i=1:nrow(data) 
> >            nres[Int(cond1(data[i,:]))+1,Int(cond2(data[i,:]))+1] += 1 
> >        end 
> >        nres 
> > end 
> > 
> > function crosstable(data::DataFrame,col1,col2) 
> >        @assert isa(data[col1],PooledDataArray) 
> >        @assert isa(data[col2],PooledDataArray) 
> >        nres= 
> > NamedArray(zeros(Int,length(data[col1].pool),length(data[col2].pool)) 
> > ,Any[data[col1].pool,data[col2].pool],[col1,col2]) 
> >        for i=1:nrow(data) 
> >            nres[data[col1].refs[i],data[col2].refs[i]] += 1 
> >        end 
> >        nres 
> > end 
> > 
> > Finally, using the functions, make some tables: 
> > 
> > tbt = twobytwo(iris,r->r[1,:Species]=="setosa",r 
> > ->r[1,:PetalWidth]>=1.5) 
> > ct = crosstable(iris,:PetalWidth,:Species) 
> > 
> > My summary and conclusions: 
> > 1) Julia is general purpose and with a little familiarity any data 
> > handling is possible. 
> > 2) This is a basic data exploration operation and there must be some 
> > easy way to do this. 
> > 
> > Waiting for more opinions/solutions on this question, as it is also 
> > basic for my needs. 
> > 
> > Thanks for the question. 
> > 
> > On Sunday, November 22, 2015 at 3:34:56 AM UTC+2, Arin Basu wrote: 
> > > Hi All, 
> > > 
> > > Can you kindly advise how to get a simple way to do two by two 
> > > tables in Julia with two categorical variables. I have tried split 
> > > -apply-combine (by function) and it works with single variables, 
> > > but with two or more variables, I cannot get the table I want. 
> > > 
> > > This is really an issue if we need to do statistical data analysis 
> > > in Epidemiology. 
> > > 
> > > Any help or advice will be greatly appreciated. 
> > > 
> > > Arin Basu 
> > > 
>

Reply via email to