I think the best way to make this happen would be to implement a DDataArray 
type and then build a DDataFrame type on top of that.

 — John

On May 1, 2014, at 5:15 PM, Jason Rudy <[email protected]> wrote:

> Hi Julia community,
> 
> I'm curious about Julia and also am a serious user of multivariate adaptive 
> regression splines (MARS) in both R and Python.  I'd very much like to have a 
> multiprocessing implementation of MARS and I'm looking into using Julia to 
> build one.  I'm curious whether anyone has advice on what basic packages I 
> should look into as dependencies.  It seems like if I want to do big parallel 
> computations I would need to use a distributed array.  However, there are 
> some extra features (such as categorical variables and missing values) that 
> may already be implemented in a standard way in the DataFrames package.  As I 
> see it, I can't use DataFrames if I want multiprocessing (unless I start by 
> copying the data into a distributed array), and therefore should just build 
> on distributed arrays and write an adapter later.  Is my impression accurate, 
> and is there any other advice you might have for someone attempting what I'm 
> attempting with Julia? 
> 
> Best,
> 
> Jason

Reply via email to