Hi Julia community, I'm curious about Julia and also am a serious user of multivariate adaptive regression splines (MARS) in both R and Python. I'd very much like to have a multiprocessing implementation of MARS and I'm looking into using Julia to build one. I'm curious whether anyone has advice on what basic packages I should look into as dependencies. It seems like if I want to do big parallel computations I would need to use a distributed array. However, there are some extra features (such as categorical variables and missing values) that may already be implemented in a standard way in the DataFrames package. As I see it, I can't use DataFrames if I want multiprocessing (unless I start by copying the data into a distributed array), and therefore should just build on distributed arrays and write an adapter later. Is my impression accurate, and is there any other advice you might have for someone attempting what I'm attempting with Julia?
Best, Jason
