Thanks for the response.

Since you kindly asked, the following are two main areas in our assessment 
of the general arc of the Julia ecosystem:

1. Will the roadmap obviate some of the bottlenecks for day to day normal 
exploratory workflow?  These are minimal  things that R and Python have and 
whose lack hamper any use of Julia for regular analysis. Thing like robust 
dataframe with data i/o into different formats, web scraping, work out 
nullable semantics and integration with ecosystem , robust data cleaning 
and tidy data, modeling with basic  diagnostic tests etc

2. Will the roadmap jump leapfrog into areas and capabilities that are 
currently not covered by other stats and data science ecosystems?

 There are many here, but we are specifically looking at the ability to 
work with modeling on medium sized out of core databases. This would 
include an abstract dataframe like interface to said databses MySQL and 
SQLlite, and some sort of modeling capability on the same. My dream would 
be separation of model specification as a DAG/ probabilistic programming 
framework, from fitting the model. Thus the same model can be fit with 
different sort of data and optimizers. Streaming black box variation 
inference can be a means to extend this to  OOC work. 

I realize Julia won't for a while have all the statistical tests and random 
models of python, much less R. However, a general yet powerful and scalable 
data querying and prob programming framework could arguably  suffice for 
most python and R use cases in Data Science while provide a comparative 
advantage over other frameworks where it counts.  To my knowledge, Right 
now SAS and STATA are the only packages that offer general modeling with on 
disk data sets, but the sort of capability I outlined would seem to be in 
excess of what they offer. 

A bonus would be filling out gadfly towards Ggplot and ggvis capability. 
 


On Thursday, December 24, 2015 at 11:50:42 AM UTC-5, Viral Shah wrote:
>
> What would be helpful is to know what kind of decisions you are thinking 
> of and what are the factors. 
>
> I suspect within 2 weeks for sure - but it's really for the Julia stats 
> folks to say. The idea is to get feedback and chart a course.
>
> -viral
> On 24 Dec 2015 10:07 p.m., "Lampkld" <[email protected] <javascript:>> 
> wrote:
>
>> Sorry to bug you, but can we expect something  this or next week?  Would 
>> be helpful in knowing until when to push some stuff off. 
>>
>> On Thursday, December 17, 2015 at 6:20:45 PM UTC-5, Viral Shah wrote:
>>>
>>>
>>> The JuliaStats team will be publishing a general plan on stats+df in a 
>>> few days. I doubt we will have settled on all the df issues by then, but at 
>>> least there will be something to start with. 
>>>
>>>
>>> -viral 
>>>
>>>
>>>
>>> > On 17-Dec-2015, at 10:15 PM, Lampkld <[email protected]> wrote: 
>>> > 
>>> > Hi Viral, 
>>> > 
>>> > Any update on this (stats + df) by chance or idea when we can get one? 
>>> Even a roadmap or some sort of vision or other details would help with   
>>> decision making regarding infrastructure. 
>>> > 
>>> > Thanks! 
>>> > 
>>> > On Wednesday, November 11, 2015 at 3:00:50 AM UTC-5, Viral Shah wrote: 
>>> > Yes, we are really excited. This grant is to focus on core Julia 
>>> compiler infrastructure and key math libraries. Much of the libraries focus 
>>> will be on statistical Computing. 
>>> > -viral 
>>> > 
>>>
>>>

Reply via email to