Re: Question about Data Sources API
> > My question wrt Java/Scala was related to extending the classes to support > new custom data sources, so was wondering if those could be written in > Java, since our company is a Java shop. > Yes, you should be able to extend the required interfaces using Java. The additional push downs I am looking for are aggregations with grouping > and sorting. > Essentially, I am trying to evaluate if this API can give me much of what > is possible with the Apache MetaModel project. > We don't currently push those down today as our initial focus is on getting data into Spark so that you can join with other sources and then do such processing. Its possible we will extend the pushdown API though in the future.
Re: Question about Data Sources API
Hello Michael, Thanks for your quick reply. My question wrt Java/Scala was related to extending the classes to support new custom data sources, so was wondering if those could be written in Java, since our company is a Java shop. The additional push downs I am looking for are aggregations with grouping and sorting. Essentially, I am trying to evaluate if this API can give me much of what is possible with the Apache MetaModel project. Regards, Ashish On Tue, Mar 24, 2015 at 1:57 PM, Michael Armbrust wrote: > On Tue, Mar 24, 2015 at 12:57 AM, Ashish Mukherjee < > ashish.mukher...@gmail.com> wrote: >> >> 1. Is the Data Source API stable as of Spark 1.3.0? >> > > It is marked DeveloperApi, but in general we do not plan to change even > these APIs unless there is a very compelling reason to. > > >> 2. The Data Source API seems to be available only in Scala. Is there any >> plan to make it available for Java too? >> > > We tried to make all the suggested interfaces (other than CatalystScan > which exposes internals and is only for experimentation) usable from Java. > Is there something in particular you are having trouble with? > > >> 3. Are only filters and projections pushed down to the data source and >> all the data pulled into Spark for other processing? >> > > For now, this is all that is provided by the public stable API. We left a > hook for more powerful push downs > (sqlContext.experimental.extraStrategies), and would be interested in > feedback on other operations we should push down as we expand the API. >
Re: Question about Data Sources API
On Tue, Mar 24, 2015 at 12:57 AM, Ashish Mukherjee < ashish.mukher...@gmail.com> wrote: > > 1. Is the Data Source API stable as of Spark 1.3.0? > It is marked DeveloperApi, but in general we do not plan to change even these APIs unless there is a very compelling reason to. > 2. The Data Source API seems to be available only in Scala. Is there any > plan to make it available for Java too? > We tried to make all the suggested interfaces (other than CatalystScan which exposes internals and is only for experimentation) usable from Java. Is there something in particular you are having trouble with? > 3. Are only filters and projections pushed down to the data source and > all the data pulled into Spark for other processing? > For now, this is all that is provided by the public stable API. We left a hook for more powerful push downs (sqlContext.experimental.extraStrategies), and would be interested in feedback on other operations we should push down as we expand the API.