If you need Java code, you can have a look @: 
https://github.com/jgperrin/net.jgp.labs.spark.datasources 
<https://github.com/jgperrin/net.jgp.labs.spark.datasources>

and:
https://databricks.com/session/extending-apache-sparks-ingestion-building-your-own-java-data-source
 
<https://databricks.com/session/extending-apache-sparks-ingestion-building-your-own-java-data-source>

> On Dec 24, 2017, at 2:56 AM, Subarna Bhattacharyya 
> <suba...@climformatics.com> wrote:
> 
> Hi Sourav,
> Looks like this would be a good utility for the development of large scale
> data driven product based on Data services. 
> 
> We are an early stage startup called Climformatics and  we are building a
> customized high resolution climate prediction tool. This effort requires
> synthesis of large scale data input from multiple data sources. This tool
> can help in getting large volume of data from multiple data services through
> api calls which are somewhat limited to their bulk use.
> 
> One feature that would help us further is if you could have a handle on
> setting the limits on how many data points can be grabbed at once, since the
> data sources that we access are often limited by the number of service calls
> that one can do at a time (say per minute).
> 
> Also we need a way to pass the parameter inputs (for multiple calls) through
> the url path itself. Many of the data sources we use need the parameters are
> to be included in the uri path itself instead of passing them as key/value
> parameter. An example is https://www.wunderground.com/weather/api/d/docs.
> 
> We would try to give a closer look to the github link you provided and get
> back to you with feedback.
> 
> Thanks,
> Sincerely,
> Subarna
> 
> 
> 
> --
> Sent from: http://apache-spark-user-list.1001560.n3.nabble.com/
> 
> ---------------------------------------------------------------------
> To unsubscribe e-mail: user-unsubscr...@spark.apache.org
> 

Reply via email to