If you need Java code, you can have a look @: https://github.com/jgperrin/net.jgp.labs.spark.datasources <https://github.com/jgperrin/net.jgp.labs.spark.datasources>
and: https://databricks.com/session/extending-apache-sparks-ingestion-building-your-own-java-data-source <https://databricks.com/session/extending-apache-sparks-ingestion-building-your-own-java-data-source> > On Dec 24, 2017, at 2:56 AM, Subarna Bhattacharyya > <suba...@climformatics.com> wrote: > > Hi Sourav, > Looks like this would be a good utility for the development of large scale > data driven product based on Data services. > > We are an early stage startup called Climformatics and we are building a > customized high resolution climate prediction tool. This effort requires > synthesis of large scale data input from multiple data sources. This tool > can help in getting large volume of data from multiple data services through > api calls which are somewhat limited to their bulk use. > > One feature that would help us further is if you could have a handle on > setting the limits on how many data points can be grabbed at once, since the > data sources that we access are often limited by the number of service calls > that one can do at a time (say per minute). > > Also we need a way to pass the parameter inputs (for multiple calls) through > the url path itself. Many of the data sources we use need the parameters are > to be included in the uri path itself instead of passing them as key/value > parameter. An example is https://www.wunderground.com/weather/api/d/docs. > > We would try to give a closer look to the github link you provided and get > back to you with feedback. > > Thanks, > Sincerely, > Subarna > > > > -- > Sent from: http://apache-spark-user-list.1001560.n3.nabble.com/ > > --------------------------------------------------------------------- > To unsubscribe e-mail: user-unsubscr...@spark.apache.org >