Thanks for the quick response. I am curious to know whether would it be parallel pulling data for 100+ HTTP request or it will only go on Driver node? the post body would be part of DataFrame. Think as I have a data frame of employee_id, employee_name now the http GET call has to be made for each employee_id and DataFrame is dynamic for each spark job run.
Does it make sense? Thanks On Thu, May 14, 2020 at 5:12 PM Jerry Vinokurov <grapesmo...@gmail.com> wrote: > Hi Chetan, > > You can pretty much use any client to do this. When I was using Spark at a > previous job, we used OkHttp, but I'm sure there are plenty of others. In > our case, we had a startup phase in which we gathered metadata via a REST > API and then broadcast it to the workers. I think if you need all the > workers to have access to whatever you're getting from the API, that's the > way to do it. > > Jerry > > On Thu, May 14, 2020 at 5:03 PM Chetan Khatri <chetan.opensou...@gmail.com> > wrote: > >> Hi Spark Users, >> >> How can I invoke the Rest API call from Spark Code which is not only >> running on Spark Driver but distributed / parallel? >> >> Spark with Scala is my tech stack. >> >> Thanks >> >> >> > > -- > http://www.google.com/profiles/grapesmoker >