Thanks for the quick response.

I am curious to know whether would it be parallel pulling data for 100+
HTTP request or it will only go on Driver node? the post body would be part
of DataFrame. Think as I have a data frame of employee_id, employee_name
now the http GET call has to be made for each employee_id and DataFrame is
dynamic for each spark job run.

Does it make sense?

Thanks


On Thu, May 14, 2020 at 5:12 PM Jerry Vinokurov <grapesmo...@gmail.com>
wrote:

> Hi Chetan,
>
> You can pretty much use any client to do this. When I was using Spark at a
> previous job, we used OkHttp, but I'm sure there are plenty of others. In
> our case, we had a startup phase in which we gathered metadata via a REST
> API and then broadcast it to the workers. I think if you need all the
> workers to have access to whatever you're getting from the API, that's the
> way to do it.
>
> Jerry
>
> On Thu, May 14, 2020 at 5:03 PM Chetan Khatri <chetan.opensou...@gmail.com>
> wrote:
>
>> Hi Spark Users,
>>
>> How can I invoke the Rest API call from Spark Code which is not only
>> running on Spark Driver but distributed / parallel?
>>
>> Spark with Scala is my tech stack.
>>
>> Thanks
>>
>>
>>
>
> --
> http://www.google.com/profiles/grapesmoker
>

Reply via email to