1. If you use some custom API library, there's a chance to end up with
Serialization errors and all, but a normal http REST api would work fine
except there could be a bit of performance lag + those api's might limit
the number of requests.

2. I would go for this approach, either i will broadcast the ip data or i
would cache it in a normal RDD and then i would join it with the stream
data.

Thanks
Best Regards

On Tue, Dec 2, 2014 at 8:44 PM, Noam Kfir <[email protected]> wrote:

>  Hi
>
>
>  I'm new to spark streaming.
>
> I'm currently writing spark streaming application to standardize events
> coming from Kinesis.
>
> As part of the logic, I want to use IP to geo information
> library or service.
>
> My questions:
>
> 1) If I would use some REST service for this task, do U think it would
> cause performance penalty (over using library based solution)
>
> 2) If I would use a library based solution, I will have to use some local
> db file.
> What mechanism should I use in order to transfer such db file? a broadcast
> variable?
>
> ​Tx, Noam.
>

Reply via email to