Hi Ashish,

Do you have custom code that connects spark to geode?  I know there was a
geode-spark connector at one point and that it was forked:
https://github.com/Pivotal-Field-Engineering/geode-spark-connector   (but
it looks like it hasn't been updated in awhile).  Just curious if there was
some code we could look at.

On Mon, Jun 24, 2019 at 11:53 AM Anilkumar Gingade <aging...@pivotal.io>
wrote:

> Hi Ashish,
>
> How many threads at a time executing putAll jobs in a single client (spark
> job?)...
> Do you see read timeout exception in client logs...If so, can you try
> increasing the read timeout value. Or reducing the putAll size.
>
> In case of PutAll for partitioned region; the putAll (entries) size is
> broken down and sent to respective servers based on its data affinity; the
> reason its working with partitioned region.
>
> You can find more detail on how client-server connection works at:
>
> https://geode.apache.org/docs/guide/14/topologies_and_comm/topology_concepts/how_the_pool_manages_connections.html
>
> -Anil.
>
>
>
>
>
>
>
> On Mon, Jun 24, 2019 at 10:04 AM aashish choudhary <
> aashish.choudha...@gmail.com> wrote:
>
>> Hi,
>>
>> We have been experiencing issues while connect to geode using putAll API
>> with spark. Issue is specific to one particular spark job which tries to
>> load data to a replicated region. Exception we see in the server side is
>> that default limit of 800 gets maxed out and on client side we see retry
>> attempt to each server but gets failed even though when we re ran the same
>> job it gets completed without any issue.
>>
>> In the code problem I could see is that we are connecting to geode using
>> client cache in forEachPartition which I think could be the issue. So for
>> each partition we are making a connection to geode. In stats file we could
>> see that connections getting timeout and there is thread burst also
>> sometimes >4000.
>>
>> What is the recommended way to connect to geode using spark?
>>
>> But this one specific job which gets failed most of the times and is a
>> replicated region. Also when we change the type of region to partitioned
>> then job gets completed. We have enabled disk persistence for both type of
>> regions.
>>
>> Thoughts?
>>
>>
>>
>> With best regards,
>> Ashish
>>
>

Reply via email to