Try batches that are small as a starting point - say 100. On Wed, Jun 26, 2019 at 10:33 AM aashish choudhary < aashish.choudha...@gmail.com> wrote:
> Yes we see exceeded max-connections error on server side. > > So I was trying to see how the putAll API works in general and from a > standard java client I was trying to simulate the behaviour that we see on > our server. > I tried to put 600k records using putAll on my local machine with 1 > locator and 2 servers. Region type is replicate persistent and I could see > that local clientCache API getting crashed with some "pool unexpected" > error. We do see this error on our spark code as well. It then do a retry > and gets failed. However surprisingly data gets inserted in the region even > though clientCache java API was crashed. > > I tried to run it through in some batches but those also got failed and > it's too slow. > > Only way I was able to make it work by is increasing readtimeout to 60 > seconds. > > Can someone share some tips on putAll API? > How to use it effectively? > > > With best regards, > Ashish > > On Wed, Jun 26, 2019, 6:20 AM Anilkumar Gingade <aging...@pivotal.io> > wrote: > >> Ashish, >> >> Do you see "exceeded max-connections" error... >> >> Operation/Job getting completed second time indicates, the server where >> the operation is executed first time may have issues, you may want to see >> the load on that server and if there are any memory issues. >> >> >>What is the recommended way to connect to geode using spark? >> Its more of how the geode is used in this context; is the spark >> processors are acting as geode's client or peer node. If its geode client, >> then its more about tuning client connections based on how/what operations >> are performed. >> >> Anil >> >> >> >> >> On Tue, Jun 25, 2019 at 10:54 AM aashish choudhary < >> aashish.choudha...@gmail.com> wrote: >> >>> We could also see below on server side logs as well. >>> >>> Rejected connection from Server connection from >>> >> [client host address=x.yx.x.x; client port=abc] because incoming >>> >> request was rejected by pool possibly due to thread exhaustion >>> >> >>> >>> >>> On Tue, Jun 25, 2019, 7:27 AM aashish choudhary < >>> aashish.choudha...@gmail.com> wrote: >>> >>>> As I mentioned earlier threads count could go to 4000 and we have seen >>>> readtimeout crossing default 10 seconds. We tried to increase read timeout >>>> to 30 seconds but that didn't work either. Record count is not more than >>>> 600k. >>>> >>>> Job gets successful in second attempt without changing anything which >>>> is bit weird. >>>> >>>> With best regards, >>>> Ashish >>>> >>>> On Tue, Jun 25, 2019, 12:23 AM Anilkumar Gingade <aging...@pivotal.io> >>>> wrote: >>>> >>>>> Hi Ashish, >>>>> >>>>> How many threads at a time executing putAll jobs in a single client >>>>> (spark job?)... >>>>> Do you see read timeout exception in client logs...If so, can you try >>>>> increasing the read timeout value. Or reducing the putAll size. >>>>> >>>>> In case of PutAll for partitioned region; the putAll (entries) size is >>>>> broken down and sent to respective servers based on its data affinity; the >>>>> reason its working with partitioned region. >>>>> >>>>> You can find more detail on how client-server connection works at: >>>>> >>>>> https://geode.apache.org/docs/guide/14/topologies_and_comm/topology_concepts/how_the_pool_manages_connections.html >>>>> >>>>> -Anil. >>>>> >>>>> >>>>> >>>>> >>>>> >>>>> >>>>> >>>>> On Mon, Jun 24, 2019 at 10:04 AM aashish choudhary < >>>>> aashish.choudha...@gmail.com> wrote: >>>>> >>>>>> Hi, >>>>>> >>>>>> We have been experiencing issues while connect to geode using putAll >>>>>> API with spark. Issue is specific to one particular spark job which tries >>>>>> to load data to a replicated region. Exception we see in the server side >>>>>> is >>>>>> that default limit of 800 gets maxed out and on client side we see retry >>>>>> attempt to each server but gets failed even though when we re ran the >>>>>> same >>>>>> job it gets completed without any issue. >>>>>> >>>>>> In the code problem I could see is that we are connecting to geode >>>>>> using client cache in forEachPartition which I think could be the issue. >>>>>> So >>>>>> for each partition we are making a connection to geode. In stats file we >>>>>> could see that connections getting timeout and there is thread burst also >>>>>> sometimes >4000. >>>>>> >>>>>> What is the recommended way to connect to geode using spark? >>>>>> >>>>>> But this one specific job which gets failed most of the times and is >>>>>> a replicated region. Also when we change the type of region to >>>>>> partitioned >>>>>> then job gets completed. We have enabled disk persistence for both type >>>>>> of >>>>>> regions. >>>>>> >>>>>> Thoughts? >>>>>> >>>>>> >>>>>> >>>>>> With best regards, >>>>>> Ashish >>>>>> >>>>> -- Charlie Black | cbl...@pivotal.io