I've launched the cluster again and I was able to reproduce the error: In the proxy I had the same error that I mention in one of my previous messages, about a failure in a table server. I checked the log of that tablet server and I found:
2014-02-03 18:02:24,065 [thrift.ProcessFunction] ERROR: Internal error processing update org.apache.accumulo.server.tabletserver.HoldTimeoutException: Commits are held A lot of times. Full log if someone want to have a look: http://www.vhgroup.net/diegows/tserver_matrix-slave-07.accumulo-ec2-test.com.debug.log Regards, Diego On Mon, Feb 3, 2014 at 12:11 PM, Josh Elser <[email protected]> wrote: > I would assume that that proxy service would become a bottleneck fairly > quickly and your throughput would benefit from running multiple proxies, > but I don't have substantive numbers to back up that assertion. > > I'll put this on my list and see if I can reproduce something. > > > On 2/3/14, 7:42 AM, Diego Woitasen wrote: >> >> I have to run the tests again because they were ec2 instances and I've >> destroyed. It's easy to reproduce BTW. >> >> My question is, does it makes sense to run multiple proxies? Are there >> a limit? Right now I'm trying with 10 nodes and 10 proxies (running on >> every node). May be that doesn't make sense or it's a buggy >> configuration. >> >> >> >> On Fri, Jan 31, 2014 at 7:29 PM, Josh Elser <[email protected]> wrote: >>> >>> When you had multiple proxies, what were the failures on that tablet >>> server >>> (10.202.6.46:9997). >>> >>> I'm curious why using one proxy didn't cause errors but multiple did. >>> >>> >>> On 1/31/14, 4:44 PM, Diego Woitasen wrote: >>>> >>>> >>>> I've reproduced the error and I've found this in the proxy logs: >>>> >>>> 2014-01-31 19:47:50,430 [server.THsHaServer] WARN : Got an >>>> IOException in internalRead! >>>> java.io.IOException: Connection reset by peer >>>> at sun.nio.ch.FileDispatcherImpl.read0(Native Method) >>>> at sun.nio.ch.SocketDispatcher.read(SocketDispatcher.java:39) >>>> at sun.nio.ch.IOUtil.readIntoNativeBuffer(IOUtil.java:223) >>>> at sun.nio.ch.IOUtil.read(IOUtil.java:197) >>>> at >>>> sun.nio.ch.SocketChannelImpl.read(SocketChannelImpl.java:379) >>>> at >>>> >>>> org.apache.thrift.transport.TNonblockingSocket.read(TNonblockingSocket.java:141) >>>> at >>>> >>>> org.apache.thrift.server.AbstractNonblockingServer$FrameBuffer.internalRead(AbstractNonblockingServer.java:515) >>>> at >>>> >>>> org.apache.thrift.server.AbstractNonblockingServer$FrameBuffer.read(AbstractNonblockingServer.java:305) >>>> at >>>> >>>> org.apache.thrift.server.AbstractNonblockingServer$AbstractSelectThread.handleRead(AbstractNonblockingServer.java:202) >>>> at >>>> >>>> org.apache.thrift.server.TNonblockingServer$SelectAcceptThread.select(TNonblockingServer.java:198) >>>> at >>>> >>>> org.apache.thrift.server.TNonblockingServer$SelectAcceptThread.run(TNonblockingServer.java:154) >>>> 2014-01-31 19:51:13,185 [impl.ThriftTransportPool] WARN : Server >>>> 10.202.6.46:9997:9997 (30000) had 20 failures in a short time period, >>>> will not complain anymore >>>> >>>> A lot of this messages appear in all the proxies. >>>> >>>> I tried the same stress tests agaisnt one proxy and I was able to >>>> increase the load without getting any error. >>>> >>>> Regards, >>>> Diego >>>> >>>> On Thu, Jan 30, 2014 at 2:47 PM, Keith Turner <[email protected]> wrote: >>>>> >>>>> >>>>> Do you see more information in the proxy logs? "# exceptions 1" >>>>> indicates >>>>> an unexpected exception occured in the batch writer client code. The >>>>> proxy >>>>> uses this client code, so maybe there will be a more detailed stack >>>>> trace >>>>> in >>>>> its logs. >>>>> >>>>> >>>>> On Thu, Jan 30, 2014 at 9:46 AM, Diego Woitasen >>>>> <[email protected]> >>>>> wrote: >>>>>> >>>>>> >>>>>> >>>>>> Hi, >>>>>> I'm testing with a ten node cluster with the proxy enabled in all >>>>>> the >>>>>> nodes. I'm doing a stress test balancing the connection between the >>>>>> proxies using round robin. When I increase the load (400 workers >>>>>> writting) I get this error: >>>>>> >>>>>> AccumuloSecurityException: >>>>>> >>>>>> >>>>>> >>>>>> AccumuloSecurityException(msg='org.apache.accumulo.core.client.MutationsRejectedException: >>>>>> # constraint violations : 0 security codes: [] # server errors 0 # >>>>>> exceptions 1') >>>>>> >>>>>> The complete message is: >>>>>> >>>>>> AccumuloSecurityException: >>>>>> >>>>>> >>>>>> >>>>>> AccumuloSecurityException(msg='org.apache.accumulo.core.client.MutationsRejectedException: >>>>>> # constraint violations : 0 security codes: [] # server errors 0 # >>>>>> exceptions 1') >>>>>> kvlayer-test client failed! >>>>>> Traceback (most recent call last): >>>>>> File "tests/kvlayer/test_accumulo_throughput.py", line 64, in >>>>>> __call__ >>>>>> self.client.put('t1', ((u,), self.one_mb)) >>>>>> File >>>>>> >>>>>> >>>>>> "/home/ubuntu/kvlayer-env/local/lib/python2.7/site-packages/kvlayer-0.2.7-py2.7.egg/kvlayer/_decorators.py", >>>>>> line 26, in wrapper >>>>>> return method(*args, **kwargs) >>>>>> File >>>>>> >>>>>> >>>>>> "/home/ubuntu/kvlayer-env/local/lib/python2.7/site-packages/kvlayer-0.2.7-py2.7.egg/kvlayer/_accumulo.py", >>>>>> line 154, in put >>>>>> batch_writer.close() >>>>>> File >>>>>> >>>>>> >>>>>> "/home/ubuntu/kvlayer-env/local/lib/python2.7/site-packages/pyaccumulo_dev-1.5.0.2-py2.7.egg/pyaccumulo/__init__.py", >>>>>> line 126, in close >>>>>> self._conn.client.closeWriter(self._writer) >>>>>> File >>>>>> >>>>>> >>>>>> "/home/ubuntu/kvlayer-env/local/lib/python2.7/site-packages/pyaccumulo_dev-1.5.0.2-py2.7.egg/pyaccumulo/proxy/AccumuloProxy.py", >>>>>> line 3149, in closeWriter >>>>>> self.recv_closeWriter() >>>>>> File >>>>>> >>>>>> >>>>>> "/home/ubuntu/kvlayer-env/local/lib/python2.7/site-packages/pyaccumulo_dev-1.5.0.2-py2.7.egg/pyaccumulo/proxy/AccumuloProxy.py", >>>>>> line 3172, in recv_closeWriter >>>>>> raise result.ouch2 >>>>>> >>>>>> I'm not sure if the errror is produced by the way I'm using the >>>>>> cluster with multiple proxies, may be I should use one. >>>>>> >>>>>> Ideas are welcome. >>>>>> >>>>>> Regards, >>>>>> Diego >>>>>> >>>>>> -- >>>>>> Diego Woitasen >>>>>> VHGroup - Linux and Open Source solutions architect >>>>>> www.vhgroup.net >>>>> >>>>> >>>>> >>>>> >>>> >>>> >>>> >>> >> >> >> > -- Diego Woitasen VHGroup - Linux and Open Source solutions architect www.vhgroup.net
