You shouldn't have to create a new BatchWriter -- have you tried reducing the amount of memory the BatchWriter will use? It keeps a cache internally to try to do an amortization of Mutations to send to a given tabletserver.

To limit this memory, use the BatchWriterConfig#setMaxMemory(long) method. By default, the maxMemory value is set to 50MB. Reducing this may be enough to hold less data in your client and give you some more head room.

Alternatively, you could give your client JVM some more heap :)

Geoffry Roberts wrote:
I am try to pump some data into Accumulo but I keep encountering

Exception in thread "Thrift Connection Pool Checker"
java.lang.OutOfMemoryError: Java heap space

at java.util.HashMap.newValueIterator(HashMap.java:971)

at java.util.HashMap$Values.iterator(HashMap.java:1038)

at
org.apache.accumulo.core.client.impl.ThriftTransportPool$Closer.closeConnections(ThriftTransportPool.java:103)

at
org.apache.accumulo.core.client.impl.ThriftTransportPool$Closer.run(ThriftTransportPool.java:147)

at java.lang.Thread.run(Thread.java:745)


I tried, as a work around, creating a new BatchWriter and closing the
old one every ten thousand rows, but to no avail.  Data gets written up
to the 200kth row, then the error.

I have a table of 8M rows in a RDB that I am pumping into Acc via a
groovy script.  The rows are narrow, a short text field and four floats.

I googled of course but nothing was helpful.  What can be done?

Thanks so much.

--
There are ways and there are ways,

Geoffry Roberts

Reply via email to