Re: OutOfMemoryError: Java heap space after data load

Terry P. Tue, 30 Apr 2013 14:44:06 -0700

Eric, I'm really disappointed.  Rather than writing anything at all
actually, I opted to run the RandomBatchWriter example program.


It wasn't 35x faster.

It was 52x faster.

After all the excellent posts I've seen from you, I really expected a more
precise guestimation from you.  ;-)

Thanks for the gentle nudge to do better than python and the accumulo
shell.  At a million rows inserted in 13 seconds, I'm certain the Accumulo
cluster I've set up can certainly handle the 2-5K records per second max we
expect to throw at it.

Thanks again!



On Tue, Apr 30, 2013 at 1:47 PM, Eric Newton <[email protected]> wrote:

> I've probably written more python than Java, so I understand. :-)
>
> I've used Jython for scripting tests.  In unreleased versions (1.4.4 &
> 1.5.0) the Proxy will let you use the language of your choice.
>
> -Eric
>
>
>
> On Tue, Apr 30, 2013 at 2:43 PM, Terry P. <[email protected]> wrote:
>
>> Hi Eric,
>> Thanks for the info.  You've inspired me to dive into it in Java -- I had
>> been using the accumulo shell because I had a python data generation script
>> already in place and it was "faster" that way.  But if a small java program
>> is going to be 35x "faster" than that, it makes no sense to bother with the
>> shell!
>>
>> Thanks,
>> Terry
>>
>>
>> On Tue, Apr 30, 2013 at 11:01 AM, Eric Newton <[email protected]>wrote:
>>
>>> There's no need to flush... the shell is flushing after every single
>>> line.
>>>
>>> The flush you are invoking causes a minor compaction.
>>>
>>> If you wrote a quick java program to ingest the data, the data would
>>> load about 35x faster.
>>>
>>> -Eric
>>>
>>>
>>> On Mon, Apr 29, 2013 at 6:40 PM, Terry P. <[email protected]> wrote:
>>>
>>>> Perhaps having a configuration item to limit the size of the
>>>> shell_history.txt file would help avoid this in future?
>>>>
>>>>
>>>> On Mon, Apr 29, 2013 at 5:37 PM, Terry P. <[email protected]> wrote:
>>>>
>>>>> You hit it John -- on the NameNode the shell_history.txt file is
>>>>> 128MB, and same thing on the DataNode that 99% of the data went to due to
>>>>> the key structure.  On the other two datanodes it was tiny, and both could
>>>>> login fine (just my luck that the only datanode I tried after the load was
>>>>> the fat one).
>>>>>
>>>>> So is --disable-tab-completion supposed to skip reading the
>>>>> shell_history.txt file?  It appears that is not the case with 1.4.2 as it
>>>>> still dies with OOM error.
>>>>>
>>>>> I now see that a better way to go would probably be to use
>>>>> --execute-file switch to read the load file rather than pipe it to the
>>>>> shell.  Correct?
>>>>>
>>>>>
>>>>>
>>>>>
>>>>> On Mon, Apr 29, 2013 at 5:04 PM, John Vines <[email protected]> wrote:
>>>>>
>>>>>> Depending on your answer to Eric's question, I wonder if your history
>>>>>> is enough to blow it up. You may also want to check the size of
>>>>>> ~/.accumulo/shell_history.txt and see if that is ginormous.
>>>>>>
>>>>>>
>>>>>> On Mon, Apr 29, 2013 at 5:07 PM, Terry P. <[email protected]> wrote:
>>>>>>
>>>>>>> Hi John,
>>>>>>> I attempted to start the shell with --disable-tab-completion but it
>>>>>>> still failed in an identical manner.  What is that feature/option?
>>>>>>>
>>>>>>> The ACCUMULO_OTHER_OPTS var was set to "-Xmx256m -Xms64m" via the
>>>>>>> 2GB example config script.  I upped the -Xmx256m to 512m and the shell
>>>>>>> started successfully, so thanks!
>>>>>>>
>>>>>>> What would cause the shell to need more than 256m of memory just to
>>>>>>> start?  I'd like to understand how to determine an appropriate value to 
>>>>>>> set
>>>>>>> ACCUMULO_OTHER_OPTS to.
>>>>>>>
>>>>>>> Thanks,
>>>>>>> Terry
>>>>>>>
>>>>>>>
>>>>>>>
>>>>>>> On Mon, Apr 29, 2013 at 2:21 PM, John Vines <[email protected]>wrote:
>>>>>>>
>>>>>>>> The shell gets it's memory config from the accumulo-env file from
>>>>>>>> ACCUMULO_OTHER_OPTS. If, for some reason, the value was low or there 
>>>>>>>> was a
>>>>>>>> lot of data being loaded for the tab completion stuff in the shell, it
>>>>>>>> could die. You can try upping that value in the file or try running the
>>>>>>>> shell with "--disable-tab-completion" to see if that helps.
>>>>>>>>
>>>>>>>>
>>>>>>>> On Mon, Apr 29, 2013 at 3:02 PM, Terry P. <[email protected]>wrote:
>>>>>>>>
>>>>>>>>> Greetings folks,
>>>>>>>>> I have stood up our 8-node Accumulo 1.4.2 cluster consisting of 3
>>>>>>>>> ZooKeepers, 1 NameNode (also runs Accumulo Master, Monitor, and GC), 
>>>>>>>>> and 3
>>>>>>>>> DataNodes / TabletServers (Secondary NameNode with Alternate Accumulo
>>>>>>>>> Master process will follow).  The initial config files were copied 
>>>>>>>>> from the
>>>>>>>>> 2GB/native-standalone directory.
>>>>>>>>>
>>>>>>>>> For a quick test I have a text file I generated to load 500,000
>>>>>>>>> rows of sample data using the Accumulo shell.  For lack of a better 
>>>>>>>>> place
>>>>>>>>> to run it this first time, I ran it on the NameNode.  The script 
>>>>>>>>> performs
>>>>>>>>> flushes every 10,000 records (about 30,000 entries).  After the load
>>>>>>>>> finished, when I attempt to login to the Accumulo Shell on the 
>>>>>>>>> NameNode, I
>>>>>>>>> get the error:
>>>>>>>>>
>>>>>>>>> [root@edib-namenode ~]# /usr/lib/accumulo/bin/accumulo shell -u
>>>>>>>>> $AUSER -p $AUSERPWD
>>>>>>>>> #
>>>>>>>>> # java.lang.OutOfMemoryError: Java heap space
>>>>>>>>> # -XX:OnOutOfMemoryError="kill -9 %p"
>>>>>>>>> #   Executing /bin/sh -c "kill -9 24899"...
>>>>>>>>> Killed
>>>>>>>>>
>>>>>>>>> The performance of that test was pretty poor at about 160/second
>>>>>>>>> (somewhat expected, as it was just one thread) so to keep moving I
>>>>>>>>> generated 3 different load files and ran one on each of the 3 
>>>>>>>>> DataNodes /
>>>>>>>>> TabletServers.  Performance was much better, sustaining 1,400 per 
>>>>>>>>> second.
>>>>>>>>> Again, the test data load files have flush commands every 10,000 
>>>>>>>>> records
>>>>>>>>> (30,000 entries), including at the end of the file.
>>>>>>>>>
>>>>>>>>> However, as with the NameNode, now I cannot login to the Accumulo
>>>>>>>>> shell on any of the DataNodes either, as I get the same 
>>>>>>>>> OutOfMemoryError.
>>>>>>>>>
>>>>>>>>> My /etc/security/limits.conf file is set with 64000 for nofile and
>>>>>>>>> 32000 for nproc for the hdfs user (which is also running Accumulo, I
>>>>>>>>> haven't split accumulo out yet).
>>>>>>>>>
>>>>>>>>> I don't see any errors in the tserver or logger logs (standard and
>>>>>>>>> debug) or any info related to the shell failing to load.  I'm at a 
>>>>>>>>> loss
>>>>>>>>> with respect to where to look.  The servers have 16GB of memory, and 
>>>>>>>>> each
>>>>>>>>> has about 14GB currently free.
>>>>>>>>>
>>>>>>>>> Any help would be greatly appreciated.
>>>>>>>>>
>>>>>>>>> Best regards,
>>>>>>>>> Terry
>>>>>>>>>
>>>>>>>>
>>>>>>>>
>>>>>>>
>>>>>>
>>>>>
>>>>
>>>
>>
>

Re: OutOfMemoryError: Java heap space after data load

Reply via email to