Re: Tabletserver message "Running low on memory"

Josh Elser Tue, 12 Nov 2013 13:37:07 -0800


On 11/12/13, 1:25 PM, Terry P. wrote:

Hi Josh,
Thanks for your exhaustive reply. I am using Native maps, and it's set
to 1G in my accumulo-site.xml.  The data and index cache settings there
are still at their 3G default values as well (50M and 512M).  I
definitely didn't realize that and will increase their size given I have
plenty of memory sitting around idle (it was intended to be used for
caching too!).

Will increasing the tserver.memory.maps.max in accumulo-site.xml perhaps
help reduce these warning messages?  My only concern is that an operator
may be monitoring the Accumulo Monitor GUI and see the memory warnings
and think "Oh no, we're almost out of memory, I should page someone!"

Hahaha, yeah, I know what you mean. You can tell them that it's just a"warning" and not an "error" :P. Increasing the size of the memory mapswon't make the error go away. I believe that warning is purely over JVMheap. I don't believe there's any code (outside of the flush-policy forwhen to close a native map and start a new one to make sure thetserver.memory.maps.max is observed) to monitor the size of the native maps.

You would want to increase JVM heap size to keep that error fromhappening or reduce the amounts of heap you give to the index block ordata block cache.

Based on what you've seen, is the warning innocuous and can just be ignored?

IMO, yes. Given with a strong recommendation that you know that you'renot spending any significant time in garbage collection.

`fgrep 'gc ParNew'` on your tserver.debug.log and not seeing "spiky" gccycles.



On Tue, Nov 12, 2013 at 3:03 PM, Josh Elser <[email protected]
<mailto:[email protected]>> wrote:

    IMO, I see this at home on my computer no matter what memory
    settings I use. I've become pretty accustomed to flat out ignoring it...

    As for heap management, there are two big paths here: with "native
    maps" and without. When you write data to Accumulo, it goes to two
    places: 1) Write-ahead log and 2) Memory maps. The WAL ensures that
    if you have writes in memory on a server that dies, that you don't
    lose data. The memory maps give you much faster ingest over trying
    to write into a sorted file.

    1) Native maps (aka c++ code over JNI)

    This memory allocation, controlled by tserver.memory.maps.max in
    accumulo-site.xml, is "off heap" memory. It is not limited by the
    JVM heap limits you specify in ACCUMULO_TSERVER_OPTS in
    accumulo-env.sh. As such, you need to make sure that you don't
    over-allocate memory usage on your node (tserver.memory.maps.max +
    JVM Xmx + fudge-factor < total available memory).

    2) Non-native (in JVM)

    This serves the same purpose as #1 but is in JVM heap as opposed to
    off heap. Ingest will be slower and JVM gc will likely be a bigger
    issue than using the native maps. This does make the JVM sizing a
    little more straightforward: JVM Xmx + fudge-factor < total
    available memory (but math is pretty easy).

    Assuming you use the native maps, lets break down what you see in
    JVM heap.

    1) Index block cache

    Each RFile (backing file for tablets in Accumulo), has an
    multi-level index structure which lets you efficiently find the data
    in that file. Accumulo provides the ability to cache this index
    information instead of reading and deserializing from disk every
    time. Controlled by tserver.cache.index.size.

    2) Data block cache

    Similar to #1 except it's for the actual blocks of data in that
    RFile (the key-value pairs) instead of just the index structure.
    Controlled by tserver.cache.data.size. This can give you some
    benefit over having to hit a (potentially, remote) datanode every
    time you perform a read in a read-heavy environment.

    3) "The rest"

    Consider this the rest of the things that the tabletserver does.
    "hosting" its tablets (each tablet has a collection of files in
    hdfs), scansessions running against those hosted tablets (the
    iterator stack that is created to perform a "read"). Compression
    (de)allocators for Hadoop (assuming you're using GZIP). Various
    internal buffers for caching. Connection management information
    (thrift and hadoop connections). I'm probably missing more things, too.


    On 11/12/13, 12:32 PM, Terry P. wrote:

        On an Accumulo 1.4.2 I've gotten "[tabletserver.TabletServer] WARN:
        Running low on memory" 5 times in the last two days on just one
        of my 6
        datanodes. That datanode is hosting ~30% of the data, as 2
        datanodes had
        dropped from the cluster due to a network issue some time ago
        and hasn't
        entirely rebalanced.  Current volume is only 140 million.

        Ingest rates has been pretty constant at a light 200 per second.

        Not knowing how Accumulo uses its java heap space, I opted to
        start with
        a stock memory config and used the 3GB example config files,
        which I see
        allocates only 1GB to the TabletServer. The server has 24GB RAM and
        currently is using only 10GB total between Accumulo and HDFS, so
        there's
        plenty of free memory to spare.

        Is ACCUMULO_TSERVER_OPTS in accumulo-env.sh the tunable I should
        target
        to alleviate these warnings?

        Unfortunately being ops-configured nodes, no JDKs are installed
        nor is
        it a possibility to do so in order to monitor the JVM itself for
        better
        information.

        Thanks in advance,
        Terry

Re: Tabletserver message "Running low on memory"

Reply via email to