Re: [IGNITE-5717] improvements of MemoryPolicy default size

dsetrakyan Fri, 04 Aug 2017 02:44:46 -0700

Hang on. I thought we were talking about offheap size, GC should not be 
relevant. Am I wrong?


⁣D.

On Aug 4, 2017, 11:38 AM, at 11:38 AM, Sergey Chugunov 
<sergey.chugu...@gmail.com> wrote:
>Do you see an obvious way of implementing it?
>
>In java there is a heap and GC working on it. And for instance, it is
>possible to make a decision to throw an OOM based on some gc metrics.
>
>I may be wrong but I don't see a mechanism in Ignite to use it right
>away
>for such purposes.
>And implementing something without thorough planning brings huge risk
>of
>false positives with nodes stopping when they don't have to.
>
>That's why I think it must be implemented and intensively tested as
>part of
>a separate ticket.
>
>Thanks,
>Sergey.
>
>On Fri, Aug 4, 2017 at 12:18 PM, <dsetrak...@apache.org> wrote:
>
>> Without #3, the #1 and #2 make little sense.
>>
>> Why is #3 so difficult?
>>
>> ⁣D.
>>
>> On Aug 4, 2017, 10:46 AM, at 10:46 AM, Sergey Chugunov <
>> sergey.chugu...@gmail.com> wrote:
>> >Dmitriy,
>> >
>> >Last item makes perfect sense to me, one may think of it as an
>> >"OutOfMemoryException" in java.
>> >However, it looks like such feature requires considerable efforts to
>> >properly design and implement it, so I would propose to create a
>> >separate
>> >ticket and agree upon target version for it.
>> >
>> >Items #1 and #2 will be implemented under IGNITE-5717. Makes sense?
>> >
>> >Thanks,
>> >Sergey.
>> >
>> >On Thu, Aug 3, 2017 at 4:34 AM, Dmitriy Setrakyan
>> ><dsetrak...@apache.org>
>> >wrote:
>> >
>> >> Here is what we should do:
>> >>
>> >>    1. Pick an acceptable number. Does not matter if it is 10% or
>50%.
>> >>    2. Print the allocated memory in *BOLD* letters into the log.
>> >>    3. Make sure that Ignite server never hangs due to the low
>memory
>> >issue.
>> >>    We should sense it and kick the node out automatically, again
>with
>> >a
>> >> *BOLD*
>> >>    message in the log.
>> >>
>> >>  Is this possible?
>> >>
>> >> D.
>> >>
>> >> On Wed, Aug 2, 2017 at 6:09 PM, Vladimir Ozerov
>> ><voze...@gridgain.com>
>> >> wrote:
>> >>
>> >> > My proposal is 10% instead of 80%.
>> >> >
>> >> > ср, 2 авг. 2017 г. в 18:54, Denis Magda <dma...@apache.org>:
>> >> >
>> >> > > Vladimir, Dmitriy P.,
>> >> > >
>> >> > > Please see inline
>> >> > >
>> >> > > > On Aug 2, 2017, at 7:20 AM, Vladimir Ozerov
>> ><voze...@gridgain.com>
>> >> > > wrote:
>> >> > > >
>> >> > > > Denis,
>> >> > > >
>> >> > > > The reason is that product should not hang user's computer.
>How
>> >else
>> >> > this
>> >> > > > could be explained? I am developer. I start Ignite, 1 node,
>2
>> >nodes,
>> >> X
>> >> > > > nodes, observe how they join topology. Add one key, 10 keys,
>1M
>> >keys.
>> >> > > Then
>> >> > > > I do a bug in example and load 100M keys accidentally -
>restart
>> >the
>> >> > > > computer. Correct behavior is to have small "maxMemory" by
>> >default to
>> >> > > avoid
>> >> > > > that. User should get exception instead of hang. E.g. Java's
>> >"-Xmx"
>> >> is
>> >> > > > typically 25% of RAM - more adequate value, comparing to
>> >Ignite.
>> >> > > >
>> >> > >
>> >> > > Right, the developer was educated about the Java heap
>parameters
>> >and
>> >> > > limited the overall space preferring OOM to the laptop
>> >suspension. Who
>> >> > > knows how he got to the point that 25% RAM should be used.
>That
>> >might
>> >> > have
>> >> > > been deep knowledge about JVM or he faced several hangs while
>> >testing
>> >> the
>> >> > > application.
>> >> > >
>> >> > > Anyway, JVM creators didn’t decide to predefine the Java heap
>to
>> >a
>> >> static
>> >> > > value to avoid the situations like above. So should not we as
>a
>> >> platform.
>> >> > > Educate people about the Ignite memory behavior like Sun did
>for
>> >the
>> >> Java
>> >> > > heap but do not try to solve the lack of knowledge with the
>> >default
>> >> > static
>> >> > > memory size.
>> >> > >
>> >> > >
>> >> > > > It doesn't matter whether you use persistence or not.
>> >Persistent case
>> >> > > just
>> >> > > > makes this flaw more obvious - you have virtually unlimited
>> >disk, and
>> >> > yet
>> >> > > > you end up with swapping and hang when using Ignite with
>> >default
>> >> > > > configuration. As already explained, the problem is not
>about
>> >> > allocating
>> >> > > > "maxMemory" right away, but about the value of "maxMemory" -
>it
>> >is
>> >> too
>> >> > > big.
>> >> > > >
>> >> > >
>> >> > > How do you know what should be the default then? Why 1 GB? For
>> >> instance,
>> >> > > if I end up having only 1 GB of free memory left and try to
>start
>> >2
>> >> > server
>> >> > > nodes and an application I will face the laptop suspension
>again.
>> >> > >
>> >> > > —
>> >> > > Denis
>> >> > >
>> >> > > > "We had this behavior before" is never an argument. Previous
>> >offheap
>> >> > > > implementation had a lot of flaws, so let's just forget
>about
>> >it.
>> >> > > >
>> >> > > > On Wed, Aug 2, 2017 at 5:08 PM, Denis Magda
><dma...@apache.org>
>> >> wrote:
>> >> > > >
>> >> > > >> Sergey,
>> >> > > >>
>> >> > > >> That’s expectable because as we revealed from this
>discussion
>> >the
>> >> > > >> allocation works different depending on whether the
>> >persistence is
>> >> > used
>> >> > > or
>> >> > > >> not:
>> >> > > >>
>> >> > > >> 1) In-memory mode (the persistence is disabled) - the space
>> >will be
>> >> > > >> allocated incrementally until the max threshold is reached.
>> >Good!
>> >> > > >>
>> >> > > >> 2) The persistence mode - the whole space (limited by the
>max
>> >> > threshold)
>> >> > > >> is allocated right away. It’s not surprising that your
>laptop
>> >starts
>> >> > > >> choking.
>> >> > > >>
>> >> > > >> So, in my previous response I tried to explain that I can’t
>> >find any
>> >> > > >> reason why we should adjust 1). Any reasons except for the
>> >massive
>> >> > > >> preloading?
>> >> > > >>
>> >> > > >> As for 2), that was a big surprise to reveal this after 2.1
>> >release.
>> >> > > >> Definitely we have to fix this somehow.
>> >> > > >>
>> >> > > >> —
>> >> > > >> Denis
>> >> > > >>
>> >> > > >>> On Aug 2, 2017, at 6:59 AM, Sergey Chugunov <
>> >> > sergey.chugu...@gmail.com
>> >> > > >
>> >> > > >> wrote:
>> >> > > >>>
>> >> > > >>> Denis,
>> >> > > >>>
>> >> > > >>> Just a simple example from our own codebase: I tried to
>> >execute
>> >> > > >>> PersistentStoreExample with default settings and two
>server
>> >nodes
>> >> and
>> >> > > >>> client node got frozen even on initial load of data into
>the
>> >grid.
>> >> > > >>> Although with one server node the example finishes pretty
>> >quickly.
>> >> > > >>>
>> >> > > >>> And my laptop isn't the weakest one and has 16 gigs of
>> >memory, but
>> >> it
>> >> > > >>> cannot deal with it.
>> >> > > >>>
>> >> > > >>>
>> >> > > >>> On Wed, Aug 2, 2017 at 4:58 PM, Denis Magda
>> ><dma...@apache.org>
>> >> > wrote:
>> >> > > >>>
>> >> > > >>>>> As far as allocating 80% of available RAM - I was
>against
>> >this
>> >> even
>> >> > > for
>> >> > > >>>>> In-memory mode and still think that this is a wrong
>> >default.
>> >> > Looking
>> >> > > at
>> >> > > >>>>> free RAM is even worse because it gives you undefined
>> >behavior.
>> >> > > >>>>
>> >> > > >>>> Guys, I can not understand how this dynamic memory
>> >allocation's
>> >> > > >> high-level
>> >> > > >>>> behavior (with the persistence DISABLED) is different
>from
>> >the
>> >> > legacy
>> >> > > >>>> off-heap memory we had in 1.x. Both off-heap memories
>> >allocate the
>> >> > > >> space on
>> >> > > >>>> demand, the current just does this more aggressively
>> >requesting
>> >> big
>> >> > > >> chunks.
>> >> > > >>>>
>> >> > > >>>> Next, the legacy one was unlimited by default and the
>user
>> >can
>> >> start
>> >> > > as
>> >> > > >>>> many nodes as he wanted on a laptop and preload as much
>data
>> >as he
>> >> > > >> needed.
>> >> > > >>>> Sure he could bring down the laptop if too many entries
>were
>> >> > injected
>> >> > > >> into
>> >> > > >>>> the local cluster. But that’s about too massive
>preloading
>> >and not
>> >> > > >> caused
>> >> > > >>>> by the ability of the legacy off-heap memory to grow
>> >infinitely.
>> >> The
>> >> > > >> same
>> >> > > >>>> preloading would cause a hang if the Java heap memory
>mode
>> >is
>> >> used.
>> >> > > >>>>
>> >> > > >>>> The upshot is that the massive preloading of data on the
>> >local
>> >> > laptop
>> >> > > >>>> should not fixed with repealing of the dynamic memory
>> >allocation.
>> >> > > >>>> Is there any other reason why we have to use the static
>> >memory
>> >> > > >> allocation
>> >> > > >>>> for the case when the persistence is disabled? I think
>the
>> >case
>> >> with
>> >> > > the
>> >> > > >>>> persistence should be reviewed separately.
>> >> > > >>>>
>> >> > > >>>> —
>> >> > > >>>> Denis
>> >> > > >>>>
>> >> > > >>>>> On Aug 2, 2017, at 12:45 AM, Alexey Goncharuk <
>> >> > > >>>> alexey.goncha...@gmail.com> wrote:
>> >> > > >>>>>
>> >> > > >>>>> Dmitriy,
>> >> > > >>>>>
>> >> > > >>>>> The reason behind this is the need to to be able to
>evict
>> >and
>> >> load
>> >> > > >> pages
>> >> > > >>>> to
>> >> > > >>>>> disk, thus we need to preserve a PageId->Pointer mapping
>in
>> >> memory.
>> >> > > In
>> >> > > >>>>> order to do this in the most efficient way, we need to
>know
>> >in
>> >> > > advance
>> >> > > >>>> all
>> >> > > >>>>> the address ranges we work with. We can add dynamic
>memory
>> >> > extension
>> >> > > >> for
>> >> > > >>>>> persistence-enabled config, but this will add yet
>another
>> >step of
>> >> > > >>>>> indirection when resolving every page address, which
>adds a
>> >> > > noticeable
>> >> > > >>>>> performance penalty.
>> >> > > >>>>>
>> >> > > >>>>>
>> >> > > >>>>>
>> >> > > >>>>> 2017-08-02 10:37 GMT+03:00 Dmitriy Setrakyan <
>> >> > dsetrak...@apache.org
>> >> > > >:
>> >> > > >>>>>
>> >> > > >>>>>> On Wed, Aug 2, 2017 at 9:33 AM, Vladimir Ozerov <
>> >> > > voze...@gridgain.com
>> >> > > >>>
>> >> > > >>>>>> wrote:
>> >> > > >>>>>>
>> >> > > >>>>>>> Dima,
>> >> > > >>>>>>>
>> >> > > >>>>>>> Probably folks who worked closely with storage know
>why.
>> >> > > >>>>>>>
>> >> > > >>>>>>
>> >> > > >>>>>> Without knowing why, how can we make a decision?
>> >> > > >>>>>>
>> >> > > >>>>>> Alexey Goncharuk, was it you who made the decision
>about
>> >not
>> >> using
>> >> > > >>>>>> increments? Do know remember what was the reason?
>> >> > > >>>>>>
>> >> > > >>>>>>
>> >> > > >>>>>>>
>> >> > > >>>>>>> The very problem is that before being started once on
>> >> production
>> >> > > >>>>>>> environment, Ignite will typically be started hundred
>> >times on
>> >> > > >>>>>> developer's
>> >> > > >>>>>>> environment. I think that default should be ~10% of
>total
>> >RAM.
>> >> > > >>>>>>>
>> >> > > >>>>>>
>> >> > > >>>>>> Why not 80% of *free *RAM?
>> >> > > >>>>>>
>> >> > > >>>>>>
>> >> > > >>>>>>>
>> >> > > >>>>>>> On Wed, Aug 2, 2017 at 10:21 AM, Dmitriy Setrakyan <
>> >> > > >>>>>> dsetrak...@apache.org>
>> >> > > >>>>>>> wrote:
>> >> > > >>>>>>>
>> >> > > >>>>>>>> On Wed, Aug 2, 2017 at 7:27 AM, Vladimir Ozerov <
>> >> > > >> voze...@gridgain.com
>> >> > > >>>>>
>> >> > > >>>>>>>> wrote:
>> >> > > >>>>>>>>
>> >> > > >>>>>>>>> Please see original Sergey's message - when
>persistence
>> >is
>> >> > > enabled,
>> >> > > >>>>>>>> memory
>> >> > > >>>>>>>>> is not allocated incrementally, maxSize is used.
>> >> > > >>>>>>>>>
>> >> > > >>>>>>>>
>> >> > > >>>>>>>> Why?
>> >> > > >>>>>>>>
>> >> > > >>>>>>>>
>> >> > > >>>>>>>>> Default settings must allow for normal work on
>> >developer's
>> >> > > >>>>>> environment.
>> >> > > >>>>>>>>>
>> >> > > >>>>>>>>
>> >> > > >>>>>>>> Agree, but why not in increments?
>> >> > > >>>>>>>>
>> >> > > >>>>>>>>
>> >> > > >>>>>>>>>
>> >> > > >>>>>>>>> ср, 2 авг. 2017 г. в 1:10, Denis Magda
>> ><dma...@apache.org>:
>> >> > > >>>>>>>>>
>> >> > > >>>>>>>>>>> Why not allocate in increments automatically?
>> >> > > >>>>>>>>>>
>> >> > > >>>>>>>>>> This is exactly how the allocation works right now.
>> >The
>> >> memory
>> >> > > >> will
>> >> > > >>>>>>>> grow
>> >> > > >>>>>>>>>> incrementally until the max size is reached (80% of
>> >RAM by
>> >> > > >>>>>> default).
>> >> > > >>>>>>>>>>
>> >> > > >>>>>>>>>> —
>> >> > > >>>>>>>>>> Denis
>> >> > > >>>>>>>>>>
>> >> > > >>>>>>>>>>> On Aug 1, 2017, at 3:03 PM, dsetrak...@apache.org
>> >wrote:
>> >> > > >>>>>>>>>>>
>> >> > > >>>>>>>>>>> Vova, 1GB seems a bit too small for me, and
>frankly i
>> >do
>> >> not
>> >> > > want
>> >> > > >>>>>>> t o
>> >> > > >>>>>>>>>> guess. Why not allocate in increments
>automatically?
>> >> > > >>>>>>>>>>>
>> >> > > >>>>>>>>>>> ⁣D.
>> >> > > >>>>>>>>>>>
>> >> > > >>>>>>>>>>> On Aug 1, 2017, 11:03 PM, at 11:03 PM, Vladimir
>> >Ozerov <
>> >> > > >>>>>>>>>> voze...@gridgain.com> wrote:
>> >> > > >>>>>>>>>>>> Denis,
>> >> > > >>>>>>>>>>>> No doubts you haven't heard about it - AI 2.1
>with
>> >> > > persistence,
>> >> > > >>>>>>> when
>> >> > > >>>>>>>>>>>> 80% of
>> >> > > >>>>>>>>>>>> RAM is allocated right away, was released several
>> >days
>> >> ago.
>> >> > > How
>> >> > > >>>>>> do
>> >> > > >>>>>>>> you
>> >> > > >>>>>>>>>>>> think, how many users tried it already?
>> >> > > >>>>>>>>>>>>
>> >> > > >>>>>>>>>>>> Guys,
>> >> > > >>>>>>>>>>>> Do you really think allocating 80% of available
>RAM
>> >is a
>> >> > > normal
>> >> > > >>>>>>>> thing?
>> >> > > >>>>>>>>>>>> Take
>> >> > > >>>>>>>>>>>> your laptop and check how many available RAM you
>> >have
>> >> right
>> >> > > now.
>> >> > > >>>>>>> Do
>> >> > > >>>>>>>>> you
>> >> > > >>>>>>>>>>>> fit
>> >> > > >>>>>>>>>>>> to remaining 20%? If not, then running AI with
>> >persistence
>> >> > > with
>> >> > > >>>>>>> all
>> >> > > >>>>>>>>>>>> defaults will bring your machine down. This is
>> >insane. We
>> >> > > shold
>> >> > > >>>>>>>>>>>> allocate no
>> >> > > >>>>>>>>>>>> more than 1Gb, so that user can play with it
>without
>> >any
>> >> > > >>>>>> problems.
>> >> > > >>>>>>>>>>>>
>> >> > > >>>>>>>>>>>> On Tue, Aug 1, 2017 at 10:26 PM, Denis Magda <
>> >> > > dma...@apache.org
>> >> > > >>>>>>>
>> >> > > >>>>>>>>> wrote:
>> >> > > >>>>>>>>>>>>
>> >> > > >>>>>>>>>>>>> My vote goes for option #1 too. I don’t think
>that
>> >80% is
>> >> > too
>> >> > > >>>>>>>>>>>> aggressive
>> >> > > >>>>>>>>>>>>> to bring it down.
>> >> > > >>>>>>>>>>>>>
>> >> > > >>>>>>>>>>>>> IGNITE-5717 was created to fix the issue of the
>80%
>> >RAM
>> >> > > >>>>>>> allocation
>> >> > > >>>>>>>> on
>> >> > > >>>>>>>>>>>> 64
>> >> > > >>>>>>>>>>>>> bit systems when Ignite works on top of 32 bit
>JVM.
>> >I’ve
>> >> > not
>> >> > > >>>>>>> heard
>> >> > > >>>>>>>> of
>> >> > > >>>>>>>>>>>> any
>> >> > > >>>>>>>>>>>>> other complaints in regards the default
>allocation
>> >size.
>> >> > > >>>>>>>>>>>>>
>> >> > > >>>>>>>>>>>>> —
>> >> > > >>>>>>>>>>>>> Denis
>> >> > > >>>>>>>>>>>>>
>> >> > > >>>>>>>>>>>>>> On Aug 1, 2017, at 10:58 AM,
>dsetrak...@apache.org
>> >> wrote:
>> >> > > >>>>>>>>>>>>>>
>> >> > > >>>>>>>>>>>>>> I prefer option #1.
>> >> > > >>>>>>>>>>>>>>
>> >> > > >>>>>>>>>>>>>> ⁣D.
>> >> > > >>>>>>>>>>>>>>
>> >> > > >>>>>>>>>>>>>> On Aug 1, 2017, 11:20 AM, at 11:20 AM, Sergey
>> >Chugunov <
>> >> > > >>>>>>>>>>>>> sergey.chugu...@gmail.com> wrote:
>> >> > > >>>>>>>>>>>>>>> Folks,
>> >> > > >>>>>>>>>>>>>>>
>> >> > > >>>>>>>>>>>>>>> I would like to get back to the question about
>> >> > MemoryPolicy
>> >> > > >>>>>>>>>>>> maxMemory
>> >> > > >>>>>>>>>>>>>>> defaults.
>> >> > > >>>>>>>>>>>>>>>
>> >> > > >>>>>>>>>>>>>>> Although MemoryPolicy may be configured with
>> >initial
>> >> and
>> >> > > >>>>>>>> maxMemory
>> >> > > >>>>>>>>>>>>>>> settings, when persistence is used
>MemoryPolicy
>> >always
>> >> > > >>>>>>> allocates
>> >> > > >>>>>>>>>>>>>>> maxMemory
>> >> > > >>>>>>>>>>>>>>> size for performance reasons.
>> >> > > >>>>>>>>>>>>>>>
>> >> > > >>>>>>>>>>>>>>> As default size of maxMemory is 80% of
>physical
>> >memory
>> >> it
>> >> > > >>>>>>> causes
>> >> > > >>>>>>>>>>>> OOME
>> >> > > >>>>>>>>>>>>>>> exceptions of 32 bit platforms (either on OS
>or
>> >JVM
>> >> > level)
>> >> > > >>>>>> and
>> >> > > >>>>>>>>>>>> hurts
>> >> > > >>>>>>>>>>>>>>> performance in setups when multiple Ignite
>nodes
>> >are
>> >> > > started
>> >> > > >>>>>> on
>> >> > > >>>>>>>>>>>> the
>> >> > > >>>>>>>>>>>>>>> same
>> >> > > >>>>>>>>>>>>>>> physical server.
>> >> > > >>>>>>>>>>>>>>>
>> >> > > >>>>>>>>>>>>>>> I suggest to rethink these defaults and switch
>to
>> >other
>> >> > > >>>>>>> options:
>> >> > > >>>>>>>>>>>>>>>
>> >> > > >>>>>>>>>>>>>>> - Check whether platform is 32 or 64 bits and
>> >adapt
>> >> > > defaults.
>> >> > > >>>>>>> In
>> >> > > >>>>>>>>>>>> this
>> >> > > >>>>>>>>>>>>>>> case we still need to address the issue with
>> >multiple
>> >> > nodes
>> >> > > >>>>>> on
>> >> > > >>>>>>>> one
>> >> > > >>>>>>>>>>>>>>> machine
>> >> > > >>>>>>>>>>>>>>> even on 64 bit systems.
>> >> > > >>>>>>>>>>>>>>>
>> >> > > >>>>>>>>>>>>>>> - Lower defaults for maxMemory and allocate,
>for
>> >> > instance,
>> >> > > >>>>>>>>>>>> max(0.3 *
>> >> > > >>>>>>>>>>>>>>> availableMemory, 1Gb).
>> >> > > >>>>>>>>>>>>>>> This option allows us to solve all issues with
>> >starting
>> >> > on
>> >> > > 32
>> >> > > >>>>>>> bit
>> >> > > >>>>>>>>>>>>>>> platforms and reduce instability with multiple
>> >nodes on
>> >> > the
>> >> > > >>>>>>> same
>> >> > > >>>>>>>>>>>>>>> machine.
>> >> > > >>>>>>>>>>>>>>>
>> >> > > >>>>>>>>>>>>>>>
>> >> > > >>>>>>>>>>>>>>> Thoughts and/or other options?
>> >> > > >>>>>>>>>>>>>>>
>> >> > > >>>>>>>>>>>>>>> Thanks,
>> >> > > >>>>>>>>>>>>>>> Sergey.
>> >> > > >>>>>>>>>>>>>
>> >> > > >>>>>>>>>>>>>
>> >> > > >>>>>>>>>>
>> >> > > >>>>>>>>>>
>> >> > > >>>>>>>>>
>> >> > > >>>>>>>>
>> >> > > >>>>>>>
>> >> > > >>>>>>
>> >> > > >>>>
>> >> > > >>>>
>> >> > > >>
>> >> > > >>
>> >> > >
>> >> > >
>> >> >
>> >>
>>

Re: [IGNITE-5717] improvements of MemoryPolicy default size

Reply via email to