Re: Why isn't there a separate JVM per table?

Jeff Jirsa Thu, 22 Feb 2018 14:56:21 -0800

Bloom filters are offheap.

To be honest, there may come a time when it makes sense to move compaction
into its own JVM, but it would be FAR less effort to just profile what
exists now and fix the problems.




On Thu, Feb 22, 2018 at 2:52 PM, Carl Mueller <[email protected]>
wrote:

> BLoom filters... nevermind
>
>
> On Thu, Feb 22, 2018 at 4:48 PM, Carl Mueller <
> [email protected]>
> wrote:
>
> > Is the current reason for a large starting heap due to the memtable?
> >
> > On Thu, Feb 22, 2018 at 4:44 PM, Carl Mueller <
> > [email protected]> wrote:
> >
> >>  ... compaction on its own jvm was also something I was thinking about,
> >> but then I realized even more JVM sharding could be done at the table
> level.
> >>
> >> On Thu, Feb 22, 2018 at 4:09 PM, Jon Haddad <[email protected]> wrote:
> >>
> >>> Yeah, I’m in the compaction on it’s own JVM camp, in an ideal world
> >>> where we’re isolating crazy GC churning parts of the DB.  It would mean
> >>> reworking how tasks are created and removal of all shared state in
> favor of
> >>> messaging + a smarter manager, which imo would be a good idea
> regardless.
> >>>
> >>> It might be a better use of time (especially for 4.0) to do some GC
> >>> performance profiling and cut down on the allocations, since that
> doesn’t
> >>> involve a massive effort.
> >>>
> >>> I’ve been meaning to do a little benchmarking and profiling for a while
> >>> now, and it seems like a few others have the same inclination as well,
> >>> maybe now is a good time to coordinate that.  A nice perf bump for 4.0
> >>> would be very rewarding.
> >>>
> >>> Jon
> >>>
> >>> > On Feb 22, 2018, at 2:00 PM, Nate McCall <[email protected]> wrote:
> >>> >
> >>> > I've heard a couple of folks pontificate on compaction in its own
> >>> > process as well, given it has such a high impact on GC. Not sure
> about
> >>> > the value of individual tables. Interesting idea though.
> >>> >
> >>> > On Fri, Feb 23, 2018 at 10:45 AM, Gary Dusbabek <[email protected]
> >
> >>> wrote:
> >>> >> I've given it some thought in the past. In the end, I usually talk
> >>> myself
> >>> >> out of it because I think it increases the surface area for failure.
> >>> That
> >>> >> is, managing N processes is more difficult that managing one
> process.
> >>> But
> >>> >> if the additional failure modes are addressed, there are some
> >>> interesting
> >>> >> possibilities.
> >>> >>
> >>> >> For example, having gossip in its own process would decrease the
> odds
> >>> that
> >>> >> a node is marked dead because STW GC is happening in the storage
> JVM.
> >>> On
> >>> >> the flipside, you'd need checks to make sure that the gossip process
> >>> can
> >>> >> recognize when the storage process has died vs just running a long
> GC.
> >>> >>
> >>> >> I don't know that I'd go so far as to have separate processes for
> >>> >> keyspaces, etc.
> >>> >>
> >>> >> There is probably some interesting work that could be done to
> support
> >>> the
> >>> >> orgs who run multiple cassandra instances on the same node (multiple
> >>> >> gossipers in that case is at least a little wasteful).
> >>> >>
> >>> >> I've also played around with using domain sockets for IPC inside of
> >>> >> cassandra. I never ran a proper benchmark, but there were some
> >>> throughput
> >>> >> advantages to this approach.
> >>> >>
> >>> >> Cheers,
> >>> >>
> >>> >> Gary.
> >>> >>
> >>> >>
> >>> >> On Thu, Feb 22, 2018 at 8:39 PM, Carl Mueller <
> >>> [email protected]>
> >>> >> wrote:
> >>> >>
> >>> >>> GC pauses may have been improved in newer releases, since we are in
> >>> 2.1.x,
> >>> >>> but I was wondering why cassandra uses one jvm for all tables and
> >>> >>> keyspaces, intermingling the heap for on-JVM objects.
> >>> >>>
> >>> >>> ... so why doesn't cassandra spin off a jvm per table so each jvm
> >>> can be
> >>> >>> tuned per table and gc tuned and gc impacts not impact other
> tables?
> >>> It
> >>> >>> would probably increase the number of endpoints if we avoid having
> an
> >>> >>> overarching query router.
> >>> >>>
> >>> >
> >>> > ------------------------------------------------------------
> ---------
> >>> > To unsubscribe, e-mail: [email protected]
> >>> > For additional commands, e-mail: [email protected]
> >>> >
> >>>
> >>>
> >>> ---------------------------------------------------------------------
> >>> To unsubscribe, e-mail: [email protected]
> >>> For additional commands, e-mail: [email protected]
> >>>
> >>>
> >>
> >
>

Re: Why isn't there a separate JVM per table?

Reply via email to