Re: penn state academic paper - "scalable" bloom filters

2018-02-22 Thread Jeff Jirsa
Potentially more interesting, range filters: https://issues.apache.org/jira/plugins/servlet/mobile#issue/CASSANDRA-9843 And rocksdb has a prefix bloom filter https://github.com/facebook/rocksdb/wiki/Prefix-Seek-API-Changes Which we could potentially use to track partition:partial-clustering

Re: penn state academic paper - "scalable" bloom filters

2018-02-22 Thread Jay Zhuang
I think there's a similar idea here to dynamically resize the BF: https://issues.apache.org/jira/browse/CASSANDRA-6633, but I don't quite understand the idea there. On Thu, Feb 22, 2018 at 7:45 AM, Carl Mueller wrote: >

Re: Why isn't there a separate JVM per table?

2018-02-22 Thread Jonathan Haddad
There's an incredible amount of work that would need to be done in order to make any of this happen. Basically a full rewrite of the entire codebase. Years of effort. The codebase would have to move to a shared-nothing actor & message based communication mechanism before any of this is possible.

Re: Why isn't there a separate JVM per table?

2018-02-22 Thread J. D. Jordan
I would be careful with anything per table for memory sizing. We used to have many caches and things that could be tuned per table, but they have all since changed to being per node, as it was a real PITA to get them right. Having to do per table heap/gc/memtable/cache tuning just sounds like

Re: Why isn't there a separate JVM per table?

2018-02-22 Thread kurt greaves
> > ... compaction on its own jvm was also something I was thinking about, but > then I realized even more JVM sharding could be done at the table level. Compaction in it's own JVM makes sense. At the table level I'm not so sure about. Gotta be some serious overheads from running that many

Re: Why isn't there a separate JVM per table?

2018-02-22 Thread Nate McCall
Agree that any first efforts per compaction should be on profiling. Probably some low-hanging fruit there. On Fri, Feb 23, 2018 at 11:55 AM, Jeff Jirsa wrote: > Bloom filters are offheap. > > To be honest, there may come a time when it makes sense to move compaction > into its

Re: Why isn't there a separate JVM per table?

2018-02-22 Thread Carl Mueller
Alternative: JVM per vnode. On Thu, Feb 22, 2018 at 4:52 PM, Carl Mueller wrote: > BLoom filters... nevermind > > > On Thu, Feb 22, 2018 at 4:48 PM, Carl Mueller < > carl.muel...@smartthings.com> wrote: > >> Is the current reason for a large starting heap due to

Re: Why isn't there a separate JVM per table?

2018-02-22 Thread Jeff Jirsa
Bloom filters are offheap. To be honest, there may come a time when it makes sense to move compaction into its own JVM, but it would be FAR less effort to just profile what exists now and fix the problems. On Thu, Feb 22, 2018 at 2:52 PM, Carl Mueller wrote: >

Re: Why isn't there a separate JVM per table?

2018-02-22 Thread Carl Mueller
BLoom filters... nevermind On Thu, Feb 22, 2018 at 4:48 PM, Carl Mueller wrote: > Is the current reason for a large starting heap due to the memtable? > > On Thu, Feb 22, 2018 at 4:44 PM, Carl Mueller < > carl.muel...@smartthings.com> wrote: > >> ... compaction

Re: Issues with Materialized-Views updates during a cluster change?

2018-02-22 Thread Paulo Motta
> Is this a realistic case when Cassandra (unless I'm missing something) is limited to adding or removing a single node at a time? I'm sure this can happen under some sort of generic range movement of some sort (how does one initiate such movement, and why), but will it happen under "normal"

Re: Why isn't there a separate JVM per table?

2018-02-22 Thread Carl Mueller
Is the current reason for a large starting heap due to the memtable? On Thu, Feb 22, 2018 at 4:44 PM, Carl Mueller wrote: > ... compaction on its own jvm was also something I was thinking about, > but then I realized even more JVM sharding could be done at the

Re: Why isn't there a separate JVM per table?

2018-02-22 Thread Carl Mueller
... compaction on its own jvm was also something I was thinking about, but then I realized even more JVM sharding could be done at the table level. On Thu, Feb 22, 2018 at 4:09 PM, Jon Haddad wrote: > Yeah, I’m in the compaction on it’s own JVM camp, in an ideal world where

Re: Why isn't there a separate JVM per table?

2018-02-22 Thread Jon Haddad
Yeah, I’m in the compaction on it’s own JVM camp, in an ideal world where we’re isolating crazy GC churning parts of the DB. It would mean reworking how tasks are created and removal of all shared state in favor of messaging + a smarter manager, which imo would be a good idea regardless. It

Re: Why isn't there a separate JVM per table?

2018-02-22 Thread Nate McCall
I've heard a couple of folks pontificate on compaction in its own process as well, given it has such a high impact on GC. Not sure about the value of individual tables. Interesting idea though. On Fri, Feb 23, 2018 at 10:45 AM, Gary Dusbabek wrote: > I've given it some

Re: Why isn't there a separate JVM per table?

2018-02-22 Thread Gary Dusbabek
I've given it some thought in the past. In the end, I usually talk myself out of it because I think it increases the surface area for failure. That is, managing N processes is more difficult that managing one process. But if the additional failure modes are addressed, there are some interesting

Re: Why isn't there a separate JVM per table?

2018-02-22 Thread Michael Kjellman
it's an interesting idea. i'd wonder how much overhead you'd end up with message parsing and negate any potential GC wins. rick branson had played around a bunch with running storage nodes and doubling down on the old "fat client" model. if you had 1 tables (yes, barely works but we don't

Why isn't there a separate JVM per table?

2018-02-22 Thread Carl Mueller
GC pauses may have been improved in newer releases, since we are in 2.1.x, but I was wondering why cassandra uses one jvm for all tables and keyspaces, intermingling the heap for on-JVM objects. ... so why doesn't cassandra spin off a jvm per table so each jvm can be tuned per table and gc tuned

Re: Expensive metrics?

2018-02-22 Thread Michael Burman
Hi, I was referring to this article by Shipilev (there are few small issues forgotten in that url you pasted): https://shipilev.net/blog/2014/nanotrusting-nanotime/ And his lovely recommendation on it: "System.nanoTime is as bad as String.intern now: you can use it, but use it wisely. ".

Re: Expensive metrics?

2018-02-22 Thread Michael Burman
Hi, I've looked at the high level the metrics' expense. It's around ~4% of the total CPU time in my machine. But the problem with that higher level measurement is that it does not show waits. When I push writes to the Cassandra (through CQL) I'm mostly getting stalls according to the kernel

Re: Expensive metrics?

2018-02-22 Thread Jonathan Haddad
Hey Micke, very cool you're looking to improve C*'s performance, we would absolutely benefit from it. Have you done any other benchmarks beside the micro one to determine the total effect of these metrics on the system overall? Microbenchmarks are a great way to tune small sections of code but

Re: Expensive metrics?

2018-02-22 Thread Jeremiah D Jordan
re: nanoTime vs currentTimeMillis there is a good blog post here about the timing of both and how your choice of Linux clock source can drastically effect the speed of the calls, and also showing that in general on linux there is no perf improvement for one over the other.

Re: Expensive metrics?

2018-02-22 Thread Blake Eggleston
Hi Micke, This is really cool, thanks for taking the time to investigate this. I believe the metrics around memtable insert time come in handy in identifying high partition contention in the memtable. I know I've been involved in a situation over the past year where we got actionable info from

penn state academic paper - "scalable" bloom filters

2018-02-22 Thread Carl Mueller
http://citeseerx.ist.psu.edu/viewdoc/download?doi=10.1.1.62.7953=rep1=pdf looks to be an adaptive approach where the "initial guess" bloom filters are enhanced with more layers of ones generated after usage stats are gained. Disclaimer: I suck at reading academic papers.

Expensive metrics?

2018-02-22 Thread Michael Burman
Hi, I wanted to get some input from the mailing list before making a JIRA and potential fixes. I'll touch the performance more on latter part, but there's one important question regarding the write latency metric recording place. Currently we measure the writeLatency (and metric write

Re: Cassandra Needs to Grow Up by Version Five!

2018-02-22 Thread Eric Plowe
Cassandra, hard to use? I disagree completely. With that said, there are definitely deficiencies in certain parts of the documentation, but nothing that is a show stopper. We’ve been using Cassandra since the sub 1.0 days and have had nothing but great things to say about it. With that said, its

Re: Cassandra Needs to Grow Up by Version Five!

2018-02-22 Thread Eric Plowe
Cassandra, hard to use? I disagree completely. With that said, there are definitely deficiencies in certain parts of the documentation, but nothing that is a show stopper. We’ve been using Cassandra since the sub 1.0 days and have had nothing but great things to say about it. With that said, its

Re: Issues with Materialized-Views updates during a cluster change?

2018-02-22 Thread Nadav Har'El
On Thu, Feb 22, 2018 at 12:54 AM, Paulo Motta wrote: > > Good catch! This indeed seems to be a regression caused by > CASSANDRA-13069, so I created CASSANDRA-14251 to restore the correct > behavior. > I have a question about your patch

Re: Cassandra Needs to Grow Up by Version Five!

2018-02-22 Thread Oleksandr Shulgin
On Thu, Feb 22, 2018 at 9:50 AM, Eric Plowe wrote: > Cassandra, hard to use? I disagree completely. With that said, there are > definitely deficiencies in certain parts of the documentation, but nothing > that is a show stopper. True, there are no show-stoppers from the

RE: Can API be an alternative for MBeans?

2018-02-22 Thread Jacques-Henri Berthemet
I didn't know about it, I'm now witching it, thank you! -- Jacques-Henri Berthemet -Original Message- From: Murukesh Mohanan [mailto:murukesh.moha...@gmail.com] Sent: Thursday, February 22, 2018 10:08 AM To: dev@cassandra.apache.org Subject: Re: Can API be an alternative for MBeans?

Re: Can API be an alternative for MBeans?

2018-02-22 Thread Murukesh Mohanan
You might want to keep an eye on https://issues.apache.org/jira/browse/CASSANDRA-7622 (I suspect you might already be doing so, but just in case...) On Thu, 22 Feb 2018 at 17:57 Jacques-Henri Berthemet < jacques-henri.berthe...@genesys.com> wrote: > Hi, > > What would be great would be to be

RE: Can API be an alternative for MBeans?

2018-02-22 Thread Jacques-Henri Berthemet
Hi, What would be great would be to be able to query stats using CQL on some "virtual" systems tables. Exposing a REST API would be another endpoint to secure. -- Jacques-Henri Berthemet -Original Message- From: Nicolas Guyomar [mailto:nicolas.guyo...@gmail.com] Sent: Thursday,

RE: Cassandra Needs to Grow Up by Version Five!

2018-02-22 Thread Jacques-Henri Berthemet
Hi Kenneth, As a Cassandra user I value usability, but since it's a database I value consistency and performance even more. If you want usability and documentation you can use Datastax DSE, after all that's where they add value on top of Cassandra. Since Datastax actually paid dev to work

Re: Issues with Materialized-Views updates during a cluster change?

2018-02-22 Thread Nadav Har'El
On Thu, Feb 22, 2018 at 12:54 AM, Paulo Motta wrote: > > 1. It seems that for example when RF=3, each one of the three base > replicas will send a view update to the fourth "pending node". While this > is not wrong, it's also inefficient - why send three copies of the

Re: Can API be an alternative for MBeans?

2018-02-22 Thread Nicolas Guyomar
Hi, Jolokia 'for instance) is making exposing MBean with Http so easy (with "just" a simple jar addition) that I think this is not really needed within Cassandra On 22 February 2018 at 09:10, Venkata Hari Krishna Nukala < n.v.harikrishna.apa...@gmail.com> wrote: > Hi, > > I saw lots of

Can API be an alternative for MBeans?

2018-02-22 Thread Venkata Hari Krishna Nukala
Hi, I saw lots of information exposed through MBeans (like status, cfstats etc...). I feel exposing them like as API has few advantages like, it is more open (different types of clients can use) and more expressible for request and response. Does the option of exposing such functionality through