Re: CASSANDRA-13241 lower default chunk_length_in_kb

2018-11-03 Thread Carl Mueller
IMO slightly bigger memory requirements for substantial improvements is a
good exchange, especially for a 4.0 release of the database. Optane and
lots of other memory are coming down the hardware pipeline, and risk-wise
almost all cassandra people know to testbed the major versions, so major
versions are a good time for significant default changes (vnode count,
this). I've read TLP blogs on this before, and the memory impact seems to
only get huge for node sizes that start to get out of ideal size, and if
they want to run nodes that big then fine, run big memory too.

But I don't actually write code for the project so I don't count :-)

On Mon, Oct 29, 2018 at 2:42 PM Jonathan Haddad  wrote:

> Looks straightforward, I can review today.
>
> On Mon, Oct 29, 2018 at 12:25 PM Ariel Weisberg  wrote:
>
> > Hi,
> >
> > Seeing too many -'s for changing the representation and essentially no
> +1s
> > so I submitted a patch for just changing the default. I could use a
> > reviewer for https://issues.apache.org/jira/browse/CASSANDRA-13241
> >
> > I created https://issues.apache.org/jira/browse/CASSANDRA-14857  "Use a
> > more space efficient representation for compressed chunk offsets" for
> post
> > 4.0.
> >
> > Regards,
> > Ariel
> >
> > On Tue, Oct 23, 2018, at 11:46 AM, Ariel Weisberg wrote:
> > > Hi,
> > >
> > > To summarize who we have heard from so far
> > >
> > > WRT to changing just the default:
> > >
> > > +1:
> > > Jon Haddadd
> > > Ben Bromhead
> > > Alain Rodriguez
> > > Sankalp Kohli (not explicit)
> > >
> > > -0:
> > > Sylvaine Lebresne
> > > Jeff Jirsa
> > >
> > > Not sure:
> > > Kurt Greaves
> > > Joshua Mckenzie
> > > Benedict Elliot Smith
> > >
> > > WRT to change the representation:
> > >
> > > +1:
> > > There are only conditional +1s at this point
> > >
> > > -0:
> > > Sylvaine Lebresne
> > >
> > > -.5:
> > > Jeff Jirsa
> > >
> > > This
> > > (
> >
> https://github.com/aweisberg/cassandra/commit/a9ae85daa3ede092b9a1cf84879fb1a9f25b9dce
> )
> >
> > > is a rough cut of the change for the representation. It needs better
> > > naming, unit tests, javadoc etc. but it does implement the change.
> > >
> > > Ariel
> > > On Fri, Oct 19, 2018, at 3:42 PM, Jonathan Haddad wrote:
> > > > Sorry, to be clear - I'm +1 on changing the configuration default,
> but
> > I
> > > > think changing the compression in memory representations warrants
> > further
> > > > discussion and investigation before making a case for or against it
> > yet.
> > > > An optimization that reduces in memory cost by over 50% sounds pretty
> > good
> > > > and we never were really explicit that those sort of optimizations
> > would be
> > > > excluded after our feature freeze.  I don't think they should
> > necessarily
> > > > be excluded at this time, but it depends on the size and risk of the
> > patch.
> > > >
> > > > On Sat, Oct 20, 2018 at 8:38 AM Jonathan Haddad 
> > wrote:
> > > >
> > > > > I think we should try to do the right thing for the most people
> that
> > we
> > > > > can.  The number of folks impacted by 64KB is huge.  I've worked on
> > a lot
> > > > > of clusters created by a lot of different teams, going from brand
> > new to
> > > > > pretty damn knowledgeable.  I can't think of a single time over the
> > last 2
> > > > > years that I've seen a cluster use non-default settings for
> > compression.
> > > > > With only a handful of exceptions, I've lowered the chunk size
> > considerably
> > > > > (usually to 4 or 8K) and the impact has always been very
> noticeable,
> > > > > frequently resulting in hardware reduction and cost savings.  Of
> all
> > the
> > > > > poorly chosen defaults we have, this is one of the biggest
> offenders
> > that I
> > > > > see.  There's a good reason ScyllaDB  claims they're so much faster
> > than
> > > > > Cassandra - we ship a DB that performs poorly for 90+% of teams
> > because we
> > > > > ship for a specific use case, not a general one (time series on
> > memory
> > > > > constrained boxes being the specific use case)
> > > > >
> > > > > This doesn't impact existing tables, just new ones.  More and more
> > teams
> > > > > are using Cassandra as a general purpose database, we should
> > acknowledge
> > > > > that adjusting our defaults accordingly.  Yes, we use a little bit
> > more
> > > > > memory on new tables if we just change this setting, and what we
> get
> > out of
> > > > > it is a massive performance win.
> > > > >
> > > > > I'm +1 on the change as well.
> > > > >
> > > > >
> > > > >
> > > > > On Sat, Oct 20, 2018 at 4:21 AM Sankalp Kohli <
> > kohlisank...@gmail.com>
> > > > > wrote:
> > > > >
> > > > >> (We should definitely harden the definition for freeze in a
> separate
> > > > >> thread)
> > > > >>
> > > > >> My thinking is that this is the best time to do this change as we
> > have
> > > > >> not even cut alpha or beta. All the people involved in the test
> will
> > > > >> definitely be testing it again when we have these releases.
> > > > >>
> > > > >> > On Oct 1

Re: CASSANDRA-13241 lower default chunk_length_in_kb

2018-10-29 Thread Jonathan Haddad
Looks straightforward, I can review today.

On Mon, Oct 29, 2018 at 12:25 PM Ariel Weisberg  wrote:

> Hi,
>
> Seeing too many -'s for changing the representation and essentially no +1s
> so I submitted a patch for just changing the default. I could use a
> reviewer for https://issues.apache.org/jira/browse/CASSANDRA-13241
>
> I created https://issues.apache.org/jira/browse/CASSANDRA-14857  "Use a
> more space efficient representation for compressed chunk offsets" for post
> 4.0.
>
> Regards,
> Ariel
>
> On Tue, Oct 23, 2018, at 11:46 AM, Ariel Weisberg wrote:
> > Hi,
> >
> > To summarize who we have heard from so far
> >
> > WRT to changing just the default:
> >
> > +1:
> > Jon Haddadd
> > Ben Bromhead
> > Alain Rodriguez
> > Sankalp Kohli (not explicit)
> >
> > -0:
> > Sylvaine Lebresne
> > Jeff Jirsa
> >
> > Not sure:
> > Kurt Greaves
> > Joshua Mckenzie
> > Benedict Elliot Smith
> >
> > WRT to change the representation:
> >
> > +1:
> > There are only conditional +1s at this point
> >
> > -0:
> > Sylvaine Lebresne
> >
> > -.5:
> > Jeff Jirsa
> >
> > This
> > (
> https://github.com/aweisberg/cassandra/commit/a9ae85daa3ede092b9a1cf84879fb1a9f25b9dce)
>
> > is a rough cut of the change for the representation. It needs better
> > naming, unit tests, javadoc etc. but it does implement the change.
> >
> > Ariel
> > On Fri, Oct 19, 2018, at 3:42 PM, Jonathan Haddad wrote:
> > > Sorry, to be clear - I'm +1 on changing the configuration default, but
> I
> > > think changing the compression in memory representations warrants
> further
> > > discussion and investigation before making a case for or against it
> yet.
> > > An optimization that reduces in memory cost by over 50% sounds pretty
> good
> > > and we never were really explicit that those sort of optimizations
> would be
> > > excluded after our feature freeze.  I don't think they should
> necessarily
> > > be excluded at this time, but it depends on the size and risk of the
> patch.
> > >
> > > On Sat, Oct 20, 2018 at 8:38 AM Jonathan Haddad 
> wrote:
> > >
> > > > I think we should try to do the right thing for the most people that
> we
> > > > can.  The number of folks impacted by 64KB is huge.  I've worked on
> a lot
> > > > of clusters created by a lot of different teams, going from brand
> new to
> > > > pretty damn knowledgeable.  I can't think of a single time over the
> last 2
> > > > years that I've seen a cluster use non-default settings for
> compression.
> > > > With only a handful of exceptions, I've lowered the chunk size
> considerably
> > > > (usually to 4 or 8K) and the impact has always been very noticeable,
> > > > frequently resulting in hardware reduction and cost savings.  Of all
> the
> > > > poorly chosen defaults we have, this is one of the biggest offenders
> that I
> > > > see.  There's a good reason ScyllaDB  claims they're so much faster
> than
> > > > Cassandra - we ship a DB that performs poorly for 90+% of teams
> because we
> > > > ship for a specific use case, not a general one (time series on
> memory
> > > > constrained boxes being the specific use case)
> > > >
> > > > This doesn't impact existing tables, just new ones.  More and more
> teams
> > > > are using Cassandra as a general purpose database, we should
> acknowledge
> > > > that adjusting our defaults accordingly.  Yes, we use a little bit
> more
> > > > memory on new tables if we just change this setting, and what we get
> out of
> > > > it is a massive performance win.
> > > >
> > > > I'm +1 on the change as well.
> > > >
> > > >
> > > >
> > > > On Sat, Oct 20, 2018 at 4:21 AM Sankalp Kohli <
> kohlisank...@gmail.com>
> > > > wrote:
> > > >
> > > >> (We should definitely harden the definition for freeze in a separate
> > > >> thread)
> > > >>
> > > >> My thinking is that this is the best time to do this change as we
> have
> > > >> not even cut alpha or beta. All the people involved in the test will
> > > >> definitely be testing it again when we have these releases.
> > > >>
> > > >> > On Oct 19, 2018, at 8:00 AM, Michael Shuler <
> mich...@pbandjelly.org>
> > > >> wrote:
> > > >> >
> > > >> >> On 10/19/18 9:16 AM, Joshua McKenzie wrote:
> > > >> >>
> > > >> >> At the risk of hijacking this thread, when are we going to
> transition
> > > >> from
> > > >> >> "no new features, change whatever else you want including
> refactoring
> > > >> and
> > > >> >> changing years-old defaults" to "ok, we think we have something
> that's
> > > >> >> stable, time to start testing"?
> > > >> >
> > > >> > Creating a cassandra-4.0 branch would allow trunk to, for
> instance, get
> > > >> > a default config value change commit and get more testing. We
> might
> > > >> > forget again, from what I understand of Benedict's last comment :)
> > > >> >
> > > >> > --
> > > >> > Michael
> > > >> >
> > > >> >
> -
> > > >> > To unsubscribe, e-mail: dev-unsubscr...@cassandra.apache.org
> > > >> > For additional commands,

Re: CASSANDRA-13241 lower default chunk_length_in_kb

2018-10-29 Thread Ariel Weisberg
Hi,

Seeing too many -'s for changing the representation and essentially no +1s so I 
submitted a patch for just changing the default. I could use a reviewer for 
https://issues.apache.org/jira/browse/CASSANDRA-13241

I created https://issues.apache.org/jira/browse/CASSANDRA-14857  "Use a more 
space efficient representation for compressed chunk offsets" for post 4.0.

Regards,
Ariel

On Tue, Oct 23, 2018, at 11:46 AM, Ariel Weisberg wrote:
> Hi,
> 
> To summarize who we have heard from so far
> 
> WRT to changing just the default:
> 
> +1:
> Jon Haddadd
> Ben Bromhead
> Alain Rodriguez
> Sankalp Kohli (not explicit)
> 
> -0:
> Sylvaine Lebresne 
> Jeff Jirsa
> 
> Not sure:
> Kurt Greaves
> Joshua Mckenzie
> Benedict Elliot Smith
> 
> WRT to change the representation:
> 
> +1:
> There are only conditional +1s at this point
> 
> -0:
> Sylvaine Lebresne
> 
> -.5:
> Jeff Jirsa
> 
> This 
> (https://github.com/aweisberg/cassandra/commit/a9ae85daa3ede092b9a1cf84879fb1a9f25b9dce)
>  
> is a rough cut of the change for the representation. It needs better 
> naming, unit tests, javadoc etc. but it does implement the change.
> 
> Ariel
> On Fri, Oct 19, 2018, at 3:42 PM, Jonathan Haddad wrote:
> > Sorry, to be clear - I'm +1 on changing the configuration default, but I
> > think changing the compression in memory representations warrants further
> > discussion and investigation before making a case for or against it yet.
> > An optimization that reduces in memory cost by over 50% sounds pretty good
> > and we never were really explicit that those sort of optimizations would be
> > excluded after our feature freeze.  I don't think they should necessarily
> > be excluded at this time, but it depends on the size and risk of the patch.
> > 
> > On Sat, Oct 20, 2018 at 8:38 AM Jonathan Haddad  wrote:
> > 
> > > I think we should try to do the right thing for the most people that we
> > > can.  The number of folks impacted by 64KB is huge.  I've worked on a lot
> > > of clusters created by a lot of different teams, going from brand new to
> > > pretty damn knowledgeable.  I can't think of a single time over the last 2
> > > years that I've seen a cluster use non-default settings for compression.
> > > With only a handful of exceptions, I've lowered the chunk size 
> > > considerably
> > > (usually to 4 or 8K) and the impact has always been very noticeable,
> > > frequently resulting in hardware reduction and cost savings.  Of all the
> > > poorly chosen defaults we have, this is one of the biggest offenders that 
> > > I
> > > see.  There's a good reason ScyllaDB  claims they're so much faster than
> > > Cassandra - we ship a DB that performs poorly for 90+% of teams because we
> > > ship for a specific use case, not a general one (time series on memory
> > > constrained boxes being the specific use case)
> > >
> > > This doesn't impact existing tables, just new ones.  More and more teams
> > > are using Cassandra as a general purpose database, we should acknowledge
> > > that adjusting our defaults accordingly.  Yes, we use a little bit more
> > > memory on new tables if we just change this setting, and what we get out 
> > > of
> > > it is a massive performance win.
> > >
> > > I'm +1 on the change as well.
> > >
> > >
> > >
> > > On Sat, Oct 20, 2018 at 4:21 AM Sankalp Kohli 
> > > wrote:
> > >
> > >> (We should definitely harden the definition for freeze in a separate
> > >> thread)
> > >>
> > >> My thinking is that this is the best time to do this change as we have
> > >> not even cut alpha or beta. All the people involved in the test will
> > >> definitely be testing it again when we have these releases.
> > >>
> > >> > On Oct 19, 2018, at 8:00 AM, Michael Shuler 
> > >> wrote:
> > >> >
> > >> >> On 10/19/18 9:16 AM, Joshua McKenzie wrote:
> > >> >>
> > >> >> At the risk of hijacking this thread, when are we going to transition
> > >> from
> > >> >> "no new features, change whatever else you want including refactoring
> > >> and
> > >> >> changing years-old defaults" to "ok, we think we have something that's
> > >> >> stable, time to start testing"?
> > >> >
> > >> > Creating a cassandra-4.0 branch would allow trunk to, for instance, get
> > >> > a default config value change commit and get more testing. We might
> > >> > forget again, from what I understand of Benedict's last comment :)
> > >> >
> > >> > --
> > >> > Michael
> > >> >
> > >> > -
> > >> > To unsubscribe, e-mail: dev-unsubscr...@cassandra.apache.org
> > >> > For additional commands, e-mail: dev-h...@cassandra.apache.org
> > >> >
> > >>
> > >> -
> > >> To unsubscribe, e-mail: dev-unsubscr...@cassandra.apache.org
> > >> For additional commands, e-mail: dev-h...@cassandra.apache.org
> > >>
> > >>
> > >
> > > --
> > > Jon Haddad
> > > http://www.rustyrazorblade.com
> > > twitter: rustyrazorblade
> > >
> > 
> > 
> > -- 
> 

Re: CASSANDRA-13241 lower default chunk_length_in_kb

2018-10-24 Thread Joshua McKenzie
+1. I use the smiley to let you know I'm mostly just giving you shit. ;)

On Wed, Oct 24, 2018 at 11:43 AM Benedict Elliott Smith 
wrote:

> If you undertake sufficiently many low risk things, some will bite you, I
> think everyone understands that.  It’s still valuable to factor a risk
> assessment into the equation, I think?
>
> Either way, somebody asked who didn’t have the context to easily answer,
> so I did my best to offer them that information so they could make an
> informed decision.  I’m not campaigning for its inclusion, just trying to
> facilitate a collective decision.
>
>
>
>
>
>
> > On 24 Oct 2018, at 16:27, Joshua McKenzie  wrote:
> >
> > | The risk from such a patch is very low
> > If I had a nickel for every time I've heard that... ;)
> >
> > I'm neutral on the default change, -.5 (i.e. don't agree with it but
> won't
> > die on that hill) on the data structure change post-freeze. We put this
> in,
> > and that's a slippery slope as I'm sure we can find numerous other
> > seemingly low-risk trivial optimizations and rewrites that cumulatively
> > would make a "feature-freeze" effectively meaningless as a tool to start
> > stabilizing the contents of the release.
> >
> > In isolation many changes look innocuous. In the context of an
> organically
> > grown open-source code-base that's this old, I've learned that it pays to
> > be very, very cautious.
> >
> > On Tue, Oct 23, 2018 at 3:33 PM Jeff Jirsa  wrote:
> >
> >> My objection (-0.5) is based on freeze not in code complexity
> >>
> >>
> >>
> >> --
> >> Jeff Jirsa
> >>
> >>
> >>> On Oct 23, 2018, at 8:59 AM, Benedict Elliott Smith <
> bened...@apache.org>
> >> wrote:
> >>>
> >>> To discuss the concerns about the patch for a more efficient
> >> representation:
> >>>
> >>> The risk from such a patch is very low.  It’s a very simple in-memory
> >> data structure, that we can introduce thorough fuzz tests for.  The
> reason
> >> to exclude it would be for reasons of wanting to begin strictly
> enforcing
> >> the freeze only.  This is a good enough reason in my book, which is why
> I’m
> >> neutral on its addition.  I just wanted to provide some context for
> >> everyone else's voting intention.
> >>>
> >>>
>  On 23 Oct 2018, at 16:51, Ariel Weisberg  wrote:
> 
>  Hi,
> 
>  I just asked Jeff. He is -0 and -0.5 respectively.
> 
>  Ariel
> 
> > On Tue, Oct 23, 2018, at 11:50 AM, Benedict Elliott Smith wrote:
> > I’m +1 change of default.  I think Jeff was -1 on that though.
> >
> >
> >> On 23 Oct 2018, at 16:46, Ariel Weisberg  wrote:
> >>
> >> Hi,
> >>
> >> To summarize who we have heard from so far
> >>
> >> WRT to changing just the default:
> >>
> >> +1:
> >> Jon Haddadd
> >> Ben Bromhead
> >> Alain Rodriguez
> >> Sankalp Kohli (not explicit)
> >>
> >> -0:
> >> Sylvaine Lebresne
> >> Jeff Jirsa
> >>
> >> Not sure:
> >> Kurt Greaves
> >> Joshua Mckenzie
> >> Benedict Elliot Smith
> >>
> >> WRT to change the representation:
> >>
> >> +1:
> >> There are only conditional +1s at this point
> >>
> >> -0:
> >> Sylvaine Lebresne
> >>
> >> -.5:
> >> Jeff Jirsa
> >>
> >> This (
> >>
> https://github.com/aweisberg/cassandra/commit/a9ae85daa3ede092b9a1cf84879fb1a9f25b9dce
> )
> >> is a rough cut of the change for the representation. It needs better
> >> naming, unit tests, javadoc etc. but it does implement the change.
> >>
> >> Ariel
> >>> On Fri, Oct 19, 2018, at 3:42 PM, Jonathan Haddad wrote:
> >>> Sorry, to be clear - I'm +1 on changing the configuration default,
> >> but I
> >>> think changing the compression in memory representations warrants
> >> further
> >>> discussion and investigation before making a case for or against it
> >> yet.
> >>> An optimization that reduces in memory cost by over 50% sounds
> >> pretty good
> >>> and we never were really explicit that those sort of optimizations
> >> would be
> >>> excluded after our feature freeze.  I don't think they should
> >> necessarily
> >>> be excluded at this time, but it depends on the size and risk of
> the
> >> patch.
> >>>
>  On Sat, Oct 20, 2018 at 8:38 AM Jonathan Haddad <
> j...@jonhaddad.com>
> >> wrote:
> 
>  I think we should try to do the right thing for the most people
> >> that we
>  can.  The number of folks impacted by 64KB is huge.  I've worked
> on
> >> a lot
>  of clusters created by a lot of different teams, going from brand
> >> new to
>  pretty damn knowledgeable.  I can't think of a single time over
> the
> >> last 2
>  years that I've seen a cluster use non-default settings for
> >> compression.
>  With only a handful of exceptions, I've lowered the chunk size
> >> considerably
>  (usually to 4 or 8K) and the impact has always been very
> noticeable,
>  freque

Re: CASSANDRA-13241 lower default chunk_length_in_kb

2018-10-24 Thread Benedict Elliott Smith
If you undertake sufficiently many low risk things, some will bite you, I think 
everyone understands that.  It’s still valuable to factor a risk assessment 
into the equation, I think?

Either way, somebody asked who didn’t have the context to easily answer, so I 
did my best to offer them that information so they could make an informed 
decision.  I’m not campaigning for its inclusion, just trying to facilitate a 
collective decision.






> On 24 Oct 2018, at 16:27, Joshua McKenzie  wrote:
> 
> | The risk from such a patch is very low
> If I had a nickel for every time I've heard that... ;)
> 
> I'm neutral on the default change, -.5 (i.e. don't agree with it but won't
> die on that hill) on the data structure change post-freeze. We put this in,
> and that's a slippery slope as I'm sure we can find numerous other
> seemingly low-risk trivial optimizations and rewrites that cumulatively
> would make a "feature-freeze" effectively meaningless as a tool to start
> stabilizing the contents of the release.
> 
> In isolation many changes look innocuous. In the context of an organically
> grown open-source code-base that's this old, I've learned that it pays to
> be very, very cautious.
> 
> On Tue, Oct 23, 2018 at 3:33 PM Jeff Jirsa  wrote:
> 
>> My objection (-0.5) is based on freeze not in code complexity
>> 
>> 
>> 
>> --
>> Jeff Jirsa
>> 
>> 
>>> On Oct 23, 2018, at 8:59 AM, Benedict Elliott Smith 
>> wrote:
>>> 
>>> To discuss the concerns about the patch for a more efficient
>> representation:
>>> 
>>> The risk from such a patch is very low.  It’s a very simple in-memory
>> data structure, that we can introduce thorough fuzz tests for.  The reason
>> to exclude it would be for reasons of wanting to begin strictly enforcing
>> the freeze only.  This is a good enough reason in my book, which is why I’m
>> neutral on its addition.  I just wanted to provide some context for
>> everyone else's voting intention.
>>> 
>>> 
 On 23 Oct 2018, at 16:51, Ariel Weisberg  wrote:
 
 Hi,
 
 I just asked Jeff. He is -0 and -0.5 respectively.
 
 Ariel
 
> On Tue, Oct 23, 2018, at 11:50 AM, Benedict Elliott Smith wrote:
> I’m +1 change of default.  I think Jeff was -1 on that though.
> 
> 
>> On 23 Oct 2018, at 16:46, Ariel Weisberg  wrote:
>> 
>> Hi,
>> 
>> To summarize who we have heard from so far
>> 
>> WRT to changing just the default:
>> 
>> +1:
>> Jon Haddadd
>> Ben Bromhead
>> Alain Rodriguez
>> Sankalp Kohli (not explicit)
>> 
>> -0:
>> Sylvaine Lebresne
>> Jeff Jirsa
>> 
>> Not sure:
>> Kurt Greaves
>> Joshua Mckenzie
>> Benedict Elliot Smith
>> 
>> WRT to change the representation:
>> 
>> +1:
>> There are only conditional +1s at this point
>> 
>> -0:
>> Sylvaine Lebresne
>> 
>> -.5:
>> Jeff Jirsa
>> 
>> This (
>> https://github.com/aweisberg/cassandra/commit/a9ae85daa3ede092b9a1cf84879fb1a9f25b9dce)
>> is a rough cut of the change for the representation. It needs better
>> naming, unit tests, javadoc etc. but it does implement the change.
>> 
>> Ariel
>>> On Fri, Oct 19, 2018, at 3:42 PM, Jonathan Haddad wrote:
>>> Sorry, to be clear - I'm +1 on changing the configuration default,
>> but I
>>> think changing the compression in memory representations warrants
>> further
>>> discussion and investigation before making a case for or against it
>> yet.
>>> An optimization that reduces in memory cost by over 50% sounds
>> pretty good
>>> and we never were really explicit that those sort of optimizations
>> would be
>>> excluded after our feature freeze.  I don't think they should
>> necessarily
>>> be excluded at this time, but it depends on the size and risk of the
>> patch.
>>> 
 On Sat, Oct 20, 2018 at 8:38 AM Jonathan Haddad 
>> wrote:
 
 I think we should try to do the right thing for the most people
>> that we
 can.  The number of folks impacted by 64KB is huge.  I've worked on
>> a lot
 of clusters created by a lot of different teams, going from brand
>> new to
 pretty damn knowledgeable.  I can't think of a single time over the
>> last 2
 years that I've seen a cluster use non-default settings for
>> compression.
 With only a handful of exceptions, I've lowered the chunk size
>> considerably
 (usually to 4 or 8K) and the impact has always been very noticeable,
 frequently resulting in hardware reduction and cost savings.  Of
>> all the
 poorly chosen defaults we have, this is one of the biggest
>> offenders that I
 see.  There's a good reason ScyllaDB  claims they're so much faster
>> than
 Cassandra - we ship a DB that performs poorly for 90+% of teams
>> because we
 ship for a specific use case, not a general one (time series on
>> memory
 constrained boxes bei

Re: CASSANDRA-13241 lower default chunk_length_in_kb

2018-10-24 Thread Joshua McKenzie
| The risk from such a patch is very low
If I had a nickel for every time I've heard that... ;)

I'm neutral on the default change, -.5 (i.e. don't agree with it but won't
die on that hill) on the data structure change post-freeze. We put this in,
and that's a slippery slope as I'm sure we can find numerous other
seemingly low-risk trivial optimizations and rewrites that cumulatively
would make a "feature-freeze" effectively meaningless as a tool to start
stabilizing the contents of the release.

In isolation many changes look innocuous. In the context of an organically
grown open-source code-base that's this old, I've learned that it pays to
be very, very cautious.

On Tue, Oct 23, 2018 at 3:33 PM Jeff Jirsa  wrote:

> My objection (-0.5) is based on freeze not in code complexity
>
>
>
> --
> Jeff Jirsa
>
>
> > On Oct 23, 2018, at 8:59 AM, Benedict Elliott Smith 
> wrote:
> >
> > To discuss the concerns about the patch for a more efficient
> representation:
> >
> > The risk from such a patch is very low.  It’s a very simple in-memory
> data structure, that we can introduce thorough fuzz tests for.  The reason
> to exclude it would be for reasons of wanting to begin strictly enforcing
> the freeze only.  This is a good enough reason in my book, which is why I’m
> neutral on its addition.  I just wanted to provide some context for
> everyone else's voting intention.
> >
> >
> >> On 23 Oct 2018, at 16:51, Ariel Weisberg  wrote:
> >>
> >> Hi,
> >>
> >> I just asked Jeff. He is -0 and -0.5 respectively.
> >>
> >> Ariel
> >>
> >>> On Tue, Oct 23, 2018, at 11:50 AM, Benedict Elliott Smith wrote:
> >>> I’m +1 change of default.  I think Jeff was -1 on that though.
> >>>
> >>>
>  On 23 Oct 2018, at 16:46, Ariel Weisberg  wrote:
> 
>  Hi,
> 
>  To summarize who we have heard from so far
> 
>  WRT to changing just the default:
> 
>  +1:
>  Jon Haddadd
>  Ben Bromhead
>  Alain Rodriguez
>  Sankalp Kohli (not explicit)
> 
>  -0:
>  Sylvaine Lebresne
>  Jeff Jirsa
> 
>  Not sure:
>  Kurt Greaves
>  Joshua Mckenzie
>  Benedict Elliot Smith
> 
>  WRT to change the representation:
> 
>  +1:
>  There are only conditional +1s at this point
> 
>  -0:
>  Sylvaine Lebresne
> 
>  -.5:
>  Jeff Jirsa
> 
>  This (
> https://github.com/aweisberg/cassandra/commit/a9ae85daa3ede092b9a1cf84879fb1a9f25b9dce)
> is a rough cut of the change for the representation. It needs better
> naming, unit tests, javadoc etc. but it does implement the change.
> 
>  Ariel
> > On Fri, Oct 19, 2018, at 3:42 PM, Jonathan Haddad wrote:
> > Sorry, to be clear - I'm +1 on changing the configuration default,
> but I
> > think changing the compression in memory representations warrants
> further
> > discussion and investigation before making a case for or against it
> yet.
> > An optimization that reduces in memory cost by over 50% sounds
> pretty good
> > and we never were really explicit that those sort of optimizations
> would be
> > excluded after our feature freeze.  I don't think they should
> necessarily
> > be excluded at this time, but it depends on the size and risk of the
> patch.
> >
> >> On Sat, Oct 20, 2018 at 8:38 AM Jonathan Haddad 
> wrote:
> >>
> >> I think we should try to do the right thing for the most people
> that we
> >> can.  The number of folks impacted by 64KB is huge.  I've worked on
> a lot
> >> of clusters created by a lot of different teams, going from brand
> new to
> >> pretty damn knowledgeable.  I can't think of a single time over the
> last 2
> >> years that I've seen a cluster use non-default settings for
> compression.
> >> With only a handful of exceptions, I've lowered the chunk size
> considerably
> >> (usually to 4 or 8K) and the impact has always been very noticeable,
> >> frequently resulting in hardware reduction and cost savings.  Of
> all the
> >> poorly chosen defaults we have, this is one of the biggest
> offenders that I
> >> see.  There's a good reason ScyllaDB  claims they're so much faster
> than
> >> Cassandra - we ship a DB that performs poorly for 90+% of teams
> because we
> >> ship for a specific use case, not a general one (time series on
> memory
> >> constrained boxes being the specific use case)
> >>
> >> This doesn't impact existing tables, just new ones.  More and more
> teams
> >> are using Cassandra as a general purpose database, we should
> acknowledge
> >> that adjusting our defaults accordingly.  Yes, we use a little bit
> more
> >> memory on new tables if we just change this setting, and what we
> get out of
> >> it is a massive performance win.
> >>
> >> I'm +1 on the change as well.
> >>
> >>
> >>
> >> On Sat, Oct 20, 2018 at 4:21 AM Sankalp Kohli <
> kohlisank...@gmail.com>
> >> wrote:
> >>
> 

Re: CASSANDRA-13241 lower default chunk_length_in_kb

2018-10-23 Thread Jeff Jirsa
My objection (-0.5) is based on freeze not in code complexity



-- 
Jeff Jirsa


> On Oct 23, 2018, at 8:59 AM, Benedict Elliott Smith  
> wrote:
> 
> To discuss the concerns about the patch for a more efficient representation:
> 
> The risk from such a patch is very low.  It’s a very simple in-memory data 
> structure, that we can introduce thorough fuzz tests for.  The reason to 
> exclude it would be for reasons of wanting to begin strictly enforcing the 
> freeze only.  This is a good enough reason in my book, which is why I’m 
> neutral on its addition.  I just wanted to provide some context for everyone 
> else's voting intention.
> 
> 
>> On 23 Oct 2018, at 16:51, Ariel Weisberg  wrote:
>> 
>> Hi,
>> 
>> I just asked Jeff. He is -0 and -0.5 respectively.
>> 
>> Ariel
>> 
>>> On Tue, Oct 23, 2018, at 11:50 AM, Benedict Elliott Smith wrote:
>>> I’m +1 change of default.  I think Jeff was -1 on that though.
>>> 
>>> 
 On 23 Oct 2018, at 16:46, Ariel Weisberg  wrote:
 
 Hi,
 
 To summarize who we have heard from so far
 
 WRT to changing just the default:
 
 +1:
 Jon Haddadd
 Ben Bromhead
 Alain Rodriguez
 Sankalp Kohli (not explicit)
 
 -0:
 Sylvaine Lebresne 
 Jeff Jirsa
 
 Not sure:
 Kurt Greaves
 Joshua Mckenzie
 Benedict Elliot Smith
 
 WRT to change the representation:
 
 +1:
 There are only conditional +1s at this point
 
 -0:
 Sylvaine Lebresne
 
 -.5:
 Jeff Jirsa
 
 This 
 (https://github.com/aweisberg/cassandra/commit/a9ae85daa3ede092b9a1cf84879fb1a9f25b9dce)
  is a rough cut of the change for the representation. It needs better 
 naming, unit tests, javadoc etc. but it does implement the change.
 
 Ariel
> On Fri, Oct 19, 2018, at 3:42 PM, Jonathan Haddad wrote:
> Sorry, to be clear - I'm +1 on changing the configuration default, but I
> think changing the compression in memory representations warrants further
> discussion and investigation before making a case for or against it yet.
> An optimization that reduces in memory cost by over 50% sounds pretty good
> and we never were really explicit that those sort of optimizations would 
> be
> excluded after our feature freeze.  I don't think they should necessarily
> be excluded at this time, but it depends on the size and risk of the 
> patch.
> 
>> On Sat, Oct 20, 2018 at 8:38 AM Jonathan Haddad  
>> wrote:
>> 
>> I think we should try to do the right thing for the most people that we
>> can.  The number of folks impacted by 64KB is huge.  I've worked on a lot
>> of clusters created by a lot of different teams, going from brand new to
>> pretty damn knowledgeable.  I can't think of a single time over the last 
>> 2
>> years that I've seen a cluster use non-default settings for compression.
>> With only a handful of exceptions, I've lowered the chunk size 
>> considerably
>> (usually to 4 or 8K) and the impact has always been very noticeable,
>> frequently resulting in hardware reduction and cost savings.  Of all the
>> poorly chosen defaults we have, this is one of the biggest offenders 
>> that I
>> see.  There's a good reason ScyllaDB  claims they're so much faster than
>> Cassandra - we ship a DB that performs poorly for 90+% of teams because 
>> we
>> ship for a specific use case, not a general one (time series on memory
>> constrained boxes being the specific use case)
>> 
>> This doesn't impact existing tables, just new ones.  More and more teams
>> are using Cassandra as a general purpose database, we should acknowledge
>> that adjusting our defaults accordingly.  Yes, we use a little bit more
>> memory on new tables if we just change this setting, and what we get out 
>> of
>> it is a massive performance win.
>> 
>> I'm +1 on the change as well.
>> 
>> 
>> 
>> On Sat, Oct 20, 2018 at 4:21 AM Sankalp Kohli 
>> wrote:
>> 
>>> (We should definitely harden the definition for freeze in a separate
>>> thread)
>>> 
>>> My thinking is that this is the best time to do this change as we have
>>> not even cut alpha or beta. All the people involved in the test will
>>> definitely be testing it again when we have these releases.
>>> 
 On Oct 19, 2018, at 8:00 AM, Michael Shuler 
>>> wrote:
 
> On 10/19/18 9:16 AM, Joshua McKenzie wrote:
> 
> At the risk of hijacking this thread, when are we going to transition
>>> from
> "no new features, change whatever else you want including refactoring
>>> and
> changing years-old defaults" to "ok, we think we have something that's
> stable, time to start testing"?
 
 Creating a cassandra-4.0 branch would allow trunk to, for instance, get
 a defa

Re: CASSANDRA-13241 lower default chunk_length_in_kb

2018-10-23 Thread Benedict Elliott Smith
To discuss the concerns about the patch for a more efficient representation:

The risk from such a patch is very low.  It’s a very simple in-memory data 
structure, that we can introduce thorough fuzz tests for.  The reason to 
exclude it would be for reasons of wanting to begin strictly enforcing the 
freeze only.  This is a good enough reason in my book, which is why I’m neutral 
on its addition.  I just wanted to provide some context for everyone else's 
voting intention.


> On 23 Oct 2018, at 16:51, Ariel Weisberg  wrote:
> 
> Hi,
> 
> I just asked Jeff. He is -0 and -0.5 respectively.
> 
> Ariel
> 
> On Tue, Oct 23, 2018, at 11:50 AM, Benedict Elliott Smith wrote:
>> I’m +1 change of default.  I think Jeff was -1 on that though.
>> 
>> 
>>> On 23 Oct 2018, at 16:46, Ariel Weisberg  wrote:
>>> 
>>> Hi,
>>> 
>>> To summarize who we have heard from so far
>>> 
>>> WRT to changing just the default:
>>> 
>>> +1:
>>> Jon Haddadd
>>> Ben Bromhead
>>> Alain Rodriguez
>>> Sankalp Kohli (not explicit)
>>> 
>>> -0:
>>> Sylvaine Lebresne 
>>> Jeff Jirsa
>>> 
>>> Not sure:
>>> Kurt Greaves
>>> Joshua Mckenzie
>>> Benedict Elliot Smith
>>> 
>>> WRT to change the representation:
>>> 
>>> +1:
>>> There are only conditional +1s at this point
>>> 
>>> -0:
>>> Sylvaine Lebresne
>>> 
>>> -.5:
>>> Jeff Jirsa
>>> 
>>> This 
>>> (https://github.com/aweisberg/cassandra/commit/a9ae85daa3ede092b9a1cf84879fb1a9f25b9dce)
>>>  is a rough cut of the change for the representation. It needs better 
>>> naming, unit tests, javadoc etc. but it does implement the change.
>>> 
>>> Ariel
>>> On Fri, Oct 19, 2018, at 3:42 PM, Jonathan Haddad wrote:
 Sorry, to be clear - I'm +1 on changing the configuration default, but I
 think changing the compression in memory representations warrants further
 discussion and investigation before making a case for or against it yet.
 An optimization that reduces in memory cost by over 50% sounds pretty good
 and we never were really explicit that those sort of optimizations would be
 excluded after our feature freeze.  I don't think they should necessarily
 be excluded at this time, but it depends on the size and risk of the patch.
 
 On Sat, Oct 20, 2018 at 8:38 AM Jonathan Haddad  wrote:
 
> I think we should try to do the right thing for the most people that we
> can.  The number of folks impacted by 64KB is huge.  I've worked on a lot
> of clusters created by a lot of different teams, going from brand new to
> pretty damn knowledgeable.  I can't think of a single time over the last 2
> years that I've seen a cluster use non-default settings for compression.
> With only a handful of exceptions, I've lowered the chunk size 
> considerably
> (usually to 4 or 8K) and the impact has always been very noticeable,
> frequently resulting in hardware reduction and cost savings.  Of all the
> poorly chosen defaults we have, this is one of the biggest offenders that 
> I
> see.  There's a good reason ScyllaDB  claims they're so much faster than
> Cassandra - we ship a DB that performs poorly for 90+% of teams because we
> ship for a specific use case, not a general one (time series on memory
> constrained boxes being the specific use case)
> 
> This doesn't impact existing tables, just new ones.  More and more teams
> are using Cassandra as a general purpose database, we should acknowledge
> that adjusting our defaults accordingly.  Yes, we use a little bit more
> memory on new tables if we just change this setting, and what we get out 
> of
> it is a massive performance win.
> 
> I'm +1 on the change as well.
> 
> 
> 
> On Sat, Oct 20, 2018 at 4:21 AM Sankalp Kohli 
> wrote:
> 
>> (We should definitely harden the definition for freeze in a separate
>> thread)
>> 
>> My thinking is that this is the best time to do this change as we have
>> not even cut alpha or beta. All the people involved in the test will
>> definitely be testing it again when we have these releases.
>> 
>>> On Oct 19, 2018, at 8:00 AM, Michael Shuler 
>> wrote:
>>> 
 On 10/19/18 9:16 AM, Joshua McKenzie wrote:
 
 At the risk of hijacking this thread, when are we going to transition
>> from
 "no new features, change whatever else you want including refactoring
>> and
 changing years-old defaults" to "ok, we think we have something that's
 stable, time to start testing"?
>>> 
>>> Creating a cassandra-4.0 branch would allow trunk to, for instance, get
>>> a default config value change commit and get more testing. We might
>>> forget again, from what I understand of Benedict's last comment :)
>>> 
>>> --
>>> Michael
>>> 
>>> -
>>> To unsubscribe, e-mail: dev-unsubscr...@cassandra.apache.o

Re: CASSANDRA-13241 lower default chunk_length_in_kb

2018-10-23 Thread Ariel Weisberg
Hi,

I just asked Jeff. He is -0 and -0.5 respectively.

Ariel

On Tue, Oct 23, 2018, at 11:50 AM, Benedict Elliott Smith wrote:
> I’m +1 change of default.  I think Jeff was -1 on that though.
> 
> 
> > On 23 Oct 2018, at 16:46, Ariel Weisberg  wrote:
> > 
> > Hi,
> > 
> > To summarize who we have heard from so far
> > 
> > WRT to changing just the default:
> > 
> > +1:
> > Jon Haddadd
> > Ben Bromhead
> > Alain Rodriguez
> > Sankalp Kohli (not explicit)
> > 
> > -0:
> > Sylvaine Lebresne 
> > Jeff Jirsa
> > 
> > Not sure:
> > Kurt Greaves
> > Joshua Mckenzie
> > Benedict Elliot Smith
> > 
> > WRT to change the representation:
> > 
> > +1:
> > There are only conditional +1s at this point
> > 
> > -0:
> > Sylvaine Lebresne
> > 
> > -.5:
> > Jeff Jirsa
> > 
> > This 
> > (https://github.com/aweisberg/cassandra/commit/a9ae85daa3ede092b9a1cf84879fb1a9f25b9dce)
> >  is a rough cut of the change for the representation. It needs better 
> > naming, unit tests, javadoc etc. but it does implement the change.
> > 
> > Ariel
> > On Fri, Oct 19, 2018, at 3:42 PM, Jonathan Haddad wrote:
> >> Sorry, to be clear - I'm +1 on changing the configuration default, but I
> >> think changing the compression in memory representations warrants further
> >> discussion and investigation before making a case for or against it yet.
> >> An optimization that reduces in memory cost by over 50% sounds pretty good
> >> and we never were really explicit that those sort of optimizations would be
> >> excluded after our feature freeze.  I don't think they should necessarily
> >> be excluded at this time, but it depends on the size and risk of the patch.
> >> 
> >> On Sat, Oct 20, 2018 at 8:38 AM Jonathan Haddad  wrote:
> >> 
> >>> I think we should try to do the right thing for the most people that we
> >>> can.  The number of folks impacted by 64KB is huge.  I've worked on a lot
> >>> of clusters created by a lot of different teams, going from brand new to
> >>> pretty damn knowledgeable.  I can't think of a single time over the last 2
> >>> years that I've seen a cluster use non-default settings for compression.
> >>> With only a handful of exceptions, I've lowered the chunk size 
> >>> considerably
> >>> (usually to 4 or 8K) and the impact has always been very noticeable,
> >>> frequently resulting in hardware reduction and cost savings.  Of all the
> >>> poorly chosen defaults we have, this is one of the biggest offenders that 
> >>> I
> >>> see.  There's a good reason ScyllaDB  claims they're so much faster than
> >>> Cassandra - we ship a DB that performs poorly for 90+% of teams because we
> >>> ship for a specific use case, not a general one (time series on memory
> >>> constrained boxes being the specific use case)
> >>> 
> >>> This doesn't impact existing tables, just new ones.  More and more teams
> >>> are using Cassandra as a general purpose database, we should acknowledge
> >>> that adjusting our defaults accordingly.  Yes, we use a little bit more
> >>> memory on new tables if we just change this setting, and what we get out 
> >>> of
> >>> it is a massive performance win.
> >>> 
> >>> I'm +1 on the change as well.
> >>> 
> >>> 
> >>> 
> >>> On Sat, Oct 20, 2018 at 4:21 AM Sankalp Kohli 
> >>> wrote:
> >>> 
>  (We should definitely harden the definition for freeze in a separate
>  thread)
>  
>  My thinking is that this is the best time to do this change as we have
>  not even cut alpha or beta. All the people involved in the test will
>  definitely be testing it again when we have these releases.
>  
> > On Oct 19, 2018, at 8:00 AM, Michael Shuler 
>  wrote:
> > 
> >> On 10/19/18 9:16 AM, Joshua McKenzie wrote:
> >> 
> >> At the risk of hijacking this thread, when are we going to transition
>  from
> >> "no new features, change whatever else you want including refactoring
>  and
> >> changing years-old defaults" to "ok, we think we have something that's
> >> stable, time to start testing"?
> > 
> > Creating a cassandra-4.0 branch would allow trunk to, for instance, get
> > a default config value change commit and get more testing. We might
> > forget again, from what I understand of Benedict's last comment :)
> > 
> > --
> > Michael
> > 
> > -
> > To unsubscribe, e-mail: dev-unsubscr...@cassandra.apache.org
> > For additional commands, e-mail: dev-h...@cassandra.apache.org
> > 
>  
>  -
>  To unsubscribe, e-mail: dev-unsubscr...@cassandra.apache.org
>  For additional commands, e-mail: dev-h...@cassandra.apache.org
>  
>  
> >>> 
> >>> --
> >>> Jon Haddad
> >>> http://www.rustyrazorblade.com
> >>> twitter: rustyrazorblade
> >>> 
> >> 
> >> 
> >> -- 
> >> Jon Haddad
> >> http://www.rustyrazorblade.com
> >> twitter: rustyrazorblade
> > 
> > ---

Re: CASSANDRA-13241 lower default chunk_length_in_kb

2018-10-23 Thread Benedict Elliott Smith
I’m +1 change of default.  I think Jeff was -1 on that though.


> On 23 Oct 2018, at 16:46, Ariel Weisberg  wrote:
> 
> Hi,
> 
> To summarize who we have heard from so far
> 
> WRT to changing just the default:
> 
> +1:
> Jon Haddadd
> Ben Bromhead
> Alain Rodriguez
> Sankalp Kohli (not explicit)
> 
> -0:
> Sylvaine Lebresne 
> Jeff Jirsa
> 
> Not sure:
> Kurt Greaves
> Joshua Mckenzie
> Benedict Elliot Smith
> 
> WRT to change the representation:
> 
> +1:
> There are only conditional +1s at this point
> 
> -0:
> Sylvaine Lebresne
> 
> -.5:
> Jeff Jirsa
> 
> This 
> (https://github.com/aweisberg/cassandra/commit/a9ae85daa3ede092b9a1cf84879fb1a9f25b9dce)
>  is a rough cut of the change for the representation. It needs better naming, 
> unit tests, javadoc etc. but it does implement the change.
> 
> Ariel
> On Fri, Oct 19, 2018, at 3:42 PM, Jonathan Haddad wrote:
>> Sorry, to be clear - I'm +1 on changing the configuration default, but I
>> think changing the compression in memory representations warrants further
>> discussion and investigation before making a case for or against it yet.
>> An optimization that reduces in memory cost by over 50% sounds pretty good
>> and we never were really explicit that those sort of optimizations would be
>> excluded after our feature freeze.  I don't think they should necessarily
>> be excluded at this time, but it depends on the size and risk of the patch.
>> 
>> On Sat, Oct 20, 2018 at 8:38 AM Jonathan Haddad  wrote:
>> 
>>> I think we should try to do the right thing for the most people that we
>>> can.  The number of folks impacted by 64KB is huge.  I've worked on a lot
>>> of clusters created by a lot of different teams, going from brand new to
>>> pretty damn knowledgeable.  I can't think of a single time over the last 2
>>> years that I've seen a cluster use non-default settings for compression.
>>> With only a handful of exceptions, I've lowered the chunk size considerably
>>> (usually to 4 or 8K) and the impact has always been very noticeable,
>>> frequently resulting in hardware reduction and cost savings.  Of all the
>>> poorly chosen defaults we have, this is one of the biggest offenders that I
>>> see.  There's a good reason ScyllaDB  claims they're so much faster than
>>> Cassandra - we ship a DB that performs poorly for 90+% of teams because we
>>> ship for a specific use case, not a general one (time series on memory
>>> constrained boxes being the specific use case)
>>> 
>>> This doesn't impact existing tables, just new ones.  More and more teams
>>> are using Cassandra as a general purpose database, we should acknowledge
>>> that adjusting our defaults accordingly.  Yes, we use a little bit more
>>> memory on new tables if we just change this setting, and what we get out of
>>> it is a massive performance win.
>>> 
>>> I'm +1 on the change as well.
>>> 
>>> 
>>> 
>>> On Sat, Oct 20, 2018 at 4:21 AM Sankalp Kohli 
>>> wrote:
>>> 
 (We should definitely harden the definition for freeze in a separate
 thread)
 
 My thinking is that this is the best time to do this change as we have
 not even cut alpha or beta. All the people involved in the test will
 definitely be testing it again when we have these releases.
 
> On Oct 19, 2018, at 8:00 AM, Michael Shuler 
 wrote:
> 
>> On 10/19/18 9:16 AM, Joshua McKenzie wrote:
>> 
>> At the risk of hijacking this thread, when are we going to transition
 from
>> "no new features, change whatever else you want including refactoring
 and
>> changing years-old defaults" to "ok, we think we have something that's
>> stable, time to start testing"?
> 
> Creating a cassandra-4.0 branch would allow trunk to, for instance, get
> a default config value change commit and get more testing. We might
> forget again, from what I understand of Benedict's last comment :)
> 
> --
> Michael
> 
> -
> To unsubscribe, e-mail: dev-unsubscr...@cassandra.apache.org
> For additional commands, e-mail: dev-h...@cassandra.apache.org
> 
 
 -
 To unsubscribe, e-mail: dev-unsubscr...@cassandra.apache.org
 For additional commands, e-mail: dev-h...@cassandra.apache.org
 
 
>>> 
>>> --
>>> Jon Haddad
>>> http://www.rustyrazorblade.com
>>> twitter: rustyrazorblade
>>> 
>> 
>> 
>> -- 
>> Jon Haddad
>> http://www.rustyrazorblade.com
>> twitter: rustyrazorblade
> 
> -
> To unsubscribe, e-mail: dev-unsubscr...@cassandra.apache.org
> For additional commands, e-mail: dev-h...@cassandra.apache.org
> 


-
To unsubscribe, e-mail: dev-unsubscr...@cassandra.apache.org
For additional commands, e-mail: dev-h...@cassandra.apache.org



Re: CASSANDRA-13241 lower default chunk_length_in_kb

2018-10-23 Thread Ariel Weisberg
Hi,

To summarize who we have heard from so far

WRT to changing just the default:

+1:
Jon Haddadd
Ben Bromhead
Alain Rodriguez
Sankalp Kohli (not explicit)

-0:
Sylvaine Lebresne 
Jeff Jirsa

Not sure:
Kurt Greaves
Joshua Mckenzie
Benedict Elliot Smith

WRT to change the representation:

+1:
There are only conditional +1s at this point

-0:
Sylvaine Lebresne

-.5:
Jeff Jirsa

This 
(https://github.com/aweisberg/cassandra/commit/a9ae85daa3ede092b9a1cf84879fb1a9f25b9dce)
 is a rough cut of the change for the representation. It needs better naming, 
unit tests, javadoc etc. but it does implement the change.

Ariel
On Fri, Oct 19, 2018, at 3:42 PM, Jonathan Haddad wrote:
> Sorry, to be clear - I'm +1 on changing the configuration default, but I
> think changing the compression in memory representations warrants further
> discussion and investigation before making a case for or against it yet.
> An optimization that reduces in memory cost by over 50% sounds pretty good
> and we never were really explicit that those sort of optimizations would be
> excluded after our feature freeze.  I don't think they should necessarily
> be excluded at this time, but it depends on the size and risk of the patch.
> 
> On Sat, Oct 20, 2018 at 8:38 AM Jonathan Haddad  wrote:
> 
> > I think we should try to do the right thing for the most people that we
> > can.  The number of folks impacted by 64KB is huge.  I've worked on a lot
> > of clusters created by a lot of different teams, going from brand new to
> > pretty damn knowledgeable.  I can't think of a single time over the last 2
> > years that I've seen a cluster use non-default settings for compression.
> > With only a handful of exceptions, I've lowered the chunk size considerably
> > (usually to 4 or 8K) and the impact has always been very noticeable,
> > frequently resulting in hardware reduction and cost savings.  Of all the
> > poorly chosen defaults we have, this is one of the biggest offenders that I
> > see.  There's a good reason ScyllaDB  claims they're so much faster than
> > Cassandra - we ship a DB that performs poorly for 90+% of teams because we
> > ship for a specific use case, not a general one (time series on memory
> > constrained boxes being the specific use case)
> >
> > This doesn't impact existing tables, just new ones.  More and more teams
> > are using Cassandra as a general purpose database, we should acknowledge
> > that adjusting our defaults accordingly.  Yes, we use a little bit more
> > memory on new tables if we just change this setting, and what we get out of
> > it is a massive performance win.
> >
> > I'm +1 on the change as well.
> >
> >
> >
> > On Sat, Oct 20, 2018 at 4:21 AM Sankalp Kohli 
> > wrote:
> >
> >> (We should definitely harden the definition for freeze in a separate
> >> thread)
> >>
> >> My thinking is that this is the best time to do this change as we have
> >> not even cut alpha or beta. All the people involved in the test will
> >> definitely be testing it again when we have these releases.
> >>
> >> > On Oct 19, 2018, at 8:00 AM, Michael Shuler 
> >> wrote:
> >> >
> >> >> On 10/19/18 9:16 AM, Joshua McKenzie wrote:
> >> >>
> >> >> At the risk of hijacking this thread, when are we going to transition
> >> from
> >> >> "no new features, change whatever else you want including refactoring
> >> and
> >> >> changing years-old defaults" to "ok, we think we have something that's
> >> >> stable, time to start testing"?
> >> >
> >> > Creating a cassandra-4.0 branch would allow trunk to, for instance, get
> >> > a default config value change commit and get more testing. We might
> >> > forget again, from what I understand of Benedict's last comment :)
> >> >
> >> > --
> >> > Michael
> >> >
> >> > -
> >> > To unsubscribe, e-mail: dev-unsubscr...@cassandra.apache.org
> >> > For additional commands, e-mail: dev-h...@cassandra.apache.org
> >> >
> >>
> >> -
> >> To unsubscribe, e-mail: dev-unsubscr...@cassandra.apache.org
> >> For additional commands, e-mail: dev-h...@cassandra.apache.org
> >>
> >>
> >
> > --
> > Jon Haddad
> > http://www.rustyrazorblade.com
> > twitter: rustyrazorblade
> >
> 
> 
> -- 
> Jon Haddad
> http://www.rustyrazorblade.com
> twitter: rustyrazorblade

-
To unsubscribe, e-mail: dev-unsubscr...@cassandra.apache.org
For additional commands, e-mail: dev-h...@cassandra.apache.org



Re: CASSANDRA-13241 lower default chunk_length_in_kb

2018-10-19 Thread Jonathan Haddad
Sorry, to be clear - I'm +1 on changing the configuration default, but I
think changing the compression in memory representations warrants further
discussion and investigation before making a case for or against it yet.
An optimization that reduces in memory cost by over 50% sounds pretty good
and we never were really explicit that those sort of optimizations would be
excluded after our feature freeze.  I don't think they should necessarily
be excluded at this time, but it depends on the size and risk of the patch.

On Sat, Oct 20, 2018 at 8:38 AM Jonathan Haddad  wrote:

> I think we should try to do the right thing for the most people that we
> can.  The number of folks impacted by 64KB is huge.  I've worked on a lot
> of clusters created by a lot of different teams, going from brand new to
> pretty damn knowledgeable.  I can't think of a single time over the last 2
> years that I've seen a cluster use non-default settings for compression.
> With only a handful of exceptions, I've lowered the chunk size considerably
> (usually to 4 or 8K) and the impact has always been very noticeable,
> frequently resulting in hardware reduction and cost savings.  Of all the
> poorly chosen defaults we have, this is one of the biggest offenders that I
> see.  There's a good reason ScyllaDB  claims they're so much faster than
> Cassandra - we ship a DB that performs poorly for 90+% of teams because we
> ship for a specific use case, not a general one (time series on memory
> constrained boxes being the specific use case)
>
> This doesn't impact existing tables, just new ones.  More and more teams
> are using Cassandra as a general purpose database, we should acknowledge
> that adjusting our defaults accordingly.  Yes, we use a little bit more
> memory on new tables if we just change this setting, and what we get out of
> it is a massive performance win.
>
> I'm +1 on the change as well.
>
>
>
> On Sat, Oct 20, 2018 at 4:21 AM Sankalp Kohli 
> wrote:
>
>> (We should definitely harden the definition for freeze in a separate
>> thread)
>>
>> My thinking is that this is the best time to do this change as we have
>> not even cut alpha or beta. All the people involved in the test will
>> definitely be testing it again when we have these releases.
>>
>> > On Oct 19, 2018, at 8:00 AM, Michael Shuler 
>> wrote:
>> >
>> >> On 10/19/18 9:16 AM, Joshua McKenzie wrote:
>> >>
>> >> At the risk of hijacking this thread, when are we going to transition
>> from
>> >> "no new features, change whatever else you want including refactoring
>> and
>> >> changing years-old defaults" to "ok, we think we have something that's
>> >> stable, time to start testing"?
>> >
>> > Creating a cassandra-4.0 branch would allow trunk to, for instance, get
>> > a default config value change commit and get more testing. We might
>> > forget again, from what I understand of Benedict's last comment :)
>> >
>> > --
>> > Michael
>> >
>> > -
>> > To unsubscribe, e-mail: dev-unsubscr...@cassandra.apache.org
>> > For additional commands, e-mail: dev-h...@cassandra.apache.org
>> >
>>
>> -
>> To unsubscribe, e-mail: dev-unsubscr...@cassandra.apache.org
>> For additional commands, e-mail: dev-h...@cassandra.apache.org
>>
>>
>
> --
> Jon Haddad
> http://www.rustyrazorblade.com
> twitter: rustyrazorblade
>


-- 
Jon Haddad
http://www.rustyrazorblade.com
twitter: rustyrazorblade


Re: CASSANDRA-13241 lower default chunk_length_in_kb

2018-10-19 Thread Jonathan Haddad
I think we should try to do the right thing for the most people that we
can.  The number of folks impacted by 64KB is huge.  I've worked on a lot
of clusters created by a lot of different teams, going from brand new to
pretty damn knowledgeable.  I can't think of a single time over the last 2
years that I've seen a cluster use non-default settings for compression.
With only a handful of exceptions, I've lowered the chunk size considerably
(usually to 4 or 8K) and the impact has always been very noticeable,
frequently resulting in hardware reduction and cost savings.  Of all the
poorly chosen defaults we have, this is one of the biggest offenders that I
see.  There's a good reason ScyllaDB  claims they're so much faster than
Cassandra - we ship a DB that performs poorly for 90+% of teams because we
ship for a specific use case, not a general one (time series on memory
constrained boxes being the specific use case)

This doesn't impact existing tables, just new ones.  More and more teams
are using Cassandra as a general purpose database, we should acknowledge
that adjusting our defaults accordingly.  Yes, we use a little bit more
memory on new tables if we just change this setting, and what we get out of
it is a massive performance win.

I'm +1 on the change as well.



On Sat, Oct 20, 2018 at 4:21 AM Sankalp Kohli 
wrote:

> (We should definitely harden the definition for freeze in a separate
> thread)
>
> My thinking is that this is the best time to do this change as we have not
> even cut alpha or beta. All the people involved in the test will definitely
> be testing it again when we have these releases.
>
> > On Oct 19, 2018, at 8:00 AM, Michael Shuler 
> wrote:
> >
> >> On 10/19/18 9:16 AM, Joshua McKenzie wrote:
> >>
> >> At the risk of hijacking this thread, when are we going to transition
> from
> >> "no new features, change whatever else you want including refactoring
> and
> >> changing years-old defaults" to "ok, we think we have something that's
> >> stable, time to start testing"?
> >
> > Creating a cassandra-4.0 branch would allow trunk to, for instance, get
> > a default config value change commit and get more testing. We might
> > forget again, from what I understand of Benedict's last comment :)
> >
> > --
> > Michael
> >
> > -
> > To unsubscribe, e-mail: dev-unsubscr...@cassandra.apache.org
> > For additional commands, e-mail: dev-h...@cassandra.apache.org
> >
>
> -
> To unsubscribe, e-mail: dev-unsubscr...@cassandra.apache.org
> For additional commands, e-mail: dev-h...@cassandra.apache.org
>
>

-- 
Jon Haddad
http://www.rustyrazorblade.com
twitter: rustyrazorblade


Re: CASSANDRA-13241 lower default chunk_length_in_kb

2018-10-19 Thread Sankalp Kohli
(We should definitely harden the definition for freeze in a separate thread)

My thinking is that this is the best time to do this change as we have not even 
cut alpha or beta. All the people involved in the test will definitely be 
testing it again when we have these releases. 

> On Oct 19, 2018, at 8:00 AM, Michael Shuler  wrote:
> 
>> On 10/19/18 9:16 AM, Joshua McKenzie wrote:
>> 
>> At the risk of hijacking this thread, when are we going to transition from
>> "no new features, change whatever else you want including refactoring and
>> changing years-old defaults" to "ok, we think we have something that's
>> stable, time to start testing"?
> 
> Creating a cassandra-4.0 branch would allow trunk to, for instance, get
> a default config value change commit and get more testing. We might
> forget again, from what I understand of Benedict's last comment :)
> 
> -- 
> Michael
> 
> -
> To unsubscribe, e-mail: dev-unsubscr...@cassandra.apache.org
> For additional commands, e-mail: dev-h...@cassandra.apache.org
> 

-
To unsubscribe, e-mail: dev-unsubscr...@cassandra.apache.org
For additional commands, e-mail: dev-h...@cassandra.apache.org



Re: CASSANDRA-13241 lower default chunk_length_in_kb

2018-10-19 Thread Michael Shuler
On 10/19/18 9:16 AM, Joshua McKenzie wrote:
> 
> At the risk of hijacking this thread, when are we going to transition from
> "no new features, change whatever else you want including refactoring and
> changing years-old defaults" to "ok, we think we have something that's
> stable, time to start testing"?

Creating a cassandra-4.0 branch would allow trunk to, for instance, get
a default config value change commit and get more testing. We might
forget again, from what I understand of Benedict's last comment :)

-- 
Michael

-
To unsubscribe, e-mail: dev-unsubscr...@cassandra.apache.org
For additional commands, e-mail: dev-h...@cassandra.apache.org



Re: CASSANDRA-13241 lower default chunk_length_in_kb

2018-10-19 Thread Benedict Elliott Smith
Shall we move this discussion to a separate thread?  I agree it needs to be 
had, but this will definitely derail this discussion.

To respond only to the relevant portion for this thread:

> changing years-old defaults

I don’t see how age is relevant?  This isn’t some ‘battle hardened’ feature 
we’re changing - most users don’t even know to change this parameter, so we 
can’t claim its length of existence works in its favour.

The project had fewer resources to be as thorough when this tickets landed, so 
we can’t even claim we’re overturning careful work.  This default was defined 
in 2011 with no performance comparisons with other possible sizes, or 
justification for the selection made on ticket (CASSANDRA-47 - yes, they once 
went down to two digits!).  

That’s not to say this wasn’t a fine default - it was.  In this case, age has 
actively worked against it.  Since 2011, SSDs have become the norm, and most 
servers have more memory than we are presently able to utilise effectively.

This is a no brainer, and doesn’t have any impact on testing.  Tests run with 
64KiB are just as valid as those run with 16KiB.  Performance tests should 
anyway compare like-to-like, so this is completely testing neutral AFAICT.


> On 19 Oct 2018, at 15:16, Joshua McKenzie  wrote:
> 
> At the risk of hijacking this thread, when are we going to transition from
> "no new features, change whatever else you want including refactoring and
> changing years-old defaults" to "ok, we think we have something that's
> stable, time to start testing"?
> 
> Right now, if the community starts aggressively testing 4.0 with all the
> changes still in flight, there's likely going to be a lot of wasted effort.
> I think the root of the disconnect was that when we discussed "freeze" on
> the mailing list, it was in the context of getting everyone engaged in
> testing 4.0.



Re: CASSANDRA-13241 lower default chunk_length_in_kb

2018-10-19 Thread Joshua McKenzie
>
> The predominant phrased used in that thread was 'feature freeze'.

At the risk of hijacking this thread, when are we going to transition from
"no new features, change whatever else you want including refactoring and
changing years-old defaults" to "ok, we think we have something that's
stable, time to start testing"?

Right now, if the community starts aggressively testing 4.0 with all the
changes still in flight, there's likely going to be a lot of wasted effort.
I think the root of the disconnect was that when we discussed "freeze" on
the mailing list, it was in the context of getting everyone engaged in
testing 4.0.

On Fri, Oct 19, 2018 at 9:46 AM Ariel Weisberg  wrote:

> Hi,
>
> I ran some benchmarks on my laptop
>
> https://issues.apache.org/jira/browse/CASSANDRA-13241?focusedCommentId=16656821&page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#comment-16656821
>
> For a random read workload, varying chunk size:
> Chunk size  Time
>64k 25:20
>64k 25:33
>32k 20:01
>16k 19:19
>16k 19:14
> 8k 16:51
> 4k 15:39
>
> Ariel
> On Thu, Oct 18, 2018, at 2:55 PM, Ariel Weisberg wrote:
> > Hi,
> >
> > For those who were asking about the performance impact of block size on
> > compression I wrote a microbenchmark.
> >
> > https://pastebin.com/RHDNLGdC
> >
> >  [java] Benchmark
>  Mode
> > Cnt  Score  Error  Units
> >  [java] CompactIntegerSequenceBench.benchCompressLZ4Fast16k
> thrpt
> > 15  331190055.685 ±  8079758.044  ops/s
> >  [java] CompactIntegerSequenceBench.benchCompressLZ4Fast32k
> thrpt
> > 15  353024925.655 ±  7980400.003  ops/s
> >  [java] CompactIntegerSequenceBench.benchCompressLZ4Fast64k
> thrpt
> > 15  365664477.654 ± 10083336.038  ops/s
> >  [java] CompactIntegerSequenceBench.benchCompressLZ4Fast8k
>  thrpt
> > 15  305518114.172 ± 11043705.883  ops/s
> >  [java] CompactIntegerSequenceBench.benchDecompressLZ4Fast16k
> thrpt
> > 15  688369529.911 ± 25620873.933  ops/s
> >  [java] CompactIntegerSequenceBench.benchDecompressLZ4Fast32k
> thrpt
> > 15  703635848.895 ±  5296941.704  ops/s
> >  [java] CompactIntegerSequenceBench.benchDecompressLZ4Fast64k
> thrpt
> > 15  695537044.676 ± 17400763.731  ops/s
> >  [java] CompactIntegerSequenceBench.benchDecompressLZ4Fast8k
>  thrpt
> > 15  727725713.128 ±  4252436.331  ops/s
> >
> > To summarize, compression is 8.5% slower and decompression is 1% faster.
> > This is measuring the impact on compression/decompression not the huge
> > impact that would occur if we decompressed data we don't need less
> > often.
> >
> > I didn't test decompression of Snappy and LZ4 high, but I did test
> compression.
> >
> > Snappy:
> >  [java] CompactIntegerSequenceBench.benchCompressSnappy16k   thrpt
>
> > 2  196574766.116  ops/s
> >  [java] CompactIntegerSequenceBench.benchCompressSnappy32k   thrpt
>
> > 2  198538643.844  ops/s
> >  [java] CompactIntegerSequenceBench.benchCompressSnappy64k   thrpt
>
> > 2  194600497.613  ops/s
> >  [java] CompactIntegerSequenceBench.benchCompressSnappy8kthrpt
>
> > 2  186040175.059  ops/s
> >
> > LZ4 high compressor:
> >  [java] CompactIntegerSequenceBench.bench16k  thrpt2
> > 20822947.578  ops/s
> >  [java] CompactIntegerSequenceBench.bench32k  thrpt2
> > 12037342.253  ops/s
> >  [java] CompactIntegerSequenceBench.bench64k  thrpt2
> > 6782534.469  ops/s
> >  [java] CompactIntegerSequenceBench.bench8k   thrpt2
> > 32254619.594  ops/s
> >
> > LZ4 high is the one instance where block size mattered a lot. It's a bit
> > suspicious really when you look at the ratio of performance to block
> > size being close to 1:1. I couldn't spot a bug in the benchmark though.
> >
> > Compression ratios with LZ4 fast for the text of Alice in Wonderland was:
> >
> > Chunk size 8192, ratio 0.709473
> > Chunk size 16384, ratio 0.667236
> > Chunk size 32768, ratio 0.634735
> > Chunk size 65536, ratio 0.607208
> >
> > By way of comparison I also ran deflate with maximum compression:
> >
> > Chunk size 8192, ratio 0.426434
> > Chunk size 16384, ratio 0.402423
> > Chunk size 32768, ratio 0.381627
> > Chunk size 65536, ratio 0.364865
> >
> > Ariel
> >
> > On Thu, Oct 18, 2018, at 5:32 AM, Benedict Elliott Smith wrote:
> > > FWIW, I’m not -0, just think that long after the freeze date a change
> > > like this needs a strong mandate from the community.  I think the
> change
> > > is a good one.
> > >
> > >
> > >
> > >
> > >
> > > > On 17 Oct 2018, at 22:09, Ariel Weisberg  wrote:
> > > >
> > > > Hi,
> > > >
> > > > It's really not appreciably slower compared to the decompression we
> are going to do which is going to take several microseconds. Decompression
> is also going to be faster because we are going to do less unnecessary
> decompression and the decompression itself may be faster since it

Re: CASSANDRA-13241 lower default chunk_length_in_kb

2018-10-19 Thread Ariel Weisberg
Hi,

I ran some benchmarks on my laptop
https://issues.apache.org/jira/browse/CASSANDRA-13241?focusedCommentId=16656821&page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#comment-16656821

For a random read workload, varying chunk size:
Chunk size  Time
   64k 25:20
   64k 25:33  
   32k 20:01
   16k 19:19
   16k 19:14
8k 16:51
4k 15:39

Ariel
On Thu, Oct 18, 2018, at 2:55 PM, Ariel Weisberg wrote:
> Hi,
> 
> For those who were asking about the performance impact of block size on 
> compression I wrote a microbenchmark.
> 
> https://pastebin.com/RHDNLGdC
> 
>  [java] Benchmark   Mode  
> Cnt  Score  Error  Units
>  [java] CompactIntegerSequenceBench.benchCompressLZ4Fast16kthrpt   
> 15  331190055.685 ±  8079758.044  ops/s
>  [java] CompactIntegerSequenceBench.benchCompressLZ4Fast32kthrpt   
> 15  353024925.655 ±  7980400.003  ops/s
>  [java] CompactIntegerSequenceBench.benchCompressLZ4Fast64kthrpt   
> 15  365664477.654 ± 10083336.038  ops/s
>  [java] CompactIntegerSequenceBench.benchCompressLZ4Fast8k thrpt   
> 15  305518114.172 ± 11043705.883  ops/s
>  [java] CompactIntegerSequenceBench.benchDecompressLZ4Fast16k  thrpt   
> 15  688369529.911 ± 25620873.933  ops/s
>  [java] CompactIntegerSequenceBench.benchDecompressLZ4Fast32k  thrpt   
> 15  703635848.895 ±  5296941.704  ops/s
>  [java] CompactIntegerSequenceBench.benchDecompressLZ4Fast64k  thrpt   
> 15  695537044.676 ± 17400763.731  ops/s
>  [java] CompactIntegerSequenceBench.benchDecompressLZ4Fast8k   thrpt   
> 15  727725713.128 ±  4252436.331  ops/s
> 
> To summarize, compression is 8.5% slower and decompression is 1% faster. 
> This is measuring the impact on compression/decompression not the huge 
> impact that would occur if we decompressed data we don't need less 
> often.
> 
> I didn't test decompression of Snappy and LZ4 high, but I did test 
> compression.
> 
> Snappy:
>  [java] CompactIntegerSequenceBench.benchCompressSnappy16k   thrpt
> 2  196574766.116  ops/s
>  [java] CompactIntegerSequenceBench.benchCompressSnappy32k   thrpt
> 2  198538643.844  ops/s
>  [java] CompactIntegerSequenceBench.benchCompressSnappy64k   thrpt
> 2  194600497.613  ops/s
>  [java] CompactIntegerSequenceBench.benchCompressSnappy8kthrpt
> 2  186040175.059  ops/s
> 
> LZ4 high compressor:
>  [java] CompactIntegerSequenceBench.bench16k  thrpt2  
> 20822947.578  ops/s
>  [java] CompactIntegerSequenceBench.bench32k  thrpt2  
> 12037342.253  ops/s
>  [java] CompactIntegerSequenceBench.bench64k  thrpt2   
> 6782534.469  ops/s
>  [java] CompactIntegerSequenceBench.bench8k   thrpt2  
> 32254619.594  ops/s
> 
> LZ4 high is the one instance where block size mattered a lot. It's a bit 
> suspicious really when you look at the ratio of performance to block 
> size being close to 1:1. I couldn't spot a bug in the benchmark though.
> 
> Compression ratios with LZ4 fast for the text of Alice in Wonderland was:
> 
> Chunk size 8192, ratio 0.709473
> Chunk size 16384, ratio 0.667236
> Chunk size 32768, ratio 0.634735
> Chunk size 65536, ratio 0.607208
> 
> By way of comparison I also ran deflate with maximum compression:
> 
> Chunk size 8192, ratio 0.426434
> Chunk size 16384, ratio 0.402423
> Chunk size 32768, ratio 0.381627
> Chunk size 65536, ratio 0.364865
> 
> Ariel
>  
> On Thu, Oct 18, 2018, at 5:32 AM, Benedict Elliott Smith wrote:
> > FWIW, I’m not -0, just think that long after the freeze date a change 
> > like this needs a strong mandate from the community.  I think the change 
> > is a good one.
> > 
> > 
> > 
> > 
> > 
> > > On 17 Oct 2018, at 22:09, Ariel Weisberg  wrote:
> > > 
> > > Hi,
> > > 
> > > It's really not appreciably slower compared to the decompression we are 
> > > going to do which is going to take several microseconds. Decompression is 
> > > also going to be faster because we are going to do less unnecessary 
> > > decompression and the decompression itself may be faster since it may fit 
> > > in a higher level cache better. I ran a microbenchmark comparing them.
> > > 
> > > https://issues.apache.org/jira/browse/CASSANDRA-13241?focusedCommentId=16653988&page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#comment-16653988
> > > 
> > > Fetching a long from memory:   56 nanoseconds
> > > Compact integer sequence   :   80 nanoseconds
> > > Summing integer sequence   :  165 nanoseconds
> > > 
> > > Currently we have one +1 from Kurt to change the representation and 
> > > possibly a -0 from Benedict. That's not really enough to make an 
> > > exception to the code freeze. If you want it to happen (or not) you need 
> > > to speak up otherwise only the default will change.
> > > 
> > > Regards,
> > > A

Re: CASSANDRA-13241 lower default chunk_length_in_kb

2018-10-19 Thread Benedict Elliott Smith
The change of default property doesn’t seem to violate the freeze?  The 
predominant phrased used in that thread was 'feature freeze'.  A lot of people 
are now interpreting it more broadly, so perhaps we need to revisit, but that’s 
probably a separate discussion?

The current default is really bad for most users, so I’m +1 changing it.  
Especially as the last time this topic was raised was (iirc) around the 3.0 
freeze.  We decided not to change anything for similar reasons, and haven't 
revisited it since.


> On 19 Oct 2018, at 09:25, Jeff Jirsa  wrote:
> 
> Agree with Sylvain (and I think Benedict) - there’s no compelling reason to 
> violate the freeze here. We’ve had the wrong default for years - add a note 
> to the docs that we’ll be changing it in the future, but let’s not violate 
> the freeze now.
> 
> -- 
> Jeff Jirsa
> 
> 
>> On Oct 19, 2018, at 10:06 AM, Sylvain Lebresne  wrote:
>> 
>> Fwiw, as much as I agree this is a change worth doing in general, I do am
>> -0 for 4.0. Both the "compact sequencing" and the change of default really.
>> We're closing on 2 months within the freeze, and for me a freeze do include
>> not changing defaults, because changing default ideally imply a decent
>> amount of analysis/benchmark of the consequence of that change[1] and that
>> doesn't enter in my definition of a freeze.
>> 
>> [1]: to be extra clear, I'm not saying we've always done this, far from it.
>> But I hope we can all agree we were wrong to no do it when we didn't and
>> should strive to improve, not repeat past mistakes.
>> --
>> Sylvain
>> 
>> 
>>> On Thu, Oct 18, 2018 at 8:55 PM Ariel Weisberg  wrote:
>>> 
>>> Hi,
>>> 
>>> For those who were asking about the performance impact of block size on
>>> compression I wrote a microbenchmark.
>>> 
>>> https://pastebin.com/RHDNLGdC
>>> 
>>>[java] Benchmark   Mode
>>> Cnt  Score  Error  Units
>>>[java] CompactIntegerSequenceBench.benchCompressLZ4Fast16kthrpt
>>> 15  331190055.685 ±  8079758.044  ops/s
>>>[java] CompactIntegerSequenceBench.benchCompressLZ4Fast32kthrpt
>>> 15  353024925.655 ±  7980400.003  ops/s
>>>[java] CompactIntegerSequenceBench.benchCompressLZ4Fast64kthrpt
>>> 15  365664477.654 ± 10083336.038  ops/s
>>>[java] CompactIntegerSequenceBench.benchCompressLZ4Fast8k thrpt
>>> 15  305518114.172 ± 11043705.883  ops/s
>>>[java] CompactIntegerSequenceBench.benchDecompressLZ4Fast16k  thrpt
>>> 15  688369529.911 ± 25620873.933  ops/s
>>>[java] CompactIntegerSequenceBench.benchDecompressLZ4Fast32k  thrpt
>>> 15  703635848.895 ±  5296941.704  ops/s
>>>[java] CompactIntegerSequenceBench.benchDecompressLZ4Fast64k  thrpt
>>> 15  695537044.676 ± 17400763.731  ops/s
>>>[java] CompactIntegerSequenceBench.benchDecompressLZ4Fast8k   thrpt
>>> 15  727725713.128 ±  4252436.331  ops/s
>>> 
>>> To summarize, compression is 8.5% slower and decompression is 1% faster.
>>> This is measuring the impact on compression/decompression not the huge
>>> impact that would occur if we decompressed data we don't need less often.
>>> 
>>> I didn't test decompression of Snappy and LZ4 high, but I did test
>>> compression.
>>> 
>>> Snappy:
>>>[java] CompactIntegerSequenceBench.benchCompressSnappy16k   thrpt
>>> 2  196574766.116  ops/s
>>>[java] CompactIntegerSequenceBench.benchCompressSnappy32k   thrpt
>>> 2  198538643.844  ops/s
>>>[java] CompactIntegerSequenceBench.benchCompressSnappy64k   thrpt
>>> 2  194600497.613  ops/s
>>>[java] CompactIntegerSequenceBench.benchCompressSnappy8kthrpt
>>> 2  186040175.059  ops/s
>>> 
>>> LZ4 high compressor:
>>>[java] CompactIntegerSequenceBench.bench16k thrpt2
>>> 20822947.578  ops/s
>>>[java] CompactIntegerSequenceBench.bench32k thrpt2
>>> 12037342.253  ops/s
>>>[java] CompactIntegerSequenceBench.bench64k  thrpt2
>>> 6782534.469  ops/s
>>>[java] CompactIntegerSequenceBench.bench8k   thrpt2
>>> 32254619.594  ops/s
>>> 
>>> LZ4 high is the one instance where block size mattered a lot. It's a bit
>>> suspicious really when you look at the ratio of performance to block size
>>> being close to 1:1. I couldn't spot a bug in the benchmark though.
>>> 
>>> Compression ratios with LZ4 fast for the text of Alice in Wonderland was:
>>> 
>>> Chunk size 8192, ratio 0.709473
>>> Chunk size 16384, ratio 0.667236
>>> Chunk size 32768, ratio 0.634735
>>> Chunk size 65536, ratio 0.607208
>>> 
>>> By way of comparison I also ran deflate with maximum compression:
>>> 
>>> Chunk size 8192, ratio 0.426434
>>> Chunk size 16384, ratio 0.402423
>>> Chunk size 32768, ratio 0.381627
>>> Chunk size 65536, ratio 0.364865
>>> 
>>> Ariel
>>> 
 On Thu, Oct 18, 2018, at 5:32 AM, Benedict Elliott Smith wrote:
 FWIW, I’m not -0, just think that long after the freeze date a change
 like this needs a strong mandate from the community. 

Re: CASSANDRA-13241 lower default chunk_length_in_kb

2018-10-19 Thread Jeff Jirsa
Agree with Sylvain (and I think Benedict) - there’s no compelling reason to 
violate the freeze here. We’ve had the wrong default for years - add a note to 
the docs that we’ll be changing it in the future, but let’s not violate the 
freeze now.

-- 
Jeff Jirsa


> On Oct 19, 2018, at 10:06 AM, Sylvain Lebresne  wrote:
> 
> Fwiw, as much as I agree this is a change worth doing in general, I do am
> -0 for 4.0. Both the "compact sequencing" and the change of default really.
> We're closing on 2 months within the freeze, and for me a freeze do include
> not changing defaults, because changing default ideally imply a decent
> amount of analysis/benchmark of the consequence of that change[1] and that
> doesn't enter in my definition of a freeze.
> 
> [1]: to be extra clear, I'm not saying we've always done this, far from it.
> But I hope we can all agree we were wrong to no do it when we didn't and
> should strive to improve, not repeat past mistakes.
> --
> Sylvain
> 
> 
>> On Thu, Oct 18, 2018 at 8:55 PM Ariel Weisberg  wrote:
>> 
>> Hi,
>> 
>> For those who were asking about the performance impact of block size on
>> compression I wrote a microbenchmark.
>> 
>> https://pastebin.com/RHDNLGdC
>> 
>> [java] Benchmark   Mode
>> Cnt  Score  Error  Units
>> [java] CompactIntegerSequenceBench.benchCompressLZ4Fast16kthrpt
>> 15  331190055.685 ±  8079758.044  ops/s
>> [java] CompactIntegerSequenceBench.benchCompressLZ4Fast32kthrpt
>> 15  353024925.655 ±  7980400.003  ops/s
>> [java] CompactIntegerSequenceBench.benchCompressLZ4Fast64kthrpt
>> 15  365664477.654 ± 10083336.038  ops/s
>> [java] CompactIntegerSequenceBench.benchCompressLZ4Fast8k thrpt
>> 15  305518114.172 ± 11043705.883  ops/s
>> [java] CompactIntegerSequenceBench.benchDecompressLZ4Fast16k  thrpt
>> 15  688369529.911 ± 25620873.933  ops/s
>> [java] CompactIntegerSequenceBench.benchDecompressLZ4Fast32k  thrpt
>> 15  703635848.895 ±  5296941.704  ops/s
>> [java] CompactIntegerSequenceBench.benchDecompressLZ4Fast64k  thrpt
>> 15  695537044.676 ± 17400763.731  ops/s
>> [java] CompactIntegerSequenceBench.benchDecompressLZ4Fast8k   thrpt
>> 15  727725713.128 ±  4252436.331  ops/s
>> 
>> To summarize, compression is 8.5% slower and decompression is 1% faster.
>> This is measuring the impact on compression/decompression not the huge
>> impact that would occur if we decompressed data we don't need less often.
>> 
>> I didn't test decompression of Snappy and LZ4 high, but I did test
>> compression.
>> 
>> Snappy:
>> [java] CompactIntegerSequenceBench.benchCompressSnappy16k   thrpt
>> 2  196574766.116  ops/s
>> [java] CompactIntegerSequenceBench.benchCompressSnappy32k   thrpt
>> 2  198538643.844  ops/s
>> [java] CompactIntegerSequenceBench.benchCompressSnappy64k   thrpt
>> 2  194600497.613  ops/s
>> [java] CompactIntegerSequenceBench.benchCompressSnappy8kthrpt
>> 2  186040175.059  ops/s
>> 
>> LZ4 high compressor:
>> [java] CompactIntegerSequenceBench.bench16k thrpt2
>> 20822947.578  ops/s
>> [java] CompactIntegerSequenceBench.bench32k thrpt2
>> 12037342.253  ops/s
>> [java] CompactIntegerSequenceBench.bench64k  thrpt2
>> 6782534.469  ops/s
>> [java] CompactIntegerSequenceBench.bench8k   thrpt2
>> 32254619.594  ops/s
>> 
>> LZ4 high is the one instance where block size mattered a lot. It's a bit
>> suspicious really when you look at the ratio of performance to block size
>> being close to 1:1. I couldn't spot a bug in the benchmark though.
>> 
>> Compression ratios with LZ4 fast for the text of Alice in Wonderland was:
>> 
>> Chunk size 8192, ratio 0.709473
>> Chunk size 16384, ratio 0.667236
>> Chunk size 32768, ratio 0.634735
>> Chunk size 65536, ratio 0.607208
>> 
>> By way of comparison I also ran deflate with maximum compression:
>> 
>> Chunk size 8192, ratio 0.426434
>> Chunk size 16384, ratio 0.402423
>> Chunk size 32768, ratio 0.381627
>> Chunk size 65536, ratio 0.364865
>> 
>> Ariel
>> 
>>> On Thu, Oct 18, 2018, at 5:32 AM, Benedict Elliott Smith wrote:
>>> FWIW, I’m not -0, just think that long after the freeze date a change
>>> like this needs a strong mandate from the community.  I think the change
>>> is a good one.
>>> 
>>> 
>>> 
>>> 
>>> 
 On 17 Oct 2018, at 22:09, Ariel Weisberg  wrote:
 
 Hi,
 
 It's really not appreciably slower compared to the decompression we
>> are going to do which is going to take several microseconds. Decompression
>> is also going to be faster because we are going to do less unnecessary
>> decompression and the decompression itself may be faster since it may fit
>> in a higher level cache better. I ran a microbenchmark comparing them.
 
 
>> https://issues.apache.org/jira/browse/CASSANDRA-13241?focusedCommentId=16653988&page=com.atlassian.jira.plugin.system.issuetabpanels:

Re: CASSANDRA-13241 lower default chunk_length_in_kb

2018-10-19 Thread Sylvain Lebresne
Fwiw, as much as I agree this is a change worth doing in general, I do am
-0 for 4.0. Both the "compact sequencing" and the change of default really.
We're closing on 2 months within the freeze, and for me a freeze do include
not changing defaults, because changing default ideally imply a decent
amount of analysis/benchmark of the consequence of that change[1] and that
doesn't enter in my definition of a freeze.

[1]: to be extra clear, I'm not saying we've always done this, far from it.
But I hope we can all agree we were wrong to no do it when we didn't and
should strive to improve, not repeat past mistakes.
--
Sylvain


On Thu, Oct 18, 2018 at 8:55 PM Ariel Weisberg  wrote:

> Hi,
>
> For those who were asking about the performance impact of block size on
> compression I wrote a microbenchmark.
>
> https://pastebin.com/RHDNLGdC
>
>  [java] Benchmark   Mode
> Cnt  Score  Error  Units
>  [java] CompactIntegerSequenceBench.benchCompressLZ4Fast16kthrpt
>  15  331190055.685 ±  8079758.044  ops/s
>  [java] CompactIntegerSequenceBench.benchCompressLZ4Fast32kthrpt
>  15  353024925.655 ±  7980400.003  ops/s
>  [java] CompactIntegerSequenceBench.benchCompressLZ4Fast64kthrpt
>  15  365664477.654 ± 10083336.038  ops/s
>  [java] CompactIntegerSequenceBench.benchCompressLZ4Fast8k thrpt
>  15  305518114.172 ± 11043705.883  ops/s
>  [java] CompactIntegerSequenceBench.benchDecompressLZ4Fast16k  thrpt
>  15  688369529.911 ± 25620873.933  ops/s
>  [java] CompactIntegerSequenceBench.benchDecompressLZ4Fast32k  thrpt
>  15  703635848.895 ±  5296941.704  ops/s
>  [java] CompactIntegerSequenceBench.benchDecompressLZ4Fast64k  thrpt
>  15  695537044.676 ± 17400763.731  ops/s
>  [java] CompactIntegerSequenceBench.benchDecompressLZ4Fast8k   thrpt
>  15  727725713.128 ±  4252436.331  ops/s
>
> To summarize, compression is 8.5% slower and decompression is 1% faster.
> This is measuring the impact on compression/decompression not the huge
> impact that would occur if we decompressed data we don't need less often.
>
> I didn't test decompression of Snappy and LZ4 high, but I did test
> compression.
>
> Snappy:
>  [java] CompactIntegerSequenceBench.benchCompressSnappy16k   thrpt
> 2  196574766.116  ops/s
>  [java] CompactIntegerSequenceBench.benchCompressSnappy32k   thrpt
> 2  198538643.844  ops/s
>  [java] CompactIntegerSequenceBench.benchCompressSnappy64k   thrpt
> 2  194600497.613  ops/s
>  [java] CompactIntegerSequenceBench.benchCompressSnappy8kthrpt
> 2  186040175.059  ops/s
>
> LZ4 high compressor:
>  [java] CompactIntegerSequenceBench.bench16k  thrpt2
> 20822947.578  ops/s
>  [java] CompactIntegerSequenceBench.bench32k  thrpt2
> 12037342.253  ops/s
>  [java] CompactIntegerSequenceBench.bench64k  thrpt2
>  6782534.469  ops/s
>  [java] CompactIntegerSequenceBench.bench8k   thrpt2
> 32254619.594  ops/s
>
> LZ4 high is the one instance where block size mattered a lot. It's a bit
> suspicious really when you look at the ratio of performance to block size
> being close to 1:1. I couldn't spot a bug in the benchmark though.
>
> Compression ratios with LZ4 fast for the text of Alice in Wonderland was:
>
> Chunk size 8192, ratio 0.709473
> Chunk size 16384, ratio 0.667236
> Chunk size 32768, ratio 0.634735
> Chunk size 65536, ratio 0.607208
>
> By way of comparison I also ran deflate with maximum compression:
>
> Chunk size 8192, ratio 0.426434
> Chunk size 16384, ratio 0.402423
> Chunk size 32768, ratio 0.381627
> Chunk size 65536, ratio 0.364865
>
> Ariel
>
> On Thu, Oct 18, 2018, at 5:32 AM, Benedict Elliott Smith wrote:
> > FWIW, I’m not -0, just think that long after the freeze date a change
> > like this needs a strong mandate from the community.  I think the change
> > is a good one.
> >
> >
> >
> >
> >
> > > On 17 Oct 2018, at 22:09, Ariel Weisberg  wrote:
> > >
> > > Hi,
> > >
> > > It's really not appreciably slower compared to the decompression we
> are going to do which is going to take several microseconds. Decompression
> is also going to be faster because we are going to do less unnecessary
> decompression and the decompression itself may be faster since it may fit
> in a higher level cache better. I ran a microbenchmark comparing them.
> > >
> > >
> https://issues.apache.org/jira/browse/CASSANDRA-13241?focusedCommentId=16653988&page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#comment-16653988
> > >
> > > Fetching a long from memory:   56 nanoseconds
> > > Compact integer sequence   :   80 nanoseconds
> > > Summing integer sequence   :  165 nanoseconds
> > >
> > > Currently we have one +1 from Kurt to change the representation and
> possibly a -0 from Benedict. That's not really enough to make an exception
> to the code freeze. If you want it to happen (or not) you n

Re: CASSANDRA-13241 lower default chunk_length_in_kb

2018-10-18 Thread Ariel Weisberg
Hi,

For those who were asking about the performance impact of block size on 
compression I wrote a microbenchmark.

https://pastebin.com/RHDNLGdC

 [java] Benchmark   Mode  Cnt   
   Score  Error  Units
 [java] CompactIntegerSequenceBench.benchCompressLZ4Fast16kthrpt   15  
331190055.685 ±  8079758.044  ops/s
 [java] CompactIntegerSequenceBench.benchCompressLZ4Fast32kthrpt   15  
353024925.655 ±  7980400.003  ops/s
 [java] CompactIntegerSequenceBench.benchCompressLZ4Fast64kthrpt   15  
365664477.654 ± 10083336.038  ops/s
 [java] CompactIntegerSequenceBench.benchCompressLZ4Fast8k thrpt   15  
305518114.172 ± 11043705.883  ops/s
 [java] CompactIntegerSequenceBench.benchDecompressLZ4Fast16k  thrpt   15  
688369529.911 ± 25620873.933  ops/s
 [java] CompactIntegerSequenceBench.benchDecompressLZ4Fast32k  thrpt   15  
703635848.895 ±  5296941.704  ops/s
 [java] CompactIntegerSequenceBench.benchDecompressLZ4Fast64k  thrpt   15  
695537044.676 ± 17400763.731  ops/s
 [java] CompactIntegerSequenceBench.benchDecompressLZ4Fast8k   thrpt   15  
727725713.128 ±  4252436.331  ops/s

To summarize, compression is 8.5% slower and decompression is 1% faster. This 
is measuring the impact on compression/decompression not the huge impact that 
would occur if we decompressed data we don't need less often.

I didn't test decompression of Snappy and LZ4 high, but I did test compression.

Snappy:
 [java] CompactIntegerSequenceBench.benchCompressSnappy16k   thrpt2  
196574766.116  ops/s
 [java] CompactIntegerSequenceBench.benchCompressSnappy32k   thrpt2  
198538643.844  ops/s
 [java] CompactIntegerSequenceBench.benchCompressSnappy64k   thrpt2  
194600497.613  ops/s
 [java] CompactIntegerSequenceBench.benchCompressSnappy8kthrpt2  
186040175.059  ops/s

LZ4 high compressor:
 [java] CompactIntegerSequenceBench.bench16k  thrpt2  20822947.578  
ops/s
 [java] CompactIntegerSequenceBench.bench32k  thrpt2  12037342.253  
ops/s
 [java] CompactIntegerSequenceBench.bench64k  thrpt2   6782534.469  
ops/s
 [java] CompactIntegerSequenceBench.bench8k   thrpt2  32254619.594  
ops/s

LZ4 high is the one instance where block size mattered a lot. It's a bit 
suspicious really when you look at the ratio of performance to block size being 
close to 1:1. I couldn't spot a bug in the benchmark though.

Compression ratios with LZ4 fast for the text of Alice in Wonderland was:

Chunk size 8192, ratio 0.709473
Chunk size 16384, ratio 0.667236
Chunk size 32768, ratio 0.634735
Chunk size 65536, ratio 0.607208

By way of comparison I also ran deflate with maximum compression:

Chunk size 8192, ratio 0.426434
Chunk size 16384, ratio 0.402423
Chunk size 32768, ratio 0.381627
Chunk size 65536, ratio 0.364865

Ariel
 
On Thu, Oct 18, 2018, at 5:32 AM, Benedict Elliott Smith wrote:
> FWIW, I’m not -0, just think that long after the freeze date a change 
> like this needs a strong mandate from the community.  I think the change 
> is a good one.
> 
> 
> 
> 
> 
> > On 17 Oct 2018, at 22:09, Ariel Weisberg  wrote:
> > 
> > Hi,
> > 
> > It's really not appreciably slower compared to the decompression we are 
> > going to do which is going to take several microseconds. Decompression is 
> > also going to be faster because we are going to do less unnecessary 
> > decompression and the decompression itself may be faster since it may fit 
> > in a higher level cache better. I ran a microbenchmark comparing them.
> > 
> > https://issues.apache.org/jira/browse/CASSANDRA-13241?focusedCommentId=16653988&page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#comment-16653988
> > 
> > Fetching a long from memory:   56 nanoseconds
> > Compact integer sequence   :   80 nanoseconds
> > Summing integer sequence   :  165 nanoseconds
> > 
> > Currently we have one +1 from Kurt to change the representation and 
> > possibly a -0 from Benedict. That's not really enough to make an exception 
> > to the code freeze. If you want it to happen (or not) you need to speak up 
> > otherwise only the default will change.
> > 
> > Regards,
> > Ariel
> > 
> > On Wed, Oct 17, 2018, at 6:40 AM, kurt greaves wrote:
> >> I think if we're going to drop it to 16k, we should invest in the compact
> >> sequencing as well. Just lowering it to 16k will have potentially a painful
> >> impact on anyone running low memory nodes, but if we can do it without the
> >> memory impact I don't think there's any reason to wait another major
> >> version to implement it.
> >> 
> >> Having said that, we should probably benchmark the two representations
> >> Ariel has come up with.
> >> 
> >> On Wed, 17 Oct 2018 at 20:17, Alain RODRIGUEZ  wrote:
> >> 
> >>> +1
> >>> 
> >>> I would guess a lot of C* clusters/tables have this option set to the
> >>> default val

Re: CASSANDRA-13241 lower default chunk_length_in_kb

2018-10-18 Thread Benedict Elliott Smith
FWIW, I’m not -0, just think that long after the freeze date a change like this 
needs a strong mandate from the community.  I think the change is a good one.





> On 17 Oct 2018, at 22:09, Ariel Weisberg  wrote:
> 
> Hi,
> 
> It's really not appreciably slower compared to the decompression we are going 
> to do which is going to take several microseconds. Decompression is also 
> going to be faster because we are going to do less unnecessary decompression 
> and the decompression itself may be faster since it may fit in a higher level 
> cache better. I ran a microbenchmark comparing them.
> 
> https://issues.apache.org/jira/browse/CASSANDRA-13241?focusedCommentId=16653988&page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#comment-16653988
> 
> Fetching a long from memory:   56 nanoseconds
> Compact integer sequence   :   80 nanoseconds
> Summing integer sequence   :  165 nanoseconds
> 
> Currently we have one +1 from Kurt to change the representation and possibly 
> a -0 from Benedict. That's not really enough to make an exception to the code 
> freeze. If you want it to happen (or not) you need to speak up otherwise only 
> the default will change.
> 
> Regards,
> Ariel
> 
> On Wed, Oct 17, 2018, at 6:40 AM, kurt greaves wrote:
>> I think if we're going to drop it to 16k, we should invest in the compact
>> sequencing as well. Just lowering it to 16k will have potentially a painful
>> impact on anyone running low memory nodes, but if we can do it without the
>> memory impact I don't think there's any reason to wait another major
>> version to implement it.
>> 
>> Having said that, we should probably benchmark the two representations
>> Ariel has come up with.
>> 
>> On Wed, 17 Oct 2018 at 20:17, Alain RODRIGUEZ  wrote:
>> 
>>> +1
>>> 
>>> I would guess a lot of C* clusters/tables have this option set to the
>>> default value, and not many of them are having the need for reading so big
>>> chunks of data.
>>> I believe this will greatly limit disk overreads for a fair amount (a big
>>> majority?) of new users. It seems fair enough to change this default value,
>>> I also think 4.0 is a nice place to do this.
>>> 
>>> Thanks for taking care of this Ariel and for making sure there is a
>>> consensus here as well,
>>> 
>>> C*heers,
>>> ---
>>> Alain Rodriguez - al...@thelastpickle.com
>>> France / Spain
>>> 
>>> The Last Pickle - Apache Cassandra Consulting
>>> http://www.thelastpickle.com
>>> 
>>> Le sam. 13 oct. 2018 à 08:52, Ariel Weisberg  a écrit :
>>> 
 Hi,
 
 This would only impact new tables, existing tables would get their
 chunk_length_in_kb from the existing schema. It's something we record in
>>> a
 system table.
 
 I have an implementation of a compact integer sequence that only requires
 37% of the memory required today. So we could do this with only slightly
 more than doubling the memory used. I'll post that to the JIRA soon.
 
 Ariel
 
 On Fri, Oct 12, 2018, at 1:56 AM, Jeff Jirsa wrote:
> 
> 
> I think 16k is a better default, but it should only affect new tables.
> Whoever changes it, please make sure you think about the upgrade path.
> 
> 
>> On Oct 12, 2018, at 2:31 AM, Ben Bromhead 
>>> wrote:
>> 
>> This is something that's bugged me for ages, tbh the performance gain
 for
>> most use cases far outweighs the increase in memory usage and I would
 even
>> be in favor of changing the default now, optimizing the storage cost
 later
>> (if it's found to be worth it).
>> 
>> For some anecdotal evidence:
>> 4kb is usually what we end setting it to, 16kb feels more reasonable
 given
>> the memory impact, but what would be the point if practically, most
 folks
>> set it to 4kb anyway?
>> 
>> Note that chunk_length will largely be dependent on your read sizes,
 but 4k
>> is the floor for most physical devices in terms of ones block size.
>> 
>> +1 for making this change in 4.0 given the small size and the large
>> improvement to new users experience (as long as we are explicit in
>>> the
>> documentation about memory consumption).
>> 
>> 
>>> On Thu, Oct 11, 2018 at 7:11 PM Ariel Weisberg 
 wrote:
>>> 
>>> Hi,
>>> 
>>> This is regarding
 https://issues.apache.org/jira/browse/CASSANDRA-13241
>>> 
>>> This ticket has languished for a while. IMO it's too late in 4.0 to
>>> implement a more memory efficient representation for compressed
>>> chunk
>>> offsets. However I don't think we should put out another release
>>> with
 the
>>> current 64k default as it's pretty unreasonable.
>>> 
>>> I propose that we lower the value to 16kb. 4k might never be the
 correct
>>> default anyways as there is a cost to compression and 16k will still
 be a
>>> large improvement.
>>> 
>>> Benedict and Jon H

Re: CASSANDRA-13241 lower default chunk_length_in_kb

2018-10-17 Thread Ariel Weisberg
Hi,

It's really not appreciably slower compared to the decompression we are going 
to do which is going to take several microseconds. Decompression is also going 
to be faster because we are going to do less unnecessary decompression and the 
decompression itself may be faster since it may fit in a higher level cache 
better. I ran a microbenchmark comparing them.

https://issues.apache.org/jira/browse/CASSANDRA-13241?focusedCommentId=16653988&page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#comment-16653988

Fetching a long from memory:   56 nanoseconds
Compact integer sequence   :   80 nanoseconds
Summing integer sequence   :  165 nanoseconds

Currently we have one +1 from Kurt to change the representation and possibly a 
-0 from Benedict. That's not really enough to make an exception to the code 
freeze. If you want it to happen (or not) you need to speak up otherwise only 
the default will change.

Regards,
Ariel

On Wed, Oct 17, 2018, at 6:40 AM, kurt greaves wrote:
> I think if we're going to drop it to 16k, we should invest in the compact
> sequencing as well. Just lowering it to 16k will have potentially a painful
> impact on anyone running low memory nodes, but if we can do it without the
> memory impact I don't think there's any reason to wait another major
> version to implement it.
> 
> Having said that, we should probably benchmark the two representations
> Ariel has come up with.
> 
> On Wed, 17 Oct 2018 at 20:17, Alain RODRIGUEZ  wrote:
> 
> > +1
> >
> > I would guess a lot of C* clusters/tables have this option set to the
> > default value, and not many of them are having the need for reading so big
> > chunks of data.
> > I believe this will greatly limit disk overreads for a fair amount (a big
> > majority?) of new users. It seems fair enough to change this default value,
> > I also think 4.0 is a nice place to do this.
> >
> > Thanks for taking care of this Ariel and for making sure there is a
> > consensus here as well,
> >
> > C*heers,
> > ---
> > Alain Rodriguez - al...@thelastpickle.com
> > France / Spain
> >
> > The Last Pickle - Apache Cassandra Consulting
> > http://www.thelastpickle.com
> >
> > Le sam. 13 oct. 2018 à 08:52, Ariel Weisberg  a écrit :
> >
> > > Hi,
> > >
> > > This would only impact new tables, existing tables would get their
> > > chunk_length_in_kb from the existing schema. It's something we record in
> > a
> > > system table.
> > >
> > > I have an implementation of a compact integer sequence that only requires
> > > 37% of the memory required today. So we could do this with only slightly
> > > more than doubling the memory used. I'll post that to the JIRA soon.
> > >
> > > Ariel
> > >
> > > On Fri, Oct 12, 2018, at 1:56 AM, Jeff Jirsa wrote:
> > > >
> > > >
> > > > I think 16k is a better default, but it should only affect new tables.
> > > > Whoever changes it, please make sure you think about the upgrade path.
> > > >
> > > >
> > > > > On Oct 12, 2018, at 2:31 AM, Ben Bromhead 
> > wrote:
> > > > >
> > > > > This is something that's bugged me for ages, tbh the performance gain
> > > for
> > > > > most use cases far outweighs the increase in memory usage and I would
> > > even
> > > > > be in favor of changing the default now, optimizing the storage cost
> > > later
> > > > > (if it's found to be worth it).
> > > > >
> > > > > For some anecdotal evidence:
> > > > > 4kb is usually what we end setting it to, 16kb feels more reasonable
> > > given
> > > > > the memory impact, but what would be the point if practically, most
> > > folks
> > > > > set it to 4kb anyway?
> > > > >
> > > > > Note that chunk_length will largely be dependent on your read sizes,
> > > but 4k
> > > > > is the floor for most physical devices in terms of ones block size.
> > > > >
> > > > > +1 for making this change in 4.0 given the small size and the large
> > > > > improvement to new users experience (as long as we are explicit in
> > the
> > > > > documentation about memory consumption).
> > > > >
> > > > >
> > > > >> On Thu, Oct 11, 2018 at 7:11 PM Ariel Weisberg 
> > > wrote:
> > > > >>
> > > > >> Hi,
> > > > >>
> > > > >> This is regarding
> > > https://issues.apache.org/jira/browse/CASSANDRA-13241
> > > > >>
> > > > >> This ticket has languished for a while. IMO it's too late in 4.0 to
> > > > >> implement a more memory efficient representation for compressed
> > chunk
> > > > >> offsets. However I don't think we should put out another release
> > with
> > > the
> > > > >> current 64k default as it's pretty unreasonable.
> > > > >>
> > > > >> I propose that we lower the value to 16kb. 4k might never be the
> > > correct
> > > > >> default anyways as there is a cost to compression and 16k will still
> > > be a
> > > > >> large improvement.
> > > > >>
> > > > >> Benedict and Jon Haddad are both +1 on making this change for 4.0.
> > In
> > > the
> > > > >> past there has been some consensus about reducing this value
> > although
> > >

Re: CASSANDRA-13241 lower default chunk_length_in_kb

2018-10-17 Thread kurt greaves
I think if we're going to drop it to 16k, we should invest in the compact
sequencing as well. Just lowering it to 16k will have potentially a painful
impact on anyone running low memory nodes, but if we can do it without the
memory impact I don't think there's any reason to wait another major
version to implement it.

Having said that, we should probably benchmark the two representations
Ariel has come up with.

On Wed, 17 Oct 2018 at 20:17, Alain RODRIGUEZ  wrote:

> +1
>
> I would guess a lot of C* clusters/tables have this option set to the
> default value, and not many of them are having the need for reading so big
> chunks of data.
> I believe this will greatly limit disk overreads for a fair amount (a big
> majority?) of new users. It seems fair enough to change this default value,
> I also think 4.0 is a nice place to do this.
>
> Thanks for taking care of this Ariel and for making sure there is a
> consensus here as well,
>
> C*heers,
> ---
> Alain Rodriguez - al...@thelastpickle.com
> France / Spain
>
> The Last Pickle - Apache Cassandra Consulting
> http://www.thelastpickle.com
>
> Le sam. 13 oct. 2018 à 08:52, Ariel Weisberg  a écrit :
>
> > Hi,
> >
> > This would only impact new tables, existing tables would get their
> > chunk_length_in_kb from the existing schema. It's something we record in
> a
> > system table.
> >
> > I have an implementation of a compact integer sequence that only requires
> > 37% of the memory required today. So we could do this with only slightly
> > more than doubling the memory used. I'll post that to the JIRA soon.
> >
> > Ariel
> >
> > On Fri, Oct 12, 2018, at 1:56 AM, Jeff Jirsa wrote:
> > >
> > >
> > > I think 16k is a better default, but it should only affect new tables.
> > > Whoever changes it, please make sure you think about the upgrade path.
> > >
> > >
> > > > On Oct 12, 2018, at 2:31 AM, Ben Bromhead 
> wrote:
> > > >
> > > > This is something that's bugged me for ages, tbh the performance gain
> > for
> > > > most use cases far outweighs the increase in memory usage and I would
> > even
> > > > be in favor of changing the default now, optimizing the storage cost
> > later
> > > > (if it's found to be worth it).
> > > >
> > > > For some anecdotal evidence:
> > > > 4kb is usually what we end setting it to, 16kb feels more reasonable
> > given
> > > > the memory impact, but what would be the point if practically, most
> > folks
> > > > set it to 4kb anyway?
> > > >
> > > > Note that chunk_length will largely be dependent on your read sizes,
> > but 4k
> > > > is the floor for most physical devices in terms of ones block size.
> > > >
> > > > +1 for making this change in 4.0 given the small size and the large
> > > > improvement to new users experience (as long as we are explicit in
> the
> > > > documentation about memory consumption).
> > > >
> > > >
> > > >> On Thu, Oct 11, 2018 at 7:11 PM Ariel Weisberg 
> > wrote:
> > > >>
> > > >> Hi,
> > > >>
> > > >> This is regarding
> > https://issues.apache.org/jira/browse/CASSANDRA-13241
> > > >>
> > > >> This ticket has languished for a while. IMO it's too late in 4.0 to
> > > >> implement a more memory efficient representation for compressed
> chunk
> > > >> offsets. However I don't think we should put out another release
> with
> > the
> > > >> current 64k default as it's pretty unreasonable.
> > > >>
> > > >> I propose that we lower the value to 16kb. 4k might never be the
> > correct
> > > >> default anyways as there is a cost to compression and 16k will still
> > be a
> > > >> large improvement.
> > > >>
> > > >> Benedict and Jon Haddad are both +1 on making this change for 4.0.
> In
> > the
> > > >> past there has been some consensus about reducing this value
> although
> > maybe
> > > >> with more memory efficiency.
> > > >>
> > > >> The napkin math for what this costs is:
> > > >> "If you have 1TB of uncompressed data, with 64k chunks that's 16M
> > chunks
> > > >> at 8 bytes each (128MB).
> > > >> With 16k chunks, that's 512MB.
> > > >> With 4k chunks, it's 2G.
> > > >> Per terabyte of data (pre-compression)."
> > > >>
> > > >>
> >
> https://issues.apache.org/jira/browse/CASSANDRA-13241?focusedCommentId=15886621&page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#comment-15886621
> > > >>
> > > >> By way of comparison memory mapping the files has a similar cost per
> > 4k
> > > >> page of 8 bytes. Multiple mappings makes this more expensive. With a
> > > >> default of 16kb this would be 4x less expensive than memory mapping
> a
> > file.
> > > >> I only mention this to give a sense of the costs we are already
> > paying. I
> > > >> am not saying they are directly related.
> > > >>
> > > >> I'll wait a week for discussion and if there is consensus make the
> > change.
> > > >>
> > > >> Regards,
> > > >> Ariel
> > > >>
> > > >>
> -
> > > >> To unsubscribe, e-mail: dev-unsubscr...@cassandra.apache.org
>

Re: CASSANDRA-13241 lower default chunk_length_in_kb

2018-10-17 Thread Alain RODRIGUEZ
+1

I would guess a lot of C* clusters/tables have this option set to the
default value, and not many of them are having the need for reading so big
chunks of data.
I believe this will greatly limit disk overreads for a fair amount (a big
majority?) of new users. It seems fair enough to change this default value,
I also think 4.0 is a nice place to do this.

Thanks for taking care of this Ariel and for making sure there is a
consensus here as well,

C*heers,
---
Alain Rodriguez - al...@thelastpickle.com
France / Spain

The Last Pickle - Apache Cassandra Consulting
http://www.thelastpickle.com

Le sam. 13 oct. 2018 à 08:52, Ariel Weisberg  a écrit :

> Hi,
>
> This would only impact new tables, existing tables would get their
> chunk_length_in_kb from the existing schema. It's something we record in a
> system table.
>
> I have an implementation of a compact integer sequence that only requires
> 37% of the memory required today. So we could do this with only slightly
> more than doubling the memory used. I'll post that to the JIRA soon.
>
> Ariel
>
> On Fri, Oct 12, 2018, at 1:56 AM, Jeff Jirsa wrote:
> >
> >
> > I think 16k is a better default, but it should only affect new tables.
> > Whoever changes it, please make sure you think about the upgrade path.
> >
> >
> > > On Oct 12, 2018, at 2:31 AM, Ben Bromhead  wrote:
> > >
> > > This is something that's bugged me for ages, tbh the performance gain
> for
> > > most use cases far outweighs the increase in memory usage and I would
> even
> > > be in favor of changing the default now, optimizing the storage cost
> later
> > > (if it's found to be worth it).
> > >
> > > For some anecdotal evidence:
> > > 4kb is usually what we end setting it to, 16kb feels more reasonable
> given
> > > the memory impact, but what would be the point if practically, most
> folks
> > > set it to 4kb anyway?
> > >
> > > Note that chunk_length will largely be dependent on your read sizes,
> but 4k
> > > is the floor for most physical devices in terms of ones block size.
> > >
> > > +1 for making this change in 4.0 given the small size and the large
> > > improvement to new users experience (as long as we are explicit in the
> > > documentation about memory consumption).
> > >
> > >
> > >> On Thu, Oct 11, 2018 at 7:11 PM Ariel Weisberg 
> wrote:
> > >>
> > >> Hi,
> > >>
> > >> This is regarding
> https://issues.apache.org/jira/browse/CASSANDRA-13241
> > >>
> > >> This ticket has languished for a while. IMO it's too late in 4.0 to
> > >> implement a more memory efficient representation for compressed chunk
> > >> offsets. However I don't think we should put out another release with
> the
> > >> current 64k default as it's pretty unreasonable.
> > >>
> > >> I propose that we lower the value to 16kb. 4k might never be the
> correct
> > >> default anyways as there is a cost to compression and 16k will still
> be a
> > >> large improvement.
> > >>
> > >> Benedict and Jon Haddad are both +1 on making this change for 4.0. In
> the
> > >> past there has been some consensus about reducing this value although
> maybe
> > >> with more memory efficiency.
> > >>
> > >> The napkin math for what this costs is:
> > >> "If you have 1TB of uncompressed data, with 64k chunks that's 16M
> chunks
> > >> at 8 bytes each (128MB).
> > >> With 16k chunks, that's 512MB.
> > >> With 4k chunks, it's 2G.
> > >> Per terabyte of data (pre-compression)."
> > >>
> > >>
> https://issues.apache.org/jira/browse/CASSANDRA-13241?focusedCommentId=15886621&page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#comment-15886621
> > >>
> > >> By way of comparison memory mapping the files has a similar cost per
> 4k
> > >> page of 8 bytes. Multiple mappings makes this more expensive. With a
> > >> default of 16kb this would be 4x less expensive than memory mapping a
> file.
> > >> I only mention this to give a sense of the costs we are already
> paying. I
> > >> am not saying they are directly related.
> > >>
> > >> I'll wait a week for discussion and if there is consensus make the
> change.
> > >>
> > >> Regards,
> > >> Ariel
> > >>
> > >> -
> > >> To unsubscribe, e-mail: dev-unsubscr...@cassandra.apache.org
> > >> For additional commands, e-mail: dev-h...@cassandra.apache.org
> > >>
> > >> --
> > > Ben Bromhead
> > > CTO | Instaclustr 
> > > +1 650 284 9692
> > > Reliability at Scale
> > > Cassandra, Spark, Elasticsearch on AWS, Azure, GCP and Softlayer
> >
> > -
> > To unsubscribe, e-mail: dev-unsubscr...@cassandra.apache.org
> > For additional commands, e-mail: dev-h...@cassandra.apache.org
> >
>
> -
> To unsubscribe, e-mail: dev-unsubscr...@cassandra.apache.org
> For additional commands, e-mail: dev-h...@cassandra.apache.org
>
>


Re: CASSANDRA-13241 lower default chunk_length_in_kb

2018-10-12 Thread Ariel Weisberg
Hi,

This would only impact new tables, existing tables would get their 
chunk_length_in_kb from the existing schema. It's something we record in a 
system table.

I have an implementation of a compact integer sequence that only requires 37% 
of the memory required today. So we could do this with only slightly more than 
doubling the memory used. I'll post that to the JIRA soon.

Ariel

On Fri, Oct 12, 2018, at 1:56 AM, Jeff Jirsa wrote:
> 
> 
> I think 16k is a better default, but it should only affect new tables. 
> Whoever changes it, please make sure you think about the upgrade path. 
> 
> 
> > On Oct 12, 2018, at 2:31 AM, Ben Bromhead  wrote:
> > 
> > This is something that's bugged me for ages, tbh the performance gain for
> > most use cases far outweighs the increase in memory usage and I would even
> > be in favor of changing the default now, optimizing the storage cost later
> > (if it's found to be worth it).
> > 
> > For some anecdotal evidence:
> > 4kb is usually what we end setting it to, 16kb feels more reasonable given
> > the memory impact, but what would be the point if practically, most folks
> > set it to 4kb anyway?
> > 
> > Note that chunk_length will largely be dependent on your read sizes, but 4k
> > is the floor for most physical devices in terms of ones block size.
> > 
> > +1 for making this change in 4.0 given the small size and the large
> > improvement to new users experience (as long as we are explicit in the
> > documentation about memory consumption).
> > 
> > 
> >> On Thu, Oct 11, 2018 at 7:11 PM Ariel Weisberg  wrote:
> >> 
> >> Hi,
> >> 
> >> This is regarding https://issues.apache.org/jira/browse/CASSANDRA-13241
> >> 
> >> This ticket has languished for a while. IMO it's too late in 4.0 to
> >> implement a more memory efficient representation for compressed chunk
> >> offsets. However I don't think we should put out another release with the
> >> current 64k default as it's pretty unreasonable.
> >> 
> >> I propose that we lower the value to 16kb. 4k might never be the correct
> >> default anyways as there is a cost to compression and 16k will still be a
> >> large improvement.
> >> 
> >> Benedict and Jon Haddad are both +1 on making this change for 4.0. In the
> >> past there has been some consensus about reducing this value although maybe
> >> with more memory efficiency.
> >> 
> >> The napkin math for what this costs is:
> >> "If you have 1TB of uncompressed data, with 64k chunks that's 16M chunks
> >> at 8 bytes each (128MB).
> >> With 16k chunks, that's 512MB.
> >> With 4k chunks, it's 2G.
> >> Per terabyte of data (pre-compression)."
> >> 
> >> https://issues.apache.org/jira/browse/CASSANDRA-13241?focusedCommentId=15886621&page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#comment-15886621
> >> 
> >> By way of comparison memory mapping the files has a similar cost per 4k
> >> page of 8 bytes. Multiple mappings makes this more expensive. With a
> >> default of 16kb this would be 4x less expensive than memory mapping a file.
> >> I only mention this to give a sense of the costs we are already paying. I
> >> am not saying they are directly related.
> >> 
> >> I'll wait a week for discussion and if there is consensus make the change.
> >> 
> >> Regards,
> >> Ariel
> >> 
> >> -
> >> To unsubscribe, e-mail: dev-unsubscr...@cassandra.apache.org
> >> For additional commands, e-mail: dev-h...@cassandra.apache.org
> >> 
> >> --
> > Ben Bromhead
> > CTO | Instaclustr 
> > +1 650 284 9692
> > Reliability at Scale
> > Cassandra, Spark, Elasticsearch on AWS, Azure, GCP and Softlayer
> 
> -
> To unsubscribe, e-mail: dev-unsubscr...@cassandra.apache.org
> For additional commands, e-mail: dev-h...@cassandra.apache.org
> 

-
To unsubscribe, e-mail: dev-unsubscr...@cassandra.apache.org
For additional commands, e-mail: dev-h...@cassandra.apache.org



Re: CASSANDRA-13241 lower default chunk_length_in_kb

2018-10-12 Thread Jeff Jirsa




> On Oct 12, 2018, at 6:46 AM, Pavel Yaskevich  wrote:
> 
>> On Thu, Oct 11, 2018 at 4:31 PM Ben Bromhead  wrote:
>> 
>> This is something that's bugged me for ages, tbh the performance gain for
>> most use cases far outweighs the increase in memory usage and I would even
>> be in favor of changing the default now, optimizing the storage cost later
>> (if it's found to be worth it).
>> 
>> For some anecdotal evidence:
>> 4kb is usually what we end setting it to, 16kb feels more reasonable given
>> the memory impact, but what would be the point if practically, most folks
>> set it to 4kb anyway?
>> 
>> Note that chunk_length will largely be dependent on your read sizes, but 4k
>> is the floor for most physical devices in terms of ones block size.
>> 
> 
> It might be worth while to investigate how splitting chunk size into data,
> index and compaction sizes would affect performance.
> 

Data chunk and index chunk are already different (though one is table level and 
one is per instance), but I’m not parsing the compaction comment? 
-
To unsubscribe, e-mail: dev-unsubscr...@cassandra.apache.org
For additional commands, e-mail: dev-h...@cassandra.apache.org



Re: CASSANDRA-13241 lower default chunk_length_in_kb

2018-10-11 Thread Jeff Jirsa



I think 16k is a better default, but it should only affect new tables. Whoever 
changes it, please make sure you think about the upgrade path. 


> On Oct 12, 2018, at 2:31 AM, Ben Bromhead  wrote:
> 
> This is something that's bugged me for ages, tbh the performance gain for
> most use cases far outweighs the increase in memory usage and I would even
> be in favor of changing the default now, optimizing the storage cost later
> (if it's found to be worth it).
> 
> For some anecdotal evidence:
> 4kb is usually what we end setting it to, 16kb feels more reasonable given
> the memory impact, but what would be the point if practically, most folks
> set it to 4kb anyway?
> 
> Note that chunk_length will largely be dependent on your read sizes, but 4k
> is the floor for most physical devices in terms of ones block size.
> 
> +1 for making this change in 4.0 given the small size and the large
> improvement to new users experience (as long as we are explicit in the
> documentation about memory consumption).
> 
> 
>> On Thu, Oct 11, 2018 at 7:11 PM Ariel Weisberg  wrote:
>> 
>> Hi,
>> 
>> This is regarding https://issues.apache.org/jira/browse/CASSANDRA-13241
>> 
>> This ticket has languished for a while. IMO it's too late in 4.0 to
>> implement a more memory efficient representation for compressed chunk
>> offsets. However I don't think we should put out another release with the
>> current 64k default as it's pretty unreasonable.
>> 
>> I propose that we lower the value to 16kb. 4k might never be the correct
>> default anyways as there is a cost to compression and 16k will still be a
>> large improvement.
>> 
>> Benedict and Jon Haddad are both +1 on making this change for 4.0. In the
>> past there has been some consensus about reducing this value although maybe
>> with more memory efficiency.
>> 
>> The napkin math for what this costs is:
>> "If you have 1TB of uncompressed data, with 64k chunks that's 16M chunks
>> at 8 bytes each (128MB).
>> With 16k chunks, that's 512MB.
>> With 4k chunks, it's 2G.
>> Per terabyte of data (pre-compression)."
>> 
>> https://issues.apache.org/jira/browse/CASSANDRA-13241?focusedCommentId=15886621&page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#comment-15886621
>> 
>> By way of comparison memory mapping the files has a similar cost per 4k
>> page of 8 bytes. Multiple mappings makes this more expensive. With a
>> default of 16kb this would be 4x less expensive than memory mapping a file.
>> I only mention this to give a sense of the costs we are already paying. I
>> am not saying they are directly related.
>> 
>> I'll wait a week for discussion and if there is consensus make the change.
>> 
>> Regards,
>> Ariel
>> 
>> -
>> To unsubscribe, e-mail: dev-unsubscr...@cassandra.apache.org
>> For additional commands, e-mail: dev-h...@cassandra.apache.org
>> 
>> --
> Ben Bromhead
> CTO | Instaclustr 
> +1 650 284 9692
> Reliability at Scale
> Cassandra, Spark, Elasticsearch on AWS, Azure, GCP and Softlayer

-
To unsubscribe, e-mail: dev-unsubscr...@cassandra.apache.org
For additional commands, e-mail: dev-h...@cassandra.apache.org



Re: CASSANDRA-13241 lower default chunk_length_in_kb

2018-10-11 Thread Pavel Yaskevich
On Thu, Oct 11, 2018 at 4:31 PM Ben Bromhead  wrote:

> This is something that's bugged me for ages, tbh the performance gain for
> most use cases far outweighs the increase in memory usage and I would even
> be in favor of changing the default now, optimizing the storage cost later
> (if it's found to be worth it).
>
> For some anecdotal evidence:
> 4kb is usually what we end setting it to, 16kb feels more reasonable given
> the memory impact, but what would be the point if practically, most folks
> set it to 4kb anyway?
>
> Note that chunk_length will largely be dependent on your read sizes, but 4k
> is the floor for most physical devices in terms of ones block size.
>

It might be worth while to investigate how splitting chunk size into data,
index and compaction sizes would affect performance.


>
> +1 for making this change in 4.0 given the small size and the large
> improvement to new users experience (as long as we are explicit in the
> documentation about memory consumption).
>
>
> On Thu, Oct 11, 2018 at 7:11 PM Ariel Weisberg  wrote:
>
> > Hi,
> >
> > This is regarding https://issues.apache.org/jira/browse/CASSANDRA-13241
> >
> > This ticket has languished for a while. IMO it's too late in 4.0 to
> > implement a more memory efficient representation for compressed chunk
> > offsets. However I don't think we should put out another release with the
> > current 64k default as it's pretty unreasonable.
> >
> > I propose that we lower the value to 16kb. 4k might never be the correct
> > default anyways as there is a cost to compression and 16k will still be a
> > large improvement.
> >
> > Benedict and Jon Haddad are both +1 on making this change for 4.0. In the
> > past there has been some consensus about reducing this value although
> maybe
> > with more memory efficiency.
> >
> > The napkin math for what this costs is:
> > "If you have 1TB of uncompressed data, with 64k chunks that's 16M chunks
> > at 8 bytes each (128MB).
> > With 16k chunks, that's 512MB.
> > With 4k chunks, it's 2G.
> > Per terabyte of data (pre-compression)."
> >
> >
> https://issues.apache.org/jira/browse/CASSANDRA-13241?focusedCommentId=15886621&page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#comment-15886621
> >
> > By way of comparison memory mapping the files has a similar cost per 4k
> > page of 8 bytes. Multiple mappings makes this more expensive. With a
> > default of 16kb this would be 4x less expensive than memory mapping a
> file.
> > I only mention this to give a sense of the costs we are already paying. I
> > am not saying they are directly related.
> >
> > I'll wait a week for discussion and if there is consensus make the
> change.
> >
> > Regards,
> > Ariel
> >
> > -
> > To unsubscribe, e-mail: dev-unsubscr...@cassandra.apache.org
> > For additional commands, e-mail: dev-h...@cassandra.apache.org
> >
> > --
> Ben Bromhead
> CTO | Instaclustr 
> +1 650 284 9692
> Reliability at Scale
> Cassandra, Spark, Elasticsearch on AWS, Azure, GCP and Softlayer
>


Re: CASSANDRA-13241 lower default chunk_length_in_kb

2018-10-11 Thread Ben Bromhead
This is something that's bugged me for ages, tbh the performance gain for
most use cases far outweighs the increase in memory usage and I would even
be in favor of changing the default now, optimizing the storage cost later
(if it's found to be worth it).

For some anecdotal evidence:
4kb is usually what we end setting it to, 16kb feels more reasonable given
the memory impact, but what would be the point if practically, most folks
set it to 4kb anyway?

Note that chunk_length will largely be dependent on your read sizes, but 4k
is the floor for most physical devices in terms of ones block size.

+1 for making this change in 4.0 given the small size and the large
improvement to new users experience (as long as we are explicit in the
documentation about memory consumption).


On Thu, Oct 11, 2018 at 7:11 PM Ariel Weisberg  wrote:

> Hi,
>
> This is regarding https://issues.apache.org/jira/browse/CASSANDRA-13241
>
> This ticket has languished for a while. IMO it's too late in 4.0 to
> implement a more memory efficient representation for compressed chunk
> offsets. However I don't think we should put out another release with the
> current 64k default as it's pretty unreasonable.
>
> I propose that we lower the value to 16kb. 4k might never be the correct
> default anyways as there is a cost to compression and 16k will still be a
> large improvement.
>
> Benedict and Jon Haddad are both +1 on making this change for 4.0. In the
> past there has been some consensus about reducing this value although maybe
> with more memory efficiency.
>
> The napkin math for what this costs is:
> "If you have 1TB of uncompressed data, with 64k chunks that's 16M chunks
> at 8 bytes each (128MB).
> With 16k chunks, that's 512MB.
> With 4k chunks, it's 2G.
> Per terabyte of data (pre-compression)."
>
> https://issues.apache.org/jira/browse/CASSANDRA-13241?focusedCommentId=15886621&page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#comment-15886621
>
> By way of comparison memory mapping the files has a similar cost per 4k
> page of 8 bytes. Multiple mappings makes this more expensive. With a
> default of 16kb this would be 4x less expensive than memory mapping a file.
> I only mention this to give a sense of the costs we are already paying. I
> am not saying they are directly related.
>
> I'll wait a week for discussion and if there is consensus make the change.
>
> Regards,
> Ariel
>
> -
> To unsubscribe, e-mail: dev-unsubscr...@cassandra.apache.org
> For additional commands, e-mail: dev-h...@cassandra.apache.org
>
> --
Ben Bromhead
CTO | Instaclustr 
+1 650 284 9692
Reliability at Scale
Cassandra, Spark, Elasticsearch on AWS, Azure, GCP and Softlayer


CASSANDRA-13241 lower default chunk_length_in_kb

2018-10-11 Thread Ariel Weisberg
Hi,

This is regarding https://issues.apache.org/jira/browse/CASSANDRA-13241

This ticket has languished for a while. IMO it's too late in 4.0 to implement a 
more memory efficient representation for compressed chunk offsets. However I 
don't think we should put out another release with the current 64k default as 
it's pretty unreasonable.

I propose that we lower the value to 16kb. 4k might never be the correct 
default anyways as there is a cost to compression and 16k will still be a large 
improvement.

Benedict and Jon Haddad are both +1 on making this change for 4.0. In the past 
there has been some consensus about reducing this value although maybe with 
more memory efficiency.

The napkin math for what this costs is:
"If you have 1TB of uncompressed data, with 64k chunks that's 16M chunks at 8 
bytes each (128MB).
With 16k chunks, that's 512MB.
With 4k chunks, it's 2G.
Per terabyte of data (pre-compression)."
https://issues.apache.org/jira/browse/CASSANDRA-13241?focusedCommentId=15886621&page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#comment-15886621

By way of comparison memory mapping the files has a similar cost per 4k page of 
8 bytes. Multiple mappings makes this more expensive. With a default of 16kb 
this would be 4x less expensive than memory mapping a file. I only mention this 
to give a sense of the costs we are already paying. I am not saying they are 
directly related.

I'll wait a week for discussion and if there is consensus make the change.

Regards,
Ariel

-
To unsubscribe, e-mail: dev-unsubscr...@cassandra.apache.org
For additional commands, e-mail: dev-h...@cassandra.apache.org