Re: [DISCUSS] Update default disk_access_mode to mmap_index_only on 5.0

2023-09-06 Thread Paulo Motta
> I wonder why disk_access_mode property is not in cassandra.yaml (looking
into trunk right now)

I think there's a prehistoric reason why it was removed but I can't
remember right now.

> Do you all think we can add it there with brief explanation what each
option does?

We could reinclude it as long as we provide a clear recommendation on when
to change from the default since this is an advanced setting which should
be rarely changed. But I still think we should provide a more
stable/foolproof default (mmap_index_only) since the current default (mmap)
is known to cause instability in some scenarios.

Also there is a technicality with changing the default, if we change the
"auto" behavior from mmap to mmap_index_only this may affect users relying
on the default "mmap" behavior. Not sure the best way to address that, is a
big NEWS note sufficient? Even though users are expected to read NEWS when
upgrading we know well not all users read it.

> Shall we also share this thread with @user?

Thanks Ekaterina! If we decide to change the default we can run this
through the user@ list to see what the user community thinks.

On Wed, Sep 6, 2023 at 4:45 PM Ekaterina Dimitrova 
wrote:

> Thanks for starting this discussion, Paulo!
>
> Shall we also share this thread with @user?
>
> On Wed, 6 Sep 2023 at 16:35, C. Scott Andreas 
> wrote:
>
>> Supportive of switching the default to mmap_index_only as well.
>>
>> I don’t have numbers handy to share, but my experience has been
>> significantly lower read latency and I wouldn’t run with auto. I’ve also
>> not observed substantial heap pressure after switching - it was strictly an
>> improvement.
>>
>> - Scott
>>
>> —
>> Mobile
>>
>> On Sep 6, 2023, at 8:50 AM, Paulo Motta  wrote:
>>
>> 
>>
>> Hi,
>>
>> I've been bitten by OOMs with disk_access_mode:auto/mmap that were fixed
>> by changing to disk_access_mode:mmap_index_only. In a particular benchmark
>> I got 5x more read throughput on 3.11.x with disk_access_mode:
>> mmap_index_only vs disk_access_mode: auto/mmap.
>>
>> Changing disk_access_mode to mmap_index_only seems to be a common
>> recommendation on forums[1][2][3][4] and slack (find by searching
>> disk_access_mode in the #cassandra channel on https://the-asf.slack.com/
>> ).
>>
>> It's not clear to me when using the default disk_access_mode:auto/mmap is
>> beneficial, perhaps only when the read set fits in memory? Mick seems to
>> think on CASSANDRA-15531 [5], that mmap_index_only has a higher heap cost
>> and should be only used when warranted. However it's not uncommon to see
>> people being bitten with OOMs or lower read performance due to the default
>> disk_access_mode, so it makes me think it's not the best fool-proof default.
>>
>> Should we consider changing default "auto" behavior of "disk_access_mode"
>> to be "mmap_index_only" instead of "mmap" in 5.0 since it's likely safer
>> and perhaps more performant?
>>
>> Thanks,
>>
>> Paulo
>>
>> [1]
>> https://stackoverflow.com/questions/72272035/troubleshooting-and-fixing-cassandra-oom-issue
>> [2] https://phabricator.wikimedia.org/T137419
>> [3] https://stackoverflow.com/a/55975471
>> [4]
>> https://support.datastax.com/s/article/FAQ-Use-of-disk-access-mode-in-DSE-51-and-earlier
>> [5] https://issues.apache.org/jira/browse/CASSANDRA-15531
>>
>>


Re: [DISCUSS] Update default disk_access_mode to mmap_index_only on 5.0

2023-09-06 Thread Ekaterina Dimitrova
Thanks for starting this discussion, Paulo!

Shall we also share this thread with @user?

On Wed, 6 Sep 2023 at 16:35, C. Scott Andreas  wrote:

> Supportive of switching the default to mmap_index_only as well.
>
> I don’t have numbers handy to share, but my experience has been
> significantly lower read latency and I wouldn’t run with auto. I’ve also
> not observed substantial heap pressure after switching - it was strictly an
> improvement.
>
> - Scott
>
> —
> Mobile
>
> On Sep 6, 2023, at 8:50 AM, Paulo Motta  wrote:
>
> 
>
> Hi,
>
> I've been bitten by OOMs with disk_access_mode:auto/mmap that were fixed
> by changing to disk_access_mode:mmap_index_only. In a particular benchmark
> I got 5x more read throughput on 3.11.x with disk_access_mode:
> mmap_index_only vs disk_access_mode: auto/mmap.
>
> Changing disk_access_mode to mmap_index_only seems to be a common
> recommendation on forums[1][2][3][4] and slack (find by searching
> disk_access_mode in the #cassandra channel on https://the-asf.slack.com/).
>
> It's not clear to me when using the default disk_access_mode:auto/mmap is
> beneficial, perhaps only when the read set fits in memory? Mick seems to
> think on CASSANDRA-15531 [5], that mmap_index_only has a higher heap cost
> and should be only used when warranted. However it's not uncommon to see
> people being bitten with OOMs or lower read performance due to the default
> disk_access_mode, so it makes me think it's not the best fool-proof default.
>
> Should we consider changing default "auto" behavior of "disk_access_mode"
> to be "mmap_index_only" instead of "mmap" in 5.0 since it's likely safer
> and perhaps more performant?
>
> Thanks,
>
> Paulo
>
> [1]
> https://stackoverflow.com/questions/72272035/troubleshooting-and-fixing-cassandra-oom-issue
> [2] https://phabricator.wikimedia.org/T137419
> [3] https://stackoverflow.com/a/55975471
> [4]
> https://support.datastax.com/s/article/FAQ-Use-of-disk-access-mode-in-DSE-51-and-earlier
> [5] https://issues.apache.org/jira/browse/CASSANDRA-15531
>
>


Re: [DISCUSS] Update default disk_access_mode to mmap_index_only on 5.0

2023-09-06 Thread Miklosovic, Stefan
I wonder why disk_access_mode property is not in cassandra.yaml (looking into 
trunk right now). Do you all think we can add it there with brief explanation 
what each option does?


From: Caleb Rackliffe 
Sent: Wednesday, September 6, 2023 21:08
To: dev@cassandra.apache.org
Subject: Re: [DISCUSS] Update default disk_access_mode to mmap_index_only on 5.0

NetApp Security WARNING: This is an external email. Do not click links or open 
attachments unless you recognize the sender and know the content is safe.



+100 to this

We'd have to come up w/ a pretty compelling counterexample to NOT switch the 
default to mmap_index_only at this point.

On Wed, Sep 6, 2023 at 11:40 AM Brandon Williams 
mailto:dri...@gmail.com>> wrote:
Given 
https://issues.apache.org/jira/browse/CASSANDRA-17237
 I think it
makes sense.  At the least I think we should restore disk_access_mode
so that users are more aware of the options available.

Kind Regards,
Brandon

On Wed, Sep 6, 2023 at 10:50 AM Paulo Motta 
mailto:pauloricard...@gmail.com>> wrote:
>
> Hi,
>
> I've been bitten by OOMs with disk_access_mode:auto/mmap that were fixed by 
> changing to disk_access_mode:mmap_index_only. In a particular benchmark I got 
> 5x more read throughput on 3.11.x with disk_access_mode: mmap_index_only vs 
> disk_access_mode: auto/mmap.
>
> Changing disk_access_mode to mmap_index_only seems to be a common 
> recommendation on forums[1][2][3][4] and slack (find by searching 
> disk_access_mode in the #cassandra channel on 
> https://the-asf.slack.com/).
>
> It's not clear to me when using the default disk_access_mode:auto/mmap is 
> beneficial, perhaps only when the read set fits in memory? Mick seems to 
> think on CASSANDRA-15531 [5], that mmap_index_only has a higher heap cost and 
> should be only used when warranted. However it's not uncommon to see people 
> being bitten with OOMs or lower read performance due to the default 
> disk_access_mode, so it makes me think it's not the best fool-proof default.
>
> Should we consider changing default "auto" behavior of "disk_access_mode" to 
> be "mmap_index_only" instead of "mmap" in 5.0 since it's likely safer and 
> perhaps more performant?
>
> Thanks,
>
> Paulo
>
> [1] 
> https://stackoverflow.com/questions/72272035/troubleshooting-and-fixing-cassandra-oom-issue
> [2] 
> https://phabricator.wikimedia.org/T137419
> [3] 
> https://stackoverflow.com/a/55975471
> [4] 
> https://support.datastax.com/s/article/FAQ-Use-of-disk-access-mode-in-DSE-51-and-earlier

Re: [DISCUSS] Update default disk_access_mode to mmap_index_only on 5.0

2023-09-06 Thread Caleb Rackliffe
+100 to this

We'd have to come up w/ a pretty compelling counterexample to NOT switch
the default to mmap_index_only at this point.

On Wed, Sep 6, 2023 at 11:40 AM Brandon Williams  wrote:

> Given https://issues.apache.org/jira/browse/CASSANDRA-17237 I think it
> makes sense.  At the least I think we should restore disk_access_mode
> so that users are more aware of the options available.
>
> Kind Regards,
> Brandon
>
> On Wed, Sep 6, 2023 at 10:50 AM Paulo Motta 
> wrote:
> >
> > Hi,
> >
> > I've been bitten by OOMs with disk_access_mode:auto/mmap that were fixed
> by changing to disk_access_mode:mmap_index_only. In a particular benchmark
> I got 5x more read throughput on 3.11.x with disk_access_mode:
> mmap_index_only vs disk_access_mode: auto/mmap.
> >
> > Changing disk_access_mode to mmap_index_only seems to be a common
> recommendation on forums[1][2][3][4] and slack (find by searching
> disk_access_mode in the #cassandra channel on https://the-asf.slack.com/).
> >
> > It's not clear to me when using the default disk_access_mode:auto/mmap
> is beneficial, perhaps only when the read set fits in memory? Mick seems to
> think on CASSANDRA-15531 [5], that mmap_index_only has a higher heap cost
> and should be only used when warranted. However it's not uncommon to see
> people being bitten with OOMs or lower read performance due to the default
> disk_access_mode, so it makes me think it's not the best fool-proof default.
> >
> > Should we consider changing default "auto" behavior of
> "disk_access_mode" to be "mmap_index_only" instead of "mmap" in 5.0 since
> it's likely safer and perhaps more performant?
> >
> > Thanks,
> >
> > Paulo
> >
> > [1]
> https://stackoverflow.com/questions/72272035/troubleshooting-and-fixing-cassandra-oom-issue
> > [2] https://phabricator.wikimedia.org/T137419
> > [3] https://stackoverflow.com/a/55975471
> > [4]
> https://support.datastax.com/s/article/FAQ-Use-of-disk-access-mode-in-DSE-51-and-earlier
> > [5] https://issues.apache.org/jira/browse/CASSANDRA-15531
>


Re: [DISCUSS] Update default disk_access_mode to mmap_index_only on 5.0

2023-09-06 Thread Brandon Williams
Given https://issues.apache.org/jira/browse/CASSANDRA-17237 I think it
makes sense.  At the least I think we should restore disk_access_mode
so that users are more aware of the options available.

Kind Regards,
Brandon

On Wed, Sep 6, 2023 at 10:50 AM Paulo Motta  wrote:
>
> Hi,
>
> I've been bitten by OOMs with disk_access_mode:auto/mmap that were fixed by 
> changing to disk_access_mode:mmap_index_only. In a particular benchmark I got 
> 5x more read throughput on 3.11.x with disk_access_mode: mmap_index_only vs 
> disk_access_mode: auto/mmap.
>
> Changing disk_access_mode to mmap_index_only seems to be a common 
> recommendation on forums[1][2][3][4] and slack (find by searching 
> disk_access_mode in the #cassandra channel on https://the-asf.slack.com/).
>
> It's not clear to me when using the default disk_access_mode:auto/mmap is 
> beneficial, perhaps only when the read set fits in memory? Mick seems to 
> think on CASSANDRA-15531 [5], that mmap_index_only has a higher heap cost and 
> should be only used when warranted. However it's not uncommon to see people 
> being bitten with OOMs or lower read performance due to the default 
> disk_access_mode, so it makes me think it's not the best fool-proof default.
>
> Should we consider changing default "auto" behavior of "disk_access_mode" to 
> be "mmap_index_only" instead of "mmap" in 5.0 since it's likely safer and 
> perhaps more performant?
>
> Thanks,
>
> Paulo
>
> [1] 
> https://stackoverflow.com/questions/72272035/troubleshooting-and-fixing-cassandra-oom-issue
> [2] https://phabricator.wikimedia.org/T137419
> [3] https://stackoverflow.com/a/55975471
> [4] 
> https://support.datastax.com/s/article/FAQ-Use-of-disk-access-mode-in-DSE-51-and-earlier
> [5] https://issues.apache.org/jira/browse/CASSANDRA-15531


Re: [DISCUSS] Update default disk_access_mode to mmap_index_only on 5.0

2023-09-06 Thread C. Scott Andreas
Supportive of switching the default to mmap_index_only as well.

I don’t have numbers handy to share, but my experience has been significantly 
lower read latency and I wouldn’t run with auto. I’ve also not observed 
substantial heap pressure after switching - it was strictly an improvement.

- Scott

—
Mobile

> On Sep 6, 2023, at 8:50 AM, Paulo Motta  wrote:
> 
> 
> Hi,
> 
> I've been bitten by OOMs with disk_access_mode:auto/mmap that were fixed by 
> changing to disk_access_mode:mmap_index_only. In a particular benchmark I got 
> 5x more read throughput on 3.11.x with disk_access_mode: mmap_index_only vs 
> disk_access_mode: auto/mmap.
> 
> Changing disk_access_mode to mmap_index_only seems to be a common 
> recommendation on forums[1][2][3][4] and slack (find by searching 
> disk_access_mode in the #cassandra channel on https://the-asf.slack.com/).
> 
> It's not clear to me when using the default disk_access_mode:auto/mmap is 
> beneficial, perhaps only when the read set fits in memory? Mick seems to 
> think on CASSANDRA-15531 [5], that mmap_index_only has a higher heap cost and 
> should be only used when warranted. However it's not uncommon to see people 
> being bitten with OOMs or lower read performance due to the default 
> disk_access_mode, so it makes me think it's not the best fool-proof default.
> 
> Should we consider changing default "auto" behavior of "disk_access_mode" to 
> be "mmap_index_only" instead of "mmap" in 5.0 since it's likely safer and 
> perhaps more performant?
> 
> Thanks,
> 
> Paulo
> 
> [1] 
> https://stackoverflow.com/questions/72272035/troubleshooting-and-fixing-cassandra-oom-issue
> [2] https://phabricator.wikimedia.org/T137419
> [3] https://stackoverflow.com/a/55975471
> [4] 
> https://support.datastax.com/s/article/FAQ-Use-of-disk-access-mode-in-DSE-51-and-earlier
> [5] https://issues.apache.org/jira/browse/CASSANDRA-15531


[DISCUSS] Update default disk_access_mode to mmap_index_only on 5.0

2023-09-06 Thread Paulo Motta
Hi,

I've been bitten by OOMs with disk_access_mode:auto/mmap that were fixed by
changing to disk_access_mode:mmap_index_only. In a particular benchmark I
got 5x more read throughput on 3.11.x with disk_access_mode:
mmap_index_only vs disk_access_mode: auto/mmap.

Changing disk_access_mode to mmap_index_only seems to be a common
recommendation on forums[1][2][3][4] and slack (find by searching
disk_access_mode in the #cassandra channel on https://the-asf.slack.com/).

It's not clear to me when using the default disk_access_mode:auto/mmap is
beneficial, perhaps only when the read set fits in memory? Mick seems to
think on CASSANDRA-15531 [5], that mmap_index_only has a higher heap cost
and should be only used when warranted. However it's not uncommon to see
people being bitten with OOMs or lower read performance due to the default
disk_access_mode, so it makes me think it's not the best fool-proof default.

Should we consider changing default "auto" behavior of "disk_access_mode"
to be "mmap_index_only" instead of "mmap" in 5.0 since it's likely safer
and perhaps more performant?

Thanks,

Paulo

[1]
https://stackoverflow.com/questions/72272035/troubleshooting-and-fixing-cassandra-oom-issue
[2] https://phabricator.wikimedia.org/T137419
[3] https://stackoverflow.com/a/55975471
[4]
https://support.datastax.com/s/article/FAQ-Use-of-disk-access-mode-in-DSE-51-and-earlier
[5] https://issues.apache.org/jira/browse/CASSANDRA-15531


Re: [Discuss] disabling io.netty.transport.noNative in tests

2023-09-06 Thread Alex Petrov
I think most of the time people actually use netty _with_ native. This might 
have been introduced when we were tried to make shaded in-JVM dtest jars. If 
all tests are passing, and we actually do have a confirmtion that native Netty 
is being used, I would say +1 to remove `noNative`. 

Just to make sure though, did you have a chance to see if the upgrade tests 
also work fine?

On Thu, Aug 31, 2023, at 1:20 PM, Miklosovic, Stefan wrote:
> Hi list,
> 
> Currently, we are skipping the usage of native libraries in Netty as part of 
> testing here (1).
> 
> In 5.0 branch, we upgraded Netty to 4.1.96 and we brought all native 
> dependencies to the class path so they are there in runtime (x86, arm, mac).
> 
> I conducted few CI tests for 5.0+ and not having 
> "io.netty.transport.noNative" set to "true" introduces no errors. I think we 
> were just too motivated here to skip stuff left and right. Having this 
> property enabled seems to have no functional effect. Also, one negative 
> side-effect of having this property enabled is that it logs exceptions when 
> running in-jvm-dtests e.g. in IDEA and it pollutes the logs unnecessarily and 
> it is just a visual clutter to deal with every time. To silence this, I set 
> (2) so it skips the logic in (3) completely hence no un-necessary logging 
> will occure.
> 
> My question is whether we should not remove (4) in 5.0, that means that tests 
> will use native libraries too. That also means that we are running tests 
> closer to a production environment. I just do not see any reason why we 
> should skip this when all tests are just passing with it too with additional 
> benefit of not seeing an exception logged every time when testing it locally.
> 
> Thanks
> 
> (1) 
> https://github.com/apache/cassandra-in-jvm-dtest-api/blob/trunk/src/main/java/org/apache/cassandra/distributed/api/ICluster.java#L95-L102
> (2) 
> https://github.com/apache/cassandra/blob/trunk/test/distributed/org/apache/cassandra/distributed/impl/AbstractCluster.java#L196
> (3) 
> https://github.com/apache/cassandra/blob/trunk/src/java/org/apache/cassandra/service/NativeTransportService.java#L163
> (4) 
> https://github.com/apache/cassandra-in-jvm-dtest-api/blob/trunk/src/main/java/org/apache/cassandra/distributed/api/ICluster.java#L101