Re: [DISCUSS] Update default disk_access_mode to mmap_index_only on 5.0

2023-11-15 Thread Caleb Rackliffe
Added one nit to the PR. Otherwise, this is awesome :)

On Wed, Nov 15, 2023 at 11:01 AM Jordan West  wrote:

> I would also like to back this proposal. We change this default because
> several incidents have occurred by leaving the default of auto. There are
> rare cases where auto/mmap is the better option but as for a default
> mmap_index_only is safer.
>
> On Wed, Nov 15, 2023 at 6:35 AM Paulo Motta  wrote:
>
>> Hi,
>>
>> I would like to get back to this. I proposed this default configuration
>> change on the user list ~1 month ago and there were no comments [1].
>>
>> I created CASSANDRA-19021 [2] to make the proposed change and Stefan
>> kindly submitted a patch, CI is looking good.
>>
>> Any objections to making this change in 5.0? If not, we will merge in 24
>> hours.
>>
>> Thanks,
>>
>> Paulo
>>
>> [1] - https://lists.apache.org/thread/w0gkdj7fhylycqwmd73p0kfck7jr8qth
>> [2] - https://issues.apache.org/jira/browse/CASSANDRA-19021
>>
>> On Wed, Sep 6, 2023 at 5:12 PM Paulo Motta 
>> wrote:
>>
>>> > I wonder why disk_access_mode property is not in cassandra.yaml
>>> (looking into trunk right now)
>>>
>>> I think there's a prehistoric reason why it was removed but I can't
>>> remember right now.
>>>
>>> > Do you all think we can add it there with brief explanation what each
>>> option does?
>>>
>>> We could reinclude it as long as we provide a clear recommendation on
>>> when to change from the default since this is an advanced setting which
>>> should be rarely changed. But I still think we should provide a more
>>> stable/foolproof default (mmap_index_only) since the current default (mmap)
>>> is known to cause instability in some scenarios.
>>>
>>> Also there is a technicality with changing the default, if we change the
>>> "auto" behavior from mmap to mmap_index_only this may affect users relying
>>> on the default "mmap" behavior. Not sure the best way to address that, is a
>>> big NEWS note sufficient? Even though users are expected to read NEWS when
>>> upgrading we know well not all users read it.
>>>
>>> > Shall we also share this thread with @user?
>>>
>>> Thanks Ekaterina! If we decide to change the default we can run this
>>> through the user@ list to see what the user community thinks.
>>>
>>> On Wed, Sep 6, 2023 at 4:45 PM Ekaterina Dimitrova <
>>> e.dimitr...@gmail.com> wrote:
>>>
 Thanks for starting this discussion, Paulo!

 Shall we also share this thread with @user?

 On Wed, 6 Sep 2023 at 16:35, C. Scott Andreas 
 wrote:

> Supportive of switching the default to mmap_index_only as well.
>
> I don’t have numbers handy to share, but my experience has been
> significantly lower read latency and I wouldn’t run with auto. I’ve also
> not observed substantial heap pressure after switching - it was strictly 
> an
> improvement.
>
> - Scott
>
> —
> Mobile
>
> On Sep 6, 2023, at 8:50 AM, Paulo Motta 
> wrote:
>
> 
>
> Hi,
>
> I've been bitten by OOMs with disk_access_mode:auto/mmap that were
> fixed by changing to disk_access_mode:mmap_index_only. In a particular
> benchmark I got 5x more read throughput on 3.11.x with disk_access_mode:
> mmap_index_only vs disk_access_mode: auto/mmap.
>
> Changing disk_access_mode to mmap_index_only seems to be a common
> recommendation on forums[1][2][3][4] and slack (find by searching
> disk_access_mode in the #cassandra channel on
> https://the-asf.slack.com/).
>
> It's not clear to me when using the default
> disk_access_mode:auto/mmap is beneficial, perhaps only when the read set
> fits in memory? Mick seems to think on CASSANDRA-15531 [5], that
> mmap_index_only has a higher heap cost and should be only used when
> warranted. However it's not uncommon to see people being bitten with OOMs
> or lower read performance due to the default disk_access_mode, so it makes
> me think it's not the best fool-proof default.
>
> Should we consider changing default "auto" behavior of
> "disk_access_mode" to be "mmap_index_only" instead of "mmap" in 5.0 since
> it's likely safer and perhaps more performant?
>
> Thanks,
>
> Paulo
>
> [1]
> https://stackoverflow.com/questions/72272035/troubleshooting-and-fixing-cassandra-oom-issue
> [2] https://phabricator.wikimedia.org/T137419
> [3] https://stackoverflow.com/a/55975471
> [4]
> https://support.datastax.com/s/article/FAQ-Use-of-disk-access-mode-in-DSE-51-and-earlier
> [5] https://issues.apache.org/jira/browse/CASSANDRA-15531
>
>


Re: [DISCUSS] Update default disk_access_mode to mmap_index_only on 5.0

2023-11-15 Thread Jordan West
I would also like to back this proposal. We change this default because
several incidents have occurred by leaving the default of auto. There are
rare cases where auto/mmap is the better option but as for a default
mmap_index_only is safer.

On Wed, Nov 15, 2023 at 6:35 AM Paulo Motta  wrote:

> Hi,
>
> I would like to get back to this. I proposed this default configuration
> change on the user list ~1 month ago and there were no comments [1].
>
> I created CASSANDRA-19021 [2] to make the proposed change and Stefan
> kindly submitted a patch, CI is looking good.
>
> Any objections to making this change in 5.0? If not, we will merge in 24
> hours.
>
> Thanks,
>
> Paulo
>
> [1] - https://lists.apache.org/thread/w0gkdj7fhylycqwmd73p0kfck7jr8qth
> [2] - https://issues.apache.org/jira/browse/CASSANDRA-19021
>
> On Wed, Sep 6, 2023 at 5:12 PM Paulo Motta 
> wrote:
>
>> > I wonder why disk_access_mode property is not in cassandra.yaml
>> (looking into trunk right now)
>>
>> I think there's a prehistoric reason why it was removed but I can't
>> remember right now.
>>
>> > Do you all think we can add it there with brief explanation what each
>> option does?
>>
>> We could reinclude it as long as we provide a clear recommendation on
>> when to change from the default since this is an advanced setting which
>> should be rarely changed. But I still think we should provide a more
>> stable/foolproof default (mmap_index_only) since the current default (mmap)
>> is known to cause instability in some scenarios.
>>
>> Also there is a technicality with changing the default, if we change the
>> "auto" behavior from mmap to mmap_index_only this may affect users relying
>> on the default "mmap" behavior. Not sure the best way to address that, is a
>> big NEWS note sufficient? Even though users are expected to read NEWS when
>> upgrading we know well not all users read it.
>>
>> > Shall we also share this thread with @user?
>>
>> Thanks Ekaterina! If we decide to change the default we can run this
>> through the user@ list to see what the user community thinks.
>>
>> On Wed, Sep 6, 2023 at 4:45 PM Ekaterina Dimitrova 
>> wrote:
>>
>>> Thanks for starting this discussion, Paulo!
>>>
>>> Shall we also share this thread with @user?
>>>
>>> On Wed, 6 Sep 2023 at 16:35, C. Scott Andreas 
>>> wrote:
>>>
 Supportive of switching the default to mmap_index_only as well.

 I don’t have numbers handy to share, but my experience has been
 significantly lower read latency and I wouldn’t run with auto. I’ve also
 not observed substantial heap pressure after switching - it was strictly an
 improvement.

 - Scott

 —
 Mobile

 On Sep 6, 2023, at 8:50 AM, Paulo Motta 
 wrote:

 

 Hi,

 I've been bitten by OOMs with disk_access_mode:auto/mmap that were
 fixed by changing to disk_access_mode:mmap_index_only. In a particular
 benchmark I got 5x more read throughput on 3.11.x with disk_access_mode:
 mmap_index_only vs disk_access_mode: auto/mmap.

 Changing disk_access_mode to mmap_index_only seems to be a common
 recommendation on forums[1][2][3][4] and slack (find by searching
 disk_access_mode in the #cassandra channel on
 https://the-asf.slack.com/).

 It's not clear to me when using the default
 disk_access_mode:auto/mmap is beneficial, perhaps only when the read set
 fits in memory? Mick seems to think on CASSANDRA-15531 [5], that
 mmap_index_only has a higher heap cost and should be only used when
 warranted. However it's not uncommon to see people being bitten with OOMs
 or lower read performance due to the default disk_access_mode, so it makes
 me think it's not the best fool-proof default.

 Should we consider changing default "auto" behavior of
 "disk_access_mode" to be "mmap_index_only" instead of "mmap" in 5.0 since
 it's likely safer and perhaps more performant?

 Thanks,

 Paulo

 [1]
 https://stackoverflow.com/questions/72272035/troubleshooting-and-fixing-cassandra-oom-issue
 [2] https://phabricator.wikimedia.org/T137419
 [3] https://stackoverflow.com/a/55975471
 [4]
 https://support.datastax.com/s/article/FAQ-Use-of-disk-access-mode-in-DSE-51-and-earlier
 [5] https://issues.apache.org/jira/browse/CASSANDRA-15531




Re: [DISCUSS] Update default disk_access_mode to mmap_index_only on 5.0

2023-11-15 Thread Paulo Motta
Hi,

I would like to get back to this. I proposed this default configuration
change on the user list ~1 month ago and there were no comments [1].

I created CASSANDRA-19021 [2] to make the proposed change and Stefan kindly
submitted a patch, CI is looking good.

Any objections to making this change in 5.0? If not, we will merge in 24
hours.

Thanks,

Paulo

[1] - https://lists.apache.org/thread/w0gkdj7fhylycqwmd73p0kfck7jr8qth
[2] - https://issues.apache.org/jira/browse/CASSANDRA-19021

On Wed, Sep 6, 2023 at 5:12 PM Paulo Motta  wrote:

> > I wonder why disk_access_mode property is not in cassandra.yaml (looking
> into trunk right now)
>
> I think there's a prehistoric reason why it was removed but I can't
> remember right now.
>
> > Do you all think we can add it there with brief explanation what each
> option does?
>
> We could reinclude it as long as we provide a clear recommendation on when
> to change from the default since this is an advanced setting which should
> be rarely changed. But I still think we should provide a more
> stable/foolproof default (mmap_index_only) since the current default (mmap)
> is known to cause instability in some scenarios.
>
> Also there is a technicality with changing the default, if we change the
> "auto" behavior from mmap to mmap_index_only this may affect users relying
> on the default "mmap" behavior. Not sure the best way to address that, is a
> big NEWS note sufficient? Even though users are expected to read NEWS when
> upgrading we know well not all users read it.
>
> > Shall we also share this thread with @user?
>
> Thanks Ekaterina! If we decide to change the default we can run this
> through the user@ list to see what the user community thinks.
>
> On Wed, Sep 6, 2023 at 4:45 PM Ekaterina Dimitrova 
> wrote:
>
>> Thanks for starting this discussion, Paulo!
>>
>> Shall we also share this thread with @user?
>>
>> On Wed, 6 Sep 2023 at 16:35, C. Scott Andreas 
>> wrote:
>>
>>> Supportive of switching the default to mmap_index_only as well.
>>>
>>> I don’t have numbers handy to share, but my experience has been
>>> significantly lower read latency and I wouldn’t run with auto. I’ve also
>>> not observed substantial heap pressure after switching - it was strictly an
>>> improvement.
>>>
>>> - Scott
>>>
>>> —
>>> Mobile
>>>
>>> On Sep 6, 2023, at 8:50 AM, Paulo Motta 
>>> wrote:
>>>
>>> 
>>>
>>> Hi,
>>>
>>> I've been bitten by OOMs with disk_access_mode:auto/mmap that were fixed
>>> by changing to disk_access_mode:mmap_index_only. In a particular benchmark
>>> I got 5x more read throughput on 3.11.x with disk_access_mode:
>>> mmap_index_only vs disk_access_mode: auto/mmap.
>>>
>>> Changing disk_access_mode to mmap_index_only seems to be a common
>>> recommendation on forums[1][2][3][4] and slack (find by searching
>>> disk_access_mode in the #cassandra channel on https://the-asf.slack.com/
>>> ).
>>>
>>> It's not clear to me when using the default
>>> disk_access_mode:auto/mmap is beneficial, perhaps only when the read set
>>> fits in memory? Mick seems to think on CASSANDRA-15531 [5], that
>>> mmap_index_only has a higher heap cost and should be only used when
>>> warranted. However it's not uncommon to see people being bitten with OOMs
>>> or lower read performance due to the default disk_access_mode, so it makes
>>> me think it's not the best fool-proof default.
>>>
>>> Should we consider changing default "auto" behavior of
>>> "disk_access_mode" to be "mmap_index_only" instead of "mmap" in 5.0 since
>>> it's likely safer and perhaps more performant?
>>>
>>> Thanks,
>>>
>>> Paulo
>>>
>>> [1]
>>> https://stackoverflow.com/questions/72272035/troubleshooting-and-fixing-cassandra-oom-issue
>>> [2] https://phabricator.wikimedia.org/T137419
>>> [3] https://stackoverflow.com/a/55975471
>>> [4]
>>> https://support.datastax.com/s/article/FAQ-Use-of-disk-access-mode-in-DSE-51-and-earlier
>>> [5] https://issues.apache.org/jira/browse/CASSANDRA-15531
>>>
>>>


Re: [DISCUSS] Update default disk_access_mode to mmap_index_only on 5.0

2023-09-06 Thread Paulo Motta
> I wonder why disk_access_mode property is not in cassandra.yaml (looking
into trunk right now)

I think there's a prehistoric reason why it was removed but I can't
remember right now.

> Do you all think we can add it there with brief explanation what each
option does?

We could reinclude it as long as we provide a clear recommendation on when
to change from the default since this is an advanced setting which should
be rarely changed. But I still think we should provide a more
stable/foolproof default (mmap_index_only) since the current default (mmap)
is known to cause instability in some scenarios.

Also there is a technicality with changing the default, if we change the
"auto" behavior from mmap to mmap_index_only this may affect users relying
on the default "mmap" behavior. Not sure the best way to address that, is a
big NEWS note sufficient? Even though users are expected to read NEWS when
upgrading we know well not all users read it.

> Shall we also share this thread with @user?

Thanks Ekaterina! If we decide to change the default we can run this
through the user@ list to see what the user community thinks.

On Wed, Sep 6, 2023 at 4:45 PM Ekaterina Dimitrova 
wrote:

> Thanks for starting this discussion, Paulo!
>
> Shall we also share this thread with @user?
>
> On Wed, 6 Sep 2023 at 16:35, C. Scott Andreas 
> wrote:
>
>> Supportive of switching the default to mmap_index_only as well.
>>
>> I don’t have numbers handy to share, but my experience has been
>> significantly lower read latency and I wouldn’t run with auto. I’ve also
>> not observed substantial heap pressure after switching - it was strictly an
>> improvement.
>>
>> - Scott
>>
>> —
>> Mobile
>>
>> On Sep 6, 2023, at 8:50 AM, Paulo Motta  wrote:
>>
>> 
>>
>> Hi,
>>
>> I've been bitten by OOMs with disk_access_mode:auto/mmap that were fixed
>> by changing to disk_access_mode:mmap_index_only. In a particular benchmark
>> I got 5x more read throughput on 3.11.x with disk_access_mode:
>> mmap_index_only vs disk_access_mode: auto/mmap.
>>
>> Changing disk_access_mode to mmap_index_only seems to be a common
>> recommendation on forums[1][2][3][4] and slack (find by searching
>> disk_access_mode in the #cassandra channel on https://the-asf.slack.com/
>> ).
>>
>> It's not clear to me when using the default disk_access_mode:auto/mmap is
>> beneficial, perhaps only when the read set fits in memory? Mick seems to
>> think on CASSANDRA-15531 [5], that mmap_index_only has a higher heap cost
>> and should be only used when warranted. However it's not uncommon to see
>> people being bitten with OOMs or lower read performance due to the default
>> disk_access_mode, so it makes me think it's not the best fool-proof default.
>>
>> Should we consider changing default "auto" behavior of "disk_access_mode"
>> to be "mmap_index_only" instead of "mmap" in 5.0 since it's likely safer
>> and perhaps more performant?
>>
>> Thanks,
>>
>> Paulo
>>
>> [1]
>> https://stackoverflow.com/questions/72272035/troubleshooting-and-fixing-cassandra-oom-issue
>> [2] https://phabricator.wikimedia.org/T137419
>> [3] https://stackoverflow.com/a/55975471
>> [4]
>> https://support.datastax.com/s/article/FAQ-Use-of-disk-access-mode-in-DSE-51-and-earlier
>> [5] https://issues.apache.org/jira/browse/CASSANDRA-15531
>>
>>


Re: [DISCUSS] Update default disk_access_mode to mmap_index_only on 5.0

2023-09-06 Thread Ekaterina Dimitrova
Thanks for starting this discussion, Paulo!

Shall we also share this thread with @user?

On Wed, 6 Sep 2023 at 16:35, C. Scott Andreas  wrote:

> Supportive of switching the default to mmap_index_only as well.
>
> I don’t have numbers handy to share, but my experience has been
> significantly lower read latency and I wouldn’t run with auto. I’ve also
> not observed substantial heap pressure after switching - it was strictly an
> improvement.
>
> - Scott
>
> —
> Mobile
>
> On Sep 6, 2023, at 8:50 AM, Paulo Motta  wrote:
>
> 
>
> Hi,
>
> I've been bitten by OOMs with disk_access_mode:auto/mmap that were fixed
> by changing to disk_access_mode:mmap_index_only. In a particular benchmark
> I got 5x more read throughput on 3.11.x with disk_access_mode:
> mmap_index_only vs disk_access_mode: auto/mmap.
>
> Changing disk_access_mode to mmap_index_only seems to be a common
> recommendation on forums[1][2][3][4] and slack (find by searching
> disk_access_mode in the #cassandra channel on https://the-asf.slack.com/).
>
> It's not clear to me when using the default disk_access_mode:auto/mmap is
> beneficial, perhaps only when the read set fits in memory? Mick seems to
> think on CASSANDRA-15531 [5], that mmap_index_only has a higher heap cost
> and should be only used when warranted. However it's not uncommon to see
> people being bitten with OOMs or lower read performance due to the default
> disk_access_mode, so it makes me think it's not the best fool-proof default.
>
> Should we consider changing default "auto" behavior of "disk_access_mode"
> to be "mmap_index_only" instead of "mmap" in 5.0 since it's likely safer
> and perhaps more performant?
>
> Thanks,
>
> Paulo
>
> [1]
> https://stackoverflow.com/questions/72272035/troubleshooting-and-fixing-cassandra-oom-issue
> [2] https://phabricator.wikimedia.org/T137419
> [3] https://stackoverflow.com/a/55975471
> [4]
> https://support.datastax.com/s/article/FAQ-Use-of-disk-access-mode-in-DSE-51-and-earlier
> [5] https://issues.apache.org/jira/browse/CASSANDRA-15531
>
>


Re: [DISCUSS] Update default disk_access_mode to mmap_index_only on 5.0

2023-09-06 Thread Miklosovic, Stefan
I wonder why disk_access_mode property is not in cassandra.yaml (looking into 
trunk right now). Do you all think we can add it there with brief explanation 
what each option does?


From: Caleb Rackliffe 
Sent: Wednesday, September 6, 2023 21:08
To: dev@cassandra.apache.org
Subject: Re: [DISCUSS] Update default disk_access_mode to mmap_index_only on 5.0

NetApp Security WARNING: This is an external email. Do not click links or open 
attachments unless you recognize the sender and know the content is safe.



+100 to this

We'd have to come up w/ a pretty compelling counterexample to NOT switch the 
default to mmap_index_only at this point.

On Wed, Sep 6, 2023 at 11:40 AM Brandon Williams 
mailto:dri...@gmail.com>> wrote:
Given 
https://issues.apache.org/jira/browse/CASSANDRA-17237<https://nam04.safelinks.protection.outlook.com/?url=https%3A%2F%2Fissues.apache.org%2Fjira%2Fbrowse%2FCASSANDRA-17237=05%7C01%7CStefan.Miklosovic%40netapp.com%7C37d5bb87f0764fd41b9208dbaf0cb9fc%7C4b0911a0929b4715944bc03745165b3a%7C0%7C0%7C638296241420889260%7CUnknown%7CTWFpbGZsb3d8eyJWIjoiMC4wLjAwMDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%7C3000%7C%7C%7C=M95cg3dxp3Jk%2FkyaCd7uX61r2S0Q4X%2BA9G8LfSDnQUk%3D=0>
 I think it
makes sense.  At the least I think we should restore disk_access_mode
so that users are more aware of the options available.

Kind Regards,
Brandon

On Wed, Sep 6, 2023 at 10:50 AM Paulo Motta 
mailto:pauloricard...@gmail.com>> wrote:
>
> Hi,
>
> I've been bitten by OOMs with disk_access_mode:auto/mmap that were fixed by 
> changing to disk_access_mode:mmap_index_only. In a particular benchmark I got 
> 5x more read throughput on 3.11.x with disk_access_mode: mmap_index_only vs 
> disk_access_mode: auto/mmap.
>
> Changing disk_access_mode to mmap_index_only seems to be a common 
> recommendation on forums[1][2][3][4] and slack (find by searching 
> disk_access_mode in the #cassandra channel on 
> https://the-asf.slack.com/<https://nam04.safelinks.protection.outlook.com/?url=https%3A%2F%2Fthe-asf.slack.com%2F=05%7C01%7CStefan.Miklosovic%40netapp.com%7C37d5bb87f0764fd41b9208dbaf0cb9fc%7C4b0911a0929b4715944bc03745165b3a%7C0%7C0%7C638296241420889260%7CUnknown%7CTWFpbGZsb3d8eyJWIjoiMC4wLjAwMDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%7C3000%7C%7C%7C=%2Fb595N375qIKg9EhU1CYXrmDbcQJFgBuSDuP6gyentg%3D=0>).
>
> It's not clear to me when using the default disk_access_mode:auto/mmap is 
> beneficial, perhaps only when the read set fits in memory? Mick seems to 
> think on CASSANDRA-15531 [5], that mmap_index_only has a higher heap cost and 
> should be only used when warranted. However it's not uncommon to see people 
> being bitten with OOMs or lower read performance due to the default 
> disk_access_mode, so it makes me think it's not the best fool-proof default.
>
> Should we consider changing default "auto" behavior of "disk_access_mode" to 
> be "mmap_index_only" instead of "mmap" in 5.0 since it's likely safer and 
> perhaps more performant?
>
> Thanks,
>
> Paulo
>
> [1] 
> https://stackoverflow.com/questions/72272035/troubleshooting-and-fixing-cassandra-oom-issue<https://nam04.safelinks.protection.outlook.com/?url=https%3A%2F%2Fstackoverflow.com%2Fquestions%2F72272035%2Ftroubleshooting-and-fixing-cassandra-oom-issue=05%7C01%7CStefan.Miklosovic%40netapp.com%7C37d5bb87f0764fd41b9208dbaf0cb9fc%7C4b0911a0929b4715944bc03745165b3a%7C0%7C0%7C638296241420889260%7CUnknown%7CTWFpbGZsb3d8eyJWIjoiMC4wLjAwMDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%7C3000%7C%7C%7C=CuITOHvhAsaXYgDJF%2FIN%2BL%2FkFuqv9DnjrAcGb9ssv9g%3D=0>
> [2] 
> https://phabricator.wikimedia.org/T137419<https://nam04.safelinks.protection.outlook.com/?url=https%3A%2F%2Fphabricator.wikimedia.org%2FT137419=05%7C01%7CStefan.Miklosovic%40netapp.com%7C37d5bb87f0764fd41b9208dbaf0cb9fc%7C4b0911a0929b4715944bc03745165b3a%7C0%7C0%7C638296241420889260%7CUnknown%7CTWFpbGZsb3d8eyJWIjoiMC4wLjAwMDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%7C3000%7C%7C%7C=%2Bu2MQ7wCRuUt6vkkeCrO8bL4zbswNPb1WKx1yOFu56w%3D=0>
> [3] 
> https://stackoverflow.com/a/55975471<https://nam04.safelinks.protection.outlook.com/?url=https%3A%2F%2Fstackoverflow.com%2Fa%2F55975471=05%7C01%7CStefan.Miklosovic%40netapp.com%7C37d5bb87f0764fd41b9208dbaf0cb9fc%7C4b0911a0929b4715944bc03745165b3a%7C0%7C0%7C638296241420889260%7CUnknown%7CTWFpbGZsb3d8eyJWIjoiMC4wLjAwMDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%7C3000%7C%7C%7C=XwqESgtkvitzcK1fR5mp1oy5eS622rVjWQfz%2B9xU%2F5U%3D=0>
> [4] 
> https://support.datastax.com/s/article/FAQ-Use-of-disk-access-mode-in-DSE-51-and-earlier<https://nam04.safelinks.protection.outlook.com/?url=https%3A%2F%2Fsupport.datastax.com%2Fs%2Farticle%2FFAQ-Use-of-disk-access-mode-in-DSE-51-and-earlier=05%7C01%7CStefan.Miklosovic%40neta

Re: [DISCUSS] Update default disk_access_mode to mmap_index_only on 5.0

2023-09-06 Thread Caleb Rackliffe
+100 to this

We'd have to come up w/ a pretty compelling counterexample to NOT switch
the default to mmap_index_only at this point.

On Wed, Sep 6, 2023 at 11:40 AM Brandon Williams  wrote:

> Given https://issues.apache.org/jira/browse/CASSANDRA-17237 I think it
> makes sense.  At the least I think we should restore disk_access_mode
> so that users are more aware of the options available.
>
> Kind Regards,
> Brandon
>
> On Wed, Sep 6, 2023 at 10:50 AM Paulo Motta 
> wrote:
> >
> > Hi,
> >
> > I've been bitten by OOMs with disk_access_mode:auto/mmap that were fixed
> by changing to disk_access_mode:mmap_index_only. In a particular benchmark
> I got 5x more read throughput on 3.11.x with disk_access_mode:
> mmap_index_only vs disk_access_mode: auto/mmap.
> >
> > Changing disk_access_mode to mmap_index_only seems to be a common
> recommendation on forums[1][2][3][4] and slack (find by searching
> disk_access_mode in the #cassandra channel on https://the-asf.slack.com/).
> >
> > It's not clear to me when using the default disk_access_mode:auto/mmap
> is beneficial, perhaps only when the read set fits in memory? Mick seems to
> think on CASSANDRA-15531 [5], that mmap_index_only has a higher heap cost
> and should be only used when warranted. However it's not uncommon to see
> people being bitten with OOMs or lower read performance due to the default
> disk_access_mode, so it makes me think it's not the best fool-proof default.
> >
> > Should we consider changing default "auto" behavior of
> "disk_access_mode" to be "mmap_index_only" instead of "mmap" in 5.0 since
> it's likely safer and perhaps more performant?
> >
> > Thanks,
> >
> > Paulo
> >
> > [1]
> https://stackoverflow.com/questions/72272035/troubleshooting-and-fixing-cassandra-oom-issue
> > [2] https://phabricator.wikimedia.org/T137419
> > [3] https://stackoverflow.com/a/55975471
> > [4]
> https://support.datastax.com/s/article/FAQ-Use-of-disk-access-mode-in-DSE-51-and-earlier
> > [5] https://issues.apache.org/jira/browse/CASSANDRA-15531
>


Re: [DISCUSS] Update default disk_access_mode to mmap_index_only on 5.0

2023-09-06 Thread Brandon Williams
Given https://issues.apache.org/jira/browse/CASSANDRA-17237 I think it
makes sense.  At the least I think we should restore disk_access_mode
so that users are more aware of the options available.

Kind Regards,
Brandon

On Wed, Sep 6, 2023 at 10:50 AM Paulo Motta  wrote:
>
> Hi,
>
> I've been bitten by OOMs with disk_access_mode:auto/mmap that were fixed by 
> changing to disk_access_mode:mmap_index_only. In a particular benchmark I got 
> 5x more read throughput on 3.11.x with disk_access_mode: mmap_index_only vs 
> disk_access_mode: auto/mmap.
>
> Changing disk_access_mode to mmap_index_only seems to be a common 
> recommendation on forums[1][2][3][4] and slack (find by searching 
> disk_access_mode in the #cassandra channel on https://the-asf.slack.com/).
>
> It's not clear to me when using the default disk_access_mode:auto/mmap is 
> beneficial, perhaps only when the read set fits in memory? Mick seems to 
> think on CASSANDRA-15531 [5], that mmap_index_only has a higher heap cost and 
> should be only used when warranted. However it's not uncommon to see people 
> being bitten with OOMs or lower read performance due to the default 
> disk_access_mode, so it makes me think it's not the best fool-proof default.
>
> Should we consider changing default "auto" behavior of "disk_access_mode" to 
> be "mmap_index_only" instead of "mmap" in 5.0 since it's likely safer and 
> perhaps more performant?
>
> Thanks,
>
> Paulo
>
> [1] 
> https://stackoverflow.com/questions/72272035/troubleshooting-and-fixing-cassandra-oom-issue
> [2] https://phabricator.wikimedia.org/T137419
> [3] https://stackoverflow.com/a/55975471
> [4] 
> https://support.datastax.com/s/article/FAQ-Use-of-disk-access-mode-in-DSE-51-and-earlier
> [5] https://issues.apache.org/jira/browse/CASSANDRA-15531


Re: [DISCUSS] Update default disk_access_mode to mmap_index_only on 5.0

2023-09-06 Thread C. Scott Andreas
Supportive of switching the default to mmap_index_only as well.

I don’t have numbers handy to share, but my experience has been significantly 
lower read latency and I wouldn’t run with auto. I’ve also not observed 
substantial heap pressure after switching - it was strictly an improvement.

- Scott

—
Mobile

> On Sep 6, 2023, at 8:50 AM, Paulo Motta  wrote:
> 
> 
> Hi,
> 
> I've been bitten by OOMs with disk_access_mode:auto/mmap that were fixed by 
> changing to disk_access_mode:mmap_index_only. In a particular benchmark I got 
> 5x more read throughput on 3.11.x with disk_access_mode: mmap_index_only vs 
> disk_access_mode: auto/mmap.
> 
> Changing disk_access_mode to mmap_index_only seems to be a common 
> recommendation on forums[1][2][3][4] and slack (find by searching 
> disk_access_mode in the #cassandra channel on https://the-asf.slack.com/).
> 
> It's not clear to me when using the default disk_access_mode:auto/mmap is 
> beneficial, perhaps only when the read set fits in memory? Mick seems to 
> think on CASSANDRA-15531 [5], that mmap_index_only has a higher heap cost and 
> should be only used when warranted. However it's not uncommon to see people 
> being bitten with OOMs or lower read performance due to the default 
> disk_access_mode, so it makes me think it's not the best fool-proof default.
> 
> Should we consider changing default "auto" behavior of "disk_access_mode" to 
> be "mmap_index_only" instead of "mmap" in 5.0 since it's likely safer and 
> perhaps more performant?
> 
> Thanks,
> 
> Paulo
> 
> [1] 
> https://stackoverflow.com/questions/72272035/troubleshooting-and-fixing-cassandra-oom-issue
> [2] https://phabricator.wikimedia.org/T137419
> [3] https://stackoverflow.com/a/55975471
> [4] 
> https://support.datastax.com/s/article/FAQ-Use-of-disk-access-mode-in-DSE-51-and-earlier
> [5] https://issues.apache.org/jira/browse/CASSANDRA-15531


[DISCUSS] Update default disk_access_mode to mmap_index_only on 5.0

2023-09-06 Thread Paulo Motta
Hi,

I've been bitten by OOMs with disk_access_mode:auto/mmap that were fixed by
changing to disk_access_mode:mmap_index_only. In a particular benchmark I
got 5x more read throughput on 3.11.x with disk_access_mode:
mmap_index_only vs disk_access_mode: auto/mmap.

Changing disk_access_mode to mmap_index_only seems to be a common
recommendation on forums[1][2][3][4] and slack (find by searching
disk_access_mode in the #cassandra channel on https://the-asf.slack.com/).

It's not clear to me when using the default disk_access_mode:auto/mmap is
beneficial, perhaps only when the read set fits in memory? Mick seems to
think on CASSANDRA-15531 [5], that mmap_index_only has a higher heap cost
and should be only used when warranted. However it's not uncommon to see
people being bitten with OOMs or lower read performance due to the default
disk_access_mode, so it makes me think it's not the best fool-proof default.

Should we consider changing default "auto" behavior of "disk_access_mode"
to be "mmap_index_only" instead of "mmap" in 5.0 since it's likely safer
and perhaps more performant?

Thanks,

Paulo

[1]
https://stackoverflow.com/questions/72272035/troubleshooting-and-fixing-cassandra-oom-issue
[2] https://phabricator.wikimedia.org/T137419
[3] https://stackoverflow.com/a/55975471
[4]
https://support.datastax.com/s/article/FAQ-Use-of-disk-access-mode-in-DSE-51-and-earlier
[5] https://issues.apache.org/jira/browse/CASSANDRA-15531