Re: [DISCUSS] Update default disk_access_mode to mmap_index_only on 5.0
Added one nit to the PR. Otherwise, this is awesome :) On Wed, Nov 15, 2023 at 11:01 AM Jordan West wrote: > I would also like to back this proposal. We change this default because > several incidents have occurred by leaving the default of auto. There are > rare cases where auto/mmap is the better option but as for a default > mmap_index_only is safer. > > On Wed, Nov 15, 2023 at 6:35 AM Paulo Motta wrote: > >> Hi, >> >> I would like to get back to this. I proposed this default configuration >> change on the user list ~1 month ago and there were no comments [1]. >> >> I created CASSANDRA-19021 [2] to make the proposed change and Stefan >> kindly submitted a patch, CI is looking good. >> >> Any objections to making this change in 5.0? If not, we will merge in 24 >> hours. >> >> Thanks, >> >> Paulo >> >> [1] - https://lists.apache.org/thread/w0gkdj7fhylycqwmd73p0kfck7jr8qth >> [2] - https://issues.apache.org/jira/browse/CASSANDRA-19021 >> >> On Wed, Sep 6, 2023 at 5:12 PM Paulo Motta >> wrote: >> >>> > I wonder why disk_access_mode property is not in cassandra.yaml >>> (looking into trunk right now) >>> >>> I think there's a prehistoric reason why it was removed but I can't >>> remember right now. >>> >>> > Do you all think we can add it there with brief explanation what each >>> option does? >>> >>> We could reinclude it as long as we provide a clear recommendation on >>> when to change from the default since this is an advanced setting which >>> should be rarely changed. But I still think we should provide a more >>> stable/foolproof default (mmap_index_only) since the current default (mmap) >>> is known to cause instability in some scenarios. >>> >>> Also there is a technicality with changing the default, if we change the >>> "auto" behavior from mmap to mmap_index_only this may affect users relying >>> on the default "mmap" behavior. Not sure the best way to address that, is a >>> big NEWS note sufficient? Even though users are expected to read NEWS when >>> upgrading we know well not all users read it. >>> >>> > Shall we also share this thread with @user? >>> >>> Thanks Ekaterina! If we decide to change the default we can run this >>> through the user@ list to see what the user community thinks. >>> >>> On Wed, Sep 6, 2023 at 4:45 PM Ekaterina Dimitrova < >>> e.dimitr...@gmail.com> wrote: >>> Thanks for starting this discussion, Paulo! Shall we also share this thread with @user? On Wed, 6 Sep 2023 at 16:35, C. Scott Andreas wrote: > Supportive of switching the default to mmap_index_only as well. > > I don’t have numbers handy to share, but my experience has been > significantly lower read latency and I wouldn’t run with auto. I’ve also > not observed substantial heap pressure after switching - it was strictly > an > improvement. > > - Scott > > — > Mobile > > On Sep 6, 2023, at 8:50 AM, Paulo Motta > wrote: > > > > Hi, > > I've been bitten by OOMs with disk_access_mode:auto/mmap that were > fixed by changing to disk_access_mode:mmap_index_only. In a particular > benchmark I got 5x more read throughput on 3.11.x with disk_access_mode: > mmap_index_only vs disk_access_mode: auto/mmap. > > Changing disk_access_mode to mmap_index_only seems to be a common > recommendation on forums[1][2][3][4] and slack (find by searching > disk_access_mode in the #cassandra channel on > https://the-asf.slack.com/). > > It's not clear to me when using the default > disk_access_mode:auto/mmap is beneficial, perhaps only when the read set > fits in memory? Mick seems to think on CASSANDRA-15531 [5], that > mmap_index_only has a higher heap cost and should be only used when > warranted. However it's not uncommon to see people being bitten with OOMs > or lower read performance due to the default disk_access_mode, so it makes > me think it's not the best fool-proof default. > > Should we consider changing default "auto" behavior of > "disk_access_mode" to be "mmap_index_only" instead of "mmap" in 5.0 since > it's likely safer and perhaps more performant? > > Thanks, > > Paulo > > [1] > https://stackoverflow.com/questions/72272035/troubleshooting-and-fixing-cassandra-oom-issue > [2] https://phabricator.wikimedia.org/T137419 > [3] https://stackoverflow.com/a/55975471 > [4] > https://support.datastax.com/s/article/FAQ-Use-of-disk-access-mode-in-DSE-51-and-earlier > [5] https://issues.apache.org/jira/browse/CASSANDRA-15531 > >
Re: [DISCUSS] Update default disk_access_mode to mmap_index_only on 5.0
I would also like to back this proposal. We change this default because several incidents have occurred by leaving the default of auto. There are rare cases where auto/mmap is the better option but as for a default mmap_index_only is safer. On Wed, Nov 15, 2023 at 6:35 AM Paulo Motta wrote: > Hi, > > I would like to get back to this. I proposed this default configuration > change on the user list ~1 month ago and there were no comments [1]. > > I created CASSANDRA-19021 [2] to make the proposed change and Stefan > kindly submitted a patch, CI is looking good. > > Any objections to making this change in 5.0? If not, we will merge in 24 > hours. > > Thanks, > > Paulo > > [1] - https://lists.apache.org/thread/w0gkdj7fhylycqwmd73p0kfck7jr8qth > [2] - https://issues.apache.org/jira/browse/CASSANDRA-19021 > > On Wed, Sep 6, 2023 at 5:12 PM Paulo Motta > wrote: > >> > I wonder why disk_access_mode property is not in cassandra.yaml >> (looking into trunk right now) >> >> I think there's a prehistoric reason why it was removed but I can't >> remember right now. >> >> > Do you all think we can add it there with brief explanation what each >> option does? >> >> We could reinclude it as long as we provide a clear recommendation on >> when to change from the default since this is an advanced setting which >> should be rarely changed. But I still think we should provide a more >> stable/foolproof default (mmap_index_only) since the current default (mmap) >> is known to cause instability in some scenarios. >> >> Also there is a technicality with changing the default, if we change the >> "auto" behavior from mmap to mmap_index_only this may affect users relying >> on the default "mmap" behavior. Not sure the best way to address that, is a >> big NEWS note sufficient? Even though users are expected to read NEWS when >> upgrading we know well not all users read it. >> >> > Shall we also share this thread with @user? >> >> Thanks Ekaterina! If we decide to change the default we can run this >> through the user@ list to see what the user community thinks. >> >> On Wed, Sep 6, 2023 at 4:45 PM Ekaterina Dimitrova >> wrote: >> >>> Thanks for starting this discussion, Paulo! >>> >>> Shall we also share this thread with @user? >>> >>> On Wed, 6 Sep 2023 at 16:35, C. Scott Andreas >>> wrote: >>> Supportive of switching the default to mmap_index_only as well. I don’t have numbers handy to share, but my experience has been significantly lower read latency and I wouldn’t run with auto. I’ve also not observed substantial heap pressure after switching - it was strictly an improvement. - Scott — Mobile On Sep 6, 2023, at 8:50 AM, Paulo Motta wrote: Hi, I've been bitten by OOMs with disk_access_mode:auto/mmap that were fixed by changing to disk_access_mode:mmap_index_only. In a particular benchmark I got 5x more read throughput on 3.11.x with disk_access_mode: mmap_index_only vs disk_access_mode: auto/mmap. Changing disk_access_mode to mmap_index_only seems to be a common recommendation on forums[1][2][3][4] and slack (find by searching disk_access_mode in the #cassandra channel on https://the-asf.slack.com/). It's not clear to me when using the default disk_access_mode:auto/mmap is beneficial, perhaps only when the read set fits in memory? Mick seems to think on CASSANDRA-15531 [5], that mmap_index_only has a higher heap cost and should be only used when warranted. However it's not uncommon to see people being bitten with OOMs or lower read performance due to the default disk_access_mode, so it makes me think it's not the best fool-proof default. Should we consider changing default "auto" behavior of "disk_access_mode" to be "mmap_index_only" instead of "mmap" in 5.0 since it's likely safer and perhaps more performant? Thanks, Paulo [1] https://stackoverflow.com/questions/72272035/troubleshooting-and-fixing-cassandra-oom-issue [2] https://phabricator.wikimedia.org/T137419 [3] https://stackoverflow.com/a/55975471 [4] https://support.datastax.com/s/article/FAQ-Use-of-disk-access-mode-in-DSE-51-and-earlier [5] https://issues.apache.org/jira/browse/CASSANDRA-15531
Re: [DISCUSS] Update default disk_access_mode to mmap_index_only on 5.0
Hi, I would like to get back to this. I proposed this default configuration change on the user list ~1 month ago and there were no comments [1]. I created CASSANDRA-19021 [2] to make the proposed change and Stefan kindly submitted a patch, CI is looking good. Any objections to making this change in 5.0? If not, we will merge in 24 hours. Thanks, Paulo [1] - https://lists.apache.org/thread/w0gkdj7fhylycqwmd73p0kfck7jr8qth [2] - https://issues.apache.org/jira/browse/CASSANDRA-19021 On Wed, Sep 6, 2023 at 5:12 PM Paulo Motta wrote: > > I wonder why disk_access_mode property is not in cassandra.yaml (looking > into trunk right now) > > I think there's a prehistoric reason why it was removed but I can't > remember right now. > > > Do you all think we can add it there with brief explanation what each > option does? > > We could reinclude it as long as we provide a clear recommendation on when > to change from the default since this is an advanced setting which should > be rarely changed. But I still think we should provide a more > stable/foolproof default (mmap_index_only) since the current default (mmap) > is known to cause instability in some scenarios. > > Also there is a technicality with changing the default, if we change the > "auto" behavior from mmap to mmap_index_only this may affect users relying > on the default "mmap" behavior. Not sure the best way to address that, is a > big NEWS note sufficient? Even though users are expected to read NEWS when > upgrading we know well not all users read it. > > > Shall we also share this thread with @user? > > Thanks Ekaterina! If we decide to change the default we can run this > through the user@ list to see what the user community thinks. > > On Wed, Sep 6, 2023 at 4:45 PM Ekaterina Dimitrova > wrote: > >> Thanks for starting this discussion, Paulo! >> >> Shall we also share this thread with @user? >> >> On Wed, 6 Sep 2023 at 16:35, C. Scott Andreas >> wrote: >> >>> Supportive of switching the default to mmap_index_only as well. >>> >>> I don’t have numbers handy to share, but my experience has been >>> significantly lower read latency and I wouldn’t run with auto. I’ve also >>> not observed substantial heap pressure after switching - it was strictly an >>> improvement. >>> >>> - Scott >>> >>> — >>> Mobile >>> >>> On Sep 6, 2023, at 8:50 AM, Paulo Motta >>> wrote: >>> >>> >>> >>> Hi, >>> >>> I've been bitten by OOMs with disk_access_mode:auto/mmap that were fixed >>> by changing to disk_access_mode:mmap_index_only. In a particular benchmark >>> I got 5x more read throughput on 3.11.x with disk_access_mode: >>> mmap_index_only vs disk_access_mode: auto/mmap. >>> >>> Changing disk_access_mode to mmap_index_only seems to be a common >>> recommendation on forums[1][2][3][4] and slack (find by searching >>> disk_access_mode in the #cassandra channel on https://the-asf.slack.com/ >>> ). >>> >>> It's not clear to me when using the default >>> disk_access_mode:auto/mmap is beneficial, perhaps only when the read set >>> fits in memory? Mick seems to think on CASSANDRA-15531 [5], that >>> mmap_index_only has a higher heap cost and should be only used when >>> warranted. However it's not uncommon to see people being bitten with OOMs >>> or lower read performance due to the default disk_access_mode, so it makes >>> me think it's not the best fool-proof default. >>> >>> Should we consider changing default "auto" behavior of >>> "disk_access_mode" to be "mmap_index_only" instead of "mmap" in 5.0 since >>> it's likely safer and perhaps more performant? >>> >>> Thanks, >>> >>> Paulo >>> >>> [1] >>> https://stackoverflow.com/questions/72272035/troubleshooting-and-fixing-cassandra-oom-issue >>> [2] https://phabricator.wikimedia.org/T137419 >>> [3] https://stackoverflow.com/a/55975471 >>> [4] >>> https://support.datastax.com/s/article/FAQ-Use-of-disk-access-mode-in-DSE-51-and-earlier >>> [5] https://issues.apache.org/jira/browse/CASSANDRA-15531 >>> >>>
Re: [DISCUSS] Update default disk_access_mode to mmap_index_only on 5.0
> I wonder why disk_access_mode property is not in cassandra.yaml (looking into trunk right now) I think there's a prehistoric reason why it was removed but I can't remember right now. > Do you all think we can add it there with brief explanation what each option does? We could reinclude it as long as we provide a clear recommendation on when to change from the default since this is an advanced setting which should be rarely changed. But I still think we should provide a more stable/foolproof default (mmap_index_only) since the current default (mmap) is known to cause instability in some scenarios. Also there is a technicality with changing the default, if we change the "auto" behavior from mmap to mmap_index_only this may affect users relying on the default "mmap" behavior. Not sure the best way to address that, is a big NEWS note sufficient? Even though users are expected to read NEWS when upgrading we know well not all users read it. > Shall we also share this thread with @user? Thanks Ekaterina! If we decide to change the default we can run this through the user@ list to see what the user community thinks. On Wed, Sep 6, 2023 at 4:45 PM Ekaterina Dimitrova wrote: > Thanks for starting this discussion, Paulo! > > Shall we also share this thread with @user? > > On Wed, 6 Sep 2023 at 16:35, C. Scott Andreas > wrote: > >> Supportive of switching the default to mmap_index_only as well. >> >> I don’t have numbers handy to share, but my experience has been >> significantly lower read latency and I wouldn’t run with auto. I’ve also >> not observed substantial heap pressure after switching - it was strictly an >> improvement. >> >> - Scott >> >> — >> Mobile >> >> On Sep 6, 2023, at 8:50 AM, Paulo Motta wrote: >> >> >> >> Hi, >> >> I've been bitten by OOMs with disk_access_mode:auto/mmap that were fixed >> by changing to disk_access_mode:mmap_index_only. In a particular benchmark >> I got 5x more read throughput on 3.11.x with disk_access_mode: >> mmap_index_only vs disk_access_mode: auto/mmap. >> >> Changing disk_access_mode to mmap_index_only seems to be a common >> recommendation on forums[1][2][3][4] and slack (find by searching >> disk_access_mode in the #cassandra channel on https://the-asf.slack.com/ >> ). >> >> It's not clear to me when using the default disk_access_mode:auto/mmap is >> beneficial, perhaps only when the read set fits in memory? Mick seems to >> think on CASSANDRA-15531 [5], that mmap_index_only has a higher heap cost >> and should be only used when warranted. However it's not uncommon to see >> people being bitten with OOMs or lower read performance due to the default >> disk_access_mode, so it makes me think it's not the best fool-proof default. >> >> Should we consider changing default "auto" behavior of "disk_access_mode" >> to be "mmap_index_only" instead of "mmap" in 5.0 since it's likely safer >> and perhaps more performant? >> >> Thanks, >> >> Paulo >> >> [1] >> https://stackoverflow.com/questions/72272035/troubleshooting-and-fixing-cassandra-oom-issue >> [2] https://phabricator.wikimedia.org/T137419 >> [3] https://stackoverflow.com/a/55975471 >> [4] >> https://support.datastax.com/s/article/FAQ-Use-of-disk-access-mode-in-DSE-51-and-earlier >> [5] https://issues.apache.org/jira/browse/CASSANDRA-15531 >> >>
Re: [DISCUSS] Update default disk_access_mode to mmap_index_only on 5.0
Thanks for starting this discussion, Paulo! Shall we also share this thread with @user? On Wed, 6 Sep 2023 at 16:35, C. Scott Andreas wrote: > Supportive of switching the default to mmap_index_only as well. > > I don’t have numbers handy to share, but my experience has been > significantly lower read latency and I wouldn’t run with auto. I’ve also > not observed substantial heap pressure after switching - it was strictly an > improvement. > > - Scott > > — > Mobile > > On Sep 6, 2023, at 8:50 AM, Paulo Motta wrote: > > > > Hi, > > I've been bitten by OOMs with disk_access_mode:auto/mmap that were fixed > by changing to disk_access_mode:mmap_index_only. In a particular benchmark > I got 5x more read throughput on 3.11.x with disk_access_mode: > mmap_index_only vs disk_access_mode: auto/mmap. > > Changing disk_access_mode to mmap_index_only seems to be a common > recommendation on forums[1][2][3][4] and slack (find by searching > disk_access_mode in the #cassandra channel on https://the-asf.slack.com/). > > It's not clear to me when using the default disk_access_mode:auto/mmap is > beneficial, perhaps only when the read set fits in memory? Mick seems to > think on CASSANDRA-15531 [5], that mmap_index_only has a higher heap cost > and should be only used when warranted. However it's not uncommon to see > people being bitten with OOMs or lower read performance due to the default > disk_access_mode, so it makes me think it's not the best fool-proof default. > > Should we consider changing default "auto" behavior of "disk_access_mode" > to be "mmap_index_only" instead of "mmap" in 5.0 since it's likely safer > and perhaps more performant? > > Thanks, > > Paulo > > [1] > https://stackoverflow.com/questions/72272035/troubleshooting-and-fixing-cassandra-oom-issue > [2] https://phabricator.wikimedia.org/T137419 > [3] https://stackoverflow.com/a/55975471 > [4] > https://support.datastax.com/s/article/FAQ-Use-of-disk-access-mode-in-DSE-51-and-earlier > [5] https://issues.apache.org/jira/browse/CASSANDRA-15531 > >
Re: [DISCUSS] Update default disk_access_mode to mmap_index_only on 5.0
I wonder why disk_access_mode property is not in cassandra.yaml (looking into trunk right now). Do you all think we can add it there with brief explanation what each option does? From: Caleb Rackliffe Sent: Wednesday, September 6, 2023 21:08 To: dev@cassandra.apache.org Subject: Re: [DISCUSS] Update default disk_access_mode to mmap_index_only on 5.0 NetApp Security WARNING: This is an external email. Do not click links or open attachments unless you recognize the sender and know the content is safe. +100 to this We'd have to come up w/ a pretty compelling counterexample to NOT switch the default to mmap_index_only at this point. On Wed, Sep 6, 2023 at 11:40 AM Brandon Williams mailto:dri...@gmail.com>> wrote: Given https://issues.apache.org/jira/browse/CASSANDRA-17237<https://nam04.safelinks.protection.outlook.com/?url=https%3A%2F%2Fissues.apache.org%2Fjira%2Fbrowse%2FCASSANDRA-17237=05%7C01%7CStefan.Miklosovic%40netapp.com%7C37d5bb87f0764fd41b9208dbaf0cb9fc%7C4b0911a0929b4715944bc03745165b3a%7C0%7C0%7C638296241420889260%7CUnknown%7CTWFpbGZsb3d8eyJWIjoiMC4wLjAwMDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%7C3000%7C%7C%7C=M95cg3dxp3Jk%2FkyaCd7uX61r2S0Q4X%2BA9G8LfSDnQUk%3D=0> I think it makes sense. At the least I think we should restore disk_access_mode so that users are more aware of the options available. Kind Regards, Brandon On Wed, Sep 6, 2023 at 10:50 AM Paulo Motta mailto:pauloricard...@gmail.com>> wrote: > > Hi, > > I've been bitten by OOMs with disk_access_mode:auto/mmap that were fixed by > changing to disk_access_mode:mmap_index_only. In a particular benchmark I got > 5x more read throughput on 3.11.x with disk_access_mode: mmap_index_only vs > disk_access_mode: auto/mmap. > > Changing disk_access_mode to mmap_index_only seems to be a common > recommendation on forums[1][2][3][4] and slack (find by searching > disk_access_mode in the #cassandra channel on > https://the-asf.slack.com/<https://nam04.safelinks.protection.outlook.com/?url=https%3A%2F%2Fthe-asf.slack.com%2F=05%7C01%7CStefan.Miklosovic%40netapp.com%7C37d5bb87f0764fd41b9208dbaf0cb9fc%7C4b0911a0929b4715944bc03745165b3a%7C0%7C0%7C638296241420889260%7CUnknown%7CTWFpbGZsb3d8eyJWIjoiMC4wLjAwMDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%7C3000%7C%7C%7C=%2Fb595N375qIKg9EhU1CYXrmDbcQJFgBuSDuP6gyentg%3D=0>). > > It's not clear to me when using the default disk_access_mode:auto/mmap is > beneficial, perhaps only when the read set fits in memory? Mick seems to > think on CASSANDRA-15531 [5], that mmap_index_only has a higher heap cost and > should be only used when warranted. However it's not uncommon to see people > being bitten with OOMs or lower read performance due to the default > disk_access_mode, so it makes me think it's not the best fool-proof default. > > Should we consider changing default "auto" behavior of "disk_access_mode" to > be "mmap_index_only" instead of "mmap" in 5.0 since it's likely safer and > perhaps more performant? > > Thanks, > > Paulo > > [1] > https://stackoverflow.com/questions/72272035/troubleshooting-and-fixing-cassandra-oom-issue<https://nam04.safelinks.protection.outlook.com/?url=https%3A%2F%2Fstackoverflow.com%2Fquestions%2F72272035%2Ftroubleshooting-and-fixing-cassandra-oom-issue=05%7C01%7CStefan.Miklosovic%40netapp.com%7C37d5bb87f0764fd41b9208dbaf0cb9fc%7C4b0911a0929b4715944bc03745165b3a%7C0%7C0%7C638296241420889260%7CUnknown%7CTWFpbGZsb3d8eyJWIjoiMC4wLjAwMDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%7C3000%7C%7C%7C=CuITOHvhAsaXYgDJF%2FIN%2BL%2FkFuqv9DnjrAcGb9ssv9g%3D=0> > [2] > https://phabricator.wikimedia.org/T137419<https://nam04.safelinks.protection.outlook.com/?url=https%3A%2F%2Fphabricator.wikimedia.org%2FT137419=05%7C01%7CStefan.Miklosovic%40netapp.com%7C37d5bb87f0764fd41b9208dbaf0cb9fc%7C4b0911a0929b4715944bc03745165b3a%7C0%7C0%7C638296241420889260%7CUnknown%7CTWFpbGZsb3d8eyJWIjoiMC4wLjAwMDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%7C3000%7C%7C%7C=%2Bu2MQ7wCRuUt6vkkeCrO8bL4zbswNPb1WKx1yOFu56w%3D=0> > [3] > https://stackoverflow.com/a/55975471<https://nam04.safelinks.protection.outlook.com/?url=https%3A%2F%2Fstackoverflow.com%2Fa%2F55975471=05%7C01%7CStefan.Miklosovic%40netapp.com%7C37d5bb87f0764fd41b9208dbaf0cb9fc%7C4b0911a0929b4715944bc03745165b3a%7C0%7C0%7C638296241420889260%7CUnknown%7CTWFpbGZsb3d8eyJWIjoiMC4wLjAwMDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%7C3000%7C%7C%7C=XwqESgtkvitzcK1fR5mp1oy5eS622rVjWQfz%2B9xU%2F5U%3D=0> > [4] > https://support.datastax.com/s/article/FAQ-Use-of-disk-access-mode-in-DSE-51-and-earlier<https://nam04.safelinks.protection.outlook.com/?url=https%3A%2F%2Fsupport.datastax.com%2Fs%2Farticle%2FFAQ-Use-of-disk-access-mode-in-DSE-51-and-earlier=05%7C01%7CStefan.Miklosovic%40neta
Re: [DISCUSS] Update default disk_access_mode to mmap_index_only on 5.0
+100 to this We'd have to come up w/ a pretty compelling counterexample to NOT switch the default to mmap_index_only at this point. On Wed, Sep 6, 2023 at 11:40 AM Brandon Williams wrote: > Given https://issues.apache.org/jira/browse/CASSANDRA-17237 I think it > makes sense. At the least I think we should restore disk_access_mode > so that users are more aware of the options available. > > Kind Regards, > Brandon > > On Wed, Sep 6, 2023 at 10:50 AM Paulo Motta > wrote: > > > > Hi, > > > > I've been bitten by OOMs with disk_access_mode:auto/mmap that were fixed > by changing to disk_access_mode:mmap_index_only. In a particular benchmark > I got 5x more read throughput on 3.11.x with disk_access_mode: > mmap_index_only vs disk_access_mode: auto/mmap. > > > > Changing disk_access_mode to mmap_index_only seems to be a common > recommendation on forums[1][2][3][4] and slack (find by searching > disk_access_mode in the #cassandra channel on https://the-asf.slack.com/). > > > > It's not clear to me when using the default disk_access_mode:auto/mmap > is beneficial, perhaps only when the read set fits in memory? Mick seems to > think on CASSANDRA-15531 [5], that mmap_index_only has a higher heap cost > and should be only used when warranted. However it's not uncommon to see > people being bitten with OOMs or lower read performance due to the default > disk_access_mode, so it makes me think it's not the best fool-proof default. > > > > Should we consider changing default "auto" behavior of > "disk_access_mode" to be "mmap_index_only" instead of "mmap" in 5.0 since > it's likely safer and perhaps more performant? > > > > Thanks, > > > > Paulo > > > > [1] > https://stackoverflow.com/questions/72272035/troubleshooting-and-fixing-cassandra-oom-issue > > [2] https://phabricator.wikimedia.org/T137419 > > [3] https://stackoverflow.com/a/55975471 > > [4] > https://support.datastax.com/s/article/FAQ-Use-of-disk-access-mode-in-DSE-51-and-earlier > > [5] https://issues.apache.org/jira/browse/CASSANDRA-15531 >
Re: [DISCUSS] Update default disk_access_mode to mmap_index_only on 5.0
Given https://issues.apache.org/jira/browse/CASSANDRA-17237 I think it makes sense. At the least I think we should restore disk_access_mode so that users are more aware of the options available. Kind Regards, Brandon On Wed, Sep 6, 2023 at 10:50 AM Paulo Motta wrote: > > Hi, > > I've been bitten by OOMs with disk_access_mode:auto/mmap that were fixed by > changing to disk_access_mode:mmap_index_only. In a particular benchmark I got > 5x more read throughput on 3.11.x with disk_access_mode: mmap_index_only vs > disk_access_mode: auto/mmap. > > Changing disk_access_mode to mmap_index_only seems to be a common > recommendation on forums[1][2][3][4] and slack (find by searching > disk_access_mode in the #cassandra channel on https://the-asf.slack.com/). > > It's not clear to me when using the default disk_access_mode:auto/mmap is > beneficial, perhaps only when the read set fits in memory? Mick seems to > think on CASSANDRA-15531 [5], that mmap_index_only has a higher heap cost and > should be only used when warranted. However it's not uncommon to see people > being bitten with OOMs or lower read performance due to the default > disk_access_mode, so it makes me think it's not the best fool-proof default. > > Should we consider changing default "auto" behavior of "disk_access_mode" to > be "mmap_index_only" instead of "mmap" in 5.0 since it's likely safer and > perhaps more performant? > > Thanks, > > Paulo > > [1] > https://stackoverflow.com/questions/72272035/troubleshooting-and-fixing-cassandra-oom-issue > [2] https://phabricator.wikimedia.org/T137419 > [3] https://stackoverflow.com/a/55975471 > [4] > https://support.datastax.com/s/article/FAQ-Use-of-disk-access-mode-in-DSE-51-and-earlier > [5] https://issues.apache.org/jira/browse/CASSANDRA-15531
Re: [DISCUSS] Update default disk_access_mode to mmap_index_only on 5.0
Supportive of switching the default to mmap_index_only as well. I don’t have numbers handy to share, but my experience has been significantly lower read latency and I wouldn’t run with auto. I’ve also not observed substantial heap pressure after switching - it was strictly an improvement. - Scott — Mobile > On Sep 6, 2023, at 8:50 AM, Paulo Motta wrote: > > > Hi, > > I've been bitten by OOMs with disk_access_mode:auto/mmap that were fixed by > changing to disk_access_mode:mmap_index_only. In a particular benchmark I got > 5x more read throughput on 3.11.x with disk_access_mode: mmap_index_only vs > disk_access_mode: auto/mmap. > > Changing disk_access_mode to mmap_index_only seems to be a common > recommendation on forums[1][2][3][4] and slack (find by searching > disk_access_mode in the #cassandra channel on https://the-asf.slack.com/). > > It's not clear to me when using the default disk_access_mode:auto/mmap is > beneficial, perhaps only when the read set fits in memory? Mick seems to > think on CASSANDRA-15531 [5], that mmap_index_only has a higher heap cost and > should be only used when warranted. However it's not uncommon to see people > being bitten with OOMs or lower read performance due to the default > disk_access_mode, so it makes me think it's not the best fool-proof default. > > Should we consider changing default "auto" behavior of "disk_access_mode" to > be "mmap_index_only" instead of "mmap" in 5.0 since it's likely safer and > perhaps more performant? > > Thanks, > > Paulo > > [1] > https://stackoverflow.com/questions/72272035/troubleshooting-and-fixing-cassandra-oom-issue > [2] https://phabricator.wikimedia.org/T137419 > [3] https://stackoverflow.com/a/55975471 > [4] > https://support.datastax.com/s/article/FAQ-Use-of-disk-access-mode-in-DSE-51-and-earlier > [5] https://issues.apache.org/jira/browse/CASSANDRA-15531
[DISCUSS] Update default disk_access_mode to mmap_index_only on 5.0
Hi, I've been bitten by OOMs with disk_access_mode:auto/mmap that were fixed by changing to disk_access_mode:mmap_index_only. In a particular benchmark I got 5x more read throughput on 3.11.x with disk_access_mode: mmap_index_only vs disk_access_mode: auto/mmap. Changing disk_access_mode to mmap_index_only seems to be a common recommendation on forums[1][2][3][4] and slack (find by searching disk_access_mode in the #cassandra channel on https://the-asf.slack.com/). It's not clear to me when using the default disk_access_mode:auto/mmap is beneficial, perhaps only when the read set fits in memory? Mick seems to think on CASSANDRA-15531 [5], that mmap_index_only has a higher heap cost and should be only used when warranted. However it's not uncommon to see people being bitten with OOMs or lower read performance due to the default disk_access_mode, so it makes me think it's not the best fool-proof default. Should we consider changing default "auto" behavior of "disk_access_mode" to be "mmap_index_only" instead of "mmap" in 5.0 since it's likely safer and perhaps more performant? Thanks, Paulo [1] https://stackoverflow.com/questions/72272035/troubleshooting-and-fixing-cassandra-oom-issue [2] https://phabricator.wikimedia.org/T137419 [3] https://stackoverflow.com/a/55975471 [4] https://support.datastax.com/s/article/FAQ-Use-of-disk-access-mode-in-DSE-51-and-earlier [5] https://issues.apache.org/jira/browse/CASSANDRA-15531