Re: Open File Descriptors not cleared post upgrade from 3.11.9 to 4.0.5.

2023-08-16 Thread vaibhav khedkar
Thank you patrick.

We have plans for upgrades anyway so keep this issue in mind and
probably expedite it.

I have updated and created a bug
https://issues.apache.org/jira/browse/CASSANDRA-18770  in case you are
interested.

Thanks
vaibhav

On Wed, Aug 16, 2023 at 1:34 PM Patrick Lee 
wrote:

> I don’t have a ticket.  What I saw in a scenario was a cluster that was
> upgraded from 3.11 to 4.0.X, we added another ring that was running Java
> 11.  Nodes on the ring with Java 8 saw this issue you described while the
> other ring running Java 11 did not.  Then if I updated from Java 8 to Java
> 11 I did not run into this issue again.  But the percentage of
> nodes/clusters that I experienced this issue was extremely low. We just
> happen to be in a phase of a lot of updates/upgrades of multiple things so
> it was just all baked into our process.  Completely get that updating the
> Java version is not a simple thing but wanted to throw that little
> Interesting bit of info that I had experienced.
>
> Sent from my iPhone
>
> On Aug 16, 2023, at 1:28 PM, vaibhav khedkar  wrote:
>
> 
> Thanks Patrick,
>
>
> We do have plans to upgrade to *java 11* eventually but we will go
> through internal testing and would also need some time given the size of
> our infrastructure.
>
> Is it safe to assume that the issue exists in the combination of upgrades
> from 3.11.x to 4.0.x *and* running on JAVA 8 ?
>
> Did you happen to have created a ticket for this when you observed the
> issue ?
>
> Thanks
> vaibhav
>
> On Wed, Aug 16, 2023 at 11:15 AM Patrick Lee 
> wrote:
>
>> I've actually noticed this as well on a few clusters I deal with but
>> after upgrading Cassandra from 3.11 to 4 we also changed to use Java 11
>> shortly after the cluster upgrade.  After I moved to Java 11 I have not
>> experienced a problem.
>>
>> On Wed, Aug 16, 2023 at 12:12 PM vaibhav khedkar 
>> wrote:
>>
>>> Thank you Scott
>>>
>>> We are seeing it for all the tables (Filter, Data ..etc )
>>>
>>> /nb-1-big-Statistics.db.tmp (deleted)
>>> /nb-3-big-Statistics.db.tmp (deleted)
>>> /nb-2-big-Data.db (deleted)
>>> /nb-2-big-Statistics.db.tmp (deleted)
>>> /nb-2-big-Index.db (deleted)
>>> /nb-2-big-Statistics.db.tmp (deleted)
>>> /nb-2-big-CompressionInfo.db (deleted)
>>> /nb-2-big-Filter.db (deleted)
>>> /nb-2-big-Summary.db (deleted)
>>> /nb-2-big-Digest.crc32 (deleted)
>>>
>>> Also I believe the property `disk_access_mode` is only present in the
>>> DSE version of cassandra and we use Open source.
>>>
>>> I will certainly open the ticket for this.
>>>
>>> Thanks
>>> vaibhav
>>>
>>> On Wed, Aug 16, 2023 at 9:34 AM C. Scott Andreas 
>>> wrote:
>>>
 Vaibhav, thank you for reaching out and sharing this issue report.

 Could you run an `lsof` and share which SSTable files you see open
 (e.g., all SSTable components or a subset of them); and also share the
 value of the `disk_access_mode` property from your cassandra.yaml?

 Opening a Jira ticket for this for discussion / investigation is
 probably a good next step.

 Thanks,

 – Scott

 On Aug 16, 2023, at 9:28 AM, vaibhav khedkar 
 wrote:


 Hi everyone,

 We recently upgraded our fleet of ~2500 Cassandra instances from 3.11.9
 to 4.0.5.

 After the upgrade, we are seeing a unique issue where the compacted
 SSTables's file descriptors are still present and are never cleared. This
 is causing false disk alerts. We have to restart nodes very often in order
 to reclaim the disk space.

 We use the Apache version of C* along with CentOS Linux 7.9.2009 and
 Java 8.

 We are looking for help in resolving this issue. If you have any
 suggestions, please let us know.

 Thank you,
 vaibhav



>>>
>>> --
>>> Best Regards,
>>> Vaibhav Khedkar
>>> Google Inc.
>>> Mobile : (806)- 252 - 2912
>>>
>>> Email: vkhedk...@gmail.com
>>>
>>
>
> --
> Best Regards,
> Vaibhav Khedkar
> Google Inc.
> Mobile : (806)- 252 - 2912
>
> Email: vkhedk...@gmail.com
>
>

-- 
Best Regards,
Vaibhav Khedkar
Google Inc.
Mobile : (806)- 252 - 2912

Email: vkhedk...@gmail.com


Re: Open File Descriptors not cleared post upgrade from 3.11.9 to 4.0.5.

2023-08-16 Thread Patrick Lee
I don’t have a ticket.  What I saw in a scenario was a cluster that was upgraded from 3.11 to 4.0.X, we added another ring that was running Java 11.  Nodes on the ring with Java 8 saw this issue you described while the other ring running Java 11 did not.  Then if I updated from Java 8 to Java 11 I did not run into this issue again.  But the percentage of nodes/clusters that I experienced this issue was extremely low. We just happen to be in a phase of a lot of updates/upgrades of multiple things so it was just all baked into our process.  Completely get that updating the Java version is not a simple thing but wanted to throw that little Interesting bit of info that I had experienced.  Sent from my iPhoneOn Aug 16, 2023, at 1:28 PM, vaibhav khedkar  wrote:Thanks Patrick,We do have plans to upgrade to java 11 eventually but we will go through internal testing and would also need some time given the size of our infrastructure. Is it safe to assume that the issue exists in the combination of upgrades from 3.11.x to 4.0.x and running on JAVA 8 ? Did you happen to have created a ticket for this when you observed the issue ? Thanks vaibhavOn Wed, Aug 16, 2023 at 11:15 AM Patrick Lee  wrote:I've actually noticed this as well on a few clusters I deal with but after upgrading Cassandra from 3.11 to 4 we also changed to use Java 11 shortly after the cluster upgrade.  After I moved to Java 11 I have not experienced a problem.On Wed, Aug 16, 2023 at 12:12 PM vaibhav khedkar  wrote:Thank you Scott We are seeing it for all the tables (Filter, Data ..etc ) /nb-1-big-Statistics.db.tmp (deleted)/nb-3-big-Statistics.db.tmp (deleted)/nb-2-big-Data.db (deleted)/nb-2-big-Statistics.db.tmp (deleted)/nb-2-big-Index.db (deleted)/nb-2-big-Statistics.db.tmp (deleted)/nb-2-big-CompressionInfo.db (deleted)/nb-2-big-Filter.db (deleted)/nb-2-big-Summary.db (deleted)/nb-2-big-Digest.crc32 (deleted)Also I believe the property `disk_access_mode` is only present in the DSE version of cassandra and we use Open source. I will certainly open the ticket for this. ThanksvaibhavOn Wed, Aug 16, 2023 at 9:34 AM C. Scott Andreas  wrote:Vaibhav, thank you for reaching out and sharing this issue report.Could you run an `lsof` and share which SSTable files you see open (e.g., all SSTable components or a subset of them); and also share the value of the `disk_access_mode` property from your cassandra.yaml?Opening a Jira ticket for this for discussion / investigation is probably a good next step.Thanks,– ScottOn Aug 16, 2023, at 9:28 AM, vaibhav khedkar  wrote:Hi everyone,We recently upgraded our fleet of ~2500 Cassandra instances from 3.11.9 to 4.0.5.After the upgrade, we are seeing a unique issue where the compacted SSTables's file descriptors are still present and are never cleared. This is causing false disk alerts. We have to restart nodes very often in order to reclaim the disk space.We use the Apache version of C* along with CentOS Linux 7.9.2009 and Java 8.We are looking for help in resolving this issue. If you have any suggestions, please let us know.Thank you,vaibhav-- Best Regards,Vaibhav KhedkarGoogle Inc. Mobile : (806)- 252 - 2912Email: vkhedk...@gmail.com

-- Best Regards,Vaibhav KhedkarGoogle Inc. Mobile : (806)- 252 - 2912Email: vkhedk...@gmail.com


Re: Unsubscribe

2023-08-16 Thread C. Scott Andreas

Hi Mark,You can unsubscribe from this mailing list by sending a blank email to 
"user-unsubscr...@cassandra.apache.org" from the address that is subscribed to the list. 
Other members of the list are not able to take this action on someone's behalf.Details on how to join 
and leave lists are here: https://cassandra.apache.org/_/community.html#discussionsCheers,–ScottOn 
Aug 16, 2023, at 11:33 AM, Mark Furlong  wrote:Please unsubscribe from 
this list  ThanksMark FurlongSr. Database administratormfurl...@ancestry.com M: 801-859-7427O: 
801-705-71151300 W Traverse PkwyLehi, UT 84043  We empower journeys of personal 
discovery to enrich lives  

Unsubscribe

2023-08-16 Thread Mark Furlong
Please unsubscribe from this list


Thanks
Mark Furlong
Sr. Database Administrator
mfurl...@ancestry.com
M: 801-859-7427
O: 801-705-7115
1300 W Traverse Pkwy
Lehi, UT 84043


[http://c.mfcreative.com/mars/email/shared-icon/sig-logo.gif]
We empower journeys of personal discovery to enrich lives




Re: Open File Descriptors not cleared post upgrade from 3.11.9 to 4.0.5.

2023-08-16 Thread vaibhav khedkar
Thanks Patrick,


We do have plans to upgrade to *java 11* eventually but we will go through
internal testing and would also need some time given the size of our
infrastructure.

Is it safe to assume that the issue exists in the combination of upgrades
from 3.11.x to 4.0.x *and* running on JAVA 8 ?

Did you happen to have created a ticket for this when you observed the
issue ?

Thanks
vaibhav

On Wed, Aug 16, 2023 at 11:15 AM Patrick Lee 
wrote:

> I've actually noticed this as well on a few clusters I deal with but after
> upgrading Cassandra from 3.11 to 4 we also changed to use Java 11 shortly
> after the cluster upgrade.  After I moved to Java 11 I have not experienced
> a problem.
>
> On Wed, Aug 16, 2023 at 12:12 PM vaibhav khedkar 
> wrote:
>
>> Thank you Scott
>>
>> We are seeing it for all the tables (Filter, Data ..etc )
>>
>> /nb-1-big-Statistics.db.tmp (deleted)
>> /nb-3-big-Statistics.db.tmp (deleted)
>> /nb-2-big-Data.db (deleted)
>> /nb-2-big-Statistics.db.tmp (deleted)
>> /nb-2-big-Index.db (deleted)
>> /nb-2-big-Statistics.db.tmp (deleted)
>> /nb-2-big-CompressionInfo.db (deleted)
>> /nb-2-big-Filter.db (deleted)
>> /nb-2-big-Summary.db (deleted)
>> /nb-2-big-Digest.crc32 (deleted)
>>
>> Also I believe the property `disk_access_mode` is only present in the DSE
>> version of cassandra and we use Open source.
>>
>> I will certainly open the ticket for this.
>>
>> Thanks
>> vaibhav
>>
>> On Wed, Aug 16, 2023 at 9:34 AM C. Scott Andreas 
>> wrote:
>>
>>> Vaibhav, thank you for reaching out and sharing this issue report.
>>>
>>> Could you run an `lsof` and share which SSTable files you see open
>>> (e.g., all SSTable components or a subset of them); and also share the
>>> value of the `disk_access_mode` property from your cassandra.yaml?
>>>
>>> Opening a Jira ticket for this for discussion / investigation is
>>> probably a good next step.
>>>
>>> Thanks,
>>>
>>> – Scott
>>>
>>> On Aug 16, 2023, at 9:28 AM, vaibhav khedkar 
>>> wrote:
>>>
>>>
>>> Hi everyone,
>>>
>>> We recently upgraded our fleet of ~2500 Cassandra instances from 3.11.9
>>> to 4.0.5.
>>>
>>> After the upgrade, we are seeing a unique issue where the compacted
>>> SSTables's file descriptors are still present and are never cleared. This
>>> is causing false disk alerts. We have to restart nodes very often in order
>>> to reclaim the disk space.
>>>
>>> We use the Apache version of C* along with CentOS Linux 7.9.2009 and
>>> Java 8.
>>>
>>> We are looking for help in resolving this issue. If you have any
>>> suggestions, please let us know.
>>>
>>> Thank you,
>>> vaibhav
>>>
>>>
>>>
>>
>> --
>> Best Regards,
>> Vaibhav Khedkar
>> Google Inc.
>> Mobile : (806)- 252 - 2912
>>
>> Email: vkhedk...@gmail.com
>>
>

-- 
Best Regards,
Vaibhav Khedkar
Google Inc.
Mobile : (806)- 252 - 2912

Email: vkhedk...@gmail.com


Re: Open File Descriptors not cleared post upgrade from 3.11.9 to 4.0.5.

2023-08-16 Thread Patrick Lee
I've actually noticed this as well on a few clusters I deal with but after
upgrading Cassandra from 3.11 to 4 we also changed to use Java 11 shortly
after the cluster upgrade.  After I moved to Java 11 I have not experienced
a problem.

On Wed, Aug 16, 2023 at 12:12 PM vaibhav khedkar 
wrote:

> Thank you Scott
>
> We are seeing it for all the tables (Filter, Data ..etc )
>
> /nb-1-big-Statistics.db.tmp (deleted)
> /nb-3-big-Statistics.db.tmp (deleted)
> /nb-2-big-Data.db (deleted)
> /nb-2-big-Statistics.db.tmp (deleted)
> /nb-2-big-Index.db (deleted)
> /nb-2-big-Statistics.db.tmp (deleted)
> /nb-2-big-CompressionInfo.db (deleted)
> /nb-2-big-Filter.db (deleted)
> /nb-2-big-Summary.db (deleted)
> /nb-2-big-Digest.crc32 (deleted)
>
> Also I believe the property `disk_access_mode` is only present in the DSE
> version of cassandra and we use Open source.
>
> I will certainly open the ticket for this.
>
> Thanks
> vaibhav
>
> On Wed, Aug 16, 2023 at 9:34 AM C. Scott Andreas 
> wrote:
>
>> Vaibhav, thank you for reaching out and sharing this issue report.
>>
>> Could you run an `lsof` and share which SSTable files you see open (e.g.,
>> all SSTable components or a subset of them); and also share the value of
>> the `disk_access_mode` property from your cassandra.yaml?
>>
>> Opening a Jira ticket for this for discussion / investigation is probably
>> a good next step.
>>
>> Thanks,
>>
>> – Scott
>>
>> On Aug 16, 2023, at 9:28 AM, vaibhav khedkar  wrote:
>>
>>
>> Hi everyone,
>>
>> We recently upgraded our fleet of ~2500 Cassandra instances from 3.11.9
>> to 4.0.5.
>>
>> After the upgrade, we are seeing a unique issue where the compacted
>> SSTables's file descriptors are still present and are never cleared. This
>> is causing false disk alerts. We have to restart nodes very often in order
>> to reclaim the disk space.
>>
>> We use the Apache version of C* along with CentOS Linux 7.9.2009 and Java
>> 8.
>>
>> We are looking for help in resolving this issue. If you have any
>> suggestions, please let us know.
>>
>> Thank you,
>> vaibhav
>>
>>
>>
>
> --
> Best Regards,
> Vaibhav Khedkar
> Google Inc.
> Mobile : (806)- 252 - 2912
>
> Email: vkhedk...@gmail.com
>


Re: Open File Descriptors not cleared post upgrade from 3.11.9 to 4.0.5.

2023-08-16 Thread vaibhav khedkar
Thank you Scott

We are seeing it for all the tables (Filter, Data ..etc )

/nb-1-big-Statistics.db.tmp (deleted)
/nb-3-big-Statistics.db.tmp (deleted)
/nb-2-big-Data.db (deleted)
/nb-2-big-Statistics.db.tmp (deleted)
/nb-2-big-Index.db (deleted)
/nb-2-big-Statistics.db.tmp (deleted)
/nb-2-big-CompressionInfo.db (deleted)
/nb-2-big-Filter.db (deleted)
/nb-2-big-Summary.db (deleted)
/nb-2-big-Digest.crc32 (deleted)

Also I believe the property `disk_access_mode` is only present in the DSE
version of cassandra and we use Open source.

I will certainly open the ticket for this.

Thanks
vaibhav

On Wed, Aug 16, 2023 at 9:34 AM C. Scott Andreas 
wrote:

> Vaibhav, thank you for reaching out and sharing this issue report.
>
> Could you run an `lsof` and share which SSTable files you see open (e.g.,
> all SSTable components or a subset of them); and also share the value of
> the `disk_access_mode` property from your cassandra.yaml?
>
> Opening a Jira ticket for this for discussion / investigation is probably
> a good next step.
>
> Thanks,
>
> – Scott
>
> On Aug 16, 2023, at 9:28 AM, vaibhav khedkar  wrote:
>
>
> Hi everyone,
>
> We recently upgraded our fleet of ~2500 Cassandra instances from 3.11.9 to
> 4.0.5.
>
> After the upgrade, we are seeing a unique issue where the compacted
> SSTables's file descriptors are still present and are never cleared. This
> is causing false disk alerts. We have to restart nodes very often in order
> to reclaim the disk space.
>
> We use the Apache version of C* along with CentOS Linux 7.9.2009 and Java
> 8.
>
> We are looking for help in resolving this issue. If you have any
> suggestions, please let us know.
>
> Thank you,
> vaibhav
>
>
>

-- 
Best Regards,
Vaibhav Khedkar
Google Inc.
Mobile : (806)- 252 - 2912

Email: vkhedk...@gmail.com


Re: Open File Descriptors not cleared post upgrade from 3.11.9 to 4.0.5.

2023-08-16 Thread C. Scott Andreas

Vaibhav, thank you for reaching out and sharing this issue report.Could you run an 
`lsof` and share which SSTable files you see open (e.g., all SSTable components or a 
subset of them); and also share the value of the `disk_access_mode` property from 
your cassandra.yaml?Opening a Jira ticket for this for discussion / investigation is 
probably a good next step.Thanks,– ScottOn Aug 16, 2023, at 9:28 AM, vaibhav khedkar 
 wrote:Hi everyone,We recently upgraded our fleet of ~2500 
Cassandra instances from 3.11.9 to 4.0.5.After the upgrade, we are seeing a unique 
issue where the compacted SSTables's file descriptors are still present and are never 
cleared. This is causing false disk alerts. We have to restart nodes very often in 
order to reclaim the disk space.We use the Apache version of C* along with CentOS 
Linux 7.9.2009 and Java 8.We are looking for help in resolving this issue. If you 
have any suggestions, please let us know.Thank you,vaibhav

Open File Descriptors not cleared post upgrade from 3.11.9 to 4.0.5.

2023-08-16 Thread vaibhav khedkar
Hi everyone,

We recently upgraded our fleet of ~2500 Cassandra instances from 3.11.9 to
4.0.5.

After the upgrade, we are seeing a unique issue where the compacted
SSTables's file descriptors are still present and are never cleared. This
is causing false disk alerts. We have to restart nodes very often in order
to reclaim the disk space.

We use the Apache version of C* along with CentOS Linux 7.9.2009 and Java 8.

We are looking for help in resolving this issue. If you have any
suggestions, please let us know.

Thank you,
vaibhav


Re: Big Data Question

2023-08-16 Thread Jeff Jirsa
A lot of things depend on actual cluster config - compaction settings (LCS
vs STCS vs TWCS) and token allocation (single token, vnodes, etc) matter a
ton.

With 4.0 and LCS, streaming for replacement is MUCH faster, so much so that
most people should be fine with 4-8TB/node, because the rebuild time is
decreased by an order of magnitude.

If you happen to have large physical machines, running multiple instances
on a machine (each with a single token, and making sure you match rack
awareness) sorta approximates vnodes without some of the unpleasant side
effects.

If you happen to run on more-reliable-storage (like EBS, or a SAN, and you
understand what that means from a business continuity perspective), then
you can assume that your rebuild frequency is probably an order of
magnitude less often, so you can adjust your risk calculation based on
measured reliability there (again, EBS and other disaggregated disks still
fail, just less often than single physical flash devices).

Seed nodes never really need to change significantly. You should be fine
with 2-3 per DC no matter the instance count.




On Wed, Aug 16, 2023 at 8:34 AM Joe Obernberger <
joseph.obernber...@gmail.com> wrote:

> General question on how to configure Cassandra.  Say I have 1PByte of
> data to store.  The general rule of thumb is that each node (or at least
> instance of Cassandra) shouldn't handle more than 2TBytes of disk.  That
> means 500 instances of Cassandra.
>
> Assuming you have very fast persistent storage (such as a NetApp,
> PorterWorx etc.), would using Kubernetes or some orchestration layer to
> handle those nodes be a viable approach?  Perhaps the worker nodes would
> have enough RAM to run 4 instances (pods) of Cassandra, you would need
> 125 servers.
> Another approach is to build your servers with 5 (or more) SSD devices -
> one for OS, four for each instance of Cassandra running on that server.
> Then build some scripts/ansible/puppet that would manage Cassandra
> start/stops, and other maintenance items.
>
> Where I think this runs into problems is with repairs, or sstablescrubs
> that can take days to run on a single instance.  How is that handled 'in
> the real world'?  With seed nodes, how many would you have in such a
> configuration?
> Thanks for any thoughts!
>
> -Joe
>
>
> --
> This email has been checked for viruses by AVG antivirus software.
> www.avg.com
>


Big Data Question

2023-08-16 Thread Joe Obernberger
General question on how to configure Cassandra.  Say I have 1PByte of 
data to store.  The general rule of thumb is that each node (or at least 
instance of Cassandra) shouldn't handle more than 2TBytes of disk.  That 
means 500 instances of Cassandra.


Assuming you have very fast persistent storage (such as a NetApp, 
PorterWorx etc.), would using Kubernetes or some orchestration layer to 
handle those nodes be a viable approach?  Perhaps the worker nodes would 
have enough RAM to run 4 instances (pods) of Cassandra, you would need 
125 servers.
Another approach is to build your servers with 5 (or more) SSD devices - 
one for OS, four for each instance of Cassandra running on that server.  
Then build some scripts/ansible/puppet that would manage Cassandra 
start/stops, and other maintenance items.


Where I think this runs into problems is with repairs, or sstablescrubs 
that can take days to run on a single instance.  How is that handled 'in 
the real world'?  With seed nodes, how many would you have in such a 
configuration?

Thanks for any thoughts!

-Joe


--
This email has been checked for viruses by AVG antivirus software.
www.avg.com