Re: Urgent Problem - Disk full

2018-04-04 Thread Jürgen Albersdorfer
Thank You All for your hints on this. I added another folder on the commitlog Disk to compensate immediate urgency. Next Step will be to reorganize and deduplicate the data into a 2nd table. Then drop the original one, clean the snapshot, consolidate back all data Files away from the commitlog

RE: Urgent Problem - Disk full

2018-04-04 Thread Kenneth Brotman
Agreed that you tend to add capacity to nodes or add nodes once you know you have no unneeded data in the cluster. From: Alain RODRIGUEZ [mailto:arodr...@gmail.com] Sent: Wednesday, April 04, 2018 9:10 AM To: user cassandra.apache.org Subject: Re: Urgent Problem - Disk full Hi, When

Re: datastax cassandra minimum hardware recommendation

2018-04-04 Thread Ben Bromhead
Also, DS charge by core ;) Anecdotally, we run a large fleet of Apache C* nodes on AWS with a good portion of supported instances that run with 16GB of RAM and 4 cores, which is fine for those workloads. On Wed, Apr 4, 2018 at 11:08 AM sujeet jog wrote: > Thanks Alain > >

Re: Urgent Problem - Disk full

2018-04-04 Thread Alain RODRIGUEZ
Hi, When the disks are full, here are the options I can think of depending on the situation and how 'full' the disk really is: - Add capacity - Add a disk, use JBOD adding a second location folder for the sstables and move some of them around then restart Cassandra. Or add a new node. - Reduce

Re: Text or....

2018-04-04 Thread Jon Haddad
Depending on the compression rate, I think it would generate less garbage on the Cassandra side if you compressed it client side. Something to test out. > On Apr 4, 2018, at 7:19 AM, Jeff Jirsa wrote: > > Compressing server side and validating checksums is hugely important

Re: datastax cassandra minimum hardware recommendation

2018-04-04 Thread sujeet jog
Thanks Alain On Wed, Apr 4, 2018 at 3:12 PM, Alain RODRIGUEZ wrote: > Hello. > > For questions to Datastax, I recommend you to ask them directly. I often > had a quick answer and they probably can answer this better than we do :). > > Apache Cassandra (and probably

RE: Urgent Problem - Disk full

2018-04-04 Thread Kenneth Brotman
There's also the old snapshots to remove that could be a significant amount of memory. -Original Message- From: Kenneth Brotman [mailto:kenbrot...@yahoo.com.INVALID] Sent: Wednesday, April 04, 2018 7:28 AM To: user@cassandra.apache.org Subject: RE: Urgent Problem - Disk full Jeff,

RE: Urgent Problem - Disk full

2018-04-04 Thread Kenneth Brotman
Jeff, Just wondering: why wouldn't the answer be to: 1. move anything you want to archive to colder storage off the cluster, 2. nodetool cleanup 3. snapshot 4. use delete command to remove archived data. Kenneth Brotman -Original Message- From: Jeff

Re: Text or....

2018-04-04 Thread Jeff Jirsa
Compressing server side and validating checksums is hugely important in the more frequently used versions of cassandra - so since you probably want to run compression on the server anyway, I’m not sure why you’d compress it twice -- Jeff Jirsa > On Apr 4, 2018, at 6:23 AM, DuyHai Doan

Re: Urgent Problem - Disk full

2018-04-04 Thread Jeff Jirsa
There is zero reason to believe a full repair would make this better and a lot of reason to believe it’ll make it worse For casual observers following along at home, this is probably not the answer you’re looking for. -- Jeff Jirsa > On Apr 4, 2018, at 4:37 AM, Rahul Singh

Re: Urgent Problem - Disk full

2018-04-04 Thread Jeff Jirsa
Yes, this works in TWCS. Note though that if you have tombstone compaction subproperties set, there may be sstables with newer filesystem timestamps that actually hold older Cassandra data, in which case sstablemetadata can help finding the sstables with truly old timestamps Also if you’ve

Re: Text or....

2018-04-04 Thread DuyHai Doan
Compressing client-side is better because it will save: 1) a lot of bandwidth on the network 2) a lot of Cassandra CPU because no decompression server-side 3) a lot of Cassandra HEAP because the compressed blob should be relatively small (text data compress very well) compared to the raw size On

Re: Text or....

2018-04-04 Thread Jeronimo de A. Barros
Hi, We use a pseudo file-system table where the chunks are blobs of 64 KB and we never had any performance issue. Primary-key structure is ((file-uuid), chunck-id). Jero On Wed, Apr 4, 2018 at 9:25 AM, shalom sagges wrote: > Hi All, > > A certain application is

Re: Text or....

2018-04-04 Thread Nicolas Guyomar
Hi Shalom, You might want to compress on application side before inserting in Cassandra, using the algorithm on your choice, based on compression ratio and speed that you found acceptable with your use case On 4 April 2018 at 14:38, shalom sagges wrote: > Thanks

Re: Text or....

2018-04-04 Thread shalom sagges
Thanks DuyHai! I'm using the default table compression. Is there anything else I should look into? Regarding the table compression, I understand that for write heavy tables, it's best to keep the default and not compress it further. Have I understood correctly? On Wed, Apr 4, 2018 at 3:28 PM,

RE: Urgent Problem - Disk full

2018-04-04 Thread Kenneth Brotman
Assuming the data model is good and there haven’t been any sudden jumps in memory use, it seems like the normal thing to do is archive some of the old time series data that you don’t care about. Kenneth Brotman From: Rahul Singh [mailto:rahul.xavier.si...@gmail.com] Sent: Wednesday,

Re: Text or....

2018-04-04 Thread DuyHai Doan
Compress it and stores it as a blob. Unless you ever need to index it but I guess even with SASI indexing a so huge text block is not a good idea On Wed, Apr 4, 2018 at 2:25 PM, shalom sagges wrote: > Hi All, > > A certain application is writing ~55,000 characters for a

Text or....

2018-04-04 Thread shalom sagges
Hi All, A certain application is writing ~55,000 characters for a single row. Most of these characters are entered to one column with "text" data type. This looks insanely large for one row. Would you suggest to change the data type from "text" to BLOB or any other option that might fit this

Re: Urgent Problem - Disk full

2018-04-04 Thread Rahul Singh
Nothing a full repair won’t be able to fix. On Apr 4, 2018, 7:32 AM -0400, Jürgen Albersdorfer , wrote: > Hi, > > I have an urgent Problem. - I will run out of disk space in near future. > Largest Table is a Time-Series Table with

Urgent Problem - Disk full

2018-04-04 Thread Jürgen Albersdorfer
Hi, I have an urgent Problem. - I will run out of disk space in near future. Largest Table is a Time-Series Table with TimeWindowCompactionStrategy (TWCS) and default_time_to_live = 0 Keyspace Replication Factor RF=3. I run C* Version 3.11.2 We have grown the Cluster over time, so SSTable files

Re: datastax cassandra minimum hardware recommendation

2018-04-04 Thread Rahul Singh
Agree with Alain. Remember that DSE is not Cassandra. It includes Cassandra, SolR, Spark, and Graph. So if you run all of some , it’s more than just Cassandra. OpsCenter is another thing altogether. -- Rahul Singh rahul.si...@anant.us Anant Corporation On Apr 4, 2018, 5:42 AM -0400, Alain

Re: datastax cassandra minimum hardware recommendation

2018-04-04 Thread Alain RODRIGUEZ
Hello. For questions to Datastax, I recommend you to ask them directly. I often had a quick answer and they probably can answer this better than we do :). Apache Cassandra (and probably DSE-Cassandra) can work with 8 CPU (and less!). I would not go much lower though. I believe the memory amount

datastax cassandra minimum hardware recommendation

2018-04-04 Thread sujeet jog
the datastax site has a hardware recommendation of 16CPU / 32G RAM for DSE Enterprise, Any idea what is the minimum hardware recommendation supported, can each node be 8CPU and the support covering it ?..