Re: High disk usage casaandra 3.11.7

Bowen Song Sat, 18 Sep 2021 11:03:51 -0700

Is there any reason to not use TTL? No compaction strategy is going tocope with frequent massive deletions. In fact, queue-like data model isa Cassandra antipattern.


On 17/09/2021 23:54, Abdul Patel wrote:

Twcs is best for TTL not for excipilitly delete correct?

On Friday, September 17, 2021, Abdul Patel <abd786...@gmail.com<mailto:abd786...@gmail.com>> wrote:


    48hrs deletion is deleting older data more than 48hrs .
    LCS was used as its more of an write once and read many application.

    On Friday, September 17, 2021, Bowen Song <bo...@bso.ng
    <mailto:bo...@bso.ng>> wrote:

        Congratulation! You've just found out the cause of it. Does
        all data get deletes 48 hours after they are inserted? If so,
        are you sure LCS is the right compaction strategy for this
        table? TWCS sounds like a much better fit for this purpose.

        On 17/09/2021 19:16, Abdul Patel wrote:

        Thanks.
        Application deletes data every 48hrs of older data.
        Auto compaction works but as space is full ..errorlog only
        says not enough space to run compaction.


        On Friday, September 17, 2021, Bowen Song <bo...@bso.ng
        <mailto:bo...@bso.ng>> wrote:

            If major compaction is failing due to disk space
            constraint, you could copy the files to another server
            and run a major compaction there instead (i.e.: start
            cassandra on new server but not joining the existing
            cluster). If you must replace the node, at least use the
            '-Dcassandra.replace_address=...' parameter instead of
            'nodetool decommission' and then re-add, because the
            later changes the token ranges on the node, and that
            makes troubleshooting harder.

            22GB of data amplifies to nearly 300GB sounds very
            impossible to me, there must be something else going on.
            Have you turned off auto compaction? Did you change the
            default parameters (namely, the 'fanout_size') for LCS?
            If this doesn't give you a clue, have a look at the
            SSTable data files, do you notice anything unusual? For
            example, too many small files, or some files are
            extraordinarily large. Also have a look at the logs, is
            there anything unusual? Also, do you know the application
            logic? Does it do a lots of delete or update (including
            'upsert')? Writes with TTL? Does the table has a default TTL?

            On 17/09/2021 13:45, Abdul Patel wrote:

            Close 300 gb data. Nodetool decommission/removenode and
            added back one node ans it came back to 22Gb.
            Cant run major compaction as no space much left.

            On Friday, September 17, 2021, Bowen Song <bo...@bso.ng
            <mailto:bo...@bso.ng>> wrote:

                Okay, so how big exactly is the data on disk? You
                said removing and adding a new node gives you 20GB
                on disk, was that done via the
                '-Dcassandra.replace_address=...' parameter? If not,
                the new node will almost certainly have a different
                token range and not directly comparable to the
                existing node if you have uneven partitions or small
                number of partitions in the table. Also, try major
                compaction, it's a lot easier than replacing a node.


                On 17/09/2021 12:28, Abdul Patel wrote:

                Yes i checked and cleared all snapshots and also i
                had incremental backups in backup folder ..i
                removed the same .. its purely data..


                On Friday, September 17, 2021, Bowen Song
                <bo...@bso.ng <mailto:bo...@bso.ng>> wrote:

                    Assuming your total disk space is a lot bigger
                    than 50GB in size (accounting for disk space
                    amplification, commit log, logs, OS data,
                    etc.), I would suspect the disk space is being
                    used by something else. Have you checked that
                    the disk space is actually being used by the
                    cassandra data directory? If so, have a look at
                    'nodetool listsnapshots' command output as well.


                    On 17/09/2021 05:48, Abdul Patel wrote:

                        Hello

                        We have cassandra with leveledcompaction
                        strategy, recently found filesystem almost
                        90% full but the data was only 10m records.
                        Manual compaction will work? As not sure
                        its recommended and space is also
                        constraint ..tried removing and adding one
                        node and now data is at 20GB which looks
                        appropropiate.
                        So is only solution to reclaim space is
                        remove/add node?

Re: High disk usage casaandra 3.11.7

Reply via email to