Tombstones could be not only generated by deletes. this happens when you: - When insert or full update of a non-frozen collection occurs, such as replacing the value of the column with another value like the UPDATE table SET field = new_value …, Cassandra inserts a tombstone marker to prevent possible overlap with previous data even if data did not previously exist. A large number of tombstones can significantly affect read performance. - When you insert null explicitly, instead of using UNSET for missing data.
On Fri, Aug 21, 2020 at 7:57 AM Attila Wind <attilaw@swf.technology> wrote: > Hi Cassandra Gurus, > > Recently I captured a very interesting warning in the logs saying > > 2020-08-19 08:08:32.492 > [cassandra-client-keytiles_data_webhits-nio-worker-2] WARN > com.datastax.driver.core.RequestHandler - Query '[3 bound values] select * > from visit_sess > ion_by_start_time_v4 where container_id=? and first_action_time_frame_id > >= ? and first_action_time_frame_id <= ?;' generated server side > warning(s): > *Read 6628 live rows and 6628 tombstone cells* for query SELECT * FROM > keytiles_data_webhits.visit_session_by_start_time_v4 WHERE container_id = > 5YzsPfE2Gcu8sd-76626 AND first_action_time_frame_id > 4 > 43837 AND first_action_time_frame_id <= 443670 AND user_agent_type > > browser-mobile AND unique_webclient_id > > 045d1683-c702-48bd-9d2b-dcf1ca87ac7c AND first_action_ts > 15978 > 15766 LIMIT 6628 (see tombstone_warn_threshold) > > What makes this interesting to me is the fact we never issue not even row > level deletes but any kind of deletes against this table for now > So I'm wondering what can result in tombstone creation in Cassandra - > apart from explicit DELETE queries and TTL setup... > > My suspicion is (but I'm not sure) that as we are going with "select *" > read strategy, then calculate everything in-memory, eventually writing back > with kinda "update *" queries to Cassandra in this table (so not updating > just a few columns but everything) can lead to these... Can it? > I tried to search around this sympthom but was not successful - so decided > to ask you guys maybe someone can give us a pointer... > > Some more info: > > - the table does not have TTL set - this mechanism is turned off > - the LIMIT param in upper query comes from paging size > - we are using Cassandra4 alpha3 > - we also have a few similarly built tables where we follow the above > described "update *" policy on write path - however those tables are > counter tables... when we mass-read them into memory we also go with > "select *" logic reading up tons of rows. The point is we never saw such a > warning for these counter tables however we are handling them same > fashion... ok counter tables work differently but still interesting to me > why those never generated things like this > > thanks! > -- > Attila Wind > > http://www.linkedin.com/in/attilaw > Mobile: +49 176 43556932 > > > -- With best wishes, Alex Ott http://alexott.net/ Twitter: alexott_en (English), alexott (Russian)