The idea of storing your data as a single blob can be dangerous. Indeed, you loose the ability to perform atomic update on each column.
In Cassandra, LWW is the rule. Suppose 2 concurrent updates on the same row, 1st update changes column Firstname (let's say it's a Person record) and 2nd update changes column Lastname Now depending on the timestamp between the 2 updates, you'll have: - old Firstname, new Lastname - new Firstname, old Lastname having updates on columns atomically guarantees you to have new Firstname, new Lastname On Fri, Jan 4, 2019 at 8:17 PM Jonathan Haddad <j...@jonhaddad.com> wrote: > Those are two different cases though. It *sounds like* (again, I may be > missing the point) you're trying to overwrite a value with another value. > You're either going to serialize a blob and overwrite a single cell, or > you're going to overwrite all the cells and include a tombstone. > > When you do a read, reading a single tombstone vs a single vs is > essentially the same thing, performance wise. > > In your description you said "~ 20-100 events", and you're overwriting the > event each time, so I don't know how you go to 10K tombstones either. > Compaction will bring multiple tombstones together for a cell in the same > way it compacts multiple values for a single cell. > > I sounds to make like you're taking some advice about tombstones out of > context and trying to apply the advice to a different problem. Again, I > might be misunderstanding what you're doing. > > > On Fri, Jan 4, 2019 at 10:49 AM Tomas Bartalos <tomas.barta...@gmail.com> > wrote: > >> Hello Jon, >> >> I thought having tombstones is much higher overhead than just overwriting >> values. The compaction overhead can be l similar, but I think the read >> performance is much worse. >> >> Tombstones accumulate and hang for 10 days (by default) before they are >> eligible for compaction. >> >> Also we have tombstone warning and error thresholds. If cassandra scans >> more than 10 000 tombstones, she will abort the query. >> >> According to this article: >> https://opencredo.com/blogs/cassandra-tombstones-common-issues/ >> >> "The cassandra.yaml comments explain in perfectly: *“When executing a >> scan, within or across a partition, we need to keep the tombstones seen in >> memory so we can return them to the coordinator, which will use them to >> make sure other replicas also know about the deleted rows. With workloads >> that generate a lot of tombstones, this can cause performance problems and >> even exhaust the server heap. "* >> >> Regards, >> Tomas >> >> On Fri, 4 Jan 2019, 7:06 pm Jonathan Haddad <j...@jonhaddad.com wrote: >> >>> If you're overwriting values, it really doesn't matter much if it's a >>> tombstone or any other value, they still need to be compacted and have the >>> same overhead at read time. >>> >>> Tombstones are problematic when you try to use Cassandra as a queue (or >>> something like a queue) and you need to scan over thousands of tombstones >>> in order to get to the real data. You're simply overwriting a row and >>> trying to avoid a single tombstone. >>> >>> Maybe I'm missing something here. Why do you think overwriting a single >>> cell with a tombstone is any worse than overwriting a single cell with a >>> value? >>> >>> Jon >>> >>> >>> On Fri, Jan 4, 2019 at 9:57 AM Tomas Bartalos <tomas.barta...@gmail.com> >>> wrote: >>> >>>> Hello, >>>> >>>> I beleive your approach is the same as using spark with " >>>> spark.cassandra.output.ignoreNulls=true" >>>> This will not cover the situation when a value have to be overwriten >>>> with null. >>>> >>>> I found one possible solution - change the schema to keep only primary >>>> key fields and move all other fields to frozen UDT. >>>> create table (year, month, day, id, frozen<Event>, primary key((year, >>>> month, day), id) ) >>>> In this way anything that is null inside event doesn't create >>>> tombstone, since event is serialized to BLOB. >>>> The penalty is in need of deserializing the whole Event when selecting >>>> only few columns. >>>> Can anyone confirm if this is good solution performance wise? >>>> >>>> Thank you, >>>> >>>> On Fri, 4 Jan 2019, 2:20 pm DuyHai Doan <doanduy...@gmail.com wrote: >>>> >>>>> "The problem is I can't know the combination of set/unset values" --> >>>>> Just for this requirement, Achilles has a working solution for many years >>>>> using INSERT_NOT_NULL_FIELDS strategy: >>>>> >>>>> https://github.com/doanduyhai/Achilles/wiki/Insert-Strategy >>>>> >>>>> Or you can use the Update API that by design only perform update on >>>>> not null fields: >>>>> https://github.com/doanduyhai/Achilles/wiki/Quick-Reference#updating-all-non-null-fields-for-an-entity >>>>> >>>>> >>>>> Behind the scene, for each new combination of INSERT INTO table(x,y,z) >>>>> statement, Achilles will check its prepared statement cache and if the >>>>> statement does not exist yet, create a new prepared statement and put it >>>>> into the cache for later re-use for you >>>>> >>>>> Disclaiment: I'm the creator of Achilles >>>>> >>>>> >>>>> >>>>> On Thu, Dec 27, 2018 at 10:21 PM Tomas Bartalos < >>>>> tomas.barta...@gmail.com> wrote: >>>>> >>>>>> Hello, >>>>>> >>>>>> The problem is I can't know the combination of set/unset values. From >>>>>> my perspective every value should be set. The event from Kafka represents >>>>>> the complete state of the happening at certain point in time. In my >>>>>> table I >>>>>> want to store the latest event so the most recent state of the happening >>>>>> (in this table I don't care about the history). Actually I used wrong >>>>>> expression since its just the opposite of "incremental update", every >>>>>> event >>>>>> carries all data (state) for specific point of time. >>>>>> >>>>>> The event is represented with nested json structure. Top level >>>>>> elements of the json are table fields with type like text, boolean, >>>>>> timestamp, list and the nested elements are UDT fields. >>>>>> >>>>>> Simplified example: >>>>>> There is a new purchase for the happening, event: >>>>>> {total_amount: 50, items : [A, B, C, new_item], purchase_time : >>>>>> '2018-12-27 13:30', specials: null, customer : {... }, fare_amount,...} >>>>>> I don't know what actually happened for this event, maybe there is a >>>>>> new item purchased, maybe some customer info have been changed, maybe the >>>>>> specials have been revoked and I have to reset them. I just need to store >>>>>> the state as it artived from Kafka, there might already be an event for >>>>>> this happening saved before, or maybe this is the first one. >>>>>> >>>>>> BR, >>>>>> Tomas >>>>>> >>>>>> >>>>>> On Thu, 27 Dec 2018, 9:36 pm Eric Stevens <migh...@gmail.com wrote: >>>>>> >>>>>>> Depending on the use case, creating separate prepared statements for >>>>>>> each combination of set / unset values in large INSERT/UPDATE statements >>>>>>> may be prohibitive. >>>>>>> >>>>>>> Instead, you can look into driver level support for UNSET values. >>>>>>> Requires Cassandra 2.2 or later IIRC. >>>>>>> >>>>>>> See: >>>>>>> Java Driver: >>>>>>> https://docs.datastax.com/en/developer/java-driver/3.0/manual/statements/prepared/#parameters-and-binding >>>>>>> Python Driver: >>>>>>> https://www.datastax.com/dev/blog/python-driver-2-6-0-rc1-with-cassandra-2-2-features#distinguishing_between_null_and_unset_values >>>>>>> Node Driver: >>>>>>> https://docs.datastax.com/en/developer/nodejs-driver/3.5/features/datatypes/nulls/#unset >>>>>>> >>>>>>> On Thu, Dec 27, 2018 at 3:21 PM Durity, Sean R < >>>>>>> sean_r_dur...@homedepot.com> wrote: >>>>>>> >>>>>>>> You say the events are incremental updates. I am interpreting this >>>>>>>> to mean only some columns are updated. Others should keep their >>>>>>>> original >>>>>>>> values. >>>>>>>> >>>>>>>> You are correct that inserting null creates a tombstone. >>>>>>>> >>>>>>>> Can you only insert the columns that actually have new values? Just >>>>>>>> skip the columns with no information. (Make the insert generator a bit >>>>>>>> smarter.) >>>>>>>> >>>>>>>> Create table happening (id text primary key, event text, a text, b >>>>>>>> text, c text); >>>>>>>> Insert into table happening (id, event, a, b, c) values >>>>>>>> ("MainEvent","The most complete info we have right now","Priceless","10 >>>>>>>> pm","Grand Ballroom"); >>>>>>>> -- b changes >>>>>>>> Insert into happening (id, b) values ("MainEvent","9:30 pm"); >>>>>>>> >>>>>>>> >>>>>>>> Sean Durity >>>>>>>> >>>>>>>> >>>>>>>> -----Original Message----- >>>>>>>> From: Tomas Bartalos <tomas.barta...@gmail.com> >>>>>>>> Sent: Thursday, December 27, 2018 9:27 AM >>>>>>>> To: user@cassandra.apache.org >>>>>>>> Subject: [EXTERNAL] Howto avoid tombstones when inserting NULL >>>>>>>> values >>>>>>>> >>>>>>>> Hello, >>>>>>>> >>>>>>>> I’d start with describing my use case and how I’d like to use >>>>>>>> Cassandra to solve my storage needs. >>>>>>>> We're processing a stream of events for various happenings. Every >>>>>>>> event have a unique happening_id. >>>>>>>> One happening may have many events, usually ~ 20-100 events. I’d >>>>>>>> like to store only the latest event for the same happening (Event is an >>>>>>>> incremental update and it contains all up-to date data about >>>>>>>> happening). >>>>>>>> Technically the events are streamed from Kafka, processed with >>>>>>>> Spark an saved to Cassandra. >>>>>>>> In Cassandra we use upserts (insert with same primary key). So far >>>>>>>> so good, however there comes the tombstone... >>>>>>>> >>>>>>>> When I’m inserting field with NULL value, Cassandra creates >>>>>>>> tombstone for this field. As I understood this is due to space >>>>>>>> efficiency, >>>>>>>> Cassandra doesn’t have to remember there is a NULL value, she just >>>>>>>> deletes >>>>>>>> the respective column and a delete creates a ... tombstone. >>>>>>>> I was hoping there could be an option to tell Cassandra not to be >>>>>>>> so space effective and store “unset" info without generating >>>>>>>> tombstones. >>>>>>>> Something similar to inserting empty strings instead of null values: >>>>>>>> >>>>>>>> CREATE TABLE happening (id text PRIMARY KEY, event text); insert >>>>>>>> into happening (‘1’, ‘event1’); — tombstone is generated insert into >>>>>>>> happening (‘1’, null); — tombstone is not generated insert into >>>>>>>> happening >>>>>>>> (‘1’, '’); >>>>>>>> >>>>>>>> Possible solutions: >>>>>>>> 1. Disable tombstones with gc_grace_seconds = 0 or set to >>>>>>>> reasonable low value (1 hour ?) . Not good, since phantom data may >>>>>>>> re-appear 2. ignore NULLs on spark side with >>>>>>>> “spark.cassandra.output.ignoreNulls=true”. Not good since this will >>>>>>>> never >>>>>>>> overwrite previously inserted event field with “empty” one. >>>>>>>> 3. On inserts with spark, find all NULL values and replace them >>>>>>>> with “empty” equivalent (empty string for text, 0 for integer). Very >>>>>>>> inefficient and problematic to find “empty” equivalent for some data >>>>>>>> types. >>>>>>>> >>>>>>>> Until tombstones appeared Cassandra was the right fit for our use >>>>>>>> case, however now I’m not sure if we’re heading the right direction. >>>>>>>> Could you please give me some advice how to solve this problem ? >>>>>>>> >>>>>>>> Thank you, >>>>>>>> Tomas >>>>>>>> >>>>>>>> --------------------------------------------------------------------- >>>>>>>> To unsubscribe, e-mail: user-unsubscr...@cassandra.apache.org >>>>>>>> For additional commands, e-mail: user-h...@cassandra.apache.org >>>>>>>> >>>>>>>> >>>>>>>> ________________________________ >>>>>>>> >>>>>>>> The information in this Internet Email is confidential and may be >>>>>>>> legally privileged. It is intended solely for the addressee. Access to >>>>>>>> this >>>>>>>> Email by anyone else is unauthorized. If you are not the intended >>>>>>>> recipient, any disclosure, copying, distribution or any action taken or >>>>>>>> omitted to be taken in reliance on it, is prohibited and may be >>>>>>>> unlawful. >>>>>>>> When addressed to our clients any opinions or advice contained in this >>>>>>>> Email are subject to the terms and conditions expressed in any >>>>>>>> applicable >>>>>>>> governing The Home Depot terms of business or client engagement >>>>>>>> letter. The >>>>>>>> Home Depot disclaims all responsibility and liability for the accuracy >>>>>>>> and >>>>>>>> content of this attachment and for any damages or losses arising from >>>>>>>> any >>>>>>>> inaccuracies, errors, viruses, e.g., worms, trojan horses, etc., or >>>>>>>> other >>>>>>>> items of a destructive nature, which may be contained in this >>>>>>>> attachment >>>>>>>> and shall not be liable for direct, indirect, consequential or special >>>>>>>> damages in connection with this e-mail message or its attachment. >>>>>>>> >>>>>>>> >>>>>>>> --------------------------------------------------------------------- >>>>>>>> To unsubscribe, e-mail: user-unsubscr...@cassandra.apache.org >>>>>>>> For additional commands, e-mail: user-h...@cassandra.apache.org >>>>>>>> >>>>>>> >>> >>> -- >>> Jon Haddad >>> http://www.rustyrazorblade.com >>> twitter: rustyrazorblade >>> >> > > -- > Jon Haddad > http://www.rustyrazorblade.com > twitter: rustyrazorblade >