Re: Re: Dynamic Columns

Sylvain Lebresne Wed, 21 Jan 2015 13:08:09 -0800

On Wed, Jan 21, 2015 at 6:19 PM, Peter Lin <wool...@gmail.com> wrote:

> the dynamic column can't be part of the primary key. The temporal entity
> key can be the default UUID or the user can choose the field in their
> object. Within our framework, we have concept of temporal links between one
> or more temporal entities. Poluting the primary key with the dynamic column
> wouldn't work.
>

Not totally sure I understand. Are you talking about the underlying storage
space used? If you are, we can discuss it (it's not too hard to remedy it
in CQL, I was mainly trying to illustrating my point, not pretending this
was a drop-in solution for your use case) but it's more of a performance
discussion, and I think we've somewhat quit the realm of "there's things
CQL3 doesn't support".

> Please excuse the confusing RDB comparison. My point is that Cassandra's
> dynamic column feature is the "unique" feature that makes it better than
> traditional RDB or newSql like VoltDB for building temporal databases. With
> databases that require static schema + alter table for managing schema
> evolution, it makes it harder and results in down time.
>

Here again you seem you imply that CQL doesn't support dynamic columns, or
has a somewhat inferior support, but that's just not true.

> One of the challenges of data management over time is evolving the data
> model and making queries simple. If the record is 5 years old, it probably
> has a difference schema than a record inserted this week. With temporal
> databases, every update is an insert, so it's a little bit more complex
> than just "use a blob". There's a whole level of complication with temporal
> data and CQL3 custom types isn't clear to me. I've read the CQL3
> documentation on the custom types several times and it is rather poor. It
> gives me the impression there's still work needed to get custom types in
> good shape.
>

I'm sorry but that's a bit of hand waving. Custom types (and by that I mean
user-provided AbstractType implementations) works in CQL *exactly* like in
thrift: they are not in a better or worse shape than in thrift. And while
the documentation on CQL3 is indeed poor on this part, so is the thrift
documentation on the same subject (besides, I don't think you're whole
point is about saying that documentation could be improved). Again, what
you can do in thrift, you can do in CQL.

> I consistently recommend new users learn and understand both Thrift and
> CQL.
>

I understand that you do this with the best of intentions and don't take it
the wrong way but it is my opinion that you are counterproductive by doing
so, and this for 2 reasons:
1) you don't only recommend users to learn both API, you justify that
advice by affirming that there is a whole family of important use cases
that thrift supports and CQL do not. Except that I pretend tat this
affirmation is technically incorrect, and so far I haven't seen much
example proving me wrong.
2) there is a wealth of evidence that trying to learn both thrift and CQL
confuses the hell out of new users. Which is btw not surprising, both API
presents the same concepts in seemingly different way (even though they do
are the same concepts) and even have conflicting vocabulary, so it's
obviously confusing when you try to learn those concepts in the first
place. Trying to learn CQL when you know thrift well is fine, and why not
learn thrift once you know and understand CQL well, but learning both is
imo a bad advice. It could maybe (maybe) be justified if what you say about
having whole family of use cases not being doable with CQL was true, but
it's not.

--
Sylvain

>
>
>
> On Wed, Jan 21, 2015 at 11:45 AM, Sylvain Lebresne <sylv...@datastax.com>
> wrote:
>
>> On Wed, Jan 21, 2015 at 4:44 PM, Peter Lin <wool...@gmail.com> wrote:
>>
>>> I don't remember other people's examples in detail due to my shitty
>>> memory, so I'd rather not misquote.
>>>
>>
>> Fair enough, but maybe you shouldn't use "people's examples you don't
>> remenber" as argument then. Those examples might be wrong or outdated and
>> that kind of stuff creates confusion for everyone.
>>
>>
>>>
>>> In my case, I mix static and dynamic columns in a single column family
>>> with primitives and objects. The objects are temporal object graphs with a
>>> known type. Doing this type of stuff is basically transparent for me, since
>>> I'm using thrift and our data modeler generates helper classes. Our tooling
>>> seamlessly convert the bytes back to the target object. We have a few
>>> standard static columns related to temporal metadata. At any time, dynamic
>>> columns can be added and they can be primitives or objects.
>>>
>>
>> I don't see anything in that that cannot be done with CQL. You can mix
>> static and dynamic columns in CQL thanks to static columns. More precisely,
>> you can do what you're describing with a table looking a bit like this:
>>   CREATE TABLE t (
>>     key blob,
>>     static my_static_column_1 int,
>>     static my_static_column_2 float,
>>     static my_static_column_3 blob,
>>     ....,
>>     dynamic_column_name blob,
>>     dynamic_column_value blob,
>>     PRIMARY KEY (key, dynamic_column_name);
>>   )
>>
>> And your helper classes will serialize your objects as they probably do
>> today (if you use a custom comparator, you can do that too). And let it be
>> clear that I'm not pretending that doing it this way is tremendously
>> simpler than thrift. But I'm saying that 1) it's possible and 2) while it's
>> not meaningfully simpler than thriftMy , it's not really harder either (and
>> in fact, it's actually less verbose with CQL than with raw thrift).
>>
>>
>>>
>>> For the record, doing this kind of stuff in a relational database sucks
>>> horribly.
>>>
>>
>> I don't know what that has to do with CQL to be honest. If you're doing
>> relational with CQL you're doing it wrong. And please note that I'm not
>> saying CQL is the perfect API for modeling temporal data. But I don't get
>> how thrift, which is very crude API, is a much better API at that than CQL
>> (or, again, how it allows you to do things you can't with CQL).
>>
>> --
>> Sylvain
>>
>
>

Re: Re: Dynamic Columns

Reply via email to