I'm looking for some best practices w/r/t supporting arbitrary columns. It
seems from the docs I've read around CQL that they are supported in some
capacity via collections but you can't exceed 64K in size. For my
requirements that would cause problems.
So my questions are:
1) Is using Thrift a
Hello Mark
Dynamic columns, as you said, are perfectly supported by CQL3 via
clustering columns. And no, using collections for storing dynamic data is a
very bad idea if the cardinality is very high ( 1000 elements)
1) Is using Thrift a valid approach in the era of CQL? -- Less and less.
Thanks DuyHai,
I have a follow up question to #2. You mentioned ideally I would create a
new table instead of mutating an existing one.
This strikes me as bad practice in the world of multi tenant systems. I
don't want to create a table per customer. So I'm wondering if dynamically
modifying the
On Fri, Jun 13, 2014 at 11:54 AM, DuyHai Doan doanduy...@gmail.com wrote:
Dynamic columns, as you said, are perfectly supported by CQL3 via
clustering columns.
Perfectly supported seems a bit expansive as a claim. They are not quite
the same thing as actual dynamic columns and are
This strikes me as bad practice in the world of multi tenant systems. I
don't want to create a table per customer. So I'm wondering if dynamically
modifying the table is an accepted practice? -- Can you give some details
about your use case ? How would you alter a table structure to adapt it
to a
I like CQL, but it's not a hammer.
If thrift is more appropriate for you, then use it. If Cassandra gets to
the point where Thrift is removed, I'll just fork Cassandra. That's what's
great about open source.
On Fri, Jun 13, 2014 at 3:47 PM, DuyHai Doan doanduy...@gmail.com wrote:
This strikes
My use case requires the support of arbitrary columns much like a CRM. My
users can define 'custom' fields within the application. Ideally I wouldn't
have to change the schema at all, which is why I like the old thrift
approach rather than the CQL approach.
Having said all that, I'd be willing to
Like you, I make extensive use of dynamic columns for similar reasons.
In our project, one of the goals is to give end users the ability to
design their own schema without having to alter a table. If people really
want strong schema, then just use old Sql or NewSql. RDB gives you the full
power
Hi Mark
I believe that in your table you want to have some common fields that
will be there whatever customer is, and other fields that are entirely
customer-dependent, isn't it ?
In this case, creating a table with static columns for the common fields
and a clustering column representing all
Yeah I don't anticipate more than 1000 properties, well under in fact. I
guess the trade off of using the clustered columns is that I'd have a table
that would be tall and skinny which also has its challenges w/r/t memory.
I'll look into your suggestion a bit more and consider some others around
Well, before talking and discussing about dynamic columns, we should
first define it clearly. What do people mean by dynamic columns exactly ?
Is it the ability to add many columns of same type to an existing
physical row? If yes then CQL3 does support it with clustering columns.
On Fri, Jun
when I say dynamic column, I mean non-static columns of different types
within the same row. Some could be an object or one of the defined
datatypes.
with thrift I use the appropriate serializer to handle these dynamic
columns.
On Fri, Jun 13, 2014 at 4:55 PM, DuyHai Doan doanduy...@gmail.com
In thrift, when creating a column family, you need to define
1) the row/partition key type
2) the column comparator type
3) the validation type for the actual value (cell in CQL3 terminology)
Unless you use dynamic composites feature, which does not exist (and
probably won't) in CQL3, I don't
the validation type is set to bytes, and my code is type safe, so it knows
which serializers to use. Those dynamic columns are driven off the types in
Java.
Having said that, CQL3 does have a new custom type feature, but the
documentation is basically non-existent on how that actually works. One
the validation type is set to bytes, and my code is type safe, so it knows
which serializers to use. Those dynamic columns are driven off the types in
Java. -- Correct. However, you are still bound by the column comparator
type which should be fixed (unless again you set it to bytes, in this case
With text based query approach like CQL, you loose the type with dynamic
columns. Yes, we're storing it as bytes, but it is simpler and easier with
Thrift to do these types of things.
I like CQL3 and what it does, but text based query languages make certain
dynamic schema use cases painful.
There are always the pros and the cons with a querying language, as always.
But as far as I can see, the advantages of Thrift I can see over CQL3 are:
1) Thrift require a little bit less decoding server-side (a difference
around 10% in CPU usage).
2) Thrift use more compact storage because
My 2 cents…
A motivation for CQL3 AFAIK was to make Cassandra more familiar to SQL users.
This is a valid goal, and works well in many cases.
Equally there are use cases (that some might find ugly) where Cassandra is
chosen explicitly because of the sorts of things you can do at the thrift
Just to add 2 more cents... :)
The CQL3 protocol is asynchronous. This can provide a substantial
throughput increase, according to my benchmarking, when one uses
non-blocking techniques.
It is also peer-to-peer. Hence the server can generate events to send to
the client, e.g. schema changes - in
without a doubt there's nice features of CQL3 like notifications and async.
I want to see CQL3 mature and handle all the use cases that Thrift handles
easily today. It's to everyone's benefit to work together and improve CQL3.
Other benefits of Thrift drivers today is being able to use object API
Note as I mentioned mid post, thrift also supports async nowadays (there was a
recent discussion on cassandra dev and the choice was not to move to it)
I think the binary protocol is the way forward; CQL3 needs some new features,
or there need to be some other types of requests you can make
yes, thrift does have async, though I haven't had to use it yet.
Right now I'm working on adding CAS to hector followed by multi slice.
On Fri, Jun 13, 2014 at 9:01 PM, graham sanderson gra...@vast.com wrote:
Note as I mentioned mid post, thrift also supports async nowadays (there
was a
22 matches
Mail list logo