Re: Ball is rolling on High Performance Cassandra Cookbook second edition

Edward Capriolo Wed, 27 Jun 2012 14:33:05 -0700

On Wed, Jun 27, 2012 at 4:34 PM, Brian O'Neill <b...@alumni.brown.edu> wrote:
> RE: API method signatures changing
>
> That triggers another thought...
>
> What terminology will you use in the book to describe the data model?  CQL?
>
> When we wrote the RefCard on DZone, we intentionally favored/used CQL
> terminology.  On advisement from Jonathan and Kris Hahn, we wanted to start
> the process of sunsetting the legacy terms (keyspace, column family, etc.)
> in favor of the more familiar CQL terms (schema, table, etc.). I've gone on
> record in favor of the switch, but it is probably something worth noting in
> the book since that terminology does not yet align with all the client APIs
> yet. (e.g. Hector, Astyanax, etc.)
>
> I'm not sure when the client APIs will catch up to the new terminology, but
> we may want to inquire as to future proof the recipes as much as possible.
>
> -brian
>
>
>
>
> On Wed, Jun 27, 2012 at 4:18 PM, Edward Capriolo <edlinuxg...@gmail.com>
> wrote:
>>
>> On Wed, Jun 27, 2012 at 3:08 PM, Courtney Robinson <court...@crlog.info>
>> wrote:
>> > Sounds good.
>> > One thing I'd like to see is more coverage on Cassandra Internals. Out
>> > of
>> > the box Cassandra's great but having a little inside knowledge can be
>> > very
>> > useful because it helps you design your applications to work with
>> > Cassandra;
>> > rather than having to later make endless optimizations that could
>> > probably
>> > have been avoided had you done your implementation slightly differently.
>> >
>> > Another thing that may be worth adding would be a recipe that showed an
>> > approach to evaluating Cassandra for your organization/use case. I
>> > realize
>> > that's going to vary on a case by case basis but one thing I've noticed
>> > is
>> > that some people dive in without really thinking through whether
>> > Cassandra
>> > is actually the right fit for what they're doing. It sort of becomes a
>> > hammer for anything that looks like a nail.
>> >
>> > On Tue, Jun 26, 2012 at 10:25 PM, Edward Capriolo
>> > <edlinuxg...@gmail.com>
>> > wrote:
>> >>
>> >> Hello all,
>> >>
>> >> It has not been very long since the first book was published but
>> >> several things have been added to Cassandra and a few things have
>> >> changed. I am putting together a list of changed content, for example
>> >> features like the old per Column family memtable flush settings versus
>> >> the new system with the global variable.
>> >>
>> >> My editors have given me the green light to grow the second edition
>> >> from ~200 pages currently up to 300 pages! This gives us the ability
>> >> to add more items/sections to the text.
>> >>
>> >> Some things were missing from the first edition such as Hector
>> >> support. Nate has offered to help me in this area. Please feel contact
>> >> me with any ideas and suggestions of recipes you would like to see in
>> >> the book. Also get in touch if you want to write a recipe. Several
>> >> people added content to the first edition and it would be great to see
>> >> that type of participation again.
>> >>
>> >> Thank you,
>> >> Edward
>> >
>> >
>> >
>> >
>> > --
>> > Courtney Robinson
>> > court...@crlog.info
>> > http://crlog.info
>> > 07535691628 (No private #s)
>> >
>>
>> Thanks for the comments. Yes the "INTERNALS" chapter was a bit tricky.
>> The challenge of writing about internals is they go stale fairly
>> quickly. I was considering writing a partitioner for the internals
>> chapter but then I thought about it more:
>> 1) Its hard
>> 2) The APIs can change. (They work the same way across versions but
>> they may have a different signature etc)
>> 3) 99.99% of people should be using the random partitioner :)
>>
>> But I agree the external chapter can be made much stronger then it is.
>>
>> The recipe format strict. It naturally conflicts with the typical use
>> case style. In a use case where you write a good amount of text
>> talking about problem domain, previous solutions, bragging about
>> company X. We can not do that with the recipe style, but we can do our
>> best to make the recipes as real world as possible. I tried to do that
>> throughout the text, you do not find many examples like 'writing foo
>> records to bar column families'. However the format does not allow
>> extensive text blocks mentioned above so it is difficult to set the
>> stage for a complex and detailed real world problem. Still, I think
>> for some examples we can take the next step and make the recipe more
>> real world practical and more use-case like.
>
>
>
>
> --
> Brian ONeill
> Lead Architect, Health Market Science (http://healthmarketscience.com)
> mobile:215.588.6024
> blog: http://weblogs.java.net/blog/boneill42/
> blog: http://brianoneill.blogspot.com/
>


As for terminology, I guess you can consider me a hard-liner as I have
a few problems with calling a column family a table. I might be in the
minority, but I know I am not alone. On one hand aliases make the
integration easier
https://issues.apache.org/jira/browse/CASSANDRA-2743, but on the other
hand if a user does not understand what a column family is they will
likely use cassandra incorrectly.

Maybe this is just a semantics debate because a table in a column
oriented database is different then a table in a row oriented
database, but the column family data model is one of the cornerstones
of Cassandra. Globally replacing column family with table for the text
is not a good idea.

We will have to be smart about it. As thrift, the cli, the internals,
the high level clients will be like this for some time.

I definitely plan to add an entire chapter on CQL. I think we can put
it after the CLI chapter, the introduction of CQL can attempt to cover
the ground between the old school and the new school thinking.

Edward

Re: Ball is rolling on High Performance Cassandra Cookbook second edition

Reply via email to