Re: Ball is rolling on High Performance Cassandra Cookbook second edition

Franc Carter Wed, 27 Jun 2012 15:08:42 -0700

On Thu, Jun 28, 2012 at 7:32 AM, Edward Capriolo <edlinuxg...@gmail.com>wrote:


> On Wed, Jun 27, 2012 at 4:34 PM, Brian O'Neill <b...@alumni.brown.edu>
> wrote:
> > RE: API method signatures changing
> >
> > That triggers another thought...
> >
> > What terminology will you use in the book to describe the data model?
> CQL?
> >
> > When we wrote the RefCard on DZone, we intentionally favored/used CQL
> > terminology.  On advisement from Jonathan and Kris Hahn, we wanted to
> start
> > the process of sunsetting the legacy terms (keyspace, column family,
> etc.)
> > in favor of the more familiar CQL terms (schema, table, etc.). I've gone
> on
> > record in favor of the switch, but it is probably something worth noting
> in
> > the book since that terminology does not yet align with all the client
> APIs
> > yet. (e.g. Hector, Astyanax, etc.)
> >
> > I'm not sure when the client APIs will catch up to the new terminology,
> but
> > we may want to inquire as to future proof the recipes as much as
> possible.
> >
> > -brian
> >
> >
> >
> >
> > On Wed, Jun 27, 2012 at 4:18 PM, Edward Capriolo <edlinuxg...@gmail.com>
> > wrote:
> >>
> >> On Wed, Jun 27, 2012 at 3:08 PM, Courtney Robinson <court...@crlog.info
> >
> >> wrote:
> >> > Sounds good.
> >> > One thing I'd like to see is more coverage on Cassandra Internals. Out
> >> > of
> >> > the box Cassandra's great but having a little inside knowledge can be
> >> > very
> >> > useful because it helps you design your applications to work with
> >> > Cassandra;
> >> > rather than having to later make endless optimizations that could
> >> > probably
> >> > have been avoided had you done your implementation slightly
> differently.
> >> >
> >> > Another thing that may be worth adding would be a recipe that showed
> an
> >> > approach to evaluating Cassandra for your organization/use case. I
> >> > realize
> >> > that's going to vary on a case by case basis but one thing I've
> noticed
> >> > is
> >> > that some people dive in without really thinking through whether
> >> > Cassandra
> >> > is actually the right fit for what they're doing. It sort of becomes a
> >> > hammer for anything that looks like a nail.
> >> >
> >> > On Tue, Jun 26, 2012 at 10:25 PM, Edward Capriolo
> >> > <edlinuxg...@gmail.com>
> >> > wrote:
> >> >>
> >> >> Hello all,
> >> >>
> >> >> It has not been very long since the first book was published but
> >> >> several things have been added to Cassandra and a few things have
> >> >> changed. I am putting together a list of changed content, for example
> >> >> features like the old per Column family memtable flush settings
> versus
> >> >> the new system with the global variable.
> >> >>
> >> >> My editors have given me the green light to grow the second edition
> >> >> from ~200 pages currently up to 300 pages! This gives us the ability
> >> >> to add more items/sections to the text.
> >> >>
> >> >> Some things were missing from the first edition such as Hector
> >> >> support. Nate has offered to help me in this area. Please feel
> contact
> >> >> me with any ideas and suggestions of recipes you would like to see in
> >> >> the book. Also get in touch if you want to write a recipe. Several
> >> >> people added content to the first edition and it would be great to
> see
> >> >> that type of participation again.
> >> >>
> >> >> Thank you,
> >> >> Edward
> >> >
> >> >
> >> >
> >> >
> >> > --
> >> > Courtney Robinson
> >> > court...@crlog.info
> >> > http://crlog.info
> >> > 07535691628 (No private #s)
> >> >
> >>
> >> Thanks for the comments. Yes the "INTERNALS" chapter was a bit tricky.
> >> The challenge of writing about internals is they go stale fairly
> >> quickly. I was considering writing a partitioner for the internals
> >> chapter but then I thought about it more:
> >> 1) Its hard
> >> 2) The APIs can change. (They work the same way across versions but
> >> they may have a different signature etc)
> >> 3) 99.99% of people should be using the random partitioner :)
> >>
> >> But I agree the external chapter can be made much stronger then it is.
> >>
> >> The recipe format strict. It naturally conflicts with the typical use
> >> case style. In a use case where you write a good amount of text
> >> talking about problem domain, previous solutions, bragging about
> >> company X. We can not do that with the recipe style, but we can do our
> >> best to make the recipes as real world as possible. I tried to do that
> >> throughout the text, you do not find many examples like 'writing foo
> >> records to bar column families'. However the format does not allow
> >> extensive text blocks mentioned above so it is difficult to set the
> >> stage for a complex and detailed real world problem. Still, I think
> >> for some examples we can take the next step and make the recipe more
> >> real world practical and more use-case like.
> >
> >
> >
> >
> > --
> > Brian ONeill
> > Lead Architect, Health Market Science (http://healthmarketscience.com)
> > mobile:215.588.6024
> > blog: http://weblogs.java.net/blog/boneill42/
> > blog: http://brianoneill.blogspot.com/
> >
>
> As for terminology, I guess you can consider me a hard-liner as I have
> a few problems with calling a column family a table. I might be in the
> minority, but I know I am not alone. On one hand aliases make the
> integration easier
> https://issues.apache.org/jira/browse/CASSANDRA-2743, but on the other
> hand if a user does not understand what a column family is they will
> likely use cassandra incorrectly.
>

This is my view as well. One of the big hurdles I noticed with developers
moving to Cassandra is that there is a strong tendency to apply RDBMS
thinking to Casandra - this is unsurprising, the majority of data store
conceptualisation exists in this framework. I can see using names that have
connections with RDBMS is likely to encourage this.

cheers


> Maybe this is just a semantics debate because a table in a column
> oriented database is different then a table in a row oriented
> database, but the column family data model is one of the cornerstones
> of Cassandra. Globally replacing column family with table for the text
> is not a good idea.
>
> We will have to be smart about it. As thrift, the cli, the internals,
> the high level clients will be like this for some time.
>
> I definitely plan to add an entire chapter on CQL. I think we can put
> it after the CLI chapter, the introduction of CQL can attempt to cover
> the ground between the old school and the new school thinking.
>
> Edward
>



-- 

*Franc Carter* | Systems architect | Sirca Ltd
 <marc.zianideferra...@sirca.org.au>

franc.car...@sirca.org.au | www.sirca.org.au

Tel: +61 2 9236 9118

Level 9, 80 Clarence St, Sydney NSW 2000

PO Box H58, Australia Square, Sydney NSW 1215

Re: Ball is rolling on High Performance Cassandra Cookbook second edition

Reply via email to