Re: Ball is rolling on High Performance Cassandra Cookbook second edition

2012-10-01 Thread Edward Capriolo
Hello all,

Work has begun on the second edition!  Keep hitting me up with ideas.
In particular I am looking for someone who has done work with
flume+Cassandra and pig+Cassandra. Both of these things topics will be
covered to some extent in the second edition, but these are two
instances in which I could use some help as I do not have extensive
experience with these two combinations.

Contact me if you have any other ideas as well.

Edward

On Tue, Jun 26, 2012 at 5:25 PM, Edward Capriolo edlinuxg...@gmail.com wrote:
 Hello all,

 It has not been very long since the first book was published but
 several things have been added to Cassandra and a few things have
 changed. I am putting together a list of changed content, for example
 features like the old per Column family memtable flush settings versus
 the new system with the global variable.

 My editors have given me the green light to grow the second edition
 from ~200 pages currently up to 300 pages! This gives us the ability
 to add more items/sections to the text.

 Some things were missing from the first edition such as Hector
 support. Nate has offered to help me in this area. Please feel contact
 me with any ideas and suggestions of recipes you would like to see in
 the book. Also get in touch if you want to write a recipe. Several
 people added content to the first edition and it would be great to see
 that type of participation again.

 Thank you,
 Edward


Re: Ball is rolling on High Performance Cassandra Cookbook second edition

2012-07-01 Thread Jonathan Ellis
On Wed, Jun 27, 2012 at 5:11 PM, Aaron Turner synfina...@gmail.com wrote:
 Honestly, I think using the same terms as a RDBMS does
 makes users think they're exactly the same thing and have the same
 properties... which is close enough in some cases, but dangerous in
 others.

The point is that thinking in terms of the storage engine is difficult
and unnecessary.  You can represent that data relationally, which is
the Right Thing to do both because people are familiar with that world
and because it decouples model from representation, which lets us
change the latter if necessary.

http://www.datastax.com/dev/blog/schema-in-cassandra-1-1

-- 
Jonathan Ellis
Project Chair, Apache Cassandra
co-founder of DataStax, the source for professional Cassandra support
http://www.datastax.com


Re: Ball is rolling on High Performance Cassandra Cookbook second edition

2012-06-27 Thread Raj N
Great stuff!!!

On Tue, Jun 26, 2012 at 5:25 PM, Edward Capriolo edlinuxg...@gmail.comwrote:

 Hello all,

 It has not been very long since the first book was published but
 several things have been added to Cassandra and a few things have
 changed. I am putting together a list of changed content, for example
 features like the old per Column family memtable flush settings versus
 the new system with the global variable.

 My editors have given me the green light to grow the second edition
 from ~200 pages currently up to 300 pages! This gives us the ability
 to add more items/sections to the text.

 Some things were missing from the first edition such as Hector
 support. Nate has offered to help me in this area. Please feel contact
 me with any ideas and suggestions of recipes you would like to see in
 the book. Also get in touch if you want to write a recipe. Several
 people added content to the first edition and it would be great to see
 that type of participation again.

 Thank you,
 Edward



Re: Ball is rolling on High Performance Cassandra Cookbook second edition

2012-06-27 Thread Robin Verlangen
Hi Edward,

Looking forward to your book. It's always interesting to read what others
have to say about a certain subject, and hopefully even learn new things!

2012/6/27 Raj N raj.cassan...@gmail.com

 Great stuff!!!


 On Tue, Jun 26, 2012 at 5:25 PM, Edward Capriolo edlinuxg...@gmail.comwrote:

 Hello all,

 It has not been very long since the first book was published but
 several things have been added to Cassandra and a few things have
 changed. I am putting together a list of changed content, for example
 features like the old per Column family memtable flush settings versus
 the new system with the global variable.

 My editors have given me the green light to grow the second edition
 from ~200 pages currently up to 300 pages! This gives us the ability
 to add more items/sections to the text.

 Some things were missing from the first edition such as Hector
 support. Nate has offered to help me in this area. Please feel contact
 me with any ideas and suggestions of recipes you would like to see in
 the book. Also get in touch if you want to write a recipe. Several
 people added content to the first edition and it would be great to see
 that type of participation again.

 Thank you,
 Edward





-- 
With kind regards,

Robin Verlangen
*Software engineer*
*
*
W http://www.robinverlangen.nl
E ro...@us2.nl

Disclaimer: The information contained in this message and attachments is
intended solely for the attention and use of the named addressee and may be
confidential. If you are not the intended recipient, you are reminded that
the information remains the property of the sender. You must not use,
disclose, distribute, copy, print or rely on this e-mail. If you have
received this message in error, please contact the sender immediately and
irrevocably delete this message and any copies.


Re: Ball is rolling on High Performance Cassandra Cookbook second edition

2012-06-27 Thread Aaron Turner
Hey Edward,

I finally posted my (short) blog post on using Hector with Jruby:

http://synfin.net/sock_stream/technology/code/cassandra-hector-jruby-awesome

If you're interested in documenting that more in detail in your book,
let me know and I can help you with that in your book if you'd like.

-Aaron

On Tue, Jun 26, 2012 at 2:25 PM, Edward Capriolo edlinuxg...@gmail.com wrote:
 Hello all,

 It has not been very long since the first book was published but
 several things have been added to Cassandra and a few things have
 changed. I am putting together a list of changed content, for example
 features like the old per Column family memtable flush settings versus
 the new system with the global variable.

 My editors have given me the green light to grow the second edition
 from ~200 pages currently up to 300 pages! This gives us the ability
 to add more items/sections to the text.

 Some things were missing from the first edition such as Hector
 support. Nate has offered to help me in this area. Please feel contact
 me with any ideas and suggestions of recipes you would like to see in
 the book. Also get in touch if you want to write a recipe. Several
 people added content to the first edition and it would be great to see
 that type of participation again.

 Thank you,
 Edward



-- 
Aaron Turner
http://synfin.net/         Twitter: @synfinatic
http://tcpreplay.synfin.net/ - Pcap editing and replay tools for Unix  Windows
Those who would give up essential Liberty, to purchase a little temporary
Safety, deserve neither Liberty nor Safety.
    -- Benjamin Franklin
carpe diem quam minimum credula postero


Re: Ball is rolling on High Performance Cassandra Cookbook second edition

2012-06-27 Thread Courtney Robinson
Sounds good.
One thing I'd like to see is more coverage on Cassandra Internals. Out of
the box Cassandra's great but having a little inside knowledge can be very
useful because it helps you design your applications to work with
Cassandra; rather than having to later make endless optimizations that
could probably have been avoided had you done your implementation slightly
differently.

Another thing that may be worth adding would be a recipe that showed an
approach to evaluating Cassandra for your organization/use case. I realize
that's going to vary on a case by case basis but one thing I've noticed is
that some people dive in without really thinking through whether Cassandra
is actually the right fit for what they're doing. It sort of becomes a
hammer for anything that looks like a nail.

On Tue, Jun 26, 2012 at 10:25 PM, Edward Capriolo edlinuxg...@gmail.comwrote:

 Hello all,

 It has not been very long since the first book was published but
 several things have been added to Cassandra and a few things have
 changed. I am putting together a list of changed content, for example
 features like the old per Column family memtable flush settings versus
 the new system with the global variable.

 My editors have given me the green light to grow the second edition
 from ~200 pages currently up to 300 pages! This gives us the ability
 to add more items/sections to the text.

 Some things were missing from the first edition such as Hector
 support. Nate has offered to help me in this area. Please feel contact
 me with any ideas and suggestions of recipes you would like to see in
 the book. Also get in touch if you want to write a recipe. Several
 people added content to the first edition and it would be great to see
 that type of participation again.

 Thank you,
 Edward




-- 
Courtney Robinson
court...@crlog.info
http://crlog.info
07535691628 (No private #s)


Re: Ball is rolling on High Performance Cassandra Cookbook second edition

2012-06-27 Thread Edward Capriolo
On Wed, Jun 27, 2012 at 3:08 PM, Courtney Robinson court...@crlog.info wrote:
 Sounds good.
 One thing I'd like to see is more coverage on Cassandra Internals. Out of
 the box Cassandra's great but having a little inside knowledge can be very
 useful because it helps you design your applications to work with Cassandra;
 rather than having to later make endless optimizations that could probably
 have been avoided had you done your implementation slightly differently.

 Another thing that may be worth adding would be a recipe that showed an
 approach to evaluating Cassandra for your organization/use case. I realize
 that's going to vary on a case by case basis but one thing I've noticed is
 that some people dive in without really thinking through whether Cassandra
 is actually the right fit for what they're doing. It sort of becomes a
 hammer for anything that looks like a nail.

 On Tue, Jun 26, 2012 at 10:25 PM, Edward Capriolo edlinuxg...@gmail.com
 wrote:

 Hello all,

 It has not been very long since the first book was published but
 several things have been added to Cassandra and a few things have
 changed. I am putting together a list of changed content, for example
 features like the old per Column family memtable flush settings versus
 the new system with the global variable.

 My editors have given me the green light to grow the second edition
 from ~200 pages currently up to 300 pages! This gives us the ability
 to add more items/sections to the text.

 Some things were missing from the first edition such as Hector
 support. Nate has offered to help me in this area. Please feel contact
 me with any ideas and suggestions of recipes you would like to see in
 the book. Also get in touch if you want to write a recipe. Several
 people added content to the first edition and it would be great to see
 that type of participation again.

 Thank you,
 Edward




 --
 Courtney Robinson
 court...@crlog.info
 http://crlog.info
 07535691628 (No private #s)


Thanks for the comments. Yes the INTERNALS chapter was a bit tricky.
The challenge of writing about internals is they go stale fairly
quickly. I was considering writing a partitioner for the internals
chapter but then I thought about it more:
1) Its hard
2) The APIs can change. (They work the same way across versions but
they may have a different signature etc)
3) 99.99% of people should be using the random partitioner :)

But I agree the external chapter can be made much stronger then it is.

The recipe format strict. It naturally conflicts with the typical use
case style. In a use case where you write a good amount of text
talking about problem domain, previous solutions, bragging about
company X. We can not do that with the recipe style, but we can do our
best to make the recipes as real world as possible. I tried to do that
throughout the text, you do not find many examples like 'writing foo
records to bar column families'. However the format does not allow
extensive text blocks mentioned above so it is difficult to set the
stage for a complex and detailed real world problem. Still, I think
for some examples we can take the next step and make the recipe more
real world practical and more use-case like.


Re: Ball is rolling on High Performance Cassandra Cookbook second edition

2012-06-27 Thread Brian O'Neill
RE: API method signatures changing

That triggers another thought...

What terminology will you use in the book to describe the data model?  CQL?

When we wrote the RefCard on
DZonehttp://refcardz.dzone.com/refcardz/apache-cassandra,
we intentionally favored/used CQL terminology.  On advisement from Jonathan
and Kris Hahn, we wanted to start the process of sunsetting the legacy
terms (keyspace, column family, etc.) in favor of the more familiar CQL
terms (schema, table, etc.). I've gone on
recordhttp://css.dzone.com/articles/new-refcard-apache-cassandrain
favor of the switch, but it is probably something worth noting in the
book since that terminology does not yet align with all the client APIs
yet. (e.g. Hector, Astyanax, etc.)

I'm not sure when the client APIs will catch up to the new terminology, but
we may want to inquire as to future proof the recipes as much as possible.

-brian




On Wed, Jun 27, 2012 at 4:18 PM, Edward Capriolo edlinuxg...@gmail.comwrote:

 On Wed, Jun 27, 2012 at 3:08 PM, Courtney Robinson court...@crlog.info
 wrote:
  Sounds good.
  One thing I'd like to see is more coverage on Cassandra Internals. Out of
  the box Cassandra's great but having a little inside knowledge can be
 very
  useful because it helps you design your applications to work with
 Cassandra;
  rather than having to later make endless optimizations that could
 probably
  have been avoided had you done your implementation slightly differently.
 
  Another thing that may be worth adding would be a recipe that showed an
  approach to evaluating Cassandra for your organization/use case. I
 realize
  that's going to vary on a case by case basis but one thing I've noticed
 is
  that some people dive in without really thinking through whether
 Cassandra
  is actually the right fit for what they're doing. It sort of becomes a
  hammer for anything that looks like a nail.
 
  On Tue, Jun 26, 2012 at 10:25 PM, Edward Capriolo edlinuxg...@gmail.com
 
  wrote:
 
  Hello all,
 
  It has not been very long since the first book was published but
  several things have been added to Cassandra and a few things have
  changed. I am putting together a list of changed content, for example
  features like the old per Column family memtable flush settings versus
  the new system with the global variable.
 
  My editors have given me the green light to grow the second edition
  from ~200 pages currently up to 300 pages! This gives us the ability
  to add more items/sections to the text.
 
  Some things were missing from the first edition such as Hector
  support. Nate has offered to help me in this area. Please feel contact
  me with any ideas and suggestions of recipes you would like to see in
  the book. Also get in touch if you want to write a recipe. Several
  people added content to the first edition and it would be great to see
  that type of participation again.
 
  Thank you,
  Edward
 
 
 
 
  --
  Courtney Robinson
  court...@crlog.info
  http://crlog.info
  07535691628 (No private #s)
 

 Thanks for the comments. Yes the INTERNALS chapter was a bit tricky.
 The challenge of writing about internals is they go stale fairly
 quickly. I was considering writing a partitioner for the internals
 chapter but then I thought about it more:
 1) Its hard
 2) The APIs can change. (They work the same way across versions but
 they may have a different signature etc)
 3) 99.99% of people should be using the random partitioner :)

 But I agree the external chapter can be made much stronger then it is.

 The recipe format strict. It naturally conflicts with the typical use
 case style. In a use case where you write a good amount of text
 talking about problem domain, previous solutions, bragging about
 company X. We can not do that with the recipe style, but we can do our
 best to make the recipes as real world as possible. I tried to do that
 throughout the text, you do not find many examples like 'writing foo
 records to bar column families'. However the format does not allow
 extensive text blocks mentioned above so it is difficult to set the
 stage for a complex and detailed real world problem. Still, I think
 for some examples we can take the next step and make the recipe more
 real world practical and more use-case like.




-- 
Brian ONeill
Lead Architect, Health Market Science (http://healthmarketscience.com)
mobile:215.588.6024
blog: http://weblogs.java.net/blog/boneill42/
blog: http://brianoneill.blogspot.com/


Re: Ball is rolling on High Performance Cassandra Cookbook second edition

2012-06-27 Thread Edward Capriolo
On Wed, Jun 27, 2012 at 4:34 PM, Brian O'Neill b...@alumni.brown.edu wrote:
 RE: API method signatures changing

 That triggers another thought...

 What terminology will you use in the book to describe the data model?  CQL?

 When we wrote the RefCard on DZone, we intentionally favored/used CQL
 terminology.  On advisement from Jonathan and Kris Hahn, we wanted to start
 the process of sunsetting the legacy terms (keyspace, column family, etc.)
 in favor of the more familiar CQL terms (schema, table, etc.). I've gone on
 record in favor of the switch, but it is probably something worth noting in
 the book since that terminology does not yet align with all the client APIs
 yet. (e.g. Hector, Astyanax, etc.)

 I'm not sure when the client APIs will catch up to the new terminology, but
 we may want to inquire as to future proof the recipes as much as possible.

 -brian




 On Wed, Jun 27, 2012 at 4:18 PM, Edward Capriolo edlinuxg...@gmail.com
 wrote:

 On Wed, Jun 27, 2012 at 3:08 PM, Courtney Robinson court...@crlog.info
 wrote:
  Sounds good.
  One thing I'd like to see is more coverage on Cassandra Internals. Out
  of
  the box Cassandra's great but having a little inside knowledge can be
  very
  useful because it helps you design your applications to work with
  Cassandra;
  rather than having to later make endless optimizations that could
  probably
  have been avoided had you done your implementation slightly differently.
 
  Another thing that may be worth adding would be a recipe that showed an
  approach to evaluating Cassandra for your organization/use case. I
  realize
  that's going to vary on a case by case basis but one thing I've noticed
  is
  that some people dive in without really thinking through whether
  Cassandra
  is actually the right fit for what they're doing. It sort of becomes a
  hammer for anything that looks like a nail.
 
  On Tue, Jun 26, 2012 at 10:25 PM, Edward Capriolo
  edlinuxg...@gmail.com
  wrote:
 
  Hello all,
 
  It has not been very long since the first book was published but
  several things have been added to Cassandra and a few things have
  changed. I am putting together a list of changed content, for example
  features like the old per Column family memtable flush settings versus
  the new system with the global variable.
 
  My editors have given me the green light to grow the second edition
  from ~200 pages currently up to 300 pages! This gives us the ability
  to add more items/sections to the text.
 
  Some things were missing from the first edition such as Hector
  support. Nate has offered to help me in this area. Please feel contact
  me with any ideas and suggestions of recipes you would like to see in
  the book. Also get in touch if you want to write a recipe. Several
  people added content to the first edition and it would be great to see
  that type of participation again.
 
  Thank you,
  Edward
 
 
 
 
  --
  Courtney Robinson
  court...@crlog.info
  http://crlog.info
  07535691628 (No private #s)
 

 Thanks for the comments. Yes the INTERNALS chapter was a bit tricky.
 The challenge of writing about internals is they go stale fairly
 quickly. I was considering writing a partitioner for the internals
 chapter but then I thought about it more:
 1) Its hard
 2) The APIs can change. (They work the same way across versions but
 they may have a different signature etc)
 3) 99.99% of people should be using the random partitioner :)

 But I agree the external chapter can be made much stronger then it is.

 The recipe format strict. It naturally conflicts with the typical use
 case style. In a use case where you write a good amount of text
 talking about problem domain, previous solutions, bragging about
 company X. We can not do that with the recipe style, but we can do our
 best to make the recipes as real world as possible. I tried to do that
 throughout the text, you do not find many examples like 'writing foo
 records to bar column families'. However the format does not allow
 extensive text blocks mentioned above so it is difficult to set the
 stage for a complex and detailed real world problem. Still, I think
 for some examples we can take the next step and make the recipe more
 real world practical and more use-case like.




 --
 Brian ONeill
 Lead Architect, Health Market Science (http://healthmarketscience.com)
 mobile:215.588.6024
 blog: http://weblogs.java.net/blog/boneill42/
 blog: http://brianoneill.blogspot.com/


As for terminology, I guess you can consider me a hard-liner as I have
a few problems with calling a column family a table. I might be in the
minority, but I know I am not alone. On one hand aliases make the
integration easier
https://issues.apache.org/jira/browse/CASSANDRA-2743, but on the other
hand if a user does not understand what a column family is they will
likely use cassandra incorrectly.

Maybe this is just a semantics debate because a table in a column
oriented database is different then a table in a row oriented

Re: Ball is rolling on High Performance Cassandra Cookbook second edition

2012-06-27 Thread Bill
I'm looking forward to getting a few copies of this. Some areas that 
would be great to cover


 - Indexing strategies

 - Configuring clients/env for sane timestamping

 - Efficient CQL

 - Top 8/10 perf issues/stacktraces and common resolutions

 - understanding nodetool tpstats/cfhistograms/cfstats and what they're 
actually saying


 - Capacity sizing (disk/ram overhead needed)

 - Compaction choice/strategies for kinds of workload

Bill

On 26/06/12 22:25, Edward Capriolo wrote:

Hello all,

It has not been very long since the first book was published but
several things have been added to Cassandra and a few things have
changed. I am putting together a list of changed content, for example
features like the old per Column family memtable flush settings versus
the new system with the global variable.

My editors have given me the green light to grow the second edition
from ~200 pages currently up to 300 pages! This gives us the ability
to add more items/sections to the text.

Some things were missing from the first edition such as Hector
support. Nate has offered to help me in this area. Please feel contact
me with any ideas and suggestions of recipes you would like to see in
the book. Also get in touch if you want to write a recipe. Several
people added content to the first edition and it would be great to see
that type of participation again.

Thank you,
Edward







Re: Ball is rolling on High Performance Cassandra Cookbook second edition

2012-06-27 Thread Rustam Aliyev

Hi Edward,

That's a great news!

One thing I'd like to see in the new edition is Counters, known issues 
and how to avoid them:
 - avoid double counting (don't retry on failure, use write consistency 
level ONE, use dedicated Hector connector?)

 - delete counters (tricky, reset to zero?)
 - other tips and tricks

I personally had (and still have to some extend) problems with 
maintaining counter accuracy.


Best,
Rustam.


On 26/06/2012 22:25, Edward Capriolo wrote:

Hello all,

It has not been very long since the first book was published but
several things have been added to Cassandra and a few things have
changed. I am putting together a list of changed content, for example
features like the old per Column family memtable flush settings versus
the new system with the global variable.

My editors have given me the green light to grow the second edition
from ~200 pages currently up to 300 pages! This gives us the ability
to add more items/sections to the text.

Some things were missing from the first edition such as Hector
support. Nate has offered to help me in this area. Please feel contact
me with any ideas and suggestions of recipes you would like to see in
the book. Also get in touch if you want to write a recipe. Several
people added content to the first edition and it would be great to see
that type of participation again.

Thank you,
Edward





Re: Ball is rolling on High Performance Cassandra Cookbook second edition

2012-06-27 Thread Franc Carter
On Thu, Jun 28, 2012 at 7:32 AM, Edward Capriolo edlinuxg...@gmail.comwrote:

 On Wed, Jun 27, 2012 at 4:34 PM, Brian O'Neill b...@alumni.brown.edu
 wrote:
  RE: API method signatures changing
 
  That triggers another thought...
 
  What terminology will you use in the book to describe the data model?
 CQL?
 
  When we wrote the RefCard on DZone, we intentionally favored/used CQL
  terminology.  On advisement from Jonathan and Kris Hahn, we wanted to
 start
  the process of sunsetting the legacy terms (keyspace, column family,
 etc.)
  in favor of the more familiar CQL terms (schema, table, etc.). I've gone
 on
  record in favor of the switch, but it is probably something worth noting
 in
  the book since that terminology does not yet align with all the client
 APIs
  yet. (e.g. Hector, Astyanax, etc.)
 
  I'm not sure when the client APIs will catch up to the new terminology,
 but
  we may want to inquire as to future proof the recipes as much as
 possible.
 
  -brian
 
 
 
 
  On Wed, Jun 27, 2012 at 4:18 PM, Edward Capriolo edlinuxg...@gmail.com
  wrote:
 
  On Wed, Jun 27, 2012 at 3:08 PM, Courtney Robinson court...@crlog.info
 
  wrote:
   Sounds good.
   One thing I'd like to see is more coverage on Cassandra Internals. Out
   of
   the box Cassandra's great but having a little inside knowledge can be
   very
   useful because it helps you design your applications to work with
   Cassandra;
   rather than having to later make endless optimizations that could
   probably
   have been avoided had you done your implementation slightly
 differently.
  
   Another thing that may be worth adding would be a recipe that showed
 an
   approach to evaluating Cassandra for your organization/use case. I
   realize
   that's going to vary on a case by case basis but one thing I've
 noticed
   is
   that some people dive in without really thinking through whether
   Cassandra
   is actually the right fit for what they're doing. It sort of becomes a
   hammer for anything that looks like a nail.
  
   On Tue, Jun 26, 2012 at 10:25 PM, Edward Capriolo
   edlinuxg...@gmail.com
   wrote:
  
   Hello all,
  
   It has not been very long since the first book was published but
   several things have been added to Cassandra and a few things have
   changed. I am putting together a list of changed content, for example
   features like the old per Column family memtable flush settings
 versus
   the new system with the global variable.
  
   My editors have given me the green light to grow the second edition
   from ~200 pages currently up to 300 pages! This gives us the ability
   to add more items/sections to the text.
  
   Some things were missing from the first edition such as Hector
   support. Nate has offered to help me in this area. Please feel
 contact
   me with any ideas and suggestions of recipes you would like to see in
   the book. Also get in touch if you want to write a recipe. Several
   people added content to the first edition and it would be great to
 see
   that type of participation again.
  
   Thank you,
   Edward
  
  
  
  
   --
   Courtney Robinson
   court...@crlog.info
   http://crlog.info
   07535691628 (No private #s)
  
 
  Thanks for the comments. Yes the INTERNALS chapter was a bit tricky.
  The challenge of writing about internals is they go stale fairly
  quickly. I was considering writing a partitioner for the internals
  chapter but then I thought about it more:
  1) Its hard
  2) The APIs can change. (They work the same way across versions but
  they may have a different signature etc)
  3) 99.99% of people should be using the random partitioner :)
 
  But I agree the external chapter can be made much stronger then it is.
 
  The recipe format strict. It naturally conflicts with the typical use
  case style. In a use case where you write a good amount of text
  talking about problem domain, previous solutions, bragging about
  company X. We can not do that with the recipe style, but we can do our
  best to make the recipes as real world as possible. I tried to do that
  throughout the text, you do not find many examples like 'writing foo
  records to bar column families'. However the format does not allow
  extensive text blocks mentioned above so it is difficult to set the
  stage for a complex and detailed real world problem. Still, I think
  for some examples we can take the next step and make the recipe more
  real world practical and more use-case like.
 
 
 
 
  --
  Brian ONeill
  Lead Architect, Health Market Science (http://healthmarketscience.com)
  mobile:215.588.6024
  blog: http://weblogs.java.net/blog/boneill42/
  blog: http://brianoneill.blogspot.com/
 

 As for terminology, I guess you can consider me a hard-liner as I have
 a few problems with calling a column family a table. I might be in the
 minority, but I know I am not alone. On one hand aliases make the
 integration easier
 https://issues.apache.org/jira/browse/CASSANDRA-2743, but on the other
 hand if a user 

Re: Ball is rolling on High Performance Cassandra Cookbook second edition

2012-06-27 Thread Aaron Turner
On Wed, Jun 27, 2012 at 1:34 PM, Brian O'Neill b...@alumni.brown.edu wrote:
 RE: API method signatures changing

 That triggers another thought...

 What terminology will you use in the book to describe the data model?  CQL?

 When we wrote the RefCard on DZone, we intentionally favored/used CQL
 terminology.  On advisement from Jonathan and Kris Hahn, we wanted to start
 the process of sunsetting the legacy terms (keyspace, column family, etc.)
 in favor of the more familiar CQL terms (schema, table, etc.). I've gone on
 record in favor of the switch, but it is probably something worth noting in
 the book since that terminology does not yet align with all the client APIs
 yet. (e.g. Hector, Astyanax, etc.)

 I'm not sure when the client APIs will catch up to the new terminology, but
 we may want to inquire as to future proof the recipes as much as possible.

Not just client API's but documentation as well.  When I was a new
user, yeah the different terminology was a bit off-putting, but it was
consistent and it didn't take long to realize a CF was like a SQL
table, etc.  Honestly, I think using the same terms as a RDBMS does
makes users think they're exactly the same thing and have the same
properties... which is close enough in some cases, but dangerous in
others.

That said, while I found the first edition informative, I found the
java/hector code examples hard to read.  Part of that was because I
don't know Java (I know enough other languages that I can follow
along) and part of that is that Java is so verbose that it just
doesn't fit on the printed page.  I think CQL lends itself to making
the book more readable to a wider audience, but I think there should
be a chapter on Hector/pycassa/etc.  Of course, you still need to
write code around it, and if that's Java I'm not sure how much it
matters.



-- 
Aaron Turner
http://synfin.net/         Twitter: @synfinatic
http://tcpreplay.synfin.net/ - Pcap editing and replay tools for Unix  Windows
Those who would give up essential Liberty, to purchase a little temporary
Safety, deserve neither Liberty nor Safety.
    -- Benjamin Franklin
carpe diem quam minimum credula postero


Ball is rolling on High Performance Cassandra Cookbook second edition

2012-06-26 Thread Edward Capriolo
Hello all,

It has not been very long since the first book was published but
several things have been added to Cassandra and a few things have
changed. I am putting together a list of changed content, for example
features like the old per Column family memtable flush settings versus
the new system with the global variable.

My editors have given me the green light to grow the second edition
from ~200 pages currently up to 300 pages! This gives us the ability
to add more items/sections to the text.

Some things were missing from the first edition such as Hector
support. Nate has offered to help me in this area. Please feel contact
me with any ideas and suggestions of recipes you would like to see in
the book. Also get in touch if you want to write a recipe. Several
people added content to the first edition and it would be great to see
that type of participation again.

Thank you,
Edward