Re: yet a couple more questions on composite columns

2012-02-06 Thread Jim Ancona
On Sat, Feb 4, 2012 at 8:54 PM, Yiming Sun yiming@gmail.com wrote:
 Interesting idea, Jim.  Is there a reason you don't you use
 metadata:{accountId} instead?  For performance reasons?

No, because the column comparator is defined as
CompositeType(DateType, AsciiType), and all column names must conform
to that.

Jim



 On Sat, Feb 4, 2012 at 6:24 PM, Jim Ancona j...@anconafamily.com wrote:

 I've used special values which still comply with the Composite
 schema for the metadata columns, e.g. a column of
 1970-01-01:{accountId} for a metadata column where the Composite is
 DateType:UTF8Type.

 Jim

 On Sat, Feb 4, 2012 at 2:13 PM, Yiming Sun yiming@gmail.com wrote:
  Thanks Andrey and Chris.  It sounds like we don't necessarily have to
  use
  composite columns.  From what I understand about dynamic CF, each row
  may
  have completely different data from other rows;  but in our case, the
  data
  in each row is similar to other rows; my concern was more about the
  homogeneity of the data between columns.
 
  In our original supercolumn-based schema, one special supercolumn is
  called
  metadata which contains a number of subcolumns to hold metadata
  describing
  each collection (e.g. number of documents, etc.), then the rest of the
  supercolumns in the same row are all IDs of documents belong to the
  collection, and for each document supercolumn, the subcolumns contain
  the
  document content as well as metadata on individual document (e.g.
  checksum
  of each document).
 
  To move away from the supercolumn schema, I could either create two CFs,
  one
  to hold metadata, the other document content; or I could create just one
  CF
  mixing metadata and doc content in the same row, and using composite
  column
  names to identify if the particular column is metadata or a document.  I
  am
  just wondering if you have any inputs on the pros and cons of each
  schema.
 
  -- Y.
 
 
  On Fri, Feb 3, 2012 at 10:27 PM, Chris Gerken
  chrisger...@mindspring.com
  wrote:
 
 
 
 
  On 4 February 2012 06:21, Yiming Sun yiming@gmail.com wrote:
 
  I cannot have one composite column name with 3 components while
  another
  with 4 components?
 
   Just put 4 components and left last empty (if it is same type)?!
 
  Another question I have is how flexible composite columns actually
  are.
   If my data model has a CF containing US zip codes with the following
  composite columns:
 
  {OH:Spring Field} : 45503
  {OH:Columbus} : 43085
  {FL:Spring Field} : 32401
  {FL:Key West}  : 33040
 
  I know I can ask cassandra to give me the zip codes of all cities in
  OH.  But can I ask it to give me the zip codes of all cities named
  Spring
  Field using this model?  Thanks.
 
  No. You set first composite component at first.
 
 
  I'd use a dynamic CF:
  row key = state abbreviation
  column name = city name
  column value = zip code (or a complex object, one of whose properties
  is
  zip code)
 
  you can iterate over the columns in a single row to get a state's city
  names and their zip code and you can do a get_range_slices on all keys
  for
  the columns starting and ending on the city name to find out the zip
  codes
  for a cities with the given name.
 
  I think
 
  - Chris
 
 




Re: yet a couple more questions on composite columns

2012-02-06 Thread Yiming Sun
Thanks for the clarification, Jim.  I didn't know the first comparator was
defined as DateType. Yeah, in that case, the beginning of the epoch is the
only choice.

-- Y.

On Mon, Feb 6, 2012 at 11:35 AM, Jim Ancona j...@anconafamily.com wrote:

 On Sat, Feb 4, 2012 at 8:54 PM, Yiming Sun yiming@gmail.com wrote:
  Interesting idea, Jim.  Is there a reason you don't you use
  metadata:{accountId} instead?  For performance reasons?

 No, because the column comparator is defined as
 CompositeType(DateType, AsciiType), and all column names must conform
 to that.

 Jim

 
 
  On Sat, Feb 4, 2012 at 6:24 PM, Jim Ancona j...@anconafamily.com wrote:
 
  I've used special values which still comply with the Composite
  schema for the metadata columns, e.g. a column of
  1970-01-01:{accountId} for a metadata column where the Composite is
  DateType:UTF8Type.
 
  Jim
 
  On Sat, Feb 4, 2012 at 2:13 PM, Yiming Sun yiming@gmail.com
 wrote:
   Thanks Andrey and Chris.  It sounds like we don't necessarily have to
   use
   composite columns.  From what I understand about dynamic CF, each row
   may
   have completely different data from other rows;  but in our case, the
   data
   in each row is similar to other rows; my concern was more about the
   homogeneity of the data between columns.
  
   In our original supercolumn-based schema, one special supercolumn is
   called
   metadata which contains a number of subcolumns to hold metadata
   describing
   each collection (e.g. number of documents, etc.), then the rest of the
   supercolumns in the same row are all IDs of documents belong to the
   collection, and for each document supercolumn, the subcolumns contain
   the
   document content as well as metadata on individual document (e.g.
   checksum
   of each document).
  
   To move away from the supercolumn schema, I could either create two
 CFs,
   one
   to hold metadata, the other document content; or I could create just
 one
   CF
   mixing metadata and doc content in the same row, and using composite
   column
   names to identify if the particular column is metadata or a document.
  I
   am
   just wondering if you have any inputs on the pros and cons of each
   schema.
  
   -- Y.
  
  
   On Fri, Feb 3, 2012 at 10:27 PM, Chris Gerken
   chrisger...@mindspring.com
   wrote:
  
  
  
  
   On 4 February 2012 06:21, Yiming Sun yiming@gmail.com wrote:
  
   I cannot have one composite column name with 3 components while
   another
   with 4 components?
  
Just put 4 components and left last empty (if it is same type)?!
  
   Another question I have is how flexible composite columns actually
   are.
If my data model has a CF containing US zip codes with the
 following
   composite columns:
  
   {OH:Spring Field} : 45503
   {OH:Columbus} : 43085
   {FL:Spring Field} : 32401
   {FL:Key West}  : 33040
  
   I know I can ask cassandra to give me the zip codes of all cities
 in
   OH.  But can I ask it to give me the zip codes of all cities named
   Spring
   Field using this model?  Thanks.
  
   No. You set first composite component at first.
  
  
   I'd use a dynamic CF:
   row key = state abbreviation
   column name = city name
   column value = zip code (or a complex object, one of whose properties
   is
   zip code)
  
   you can iterate over the columns in a single row to get a state's
 city
   names and their zip code and you can do a get_range_slices on all
 keys
   for
   the columns starting and ending on the city name to find out the zip
   codes
   for a cities with the given name.
  
   I think
  
   - Chris
  
  
 
 



Re: yet a couple more questions on composite columns

2012-02-05 Thread R. Verlangen
Yiming, I am using 2 CF's. Performance wise this should not be an issue. I
use it for small files data store. My 2 CF's are:

FilesMeta
FilesData

2012/2/5 Yiming Sun yiming@gmail.com

 Interesting idea, Jim.  Is there a reason you don't you use
 metadata:{accountId} instead?  For performance reasons?


 On Sat, Feb 4, 2012 at 6:24 PM, Jim Ancona j...@anconafamily.com wrote:

 I've used special values which still comply with the Composite
 schema for the metadata columns, e.g. a column of
 1970-01-01:{accountId} for a metadata column where the Composite is
 DateType:UTF8Type.

 Jim

 On Sat, Feb 4, 2012 at 2:13 PM, Yiming Sun yiming@gmail.com wrote:
  Thanks Andrey and Chris.  It sounds like we don't necessarily have to
 use
  composite columns.  From what I understand about dynamic CF, each row
 may
  have completely different data from other rows;  but in our case, the
 data
  in each row is similar to other rows; my concern was more about the
  homogeneity of the data between columns.
 
  In our original supercolumn-based schema, one special supercolumn is
 called
  metadata which contains a number of subcolumns to hold metadata
 describing
  each collection (e.g. number of documents, etc.), then the rest of the
  supercolumns in the same row are all IDs of documents belong to the
  collection, and for each document supercolumn, the subcolumns contain
 the
  document content as well as metadata on individual document (e.g.
 checksum
  of each document).
 
  To move away from the supercolumn schema, I could either create two
 CFs, one
  to hold metadata, the other document content; or I could create just
 one CF
  mixing metadata and doc content in the same row, and using composite
 column
  names to identify if the particular column is metadata or a document.
  I am
  just wondering if you have any inputs on the pros and cons of each
 schema.
 
  -- Y.
 
 
  On Fri, Feb 3, 2012 at 10:27 PM, Chris Gerken 
 chrisger...@mindspring.com
  wrote:
 
 
 
 
  On 4 February 2012 06:21, Yiming Sun yiming@gmail.com wrote:
 
  I cannot have one composite column name with 3 components while
 another
  with 4 components?
 
   Just put 4 components and left last empty (if it is same type)?!
 
  Another question I have is how flexible composite columns actually
 are.
   If my data model has a CF containing US zip codes with the following
  composite columns:
 
  {OH:Spring Field} : 45503
  {OH:Columbus} : 43085
  {FL:Spring Field} : 32401
  {FL:Key West}  : 33040
 
  I know I can ask cassandra to give me the zip codes of all cities in
  OH.  But can I ask it to give me the zip codes of all cities named
 Spring
  Field using this model?  Thanks.
 
  No. You set first composite component at first.
 
 
  I'd use a dynamic CF:
  row key = state abbreviation
  column name = city name
  column value = zip code (or a complex object, one of whose properties
 is
  zip code)
 
  you can iterate over the columns in a single row to get a state's city
  names and their zip code and you can do a get_range_slices on all keys
 for
  the columns starting and ending on the city name to find out the zip
 codes
  for a cities with the given name.
 
  I think
 
  - Chris
 
 





Re: yet a couple more questions on composite columns

2012-02-05 Thread Yiming Sun
Thanks R.V.!! We are also dealing with many small files, so this sounds
really promising.

-- Y.

On Sun, Feb 5, 2012 at 9:59 AM, R. Verlangen ro...@us2.nl wrote:

 Yiming, I am using 2 CF's. Performance wise this should not be an issue. I
 use it for small files data store. My 2 CF's are:

 FilesMeta
 FilesData


 2012/2/5 Yiming Sun yiming@gmail.com

 Interesting idea, Jim.  Is there a reason you don't you use
 metadata:{accountId} instead?  For performance reasons?


 On Sat, Feb 4, 2012 at 6:24 PM, Jim Ancona j...@anconafamily.com wrote:

 I've used special values which still comply with the Composite
 schema for the metadata columns, e.g. a column of
 1970-01-01:{accountId} for a metadata column where the Composite is
 DateType:UTF8Type.

 Jim

 On Sat, Feb 4, 2012 at 2:13 PM, Yiming Sun yiming@gmail.com wrote:
  Thanks Andrey and Chris.  It sounds like we don't necessarily have to
 use
  composite columns.  From what I understand about dynamic CF, each row
 may
  have completely different data from other rows;  but in our case, the
 data
  in each row is similar to other rows; my concern was more about the
  homogeneity of the data between columns.
 
  In our original supercolumn-based schema, one special supercolumn is
 called
  metadata which contains a number of subcolumns to hold metadata
 describing
  each collection (e.g. number of documents, etc.), then the rest of the
  supercolumns in the same row are all IDs of documents belong to the
  collection, and for each document supercolumn, the subcolumns contain
 the
  document content as well as metadata on individual document (e.g.
 checksum
  of each document).
 
  To move away from the supercolumn schema, I could either create two
 CFs, one
  to hold metadata, the other document content; or I could create just
 one CF
  mixing metadata and doc content in the same row, and using composite
 column
  names to identify if the particular column is metadata or a document.
  I am
  just wondering if you have any inputs on the pros and cons of each
 schema.
 
  -- Y.
 
 
  On Fri, Feb 3, 2012 at 10:27 PM, Chris Gerken 
 chrisger...@mindspring.com
  wrote:
 
 
 
 
  On 4 February 2012 06:21, Yiming Sun yiming@gmail.com wrote:
 
  I cannot have one composite column name with 3 components while
 another
  with 4 components?
 
   Just put 4 components and left last empty (if it is same type)?!
 
  Another question I have is how flexible composite columns actually
 are.
   If my data model has a CF containing US zip codes with the following
  composite columns:
 
  {OH:Spring Field} : 45503
  {OH:Columbus} : 43085
  {FL:Spring Field} : 32401
  {FL:Key West}  : 33040
 
  I know I can ask cassandra to give me the zip codes of all cities in
  OH.  But can I ask it to give me the zip codes of all cities named
 Spring
  Field using this model?  Thanks.
 
  No. You set first composite component at first.
 
 
  I'd use a dynamic CF:
  row key = state abbreviation
  column name = city name
  column value = zip code (or a complex object, one of whose properties
 is
  zip code)
 
  you can iterate over the columns in a single row to get a state's city
  names and their zip code and you can do a get_range_slices on all
 keys for
  the columns starting and ending on the city name to find out the zip
 codes
  for a cities with the given name.
 
  I think
 
  - Chris
 
 






Re: yet a couple more questions on composite columns

2012-02-04 Thread Yiming Sun
Thanks Andrey and Chris.  It sounds like we don't necessarily have to use
composite columns.  From what I understand about dynamic CF, each row may
have completely different data from other rows;  but in our case, the data
in each row is similar to other rows; my concern was more about the
homogeneity of the data between columns.

In our original supercolumn-based schema, one special supercolumn is called
metadata which contains a number of subcolumns to hold metadata
describing each collection (e.g. number of documents, etc.), then the rest
of the supercolumns in the same row are all IDs of documents belong to the
collection, and for each document supercolumn, the subcolumns contain the
document content as well as metadata on individual document (e.g. checksum
of each document).

To move away from the supercolumn schema, I could either create two CFs,
one to hold metadata, the other document content; or I could create just
one CF mixing metadata and doc content in the same row, and using composite
column names to identify if the particular column is metadata or a
document.  I am just wondering if you have any inputs on the pros and cons
of each schema.

-- Y.

On Fri, Feb 3, 2012 at 10:27 PM, Chris Gerken chrisger...@mindspring.comwrote:




 On 4 February 2012 06:21, Yiming Sun yiming@gmail.com wrote:

 I cannot have one composite column name with 3 components while another
 with 4 components?

  Just put 4 components and left last empty (if it is same type)?!

 Another question I have is how flexible composite columns actually are.
  If my data model has a CF containing US zip codes with the following
 composite columns:

 {OH:Spring Field} : 45503
 {OH:Columbus} : 43085
 {FL:Spring Field} : 32401
 {FL:Key West}  : 33040

 I know I can ask cassandra to give me the zip codes of all cities in
 OH.  But can I ask it to give me the zip codes of all cities named Spring
 Field using this model?  Thanks.

 No. You set first composite component at first.


 I'd use a dynamic CF:
 row key = state abbreviation
 column name = city name
 column value = zip code (or a complex object, one of whose properties is
 zip code)

 you can iterate over the columns in a single row to get a state's city
 names and their zip code and you can do a get_range_slices on all keys for
 the columns starting and ending on the city name to find out the zip codes
 for a cities with the given name.

 I think

 - Chris



Re: yet a couple more questions on composite columns

2012-02-04 Thread R. Verlangen
I also made something like this a while ago. I decided to go for the
2-rows-solution: by doing that you don't have the need for super columns.
Cassandra is really good at reading, so this should not be an issue.

Cheers!

2012/2/4 Yiming Sun yiming@gmail.com

 Thanks Andrey and Chris.  It sounds like we don't necessarily have to use
 composite columns.  From what I understand about dynamic CF, each row may
 have completely different data from other rows;  but in our case, the data
 in each row is similar to other rows; my concern was more about the
 homogeneity of the data between columns.

 In our original supercolumn-based schema, one special supercolumn is
 called metadata which contains a number of subcolumns to hold metadata
 describing each collection (e.g. number of documents, etc.), then the rest
 of the supercolumns in the same row are all IDs of documents belong to the
 collection, and for each document supercolumn, the subcolumns contain the
 document content as well as metadata on individual document (e.g. checksum
 of each document).

 To move away from the supercolumn schema, I could either create two CFs,
 one to hold metadata, the other document content; or I could create just
 one CF mixing metadata and doc content in the same row, and using composite
 column names to identify if the particular column is metadata or a
 document.  I am just wondering if you have any inputs on the pros and cons
 of each schema.

 -- Y.


 On Fri, Feb 3, 2012 at 10:27 PM, Chris Gerken 
 chrisger...@mindspring.comwrote:




 On 4 February 2012 06:21, Yiming Sun yiming@gmail.com wrote:

 I cannot have one composite column name with 3 components while another
 with 4 components?

  Just put 4 components and left last empty (if it is same type)?!

 Another question I have is how flexible composite columns actually are.
  If my data model has a CF containing US zip codes with the following
 composite columns:

 {OH:Spring Field} : 45503
 {OH:Columbus} : 43085
 {FL:Spring Field} : 32401
 {FL:Key West}  : 33040

 I know I can ask cassandra to give me the zip codes of all cities in
 OH.  But can I ask it to give me the zip codes of all cities named Spring
 Field using this model?  Thanks.

 No. You set first composite component at first.


 I'd use a dynamic CF:
 row key = state abbreviation
 column name = city name
 column value = zip code (or a complex object, one of whose properties is
 zip code)

 you can iterate over the columns in a single row to get a state's city
 names and their zip code and you can do a get_range_slices on all keys for
 the columns starting and ending on the city name to find out the zip codes
 for a cities with the given name.

 I think

 - Chris





Re: yet a couple more questions on composite columns

2012-02-04 Thread Yiming Sun
Interesting idea, R.V.  But what did you do with the row keys?

On Sat, Feb 4, 2012 at 2:29 PM, R. Verlangen ro...@us2.nl wrote:

 I also made something like this a while ago. I decided to go for the
 2-rows-solution: by doing that you don't have the need for super columns.
 Cassandra is really good at reading, so this should not be an issue.

 Cheers!


 2012/2/4 Yiming Sun yiming@gmail.com

 Thanks Andrey and Chris.  It sounds like we don't necessarily have to use
 composite columns.  From what I understand about dynamic CF, each row may
 have completely different data from other rows;  but in our case, the data
 in each row is similar to other rows; my concern was more about the
 homogeneity of the data between columns.

 In our original supercolumn-based schema, one special supercolumn is
 called metadata which contains a number of subcolumns to hold metadata
 describing each collection (e.g. number of documents, etc.), then the rest
 of the supercolumns in the same row are all IDs of documents belong to the
 collection, and for each document supercolumn, the subcolumns contain the
 document content as well as metadata on individual document (e.g. checksum
 of each document).

 To move away from the supercolumn schema, I could either create two CFs,
 one to hold metadata, the other document content; or I could create just
 one CF mixing metadata and doc content in the same row, and using composite
 column names to identify if the particular column is metadata or a
 document.  I am just wondering if you have any inputs on the pros and cons
 of each schema.

 -- Y.


 On Fri, Feb 3, 2012 at 10:27 PM, Chris Gerken chrisger...@mindspring.com
  wrote:




 On 4 February 2012 06:21, Yiming Sun yiming@gmail.com wrote:

 I cannot have one composite column name with 3 components while another
 with 4 components?

  Just put 4 components and left last empty (if it is same type)?!

 Another question I have is how flexible composite columns actually are.
  If my data model has a CF containing US zip codes with the following
 composite columns:

 {OH:Spring Field} : 45503
 {OH:Columbus} : 43085
 {FL:Spring Field} : 32401
 {FL:Key West}  : 33040

 I know I can ask cassandra to give me the zip codes of all cities in
 OH.  But can I ask it to give me the zip codes of all cities named Spring
 Field using this model?  Thanks.

 No. You set first composite component at first.


 I'd use a dynamic CF:
 row key = state abbreviation
 column name = city name
 column value = zip code (or a complex object, one of whose properties is
 zip code)

 you can iterate over the columns in a single row to get a state's city
 names and their zip code and you can do a get_range_slices on all keys for
 the columns starting and ending on the city name to find out the zip codes
 for a cities with the given name.

 I think

 - Chris






Re: yet a couple more questions on composite columns

2012-02-04 Thread R. Verlangen
I just kept both row keys the same. This was very trivial for fetching them
both. When you have A, you can fetch B, and vice versa.

2012/2/4 Yiming Sun yiming@gmail.com

 Interesting idea, R.V.  But what did you do with the row keys?


 On Sat, Feb 4, 2012 at 2:29 PM, R. Verlangen ro...@us2.nl wrote:

 I also made something like this a while ago. I decided to go for the
 2-rows-solution: by doing that you don't have the need for super columns.
 Cassandra is really good at reading, so this should not be an issue.

 Cheers!


 2012/2/4 Yiming Sun yiming@gmail.com

 Thanks Andrey and Chris.  It sounds like we don't necessarily have to
 use composite columns.  From what I understand about dynamic CF, each row
 may have completely different data from other rows;  but in our case, the
 data in each row is similar to other rows; my concern was more about the
 homogeneity of the data between columns.

 In our original supercolumn-based schema, one special supercolumn is
 called metadata which contains a number of subcolumns to hold metadata
 describing each collection (e.g. number of documents, etc.), then the rest
 of the supercolumns in the same row are all IDs of documents belong to the
 collection, and for each document supercolumn, the subcolumns contain the
 document content as well as metadata on individual document (e.g. checksum
 of each document).

 To move away from the supercolumn schema, I could either create two CFs,
 one to hold metadata, the other document content; or I could create just
 one CF mixing metadata and doc content in the same row, and using composite
 column names to identify if the particular column is metadata or a
 document.  I am just wondering if you have any inputs on the pros and cons
 of each schema.

 -- Y.


 On Fri, Feb 3, 2012 at 10:27 PM, Chris Gerken 
 chrisger...@mindspring.com wrote:




 On 4 February 2012 06:21, Yiming Sun yiming@gmail.com wrote:

 I cannot have one composite column name with 3 components while
 another with 4 components?

  Just put 4 components and left last empty (if it is same type)?!

 Another question I have is how flexible composite columns actually are.
  If my data model has a CF containing US zip codes with the following
 composite columns:

 {OH:Spring Field} : 45503
 {OH:Columbus} : 43085
 {FL:Spring Field} : 32401
 {FL:Key West}  : 33040

 I know I can ask cassandra to give me the zip codes of all cities in
 OH.  But can I ask it to give me the zip codes of all cities named 
 Spring
 Field using this model?  Thanks.

 No. You set first composite component at first.


 I'd use a dynamic CF:
 row key = state abbreviation
 column name = city name
 column value = zip code (or a complex object, one of whose properties
 is zip code)

 you can iterate over the columns in a single row to get a state's city
 names and their zip code and you can do a get_range_slices on all keys for
 the columns starting and ending on the city name to find out the zip codes
 for a cities with the given name.

 I think

 - Chris







Re: yet a couple more questions on composite columns

2012-02-04 Thread Jim Ancona
I've used special values which still comply with the Composite
schema for the metadata columns, e.g. a column of
1970-01-01:{accountId} for a metadata column where the Composite is
DateType:UTF8Type.

Jim

On Sat, Feb 4, 2012 at 2:13 PM, Yiming Sun yiming@gmail.com wrote:
 Thanks Andrey and Chris.  It sounds like we don't necessarily have to use
 composite columns.  From what I understand about dynamic CF, each row may
 have completely different data from other rows;  but in our case, the data
 in each row is similar to other rows; my concern was more about the
 homogeneity of the data between columns.

 In our original supercolumn-based schema, one special supercolumn is called
 metadata which contains a number of subcolumns to hold metadata describing
 each collection (e.g. number of documents, etc.), then the rest of the
 supercolumns in the same row are all IDs of documents belong to the
 collection, and for each document supercolumn, the subcolumns contain the
 document content as well as metadata on individual document (e.g. checksum
 of each document).

 To move away from the supercolumn schema, I could either create two CFs, one
 to hold metadata, the other document content; or I could create just one CF
 mixing metadata and doc content in the same row, and using composite column
 names to identify if the particular column is metadata or a document.  I am
 just wondering if you have any inputs on the pros and cons of each schema.

 -- Y.


 On Fri, Feb 3, 2012 at 10:27 PM, Chris Gerken chrisger...@mindspring.com
 wrote:




 On 4 February 2012 06:21, Yiming Sun yiming@gmail.com wrote:

 I cannot have one composite column name with 3 components while another
 with 4 components?

  Just put 4 components and left last empty (if it is same type)?!

 Another question I have is how flexible composite columns actually are.
  If my data model has a CF containing US zip codes with the following
 composite columns:

 {OH:Spring Field} : 45503
 {OH:Columbus} : 43085
 {FL:Spring Field} : 32401
 {FL:Key West}  : 33040

 I know I can ask cassandra to give me the zip codes of all cities in
 OH.  But can I ask it to give me the zip codes of all cities named Spring
 Field using this model?  Thanks.

 No. You set first composite component at first.


 I'd use a dynamic CF:
 row key = state abbreviation
 column name = city name
 column value = zip code (or a complex object, one of whose properties is
 zip code)

 you can iterate over the columns in a single row to get a state's city
 names and their zip code and you can do a get_range_slices on all keys for
 the columns starting and ending on the city name to find out the zip codes
 for a cities with the given name.

 I think

 - Chris




Re: yet a couple more questions on composite columns

2012-02-04 Thread Yiming Sun
R.V., I am a little confused.  I was under the impression that you cannot
have two rows with the same key - unless you were referring to two
different CFs?

On Sat, Feb 4, 2012 at 6:11 PM, R. Verlangen ro...@us2.nl wrote:

 I just kept both row keys the same. This was very trivial for fetching
 them both. When you have A, you can fetch B, and vice versa.


 2012/2/4 Yiming Sun yiming@gmail.com

 Interesting idea, R.V.  But what did you do with the row keys?


 On Sat, Feb 4, 2012 at 2:29 PM, R. Verlangen ro...@us2.nl wrote:

 I also made something like this a while ago. I decided to go for the
 2-rows-solution: by doing that you don't have the need for super columns.
 Cassandra is really good at reading, so this should not be an issue.

 Cheers!


 2012/2/4 Yiming Sun yiming@gmail.com

 Thanks Andrey and Chris.  It sounds like we don't necessarily have to
 use composite columns.  From what I understand about dynamic CF, each row
 may have completely different data from other rows;  but in our case, the
 data in each row is similar to other rows; my concern was more about the
 homogeneity of the data between columns.

 In our original supercolumn-based schema, one special supercolumn is
 called metadata which contains a number of subcolumns to hold metadata
 describing each collection (e.g. number of documents, etc.), then the rest
 of the supercolumns in the same row are all IDs of documents belong to the
 collection, and for each document supercolumn, the subcolumns contain the
 document content as well as metadata on individual document (e.g. checksum
 of each document).

 To move away from the supercolumn schema, I could either create two
 CFs, one to hold metadata, the other document content; or I could create
 just one CF mixing metadata and doc content in the same row, and using
 composite column names to identify if the particular column is metadata or
 a document.  I am just wondering if you have any inputs on the pros and
 cons of each schema.

 -- Y.


 On Fri, Feb 3, 2012 at 10:27 PM, Chris Gerken 
 chrisger...@mindspring.com wrote:




 On 4 February 2012 06:21, Yiming Sun yiming@gmail.com wrote:

 I cannot have one composite column name with 3 components while
 another with 4 components?

  Just put 4 components and left last empty (if it is same type)?!

 Another question I have is how flexible composite columns actually
 are.  If my data model has a CF containing US zip codes with the 
 following
 composite columns:

 {OH:Spring Field} : 45503
 {OH:Columbus} : 43085
 {FL:Spring Field} : 32401
 {FL:Key West}  : 33040

 I know I can ask cassandra to give me the zip codes of all cities in
 OH.  But can I ask it to give me the zip codes of all cities named 
 Spring
 Field using this model?  Thanks.

 No. You set first composite component at first.


 I'd use a dynamic CF:
 row key = state abbreviation
 column name = city name
 column value = zip code (or a complex object, one of whose properties
 is zip code)

 you can iterate over the columns in a single row to get a state's city
 names and their zip code and you can do a get_range_slices on all keys for
 the columns starting and ending on the city name to find out the zip codes
 for a cities with the given name.

 I think

 - Chris








Re: yet a couple more questions on composite columns

2012-02-04 Thread Yiming Sun
Interesting idea, Jim.  Is there a reason you don't you use
metadata:{accountId} instead?  For performance reasons?

On Sat, Feb 4, 2012 at 6:24 PM, Jim Ancona j...@anconafamily.com wrote:

 I've used special values which still comply with the Composite
 schema for the metadata columns, e.g. a column of
 1970-01-01:{accountId} for a metadata column where the Composite is
 DateType:UTF8Type.

 Jim

 On Sat, Feb 4, 2012 at 2:13 PM, Yiming Sun yiming@gmail.com wrote:
  Thanks Andrey and Chris.  It sounds like we don't necessarily have to use
  composite columns.  From what I understand about dynamic CF, each row may
  have completely different data from other rows;  but in our case, the
 data
  in each row is similar to other rows; my concern was more about the
  homogeneity of the data between columns.
 
  In our original supercolumn-based schema, one special supercolumn is
 called
  metadata which contains a number of subcolumns to hold metadata
 describing
  each collection (e.g. number of documents, etc.), then the rest of the
  supercolumns in the same row are all IDs of documents belong to the
  collection, and for each document supercolumn, the subcolumns contain the
  document content as well as metadata on individual document (e.g.
 checksum
  of each document).
 
  To move away from the supercolumn schema, I could either create two CFs,
 one
  to hold metadata, the other document content; or I could create just one
 CF
  mixing metadata and doc content in the same row, and using composite
 column
  names to identify if the particular column is metadata or a document.  I
 am
  just wondering if you have any inputs on the pros and cons of each
 schema.
 
  -- Y.
 
 
  On Fri, Feb 3, 2012 at 10:27 PM, Chris Gerken 
 chrisger...@mindspring.com
  wrote:
 
 
 
 
  On 4 February 2012 06:21, Yiming Sun yiming@gmail.com wrote:
 
  I cannot have one composite column name with 3 components while another
  with 4 components?
 
   Just put 4 components and left last empty (if it is same type)?!
 
  Another question I have is how flexible composite columns actually are.
   If my data model has a CF containing US zip codes with the following
  composite columns:
 
  {OH:Spring Field} : 45503
  {OH:Columbus} : 43085
  {FL:Spring Field} : 32401
  {FL:Key West}  : 33040
 
  I know I can ask cassandra to give me the zip codes of all cities in
  OH.  But can I ask it to give me the zip codes of all cities named
 Spring
  Field using this model?  Thanks.
 
  No. You set first composite component at first.
 
 
  I'd use a dynamic CF:
  row key = state abbreviation
  column name = city name
  column value = zip code (or a complex object, one of whose properties is
  zip code)
 
  you can iterate over the columns in a single row to get a state's city
  names and their zip code and you can do a get_range_slices on all keys
 for
  the columns starting and ending on the city name to find out the zip
 codes
  for a cities with the given name.
 
  I think
 
  - Chris
 
 



Re: yet a couple more questions on composite columns

2012-02-03 Thread Andrey V. Panov
On 4 February 2012 06:21, Yiming Sun yiming@gmail.com wrote:

 I cannot have one composite column name with 3 components while another
 with 4 components?

 Just put 4 components and left last empty (if it is same type)?!

Another question I have is how flexible composite columns actually are.  If
 my data model has a CF containing US zip codes with the following composite
 columns:

 {OH:Spring Field} : 45503
 {OH:Columbus} : 43085
 {FL:Spring Field} : 32401
 {FL:Key West}  : 33040

 I know I can ask cassandra to give me the zip codes of all cities in OH.
  But can I ask it to give me the zip codes of all cities named Spring
 Field using this model?  Thanks.

No. You set first composite component at first.


Re: yet a couple more questions on composite columns

2012-02-03 Thread Chris Gerken

 
 
 On 4 February 2012 06:21, Yiming Sun yiming@gmail.com wrote:
 I cannot have one composite column name with 3 components while another with 
 4 components?
  Just put 4 components and left last empty (if it is same type)?!
 
 Another question I have is how flexible composite columns actually are.  If 
 my data model has a CF containing US zip codes with the following composite 
 columns:
 
 {OH:Spring Field} : 45503
 {OH:Columbus} : 43085
 {FL:Spring Field} : 32401
 {FL:Key West}  : 33040
 
 I know I can ask cassandra to give me the zip codes of all cities in OH.  
 But can I ask it to give me the zip codes of all cities named Spring Field 
 using this model?  Thanks.
 No. You set first composite component at first.

I'd use a dynamic CF:
row key = state abbreviation 
column name = city name
column value = zip code (or a complex object, one of whose properties is zip 
code)

you can iterate over the columns in a single row to get a state's city names 
and their zip code and you can do a get_range_slices on all keys for the 
columns starting and ending on the city name to find out the zip codes for a 
cities with the given name.

I think

- Chris