Re: yet a couple more questions on composite columns
On Sat, Feb 4, 2012 at 8:54 PM, Yiming Sun yiming@gmail.com wrote: Interesting idea, Jim. Is there a reason you don't you use metadata:{accountId} instead? For performance reasons? No, because the column comparator is defined as CompositeType(DateType, AsciiType), and all column names must conform to that. Jim On Sat, Feb 4, 2012 at 6:24 PM, Jim Ancona j...@anconafamily.com wrote: I've used special values which still comply with the Composite schema for the metadata columns, e.g. a column of 1970-01-01:{accountId} for a metadata column where the Composite is DateType:UTF8Type. Jim On Sat, Feb 4, 2012 at 2:13 PM, Yiming Sun yiming@gmail.com wrote: Thanks Andrey and Chris. It sounds like we don't necessarily have to use composite columns. From what I understand about dynamic CF, each row may have completely different data from other rows; but in our case, the data in each row is similar to other rows; my concern was more about the homogeneity of the data between columns. In our original supercolumn-based schema, one special supercolumn is called metadata which contains a number of subcolumns to hold metadata describing each collection (e.g. number of documents, etc.), then the rest of the supercolumns in the same row are all IDs of documents belong to the collection, and for each document supercolumn, the subcolumns contain the document content as well as metadata on individual document (e.g. checksum of each document). To move away from the supercolumn schema, I could either create two CFs, one to hold metadata, the other document content; or I could create just one CF mixing metadata and doc content in the same row, and using composite column names to identify if the particular column is metadata or a document. I am just wondering if you have any inputs on the pros and cons of each schema. -- Y. On Fri, Feb 3, 2012 at 10:27 PM, Chris Gerken chrisger...@mindspring.com wrote: On 4 February 2012 06:21, Yiming Sun yiming@gmail.com wrote: I cannot have one composite column name with 3 components while another with 4 components? Just put 4 components and left last empty (if it is same type)?! Another question I have is how flexible composite columns actually are. If my data model has a CF containing US zip codes with the following composite columns: {OH:Spring Field} : 45503 {OH:Columbus} : 43085 {FL:Spring Field} : 32401 {FL:Key West} : 33040 I know I can ask cassandra to give me the zip codes of all cities in OH. But can I ask it to give me the zip codes of all cities named Spring Field using this model? Thanks. No. You set first composite component at first. I'd use a dynamic CF: row key = state abbreviation column name = city name column value = zip code (or a complex object, one of whose properties is zip code) you can iterate over the columns in a single row to get a state's city names and their zip code and you can do a get_range_slices on all keys for the columns starting and ending on the city name to find out the zip codes for a cities with the given name. I think - Chris
Re: yet a couple more questions on composite columns
Thanks for the clarification, Jim. I didn't know the first comparator was defined as DateType. Yeah, in that case, the beginning of the epoch is the only choice. -- Y. On Mon, Feb 6, 2012 at 11:35 AM, Jim Ancona j...@anconafamily.com wrote: On Sat, Feb 4, 2012 at 8:54 PM, Yiming Sun yiming@gmail.com wrote: Interesting idea, Jim. Is there a reason you don't you use metadata:{accountId} instead? For performance reasons? No, because the column comparator is defined as CompositeType(DateType, AsciiType), and all column names must conform to that. Jim On Sat, Feb 4, 2012 at 6:24 PM, Jim Ancona j...@anconafamily.com wrote: I've used special values which still comply with the Composite schema for the metadata columns, e.g. a column of 1970-01-01:{accountId} for a metadata column where the Composite is DateType:UTF8Type. Jim On Sat, Feb 4, 2012 at 2:13 PM, Yiming Sun yiming@gmail.com wrote: Thanks Andrey and Chris. It sounds like we don't necessarily have to use composite columns. From what I understand about dynamic CF, each row may have completely different data from other rows; but in our case, the data in each row is similar to other rows; my concern was more about the homogeneity of the data between columns. In our original supercolumn-based schema, one special supercolumn is called metadata which contains a number of subcolumns to hold metadata describing each collection (e.g. number of documents, etc.), then the rest of the supercolumns in the same row are all IDs of documents belong to the collection, and for each document supercolumn, the subcolumns contain the document content as well as metadata on individual document (e.g. checksum of each document). To move away from the supercolumn schema, I could either create two CFs, one to hold metadata, the other document content; or I could create just one CF mixing metadata and doc content in the same row, and using composite column names to identify if the particular column is metadata or a document. I am just wondering if you have any inputs on the pros and cons of each schema. -- Y. On Fri, Feb 3, 2012 at 10:27 PM, Chris Gerken chrisger...@mindspring.com wrote: On 4 February 2012 06:21, Yiming Sun yiming@gmail.com wrote: I cannot have one composite column name with 3 components while another with 4 components? Just put 4 components and left last empty (if it is same type)?! Another question I have is how flexible composite columns actually are. If my data model has a CF containing US zip codes with the following composite columns: {OH:Spring Field} : 45503 {OH:Columbus} : 43085 {FL:Spring Field} : 32401 {FL:Key West} : 33040 I know I can ask cassandra to give me the zip codes of all cities in OH. But can I ask it to give me the zip codes of all cities named Spring Field using this model? Thanks. No. You set first composite component at first. I'd use a dynamic CF: row key = state abbreviation column name = city name column value = zip code (or a complex object, one of whose properties is zip code) you can iterate over the columns in a single row to get a state's city names and their zip code and you can do a get_range_slices on all keys for the columns starting and ending on the city name to find out the zip codes for a cities with the given name. I think - Chris
Re: yet a couple more questions on composite columns
Yiming, I am using 2 CF's. Performance wise this should not be an issue. I use it for small files data store. My 2 CF's are: FilesMeta FilesData 2012/2/5 Yiming Sun yiming@gmail.com Interesting idea, Jim. Is there a reason you don't you use metadata:{accountId} instead? For performance reasons? On Sat, Feb 4, 2012 at 6:24 PM, Jim Ancona j...@anconafamily.com wrote: I've used special values which still comply with the Composite schema for the metadata columns, e.g. a column of 1970-01-01:{accountId} for a metadata column where the Composite is DateType:UTF8Type. Jim On Sat, Feb 4, 2012 at 2:13 PM, Yiming Sun yiming@gmail.com wrote: Thanks Andrey and Chris. It sounds like we don't necessarily have to use composite columns. From what I understand about dynamic CF, each row may have completely different data from other rows; but in our case, the data in each row is similar to other rows; my concern was more about the homogeneity of the data between columns. In our original supercolumn-based schema, one special supercolumn is called metadata which contains a number of subcolumns to hold metadata describing each collection (e.g. number of documents, etc.), then the rest of the supercolumns in the same row are all IDs of documents belong to the collection, and for each document supercolumn, the subcolumns contain the document content as well as metadata on individual document (e.g. checksum of each document). To move away from the supercolumn schema, I could either create two CFs, one to hold metadata, the other document content; or I could create just one CF mixing metadata and doc content in the same row, and using composite column names to identify if the particular column is metadata or a document. I am just wondering if you have any inputs on the pros and cons of each schema. -- Y. On Fri, Feb 3, 2012 at 10:27 PM, Chris Gerken chrisger...@mindspring.com wrote: On 4 February 2012 06:21, Yiming Sun yiming@gmail.com wrote: I cannot have one composite column name with 3 components while another with 4 components? Just put 4 components and left last empty (if it is same type)?! Another question I have is how flexible composite columns actually are. If my data model has a CF containing US zip codes with the following composite columns: {OH:Spring Field} : 45503 {OH:Columbus} : 43085 {FL:Spring Field} : 32401 {FL:Key West} : 33040 I know I can ask cassandra to give me the zip codes of all cities in OH. But can I ask it to give me the zip codes of all cities named Spring Field using this model? Thanks. No. You set first composite component at first. I'd use a dynamic CF: row key = state abbreviation column name = city name column value = zip code (or a complex object, one of whose properties is zip code) you can iterate over the columns in a single row to get a state's city names and their zip code and you can do a get_range_slices on all keys for the columns starting and ending on the city name to find out the zip codes for a cities with the given name. I think - Chris
Re: yet a couple more questions on composite columns
Thanks R.V.!! We are also dealing with many small files, so this sounds really promising. -- Y. On Sun, Feb 5, 2012 at 9:59 AM, R. Verlangen ro...@us2.nl wrote: Yiming, I am using 2 CF's. Performance wise this should not be an issue. I use it for small files data store. My 2 CF's are: FilesMeta FilesData 2012/2/5 Yiming Sun yiming@gmail.com Interesting idea, Jim. Is there a reason you don't you use metadata:{accountId} instead? For performance reasons? On Sat, Feb 4, 2012 at 6:24 PM, Jim Ancona j...@anconafamily.com wrote: I've used special values which still comply with the Composite schema for the metadata columns, e.g. a column of 1970-01-01:{accountId} for a metadata column where the Composite is DateType:UTF8Type. Jim On Sat, Feb 4, 2012 at 2:13 PM, Yiming Sun yiming@gmail.com wrote: Thanks Andrey and Chris. It sounds like we don't necessarily have to use composite columns. From what I understand about dynamic CF, each row may have completely different data from other rows; but in our case, the data in each row is similar to other rows; my concern was more about the homogeneity of the data between columns. In our original supercolumn-based schema, one special supercolumn is called metadata which contains a number of subcolumns to hold metadata describing each collection (e.g. number of documents, etc.), then the rest of the supercolumns in the same row are all IDs of documents belong to the collection, and for each document supercolumn, the subcolumns contain the document content as well as metadata on individual document (e.g. checksum of each document). To move away from the supercolumn schema, I could either create two CFs, one to hold metadata, the other document content; or I could create just one CF mixing metadata and doc content in the same row, and using composite column names to identify if the particular column is metadata or a document. I am just wondering if you have any inputs on the pros and cons of each schema. -- Y. On Fri, Feb 3, 2012 at 10:27 PM, Chris Gerken chrisger...@mindspring.com wrote: On 4 February 2012 06:21, Yiming Sun yiming@gmail.com wrote: I cannot have one composite column name with 3 components while another with 4 components? Just put 4 components and left last empty (if it is same type)?! Another question I have is how flexible composite columns actually are. If my data model has a CF containing US zip codes with the following composite columns: {OH:Spring Field} : 45503 {OH:Columbus} : 43085 {FL:Spring Field} : 32401 {FL:Key West} : 33040 I know I can ask cassandra to give me the zip codes of all cities in OH. But can I ask it to give me the zip codes of all cities named Spring Field using this model? Thanks. No. You set first composite component at first. I'd use a dynamic CF: row key = state abbreviation column name = city name column value = zip code (or a complex object, one of whose properties is zip code) you can iterate over the columns in a single row to get a state's city names and their zip code and you can do a get_range_slices on all keys for the columns starting and ending on the city name to find out the zip codes for a cities with the given name. I think - Chris
Re: yet a couple more questions on composite columns
Thanks Andrey and Chris. It sounds like we don't necessarily have to use composite columns. From what I understand about dynamic CF, each row may have completely different data from other rows; but in our case, the data in each row is similar to other rows; my concern was more about the homogeneity of the data between columns. In our original supercolumn-based schema, one special supercolumn is called metadata which contains a number of subcolumns to hold metadata describing each collection (e.g. number of documents, etc.), then the rest of the supercolumns in the same row are all IDs of documents belong to the collection, and for each document supercolumn, the subcolumns contain the document content as well as metadata on individual document (e.g. checksum of each document). To move away from the supercolumn schema, I could either create two CFs, one to hold metadata, the other document content; or I could create just one CF mixing metadata and doc content in the same row, and using composite column names to identify if the particular column is metadata or a document. I am just wondering if you have any inputs on the pros and cons of each schema. -- Y. On Fri, Feb 3, 2012 at 10:27 PM, Chris Gerken chrisger...@mindspring.comwrote: On 4 February 2012 06:21, Yiming Sun yiming@gmail.com wrote: I cannot have one composite column name with 3 components while another with 4 components? Just put 4 components and left last empty (if it is same type)?! Another question I have is how flexible composite columns actually are. If my data model has a CF containing US zip codes with the following composite columns: {OH:Spring Field} : 45503 {OH:Columbus} : 43085 {FL:Spring Field} : 32401 {FL:Key West} : 33040 I know I can ask cassandra to give me the zip codes of all cities in OH. But can I ask it to give me the zip codes of all cities named Spring Field using this model? Thanks. No. You set first composite component at first. I'd use a dynamic CF: row key = state abbreviation column name = city name column value = zip code (or a complex object, one of whose properties is zip code) you can iterate over the columns in a single row to get a state's city names and their zip code and you can do a get_range_slices on all keys for the columns starting and ending on the city name to find out the zip codes for a cities with the given name. I think - Chris
Re: yet a couple more questions on composite columns
I also made something like this a while ago. I decided to go for the 2-rows-solution: by doing that you don't have the need for super columns. Cassandra is really good at reading, so this should not be an issue. Cheers! 2012/2/4 Yiming Sun yiming@gmail.com Thanks Andrey and Chris. It sounds like we don't necessarily have to use composite columns. From what I understand about dynamic CF, each row may have completely different data from other rows; but in our case, the data in each row is similar to other rows; my concern was more about the homogeneity of the data between columns. In our original supercolumn-based schema, one special supercolumn is called metadata which contains a number of subcolumns to hold metadata describing each collection (e.g. number of documents, etc.), then the rest of the supercolumns in the same row are all IDs of documents belong to the collection, and for each document supercolumn, the subcolumns contain the document content as well as metadata on individual document (e.g. checksum of each document). To move away from the supercolumn schema, I could either create two CFs, one to hold metadata, the other document content; or I could create just one CF mixing metadata and doc content in the same row, and using composite column names to identify if the particular column is metadata or a document. I am just wondering if you have any inputs on the pros and cons of each schema. -- Y. On Fri, Feb 3, 2012 at 10:27 PM, Chris Gerken chrisger...@mindspring.comwrote: On 4 February 2012 06:21, Yiming Sun yiming@gmail.com wrote: I cannot have one composite column name with 3 components while another with 4 components? Just put 4 components and left last empty (if it is same type)?! Another question I have is how flexible composite columns actually are. If my data model has a CF containing US zip codes with the following composite columns: {OH:Spring Field} : 45503 {OH:Columbus} : 43085 {FL:Spring Field} : 32401 {FL:Key West} : 33040 I know I can ask cassandra to give me the zip codes of all cities in OH. But can I ask it to give me the zip codes of all cities named Spring Field using this model? Thanks. No. You set first composite component at first. I'd use a dynamic CF: row key = state abbreviation column name = city name column value = zip code (or a complex object, one of whose properties is zip code) you can iterate over the columns in a single row to get a state's city names and their zip code and you can do a get_range_slices on all keys for the columns starting and ending on the city name to find out the zip codes for a cities with the given name. I think - Chris
Re: yet a couple more questions on composite columns
Interesting idea, R.V. But what did you do with the row keys? On Sat, Feb 4, 2012 at 2:29 PM, R. Verlangen ro...@us2.nl wrote: I also made something like this a while ago. I decided to go for the 2-rows-solution: by doing that you don't have the need for super columns. Cassandra is really good at reading, so this should not be an issue. Cheers! 2012/2/4 Yiming Sun yiming@gmail.com Thanks Andrey and Chris. It sounds like we don't necessarily have to use composite columns. From what I understand about dynamic CF, each row may have completely different data from other rows; but in our case, the data in each row is similar to other rows; my concern was more about the homogeneity of the data between columns. In our original supercolumn-based schema, one special supercolumn is called metadata which contains a number of subcolumns to hold metadata describing each collection (e.g. number of documents, etc.), then the rest of the supercolumns in the same row are all IDs of documents belong to the collection, and for each document supercolumn, the subcolumns contain the document content as well as metadata on individual document (e.g. checksum of each document). To move away from the supercolumn schema, I could either create two CFs, one to hold metadata, the other document content; or I could create just one CF mixing metadata and doc content in the same row, and using composite column names to identify if the particular column is metadata or a document. I am just wondering if you have any inputs on the pros and cons of each schema. -- Y. On Fri, Feb 3, 2012 at 10:27 PM, Chris Gerken chrisger...@mindspring.com wrote: On 4 February 2012 06:21, Yiming Sun yiming@gmail.com wrote: I cannot have one composite column name with 3 components while another with 4 components? Just put 4 components and left last empty (if it is same type)?! Another question I have is how flexible composite columns actually are. If my data model has a CF containing US zip codes with the following composite columns: {OH:Spring Field} : 45503 {OH:Columbus} : 43085 {FL:Spring Field} : 32401 {FL:Key West} : 33040 I know I can ask cassandra to give me the zip codes of all cities in OH. But can I ask it to give me the zip codes of all cities named Spring Field using this model? Thanks. No. You set first composite component at first. I'd use a dynamic CF: row key = state abbreviation column name = city name column value = zip code (or a complex object, one of whose properties is zip code) you can iterate over the columns in a single row to get a state's city names and their zip code and you can do a get_range_slices on all keys for the columns starting and ending on the city name to find out the zip codes for a cities with the given name. I think - Chris
Re: yet a couple more questions on composite columns
I just kept both row keys the same. This was very trivial for fetching them both. When you have A, you can fetch B, and vice versa. 2012/2/4 Yiming Sun yiming@gmail.com Interesting idea, R.V. But what did you do with the row keys? On Sat, Feb 4, 2012 at 2:29 PM, R. Verlangen ro...@us2.nl wrote: I also made something like this a while ago. I decided to go for the 2-rows-solution: by doing that you don't have the need for super columns. Cassandra is really good at reading, so this should not be an issue. Cheers! 2012/2/4 Yiming Sun yiming@gmail.com Thanks Andrey and Chris. It sounds like we don't necessarily have to use composite columns. From what I understand about dynamic CF, each row may have completely different data from other rows; but in our case, the data in each row is similar to other rows; my concern was more about the homogeneity of the data between columns. In our original supercolumn-based schema, one special supercolumn is called metadata which contains a number of subcolumns to hold metadata describing each collection (e.g. number of documents, etc.), then the rest of the supercolumns in the same row are all IDs of documents belong to the collection, and for each document supercolumn, the subcolumns contain the document content as well as metadata on individual document (e.g. checksum of each document). To move away from the supercolumn schema, I could either create two CFs, one to hold metadata, the other document content; or I could create just one CF mixing metadata and doc content in the same row, and using composite column names to identify if the particular column is metadata or a document. I am just wondering if you have any inputs on the pros and cons of each schema. -- Y. On Fri, Feb 3, 2012 at 10:27 PM, Chris Gerken chrisger...@mindspring.com wrote: On 4 February 2012 06:21, Yiming Sun yiming@gmail.com wrote: I cannot have one composite column name with 3 components while another with 4 components? Just put 4 components and left last empty (if it is same type)?! Another question I have is how flexible composite columns actually are. If my data model has a CF containing US zip codes with the following composite columns: {OH:Spring Field} : 45503 {OH:Columbus} : 43085 {FL:Spring Field} : 32401 {FL:Key West} : 33040 I know I can ask cassandra to give me the zip codes of all cities in OH. But can I ask it to give me the zip codes of all cities named Spring Field using this model? Thanks. No. You set first composite component at first. I'd use a dynamic CF: row key = state abbreviation column name = city name column value = zip code (or a complex object, one of whose properties is zip code) you can iterate over the columns in a single row to get a state's city names and their zip code and you can do a get_range_slices on all keys for the columns starting and ending on the city name to find out the zip codes for a cities with the given name. I think - Chris
Re: yet a couple more questions on composite columns
I've used special values which still comply with the Composite schema for the metadata columns, e.g. a column of 1970-01-01:{accountId} for a metadata column where the Composite is DateType:UTF8Type. Jim On Sat, Feb 4, 2012 at 2:13 PM, Yiming Sun yiming@gmail.com wrote: Thanks Andrey and Chris. It sounds like we don't necessarily have to use composite columns. From what I understand about dynamic CF, each row may have completely different data from other rows; but in our case, the data in each row is similar to other rows; my concern was more about the homogeneity of the data between columns. In our original supercolumn-based schema, one special supercolumn is called metadata which contains a number of subcolumns to hold metadata describing each collection (e.g. number of documents, etc.), then the rest of the supercolumns in the same row are all IDs of documents belong to the collection, and for each document supercolumn, the subcolumns contain the document content as well as metadata on individual document (e.g. checksum of each document). To move away from the supercolumn schema, I could either create two CFs, one to hold metadata, the other document content; or I could create just one CF mixing metadata and doc content in the same row, and using composite column names to identify if the particular column is metadata or a document. I am just wondering if you have any inputs on the pros and cons of each schema. -- Y. On Fri, Feb 3, 2012 at 10:27 PM, Chris Gerken chrisger...@mindspring.com wrote: On 4 February 2012 06:21, Yiming Sun yiming@gmail.com wrote: I cannot have one composite column name with 3 components while another with 4 components? Just put 4 components and left last empty (if it is same type)?! Another question I have is how flexible composite columns actually are. If my data model has a CF containing US zip codes with the following composite columns: {OH:Spring Field} : 45503 {OH:Columbus} : 43085 {FL:Spring Field} : 32401 {FL:Key West} : 33040 I know I can ask cassandra to give me the zip codes of all cities in OH. But can I ask it to give me the zip codes of all cities named Spring Field using this model? Thanks. No. You set first composite component at first. I'd use a dynamic CF: row key = state abbreviation column name = city name column value = zip code (or a complex object, one of whose properties is zip code) you can iterate over the columns in a single row to get a state's city names and their zip code and you can do a get_range_slices on all keys for the columns starting and ending on the city name to find out the zip codes for a cities with the given name. I think - Chris
Re: yet a couple more questions on composite columns
R.V., I am a little confused. I was under the impression that you cannot have two rows with the same key - unless you were referring to two different CFs? On Sat, Feb 4, 2012 at 6:11 PM, R. Verlangen ro...@us2.nl wrote: I just kept both row keys the same. This was very trivial for fetching them both. When you have A, you can fetch B, and vice versa. 2012/2/4 Yiming Sun yiming@gmail.com Interesting idea, R.V. But what did you do with the row keys? On Sat, Feb 4, 2012 at 2:29 PM, R. Verlangen ro...@us2.nl wrote: I also made something like this a while ago. I decided to go for the 2-rows-solution: by doing that you don't have the need for super columns. Cassandra is really good at reading, so this should not be an issue. Cheers! 2012/2/4 Yiming Sun yiming@gmail.com Thanks Andrey and Chris. It sounds like we don't necessarily have to use composite columns. From what I understand about dynamic CF, each row may have completely different data from other rows; but in our case, the data in each row is similar to other rows; my concern was more about the homogeneity of the data between columns. In our original supercolumn-based schema, one special supercolumn is called metadata which contains a number of subcolumns to hold metadata describing each collection (e.g. number of documents, etc.), then the rest of the supercolumns in the same row are all IDs of documents belong to the collection, and for each document supercolumn, the subcolumns contain the document content as well as metadata on individual document (e.g. checksum of each document). To move away from the supercolumn schema, I could either create two CFs, one to hold metadata, the other document content; or I could create just one CF mixing metadata and doc content in the same row, and using composite column names to identify if the particular column is metadata or a document. I am just wondering if you have any inputs on the pros and cons of each schema. -- Y. On Fri, Feb 3, 2012 at 10:27 PM, Chris Gerken chrisger...@mindspring.com wrote: On 4 February 2012 06:21, Yiming Sun yiming@gmail.com wrote: I cannot have one composite column name with 3 components while another with 4 components? Just put 4 components and left last empty (if it is same type)?! Another question I have is how flexible composite columns actually are. If my data model has a CF containing US zip codes with the following composite columns: {OH:Spring Field} : 45503 {OH:Columbus} : 43085 {FL:Spring Field} : 32401 {FL:Key West} : 33040 I know I can ask cassandra to give me the zip codes of all cities in OH. But can I ask it to give me the zip codes of all cities named Spring Field using this model? Thanks. No. You set first composite component at first. I'd use a dynamic CF: row key = state abbreviation column name = city name column value = zip code (or a complex object, one of whose properties is zip code) you can iterate over the columns in a single row to get a state's city names and their zip code and you can do a get_range_slices on all keys for the columns starting and ending on the city name to find out the zip codes for a cities with the given name. I think - Chris
Re: yet a couple more questions on composite columns
Interesting idea, Jim. Is there a reason you don't you use metadata:{accountId} instead? For performance reasons? On Sat, Feb 4, 2012 at 6:24 PM, Jim Ancona j...@anconafamily.com wrote: I've used special values which still comply with the Composite schema for the metadata columns, e.g. a column of 1970-01-01:{accountId} for a metadata column where the Composite is DateType:UTF8Type. Jim On Sat, Feb 4, 2012 at 2:13 PM, Yiming Sun yiming@gmail.com wrote: Thanks Andrey and Chris. It sounds like we don't necessarily have to use composite columns. From what I understand about dynamic CF, each row may have completely different data from other rows; but in our case, the data in each row is similar to other rows; my concern was more about the homogeneity of the data between columns. In our original supercolumn-based schema, one special supercolumn is called metadata which contains a number of subcolumns to hold metadata describing each collection (e.g. number of documents, etc.), then the rest of the supercolumns in the same row are all IDs of documents belong to the collection, and for each document supercolumn, the subcolumns contain the document content as well as metadata on individual document (e.g. checksum of each document). To move away from the supercolumn schema, I could either create two CFs, one to hold metadata, the other document content; or I could create just one CF mixing metadata and doc content in the same row, and using composite column names to identify if the particular column is metadata or a document. I am just wondering if you have any inputs on the pros and cons of each schema. -- Y. On Fri, Feb 3, 2012 at 10:27 PM, Chris Gerken chrisger...@mindspring.com wrote: On 4 February 2012 06:21, Yiming Sun yiming@gmail.com wrote: I cannot have one composite column name with 3 components while another with 4 components? Just put 4 components and left last empty (if it is same type)?! Another question I have is how flexible composite columns actually are. If my data model has a CF containing US zip codes with the following composite columns: {OH:Spring Field} : 45503 {OH:Columbus} : 43085 {FL:Spring Field} : 32401 {FL:Key West} : 33040 I know I can ask cassandra to give me the zip codes of all cities in OH. But can I ask it to give me the zip codes of all cities named Spring Field using this model? Thanks. No. You set first composite component at first. I'd use a dynamic CF: row key = state abbreviation column name = city name column value = zip code (or a complex object, one of whose properties is zip code) you can iterate over the columns in a single row to get a state's city names and their zip code and you can do a get_range_slices on all keys for the columns starting and ending on the city name to find out the zip codes for a cities with the given name. I think - Chris
Re: yet a couple more questions on composite columns
On 4 February 2012 06:21, Yiming Sun yiming@gmail.com wrote: I cannot have one composite column name with 3 components while another with 4 components? Just put 4 components and left last empty (if it is same type)?! Another question I have is how flexible composite columns actually are. If my data model has a CF containing US zip codes with the following composite columns: {OH:Spring Field} : 45503 {OH:Columbus} : 43085 {FL:Spring Field} : 32401 {FL:Key West} : 33040 I know I can ask cassandra to give me the zip codes of all cities in OH. But can I ask it to give me the zip codes of all cities named Spring Field using this model? Thanks. No. You set first composite component at first.
Re: yet a couple more questions on composite columns
On 4 February 2012 06:21, Yiming Sun yiming@gmail.com wrote: I cannot have one composite column name with 3 components while another with 4 components? Just put 4 components and left last empty (if it is same type)?! Another question I have is how flexible composite columns actually are. If my data model has a CF containing US zip codes with the following composite columns: {OH:Spring Field} : 45503 {OH:Columbus} : 43085 {FL:Spring Field} : 32401 {FL:Key West} : 33040 I know I can ask cassandra to give me the zip codes of all cities in OH. But can I ask it to give me the zip codes of all cities named Spring Field using this model? Thanks. No. You set first composite component at first. I'd use a dynamic CF: row key = state abbreviation column name = city name column value = zip code (or a complex object, one of whose properties is zip code) you can iterate over the columns in a single row to get a state's city names and their zip code and you can do a get_range_slices on all keys for the columns starting and ending on the city name to find out the zip codes for a cities with the given name. I think - Chris