Thanks R.V.!! We are also dealing with many small files, so this sounds really promising.
-- Y. On Sun, Feb 5, 2012 at 9:59 AM, R. Verlangen <ro...@us2.nl> wrote: > Yiming, I am using 2 CF's. Performance wise this should not be an issue. I > use it for small files data store. My 2 CF's are: > > FilesMeta > FilesData > > > 2012/2/5 Yiming Sun <yiming....@gmail.com> > >> Interesting idea, Jim. Is there a reason you don't you use >> "metadata:{accountId}" instead? For performance reasons? >> >> >> On Sat, Feb 4, 2012 at 6:24 PM, Jim Ancona <j...@anconafamily.com> wrote: >> >>> I've used "special" values which still comply with the Composite >>> schema for the metadata columns, e.g. a column of >>> 1970-01-01:{accountId} for a metadata column where the Composite is >>> DateType:UTF8Type. >>> >>> Jim >>> >>> On Sat, Feb 4, 2012 at 2:13 PM, Yiming Sun <yiming....@gmail.com> wrote: >>> > Thanks Andrey and Chris. It sounds like we don't necessarily have to >>> use >>> > composite columns. From what I understand about dynamic CF, each row >>> may >>> > have completely different data from other rows; but in our case, the >>> data >>> > in each row is similar to other rows; my concern was more about the >>> > homogeneity of the data between columns. >>> > >>> > In our original supercolumn-based schema, one special supercolumn is >>> called >>> > "metadata" which contains a number of subcolumns to hold metadata >>> describing >>> > each collection (e.g. number of documents, etc.), then the rest of the >>> > supercolumns in the same row are all IDs of documents belong to the >>> > collection, and for each document supercolumn, the subcolumns contain >>> the >>> > document content as well as metadata on individual document (e.g. >>> checksum >>> > of each document). >>> > >>> > To move away from the supercolumn schema, I could either create two >>> CFs, one >>> > to hold metadata, the other document content; or I could create just >>> one CF >>> > mixing metadata and doc content in the same row, and using composite >>> column >>> > names to identify if the particular column is metadata or a document. >>> I am >>> > just wondering if you have any inputs on the pros and cons of each >>> schema. >>> > >>> > -- Y. >>> > >>> > >>> > On Fri, Feb 3, 2012 at 10:27 PM, Chris Gerken < >>> chrisger...@mindspring.com> >>> > wrote: >>> >> >>> >> >>> >> >>> >> >>> >> On 4 February 2012 06:21, Yiming Sun <yiming....@gmail.com> wrote: >>> >>> >>> >>> I cannot have one composite column name with 3 components while >>> another >>> >>> with 4 components? >>> >> >>> >> Just put 4 components and left last empty (if it is same type)?! >>> >> >>> >>> Another question I have is how flexible composite columns actually >>> are. >>> >>> If my data model has a CF containing US zip codes with the following >>> >>> composite columns: >>> >>> >>> >>> {OH:Spring Field} : 45503 >>> >>> {OH:Columbus} : 43085 >>> >>> {FL:Spring Field} : 32401 >>> >>> {FL:Key West} : 33040 >>> >>> >>> >>> I know I can ask cassandra to "give me the zip codes of all cities in >>> >>> OH". But can I ask it to "give me the zip codes of all cities named >>> Spring >>> >>> Field" using this model? Thanks. >>> >> >>> >> No. You set first composite component at first. >>> >> >>> >> >>> >> I'd use a dynamic CF: >>> >> row key = state abbreviation >>> >> column name = city name >>> >> column value = zip code (or a complex object, one of whose properties >>> is >>> >> zip code) >>> >> >>> >> you can iterate over the columns in a single row to get a state's city >>> >> names and their zip code and you can do a get_range_slices on all >>> keys for >>> >> the columns starting and ending on the city name to find out the zip >>> codes >>> >> for a cities with the given name. >>> >> >>> >> I think >>> >> >>> >> - Chris >>> > >>> > >>> >> >> >