Thanks R.V.!! We are also dealing with many small files, so this sounds
really promising.

-- Y.

On Sun, Feb 5, 2012 at 9:59 AM, R. Verlangen <ro...@us2.nl> wrote:

> Yiming, I am using 2 CF's. Performance wise this should not be an issue. I
> use it for small files data store. My 2 CF's are:
>
> FilesMeta
> FilesData
>
>
> 2012/2/5 Yiming Sun <yiming....@gmail.com>
>
>> Interesting idea, Jim.  Is there a reason you don't you use
>> "metadata:{accountId}" instead?  For performance reasons?
>>
>>
>> On Sat, Feb 4, 2012 at 6:24 PM, Jim Ancona <j...@anconafamily.com> wrote:
>>
>>> I've used "special" values which still comply with the Composite
>>> schema for the metadata columns, e.g. a column of
>>> 1970-01-01:{accountId} for a metadata column where the Composite is
>>> DateType:UTF8Type.
>>>
>>> Jim
>>>
>>> On Sat, Feb 4, 2012 at 2:13 PM, Yiming Sun <yiming....@gmail.com> wrote:
>>> > Thanks Andrey and Chris.  It sounds like we don't necessarily have to
>>> use
>>> > composite columns.  From what I understand about dynamic CF, each row
>>> may
>>> > have completely different data from other rows;  but in our case, the
>>> data
>>> > in each row is similar to other rows; my concern was more about the
>>> > homogeneity of the data between columns.
>>> >
>>> > In our original supercolumn-based schema, one special supercolumn is
>>> called
>>> > "metadata" which contains a number of subcolumns to hold metadata
>>> describing
>>> > each collection (e.g. number of documents, etc.), then the rest of the
>>> > supercolumns in the same row are all IDs of documents belong to the
>>> > collection, and for each document supercolumn, the subcolumns contain
>>> the
>>> > document content as well as metadata on individual document (e.g.
>>> checksum
>>> > of each document).
>>> >
>>> > To move away from the supercolumn schema, I could either create two
>>> CFs, one
>>> > to hold metadata, the other document content; or I could create just
>>> one CF
>>> > mixing metadata and doc content in the same row, and using composite
>>> column
>>> > names to identify if the particular column is metadata or a document.
>>>  I am
>>> > just wondering if you have any inputs on the pros and cons of each
>>> schema.
>>> >
>>> > -- Y.
>>> >
>>> >
>>> > On Fri, Feb 3, 2012 at 10:27 PM, Chris Gerken <
>>> chrisger...@mindspring.com>
>>> > wrote:
>>> >>
>>> >>
>>> >>
>>> >>
>>> >> On 4 February 2012 06:21, Yiming Sun <yiming....@gmail.com> wrote:
>>> >>>
>>> >>> I cannot have one composite column name with 3 components while
>>> another
>>> >>> with 4 components?
>>> >>
>>> >>  Just put 4 components and left last empty (if it is same type)?!
>>> >>
>>> >>> Another question I have is how flexible composite columns actually
>>> are.
>>> >>>  If my data model has a CF containing US zip codes with the following
>>> >>> composite columns:
>>> >>>
>>> >>> {OH:Spring Field} : 45503
>>> >>> {OH:Columbus} : 43085
>>> >>> {FL:Spring Field} : 32401
>>> >>> {FL:Key West}  : 33040
>>> >>>
>>> >>> I know I can ask cassandra to "give me the zip codes of all cities in
>>> >>> OH".  But can I ask it to "give me the zip codes of all cities named
>>> Spring
>>> >>> Field" using this model?  Thanks.
>>> >>
>>> >> No. You set first composite component at first.
>>> >>
>>> >>
>>> >> I'd use a dynamic CF:
>>> >> row key = state abbreviation
>>> >> column name = city name
>>> >> column value = zip code (or a complex object, one of whose properties
>>> is
>>> >> zip code)
>>> >>
>>> >> you can iterate over the columns in a single row to get a state's city
>>> >> names and their zip code and you can do a get_range_slices on all
>>> keys for
>>> >> the columns starting and ending on the city name to find out the zip
>>> codes
>>> >> for a cities with the given name.
>>> >>
>>> >> I think
>>> >>
>>> >> - Chris
>>> >
>>> >
>>>
>>
>>
>

Reply via email to