Comments inline.

On Tue, Mar 24, 2009 at 6:53 PM, Jun Rao <[email protected]> wrote:

>
> Prashant,
>
> Thanks for the comments. They are quite useful. Let me try to address some
> of the points that you made.
>
> 1. It is true that in our current implementation, we can glue the changes
> on both the data and the index in one batch_update() call. This way, the
> data and the index will be maintained synchronously. However, maintaining
> the index on the server is likely more efficient since there is less
> communication overhead. You seem to agree with this.


[Avinash] You can update multiple column families for a single key in one
mutation.

>
>
> 2. Cassandra currently doesn't acquire row-lock for row accesses. However,
> the implication is that a reader may see partial updates of a row. For
> example, suppose that a writer updates two columns in different CFs. Then,
> it is possible for a concurrent reader to see the update on one column, but
> not the other one. For some applications, row-level consistency could be
> important. It's probably for this reason, in HBase, a region server
> acquires a row lock for every read and write.


[Avinash] Updates to a single row within a machine are atomic. Which means
what you are stating will not happen. Writes and reads will be serialized at
the Memtable.

>
>
> 3. For our current application, the size of all entities in a group is not
> too large and likely fits within the capacity of a single node. However,
> for other applications, being able to scale a group to more than a node
> could be useful. Storing a group within a single row will prevent scaling
> out the group.

[Avinash] I guess the question is how many entities do you envision in a
group. What do you mean by fitting into one node?

>
>
> Jun
> IBM Almaden Research Center
> K55/B1, 650 Harry Road, San Jose, CA  95120-6099
>
> [email protected]
>
>
> Prashant Malik <[email protected]> wrote on 03/24/2009 11:34:51 AM:
>
> > Some questions Iline
> >
> > On Tue, Mar 24, 2009 at 10:21 AM, Jun Rao <[email protected]>
> wrote:
> >
> > >
> > >
> > > We have an application that has groups and entities. A group has many
> > > entities and an entity has a bunch of (attribute, value) pairs. A
> common
> > > access pattern is to select some number of entities within a group with
> > > attribute X equals to x and ordered by attribute Y. For efficiency, we
> want
> > > to build a secondary index for each group and collocate a group and its
> > > secondary index on the same node. Our current approach is to map a
> group to
> > > a row in Cassandra and each entity to a column in a column family (CF).
> > > Within the same row, we use a separate CF (ordered by name) to
> implement a
> > > secondary index, say on attribute X and Y. In this family, each column
> name
> > > has the form of X:x:Y:y:entityID. We extended the get_slice() function
> so
> > > that it can get a slice of columns starting from a given column. The
> > > extended function uses the column-level index to locate the starting
> column
> > > quickly. (We'd be happy to contribute this extension back to Cassandra
> if
> > > people find this useful). Using the extended get_slice(), we were able
> to
> > > access the entities through the simulated secondary index.
> > >
> > > We see a couple of problems with the current approach. First, our
> > > application has to maintain the index. This is inefficient and could
> leave
> > > the index inconsistent when failure occurs.  Second, mapping each
> entity to
> > > a column may not be a good idea. Often, there is some sort of locking
> for
> > > each row access. Putting many entities per row limits concurrency.
> Today,
> > > in Cassandra, a full row is deserialized into memory during compaction.
> > > This limits the number of entities that can be put in a single row.
> Also,
> > > intuitively, an entity is more naturally represented as a row with
> > > attributes stored as columns.
> >
> >
> > Prashant
> >
> >    1. Application can send the index  and the entityId update in the same
> > write , a write per row is always atomic given that teh index and the
> data
> > have teh same key in the above
> >        case the index will not be out of sync.
> >     2. The maintainance of index by app can be moved into cassandra and I
> > agree with fact that you can add support for it by a built in special CF
> > which you have to do in either of the approaches you are taking.
> >         Infact in the first approach that you are taking it will be
> easier
> > to move the indexes in case of adding nodes to the cluster and when files
> > are split and data is sent over. In the second approach this
> >         process could get comlicated.
> >     3. There is no locking for row access in cassandra.
> >     4. Compactions can be opmtimized for name sorted columns , this is
> one
> > of the workitems we have where we do not deserialize the entire row in
> > compactiopn but only do it in slices , this is easily achievable for
> >         name sorted columns.
> >     5. The entity can still be represented naturally as a supercolumn
> where
> > each of the super column name is the entity Id and each of the columns in
> > the supercolumn are attribute value pairs.
> >     6.  How many entities per  groupid are we talking about here ? why do
> > you feel concurrency is limited by entities per row ?
> >
> >
> > >
> > > To address the above problems, we are thinking of the following new
> > > implementation. Each entity is mapped to a row in Cassandra and uses a
> > > two-part key (groupID, entityID). We use the groupID to hash an entity
> to a
> > > node. This way, all entities for a group will be collocated in the same
> > > node. We then define a special CF to serve as the secondary index. In
> the
> > > definition, we specify what entity attributes need to be indexed  and
> in
> > > what order. Within a node, this special CF will index all rows stored
> > > locally. Every time we insert a new entity, the server automatically
> > > extracts the index key based on the index definition (for example, the
> > > index key can be of the form "hash
> (rowkey):attribute1:attribute2:rowkey)
> > > and add the index entry to the special CF. We can then access the
> entities
> > > using an extended version of the query language in Cassandra. For
> example,
> > > if we issue the following query and there is an index defined by
> > > (attributeX, attributeY), the query can be evaluated using the index in
> the
> > > special CF. (Note that AppEngine supports this flavor of queries.)
> > >
> > > select attributeZ
> > > from ROWS(HASH = hash(groupID))
> > > where attributeX="x"
> > > order by attributeY desc
> > > limit 50
> > >
> > > We are in the middle of prototyping this approach. We'd like to hear if
> > > other people are interested in this too or if people think there are
> better
> > > alternatives.
> > >
> >
> >
> > Prashant
> >
> >    1. In this approach I see a number of problems during addition ,
> removal
> > and moving of  new nodes. These problems are not impossible to solve but
> > just harder and inefficient with the above approach.
> >    2. What are the wins of this approach over the other is not clear to
> me.
> > could you please highlight those.
> >
> >
> > >
> > > Jun
> > > IBM Almaden Research Center
> > > K55/B1, 650 Harry Road, San Jose, CA  95120-6099
> > >
> > > [email protected]
>

Reply via email to