Evan Weaver wrote:
I wanted to start a small discussion to see if there is any interest
in supporting alternative wire protocols or perhaps junking Thrift to
some degree.

Some options:
  * Use JSON over HTTP
  * Use BSON over...something (http://www.mongodb.org/display/DOCS/BSON)
  * Use ASN.1 over...something
  * Use Protocol Buffers over...something
  * Use Thrift, but package Cassandra-specific clients for each language

I have not thought too coherently about this but generic Thrift seems
to be a pain point for everybody.

Hi Evan,

I've been playing around again with Cassandra recently and I agree Thrift is a pain point, and that was the case when I looked at the project originally. But I think it's not so much Thrift as how the data is presented to clients.

Much more important to me is that to use Cassandra means reading and understanding the service api calls in cassandra.thrift. Personally I wouldn't have designed a fine grained API over the generic data structures implied by a colum store, where simple filters and selects become a litany of get_by_X calls. For example, 4 methods return list<column_t>, 2 return list<string>, 2 return list<superColumn_t>, there are 5 get_slice and 4 get_column variants. And typical of RPC, none of this stuff composes. In something like Django there are chained filter() calls (Hibernate has similar Criteria calls) which makes for a stable programming API, where what you need to figure out the criteria to pass. With Cassandra you have to do that and find the right method; the API surface is much bigger. Simple keystores and dynamo style models get away with fine grained RPC as there's nothing much to do except the key lookup and multiget usecases. They're not a design sweetspot for column stores APIs imvho.

I think the question for Cassandra is not so much about serialization techniques and speed as whether RPC is the best way to expose the data.

Bill

Reply via email to