[Cassandra Wiki] Update of "API" by NickTelford

Apache Wiki Tue, 02 Mar 2010 04:16:35 -0800

Dear Wiki user,

You have subscribed to a wiki page or wiki category on "Cassandra Wiki" for 
change notification.


The "API" page has been changed by NickTelford.
http://wiki.apache.org/cassandra/API?action=diff&rev1=41&rev2=42

--------------------------------------------------

  == Overview ==
- The Cassandra Thrift API changed between 0.3 and 0.4; this document explains 
the 0.4 version.  The 0.3 API is described in [[API03]].
+ The Cassandra Thrift API changed between 0.3, 0.4, 0.5 and 0.6; this document 
explains the 0.5 version with annotations for the changes in 0.6. The 
[[API03|0.3 API]] and [[API04|0.4 API]] are archived for reference.
  
  Cassandra's client API is built entirely on top of Thrift. It should be noted 
that these documents mention default values, but these are not generated in all 
of the languages that Thrift supports.
  
@@ -30, +30 @@

  ==== Write ====
  ||'''Level''' ||'''Behavior''' ||
  ||`ZERO` ||Ensure nothing. A write happens asynchronously in background ||
- ||`ANY` ||(Coming in 0.6) Ensure that the write has been written to at least 
1 node, including hinted recipients. ||
+ ||`ANY` ||(Requires 0.6) Ensure that the write has been written to at least 1 
node, including hinted recipients. ||
  ||`ONE` ||Ensure that the write has been written to at least 1 node's commit 
log and memory table before responding to the client. ||
  ||`QUORUM` ||Ensure that the write has been written to `<ReplicationFactor> / 
2 + 1` nodes before responding to the client. ||
  ||`ALL` ||Ensure that the write is written to all `<ReplicationFactor>` nodes 
before responding to the client.  Any unresponsive nodes will fail the 
operation. ||
- 
  
  ==== Read ====
  ||'''Level''' ||'''Behavior''' ||
@@ -44, +43 @@

  ||`QUORUM` ||Will query all storage nodes and return the record with the most 
recent timestamp once it has at least a majority of replicas reported.  Again, 
the remaining replicas will be checked in the background. ||
  ||`ALL` ||Will query all storage nodes and return the record with the most 
recent timestamp once all nodes have replied.  Any unresponsive nodes will fail 
the operation. ||
  
+ === ColumnOrSuperColumn ===
+ Due to the lack of inheritance in Thrift, `Column` and `SuperColumn` 
structures are aggregated by the `ColumnOrSuperColumn` structure. This is used 
wherever either a `Column` or `SuperColumn` would normally be expected.
  
- === ColumnPath and ColumnParent ===
+ If the underlying column is a `Column`, it will be contained within the 
`column` attribute. If the underlying column is a `SuperColumn`, it will be 
contained within the `super_column` attribute. The two are mutually exclusive - 
i.e. only one may be populated.
+ ||'''Attribute''' ||'''Type''' ||'''Default''' ||'''Required''' 
||'''Description''' ||
+ ||`column` ||`Column` ||n/a ||N ||The `Column` if this `ColumnOrSuperColumn` 
is aggregating a `Column`.||
+ ||`super_column` ||`SuperColumn` ||n/a ||N ||The `SuperColumn` if this 
`ColumnOrSuperColumn` is aggregating a `SuperColumn` ||
+ 
+ === Column ===
+ The `Column` is a triplet of a name, value and timestamp. As described above, 
`Column` names are unique within a row. Timestamps are arbitrary - they can be 
any integer you specify, however they must be consistent across your 
application. It is recommended to use a timestamp value with a fine 
granularity, such as milliseconds since the UNIX epoch. See [[DataModel]] for 
more information.
+ ||'''Attribute''' ||'''Type''' ||'''Default''' ||'''Required''' 
||'''Description''' ||
+ ||`name` ||`binary` ||n/a ||Y ||The name of the `Column`. ||
+ ||`value` ||`binary` ||n/a ||Y ||The value of the `Column`. ||
+ ||`timestamp` ||`i64` ||n/a ||Y ||The timestamp of the `Column`. ||
+ 
+ === SuperColumn ===
+ A `SuperColumn` contains no data itself, but instead stores another level of 
`Columns` below the key. See [[DataModel]] for more details on what 
`SuperColumns` are and how they should be used.
+ ||'''Attribute''' ||'''Type''' ||'''Default''' ||'''Required''' 
||'''Description''' ||
+ ||`name` ||`binary` ||n/a ||Y ||The name of the `SuperColumn`. ||
+ ||`columns` ||`list<Column>` ||n/a ||Y ||The `Columns` within the 
`SuperColumn`. ||
+ 
+ === ColumnPath ===
  The `ColumnPath` is the path to a single column in Cassandra. It might make 
sense to think of `ColumnPath` and `ColumnParent` in terms of a directory 
structure.
  ||'''Attribute''' ||'''Type''' ||'''Default''' ||'''Required''' 
||'''Description''' ||
  ||`column_family` ||`string` ||n/a ||Y ||The name of the CF of the column 
being looked up. ||
  ||`super_column` ||`binary` ||n/a ||N ||The super column name. ||
  ||`column` ||`binary` ||n/a ||N ||The column name. ||
  
- 
- 
- 
- `ColumnPath` is used to looking up a single column.  `ColumnParent` is used 
when selecting groups of columns from the same !ColumnFamily.  In directory 
structure terms, imagine `ColumnParent` as `ColumnPath + '/../'`.
+ === ColumnParent ===
+ The `ColumnParent` is the path to the parent of a particular set of 
`Columns`. It is used when selecting groups of columns from the same 
!ColumnFamily. In directory structure terms, imagine `ColumnParent` as 
`ColumnPath + '/../'`.
+ ||'''Attribute''' ||'''Type''' ||'''Default''' ||'''Required''' 
||'''Description''' ||
+ ||`column_family` ||`string` ||n/a ||Y ||The name of the CF of the column 
being looked up. ||
+ ||`super_column` ||`binary` ||n/a ||N ||The super column name. ||
  
  === SlicePredicate ===
  A `SlicePredicate` is similar to a 
[[http://en.wikipedia.org/wiki/Predicate_(mathematical_logic)|mathematic 
predicate]], which is described as "a property that the elements of a set have 
in common."
  
  `SlicePredicate`'s in Cassandra are described with either a list of 
`column_names` or a `SliceRange`.
  ||'''Attribute''' ||'''Type''' ||'''Default''' ||'''Required''' 
||'''Description''' ||
- ||`column_names` ||`list` ||n/a ||N ||A list of column names to retrieve. 
This can be used similar to Memcached's "multi-get" feature to fetch N known 
column names. For instance, if you know you wish to fetch columns 'Joe', 
'Jack', and 'Jim' you can pass those column names as a list to fetch all three 
at once. ||
+ ||`column_names` ||`list<binary>` ||n/a ||N ||A list of column names to 
retrieve. This can be used similar to Memcached's "multi-get" feature to fetch 
N known column names. For instance, if you know you wish to fetch columns 
'Joe', 'Jack', and 'Jim' you can pass those column names as a list to fetch all 
three at once. ||
  ||`slice_range` ||`SliceRange` ||n/a ||N ||A `SliceRange` describing how to 
range, order, and/or limit the slice. ||
  
- 
- 
- 
  If `column_names` is specified, `slice_range` is ignored.
  
  === SliceRange ===
- A slice range is a structure that stores basic range, ordering and limit 
information for a query that will return multiple columns. It could be thought 
of as Cassandra's version of `LIMIT` and `ORDER BY`.
+ A `SliceRange` is a structure that stores basic range, ordering and limit 
information for a query that will return multiple columns. It could be thought 
of as Cassandra's version of `LIMIT` and `ORDER BY`.
  ||'''Attribute''' ||'''Type''' ||'''Default''' ||'''Required''' 
||'''Description''' ||
  ||`start` ||`binary` ||n/a ||Y ||The column name to start the slice with. 
This attribute is not required, though there is no default value, and can be 
safely set to `''`, i.e., an empty byte array, to start with the first column 
name.  Otherwise, it must be a valid value under the rules of the Comparator 
defined for the given `ColumnFamily`. ||
  ||`finish` ||`binary` ||n/a ||Y ||The column name to stop the slice at. This 
attribute is not required, though there is no default value, and can be safely 
set to an empty byte array to not stop until `count` results are seen. 
Otherwise, it must also be a valid value to the `ColumnFamily` Comparator. ||
- ||`reversed` ||`bool` ||`false` ||N ||Whether the results should be ordered 
in reversed order. Similar to `ORDER BY blah DESC` in SQL. ||
+ ||`reversed` ||`bool` ||`false` ||Y ||Whether the results should be ordered 
in reversed order. Similar to `ORDER BY blah DESC` in SQL. ||
- ||`count` ||`integer` ||`100` ||N ||How many columns to return. Similar to 
`LIMIT 100` in SQL. May be arbitrarily large, but Thrift will materialize the 
whole result into memory before returning it to the client, so be aware that 
you may be better served by iterating through slices by passing the last value 
of one call in as the `start` of the next instead of increasing `count` 
arbitrarily large. ||
+ ||`count` ||`integer` ||`100` ||Y ||How many columns to return. Similar to 
`LIMIT 100` in SQL. May be arbitrarily large, but Thrift will materialize the 
whole result into memory before returning it to the client, so be aware that 
you may be better served by iterating through slices by passing the last value 
of one call in as the `start` of the next instead of increasing `count` 
arbitrarily large. ||
  
+ === KeyRange ===
+ '''''Requires Cassandra 0.6'''''
  
+ A `KeyRange` is used by `get_range_slices` to define the range of keys to get 
the slices for.
  
+ The semantics of start keys and tokens are slightly different. Keys are 
start-inclusive; tokens are start-exclusive. Token ranges may also wrap -- that 
is, the end token may be less than the start one. Thus, a range from keyX to 
keyX is a one-element range, but a range from tokenY to tokenY is the full ring.
- 
- === ColumnOrSuperColumn ===
- Methods for fetching rows/records from Cassandra will return either a single 
instance of `ColumnOrSuperColumn` (`get()`) or a list of 
`ColumnOrSuperColumn`'s (`get_slice()`). If you're looking up a `SuperColumn` 
(or list of `SuperColumn`'s) then the resulting instances of 
`ColumnOrSuperColumn` will have the requested `SuperColumn` in the attribute 
`super_column`. For queries resulting in `Column`'s those values will be in the 
attribute `column`. This change was made between 0.3 and 0.4 to standardize on 
single query methods that may return either a `SuperColumn` or `Column`.
  ||'''Attribute''' ||'''Type''' ||'''Default''' ||'''Required''' 
||'''Description''' ||
- ||`column` ||`Column` ||n/a ||N ||The `Column` returned by `get()` or 
`get_slice()`. ||
- ||`super_column` ||`SuperColumn` ||n/a ||N ||The `SuperColumn` returned by 
`get()` or `get_slice()`. ||
+ ||`start_key` ||`string` ||n/a ||N ||The first key in the inclusive 
`KeyRange`. ||
+ ||`end_key` ||`string` ||n/a ||N ||The last key in the inclusive `KeyRange`. 
||
+ ||`end_key` ||`string` ||n/a ||N ||The first token in the exclusive 
`KeyRange`. ||
+ ||`end_key` ||`string` ||n/a ||N ||The last token in the exclusive 
`KeyRange`. ||
+ ||`count` ||`i32` ||100 ||Y ||The total number of keys to permit in the 
`KeyRange`. ||
  
+ === KeySlice ===
+ '''''Requires Cassandra 0.6'''''
  
+ A `KeySlice` encapsulates a mapping of a key to the slice of columns for it 
as returned by the get_range_slices operation. Normally, when slicing a single 
key, a `list<ColumnOrSuperColumn>` of the slice would be returned. When slicing 
multiple or a range of keys, a `list<KeySlice>` is instead returned so that 
each slice can be mapped to their key.
+ ||'''Attribute''' ||'''Type''' ||'''Default''' ||'''Required''' 
||'''Description''' ||
+ ||`key` ||`string` ||n/a ||Y ||The key for the slice. ||
+ ||`columns` ||`list<ColumnOrSuperColumn>` ||n/a ||Y ||The columns in the 
slice. ||
  
+ === TokenRange ===
+ '''''Requires Cassandra 0.6'''''
+ 
+ A structure representing structural information about the cluster provided by 
the `describe` utility methods detailed below.
+ ||'''Attribute''' ||'''Type''' ||'''Default''' ||'''Required''' 
||'''Description''' ||
+ ||`start_token` ||`string` ||n/a ||Y ||The first token in the `TokenRange`. ||
+ ||`end_token` ||`string` ||n/a ||Y ||The last token in the `TokenRange`. ||
+ ||`endpoints` ||`list<string>` ||n/a ||Y ||A list of the endpoints (nodes) 
that replicate data in the `TokenRange`. ||
+ 
+ === Mutation ===
+ '''''Requires Cassandra 0.6'''''
+ 
+ A `Mutation` encapsulates either a column to insert, or a deletion to execute 
for a key. Like `ColumnOrSuperColumn`, the two properties are mutually 
exclusive - you may only set one on a Mutation.
+ ||'''Attribute''' ||'''Type''' ||'''Default''' ||'''Required''' 
||'''Description''' ||
+ ||`column_or_supercolumn` ||`ColumnOrSuperColumn` ||n/a ||N ||The column to 
insert in to the key. ||
+ ||`deletion` ||`Deletion` ||n/a ||N ||The deletion to execute on the key. ||
+ 
+ === Deletion ===
+ '''''Requires Cassandra 0.6'''''
+ 
+ A `Deletion` encapsulates an operation that will delete all columns matching 
the specified `timestamp` and `predicate`. If `super_column` is specified, the 
`Deletion` will operate on columns within the `SuperColumn` - otherwise it will 
operate on columns in the top-level of the key.
+ ||'''Attribute''' ||'''Type''' ||'''Default''' ||'''Required''' 
||'''Description''' ||
+ ||`timestamp` ||`i64` ||n/a ||Y ||The timestamp of the column(s) to be 
deleted. ||
+ ||`super_column` ||`binary` ||n/a ||N ||The super column to delete the 
column(s) from. ||
+ ||`predicate` ||`SlicePredicate` ||n/a ||N ||A predicate to match the 
column(s) to be deleted from the key/super column. ||
+ 
+ === AuthenticationRequest ===
+ '''''Requires Cassandra 0.6'''''
+ 
+ A structure that encapsulates a request for the connection to be 
authenticated. The authentication credentials are arbitrary - this structure 
simply provides a mapping of credential name to credential value.
+ ||'''Attribute''' ||'''Type''' ||'''Default''' ||'''Required''' 
||'''Description''' ||
+ ||`credentials` ||`map<string, string>` ||n/a ||Y ||A map of named 
credentials. ||
  
  == Method calls ==
  === get ===
   . `ColumnOrSuperColumn get(keyspace, key, column_path, consistency_level)`
  
- Get the `Column` or `SuperColumn` at the given `column_path`.  If no value is 
present, NotFoundException is thrown.  (This is the only method that can throw 
an exception under non-failure conditions.)
+ Get the `Column` or `SuperColumn` at the given `column_path`.  If no value is 
present, `NotFoundException` is thrown.  (This is the only method that can 
throw an exception under non-failure conditions.)
  
  === get_slice ===
   . `list<ColumnOrSuperColumn> get_slice(keyspace, key, column_parent, 
predicate, consistency_level)`
@@ -102, +161 @@

  Get the group of columns contained by `column_parent` (either a 
`ColumnFamily` name or a `ColumnFamily/SuperColumn` name pair) specified by the 
given `SlicePredicate` struct.
  
  === multiget ===
+ ''Deprecated in 0.6 - use `multiget_slice` instead''
+ 
   . `map<string,ColumnOrSuperColumn> multiget(keyspace, keys, column_path, 
consistency_level)`
  
  Perform a `get` for `column_path` in parallel on the given `list<string> 
keys`.  The return value maps keys to the `ColumnOrSuperColumn` found.  If no 
value corresponding to a key is present, the key will still be in the map, but 
both the `column` and `super_column` references of the `ColumnOrSuperColumn` 
object it maps to will be null.
@@ -109, +170 @@

  === multiget_slice ===
   . `map<string,list<ColumnOrSuperColumn>> multiget_slice(keyspace, keys, 
column_parent, predicate, consistency_level)`
  
- Performs a `get_slice` for `column_parent` and `predicate` for the given keys 
in parallel.
+ Retrieves slices for `column_parent` and `predicate` on each of the given 
keys in parallel. Keys are a `list<string> of the keys to get slices for.
+ 
+ This is similar to `get_range_slices` (Cassandra 0.6) or `get_range_slice` 
(Cassandra 0.5) except operating on a set of non-contiguous keys instead of a 
range of keys. 
  
  === get_count ===
   . `i32 get_count(keyspace, key, column_parent, consistency_level)`
@@ -119, +182 @@

  The method is not O(1). It takes all the columns from disk to calculate the 
answer. The only benefit of the method is that you do not need to pull all the 
columns over Thrift interface to count them.
  
  === get_range_slice ===
+ ''Deprecated in 0.6 - use `get_range_slices` instead''
+ 
   . `list<KeySlice> get_range_slice(keyspace, column_parent, predicate, 
start_key, finish_key, row_count=100, consistency_level)`
  
- Replaces get_key_range. Returns a list of slices, sorted by row key, starting 
with start, ending with finish (both inclusive) and at most count long. The 
empty string ("") can be used as a sentinel value to get the first/last 
existing key (or first/last column in the column predicate parameter). Unlike 
get_key_range, this applies the given predicate to all keys in the range, not 
just those with undeleted matching data.  This method is only allowed when 
using an order-preserving partitioner.
+ Replaces `get_key_range`. Returns a list of slices, sorted by row key, 
starting with start, ending with finish (both inclusive) and at most count 
long. The empty string ("") can be used as a sentinel value to get the 
first/last existing key (or first/last column in the column predicate 
parameter). Unlike get_key_range, this applies the given predicate to all keys 
in the range, not just those with undeleted matching data.  This method is only 
allowed when using an order-preserving partitioner.
+ 
+ === get_range_slices ===
+ ''Requires Cassandra 0.6''
+ 
+  . `list<KeySlice> get_range_slices(keyspace, column_parent, predicate, 
range, consistency_level)`
+ 
+ Replaces `get_range_slices`. Returns a list of slices for the keys within the 
specified `KeyRange`. Unlike get_key_range, this applies the given predicate to 
all keys in the range, not just those with undeleted matching data.  This 
method is only allowed when using an order-preserving partitioner.
  
  === get_key_range ===
+ ''Deprecated in 0.5 - use `get_range_slice` instead''
+ 
+ ''Removed in 0.6 - use `get_range_slices` instead''
+ 
   . `list<string> get_key_range(keyspace, column_family, start, finish, 
count=100, consistency_level)`
  
  Returns a list of keys starting with `start`, ending with `finish` (both 
inclusive), and at most `count` long.  The empty string ("") can be used as a 
sentinel value to get the first/last existing key.  (The semantics are similar 
to the corresponding components of `SliceRange`.)  This method is only allowed 
when using an order-preserving partitioner.
- 
- ''Note'': `get_key_range`'s design is kind of fundamentally broken, so we're 
deprecating it in favor of `get_range_slice` starting in 0.5. `get_range_slice` 
should be used instead.  In trunk (0.6) this method has been removed entirely.
  
  === insert ===
   . `insert(keyspace, key, column_path, value, timestamp, consistency_level)`
@@ -136, +210 @@

  Insert a `Column` consisting of (`column_path.column`, `value`, `timestamp`) 
at the given `column_path.column_family` and optional 
`column_path.super_column`.  Note that `column_path.column` is here required, 
since a !SuperColumn cannot directly contain binary values -- it can only 
contain sub-Columns.
  
  === batch_insert ===
+ ''Deprecated in 0.6 - use `batch_mutate` instead''
+ 
   . `batch_insert(keyspace, key, batch_mutation, consistency_level)`
  
  Insert Columns or SuperColumns across different Column Families for the same 
row key. `batch_mutation` is a `map<string, list<ColumnOrSuperColumn>>` -- a 
map which pairs column family names with the relevant `ColumnOrSuperColumn` 
objects to insert.
+ 
+ === batch_mutate ===
+ ''Requires Cassandra 0.6''
+ 
+  . `batch_mutate(keyspace, mutation_map, consistency_level)`
+ 
+ Executes the specified mutations on the keyspace. `mutation_map` is a 
`map<string, map<string, list<Mutation>>>`; the outer map maps the key to the 
inner map, which maps the column family to the `Mutation`; can be read as: 
`map<key : string, map<column_family : string, list<Mutation>>>`.
+ 
+ A `Mutation` specifies either columns to insert or columns to delete. See 
`Mutation` and `Deletion` above for more details.
  
  === remove ===
   . `remove(keyspace, key, column_path, timestamp, consistency_level)`

[Cassandra Wiki] Update of "API" by NickTelford

Reply via email to