[Cassandra Wiki] Update of "API" by JonathanEllis

Apache Wiki Mon, 31 Aug 2009 11:19:25 -0700

Dear Wiki user,

You have subscribed to a wiki page or wiki category on "Cassandra Wiki" for 
change notification.


The following page has been changed by JonathanEllis:
http://wiki.apache.org/cassandra/API

The comment on the change is:
fix some inaccuracies

------------------------------------------------------------------------------
   Keyspace:: Contains multiple Column Families.
   CF:: !ColumnFamily.
   SCF:: !ColumnFamily of type "Super".
-  Key:: A unique value that identifies a row in a CF. Keys must be unique 
inside a given CF.
+  Key:: A unique value that identifies a row in a CF.
  
  == Exceptions ==
   NotFoundException:: A specific column was requested that does not exist.
   InvalidRequestException:: Invalid request could mean keyspace or column 
family does not exist, required parameters are missing, or a parameter is 
malformed. `why` contains an associated error message.
   UnavailableException:: Not all the replicas required could be created and/or 
read.
-  TApplicationException:: Internal server error.
+  TApplicationException:: Internal server error or invalid Thrift method 
(possible if you are using an older version of a Thrift client with a newer 
build of the Cassandra server).
  
  == Structures ==
  
  === ConsistencyLevel ===
  
- The `ConsistencyLevel` is an `enum` that controls both read and write 
behavior based on `<ReplicationFactor>` in your `storage-conf.xml`. The 
different consistency levels have different meanings, depending on if you're 
doing a write or read operation.
+ The `ConsistencyLevel` is an `enum` that controls both read and write 
behavior based on `<ReplicationFactor>` in your `storage-conf.xml`. The 
different consistency levels have different meanings, depending on if you're 
doing a write or read operation.  Note that if `W` + `R` > `ReplicationFactor`, 
where W is the number of nodes to block for on write, and R the number to block 
for on reads, you will have strongly consistent behavior; that is, readers will 
always see the most recent write.  Of these, the most interesting is to do 
`QUORUM` reads and writes, which gives you consistency while still allowing 
availability in the face of node failures up to half of `ReplicationFactor`.  
Of course if latency is more important than consistency then you can use lower 
values for either or both.
  
  ==== Write ====
  
  ||'''Level'''||'''Behavior'''||
- ||`ZERO`||Ensure nothing. A write happens async in background||
+ ||`ZERO`||Ensure nothing. A write happens asynchronously in background||
  ||`ONE`||Ensure that the write has been written to at least 1 node's commit 
log and memory table before responding to the client.||
  ||`QUORUM`||Ensure that the write has been written to `<ReplicationFactor> / 
2 + 1` nodes before responding to the client.||
  ||`ALL`||Ensure that the write is written to `<ReplicationFactor>` nodes 
before responding to the client.||
@@ -37, +37 @@

  ==== Read ====
  
  ||'''Level'''||'''Behavior'''||
- ||`ZERO`||Not supported.||
- ||`ONE`||Will return the record returned by the first node to respond. A 
background thread is always fired off to fix any consistency issues when 
`ConsistencyLevel.ONE` is used. This means subsequent calls will have correct 
data while the initial read may not.||
- ||`QUORUM`||Will query all storage nodes and return the record that is 
prevailing in consistency. For instance, if `foo = 1` on nodes A and B, while 
`foo = 2` on node C then the prevailing consistency is `foo = 1`. A background 
thread will be fired off to fix consistency issues.||
- ||`ALL`||Not supported.||
+ ||`ZERO`||Not supported, because it doesn't make sense.||
+ ||`ONE`||Will return the record returned by the first node to respond. A 
consistency check is always done in a background thread to fix any consistency 
issues when `ConsistencyLevel.ONE` is used. This means subsequent calls will 
have correct data even if the initial read gets an older value.  (This is 
called `read repair`.)||
+ ||`QUORUM`||Will query all storage nodes and return the record with the most 
recent timestamp once it has at least a majority of replicas reported.  Again, 
the remaining replicas will be checked in the background.||
+ ||`ALL`||Not yet supported, but we plan to eventually.||
  
- === ColumnPath ===
+ === ColumnPath and ColumnParent ===
  
- The `ColumnPath` is the path to a single column in Cassandra. It might make 
sense to think of `ColumnPath` (and the soon-to-be-discussed `ColumnParent`) in 
terms of a directory structure. 
+ The `ColumnPath` is the path to a single column in Cassandra. It might make 
sense to think of `ColumnPath` and `ColumnParent` in terms of a directory 
structure. 
  
  
||'''Attribute'''||'''Type'''||'''Default'''||'''Required'''||'''Description'''||
  ||`column_family`||`string`||n/a||Y||The name of the CF of the column being 
looked up.||
  ||`super_column`||`binary`||n/a||N||The super column name.||
  ||`column`||`binary`||n/a||N||The column name.||
  
+ `ColumnPath` is used to looking up a single column.  `ColumnParent` is used 
when selecting groups of columns from the same !ColumnFamily.  In directory 
structure terms, imagine `ColumnParent` as `ColumnPath + '/../'`. 
- When looking up a key 
- 
- === ColumnParent ===
- 
- Imagine `ColumnParent` as `ColumnPath + '/../'`. 
- 
- === SliceRange ===
- 
- A slice range is a structure that stores basic range, ordering and limit 
information for a query that will return multiple keys. It could be thought of 
as Cassandra's version of `LIMIT` and `ORDER BY`.
- 
- 
||'''Attribute'''||'''Type'''||'''Default'''||'''Required'''||'''Description'''||
- ||`start`||`binary`||n/a||Y||The column name to start the slice with. This 
attribute is not required, though there is no default value, and can be safely 
set to `''`. Can be numerical or characters.||
- ||`finish`||`binary`||n/a||Y||The column name to stop the slice at. This 
attribute is not required, though there is no default value, and can be safely 
set to `''`. Can be an integer or string.||
- ||`reversed`||`bool`||`false`||N||Whether the results should be ordered in 
reversed order. Similar to `ORDER BY blah DESC` in MySQL.||
- ||`count`||`integer`||`100`||N||How many keys to return. Similar to `LIMIT 
100` in MySQL.||
  
  === SlicePredicate ===
  
- A `SlicePredicate` is similar to a 
[http://en.wikipedia.org/wiki/Predicate_(mathematical_logic) mathematic 
predicate], which is described as:
+ A `SlicePredicate` is similar to a 
[http://en.wikipedia.org/wiki/Predicate_(mathematical_logic) mathematic 
predicate], which is described as "a property that the elements of a set have 
in common."
  
-     Sometimes it is inconvenient or impossible to describe a set by listing 
all of its elements. Another useful way to define a set is by specifying a 
property that the elements of the set have in common.
- 
- `SlicePredicate`'s in Cassandra are described with either a list of 
`column_names`, a `SliceRange`, or both.
+ `SlicePredicate`'s in Cassandra are described with either a list of 
`column_names` or a `SliceRange`.
  
  
||'''Attribute'''||'''Type'''||'''Default'''||'''Required'''||'''Description'''||
  ||`column_names`||`list`||n/a||N||A list of column names to retrieve. This 
can be used similar to Memcached's "multi-get" feature to fetch N known column 
names. For instance, if you know you wish to fetch columns 'Joe', 'Jack', and 
'Jim' you can pass those column names as a list to fetch all three at once.||
  ||`slice_range`||`SliceRange`||n/a||N||A `SliceRange` describing how to 
range, order, and/or limit the slice.||
+ 
+ If `column_names` is specified, `slice_range` is ignored.
+ 
+ === SliceRange ===
+ 
+ A slice range is a structure that stores basic range, ordering and limit 
information for a query that will return multiple columns. It could be thought 
of as Cassandra's version of `LIMIT` and `ORDER BY`.
+ 
+ 
||'''Attribute'''||'''Type'''||'''Default'''||'''Required'''||'''Description'''||
+ ||`start`||`binary`||n/a||Y||The column name to start the slice with. This 
attribute is not required, though there is no default value, and can be safely 
set to `''`, i.e., an empty byte array, to start with the first column name.  
Otherwise, it must a valid value under the rules of the Comparator defined for 
the given `ColumnFamily`.||
+ ||`finish`||`binary`||n/a||Y||The column name to stop the slice at. This 
attribute is not required, though there is no default value, and can be safely 
set to an empty byte array to not stop until `count` results are seen. 
Otherwise, it must also be a value value to the `ColumnFamily` Comparator. ||
+ ||`reversed`||`bool`||`false`||N||Whether the results should be ordered in 
reversed order. Similar to `ORDER BY blah DESC` in SQL.||
+ ||`count`||`integer`||`100`||N||How many keys to return. Similar to `LIMIT 
100` in SQL. May be arbitrarily large, but Thrift will materialize the whole 
result into memory before returning it to the client, so be aware that you may 
be better served by iterating through slices by passing the last value of one 
call in as the `start` of the next instead of increasing `count` arbitrarily 
large.||
  
  === ColumnOrSuperColumn ===
  
@@ -87, +83 @@

  ||`column`||`Column`||n/a||N||The `Column` returned by `get()` or 
`get_slice()`.||
  ||`super_column`||`SuperColumn`||n/a||N||The `SuperColumn` returned by 
`get()` or `get_slice()`.||
  
+ == Method calls ==
+ 
+ TODO
+ 
+ == Examples ==
+ 
+ Would someone public-spirited add some examples of the results you'd get with 
these methods?
+

[Cassandra Wiki] Update of "API" by JonathanEllis

Reply via email to