There are two functions on the 0.7 API http://wiki.apache.org/cassandra/API to count the columns in a row, get_count() and multiget_count() (not listed on the wiki yet). Both of these will take a SlicePredicate which may have an empty start and end.
The only way to count rows is to use get_range_slice(), which will return the columns request. To reduce bandwidth of the query request it to return a single column. However the return from these functions is not guaranteed to be correct. Cassandra does not lock it's internal structures, so while it's busy processing your request other connections may be adding columns and rows. So that by the time it returns back to you the count if already wrong. You can apply the same reasoning to why there are no aggregate functions. Do you need count the rows as a once off or is it part of your application design ? Hope that helps Aaron On 29 Jan 2011, at 05:02, Victor Kabdebon wrote: > Buddasystem is right. > A count returns columns to the client which count it. My advice : do not > count big columns / supercolumns. People in the dev team are trying to > develop distributed counters but I don't know the state of this research. > > Best regards, > Victor Kabdebon > http://www.voxnucleus.fr > > 2011/1/28 buddhasystem <potek...@bnl.gov> > > As far as I know, there are no aggregate operations built into Cassandra, > which means you'll have to retrieve all of the data to count it in the > client. I had a thread on this topic 2 weeks ago. It's pretty bad. > > -- > View this message in context: > http://cassandra-user-incubator-apache-org.3065146.n2.nabble.com/Cassandra-and-count-tp5969159p5970315.html > Sent from the cassandra-u...@incubator.apache.org mailing list archive at > Nabble.com. >