I just found an estmateKeys() method of the ColumnFamilyStoreMBean.
Is there any indication about how it works?
Sheng
2011/3/28 Sheng Chen chensheng2...@gmail.com
Hi all,
I want to know how many records I am holding in Cassandra, just like
count(*) in sql.
What can I do ? Thank you.
Sheng
Woud you cassandra team think to add an alias name for nodetool
repair command?
That thought has crossed my mind lately too; particularly in one of
the recent threads.
The problem seems analogous to 'fsck', and the distinction between
fully expected by-design behavior needing fsck/repair is
Hi Aaron,
Thank you for your reply, i appreciate the suggestions you made.
Yesterday i managed to get everything (our main read) in one CF, with the
use of a structure in a value like you suggested.
Designing a new data model is different from what i'm used to, but if you
keep in mind that you
The CassandraBulkLoader example is written to use Super Columns, so seems odd.
Do you have the rest of the error stack ?
Aaron
On 31 Mar 2011, at 04:54, George Ciubotaru wrote:
Hello,
I’m using CassandraBulkLoader.java
AFAIK Cassandra will just pick the directory with the most space.
Also AFAIK using multiple directories should only be considered a safety valve
to fix problems such as the one you describe see
http://www.mail-archive.com/user@cassandra.apache.org/msg07874.html
Aaron
On 31 Mar 2011, at
--
Darío Bravo
Ok, we'll do it for sure!
Thanks,
Roberto
On 31 March 2011 14:56, aaron morton aa...@thelastpickle.com wrote:
Next time it happens take a note of the snapshot folder, different
processes name the folder differently. It may help track down what created
the snapshot.
Cheers
Aaron
On 31
From my understanding of replica copies, cassandra picks which nodes to
replicate the data based on replication strategy, and those same replica
partner nodes are always used according to token ring distribution.
If you change the replication strategy, does cassandra pick new nodes to
On Thu, Mar 31, 2011 at 3:52 AM, T Akhayo t.akh...@gmail.com wrote:
Hi Aaron,
Thank you for your reply, i appreciate the suggestions you made.
Yesterday i managed to get everything (our main read) in one CF, with the
use of a structure in a value like you suggested.
Designing a new data
silly question, would every cassandra installation need to have manual repairs
done on it?
It would seem cassandra's read repair and regular compaction would take care
of keeping the data clean.
Am I missing something?
On Mar 30, 2011, at 7:46 PM, Peter Schuller wrote:
I just wanted to
Thanks Edward,
Anyone able to provide some answers for the other questions?
On 03/26/2011 07:25 AM, Edward Capriolo wrote:
On Fri, Mar 25, 2011 at 2:11 PM, ian douglasi...@armorgames.com wrote:
On 03/25/2011 10:12 AM, Jonathan Ellis wrote:
On Fri, Mar 25, 2011 at 11:59 AM, ian
silly question, would every cassandra installation need to have manual
repairs done on it?
It would seem cassandra's read repair and regular compaction would take
care of keeping the data clean.
Am I missing something?
See my previous posts in this thread for the distinct reasons to run
If I am not wrong node repair need to be run on all the nodes in staggerred
manner. It is required to take care of tombstones. Please correct me team if
I am wrong :)
See Distributed Deletes:
http://wiki.apache.org/cassandra/Operations
--
View this message in context:
In the pycassa.pool.ConnectionPool class, I can specify all the nodes
in server_list parameter.
But overtime, when nodes get decomissioned and new nodes with new IPs
get added, how can the server_list parameter be refereshed ?
Do I have to modify it manually, or is there a way to update the list
I'm rebalancing a cluster of 2 nodes at this point. Netstats on the source
node reports progress of the stream, whereas on the receving end netstats
states that progress = 0. Did anyone see that?
Do I need both nodes listed as seeds in cassandra.yaml?
TIA/
--
View this message in context:
ConnectionPool has a set_server_list() method that you can use to update the
list of servers. (It appears this method did not make it into the docs;
I'll make sure it gets in there.) Pycassa doesn't make any attempt to
update the server list automatically right now.
By the way, there is a
Thanks Aaron,
I have already checked out Twissandra. I was mainly looking to see how
Secondary Indexes can be used and how they effect Data Modeling. There doesn't
seem to be a lot of coverage on them.
In addition, I couldn't tell what kind of Partitioner is Twissandra using and
why.
cheers,
Hi,
I am dealing with reporting with not so important data and I am okay with data
being lost.
I would like to minimize the time taken for the actual data insert.
I am using Cassandra 0.7.4
If it matter, using Hector to connect to Cassandra
cZERO consistency level in Thrift Generated code
Only the following Levels are provided, I am wondering if the ZERO
consistency level is removed in Cassandra 0.7.X ?
Yes, it's gone.
If so, Could you please explain why was it removed and what is the best
option I have given my context.
https://issues.apache.org/jira/browse/CASSANDRA-1607
On Thu, Mar 31, 2011 at 2:53 PM, Peter Schuller
peter.schul...@infidyne.com wrote:
Only the following Levels are provided, I am wondering if the ZERO
consistency level is removed in Cassandra 0.7.X ?
Yes, it's gone.
If so, Could you please explain why was it removed and what is the best
Peter -
Thanks a lot for elaborating on repairs.Still, it's a bit fuzzy to me why
it is so important to run a repair before the GCGraceSeconds kicks in. Does
this mean a delete does not get replicated ? In other words when I delete
something on a node, doesn't cassandra set tombstones
I just configured a cluster of two nodes -- do these token values make sense?
The reason I'm asking that so far I don't see load balancing to be
happening, judging from performance.
Address Status State LoadOwnsToken
Thanks a lot for elaborating on repairs. Still, it's a bit fuzzy to me why
it is so important to run a repair before the GCGraceSeconds kicks in. Does
this mean a delete does not get replicated ? In other words when I delete
something on a node, doesn't cassandra set tombstones on
I experience something that looks exactly like
https://issues.apache.org/jira/browse/CASSANDRA-1178
On cassandra 0.7.3 when using index slice queries (lots of them)
Crashing multiple nodes and rendering the cluster useless. But I have no clue
where to look if index queries still leak fd
Does
A script that I have says the following:
$ python ctokens.py
How many nodes are in your cluster? 2
node 0: 0
node 1: 85070591730234615865843651857942052864
The first token should be zero, for the reasons discussed here:
Peter, I want to join everyone else thanking you for helping out so much
with this thread, and especially for pointing out the problems with the DS
docs on this topic. We have some corrections posted today, and will keep
looking to improve the information.
On Thu, Mar 31, 2011 at 3:11 PM, Peter
It does not have a yaml file, so am assuming it's the default Random
Partitioner.
Aaron
On 1 Apr 2011, at 04:51, Drew Kutcharian wrote:
Thanks Aaron,
I have already checked out Twissandra. I was mainly looking to see how
Secondary Indexes can be used and how they effect Data Modeling.
I've been looking at replacing our PostgreSQL backend for RTG (a SNMP
based polling and graphing solution for network traffic/ports) with
something using Cassandra in order to solve our scalability and
redundancy requirements. Based on a lot of what I've read, Cassandra
is an ideal data store for
There is no reason to change the RF on the system keyspace, it should probably
not be allowed.
The system keyspace uses a LocalPartitioner and it's data is not replicated
through the same mechanism as a user keyspace.
Aaron
On 31 Mar 2011, at 10:22, Jeremy Stribling wrote:
On 03/30/2011
I know cloudkick is doing something like this, and we're developing our own
in-house method, but it would be nice for there to be a generically-available
package that would do this. Lately I've been wishing that someone would take
graphite (written in python) and put the frontend on top of
Where are the connection refused messages ? Are they client side ? Can you
cannot to the cluster with nodetool and run the ring command ?
Aaron
On 31 Mar 2011, at 11:44, Anurag Gujral wrote:
I restarted the cassandra node with more disks when I try to connect to
cassandra i get connection
We have a solution for time series data on cassandra at Twitter that
we'd like to open source, but it requires 0.8/trunk so we're not going
to release it until that's stable.
See
http://www.slideshare.net/kevinweil/rainbird-realtime-analytics-at-twitter-strata-2011
-ryan
On Thu, Mar 31, 2011
It iterates over all the SSTables and disk and estimates the number of keys by
looking at how big the index is. It does not count the actual keys.
aaron
On 31 Mar 2011, at 17:46, Sheng Chen wrote:
I just found an estmateKeys() method of the ColumnFamilyStoreMBean.
Is there any indication
Just finished looking at the slides. It looks awesome!
On 3/31/11 4:19 PM, Ryan King r...@twitter.com wrote:
We have a solution for time series data on cassandra at Twitter that
we'd like to open source, but it requires 0.8/trunk so we're not going
to release it until that's stable.
See
On Thu, Mar 31, 2011 at 6:15 PM, Eric Gilmore e...@datastax.com wrote:
A script that I have says the following:
$ python ctokens.py
How many nodes are in your cluster? 2
node 0: 0
node 1: 85070591730234615865843651857942052864
The first token should be zero, for the reasons discussed here:
On Thu, Mar 31, 2011 at 4:19 PM, Ryan King r...@twitter.com wrote:
We have a solution for time series data on cassandra at Twitter that
we'd like to open source, but it requires 0.8/trunk so we're not going
to release it until that's stable.
See
Cassandra 7.4:
nodetool -h `hostname` cfhistograms system schema
Exception in thread main java.lang.reflect.UndeclaredThrowableException
at $Proxy5.getRecentReadLatencyHistogramMicros(Unknown Source)
at
org.apache.cassandra.tools.NodeCmd.printCfHistograms(NodeCmd.java:452)
It looks like if I use system schema it fails. Is it because of
LocalPartitioner?
I ran with other keyspace and got following output.
Offset SSTables Write Latency Read Latency Row Size Column Count
1 0 0 0 0 0
2 0 0 0 0 0
179 0 0 0 320 320
Can someone please help me understand the output in
ant on my command line had completed without error.
Next I tried to build cassandra 0.7.4 in eclipse, and had luck.
So I'll explore cassandra code with eclipse, rather than IDEA.
maki
2011/3/31 Maki Watanabe watanabe.m...@gmail.com:
Not yet. I'll try.
maki
2011/3/31 Tommy Tynjä
Gregori,
Congrats on writing the fud-liest post of the month award. Firstly if
you don't like updates give up on computers and software. Especally
give up on anything that has to do with nosql because it is fast
evolving.
If you think you have a problem with the cassandra api, then what you
Hi All,
I am trying out a very simple scenario and I dont seem to get it working. It
would be great if I am pointed to some things here.
I have set up a 2 node cluster, cassandra.yaml being the default and same for
each other than the seed: being each other and I have set the Thrift RPC
On Thu, Mar 31, 2011 at 8:25 PM, mcasandra mohitanch...@gmail.com wrote:
It looks like if I use system schema it fails. Is it because of
LocalPartitioner?
I ran with other keyspace and got following output.
Offset SSTables Write Latency Read Latency Row Size Column Count
1 0 0 0 0 0
2 0 0
Yup, I screwed up the token setting, my bad.
Now, I moved the tokens. I still observe that read latency deteriorated with
3 machines vs original one. Replication factor is 1, Cassandra version 0.7.2
(didn't have time to upgrade as I need results by this weekend).
Key and row caching was disabled
I've got a single node of cassandra 0.7.4, and I used the java stress tool
to insert about 100 million records.
The inserts took about 6 hours (45k inserts/sec) but the following minor
compactions last for 2 days and the pending compaction jobs are still
increasing.
From jconsole I can read the
What's going on in the logs? CPU? i/o?
On Thu, Mar 31, 2011 at 4:20 AM, Or Yanay o...@peer39.com wrote:
Hi all,
My production cluster reads got stuck.
The ring gives:
Address Status State LoadOwns
Token
Index queries (ColumnFamilyStore.scan) don't do any low-level i/o
themselves, they go through CFS.getColumnFamily, which is what normal
row fetches also go through. So if there is a leak there it's
unlikely to be specific to indexes.
What is your open-file limit (remember that sockets count
46 matches
Mail list logo