Duplicate result of get_indexed_slices, depending on indexClause.count
Hi All, I have been using Cassandra 0.7.2 and 0.7.4 with Thrift API (using Java). I noticed that if I am querying a Column Family with indexed columns sometimes I get a duplicate result in get_indexed_slices depending on the number of rows in the CF and the count that I set in IndexClause.count. It also depends on the order of rows in CF. For example consider the following CF that I call Attributes: create column family Attributes with comparator=UTF8Type and column_metadata=[ {column_name: range_id, validation_class: LongType, index_type: KEYS}, {column_name: attr_key, validation_class: UTF8Type, index_type: KEYS}, {column_name: attr_val, validation_class: BytesType, index_type: KEYS} ]; And suppose I have the following rows in the CF: key range_id attr_keyattr_val 1/@1/0, 1, A, 1 1/5/0, 1, B, 1000 3/@1/0, 2, A, 1 3/5/0, 2, B, 1001 5/@1/0, 3, A, 2 5/5/0, 3, B, 1002 7/@1/0, 4, A, 2 7/5/0, 4, B, 1003 Now if I have a query with IndexClause like this (in pseudo code): attr_key == A AND attr_val == 1 with indexClause.count = 4; Then I ill get the rows with the following keys from get_indexed_slices : 1/@1/0, 3/@1/0, 3/@1/0 The last key is a duplicate! This is very sensitive to the order of rows in the CF and the number of rows and the number you set in indexClause.count. I noticed when the number of rows in the CF is twice the indexClause.count this issue might happen depending on the order of rows in CF! This seems a bug. And it occurs in both 0.7.2 and 0.7.4. Is there a solution to this problem? Many Thanks, Sam -- View this message in context: http://cassandra-user-incubator-apache-org.3065146.n2.nabble.com/Duplicate-result-of-get-indexed-slices-depending-on-indexClause-count-tp6275394p6275394.html Sent from the cassandra-u...@incubator.apache.org mailing list archive at Nabble.com.
Re: Indexes on heterogeneous rows
Does the get_indexed_slice in 0.7.4 version already do thing that way? It seems always take the 1st indexed column with EQ. Or is it a new feature of coming 0.7.5 or 0.8? -邮件原件- 发件人: Jonathan Ellis [mailto:jbel...@gmail.com] 发送时间: 2011年4月15日 0:21 收件人: user@cassandra.apache.org 抄送: David Boxenhorn; aaron morton 主题: Re: Indexes on heterogeneous rows This should work reasonably well w/ 0.7 indexes. Cassandra tracks statistics on index selectivity, so it would plan that query as index lookup on e=5, then iterate over those results and return only rows that also have type=2. On Thu, Apr 14, 2011 at 5:33 AM, David Boxenhorn da...@taotown.com wrote: Thank you for your answer, and sorry about the sloppy terminology. I'm thinking of the scenario where there are a small number of results in the result set, but there are billions of rows in the first of your secondary indexes. That is, I want to do something like (not sure of the CQL syntax): select * where type=2 and e=5 where there are billions of rows of type 2, but some manageable number of those rows have e=5. As I understand it, secondary indexes are like column families, where each value is a column. So the billions of rows where type=2 would go into a single row of the secondary index. This sounds like a problem to me, is it? I'm assuming that the billions of rows that don't have column e at all (those rows of other types) are not a problem at all... On Thu, Apr 14, 2011 at 12:12 PM, aaron morton aa...@thelastpickle.com wrote: Need to clear up some terminology here. Rows have a key and can be retrieved by key. This is *sort of* the primary index, but not primary in the normal RDBMS sense. Rows can have different columns and the column names are sorted and can be efficiently selected. There are secondary indexes in cassandra 0.7 based on column values http://www.datastax.com/dev/blog/whats-new-cassandra-07-secondary-indexes So you could create secondary indexes on the a,e, and h columns and get rows that have specific values. There are some limitations to secondary indexes, read the linked article. Or you can make your own secondary indexes using row keys as the index values. If you have billions of rows, how many do you need to read back at once? Hope that helps Aaron On 14 Apr 2011, at 04:23, David Boxenhorn wrote: Is it possible in 0.7.x to have indexes on heterogeneous rows, which have different sets of columns? For example, let's say you have three types of objects (1, 2, 3) which each had three members. If your rows had the following pattern type=1 a=? b=? c=? type=2 d=? e=? f=? type=3 g=? h=? i=? could you index type as your primary index, and also index a, e, h as secondary indexes, to get the objects of that type that you are looking for? Would it work if you had billions of rows of each type? -- Jonathan Ellis Project Chair, Apache Cassandra co-founder of DataStax, the source for professional Cassandra support http://www.datastax.com
Re: Duplicate result of get_indexed_slices, depending on indexClause.count
https://issues.apache.org/jira/browse/CASSANDRA-2406 On Fri, Apr 15, 2011 at 1:43 AM, sam_ amin_shar...@yahoo.com wrote: Hi All, I have been using Cassandra 0.7.2 and 0.7.4 with Thrift API (using Java). I noticed that if I am querying a Column Family with indexed columns sometimes I get a duplicate result in get_indexed_slices depending on the number of rows in the CF and the count that I set in IndexClause.count. It also depends on the order of rows in CF. For example consider the following CF that I call Attributes: create column family Attributes with comparator=UTF8Type and column_metadata=[ {column_name: range_id, validation_class: LongType, index_type: KEYS}, {column_name: attr_key, validation_class: UTF8Type, index_type: KEYS}, {column_name: attr_val, validation_class: BytesType, index_type: KEYS} ]; And suppose I have the following rows in the CF: key range_id attr_key attr_val 1/@1/0, 1, A, 1 1/5/0, 1, B, 1000 3/@1/0, 2, A, 1 3/5/0, 2, B, 1001 5/@1/0, 3, A, 2 5/5/0, 3, B, 1002 7/@1/0, 4, A, 2 7/5/0, 4, B, 1003 Now if I have a query with IndexClause like this (in pseudo code): attr_key == A AND attr_val == 1 with indexClause.count = 4; Then I ill get the rows with the following keys from get_indexed_slices : 1/@1/0, 3/@1/0, 3/@1/0 The last key is a duplicate! This is very sensitive to the order of rows in the CF and the number of rows and the number you set in indexClause.count. I noticed when the number of rows in the CF is twice the indexClause.count this issue might happen depending on the order of rows in CF! This seems a bug. And it occurs in both 0.7.2 and 0.7.4. Is there a solution to this problem? Many Thanks, Sam -- View this message in context: http://cassandra-user-incubator-apache-org.3065146.n2.nabble.com/Duplicate-result-of-get-indexed-slices-depending-on-indexClause-count-tp6275394p6275394.html Sent from the cassandra-u...@incubator.apache.org mailing list archive at Nabble.com. -- Jonathan Ellis Project Chair, Apache Cassandra co-founder of DataStax, the source for professional Cassandra support http://www.datastax.com
CL.ONE gives UnavailableException on ok node
Just experienced something i don't understand yet. Running a 3 node cluster successfully for a few days now, then one of the nodes went down (server required reboot). After this the other two nodes kept throwing UnavailableExceptions like UnavailableException() at org.apache.cassandra.service.WriteResponseHandler.assureSufficientLiveNodes(WriteResponseHandler.java:127) at org.apache.cassandra.service.StorageProxy.mutate(StorageProxy.java:118) at no.finntech.countstats.listener.CassandraMessageListener$1.run(CassandraMessageListener.java:356) (this code being loosely based off the second example in http://wiki.apache.org/cassandra/ScribeToCassandra ). This seems a bit weird to me when the StorageProxy.mutate(..) is being called with ConsistencyLevel.ONE. I'm running 0.7.4 so i doubt it to be CASSANDRA-2069 ~mck -- Everything you can imagine is real. Pablo Picasso | http://semb.wever.org | http://sesat.no | http://tech.finn.no | Java XSS Filter signature.asc Description: This is a digitally signed message part
How to warm up a cold node
Hi everyone, is there any recommended procedure to warm up a node before bringing it up? Thanks!
Re: How to warm up a cold node
Hi everyone, is there any recommended procedure to warm up a node before bringing it up? Currently the only out-of-the-box support for warming up caches is that implied by the key cache and row cache, which will pre-heat on start-up. Indexes will be indirectly preheated by index sampling, to the extent that they operating system retains them in page cache. If you're wanting to pre-heat sstables there's currently no way to do that (but it's a useful feature to have). Pragmatically, you can script something that e.g. does cat path/to/keyspace/* /dev/null or similar. But that only works if the total database size fits reasonably well in page cache. Pre-heating sstables on a per-cf basis on start-up would be a nice feature to have. -- / Peter Schuller
Re: How to warm up a cold node
How difficult do you think this could be? I would be interested into developing this if it's feasible. El vie, 15-04-2011 a las 16:19 +0200, Peter Schuller escribió: Hi everyone, is there any recommended procedure to warm up a node before bringing it up? Currently the only out-of-the-box support for warming up caches is that implied by the key cache and row cache, which will pre-heat on start-up. Indexes will be indirectly preheated by index sampling, to the extent that they operating system retains them in page cache. If you're wanting to pre-heat sstables there's currently no way to do that (but it's a useful feature to have). Pragmatically, you can script something that e.g. does cat path/to/keyspace/* /dev/null or similar. But that only works if the total database size fits reasonably well in page cache. Pre-heating sstables on a per-cf basis on start-up would be a nice feature to have.
question about performance of Cassandra 0.7.4 under a read-heavy workload.
I just deployed cassandra 0.7.4 as a 6-server cluster and tested its performance via YCSB. The result seems confusing when compared to that of Cassandra0.6.6. Under a write heavy workload(i.e., write/read: 50%/50%), Cassandra0.7.4 obtains a really satisfactory latency. I mean both the read latency and write latency is much lower than those of Cassandra0.6.6. However, under a read heavy workload(i.e., write/read:5%/95%), Cassandra0.7.4 performs far worse than Cassandra0.6.6 does. Did I miss something?
Consistency model
I've been experimenting with the consistency model of Cassandra, and I found something that seems a bit unexpected. In my experiment, I have 2 processes, a reader and a writer, each accessing a Cassandra cluster with a replication factor greater than 1. In addition, sometimes I generate background traffic to simulate a busy cluster by uploading a large data file to another table. The writer executes a loop where it writes a single row that contains just an sequentially increasing sequence number and a timestamp. In python this looks something like: while time.time() start_time + duration: target_server = random.sample(servers, 1)[0] target_server = '%s:9160'%target_server row = {'seqnum':str(seqnum), 'timestamp':str(time.time())} seqnum += 1 # print 'uploading to server %s, %s'%(target_server, row) pool = pycassa.connect('Keyspace1', [target_server]) cf = pycassa.ColumnFamily(pool, 'Standard1') cf.insert('foo', row, write_consistency_level=consistency_level) pool.dispose() if sleeptime 0.0: time.sleep(sleeptime) The reader simply executes a loop reading this row and reporting whenever a sequence number is *less* than the previous sequence number. As expected, with consistency_level=ConsistencyLevel.ONE there are many inconsistencies, especially with a high replication factor. What is unexpected is that I still detect inconsistencies when it is set at ConsistencyLevel.QUORUM. This is unexpected because the documentation seems to imply that QUORUM will give consistent results. With background traffic the average difference in timestamps was 0.6s, and the maximum was 3.5s. This means that a client sees a version of the row, and can subsequently see another version of the row that is 3.5s older than the previous. What I imagine is happening is this, but I'd like someone who knows that they're talking about to tell me if it's actually the case: I think Cassandra is not using an atomic commit protocol to commit to the quorum of servers chosen when the write is made. This means that at some point in the middle of the write, some subset of the quorum have seen the write, while others have not. At this time, there is a quorum of servers that have not seen the update, so depending on which quorum the client reads from, it may or may not see the update. Of course, I understand that the client is not *choosing* a bad quorum to read from, it is just the first `q` servers to respond, but in this case it is effectively random and sometimes an bad quorum is chosen. Does anyone have any other insight into what is going on here?
Key cache hit rate
How to intepret Key cache hit rate? What does this no mean? Keyspace: StressKeyspace Read Count: 87579 Read Latency: 11.792417360326105 ms. Write Count: 179749 Write Latency: 0.009272318622078566 ms. Pending Tasks: 0 Column Family: StressStandard SSTable count: 59 Space used (live): 52432078035 Space used (total): 52432078035 Memtable Columns Count: 229 Memtable Data Size: 114103248 Memtable Switch Count: 375 Read Count: 87579 Read Latency: NaN ms. Write Count: 179751 Write Latency: 0.007 ms. Pending Tasks: 0 Key cache capacity: 100 Key cache size: 78576 Key cache hit rate: 3.8880248833592535E-4 Row cache: disabled Compacted row minimum size: 182786 Compacted row maximum size: 5839588 Compacted row mean size: 532956 -- View this message in context: http://cassandra-user-incubator-apache-org.3065146.n2.nabble.com/Key-cache-hit-rate-tp6277236p6277236.html Sent from the cassandra-u...@incubator.apache.org mailing list archive at Nabble.com.
Re: What's the best modeling approach for ordering events by date?
Hi. So, the OPP will direct all activity for a range of keys to a particular node (or set of nodes, in accordance with your replication factor). Depending on the volume of writes, this could be fine. Depending on the distribution of key values you write at any given time, it can also be fine. But if you're using the OPP, and your keys align with the time of receiving the data, and your application writes that data as it receives it, you're going to be placing write activity on effectively one node at a time, for the range of time allocated to that node. If you use RP, and can divide time into finer slices such that you have multiple tweets in a row, you trade off a more complex read in exchange for better distribution of load throughout your cluster. The necessity of this depends on your particulars. In your TweetsBySecond example, you're using a deterministic set of keys (the keys correspond to seconds since epoch). Querying for ranges of time is nice with OPP, but if the ranges of time you're interested in are constrained, you don't specifically need OPP. You could use RP and request all the keys for the seconds contained within the time range of interest. In this way, you balance writes across the cluster more effectively than you would with OPP, while still getting a workable data set. Again, the degree to which you need this is dependent on your situation. Others on the list will no doubt have more informed opinions on this than me. :) On Thu, Apr 14, 2011 at 8:00 PM, Guillermo Winkler gwink...@inconcertcc.com wrote: Hi Ethan, I want to present the events ordered by time, always in pages of 20/40 events. If the events are tweets, you can have 1000 tweets from the same second or you can have 30 tweets in a 10 minute range. But I always wanna be able to page through the results in an orderly fashion. I think that using seconds since epoch it's what I'm doing, that is divide time into a fixed series of interval. Each second is an interval, and all of the events for that particular second are columns of that row. Again with tweets for easier visualizatoin TweetsBySecond : { 12121121212 :{ - seconds since epoch id1,id2,id3 - all the tweet ids ocurred in that particular second }, 12121212123 : { id4,id5 }, 12121212124 : { id6 } } The problem is you can't do that using OPP in cassandra 0.7, or it's just me missing something? Thanks for your answer, Guille On Thu, Apr 14, 2011 at 4:49 PM, Ethan Rowe et...@the-rowes.com wrote: How do you plan to read the data? Entire histories, or in relatively confined slices of time? Do the events have any attributes by which you might segregate them, apart from time? If you can divide time into a fixed series of intervals, you can insert members of a given interval as columns (or supercolumns) in a row. But it depends how you want to use the data on the read side. On Thu, Apr 14, 2011 at 12:25 PM, Guillermo Winkler gwink...@inconcertcc.com wrote: I have a huge number of events I need to consume later, ordered by the date the event occured. My first approach to this problem was to use seconds since epoch as row key, and event ids as column names (empty value), this way: EventsByDate : { SecondsSinceEpoch: { evid:, evid:, evid: } } And use OPP as partitioner. Using GetRangeSlices to retrieve ordered events secuentially. Now I have two problems to solve: 1) The system is realtime, so all the events in a given moment are hitting the same box 2) Migrating from cassandra 0.6 to cassandra 0.7 OPP doesn't seem to like LongType for row keys, was this purposedly deprecated? I was thinking about secondary indexes, but it does not assure the order the rows are coming out of cassandra. Anyone has a better approach to model events by date given that restrictions? Thanks, Guille
Two versions of schema
Is there a problem? [default@StressKeyspace] update column family StressStandard with keys_cached=100; 854ee0a0-6792-11e0-81f9-93d987913479 Waiting for schema agreement... The schema has not settled in 10 seconds; further migrations are ill-advised until it does. Versions are 854ee0a0-6792-11e0-81f9-93d987913479:[10.18.62.202, 10.18.62.203, 10.18.62.200, 10.18.62.204, 10.18.62.199, 10.18.62.196, 10.18.62.197],22d165ff-6783-11e0-81f9-93d987913479:[10.18.62.198] I remember reading somewhere before that when you have 2 versions of schemas you are basically in trouble. Can someone explain what it means and it's implications? -- View this message in context: http://cassandra-user-incubator-apache-org.3065146.n2.nabble.com/Two-versions-of-schema-tp6277365p6277365.html Sent from the cassandra-u...@incubator.apache.org mailing list archive at Nabble.com.
Problems with subcolumn retrieval after upgrade from 0.6 to 0.7
I'm having some issues with a few of my ColumnFamilies after a cassandra upgrade/import from 0.6.1 to 0.7.4. I followed the instructions to upgrade and everything seem to work OK...until I got into the application and noticed some wierd behavior. I was getting the following stacktrace in cassandra occassionally when I did get operations for a single subcolumn for some of the Super type CFs: ERROR 12:56:05,669 Internal error processing get java.lang.AssertionError at org.apache.cassandra.thrift. CassandraServer.get(CassandraServer.java:300) at org.apache.cassandra.thrift.Cassandra$Processor$get.process(Cassandra.java:2655) at org.apache.cassandra.thrift.Cassandra$Processor.process(Cassandra.java:2555) at org.apache.cassandra.thrift.CustomTThreadPoolServer$WorkerProcess.run(CustomTThreadPoolServer.java:206) at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1110) at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:603) at java.lang.Thread.run(Thread.java:636) The assertion that is failing is the check that only one column is retrieved by the get. I did some debugging with the cli and a remote debugger and found a few interesting patterns. First, the problem does not seem consistently duplicatable. If one supercolumn is affected though, it will happen more frequently for subcolumns that when sorted appear at the beginning of the range. For columns near the end of the range, it seems to be more intermittent, and almost never occurs when I step through the code line by line. The only factor I can think of that might cause issues is that I am using custom data types for all supercolumns and columns. I originally thought I might be reading past the end of the ByteBuffer, but I have quadrupled checked that this is not the case. Abe Sanderson
recurring EOFException exception in 0.7.4
I've been struggling with these kinds of exceptions for some time now. I thought it might have been a one-time thing, so on the 2 nodes where I saw this problem I pulled in fresh data with a repair on an empty data directory. Unfortunately, this problem is now coming up on a new node that has, up until now, not had this problem. What could be causing this? Could it be related to encoding? Why are these rows not readable? This exception prevents cassandra from doing repairs, and even minor compactions. It also messes up memtable management (with a normal load of 25GB, disk goes to almost 100% full on a 500 GB hd). This is incredibly frustrating. This is the only pain-point I have had with cassandra so far. By the way, this node was never upgraded - it was 0.7.4 from the start, so that eliminates format compatibility problems. ERROR [CompactionExecutor:1] 2011-04-15 21:31:23,479 PrecompactedRow.java (line 82) Skipping row DecoratedKey(105452551814086725777389040553659117532, 4d657373616765456e726963686d656e743a313032343937) in /var/lib/cassandra/data/DFS/main-f-91-Data.db java.io.EOFException at java.io.RandomAccessFile.readFully(RandomAccessFile.java:383) at java.io.RandomAccessFile.readFully(RandomAccessFile.java:361) at org.apache.cassandra.io.util.BufferedRandomAccessFile.readBytes(BufferedRandomAccessFile.java:270) at org.apache.cassandra.utils.ByteBufferUtil.read(ByteBufferUtil.java:315) at org.apache.cassandra.utils.ByteBufferUtil.readWithLength(ByteBufferUtil.java:272) at org.apache.cassandra.db.ColumnSerializer.deserialize(ColumnSerializer.java:94) at org.apache.cassandra.db.ColumnSerializer.deserialize(ColumnSerializer.java:35) at org.apache.cassandra.db.ColumnFamilySerializer.deserializeColumns(ColumnFamilySerializer.java:129) at org.apache.cassandra.io.sstable.SSTableIdentityIterator.getColumnFamilyWithColumns(SSTableIdentityIterator.java:176) at org.apache.cassandra.io.PrecompactedRow.init(PrecompactedRow.java:78) at org.apache.cassandra.io.CompactionIterator.getCompactedRow(CompactionIterator.java:147) at org.apache.cassandra.io.CompactionIterator.getReduced(CompactionIterator.java:108) at org.apache.cassandra.io.CompactionIterator.getReduced(CompactionIterator.java:43) at org.apache.cassandra.utils.ReducingIterator.computeNext(ReducingIterator.java:73) at com.google.common.collect.AbstractIterator.tryToComputeNext(AbstractIterator.java:136) at com.google.common.collect.AbstractIterator.hasNext(AbstractIterator.java:131) at org.apache.commons.collections.iterators.FilterIterator.setNextObject(FilterIterator.java:183) at org.apache.commons.collections.iterators.FilterIterator.hasNext(FilterIterator.java:94) at org.apache.cassandra.db.CompactionManager.doCompaction(CompactionManager.java:449) at org.apache.cassandra.db.CompactionManager$1.call(CompactionManager.java:124) at org.apache.cassandra.db.CompactionManager$1.call(CompactionManager.java:94) at java.util.concurrent.FutureTask$Sync.innerRun(FutureTask.java:303) at java.util.concurrent.FutureTask.run(FutureTask.java:138) at java.util.concurrent.ThreadPoolExecutor$Worker.runTask(ThreadPoolExecutor.java:886) at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:908) at java.lang.Thread.run(Thread.java:662)
cluster IP question and Jconsole?
I have followed the description here http://www.edwardcapriolo.com/roller/edwardcapriolo/entry/lauching_5_node_cassandra_clusters to created 5 instances of cassandra in one CentOS 5.5 machine. using nodetool shows the 5 nodes are all running fine. Note the 5 nodes are using IP 127.0.0.1 to 127.0.0.5. I understand 127.0.0.1 is pointing to local server, but how about 127.0.0.2 to 127.0.0.5? looks to me that they are not valid IP? how come all 5 nodes are working ok? Another question. I have installed MX4J in instance 127.0.0.1 on port 8081. I am able to connect to http://server:8081/ from the browser. However how do I connect using Jconsole that was installed in another windows machines?(since my CentOS5.5 doesn't have X installed, only SSH allowed). Thanks.
Re: Cassandra 2 DC deployment
You are right about the automatic fallback to ONE. Its quite possible, if 2 nodes die for some reason I will have the same problem. So probably the right thing to do would be to read/write at ONE only when we lose a DC by changing some manual configuration. Since we shouldn't be losing DCs that often, this should be an acceptable change. So my follow up questions would be - Seems reasonable to have a human do it, since it seems that you really want QUORUM - so presumably there is some kind of negative impact and you don't want that sporadically happening every time there is a hiccup. But of course I don't know the context. When would be the right time to start reading/writing at QUORUM again? I'd say usually as soon as possible, but it will depend on details of your situation. For example, if you have 2 DC:s with 5 nodes in one and 1 node in another, and there is a partition - the DC with just one node will start seeing older data (from the point of view of writes done in the 1-node DC) if you start asking for quorum since a lot of the time a quorum will be 4 nodes in the other DC. So if there is interest in preferring the local dc's copy of the data after an emergency fallback to CL.ONE, it may be detrimental to go QUORUM too early. But this will depend on what your application is actually doing and what is important to you. Should we be marking the 2 nodes in the lost DC as down? Should we be doing some administrative work on Cassandra before we start reading/writing at QUORUM again? Are you talking about permanently losing a DC then, rather than just a transient partition? For non-permanent situations it seems counter-productive to mark other DC's nodes as down. Oh and btw, keep in mind you can choose to use LOCAL_QUORUM to get intra-site consistency (rather than ONE). As for administrative work: I can't answer in general since we're talking about very special circumstances, but at least it's valid to say that whenever you have some kind of issue that has caused inconsistency, running 'nodetool repair' (perhaps earlier than the standard weekly/whatever repair) is the most efficient way to achieve consistency again. -- / Peter Schuller
RE: recurring EOFException exception in 0.7.4
Try running nodetool scrub on the cf: its pretty good at detecting and fixing most corruption problems. Dan -Original Message- From: Jonathan Colby [mailto:jonathan.co...@gmail.com] Sent: April-15-11 15:41 To: user@cassandra.apache.org Subject: recurring EOFException exception in 0.7.4 I've been struggling with these kinds of exceptions for some time now. I thought it might have been a one-time thing, so on the 2 nodes where I saw this problem I pulled in fresh data with a repair on an empty data directory. Unfortunately, this problem is now coming up on a new node that has, up until now, not had this problem. What could be causing this? Could it be related to encoding? Why are these rows not readable? This exception prevents cassandra from doing repairs, and even minor compactions. It also messes up memtable management (with a normal load of 25GB, disk goes to almost 100% full on a 500 GB hd). This is incredibly frustrating. This is the only pain-point I have had with cassandra so far. By the way, this node was never upgraded - it was 0.7.4 from the start, so that eliminates format compatibility problems. ERROR [CompactionExecutor:1] 2011-04-15 21:31:23,479 PrecompactedRow.java (line 82) Skipping row DecoratedKey(105452551814086725777389040553659117532, 4d657373616765456e726963686d656e743a313032343937) in /var/lib/cassandra/data/DFS/main-f-91-Data.db java.io.EOFException at java.io.RandomAccessFile.readFully(RandomAccessFile.java:383) at java.io.RandomAccessFile.readFully(RandomAccessFile.java:361) at org.apache.cassandra.io.util.BufferedRandomAccessFile.readBytes(BufferedRand omAccessFile.java:270) at org.apache.cassandra.utils.ByteBufferUtil.read(ByteBufferUtil.java:315) at org.apache.cassandra.utils.ByteBufferUtil.readWithLength(ByteBufferUtil.java :272) at org.apache.cassandra.db.ColumnSerializer.deserialize(ColumnSerializer.java:9 4) at org.apache.cassandra.db.ColumnSerializer.deserialize(ColumnSerializer.java:3 5) at org.apache.cassandra.db.ColumnFamilySerializer.deserializeColumns(ColumnFami lySerializer.java:129) at org.apache.cassandra.io.sstable.SSTableIdentityIterator.getColumnFamilyWithC olumns(SSTableIdentityIterator.java:176) at org.apache.cassandra.io.PrecompactedRow.init(PrecompactedRow.java:78) at org.apache.cassandra.io.CompactionIterator.getCompactedRow(CompactionIterato r.java:147) at org.apache.cassandra.io.CompactionIterator.getReduced(CompactionIterator.jav a:108) at org.apache.cassandra.io.CompactionIterator.getReduced(CompactionIterator.jav a:43) at org.apache.cassandra.utils.ReducingIterator.computeNext(ReducingIterator.jav a:73) at com.google.common.collect.AbstractIterator.tryToComputeNext(AbstractIterator .java:136) at com.google.common.collect.AbstractIterator.hasNext(AbstractIterator.java:131 ) at org.apache.commons.collections.iterators.FilterIterator.setNextObject(Filter Iterator.java:183) at org.apache.commons.collections.iterators.FilterIterator.hasNext(FilterIterat or.java:94) at org.apache.cassandra.db.CompactionManager.doCompaction(CompactionManager.jav a:449) at org.apache.cassandra.db.CompactionManager$1.call(CompactionManager.java:124) at org.apache.cassandra.db.CompactionManager$1.call(CompactionManager.java:94) at java.util.concurrent.FutureTask$Sync.innerRun(FutureTask.java:303) at java.util.concurrent.FutureTask.run(FutureTask.java:138) at java.util.concurrent.ThreadPoolExecutor$Worker.runTask(ThreadPoolExecutor.ja va:886) at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:9 08) at java.lang.Thread.run(Thread.java:662) No virus found in this incoming message. Checked by AVG - www.avg.com Version: 9.0.894 / Virus Database: 271.1.1/3574 - Release Date: 04/15/11 02:34:00
RE: Consistency model
So Cassandra does not use an atomic commit protocol at the cluster level. Strong consistency on a quorum read is only guaranteed *after* a successful quorum write. The behaviour you are seeing is possible if you are reading in the middle of a write or the write failed (which should be reported to your code via an exception). Dan -Original Message- From: James Cipar [mailto:jci...@cmu.edu] Sent: April-15-11 14:15 To: user@cassandra.apache.org Subject: Consistency model I've been experimenting with the consistency model of Cassandra, and I found something that seems a bit unexpected. In my experiment, I have 2 processes, a reader and a writer, each accessing a Cassandra cluster with a replication factor greater than 1. In addition, sometimes I generate background traffic to simulate a busy cluster by uploading a large data file to another table. The writer executes a loop where it writes a single row that contains just an sequentially increasing sequence number and a timestamp. In python this looks something like: while time.time() start_time + duration: target_server = random.sample(servers, 1)[0] target_server = '%s:9160'%target_server row = {'seqnum':str(seqnum), 'timestamp':str(time.time())} seqnum += 1 # print 'uploading to server %s, %s'%(target_server, row) pool = pycassa.connect('Keyspace1', [target_server]) cf = pycassa.ColumnFamily(pool, 'Standard1') cf.insert('foo', row, write_consistency_level=consistency_level) pool.dispose() if sleeptime 0.0: time.sleep(sleeptime) The reader simply executes a loop reading this row and reporting whenever a sequence number is *less* than the previous sequence number. As expected, with consistency_level=ConsistencyLevel.ONE there are many inconsistencies, especially with a high replication factor. What is unexpected is that I still detect inconsistencies when it is set at ConsistencyLevel.QUORUM. This is unexpected because the documentation seems to imply that QUORUM will give consistent results. With background traffic the average difference in timestamps was 0.6s, and the maximum was 3.5s. This means that a client sees a version of the row, and can subsequently see another version of the row that is 3.5s older than the previous. What I imagine is happening is this, but I'd like someone who knows that they're talking about to tell me if it's actually the case: I think Cassandra is not using an atomic commit protocol to commit to the quorum of servers chosen when the write is made. This means that at some point in the middle of the write, some subset of the quorum have seen the write, while others have not. At this time, there is a quorum of servers that have not seen the update, so depending on which quorum the client reads from, it may or may not see the update. Of course, I understand that the client is not *choosing* a bad quorum to read from, it is just the first `q` servers to respond, but in this case it is effectively random and sometimes an bad quorum is chosen. Does anyone have any other insight into what is going on here?= No virus found in this incoming message. Checked by AVG - www.avg.com Version: 9.0.894 / Virus Database: 271.1.1/3574 - Release Date: 04/15/11 02:34:00
Schemas diverging while dynamically creating CF.
Hello, We're testing cassandra for integration with indextank. In this first try, we're creating one column family for each user. In practice, on the first run and for the first few documents (a few 100s), a new CF is created, and a document is immediately added to it. A few (up to 50) requests of this type are issued in parallel (for different column families). The end result, and quite repeatable, is having the cluster split with different schema versions, and they never agree. Any thoughts? Thanks, Spike. -- Alejandro Perez IndexTank follow us @indextank http://twitter.com/indextank | read our bloghttp://blog.indextank.com/ | subscribe our user mailing list http://groups.google.com/group/indextank http://blog.indextank.com/
Re: CL.ONE gives UnavailableException on ok node
Sure sounds like you have RF=1 to me. On Fri, Apr 15, 2011 at 7:45 AM, Mick Semb Wever m...@apache.org wrote: Just experienced something i don't understand yet. Running a 3 node cluster successfully for a few days now, then one of the nodes went down (server required reboot). After this the other two nodes kept throwing UnavailableExceptions like UnavailableException() at org.apache.cassandra.service.WriteResponseHandler.assureSufficientLiveNodes(WriteResponseHandler.java:127) at org.apache.cassandra.service.StorageProxy.mutate(StorageProxy.java:118) at no.finntech.countstats.listener.CassandraMessageListener$1.run(CassandraMessageListener.java:356) (this code being loosely based off the second example in http://wiki.apache.org/cassandra/ScribeToCassandra ). This seems a bit weird to me when the StorageProxy.mutate(..) is being called with ConsistencyLevel.ONE. I'm running 0.7.4 so i doubt it to be CASSANDRA-2069 ~mck -- Everything you can imagine is real. Pablo Picasso | http://semb.wever.org | http://sesat.no | http://tech.finn.no | Java XSS Filter -- Jonathan Ellis Project Chair, Apache Cassandra co-founder of DataStax, the source for professional Cassandra support http://www.datastax.com
Re: CL.ONE gives UnavailableException on ok node
On Fri, 2011-04-15 at 15:43 -0500, Jonathan Ellis wrote: Sure sounds like you have RF=1 to me. Yes that's right. I see... so the answer here is that i should be using CL.ANY ? (so the write goes through and hinted handoff can get it to the correct node latter on). ~mck -- The fox condemns the trap, not himself. William Blake | http://semb.wever.org | http://sesat.no | http://tech.finn.no | Java XSS Filter signature.asc Description: This is a digitally signed message part
RE: Schemas diverging while dynamically creating CF.
Uh... don't create a column family per user. Column families are meant to be fairly static; conceptually equivalent to a table in a relational database. Why do you need (or even want) a CF per user? Reconsider your data model, a single column family with an inverted index for a 'user' column is probably more what you are looking for. Operationally, the fewer CFs the better. Dan From: Alejandro Perez [mailto:sp...@indextank.com] Sent: April-15-11 16:39 To: user@cassandra.apache.org Cc: Support Subject: Schemas diverging while dynamically creating CF. Hello, We're testing cassandra for integration with indextank. In this first try, we're creating one column family for each user. In practice, on the first run and for the first few documents (a few 100s), a new CF is created, and a document is immediately added to it. A few (up to 50) requests of this type are issued in parallel (for different column families). The end result, and quite repeatable, is having the cluster split with different schema versions, and they never agree. Any thoughts? Thanks, Spike. -- Alejandro Perez IndexTank follow us @indextank http://twitter.com/indextank | read our blog http://blog.indextank.com/ | subscribe our user mailing list http://groups.google.com/group/indextank http://blog.indextank.com/ No virus found in this incoming message. Checked by AVG - www.avg.com Version: 9.0.894 / Virus Database: 271.1.1/3574 - Release Date: 04/15/11 02:34:00
Re: Schemas diverging while dynamically creating CF.
Thanks for the quick response!. I will reconsider the schema. However, the problem troubles me somehow. How are schema changes supposed to be done? Should I serialize them, should I halt other cluster operations while I do the schema change? Is this a known problem with cassandra? The other question, and I think the more important one for me now: how do I repair the cluster without loosing data once the schemas diverge? Right now the only way I have is erase all data and have the cluster start empty. Should this problem ever happen in production, it's important there's a way to recover the data. On Fri, Apr 15, 2011 at 1:57 PM, Dan Hendry dan.hendry.j...@gmail.comwrote: Uh... don’t create a column family per user. Column families are meant to be fairly static; conceptually equivalent to a table in a relational database. Why do you need (or even want) a CF per user? Reconsider your data model, a single column family with an inverted index for a ‘user’ column is probably more what you are looking for. Operationally, the fewer CFs the better. Dan *From:* Alejandro Perez [mailto:sp...@indextank.com] *Sent:* April-15-11 16:39 *To:* user@cassandra.apache.org *Cc:* Support *Subject:* Schemas diverging while dynamically creating CF. Hello, We're testing cassandra for integration with indextank. In this first try, we're creating one column family for each user. In practice, on the first run and for the first few documents (a few 100s), a new CF is created, and a document is immediately added to it. A few (up to 50) requests of this type are issued in parallel (for different column families). The end result, and quite repeatable, is having the cluster split with different schema versions, and they never agree. Any thoughts? Thanks, Spike. -- Alejandro Perez IndexTank follow us @indextank http://twitter.com/indextank | read our bloghttp://blog.indextank.com/ | subscribe our user mailing list http://groups.google.com/group/indextank http://blog.indextank.com/ No virus found in this incoming message. Checked by AVG - www.avg.com Version: 9.0.894 / Virus Database: 271.1.1/3574 - Release Date: 04/15/11 02:34:00 -- Alejandro Perez IndexTank follow us @indextank http://twitter.com/indextank | read our bloghttp://blog.indextank.com/ | subscribe our user mailing list http://groups.google.com/group/indextank http://blog.indextank.com/
Re: CL.ONE gives UnavailableException on ok node
Yes, if you want to keep writes available w/ RF=1 then you need to use CL.ANY. On Fri, Apr 15, 2011 at 3:48 PM, Mick Semb Wever m...@apache.org wrote: On Fri, 2011-04-15 at 15:43 -0500, Jonathan Ellis wrote: Sure sounds like you have RF=1 to me. Yes that's right. I see... so the answer here is that i should be using CL.ANY ? (so the write goes through and hinted handoff can get it to the correct node latter on). ~mck -- The fox condemns the trap, not himself. William Blake | http://semb.wever.org | http://sesat.no | http://tech.finn.no | Java XSS Filter -- Jonathan Ellis Project Chair, Apache Cassandra co-founder of DataStax, the source for professional Cassandra support http://www.datastax.com
Upcoming Bay area Cassandra events
FYI, there's a couple Cassandra events coming up in April and May in the Bay area: Wednesday, April 27, 1pm-6pm: Free Cassandra training by DataStax, hosted by Ooyala! *Space is limited*; you can sign up at http://www.datastax.com/freetraining. Wednesday, April 27, 6pm-8pm (yes, the evening of the training day): DataStax and Ooyala will be hosting a meet n' greet with pizza, beer, and Cassandra. The event begins with a happy hour from 6PM to 7PM. Following the happy hour, Ooyala staff will show how they're using Cassandra to power their analytics. (Some background material at [1].) DataStax engineers will also be there to share details about Brisk[2], the new open source Hadoop distribution that uses Cassandra for its core services. RSVP at http://www.meetup.com/Cassandra-User-Group-Meeting/events/17283903/ Monday, May 9, 2011, 6:45pm: The San Francisco Geo Meetup will feature a presentation by Mike Malone of SimpleGeo. Mike will explain how and why the company built its own data indexing scheme using Apache Cassandra; some background is at [3]. This is a great opportunity to to see the type of problems that arise when working with multidimensional spatial data. RSVP at http://www.meetup.com/geomeetup/events/17034143/ [1] http://www.ooyala.com/whitepapers/Cassandrawhitepaper.pdf [2] http://www.datastax.com/wp-content/uploads/2011/03/WP-Brisk.pdf [3] http://www.slideshare.net/mmalone/working-with-dimensional-data-in-distributed-hash-tables -- Jonathan Ellis Project Chair, Apache Cassandra co-founder of DataStax, the source for professional Cassandra support http://www.datastax.com
Re: Cassandra Database Modeling
There rows can have 2 billion columns, max column size is 2 GB . But less than 10 mb sounds like a sane limit for a single column. For the serialisation it depends on what your data looks like, point is that json is not space efficient. You may get away with just compressing it (gzip, lzo...), or you may need to create you own space efficient binary format. Start with compressing and use the c accelerated simplejson package. Struct.pack is a way to encode bytes typically to exchange with other programs. Good luck. Aaron On 15/04/2011, at 3:59 PM, csharpplusproject csharpplusproj...@gmail.com wrote: Aaron, Thank you so much. So, the way things appear, it is definitely possible that I could be making queries that would return all 10M particle pairs (at least, I should plan for it). What would be the best design in such a case? I read somewhere that the recommended maximum size of a row (meaning, including all columns) should be around 10[MB], and better not to exceed that. Is that correct? As per packing data efficiently, what would be the best way? would packing the data using say (in python terms) struct.pack( ... ) be at all helpful? Thanks, Shalom. -Original Message- From: aaron morton aa...@thelastpickle.com Reply-to: user@cassandra.apache.org To: user@cassandra.apache.org Subject: Re: Cassandra Database Modeling Date: Thu, 14 Apr 2011 20:54:43 +1200 WRT your query, it depends on how big a slice you want to get how time critical it is. e.g. Could you be making queries that would return all 10M pairs ? Or would the queries generally want to get some small fraction of the data set? Again, depends on how the sim runs. If you sim has stop the world pauses were you have a full view of the data space, then you could grab all the points at a certain distance and efficiently pack them up. Where efficiently means not using JSON. http://wiki.apache.org/cassandra/LargeDataSetConsiderations http://wiki.apache.org/cassandra/CassandraLimitations Aaron On 13 Apr 2011, at 15:48, csharpplusproject wrote: Aaron, Thank you so much for your help. It is greatly appreciated! Looking at the design of the particle pairs: - key: expriement_id.time_interval - column name: pair_id - column value: distance, angle, other data packed together as JSON or some other format You wrote that retrieving millions of columns (I will have about 10,000,000 particles pairs) would be slow. You are also right that the retrieval of millions of columns into Python, won't be fast. If my desired query is to get all particle pairs on time interval [ Tn..T(n+1) ] where the distance between the two particles is smaller than X and the angle between the two particles is greater than Y. In such a query (as the above), given the fact that retrieving millions of columns could be slow, would it be best to say 'concatenate' all values for all particle pairs for a given 'expriement_id.time_interval' into one column? If data is stored in this way, I will be getting from Cassandra a binary string / JSON Object that I will have to 'unpack' in my application. Is this a recommended approach? are there better approaches? Is there a limit to the size that can be stored in one 'cell' (by 'cell' I mean the intersection between a key and a data column)? is there a limit to the size of data of one key? one data column? Thanks in advance for any help / guidance. -Original Message- From: aaron morton aa...@thelastpickle.com Reply-to: user@cassandra.apache.org To: user@cassandra.apache.org Subject: Re: Cassandra Database Modeling Date: Wed, 13 Apr 2011 10:14:21 +1200 Yes for interactive == real time queries. Hadoop based techniques are non time critical queries, but they do have greater analytical capabilities. particle_pairs: 1) Yes and no and sort of. Under the hood the get_slice api call will be used by your client library to pull back chunks of (ordered) columns. Most client libraries abstract away the chunking for you. 2) If you are using a packed structure like JSON then no, Cassandra will have no idea what you've put in the columns other than bytes . It really depends on how much data you have per pair, but generally it's easier to pull back more data than try to get exactly what you need. Downside is you have to update all the data. 3) No, you would need to update all the data for the pair. I was assuming most of the data was written once, and that your simulation had something like a stop-the-world phase between time slices where state was dumped and then read to start the next interval. You could either read it first, or we can come up with something else. distance_cf 1) the query would return an list of columns, which have a name and value (as well as a timestamp and ttl). 2) depends on the client library, if using python go for https://github.com/pycassa/pycassa
Re: question about performance of Cassandra 0.7.4 under a read-heavy workload.
Will need to know more about the number of requests, iostats etc. There is no reason for it to run slower. Aaron On 16/04/2011, at 2:35 AM, 魏金仙 sei_...@126.com wrote: I just deployed cassandra 0.7.4 as a 6-server cluster and tested its performance via YCSB. The result seems confusing when compared to that of Cassandra0.6.6. Under a write heavy workload(i.e., write/read: 50%/50%), Cassandra0.7.4 obtains a really satisfactory latency. I mean both the read latency and write latency is much lower than those of Cassandra0.6.6. However, under a read heavy workload(i.e., write/read:5%/95%), Cassandra0.7.4 performs far worse than Cassandra0.6.6 does. Did I miss something? 体验网易邮箱2G超大附件,轻松发优质大电影、大照片,提速3倍!
Re: Key cache hit rate
Move the decimal point 4 places to the left. It's the percent of your queries that get a hit from the key cache . Aaron On 16/04/2011, at 6:25 AM, mcasandra mohitanch...@gmail.com wrote: How to intepret Key cache hit rate? What does this no mean? Keyspace: StressKeyspace Read Count: 87579 Read Latency: 11.792417360326105 ms. Write Count: 179749 Write Latency: 0.009272318622078566 ms. Pending Tasks: 0 Column Family: StressStandard SSTable count: 59 Space used (live): 52432078035 Space used (total): 52432078035 Memtable Columns Count: 229 Memtable Data Size: 114103248 Memtable Switch Count: 375 Read Count: 87579 Read Latency: NaN ms. Write Count: 179751 Write Latency: 0.007 ms. Pending Tasks: 0 Key cache capacity: 100 Key cache size: 78576 Key cache hit rate: 3.8880248833592535E-4 Row cache: disabled Compacted row minimum size: 182786 Compacted row maximum size: 5839588 Compacted row mean size: 532956 -- View this message in context: http://cassandra-user-incubator-apache-org.3065146.n2.nabble.com/Key-cache-hit-rate-tp6277236p6277236.html Sent from the cassandra-u...@incubator.apache.org mailing list archive at Nabble.com.
DatabaseDescriptor.defsVersion
Hey all, I've been seeing a very rare issue with schema change conflicts on 0.7.3 (I am serializing all schema changes to a single Cassandra node and waiting for them to finish before continuing). Occasionally a node in the cluster will never report the correct schema, and I think it may have to do with synchronization on DatabaseDescriptor.defsVersion. As far as I can tell, it is a static variable accessed by multiple threads but is not protected by synchronized/volatile. I was able to write a test in which one thread never reads the modification done by another thread (as is expected by an unsynchronized variable). Should this be fixed or is there a higher level reason this does not need to be synchronized (in which case I should continue looking for the reason why my schemas don't agree)? Thanks. -Jeffrey
Re:Re: question about performance of Cassandra 0.7.4 under a read-heavy workload.
To make a comparation, 10 threads were run against the two workloads seperately. below is the result of Cassandra0.7.4. write heavy workload(i.e., write/read: 50%/50%) median throughput: 5816 operations/second(i.e., 2908 writes and 2908 reads) update latency:1.32ms read latency:1.81ms read heavy workload(i.e., write/read: 5%/95%) median throughput: 40 operations/second(i.e., 2 writes and 38 reads) update latency:1.85ms read latency:90.43ms and for cassandra0.6.6, the result is: write heavy workload(i.e., write/read: 50%/50%) median throughput: 3284 operations/second(i.e., 1642 writes and 1642 reads) update latency:2.29ms read latency:3.51ms read heavy workload(i.e., write/read: 5%/95%) median throughput: 2759 operations/second(i.e., 2621 writes and 138 reads) update latency:2.33ms read latency:3.53ms all the tests were run in one environment. and most configurations of cassandra are just as default, except that:we choose orderPreservingPartitioner for all the tests and set concurrent_reads as 8( which is the default value of 0.6.6 but the default value of 0.7.4 is 32) . At 2011-04-16 06:53:01,Aaron Morton aa...@thelastpickle.com wrote: Will need to know more about the number of requests, iostats etc. There is no reason for it to run slower. Aaron On 16/04/2011, at 2:35 AM, 魏金仙 sei_...@126.com wrote: I just deployed cassandra 0.7.4 as a 6-server cluster and tested its performance via YCSB. The result seems confusing when compared to that of Cassandra0.6.6. Under a write heavy workload(i.e., write/read: 50%/50%), Cassandra0.7.4 obtains a really satisfactory latency. I mean both the read latency and write latency is much lower than those of Cassandra0.6.6. However, under a read heavy workload(i.e., write/read:5%/95%), Cassandra0.7.4 performs far worse than Cassandra0.6.6 does. Did I miss something? 体验网易邮箱2G超大附件,轻松发优质大电影、大照片,提速3倍!
Re: cluster IP question and Jconsole?
127.0.0.2 to 127.0.0.5 are valid IP addresses. Those are just alias addresses for your loopback interface. Verify: % ifconfig -a 127.0.0.0/8 is for loopback, so you can't connect this address from remote machines. You may be able configure SSH port forwarding from your monitroing host to cassandra node though I haven't try. maki 2011/4/16 tinhuty he tinh...@hotmail.com: I have followed the description here http://www.edwardcapriolo.com/roller/edwardcapriolo/entry/lauching_5_node_cassandra_clusters to created 5 instances of cassandra in one CentOS 5.5 machine. using nodetool shows the 5 nodes are all running fine. Note the 5 nodes are using IP 127.0.0.1 to 127.0.0.5. I understand 127.0.0.1 is pointing to local server, but how about 127.0.0.2 to 127.0.0.5? looks to me that they are not valid IP? how come all 5 nodes are working ok? Another question. I have installed MX4J in instance 127.0.0.1 on port 8081. I am able to connect to http://server:8081/ from the browser. However how do I connect using Jconsole that was installed in another windows machines?(since my CentOS5.5 doesn't have X installed, only SSH allowed). Thanks.
Re: cluster IP question and Jconsole?
Maki, thanks for your reply. for the second question, I wasn't using the loopback address, I was using the actually IP address for that server. I am able to telnet to that IP on port 8081, but using jconsole failed. -Original Message- From: Maki Watanabe Sent: Friday, April 15, 2011 9:43 PM To: user@cassandra.apache.org Cc: tinhuty he Subject: Re: cluster IP question and Jconsole? 127.0.0.2 to 127.0.0.5 are valid IP addresses. Those are just alias addresses for your loopback interface. Verify: % ifconfig -a 127.0.0.0/8 is for loopback, so you can't connect this address from remote machines. You may be able configure SSH port forwarding from your monitroing host to cassandra node though I haven't try. maki 2011/4/16 tinhuty he tinh...@hotmail.com: I have followed the description here http://www.edwardcapriolo.com/roller/edwardcapriolo/entry/lauching_5_node_cassandra_clusters to created 5 instances of cassandra in one CentOS 5.5 machine. using nodetool shows the 5 nodes are all running fine. Note the 5 nodes are using IP 127.0.0.1 to 127.0.0.5. I understand 127.0.0.1 is pointing to local server, but how about 127.0.0.2 to 127.0.0.5? looks to me that they are not valid IP? how come all 5 nodes are working ok? Another question. I have installed MX4J in instance 127.0.0.1 on port 8081. I am able to connect to http://server:8081/ from the browser. However how do I connect using Jconsole that was installed in another windows machines?(since my CentOS5.5 doesn't have X installed, only SSH allowed). Thanks.
RE: DatabaseDescriptor.defsVersion
Done: https://issues.apache.org/jira/browse/CASSANDRA-2490 -Jeffrey -Original Message- From: Jonathan Ellis [mailto:jbel...@gmail.com] Sent: Friday, April 15, 2011 7:39 PM To: user@cassandra.apache.org Cc: Jeffrey Wang Subject: Re: DatabaseDescriptor.defsVersion I think you found a bug; it should be volatile. (Cassandra does already make sure that only one change runs internally at a time.) Can you create a ticket? On Fri, Apr 15, 2011 at 6:04 PM, Jeffrey Wang jw...@palantir.com wrote: Hey all, I've been seeing a very rare issue with schema change conflicts on 0.7.3 (I am serializing all schema changes to a single Cassandra node and waiting for them to finish before continuing). Occasionally a node in the cluster will never report the correct schema, and I think it may have to do with synchronization on DatabaseDescriptor.defsVersion. As far as I can tell, it is a static variable accessed by multiple threads but is not protected by synchronized/volatile. I was able to write a test in which one thread never reads the modification done by another thread (as is expected by an unsynchronized variable). Should this be fixed or is there a higher level reason this does not need to be synchronized (in which case I should continue looking for the reason why my schemas don't agree)? Thanks. -Jeffrey -- Jonathan Ellis Project Chair, Apache Cassandra co-founder of DataStax, the source for professional Cassandra support http://www.datastax.com
What will be the steps for adding new nodes
I have a 0.6.4 Cassandra cluster of two nodes in full replica (replica factor 2). I wants to add two more nodes and balance the cluster (replica factor 2). I want all of them to be seed's. What should be the simple steps: 1. add the AutoBootstraptrue/AutoBootstrap to all the nodes or only the new ones? 2. add the Seed[new_node]/Seed to the config file of the old nodes before adding the new ones? 3. do the old node need to be restarted (if no change is needed in their config file)? TX,
What will be the steps for adding new nodes
I have a 0.6.4 Cassandra cluster of two nodes in full replica (replica factor 2). I wants to add two more nodes and balance the cluster (replica factor 2). I want all of them to be seed's. What should be the simple steps: 1. add the AutoBootstraptrue/AutoBootstrap to all the nodes or only the new ones? 2. add the Seed[new_node]/Seed to the config file of the old nodes before adding the new ones? 3. do the old node need to be restarted (if no change is needed in their config file)? TX,