Unsubscribe
UNSUBSCRIBE
Re: Read-repair working, repair not working?
ger.info(String.format("[repair #%s] new session: will sync %s > on range %s for %s.%s", getName(), repairedNodes(), range, tablename, > Arrays.toString(cfnames))); > > When it completes it logs this > > logger.info(String.format("[repair #%s] session completed successfully", > getName())); > > Or this on failure > > logger.error(String.format("[repair #%s] session completed with the following > error", getName()), exception); > > > Cheers > > - > Aaron Morton > Freelance Cassandra Developer > New Zealand > > @aaronmorton > http://www.thelastpickle.com > > On 10/02/2013, at 9:56 PM, Brian Fleming wrote: > >> >> >> >> Hi, >> >> >> >> I have a 20 node cluster running v1.0.7 split between 5 data centres, each >> with an RF of 2, containing a ~1TB unique dataset/~10TB of total data. >> >> >> >> I’ve had some intermittent issues with a new data centre (3 nodes, RF=2) I >> brought online late last year with data consistency & availability: I’d >> request data, nothing would be returned, I would then re-request the data >> and it would correctly be returned: i.e. read-repair appeared to be >> occurring. However running repairs on the nodes didn’t resolve this (I >> tried general ‘repair’ commands as well as targeted keyspace commands) – >> this didn’t alter the behaviour. >> >> >> >> After a lot of fruitless investigation, I decided to wipe & >> re-install/re-populate the nodes. The re-install & repair operations are >> now complete: I see the expected amount of data on the nodes, however I am >> still seeing the same behaviour, i.e. I only get data after one failed >> attempt. >> >> >> >> When I run repair commands, I don’t see any errors in the logs. >> >> I see the expected ‘AntiEntropySessions’ count in ‘nodetool tpstats’ during >> repair sessions. >> >> I see a number of dropped ‘MUTATION’ operations : just under 5% of the total >> ‘MutationStage’ count. >> >> >> >> Questions : >> >> - Could anybody suggest anything specific to look at to see why the >> repair operations aren’t having the desired effect? >> >> - Would increasing logging level to ‘DEBUG’ show read-repair >> activity (to confirm that this is happening, when & for what proportion of >> total requests)? >> >> - Is there something obvious that I could be missing here? >> >> >> >> Many thanks, >> >> Brian >
Read-repair working, repair not working?
** Hi, ** ** I have a 20 node cluster running v1.0.7 split between 5 data centres, each with an RF of 2, containing a ~1TB unique dataset/~10TB of total data. ** ** I’ve had some intermittent issues with a new data centre (3 nodes, RF=2) I brought online late last year with data consistency & availability: I’d request data, nothing would be returned, I would then re-request the data and it would correctly be returned: i.e. read-repair appeared to be occurring. However running repairs on the nodes didn’t resolve this (I tried general ‘*repair’* commands as well as targeted keyspace commands) – this didn’t alter the behaviour. ** ** After a lot of fruitless investigation, I decided to wipe & re-install/re-populate the nodes. The re-install & repair operations are now complete: I see the expected amount of data on the nodes, however I am still seeing the same behaviour, i.e. I only get data after one failed attempt. ** ** When I run repair commands, I don’t see any errors in the logs. I see the expected ‘AntiEntropySessions’ count in ‘nodetool tpstats’ during repair sessions. I see a number of dropped ‘MUTATION’ operations : just under 5% of the total ‘MutationStage’ count. ** ** Questions : **- **Could anybody suggest anything specific to look at to see why the repair operations aren’t having the desired effect? **- **Would increasing logging level to ‘DEBUG’ show read-repair activity (to confirm that this is happening, when & for what proportion of total requests)? **- **Is there something obvious that I could be missing here? ** ** Many thanks, Brian **
Re: Cassandra upgrade issues...
Hi Sylvain, Simple as that!!! Using the 1.1.5 nodetool version works as expected. My mistake. Many thanks, Brian On Thu, Nov 1, 2012 at 8:24 AM, Sylvain Lebresne wrote: > The first thing I would check is if nodetool is using the right jar. I > sounds a lot like if the server has been correctly updated but > nodetool haven't and still use the old classes. > Check the nodetool executable, it's a shell script, and try echoing > the CLASSPATH in there and check it correctly point to what it should. > > -- > Sylvain > > On Thu, Nov 1, 2012 at 9:10 AM, Brian Fleming > wrote: > > Hi, > > > > > > > > I was testing upgrading from Cassandra v.1.0.7 to v.1.1.5 yesterday on a > > single node dev cluster with ~6.5GB of data & it went smoothly in that no > > errors were thrown, the data was migrated to the new directory > structure, I > > can still read/write data as expected, etc. However nodetool commands > are > > behaving strangely – full details below. > > > > > > > > I couldn’t find anything relevant online relating to these exceptions – > any > > help/pointers would be greatly appreciated. > > > > > > > > Thanks & Regards, > > > > > > > > Brian > > > > > > > > > > > > > > > > > > > > ‘nodetool cleanup’ runs successfully > > > > > > > > ‘nodetool info’ produces : > > > > > > > > Token: 82358484304664259547357526550084691083 > > > > Gossip active: true > > > > Load : 7.69 GB > > > > Generation No: 1351697611 > > > > Uptime (seconds) : 58387 > > > > Heap Memory (MB) : 936.91 / 1928.00 > > > > Exception in thread "main" java.lang.ClassCastException: java.lang.String > > cannot be cast to org.apache.cassandra.dht.Token > > > > at > > org.apache.cassandra.tools.NodeProbe.getEndpoint(NodeProbe.java:546) > > > > at > > org.apache.cassandra.tools.NodeProbe.getDataCenter(NodeProbe.java:559) > > > > at org.apache.cassandra.tools.NodeCmd.printInfo(NodeCmd.java:313) > > > > at org.apache.cassandra.tools.NodeCmd.main(NodeCmd.java:651) > > > > > > > > ‘nodetool repair’ produces : > > > > Exception in thread "main" java.lang.reflect.UndeclaredThrowableException > > > > at $Proxy0.forceTableRepair(Unknown Source) > > > > at > > org.apache.cassandra.tools.NodeProbe.forceTableRepair(NodeProbe.java:203) > > > > at > > org.apache.cassandra.tools.NodeCmd.optionalKSandCFs(NodeCmd.java:880) > > > > at org.apache.cassandra.tools.NodeCmd.main(NodeCmd.java:719) > > > > Caused by: javax.management.ReflectionException: Signature mismatch for > > operation forceTableRepair: (java.lang.String, [Ljava.lang.String;) > should > > be (java.lang.String, boolean, [Ljava.lang.String;) > > > > at > > com.sun.jmx.mbeanserver.PerInterface.noSuchMethod(PerInterface.java:152) > > > > at > > com.sun.jmx.mbeanserver.PerInterface.invoke(PerInterface.java:117) > > > > at > > com.sun.jmx.mbeanserver.MBeanSupport.invoke(MBeanSupport.java:262) > > > > at > > > com.sun.jmx.interceptor.DefaultMBeanServerInterceptor.invoke(DefaultMBeanServerInterceptor.java:836) > > > > at > > com.sun.jmx.mbeanserver.JmxMBeanServer.invoke(JmxMBeanServer.java:761) > > > > at > > > javax.management.remote.rmi.RMIConnectionImpl.doOperation(RMIConnectionImpl.java:1427) > > > > at > > > javax.management.remote.rmi.RMIConnectionImpl.access$200(RMIConnectionImpl.java:72) > > > > at > > > javax.management.remote.rmi.RMIConnectionImpl$PrivilegedOperation.run(RMIConnectionImpl.java:1265) > > > > at > > > javax.management.remote.rmi.RMIConnectionImpl.doPrivilegedOperation(RMIConnectionImpl.java:1360) > > > > at > > > javax.management.remote.rmi.RMIConnectionImpl.invoke(RMIConnectionImpl.java:788) > > > > at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) > > > > at > > > sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:39) > > > > at > > > sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25) > > > > at java.lang.reflect.Method.invoke(Method.java:597) > > >
Node repair : excessive data
Hi, We simulated a node 'failure' on one of our nodes by deleting the entire Cassandra installation directory & reconfiguring a fresh instance with the same token. When we issued a 'repair' it started streaming data back onto the node as expected. However after the repair completed, we had over 2.5 times the original load. Issuing a 'cleanup' reduced this to about 1.5 times the original load. We observed an increase in the number of keys via 'cfstats' which is obviously accounting for the increased load. Would anybody know why the repair pulled more keys in than it had initially with the same token? How can we avoid this recurring? If we didn't have sufficient headroom on the disk to handle say 3 times the load, we could be in a difficult situation should we experience a genuine failure. (we're using Cassandra 1.0.5, 12 nodes split across 2 data centres, total cluster load during testing was about 150GB) Many thanks, Brian
Re: Efficiency of Cross Data Center Replication...?
Great - thanks Jake B. On Wed, Nov 16, 2011 at 8:40 PM, Jake Luciani wrote: > the former > > > On Wed, Nov 16, 2011 at 3:33 PM, Brian Fleming > wrote: > >> >> Hi All, >> >> I have a question about inter-data centre replication : if you have 2 >> Data Centers, each with a local RF of 2 (i.e. total RF of 4) and write to a >> node in DC1, how efficient is the replication to DC2 - i.e. is that data : >> - replicated over to a single node in DC2 once and internally replicated >> or >> - replicated explicitly to two separate nodes? >> >> Obviously from a LAN resource utilisation perspective, the former would >> be preferable. >> >> Many thanks, >> >> Brian >> >> > > > -- > http://twitter.com/tjake >
Efficiency of Cross Data Center Replication...?
Hi All, I have a question about inter-data centre replication : if you have 2 Data Centers, each with a local RF of 2 (i.e. total RF of 4) and write to a node in DC1, how efficient is the replication to DC2 - i.e. is that data : - replicated over to a single node in DC2 once and internally replicated or - replicated explicitly to two separate nodes? Obviously from a LAN resource utilisation perspective, the former would be preferable. Many thanks, Brian
Monitoring....
> Hi, > > Has anybody used any solutions for harvesting and storing Cassandra JMX > metrics for monitoring, trend analysis, etc.? > > JConsole is useful for single node monitoring/etc but not scalable & data > obviously doesn't persist between sessions... > > Many thanks, > > Brian