Re: Corrupt SSTABLE over and over

2016-08-17 Thread Kai Wang
This might not be good news to you. But my experience is that C* 2.X/Windows is not ready for production yet. I've seen various file system related errors. And in one of the JIRAs I was told major work (or rework) is done in 3.X to improve C* stability on Windows. On Tue, Aug 16, 2016 at 3:44 AM,

Re: Corrupt SSTABLE over and over

2016-08-15 Thread Bryan Cheng
Hi Alaa, Sounds like you have problems that go beyond Cassandra- likely filesystem corruption or bad disks. I don't know enough about Windows to give you any specific advice but I'd try a run of chkdsk to start. --Bryan On Fri, Aug 12, 2016 at 5:19 PM, Alaa Zubaidi (PDF)

Re: Corrupt SSTABLE over and over

2016-08-12 Thread Alaa Zubaidi (PDF)
Hi Bryan, Changing disk_failure_policy to best_effort, and running nodetool scrub, did not work, it generated another error: java.nio.file.AccessDeniedException Also tried to remove all files (data, commitlog, savedcaches) and restart the node fresh, and still I am getting corruption. and Still

Re: Corrupt SSTABLE over and over

2016-08-12 Thread Bryan Cheng
Should also add that if the scope of corruption is _very_ large, and you have a good, aggressive repair policy (read: you are confident in the consistency of the data elsewhere in the cluster), you may just want to decommission and rebuild that node. On Fri, Aug 12, 2016 at 11:55 AM, Bryan Cheng

Re: Corrupt SSTABLE over and over

2016-08-12 Thread Bryan Cheng
Looks like you're doing the offline scrub- have you tried online? Here's my typical process for corrupt SSTables. With disk_failure_policy set to stop, examine the failing sstables. If they are very small (in the range of kbs), it is unlikely that there is any salvageable data there. Just delete

Re: Corrupt SSTABLE over and over

2016-08-12 Thread Alaa Zubaidi (PDF)
Hi Jason, Thanks for your input... Thats what I am afraid of? Did you find any HW error in the VMware and HW logs? any indication that the HW is the reason? I need to make sure that this is the reason before asking the customer to spend more money? Thanks, Alaa On Thu, Aug 11, 2016 at 11:02 PM,

Re: Corrupt SSTABLE over and over

2016-08-12 Thread Alaa Zubaidi (PDF)
One more thing I noticed.. The corrupted SSTable is mentioned twice in the log file [CompactionExecutor:10253] 2016-08-11 08:59:01,952 - Compacting (.) [...la-1104-big-Data.db, ] [CompactionExecutor:10253] 2016-08-11 09:32:04,814 - Compacting (.) [...la-1104-big-Data.db]

Re: Corrupt SSTABLE over and over

2016-08-12 Thread Jason Wee
cassandra run on virtual server (vmware)? > I tried sstablescrub but it crashed with hs-err-pid-... maybe try with larger heap allocated to sstablescrub this sstable corrupt i ran into it as well (on cassandra 1.2), first i try nodetool scrub, still persist, then offline sstablescrub still

Corrupt SSTABLE over and over

2016-08-11 Thread Alaa Zubaidi (PDF)
Hi, I have a 16 Node cluster, Cassandra 2.2.1 on Windows, local installation (NOT on the cloud) and I am getting Error [CompactionExecutor:2] 2016-08-12 06:51:52, 983 Cassandra Daemon.java:183 - Execption in thread Thread[CompactionExecutor:2,1main] org.apache.cassandra.io.FSReaderError: