OK, so those transactions are probably noise then. I changed the LogFormatter to not stop when it detects CRC errors so that I could see what nodes were in error. It turns out that those errors appear on larger nodes (>5MB). We do override jute.maxbuffer to more than the default 1MB.
Could there be some unexpected behavior with larger nodes? On Fri, Aug 14, 2015 at 11:40 AM, Flavio Junqueira < [email protected]> wrote: > Note that the zxid is different in each of the lines. > > -Flavio > > > On 14 Aug 2015, at 19:27, Benjamin Jaton <[email protected]> > wrote: > > > > I see a lot of those: > > > > *8/11/15 11:57:57 AM PDT session 0x14efe8c1ba2000b cxid 0x1ba004 zxid > > 0x78b5a setData '/path/to/node,(....),60221* > > immediately followed by > > *8/11/15 11:57:57 AM PDT session 0x14efe8c1ba2000b cxid 0x1ba00a zxid > > 0x78b5b error -110* > > > > Is -110 a NodeExists exception? A setData should not be in error if the > > node already exist no? > > > > > > On Fri, Aug 14, 2015 at 9:50 AM, Benjamin Jaton < > [email protected]> > > wrote: > > > >> This is what I have when I run the LogFormatter on the last tlog: > >> > >> *Exception in thread "main" java.io.IOException: CRC doesn't match > >> 1863394799 vs 480060806* > >> > >> Do you have any idea on what could cause this to happen? > >> > >> On Thu, Aug 13, 2015 at 4:16 PM, Flavio Junqueira <[email protected]> > wrote: > >> > >>> Hi Benjamin, > >>> > >>> Have you tried LogFormatter to check the file? I guess it won't work if > >>> your file is really corrupt, but it might be worth a shot. > Unfortunately, > >>> I'm not aware of a way around it other than deleting the file and and > >>> losing at least part of the transactions in the log. > >>> > >>> -Flavio > >>> > >>>> On 13 Aug 2015, at 17:27, Benjamin Jaton <[email protected]> > >>> wrote: > >>>> > >>>> Hello, > >>>> > >>>> Does anybody know how those "CRC check failed" errors can occur? > >>>> > >>>> Apparently the data has been corrupted, what would be the most likely > >>>> scenario for this to happen? > >>>> > >>>> java.io.IOException: CRC check failed > >>>> at > >>>> > >>> > org.apache.zookeeper.server.persistence.FileTxnLog$FileTxnIterator.next(FileTxnLog.java:612) > >>>> > >>>> at > >>>> > >>> > org.apache.zookeeper.server.persistence.FileTxnSnapLog.restore(FileTxnSnapLog.java:157) > >>>> > >>>> at > >>> > org.apache.zookeeper.server.ZKDatabase.loadDataBase(ZKDatabase.java:223) > >>>> at > >>>> > >>> > org.apache.zookeeper.server.ZooKeeperServer.loadData(ZooKeeperServer.java:272) > >>>> > >>>> at > >>>> > >>> > org.apache.zookeeper.server.ZooKeeperServer.startdata(ZooKeeperServer.java:399) > >>>> > >>>> at > >>>> > >>> > org.apache.zookeeper.server.NIOServerCnxnFactory.startup(NIOServerCnxnFactory.java:122) > >>>> > >>>> at > >>>> > >>> > org.apache.zookeeper.server.ZooKeeperServerMain.runFromConfig(ZooKeeperServerMain.java:113) > >>>> > >>>> at > >>>> > >>> > org.apache.zookeeper.server.ZooKeeperServerMain.initializeAndRun(ZooKeeperServerMain.java:86) > >>>> > >>>> at > >>>> > >>> > org.apache.zookeeper.server.ZooKeeperServerMain.main(ZooKeeperServerMain.java:52) > >>>> > >>>> > >>>> As a result my ZK can't be started, is it recoverable? > >>>> > >>>> Thanks, > >>>> Ben > >>>> > >>>> ---------- Forwarded message ---------- > >>>> From: Apache Jenkins Server <[email protected]> > >>>> Date: Sat, Apr 26, 2014 at 11:35 PM > >>>> Subject: ZooKeeper_branch33_solaris - Build # 867 - Still Failing > >>>> To: [email protected] > >>>> > >>>> > >>>> See https://builds.apache.org/job/ZooKeeper_branch33_solaris/867/ > >>>> > >>>> > >>> > ################################################################################### > >>>> ########################## LAST 60 LINES OF THE CONSOLE > >>>> ########################### > >>>> [...truncated 301 lines...] > >>>> [junit] java.io.IOException: CRC check failed > >>>> [junit] at > >>>> > >>> > org.apache.zookeeper.server.persistence.FileTxnLog$FileTxnIterator.next(FileTxnLog.java:611) > >>>> [junit] at > >>>> org.apache.zookeeper.server.CRCTest.testChecksums(CRCTest.java:165) > >>>> [junit] at sun.reflect.NativeMethodAccessorImpl.invoke0(Native > >>>> Method) > >>>> [junit] at > >>>> > >>> > sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:39) > >>>> [junit] at > >>>> > >>> > sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25) > >>>> [junit] at java.lang.reflect.Method.invoke(Method.java:597) > >>>> [junit] at junit.framework.TestCase.runTest(TestCase.java:168) > >>>> [junit] at junit.framework.TestCase.runBare(TestCase.java:134) > >>>> [junit] at > >>> junit.framework.TestResult$1.protect(TestResult.java:110) > >>>> [junit] at > >>>> junit.framework.TestResult.runProtected(TestResult.java:128) > >>>> [junit] at junit.framework.TestResult.run(TestResult.java:113) > >>>> [junit] at junit.framework.TestCase.run(TestCase.java:124) > >>>> [junit] at junit.framework.TestSuite.runTest(TestSuite.java:232) > >>>> [junit] at junit.framework.TestSuite.run(TestSuite.java:227) > >>>> [junit] at > >>>> > >>> > org.junit.internal.runners.JUnit38ClassRunner.run(JUnit38ClassRunner.java:83) > >>>> [junit] at > >>>> junit.framework.JUnit4TestAdapter.run(JUnit4TestAdapter.java:39) > >>>> [junit] at > >>>> > >>> > org.apache.tools.ant.taskdefs.optional.junit.JUnitTestRunner.run(JUnitTestRunner.java:421) > >>>> [junit] at > >>>> > >>> > org.apache.tools.ant.taskdefs.optional.junit.JUnitTestRunner.launch(JUnitTestRunner.java:912) > >>>> [junit] at > >>>> > >>> > org.apache.tools.ant.taskdefs.optional.junit.JUnitTestRunner.main(JUnitTestRunner.java:766) > >>>> [junit] 2014-04-27 06:35:01,089 - INFO [main:CRCTest@68] - > FINISHED > >>>> testChecksums > >>>> [junit] Tests run: 1, Failures: 0, Errors: 0, Time elapsed: 54.234 > >>> sec > >>>> [junit] java.io.FileNotFoundException: > >>>> > >>> > /zonestorage/hudson_solaris/home/hudson/hudson-slave/workspace/ZooKeeper_branch33_solaris/trunk/junitvmwatcher1383325470.properties > >>>> (No such file or directory) > >>>> [junit] at java.io.FileInputStream.open(Native Method) > >>>> [junit] at > >>> java.io.FileInputStream.<init>(FileInputStream.java:120) > >>>> [junit] at java.io.FileReader.<init>(FileReader.java:55) > >>>> [junit] at > >>>> > >>> > org.apache.tools.ant.taskdefs.optional.junit.JUnitTask.executeAsForked(JUnitTask.java:1028) > >>>> [junit] at > >>>> > >>> > org.apache.tools.ant.taskdefs.optional.junit.JUnitTask.execute(JUnitTask.java:817) > >>>> [junit] at > >>>> > >>> > org.apache.tools.ant.taskdefs.optional.junit.JUnitTask.executeOrQueue(JUnitTask.java:1657) > >>>> [junit] at > >>>> > >>> > org.apache.tools.ant.taskdefs.optional.junit.JUnitTask.execute(JUnitTask.java:764) > >>>> [junit] at > >>>> org.apache.tools.ant.UnknownElement.execute(UnknownElement.java:288) > >>>> [junit] at sun.reflect.GeneratedMethodAccessor2.invoke(Unknown > >>>> Source) > >>>> [junit] at > >>>> > >>> > sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25) > >>>> [junit] at java.lang.reflect.Method.invoke(Method.java:597) > >>>> [junit] at > >>>> > >>> > org.apache.tools.ant.dispatch.DispatchUtils.execute(DispatchUtils.java:105) > >>>> [junit] at org.apache.tools.ant.Task.perform(Task.java:348) > >>>> [junit] at org.apache.tools.ant.Target.execute(Target.java:357) > >>>> [junit] at > >>> org.apache.tools.ant.Target.performTasks(Target.java:385) > >>>> [junit] at > >>>> org.apache.tools.ant.Project.executeSortedTargets(Project.java:1329) > >>>> [junit] at > >>>> org.apache.tools.ant.Project.executeTarget(Project.java:1298) > >>>> [junit] at > >>>> > >>> > org.apache.tools.ant.helper.DefaultExecutor.executeTargets(DefaultExecutor.java:41) > >>>> [junit] at > >>>> org.apache.tools.ant.Project.executeTargets(Project.java:1181) > >>>> [junit] at org.apache.tools.ant.Main.runBuild(Main.java:698) > >>>> [junit] at org.apache.tools.ant.Main.startAnt(Main.java:199) > >>>> [junit] at > >>>> org.apache.tools.ant.launch.Launcher.run(Launcher.java:257) > >>>> [junit] at > >>>> org.apache.tools.ant.launch.Launcher.main(Launcher.java:104) > >>>> [junit] Running org.apache.zookeeper.server.DataTreeUnitTest > >>>> [junit] Tests run: 1, Failures: 0, Errors: 1, Time elapsed: 0 sec > >>>> > >>>> BUILD FAILED > >>>> > >>> > /zonestorage/hudson_solaris/home/hudson/hudson-slave/workspace/ZooKeeper_branch33_solaris/trunk/build.xml:814: > >>>> Process fork failed. > >>>> > >>>> Total time: 2 minutes 54 seconds > >>>> Build step 'Invoke Ant' marked build as failure > >>>> [locks-and-latches] Releasing all the locks > >>>> [locks-and-latches] All the locks released > >>>> Recording test results > >>>> Email was triggered for: Failure > >>>> Sending email for trigger: Failure > >>>> > >>>> > >>>> > >>>> > >>> > ################################################################################### > >>>> ############################## FAILED TESTS (if any) > >>>> ############################## > >>>> 1 tests failed. > >>>> FAILED: org.apache.zookeeper.server.DataTreeUnitTest.unknown > >>>> > >>>> Error Message: > >>>> Forked Java VM exited abnormally. Please note the time in the report > >>> does > >>>> not reflect the time until the VM exit. > >>>> > >>>> Stack Trace: > >>>> junit.framework.AssertionFailedError: Forked Java VM exited > abnormally. > >>>> Please note the time in the report does not reflect the time until the > >>> VM > >>>> exit. > >>> > >>> > >> > >
