Well, color me shocked -- the verify found some bad data. It looks like two keys have bad checksums (which I assume is what created the UNDEFINEDs, too?).
CORRUPT 2 REFERENCED 2199999908 UNDEFINED 2 UNREFERENCED 874770 I ran two tabletservers on my desktop, turned on hflush instead of hsync, switched from GZ to snappy and upped the splits threshold for 4g and let CI run for ~5 hours. I killed the tservers about a dozen times by hand throughout the day (kill -9), and the master once or twice. The datanode was left alone. This was running on 2.6.0-SNAPSHOT from around 9/14/2014. The offending keys are: 389a85668b6ebf8e 2ff6:4a78 [] 1411499115242 3a10885b-d481-4d00-be00-0477e231ey65:000000008576b169:0cd98965c9ccc1d0:ba15529e and 7e56b58a0c7df128 5fa0:6249 [] 1411499311578 3a10885b-d481-4d00-be00-0477e231e965:0000p000872d60eb:499fa72752d82a7c:5c5f19e8 which both happened a little after 3:00pm eastern (I stopped CI around 3:30pm eastern). I don't see anything immediately wrong in the tserver logs (nor does it appear that I had restarted either of them around the timestamp of the above keys). I see no errors in the DN logs either around that time window. I don't have a clue how to even start looking at this to figure out if something indeed went wrong, or if it's some other sort of issue. To be clear, this as it stands isn't sufficient to make me change my vote. On Tue, Sep 23, 2014 at 3:04 PM, Josh Elser <[email protected]> wrote: > +1 > > * Verified checksums+sigs > * Build from source tarball and ran all unit+functional tests against > Apache Hadoop 2.5.1 and 2.6.0-SNAPSHOT > * Ingested 2B records w/ CI + clean verify with single tserver (Apache > Hadoop 2.6.0-SNAPSHOT + Apache ZooKeeper 3.4.5) > * Ingested ~2.5B records w/ CI with 2 tservers and some manual > agitation (Apache Hadoop 2.6.0-SNAPSHOT + Apache ZooKeeper 3.4.5) > - Currently running verify, will report if I get a failed verify > * Ran some Hive queries (w/ Apache Hive-0.14.0-SNAPSHOT & Apache Tez > 0.6.0-SNAPSHOT) > * Ran some Pig queries (w/ Apache Pig-0.13.0) > > Thanks for organizing this, Corey!! > > On Fri, Sep 19, 2014 at 10:49 PM, Corey Nolet <[email protected]> wrote: >> Devs, >> >> Please consider the following candidate for Apache Accumulo 1.6.1 >> >> Branch: 1.6.1-rc1 >> SHA1: 88c5473b3b49d797d3dabebd12fe517e9b248ba2 >> Staging Repository: >> *https://repository.apache.org/content/repositories/orgapacheaccumulo-1017/ >> <https://repository.apache.org/content/repositories/orgapacheaccumulo-1017/>* >> >> Source tarball: >> *http://repository.apache.org/content/repositories/orgapacheaccumulo-1017/org/apache/accumulo/accumulo/1.6.1/accumulo-1.6.1-src.tar.gz >> <http://repository.apache.org/content/repositories/orgapacheaccumulo-1017/org/apache/accumulo/accumulo/1.6.1/accumulo-1.6.1-src.tar.gz>* >> Binary tarball: >> *http://repository.apache.org/content/repositories/orgapacheaccumulo-1017/org/apache/accumulo/accumulo/1.6.1/accumulo-1.6.1-bin.tar.gz >> <http://repository.apache.org/content/repositories/orgapacheaccumulo-1017/org/apache/accumulo/accumulo/1.6.1/accumulo-1.6.1-bin.tar.gz>* >> (Append ".sha1", ".md5" or ".asc" to download the signature/hash for a >> given artifact.) >> >> Signing keys available at: https://www.apache.org/dist/accumulo/KEYS >> >> Over 1.6.1, we have 188 issues resolved >> *https://git-wip-us.apache.org/repos/asf?p=accumulo.git;a=blob;f=CHANGES;h=91b9d31e3b9dc53f1a576cc49bbc061919eb0070;hb=1.6.1-rc1 >> <https://git-wip-us.apache.org/repos/asf?p=accumulo.git;a=blob;f=CHANGES;h=91b9d31e3b9dc53f1a576cc49bbc061919eb0070;hb=1.6.1-rc1>* >> >> Testing: All unit and functional tests are passing. >> >> Vote will be open until Thursday, September 25th 12:00AM UTC (9/24 8:00PM >> ET, 9/24 5:00PM PT)
