I have run into problems with creating a shaded jar for spark job and some old version of hadoop client libs ending up in the shaded jar.
On Fri, Jul 8, 2016 at 6:03 PM, Keith Turner <[email protected]> wrote: > Is Accumulo writing RFiles to the encrypted HDFS instance and are those > ok? If only the spark job is having issue, maybe it using a different > hadoop client lib or different hadoop config when it writes files. > > On Fri, Jul 8, 2016 at 5:29 PM, Russ Weeks <[email protected]> > wrote: > >> Thanks, Bill and Josh! rfile-info was the tool I was looking for. >> >> It definitely seems related to HDFS encryption. I put a gist here showing >> the results: >> >> https://gist.github.com/anonymous/13a8ba2c5f6794dc6207f1ae09b12825 >> >> The two files contain exactly the same key-value pairs. They differ by 2 >> bytes in the footer of the RFile. The file written to the encrypted HDFS >> directory is consistently corrupt - I'm not confident yet that it's always >> corrupt in the same place because I see several different errors, but in >> this case those 2 bytes were wrong. >> >> -Russ >> >> On Fri, Jul 8, 2016 at 12:30 PM Josh Elser <[email protected]> wrote: >> >>> Yeah, I'd lean towards something corrupting the file as well. We >>> presently have two BCFile versions: 2.0 and 1.0. Both are presently >>> supported by the code so it should not be possible to create a bad RFile >>> using our APIs (assuming correctness from the filesystem, anyways) >>> >>> I'm reminded of HADOOP-11674, but a quick check does show that is fixed >>> in your HDP-2.3.4 version (sorry for injecting $vendor here). >>> >>> Some other thoughts on how you could proceed: >>> >>> * Can Spark write the file to local fs? Maybe you can rule out HDFS w/ >>> encryption as a contributing issue by just writing directly to local >>> disk and then upload them to HDFS after the fact (as a test) >>> * `accumulo rfile-info` should fail in the same way if the metadata is >>> busted as a way to verify things >>> * You can use rfile-info on both files in HDFS and local fs (tying into >>> the first point) >>> * If you can share one of these files that is invalid, we can rip it >>> apart and see what's going on. >>> >>> William Slacum wrote: >>> > I wonder if the file isn't being decrypted properly. I don't see why it >>> > would write out incompatible file versions. >>> > >>> > On Fri, Jul 8, 2016 at 3:02 PM, Josh Elser <[email protected] >>> > <mailto:[email protected]>> wrote: >>> > >>> > Interesting! I have not run into this one before. >>> > >>> > You could use `accumulo rfile-info`, but I'd guess that would net >>> > the same exception you see below. >>> > >>> > Let me see if I can dig a little into the code and come up with a >>> > plausible explanation. >>> > >>> > >>> > Russ Weeks wrote: >>> > >>> > Hi, folks, >>> > >>> > Has anybody ever encountered a problem where the RFiles that >>> are >>> > generated by AccumuloFileOutputFormat can't be imported using >>> > TableOperations.importDirectory? >>> > >>> > I'm seeing this problem very frequently for small RFiles and >>> > occasionally for larger RFiles. The errors shown in the >>> > monitor's log UI >>> > suggest a corrupt file, to me. For instance, the stack trace >>> > below shows >>> > a case where the BCFileVersion was incorrect, but sometimes it >>> will >>> > complain about an invalid length, negative offset, or invalid >>> codec. >>> > >>> > I'm using HDP Accumulo 1.7.0 (1.7.0.2.3.4.12-1) on an >>> encrypted HDFS >>> > volume, with Kerberos turned on. The RFiles are generated by >>> > AccumuloFileOutputFormat from a Spark job. >>> > >>> > A very small RFile that exhibits this problem is available >>> here: >>> > >>> http://firebar.newbrightidea.com/downloads/bad_rfiles/I0000waz.rf >>> > >>> > I'm pretty confident that the keys are being written to the >>> RFile in >>> > order. Are there any tools I could use to inspect the internal >>> > structure >>> > of the RFile? >>> > >>> > Thanks, >>> > -Russ >>> > >>> > Unable to find tablets that overlap file >>> > hdfs://[redacted]/accumulo/data/tables/f/b-0000ze9/I0000zeb.rf >>> > java.lang.RuntimeException: Incompatible BCFile >>> fileBCFileVersion. >>> > at >>> > >>> >>> org.apache.accumulo.core.file.rfile.bcfile.BCFile$Reader.<init>(BCFile.java:828) >>> > at >>> > >>> >>> org.apache.accumulo.core.file.blockfile.impl.CachableBlockFile$Reader.init(CachableBlockFile.java:246) >>> > at >>> > >>> >>> org.apache.accumulo.core.file.blockfile.impl.CachableBlockFile$Reader.getBCFile(CachableBlockFile.java:257) >>> > at >>> > >>> >>> org.apache.accumulo.core.file.blockfile.impl.CachableBlockFile$Reader.access$100(CachableBlockFile.java:137) >>> > at >>> > >>> >>> org.apache.accumulo.core.file.blockfile.impl.CachableBlockFile$Reader$MetaBlockLoader.get(CachableBlockFile.java:209) >>> > at >>> > >>> >>> org.apache.accumulo.core.file.blockfile.impl.CachableBlockFile$Reader.getBlock(CachableBlockFile.java:313) >>> > at >>> > >>> >>> org.apache.accumulo.core.file.blockfile.impl.CachableBlockFile$Reader.getMetaBlock(CachableBlockFile.java:368) >>> > at >>> > >>> >>> org.apache.accumulo.core.file.blockfile.impl.CachableBlockFile$Reader.getMetaBlock(CachableBlockFile.java:137) >>> > at >>> > >>> org.apache.accumulo.core.file.rfile.RFile$Reader.<init>(RFile.java:843) >>> > at >>> > >>> >>> org.apache.accumulo.core.file.rfile.RFileOperations.openReader(RFileOperations.java:79) >>> > at >>> > >>> >>> org.apache.accumulo.core.file.DispatchingFileFactory.openReader(DispatchingFileFactory.java:69) >>> > at >>> > >>> >>> org.apache.accumulo.server.client.BulkImporter.findOverlappingTablets(BulkImporter.java:644) >>> > at >>> > >>> >>> org.apache.accumulo.server.client.BulkImporter.findOverlappingTablets(BulkImporter.java:615) >>> > at >>> > >>> org.apache.accumulo.server.client.BulkImporter$1.run(BulkImporter.java:146) >>> > at >>> > >>> org.apache.accumulo.fate.util.LoggingRunnable.run(LoggingRunnable.java:35) >>> > at >>> > >>> org.apache.htrace.wrappers.TraceRunnable.run(TraceRunnable.java:57) >>> > at >>> > >>> java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511) >>> > at java.util.concurrent.FutureTask.run(FutureTask.java:266) >>> > at >>> > >>> >>> java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142) >>> > at >>> > >>> >>> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617) >>> > at >>> > >>> org.apache.accumulo.fate.util.LoggingRunnable.run(LoggingRunnable.java:35) >>> > at java.lang.Thread.run(Thread.java:745) >>> > >>> > >>> >> >
