On Thu, Oct 1, 2015 at 12:32 PM, Tahir Hameed <[email protected]> wrote: > I tried the above method, but I'm using version 0.98.4.2.2.4.4-16-hadoop2 > for HBase on the cluster, and version 0.12.0-hadoop2 for Apache Crunch. I've > tried using using 0.11.0-hadoop2 by applying a patch for CRUNCH-536 but I > fall into other errors. I havent been able to find a git release for > 0.12.0-hadoop2 to add the CRUNCH-536 changes to it .
You can check out the 0.13 release from here: https://github.com/apache/crunch/tree/apache-crunch-0.13.0 (starting from 0.13, Crunch is hadoop2-only). This release includes the changes for CRUNCH-536. > > Also, I am using already TableMapReduceUtil.initCredentials(mrJob.getJob()) > in my own code as well for all tables I read . I read the table, and convert > that into a readable instance to be accessed into another DoFn. With 2 > pipelines, one of them working absolutely fine, (no errors) and the other > one having kerberos authentication errors. The only difference I see is of > use of PTable in one, and PGroupedTable in the other. The use of a PTable vs a PGroupedTable affects the topology of the job graph, which also changes what gets executed in which job. This can also definitely have an effect on using ReadableData. From the stack trace that you posted, it definitely looked like an HTable was being read from within the initialize method of your DoFn -- even though your Crunch code probably defines the HBase-based PTable earlier, it's only being read as needed (which appears to be in the initialize method). > > Would sharing the code for instance be more helpful in identifying the > problem? Yes, definitely. If you can share a minimal example of the full code that works and a version that doesn't work, it'll probably help a lot in resolving this. - Gabriel
