I'm testing HBase 0.94.12 (Hadoop 1.0.4) in our systems and I get NPE when setting up bulk load. I'll start by noting that we run a Hadoop/HBase OSGi client that runs with a bundled version of Hadoop and HBase.
We currently run in production with a bundled 0.94.2 client we created and everything works although TableMapReduceUtil.addDependencyJars is logging WARNs like "Could not find jar for class class... in order to ship it to the cluster" and that happens because of CL issues. It doesn't matter much because those classes are available in all nodes/region servers classpath. When testing HBase 0.94.12 (again bundled here) I get the following NPE: Caused by: java.lang.NullPointerException: null at java.io.File.<init>(File.java:251) at java.util.zip.ZipFile.<init>(ZipFile.java:115) at org.apache.hadoop.hbase.mapreduce.TableMapReduceUtil.updateMap(TableMapReduceUtil.java:617) at org.apache.hadoop.hbase.mapreduce.TableMapReduceUtil.findOrCreateJar(TableMapReduceUtil.java:597) at org.apache.hadoop.hbase.mapreduce.TableMapReduceUtil.addDependencyJars(TableMapReduceUtil.java:557) at org.apache.hadoop.hbase.mapreduce.TableMapReduceUtil.addDependencyJars(TableMapReduceUtil.java:518) at com.infolinks.hadoop.framework.InfolinksHFileOutputFormat.configureIncrementalLoad(InfolinksHFileOutputFormat.java:114) at com.infolinks.redmap.services.impl.dailybatchprocess.HadoopJobBulkLoader.initBulkLoad(HadoopJobBulkLoader.java:95) at com.infolinks.redmap.services.impl.dailybatchprocess.HadoopJobBulkLoader.preSplitAndInitBulkLoad(HadoopJobBulkLoader.java:60) at com.infolinks.redmap.services.impl.dailybatchprocess.UrlsBulkLoadProcess.init(UrlsBulkLoadProcess.java:40) ... 9 common frames omitted Since getJar(my_class) (in TableMapReduceUtil) may return null, calling updateMap(null, ) will throw NPE. Should check null==jar before calling updateMap ? Why not allow failure to add dependency jars and just log a WARN ? Thanks, Amit.
