Re: org.apache.hadoop.hbase.mapreduce.Export fails with an NPE

Ted Yu Sun, 11 Apr 2010 09:59:10 -0700

Hi,
I added initial proposal to
https://issues.apache.org/jira/browse/HADOOP-6695 which should address
George's use case.


Cheers

On Sat, Apr 10, 2010 at 4:01 PM, Ted Yu <yuzhih...@gmail.com> wrote:

> For the reason why NPE wasn't thrown from TableInputFormatBase.getSplits(),
> I think job tracker sent the job to 192.168.1.16 and
> TableInputFormatBase.table was null on that machine.
> I guess TableInputFormatBase.table depends on zookeeper to initialize.
>
> My two cents.
>
>
> On Sat, Apr 10, 2010 at 3:03 PM, George Stathis <gstat...@gmail.com>wrote:
>
>> On Sat, Apr 10, 2010 at 5:15 PM, Stack <st...@duboce.net> wrote:
>>
>> > George:
>> >
>> > I think Edward was referring to the later paragraph where it says:
>> >
>> > "Another possibility, if for example you do not have access to
>> > hadoop-env.sh or are unable to restart the hadoop cluster, is bundling
>> > the hbase jars into a mapreduce job jar adding it and its dependencies
>> > under the job jar lib/ directory and the hbase conf into the job jars
>> > top-level directory."
>> >
>>
>> I see. My mis-understanding.
>>
>>
>> >
>> > The above could be better phrase especially as building the fat jar
>> > can take a bit of messing to get right.
>> >
>> > I haven't done one for hbase in a while.  Maybe others have.
>> >
>> > From your note above, this is the interesting one:
>> >
>> > " If the zookeeper JAR is included in HADOOP_CLASSPATH, the
>> > ClassNotFoundExceptions go away, but then the original NPE re-appears:
>> >
>> >
>> org.apache.hadoop.hbase.mapreduce.TableInputFormatBase$TableRecordReader.restart(TableInputFormatBase.java:110)"
>> >
>> > This is the null HTable?  We're probably suppressing somehow the root
>> > cause of the NPE.
>>
>>
>> This is the null HTable yes. This is line 110 of TableInputFormatBase.java
>> inside the inner class TableRecordReader.restart() method:
>> ...
>> 110      this.scanner = this.htable.getScanner(newScan);
>> ...
>>
>> TableRecordReader.htable variable is set inside
>> TableInputFormatBase.createRecordReader() from
>> the TableInputFormatBase.table variable. If that were null, then so
>> would TableRecordReader.htable. The weird thing is that the NPE is thrown
>> after the splits are done from what I can see in the output logs. If
>> that's
>> true, then TableInputFormatBase.table cannot really be null; the NPE would
>> be thrown in line 278 of TableInputFormatBase.getSplits() method instead:
>> ...
>> 278    Pair<byte[][], byte[][]> keys = table.getStartEndKeys();
>> ...
>>
>> So, I'm a bit at a loss.
>>
>> Is this TRUNK or head of the branch?
>>
>>
>> This is the official hbase-0.20.3.tar.gz release as downloaded from a
>> mirror. The hadoop version is the 0.20.2 tar as downloaded from a mirror.
>>
>>
>> > If  so, do
>> > you have > 1 zk servers in your ensemble?   Just asking.
>> >
>>
>> All of this is happening on a single machine running on the
>> pseudo-distributed setup (
>>
>> http://hadoop.apache.org/hbase/docs/current/api/overview-summary.html#pseudo-distrib
>> )
>> with
>> a single instance of everything set to localhost. The built-in zk is used
>> and controlled by HBase as opposed to a separate zk
>> installation. hbase.cluster.distributed is set to false.
>>
>> I'm curious: how would the number of zk instances be tied?
>>
>>
>> >
>> > St.Ack
>> >
>> > On Sat, Apr 10, 2010 at 10:44 AM, George Stathis <gstat...@gmail.com>
>> > wrote:
>> > > Actually, the HBase documentation discourages physically copying JARs
>> > from
>> > > the HBase classpath to the Hadoop one:
>> > >
>> > > From the HBase API documentation (
>> > >
>> >
>> http://hadoop.apache.org/hbase/docs/current/api/org/apache/hadoop/hbase/mapreduce/package-summary.html
>> > > ):
>> > >
>> > > "HBase, MapReduce and the CLASSPATH
>> > >
>> > > MapReduce jobs deployed to a MapReduce cluster do not by default have
>> > access
>> > > to the HBase configuration under $HBASE_CONF_DIR nor to HBase classes.
>> > You
>> > > could add hbase-site.xml to $HADOOP_HOME/conf and add hbase jars to
>> the
>> > > $HADOOP_HOME/lib and copy these changes across your cluster *but a
>> > cleaner
>> > > means of adding hbase configuration and classes to the cluster
>> CLASSPATH
>> > is
>> > > by uncommenting HADOOP_CLASSPATH in $HADOOP_HOME/conf/hadoop-env.sh
>> > adding
>> > > hbase dependencies here.*"
>> > >
>> > > It seems that the approach in bold is not sufficient and that not all
>> > mapred
>> > > jobs have access to the required JARs unless the first approach is
>> taken.
>> > >
>> > > -GS
>> > >
>> > > On Sat, Apr 10, 2010 at 1:35 PM, Edward Capriolo <
>> edlinuxg...@gmail.com
>> > >wrote:
>> > >
>> > >> On Sat, Apr 10, 2010 at 1:31 PM, George Stathis <gstat...@gmail.com>
>> > >> wrote:
>> > >>
>> > >> > Ted,
>> > >> >
>> > >> > HADOOP-6695 is an improvement request and a different issue from
>> what
>> > I
>> > >> am
>> > >> > encountering. What I am referring to is not a dynamic classloading
>> > issue.
>> > >> > It
>> > >> > happens even after the servers are being restarted. You are
>> requesting
>> > >> for
>> > >> > Hadoop to automatically detect new JARs without restarting when
>> they
>> > are
>> > >> > placed in its' classpath. I'm saying that my MapRed jobs fail
>> unless
>> > some
>> > >> > JARs are physically present in the hadoop lib directory, regardless
>> of
>> > >> > server restarts and HADOOP_CLASSPATH settings.
>> > >> >
>> > >> > I hope this clarifies things.
>> > >> >
>> > >> > -GS
>> > >> >
>> > >> > On Sat, Apr 10, 2010 at 1:11 PM, <yuzhih...@gmail.com> wrote:
>> > >> >
>> > >> > > I logged HADOOP-6695
>> > >> > >
>> > >> > > Cheers
>> > >> > > Sent from my Verizon Wireless BlackBerry
>> > >> > >
>> > >> > > -----Original Message-----
>> > >> > > From: George Stathis <gstat...@gmail.com>
>> > >> > > Date: Sat, 10 Apr 2010 12:11:37
>> > >> > > To: <hbase-user@hadoop.apache.org>
>> > >> > > Subject: Re: org.apache.hadoop.hbase.mapreduce.Export fails with
>> an
>> > NPE
>> > >> > >
>> > >> > > OK, the issue remains in our Ubuntu EC2 dev environment, so it's
>> not
>> > >> just
>> > >> > > my
>> > >> > > local setup. Here are some more observations based on some tests
>> I
>> > just
>> > >> > > ran:
>> > >> > >
>> > >> > >   - If the zookeeper JAR is omitted from HADOOP_CLASSPATH, then
>> > there
>> > >> are
>> > >> > >   ClassNotFoundExceptions thrown as would be expected
>> > >> > >   - If the zookeeper JAR is included in HADOOP_CLASSPATH,
>> > >> > >   the ClassNotFoundExceptions go away, but then the original NPE
>> > >> > >   re-appears:
>> > >> > >
>> > >> >
>> > >>
>> >
>> org.apache.hadoop.hbase.mapreduce.TableInputFormatBase$TableRecordReader.restart(TableInputFormatBase.java:110)
>> > >> > >   - If the zookeeper JAR in physically included in
>> $HADOOP_HOME/lib,
>> > >> then
>> > >> > >   the NPE goes away as well
>> > >> > >
>> > >> > > So, while it seems that the HADOOP_CLASSPATH is indeed being
>> read,
>> > >> > > something
>> > >> > > is missing during the MapRed process that keeps the htable from
>> > being
>> > >> > > instantiated properly in TableInputFormatBase unless some JARs
>> are
>> > >> > > physically present in $HADOOP_HOME/lib. Note that this issue is
>> not
>> > >> > > specific
>> > >> > > to the zookeeper JAR either. We have enabled the transactional
>> > contrib
>> > >> > > indexed tables and we have the same problem if we don't
>> physically
>> > >> > > include hbase-transactional-0.20.3.jar in the hadoop lib even
>> though
>> > >> it's
>> > >> > > included in HADOOP_CLASSPATH.
>> > >> > >
>> > >> > > It feels like there is a discrepancy in the way classloading is
>> done
>> > >> > > between
>> > >> > > the various components. But I'm not sure whether this is even an
>> > HBase
>> > >> > > issue
>> > >> > > and not a Hadoop one. Seems like this might be a JIRA ticket
>> > candidate.
>> > >> > Any
>> > >> > > thoughts on which project should look at this first?
>> > >> > >
>> > >> > > -GS
>> > >> > >
>> > >> > > On Fri, Apr 9, 2010 at 8:29 PM, George Stathis <
>> gstat...@gmail.com>
>> > >> > wrote:
>> > >> > >
>> > >> > > > Here is mine:
>> > >> > > >
>> > >> > > > export
>> > >> > > >
>> > >> > >
>> > >> >
>> > >>
>> >
>> HADOOP_CLASSPATH="$HBASE_HOME/hbase-0.20.3.jar:$HBASE_HOME/hbase-0.20.3-test.jar:$HBASE_HOME/lib/zookeeper-3.2.2.jar:$HBASE_HOME/conf"
>> > >> > > >
>> > >> > > > $HBASE_HOME is defined in my .bash_profile, so it's already
>> there
>> > and
>> > >> I
>> > >> > > see
>> > >> > > > it expanded in the debug statements with the correct path. I
>> even
>> > >> tried
>> > >> > > > hard-coding the $HBASE_HOME path above just in case and I had
>> the
>> > >> same
>> > >> > > > issue.
>> > >> > > >
>> > >> > > > I any case, I'm passed it now. I'll have to check whether the
>> same
>> > >> > issue
>> > >> > > > happens on our dev environment running on Ubuntu on EC2. If
>> not,
>> > then
>> > >> > at
>> > >> > > > least it's localized to my OSX environment.
>> > >> > > >
>> > >> > > > -GS
>> > >> > > >
>> > >> > > >
>> > >> > > > On Fri, Apr 9, 2010 at 7:32 PM, Stack <st...@duboce.net>
>> wrote:
>> > >> > > >
>> > >> > > >> Very odd.  I don't have to do that running MR jobs.  I wonder
>> > whats
>> > >> > > >> different? (I'm using 0.20.4 near-candidate rather than
>> 0.20.3,
>> > >> > > >> 1.6.0u14).  I have a HADOOP_ENV like this.
>> > >> > > >>
>> > >> > > >> export HBASE_HOME=/home/hadoop/0.20
>> > >> > > >> export HBASE_VERSION=20.4-dev
>> > >> > > >> #export
>> > >> > > >>
>> > >> > >
>> > >> >
>> > >>
>> >
>> HADOOP_CLASSPATH="$HBASE_HOME/conf:$HBASE_HOME/build/hbase-0.20.4-dev.jar:$HBASE_HOME/build/hbase-0.20.4-dev-test.jar:$HBASE_HOME/lib/zookeeper-3.2.2.jar"
>> > >> > > >> export
>> > >> > > >>
>> > >> > >
>> > >> >
>> > >>
>> >
>> HADOOP_CLASSPATH="$HBASE_HOME/conf:$HBASE_HOME/build/hbase-0.${HBASE_VERSION}.jar:$HBASE_HOME/build/hbase-0.${HBASE_VERSION}-test.jar:$HBASE_HOME/lib/zookeeper-3.2.2.jar"
>> > >> > > >>
>> > >> > > >> St.Ack
>> > >> > > >>
>> > >> > > >> On Fri, Apr 9, 2010 at 4:19 PM, George Stathis <
>> > gstat...@gmail.com>
>> > >> > > >> wrote:
>> > >> > > >> > Solved: for those interested, I had to explicitly copy
>> > >> > > >> zookeeper-3.2.2.jar
>> > >> > > >> > to $HADOOP_HOME/lib even though I had added its' path to
>> > >> > > >> $HADOOP_CLASSPATH
>> > >> > > >> > under $HADOOP_HOME/conf/hadoop-env.sh.
>> > >> > > >> >
>> > >> > > >> > It makes no sense to me why that particular JAR would not
>> get
>> > >> picked
>> > >> > > up.
>> > >> > > >> It
>> > >> > > >> > was even listed in the classpath debug output when I ran the
>> > job
>> > >> > using
>> > >> > > >> the
>> > >> > > >> > hadoop shell script. If anyone can enlighten, please do.
>> > >> > > >> >
>> > >> > > >> > -GS
>> > >> > > >> >
>> > >> > > >> > On Fri, Apr 9, 2010 at 5:56 PM, George Stathis <
>> > >> gstat...@gmail.com>
>> > >> > > >> wrote:
>> > >> > > >> >
>> > >> > > >> >> No dice. Classpath is now set. Same error. Meanwhile, I'm
>> > running
>> > >> > "$
>> > >> > > >> hadoop
>> > >> > > >> >> org.apache.hadoop.hbase.PerformanceEvaluation
>> sequentialWrite
>> > 1"
>> > >> > just
>> > >> > > >> fine,
>> > >> > > >> >> so MapRed is working at least.
>> > >> > > >> >>
>> > >> > > >> >> Still looking for suggestions then I guess.
>> > >> > > >> >>
>> > >> > > >> >> -GS
>> > >> > > >> >>
>> > >> > > >> >>
>> > >> > > >> >> On Fri, Apr 9, 2010 at 5:31 PM, George Stathis <
>> > >> gstat...@gmail.com
>> > >> > >
>> > >> > > >> wrote:
>> > >> > > >> >>
>> > >> > > >> >>> RTFMing
>> > >> > > >> >>>
>> > >> > > >>
>> > >> > >
>> > >> >
>> > >>
>> >
>> http://hadoop.apache.org/hbase/docs/current/api/org/apache/hadoop/hbase/mapreduce/package-summary.htmlright
>> > >> > > >> >>> now...Hadoop classpath not being set properly could be the
>> > >> > issue...
>> > >> > > >> >>>
>> > >> > > >> >>>
>> > >> > > >> >>> On Fri, Apr 9, 2010 at 5:26 PM, George Stathis <
>> > >> > gstat...@gmail.com
>> > >> > > >> >wrote:
>> > >> > > >> >>>
>> > >> > > >> >>>> Hi folks,
>> > >> > > >> >>>>
>> > >> > > >> >>>> I hope this is just a newbie problem.
>> > >> > > >> >>>>
>> > >> > > >> >>>> Context:
>> > >> > > >> >>>> - Running 0.20.3 tag locally in pseudo cluster mode
>> > >> > > >> >>>> - $HBASE_HOME is in env and $PATH
>> > >> > > >> >>>> - Running org.apache.hadoop.hbase.mapreduce.Export in the
>> > shell
>> > >> > > such
>> > >> > > >> >>>> as: $ hbase org.apache.hadoop.hbase.mapreduce.Export
>> > channels
>> > >> > > >> >>>> /bkps/channels/01
>> > >> > > >> >>>>
>> > >> > > >> >>>> Symptom:
>> > >> > > >> >>>> - Getting an NPE at
>> > >> > > >> >>>>
>> > >> > > >>
>> > >> > >
>> > >> >
>> > >>
>> >
>> org.apache.hadoop.hbase.mapreduce.TableInputFormatBase$TableRecordReader.restart(TableInputFormatBase.java:110):
>> > >> > > >> >>>>
>> > >> > > >> >>>> [...]
>> > >> > > >> >>>> 110      this.scanner = this.htable.getScanner(newScan);
>> > >> > > >> >>>> [...]
>> > >> > > >> >>>>
>> > >> > > >> >>>> Full output is bellow. Not sure why htable is still null
>> at
>> > >> that
>> > >> > > >> point.
>> > >> > > >> >>>> User error?
>> > >> > > >> >>>>
>> > >> > > >> >>>> Any help is appreciated.
>> > >> > > >> >>>>
>> > >> > > >> >>>> -GS
>> > >> > > >> >>>>
>> > >> > > >> >>>> Full output:
>> > >> > > >> >>>>
>> > >> > > >> >>>> $ hbase org.apache.hadoop.hbase.mapreduce.Export channels
>> > >> > > >> >>>> /bkps/channels/01
>> > >> > > >> >>>> 2010-04-09 17:13:57.407::INFO:  Logging to STDERR via
>> > >> > > >> >>>> org.mortbay.log.StdErrLog
>> > >> > > >> >>>> 2010-04-09 17:13:57.408::INFO:  verisons=1, starttime=0,
>> > >> > > >> >>>> endtime=9223372036854775807
>> > >> > > >> >>>> 10/04/09 17:13:58 DEBUG zookeeper.ZooKeeperWrapper: Read
>> > ZNode
>> > >> > > >> >>>> /hbase/root-region-server got 192.168.1.16:52159
>> > >> > > >> >>>> 10/04/09 17:13:58 DEBUG
>> > client.HConnectionManager$TableServers:
>> > >> > > Found
>> > >> > > >> >>>> ROOT at 192.168.1.16:52159
>> > >> > > >> >>>> 10/04/09 17:13:58 DEBUG
>> > client.HConnectionManager$TableServers:
>> > >> > > >> Cached
>> > >> > > >> >>>> location for .META.,,1 is 192.168.1.16:52159
>> > >> > > >> >>>> 10/04/09 17:13:58 DEBUG
>> > client.HConnectionManager$TableServers:
>> > >> > > >> Cached
>> > >> > > >> >>>> location for channels,,1270753106916 is
>> 192.168.1.16:52159
>> > >> > > >> >>>> 10/04/09 17:13:58 DEBUG
>> > client.HConnectionManager$TableServers:
>> > >> > > Cache
>> > >> > > >> hit
>> > >> > > >> >>>> for row <> in tableName channels: location server
>> > >> > > 192.168.1.16:52159
>> > >> > > >> ,
>> > >> > > >> >>>> location region name channels,,1270753106916
>> > >> > > >> >>>> 10/04/09 17:13:58 DEBUG mapreduce.TableInputFormatBase:
>> > >> > getSplits:
>> > >> > > >> split
>> > >> > > >> >>>> -> 0 -> 192.168.1.16:,
>> > >> > > >> >>>> 10/04/09 17:13:58 INFO mapred.JobClient: Running job:
>> > >> > > >> >>>> job_201004091642_0009
>> > >> > > >> >>>> 10/04/09 17:13:59 INFO mapred.JobClient:  map 0% reduce
>> 0%
>> > >> > > >> >>>> 10/04/09 17:14:09 INFO mapred.JobClient: Task Id :
>> > >> > > >> >>>> attempt_201004091642_0009_m_000000_0, Status : FAILED
>> > >> > > >> >>>> java.lang.NullPointerException
>> > >> > > >> >>>>  at
>> > >> > > >> >>>>
>> > >> > > >>
>> > >> > >
>> > >> >
>> > >>
>> >
>> org.apache.hadoop.hbase.mapreduce.TableInputFormatBase$TableRecordReader.restart(TableInputFormatBase.java:110)
>> > >> > > >> >>>> at
>> > >> > > >> >>>>
>> > >> > > >>
>> > >> > >
>> > >> >
>> > >>
>> >
>> org.apache.hadoop.hbase.mapreduce.TableInputFormatBase$TableRecordReader.init(TableInputFormatBase.java:119)
>> > >> > > >> >>>>  at
>> > >> > > >> >>>>
>> > >> > > >>
>> > >> > >
>> > >> >
>> > >>
>> >
>> org.apache.hadoop.hbase.mapreduce.TableInputFormatBase.createRecordReader(TableInputFormatBase.java:262)
>> > >> > > >> >>>> at
>> > >> > org.apache.hadoop.mapred.MapTask.runNewMapper(MapTask.java:588)
>> > >> > > >> >>>>  at
>> org.apache.hadoop.mapred.MapTask.run(MapTask.java:305)
>> > >> > > >> >>>> at org.apache.hadoop.mapred.Child.main(Child.java:170)
>> > >> > > >> >>>>
>> > >> > > >> >>>> 10/04/09 17:14:15 INFO mapred.JobClient: Task Id :
>> > >> > > >> >>>> attempt_201004091642_0009_m_000000_1, Status : FAILED
>> > >> > > >> >>>> java.lang.NullPointerException
>> > >> > > >> >>>> at
>> > >> > > >> >>>>
>> > >> > > >>
>> > >> > >
>> > >> >
>> > >>
>> >
>> org.apache.hadoop.hbase.mapreduce.TableInputFormatBase$TableRecordReader.restart(TableInputFormatBase.java:110)
>> > >> > > >> >>>>  at
>> > >> > > >> >>>>
>> > >> > > >>
>> > >> > >
>> > >> >
>> > >>
>> >
>> org.apache.hadoop.hbase.mapreduce.TableInputFormatBase$TableRecordReader.init(TableInputFormatBase.java:119)
>> > >> > > >> >>>> at
>> > >> > > >> >>>>
>> > >> > > >>
>> > >> > >
>> > >> >
>> > >>
>> >
>> org.apache.hadoop.hbase.mapreduce.TableInputFormatBase.createRecordReader(TableInputFormatBase.java:262)
>> > >> > > >> >>>>  at
>> > >> > org.apache.hadoop.mapred.MapTask.runNewMapper(MapTask.java:588)
>> > >> > > >> >>>> at org.apache.hadoop.mapred.MapTask.run(MapTask.java:305)
>> > >> > > >> >>>>  at org.apache.hadoop.mapred.Child.main(Child.java:170)
>> > >> > > >> >>>>
>> > >> > > >> >>>> 10/04/09 17:14:21 INFO mapred.JobClient: Task Id :
>> > >> > > >> >>>> attempt_201004091642_0009_m_000000_2, Status : FAILED
>> > >> > > >> >>>> java.lang.NullPointerException
>> > >> > > >> >>>> at
>> > >> > > >> >>>>
>> > >> > > >>
>> > >> > >
>> > >> >
>> > >>
>> >
>> org.apache.hadoop.hbase.mapreduce.TableInputFormatBase$TableRecordReader.restart(TableInputFormatBase.java:110)
>> > >> > > >> >>>>  at
>> > >> > > >> >>>>
>> > >> > > >>
>> > >> > >
>> > >> >
>> > >>
>> >
>> org.apache.hadoop.hbase.mapreduce.TableInputFormatBase$TableRecordReader.init(TableInputFormatBase.java:119)
>> > >> > > >> >>>> at
>> > >> > > >> >>>>
>> > >> > > >>
>> > >> > >
>> > >> >
>> > >>
>> >
>> org.apache.hadoop.hbase.mapreduce.TableInputFormatBase.createRecordReader(TableInputFormatBase.java:262)
>> > >> > > >> >>>>  at
>> > >> > org.apache.hadoop.mapred.MapTask.runNewMapper(MapTask.java:588)
>> > >> > > >> >>>> at org.apache.hadoop.mapred.MapTask.run(MapTask.java:305)
>> > >> > > >> >>>>  at org.apache.hadoop.mapred.Child.main(Child.java:170)
>> > >> > > >> >>>>
>> > >> > > >> >>>> 10/04/09 17:14:30 INFO mapred.JobClient: Job complete:
>> > >> > > >> >>>> job_201004091642_0009
>> > >> > > >> >>>> 10/04/09 17:14:30 INFO mapred.JobClient: Counters: 3
>> > >> > > >> >>>> 10/04/09 17:14:30 INFO mapred.JobClient:   Job Counters
>> > >> > > >> >>>> 10/04/09 17:14:30 INFO mapred.JobClient:     Launched map
>> > >> tasks=4
>> > >> > > >> >>>> 10/04/09 17:14:30 INFO mapred.JobClient:     Data-local
>> map
>> > >> > tasks=4
>> > >> > > >> >>>> 10/04/09 17:14:30 INFO mapred.JobClient:     Failed map
>> > tasks=1
>> > >> > > >> >>>> 10/04/09 17:14:30 DEBUG zookeeper.ZooKeeperWrapper:
>> Closed
>> > >> > > connection
>> > >> > > >> >>>> with ZooKeeper
>> > >> > > >> >>>>
>> > >> > > >> >>>>
>> > >> > > >> >>>>
>> > >> > > >> >>>>
>> > >> > > >> >>>
>> > >> > > >> >>
>> > >> > > >> >
>> > >> > > >>
>> > >> > > >
>> > >> > > >
>> > >> > >
>> > >> > >
>> > >> >
>> > >>
>> > >> I know that adding the hbase jars to the hadoop classpath is one of
>> the
>> > >> suggested methods. Personally I like the one big jar approach.
>> Rational:
>> > >> system administration. Say you are using Hadoop X.Y.Z and you are
>> adding
>> > >> this post install work, copying libraries, edit files. etc. Now when
>> you
>> > >> update HBase you have to do that work again, or you update hadoop and
>> > you
>> > >> have to do that work again. You are doubling your administrative
>> > workload
>> > >> every upgrade to either hive or hbase.
>> > >>
>> > >> On the other side of the coin, eclipse has a FAT JAR plugin that
>> builds
>> > one
>> > >> big jar. Big jar means a little longer to start the job but that is
>> > >> negligible.
>> > >>
>> > >
>> >
>>
>
>

Re: org.apache.hadoop.hbase.mapreduce.Export fails with an NPE

Reply via email to