I forgot to mention, that I am working with Royston on this issue. We have made 
some progress...

We have managed to get rid of our version/classpath issue by compiling the Java 
client class on the namenode with the HBase Classpath and running it from there 
(however, I then get errors due to missing client classes where the class has 
not been distributed to the map nodes).

So if we compile and run on the namenode as follows:

[hadoop1@namenode src]$ javac uk/org/cse/ingestion/SampleUploader.java -cp  
`hbase classpath`
[hadoop1@namenode src]$ java -cp `hbase classpath` 
uk.org.cse.ingestion.SampleUploader sample.10.csv tomstable dat no 
1>class_output 2>&1

The output from this can be found here: http://pastebin.com/jn5e7E2K

Note that, in contrast to the following and previous runs using jars, we get 
the appropriate zookeeper version in the client output:

12/04/11 12:20:12 INFO zookeeper.ZooKeeper: Client 
environment:zookeeper.version=3.4.3-1240972, built on 02/06/2012 10:48 GMT

In contrast to this, if we build a jar from the same place on the namenode and 
deploy using hadoop as follows:

jar cf SampleUploader.jar src/uk
HADOOP_CLASSPATH=`hbase classpath` hadoop jar SampleUploader.jar 
uk.org.cse.ingestion.SampleUploader sample.10.csv tomstable dat no 1>jar_output 
2>&1 

We get the following output: http://pastebin.com/xG98KfYe

With the offending client output showing that we are using the old version of 
zookeeper:

12/04/11 12:03:25 INFO zookeeper.ZooKeeper: Client 
environment:zookeeper.version=3.3.3-1073969, built on 02/23/2011 22:27 GMT

It seems that the "hadoop" command is introducing the older client API jars 
into the classpath in place of/ahead of the latest. By testing the successful 
class run above using the "hadoop" command in place of "java" we get the same 
error as with the jar.
 
HADOOP_CLASSPATH=`hbase classpath` hadoop uk.org.cse.ingestion.SampleUploader 
sample.10.csv tomstable dat no 1>class_output_with_hadoop 2>&1

The output (http://pastebin.com/8tfCqGgV) shows the incorrect zookeeper client 
in use.

Our hadoop-env.sh file has the following additions to the HADOOP_CLASSPATH: 

# Extra Java CLASSPATH elements.  Optional.
export HADOOP_CLASSPATH="$ZOOKEEPER_INSTALL/*:$HADOOP_CLASSPATH"
export HADOOP_CLASSPATH="$PIGDIR/*:$HADOOP_CLASSPATH"

Commenting out these lines and rerunning gets rid of the version errors! We are 
now getting what appears to be an unrelated local disk error from the job 
output, but our issue with the client version mismatch appears to have been 
resolved by 2 things:

1) Removed all references to HADOOP_CLASSPATH in hadoop-env.sh and replaced 
with the following so that any initial HADOOP_CLASSPATH settings have 
precedence:

# Extra Java CLASSPATH elements.  Optional.
export HADOOP_CLASSPATH="$HADOOP_CLASSPATH:$ZOOKEEPER_INSTALL/*"
export HADOOP_CLASSPATH="$HADOOP_CLASSPATH:$PIGDIR/*"

2) Ran the job with the following (so that HADOOP_CLASSPATH contained all 
appropriate HBase API jars):

HADOOP_CLASSPATH=`hbase classpath` hadoop jar SampleUploader.jar 
uk.org.cse.ingestion.SampleUploader sample.10.csv tomstable dat no

We are now dealing with the following error:

[sshexec] org.apache.hadoop.util.DiskChecker$DiskErrorException: Could not find 
any valid local directory for 
taskTracker/hadoop1/distcache/-6735763131868259398_188156722_559071878/namenode/tmp/mapred/staging/hadoop1/.staging/job_201204111219_0013/libjars/hbase-0.95-SNAPSHOT.jar
  [sshexec]     at 
org.apache.hadoop.fs.LocalDirAllocator$AllocatorPerContext.getLocalPathForWrite(LocalDirAllocator.java:381)
  [sshexec]     at 
org.apache.hadoop.fs.LocalDirAllocator.getLocalPathForWrite(LocalDirAllocator.java:146)
  [sshexec]     at 
org.apache.hadoop.filecache.TrackerDistributedCacheManager.getLocalCache(TrackerDistributedCacheManager.java:172)
  [sshexec]     at 
org.apache.hadoop.filecache.TaskDistributedCacheManager.setupCache(TaskDistributedCacheManager.java:187)
  [sshexec]     at 
org.apache.hadoop.mapred.TaskTracker$4.run(TaskTracker.java:1212)
  [sshexec]     at java.security.AccessController.doPrivileged(Native Method)
  [sshexec]     at javax.security.auth.Subject.doAs(Subject.java:396)
  [sshexec]     at 
org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1093)
  [sshexec]     at 
org.apache.hadoop.mapred.TaskTracker.initializeJob(TaskTracker.java:1203)
  [sshexec]     at 
org.apache.hadoop.mapred.TaskTracker.localizeJob(TaskTracker.java:1118)
  [sshexec]     at 
org.apache.hadoop.mapred.TaskTracker$5.run(TaskTracker.java:2430)
  [sshexec]     at java.lang.Thread.run(Thread.java:662)
  [sshexec]

Thanks,
Tom



-----Original Message-----
From: Tom Wilcox [mailto:[email protected]] 
Sent: 11 April 2012 10:36
To: [email protected]
Subject: RE: Not a host:port issue

I am not sure how to confirm which version of the HBase API the client is 
using. Although we are referencing the HBase-0.95-SNAPSHOT and zookeeper 3.4.3 
jars, we are still seeing the following message in the program output when 
building a job:

[sshexec] 12/04/11 10:35:13 INFO zookeeper.ZooKeeper: Client 
environment:zookeeper.version=3.3.3-1073969, built on 02/23/2011 22:27 GMT

Not that it is stating the zookeeper version as 3.3.3 and it was built in 
February (whereas our referenced jars were built yesterday).

How should I be using the HBase jars with our HBase Client Java program to 
ensure that it is the latest version (and how can I properly confirm this)?

Thanks,
Tom

-----Original Message-----
From: Royston Sellman [mailto:[email protected]] 
Sent: 10 April 2012 18:38
To: [email protected]
Subject: RE: Not a host:port issue

The CLASSPATH(S) are here: http://pastebin.com/wbwEL9Li
Looks to me like the client is 0.95-SNAPSHOT as is our HBase server.
However I just noticed the client is built with ZK 3.4.3 but our ZK server is 
3.3.3. Is there any incompatibility between those versions of ZK? (I'm going to 
make them the same but that will take a few minutes :)

Thanks,
Royston



-----Original Message-----
From: [email protected] [mailto:[email protected]] On Behalf Of Stack
Sent: 10 April 2012 17:08
To: [email protected]
Subject: Re: Not a host:port issue

On Tue, Apr 10, 2012 at 2:58 AM, Royston Sellman 
<[email protected]> wrote:
>  [sshexec] java.lang.IllegalArgumentException: Not a host:port pair: 
>  [][][]
>

We changed how we persist names to zookeeper in 0.92.x.  It used to be a 
host:port but now is a ServerName which is host comma port comma startcode and 
all is prefixed with zk sequenceid.

It looks like your mapreduce job is using an old hbase client.  Is that 
possible?  Can you check its CLASSPATH?

St.Ack

Reply via email to