Hi
Have you used the tool, LoadIncrementalHFiles after the
ImportTSV?
-Anoop-
________________________________________
From: Omkar Joshi [[email protected]]
Sent: Tuesday, April 16, 2013 12:01 PM
To: [email protected]
Subject: Data not loaded in table via ImportTSV
Hi,
The background thread is this :
http://mail-archives.apache.org/mod_mbox/hbase-user/201304.mbox/%3ce689a42b73c5a545ad77332a4fc75d8c1efbd80...@vshinmsmbx01.vshodc.lntinfotech.com%3E
I'm referring to the HBase doc.
http://hbase.apache.org/book/ops_mgt.html#importtsv
Accordingly, my command is :
HADOOP_CLASSPATH=`${HBASE_HOME}/bin/hbase classpath` ${HADOOP_HOME}/bin/hadoop
jar ${HBASE_HOME}/hbase-0.94.6.1.jar importtsv '-Dimporttsv.separator=;'
-Dimporttsv.columns=HBASE_ROW_KEY,CUSTOMER_INFO:NAME,CUSTOMER_INFO:EMAIL,CUSTOMER_INFO:ADDRESS,CUSTOMER_INFO:MOBILE
-Dimporttsv.bulk.output=hdfs://cldx-1139-1033:9000/hbase/storefileoutput
CUSTOMERS hdfs://cldx-1139-1033:9000/hbase/copiedFromLocal/customer.txt
..../*classpath echoed here*/
13/04/16 17:18:43 INFO zookeeper.ZooKeeper: Client
environment:java.library.path=/home/hduser/hadoop_ecosystem/apache_hadoop/hadoop_installation/hadoop-1.0.4/libexec/../lib/native/Linux-amd64-64
13/04/16 17:18:43 INFO zookeeper.ZooKeeper: Client
environment:java.io.tmpdir=/tmp
13/04/16 17:18:43 INFO zookeeper.ZooKeeper: Client
environment:java.compiler=<NA>
13/04/16 17:18:43 INFO zookeeper.ZooKeeper: Client environment:os.name=Linux
13/04/16 17:18:43 INFO zookeeper.ZooKeeper: Client environment:os.arch=amd64
13/04/16 17:18:43 INFO zookeeper.ZooKeeper: Client
environment:os.version=3.2.0-23-generic
13/04/16 17:18:43 INFO zookeeper.ZooKeeper: Client environment:user.name=hduser
13/04/16 17:18:43 INFO zookeeper.ZooKeeper: Client
environment:user.home=/home/hduser
13/04/16 17:18:43 INFO zookeeper.ZooKeeper: Client
environment:user.dir=/home/hduser/hadoop_ecosystem/apache_hbase/hbase_installation/hbase-0.94.6.1/bin
13/04/16 17:18:43 INFO zookeeper.ZooKeeper: Initiating client connection,
connectString=cldx-1140-1034:2181 sessionTimeout=180000 watcher=hconnection
13/04/16 17:18:43 INFO zookeeper.ClientCnxn: Opening socket connection to
server cldx-1140-1034/172.25.6.71:2181. Will not attempt to authenticate using
SASL (unknown error)
13/04/16 17:18:43 INFO zookeeper.RecoverableZooKeeper: The identifier of this
process is 5483@cldx-1139-1033
13/04/16 17:18:43 INFO zookeeper.ClientCnxn: Socket connection established to
cldx-1140-1034/172.25.6.71:2181, initiating session
13/04/16 17:18:43 INFO zookeeper.ClientCnxn: Session establishment complete on
server cldx-1140-1034/172.25.6.71:2181, sessionid = 0x13def2889530023,
negotiated timeout = 180000
13/04/16 17:18:44 INFO zookeeper.ZooKeeper: Initiating client connection,
connectString=cldx-1140-1034:2181 sessionTimeout=180000
watcher=catalogtracker-on-org.apache.hadoop.hbase.client.HConnectionManager$HConnectionImplementation@34d03009
13/04/16 17:18:44 INFO zookeeper.RecoverableZooKeeper: The identifier of this
process is 5483@cldx-1139-1033
13/04/16 17:18:44 INFO zookeeper.ClientCnxn: Opening socket connection to
server cldx-1140-1034/172.25.6.71:2181. Will not attempt to authenticate using
SASL (unknown error)
13/04/16 17:18:44 INFO zookeeper.ClientCnxn: Socket connection established to
cldx-1140-1034/172.25.6.71:2181, initiating session
13/04/16 17:18:44 INFO zookeeper.ClientCnxn: Session establishment complete on
server cldx-1140-1034/172.25.6.71:2181, sessionid = 0x13def2889530024,
negotiated timeout = 180000
13/04/16 17:18:44 INFO zookeeper.ZooKeeper: Session: 0x13def2889530024 closed
13/04/16 17:18:44 INFO zookeeper.ClientCnxn: EventThread shut down
13/04/16 17:18:44 INFO mapreduce.HFileOutputFormat: Looking up current regions
for table org.apache.hadoop.hbase.client.HTable@238cfdf
13/04/16 17:18:44 INFO mapreduce.HFileOutputFormat: Configuring 1 reduce
partitions to match current region count
13/04/16 17:18:44 INFO mapreduce.HFileOutputFormat: Writing partition
information to
hdfs://cldx-1139-1033:9000/user/hduser/partitions_4159cd24-b8ff-4919-854b-a7d1da5069ad
13/04/16 17:18:44 INFO util.NativeCodeLoader: Loaded the native-hadoop library
13/04/16 17:18:44 INFO zlib.ZlibFactory: Successfully loaded & initialized
native-zlib library
13/04/16 17:18:44 INFO compress.CodecPool: Got brand-new compressor
13/04/16 17:18:44 INFO mapreduce.HFileOutputFormat: Incremental table output
configured.
13/04/16 17:18:47 INFO input.FileInputFormat: Total input paths to process : 1
13/04/16 17:18:47 WARN snappy.LoadSnappy: Snappy native library not loaded
13/04/16 17:18:47 INFO mapred.JobClient: Running job: job_201304091909_0010
13/04/16 17:18:48 INFO mapred.JobClient: map 0% reduce 0%
13/04/16 17:19:07 INFO mapred.JobClient: map 100% reduce 0%
13/04/16 17:19:19 INFO mapred.JobClient: map 100% reduce 100%
13/04/16 17:19:24 INFO mapred.JobClient: Job complete: job_201304091909_0010
13/04/16 17:19:24 INFO mapred.JobClient: Counters: 30
13/04/16 17:19:24 INFO mapred.JobClient: Job Counters
13/04/16 17:19:24 INFO mapred.JobClient: Launched reduce tasks=1
13/04/16 17:19:24 INFO mapred.JobClient: SLOTS_MILLIS_MAPS=16567
13/04/16 17:19:24 INFO mapred.JobClient: Total time spent by all reduces
waiting after reserving slots (ms)=0
13/04/16 17:19:24 INFO mapred.JobClient: Total time spent by all maps
waiting after reserving slots (ms)=0
13/04/16 17:19:24 INFO mapred.JobClient: Launched map tasks=1
13/04/16 17:19:24 INFO mapred.JobClient: Data-local map tasks=1
13/04/16 17:19:24 INFO mapred.JobClient: SLOTS_MILLIS_REDUCES=10953
13/04/16 17:19:24 INFO mapred.JobClient: ImportTsv
13/04/16 17:19:24 INFO mapred.JobClient: Bad Lines=0
13/04/16 17:19:24 INFO mapred.JobClient: File Output Format Counters
13/04/16 17:19:24 INFO mapred.JobClient: Bytes Written=1984
13/04/16 17:19:24 INFO mapred.JobClient: FileSystemCounters
13/04/16 17:19:24 INFO mapred.JobClient: FILE_BYTES_READ=1753
13/04/16 17:19:24 INFO mapred.JobClient: HDFS_BYTES_READ=563
13/04/16 17:19:24 INFO mapred.JobClient: FILE_BYTES_WRITTEN=74351
13/04/16 17:19:24 INFO mapred.JobClient: HDFS_BYTES_WRITTEN=1984
13/04/16 17:19:24 INFO mapred.JobClient: File Input Format Counters
13/04/16 17:19:24 INFO mapred.JobClient: Bytes Read=433
13/04/16 17:19:24 INFO mapred.JobClient: Map-Reduce Framework
13/04/16 17:19:24 INFO mapred.JobClient: Map output materialized bytes=1600
13/04/16 17:19:24 INFO mapred.JobClient: Map input records=5
13/04/16 17:19:24 INFO mapred.JobClient: Reduce shuffle bytes=0
13/04/16 17:19:24 INFO mapred.JobClient: Spilled Records=10
13/04/16 17:19:24 INFO mapred.JobClient: Map output bytes=1574
13/04/16 17:19:24 INFO mapred.JobClient: Total committed heap usage
(bytes)=212664320
13/04/16 17:19:24 INFO mapred.JobClient: CPU time spent (ms)=4780
13/04/16 17:19:24 INFO mapred.JobClient: Combine input records=0
13/04/16 17:19:24 INFO mapred.JobClient: SPLIT_RAW_BYTES=130
13/04/16 17:19:24 INFO mapred.JobClient: Reduce input records=5
13/04/16 17:19:24 INFO mapred.JobClient: Reduce input groups=5
13/04/16 17:19:24 INFO mapred.JobClient: Combine output records=0
13/04/16 17:19:24 INFO mapred.JobClient: Physical memory (bytes)
snapshot=279982080
13/04/16 17:19:24 INFO mapred.JobClient: Reduce output records=20
13/04/16 17:19:24 INFO mapred.JobClient: Virtual memory (bytes)
snapshot=2010615808
13/04/16 17:19:24 INFO mapred.JobClient: Map output records=5
As seen, there aren't any bad lines and mapper has output 5 records(the source
text file has 5 records)
The HDFS reflects the following :
hduser@cldx-1139-1033:~/hadoop_ecosystem/apache_hbase/hbase_installation/hbase-0.94.6.1/bin$
hadoop fs -ls /hbase
Warning: $HADOOP_HOME is deprecated.
Found 12 items
drwxr-xr-x - hduser supergroup 0 2013-04-09 19:47 /hbase/-ROOT-
drwxr-xr-x - hduser supergroup 0 2013-04-09 19:47 /hbase/.META.
drwxr-xr-x - hduser supergroup 0 2013-04-16 16:02 /hbase/.archive
drwxr-xr-x - hduser supergroup 0 2013-04-09 19:47 /hbase/.logs
drwxr-xr-x - hduser supergroup 0 2013-04-09 19:47 /hbase/.oldlogs
drwxr-xr-x - hduser supergroup 0 2013-04-16 16:05 /hbase/.tmp
drwxr-xr-x - hduser supergroup 0 2013-04-16 16:05 /hbase/CUSTOMERS
drwxr-xr-x - hduser supergroup 0 2013-04-16 17:14
/hbase/copiedFromLocal
-rw-r--r-- 4 hduser supergroup 38 2013-04-09 19:47 /hbase/hbase.id
-rw-r--r-- 4 hduser supergroup 3 2013-04-09 19:47
/hbase/hbase.version
drwxr-xr-x - hduser supergroup 0 2013-04-16 17:19
/hbase/storefileoutput
drwxr-xr-x - hduser supergroup 0 2013-04-09 22:03 /hbase/users
hduser@cldx-1139-1033:~/hadoop_ecosystem/apache_hbase/hbase_installation/hbase-0.94.6.1/bin$
hadoop fs -ls /hbase/storefileoutput
Warning: $HADOOP_HOME is deprecated.
Found 3 items
drwxr-xr-x - hduser supergroup 0 2013-04-16 17:19
/hbase/storefileoutput/CUSTOMER_INFO
-rw-r--r-- 4 hduser supergroup 0 2013-04-16 17:19
/hbase/storefileoutput/_SUCCESS
drwxr-xr-x - hduser supergroup 0 2013-04-16 17:18
/hbase/storefileoutput/_logs
hduser@cldx-1139-1033:~/hadoop_ecosystem/apache_hbase/hbase_installation/hbase-0.94.6.1/bin$
hduser@cldx-1139-1033:~/hadoop_ecosystem/apache_hbase/hbase_installation/hbase-0.94.6.1/bin$
hadoop fs -ls /hbase/storefileoutput/CUSTOMER_INFO
Warning: $HADOOP_HOME is deprecated.
Found 1 items
-rw-r--r-- 4 hduser supergroup 1984 2013-04-16 17:19
/hbase/storefileoutput/CUSTOMER_INFO/64a822e4ff82456785740925eccd392f
But no rows are inserted in the CUSTOMERS table :
hduser@cldx-1139-1033:~$ $HBASE_HOME/bin/hbase shell
HBase Shell; enter 'help<RETURN>' for list of supported commands.
Type "exit<RETURN>" to leave the HBase Shell
Version 0.94.6.1, r1464658, Thu Apr 4 10:58:50 PDT 2013
hbase(main):001:0> scan 'CUSTOMERS'
ROW COLUMN+CELL
0 row(s) in 0.8240 seconds
Do I need to execute some additional step(CompleteBulkLoad?) to push the data -
I'm not sure if this is required !
Regards,
Omkar Joshi
________________________________
The contents of this e-mail and any attachment(s) may contain confidential or
privileged information for the intended recipient(s). Unintended recipients are
prohibited from taking action on the basis of information in this e-mail and
using or disseminating the information, and must notify the sender and delete
it from their system. L&T Infotech will not accept responsibility or liability
for the accuracy or completeness of, or the presence of any virus or disabling
code in this e-mail"