Re: bulk load issue

Perko, Ralph J Mon, 30 Mar 2015 11:32:27 -0700

Fixed.  The issue was with hbase and the WAL.  I shutdown hbase, deleted the 
WAL and things work fine now.


Ralph

From: <Perko>, Ralph Perko <[email protected]<mailto:[email protected]>>
Reply-To: "[email protected]<mailto:[email protected]>" 
<[email protected]<mailto:[email protected]>>
Date: Monday, March 30, 2015 at 10:10 AM
To: "[email protected]<mailto:[email protected]>" 
<[email protected]<mailto:[email protected]>>
Subject: bulk load issue

Hi, I recently ran into a new issue with the csv bulk loader.  The MapReduce 
jobs run fine but then the hbase loading portion seems to get stuck in a cycle 
of RpcRetryCaller cycles on the index tables.

Sample output – there are many of these for all index tables
15/03/30 09:55:21 INFO client.RpcRetryingCaller: Call exception, tries=20, 
retries=35, started=1424604 ms ago, cancelled=false, msg=row '' on table 
'RAW_DATA_IDX' at 
region=RAW_DATA_IDX,\x09\x00\x00\x00\x00\x00\x00\x00,1427732357435.bfcbd84ad20046978ecf07b1a49b992c.,
 hostname=server1,60020,1427726430154, seqNum=2

15/03/30 09:55:21 INFO client.RpcRetryingCaller: Call exception, tries=20, 
retries=35, started=1424636 ms ago, cancelled=false, msg=row '' on table 
'RAW_DATA_IDX' at 
region=RAW_DATA_IDX,\x08\x00\x00\x00\x00\x00\x00\x00,1427732357435.bac4e9778524eac1de2c1bc6bba11fde.,
 hostname=server1,60020,1427726430154, seqNum=2

I recently upgraded to phoenix 4.3 (and Hortonworks 2.2 – hbase 0.98.4.2.2.0.0) 
things worked prior to this.

Create statement (obfuscated a bit):
CREATE TABLE IF NOT EXISTS data_table
(
  file_name VARCHAR NOT NULL,
  rec_num INTEGER NOT NULL,
  m.f1 VARCHAR,
  m.f2 VARCHAR,
  m.f3 VARCHAR,
  m.f4 VARCHAR,
  m.f5 VARCHAR,
  m.f6 VARCHAR,
  m.f7 VARCHAR
 CONSTRAINT pkey PRIMARY KEY (file_name,rec_num)
) 
TTL='7776000',IMMUTABLE_ROWS=true,KEEP_DELETED_CELLS='false',COMPRESSION='SNAPPY',SALT_BUCKETS=10,SPLIT_POLICY='org.apache.hadoop.hbase.regionserver.ConstantSizeRegionSplitPolicy';

-- indexes
CREATE INDEX IF NOT EXISTS raw_data_idx  ON data_table(m.f1) 
TTL='7776000',KEEP_DELETED_CELLS='false',COMPRESSION='SNAPPY',MAX_FILESIZE='1000000000',SPLIT_POLICY='org.apache.hadoop.hbase.regionserver.ConstantSizeRegionSplitPolicy';
CREATE INDEX IF NOT EXISTS data_table_f2f3_idx  ON data_table(m.f2,m.f3) 
TTL='7776000',KEEP_DELETED_CELLS='false',COMPRESSION='SNAPPY',MAX_FILESIZE='1000000000',SPLIT_POLICY='org.apache.hadoop.hbase.regionserver.ConstantSizeRegionSplitPolicy';
CREATE INDEX IF NOT EXISTS data_table_f4f5_idx  ON data_table(m.f4,m.f5) 
TTL='7776000',KEEP_DELETED_CELLS='false',COMPRESSION='SNAPPY',MAX_FILESIZE='1000000000',SPLIT_POLICY='org.apache.hadoop.hbase.regionserver.ConstantSizeRegionSplitPolicy';
CREATE INDEX IF NOT EXISTS data_table_f6f7_idx  ON data_table(m.f6,m.f7) 
TTL='7776000',KEEP_DELETED_CELLS='false',COMPRESSION='SNAPPY',MAX_FILESIZE='1000000000',SPLIT_POLICY='org.apache.hadoop.hbase.regionserver.ConstantSizeRegionSplitPolicy';


Any thoughts on what could be causing this?

Thanks,
Ralph

Re: bulk load issue

Reply via email to