Hello – here is more info if needed. This table has a lot of data and regions.
Could it be related to this? Tests with a new table do not fail.
My apologies for the wonky log format and pdf. I am unable to copy the logs
over as text on this system.
Thanks,
Table def:
CREATE TABLE IF NOT EXISTS log_table
(
fst VARCHAR NOT NULL,
f VARCHAR NOT NULL,
r INTEGER NOT NULL,
ft INTEGER,
av INTEGER,
s VARCHAR,
sa VARCHAR,
da VARCHAR,
ttl BIGINT,
rqt INTEGER,
fl BIGINT,
rqdn VARCHAR,
rfdn VARCHAR,
ra VARCHAR,
pf INTEGER,
ans VARCHAR,
aus VARCHAR,
ts INTEGER,
lst VARCHAR,
sc VARCHAR,
so VARCHAR,
sla DOUBLE,
slo DOUBLE,
dc VARCHAR,
do VARCHAR,
dla DOUBLE,
dlo DOUBLE,
rc VARCHAR,
ro VARCHAR,
rla DOUBLE,
rlo DOUBLE
CONSTRAINT pkey PRIMARY KEY (fst,f, r)
)
TTL='5616000',KEEP_DELETED_CELLS='false',IMMUTABLE_ROWS=true,COMPRESSION='SNAPPY',SALT_BUCKETS=40,MAX_FILESIZE='10000000000',SPLIT_POLICY='org.apache.hadoop.hbase.regionserver.ConstantSizeRegionSplitPolicy';
CREATE INDEX IF NOT EXISTS log_table_site_idx ON log_table(s)
TTL='5616000',KEEP_DELETED_CELLS='false',COMPRESSION='SNAPPY',MAX_FILESIZE='10000000000',SPLIT_POLICY='org.apache.hadoop.hbase.regionserver.ConstantSizeRegionSplitPolicy';
CREATE INDEX IF NOT EXISTS log_table_addr_idx ON log_table(sa,da,ra)
TTL='5616000',KEEP_DELETED_CELLS='false',COMPRESSION='SNAPPY',MAX_FILESIZE='10000000000',SPLIT_POLICY='org.apache.hadoop.hbase.regionserver.ConstantSizeRegionSplitPolicy';
CREATE INDEX IF NOT EXISTS log_table_cc_idx ON log_table(sc,dc,rc)
TTL='5616000',KEEP_DELETED_CELLS='false',COMPRESSION='SNAPPY',MAX_FILESIZE='10000000000',SPLIT_POLICY='org.apache.hadoop.hbase.regionserver.ConstantSizeRegionSplitPolicy';
CREATE INDEX IF NOT EXISTS log_table_o_idx ON log_table(so,do,ro)
TTL='5616000',KEEP_DELETED_CELLS='false',COMPRESSION='SNAPPY',MAX_FILESIZE='10000000000',SPLIT_POLICY='org.apache.hadoop.hbase.regionserver.ConstantSizeRegionSplitPolicy';
CREATE INDEX IF NOT EXISTS log_table_rqfqdn_idx ON log_table(rqdn)
TTL='5616000',KEEP_DELETED_CELLS='false',COMPRESSION='SNAPPY',MAX_FILESIZE='10000000000',SPLIT_POLICY='org.apache.hadoop.hbase.regionserver.ConstantSizeRegionSplitPolicy';
CREATE INDEX IF NOT EXISTS log_table_refqdn_idx ON log_table(rfdn)
TTL='5616000',KEEP_DELETED_CELLS='false',COMPRESSION='SNAPPY',MAX_FILESIZE='10000000000',SPLIT_POLICY='org.apache.hadoop.hbase.regionserver.ConstantSizeRegionSplitPolicy';
From: "Perko , Ralph J" <[email protected]<mailto:[email protected]>>
Reply-To: "[email protected]<mailto:[email protected]>"
<[email protected]<mailto:[email protected]>>
Date: Monday, July 6, 2015 at 3:43 PM
To: "[email protected]<mailto:[email protected]>"
<[email protected]<mailto:[email protected]>>
Subject: out of memory - unable to create new native thread
Hi,
I am using a pig script to regularly load data into hbase/phoenix. Recently,
"OutOfMemory – unable to create new native thread” errors have cropped up in
the pig generated MR job – specifically in the Mapper task. I have not seen
this before and it only occurs on this one data load while other similar
scripts complete successfully. We also recently upgraded to Phoenix 4.4. My
understanding is this is less about memory and more about resource availability
from the OS.
The attached PDF contains some of the relevant log entries from the MR job
Any thoughts on what could be causing this?
Thanks,
Ralph
Phoenix 4.4.0
HBase 0.98.4
Pig script:
register $phoenix_jar;
register $project_jar;
register $piggybank_jar;
SET job.name 'Load $data into $table_name'
Z = load '$data' USING org.apache.pig.piggybank.storage.CSVExcelStorage() as (
f:chararray,
r:int,
...
);
X = FILTER Z BY f is not null AND r is not null and fst is not null;
D = foreach X generate
fst,
gov.pnnl.pig.Format(f),
r,
...
;
STORE D into 'hbase://$table_name/FST,F,R,...' using
org.apache.phoenix.pig.PhoenixHBaseStorage('$zookeeper','-batchSize 500');