Re: Inconsistent rows exported/counted when looking at a set, unchanged past time frame.

Andrew Kettmann Wed, 21 Feb 2018 09:36:43 -0800

Unfortunately, without already knowing that is the reason, it is difficult to 
get to that point. Container logs, nodemanager logs, nothing indicated anything 
incorrect was happening other than inconsistent exports/rowcounter results. I 
had reviewed all the hbase/yarn/hdfs bugs in the list but didn't see one that 
seemed like a smoking gun, just a bunch of possible ones. My ignorance of the 
inner workings of hbase/yarn likely played a big part in that though. I do 
appreciate you pointing out 'the one' !







From: Ted Yu
Sent: Tuesday, February 20, 11:15 PM
Subject: Re: Inconsistent rows exported/counted when looking at a set, 
unchanged past time frame.
To: [email protected]


If you look at 
https://www.cloudera.com/documentation/enterprise/release-notes/topics/cdh_rn_fixed_in_58.html#fixed_issues585
 , you would see the following: HBASE-15378 - Scanner cannot handle heartbeat 
message with no results which fixed what you observed in previous release. FYI 
On Tue, Feb 20, 2018 at 9:07 PM, Andrew Kettmann < 
[email protected]> wrote: > Josh, > > We upgraded from CDH 5.8.0 -> 
5.8.5 seems to have fixed the issue. 3 > Rowcounts in a row that were not 
consistent before on a static table are > now consistent. We are doing some 
further testing but it looks like you > called it with: > > 'scans on 
RegionServers stop prematurely before all of the data is read' > > Thanks for 
the pointer in that direction, I was bashing my face against > this for two 
weeks trying to figure out this inconsistency. I appreciate > the clue! > > 
Andrew Kettmann > Consultant, Platform Services Group > > -----Original 
Message----- > From: Josh Elser [mailto:[email protected]] > Sent: Monday, 
February 12, 2018 11:59 AM > To: [email protected] > Subject: Re: 
Inconsistent rows exported/counted when looking at a set, > unchanged past time 
frame. > > Hi Andrew, > > Yes. The answer is, of course, that you should see 
consistent results from > HBase if there are no mutations in flight to that 
table. Whether you're > reading "current" or "back-in-time", as long as you're 
not dealing with raw > scans (where compactions may persist delete tombstones), 
this should hold > just the same. > > Are you modifying older cells with newer 
data when you insert data? > Remember that MAX_VERSIONS for a table defaults to 
1. Consider the > following: > > * Timestamps are of the form "tX", and t1 < t2 
< t3 < .. > * You are querying from the time range: [t1, t5]. > * You have a 
cell for "row1" with at t3 with value "foo". > * RowCounter over [t1, t5] would 
return "1" > * Your ingest writes a new cell for "row1" of "bar" at t6. > * 
RowCounter over [t1, t5] would return "0" normally, or "1" is you use > RAW 
scans *** > * A compaction would run over the region containing "row1" > * 
RowCounter over [t1, t5] would return "0" (RAW or normal) > > It's also 
possible that you're hitting some sort of bug around missing > records at query 
time. I'm not sure what the CDH versions you're using line > up to, but there 
have certainly been issues in the past around query-time > data loss (e.g. 
scans on RegionServers stop prematurely before all of the > data is read). > > 
Good luck! > > *** Going off of memory here. I think this is how it works, but 
you should > be able to test easily ;) > > On 2/9/18 5:30 PM, Andrew Kettmann 
wrote: > > A simpler question would be this: > > > > Given: > > > > > > * a set 
timeframe in the past (2-3 days roughly a year ago) > > * we are NOT removing 
records from the table at all > > * We ARE inserting into this table actively > 
> > > Should I expect two consecutive runs of the rowcounter mapreduce job to > 
return an identical number? > > > > > > Andrew Kettmann > > Consultant, 
Platform Services Group > > > > From: Andrew Kettmann > > Sent: Thursday, 
February 08, 2018 11:35 AM > > To: [email protected] > > Subject: 
Inconsistent rows exported/counted when looking at a set, > unchanged past time 
frame. > > > > First the version details: > > > > Running HBASE/Yarn/HDFS using 
Cloudera manager 5.12.1. > > Hbase: Version 1.2.0-cdh5.8.0 > > HDFS/YARN: 
Hadoop 2.6.0-cdh5.8.0 > > Hbck and hdfs fsck return healthy > > > > 15 nodes, 
sized down recently from 30 (other service requirements > > reduced. Solr, etc) 
> > > > > > The simplest example of the inconsistency is using rowcounter. If I 
run > the same mapreduce job twice in a row, I get different counts: > > > > 
hbase org.apache.hadoop.hbase.mapreduce.Driver rowcounter > > 
-Dmapreduce.map.speculative=false TABLENAME --starttime=1485907200000 > > 
--endtime=1486058400000 > > > > Looking at 
org.apache.hadoop.hbase.mapreduce.RowCounter$ > 
RowCounterMapper$Counters: > > Run 1: 4876683 > > Run 2: 4866351 > > > > 
Similarly with exports of the same date/time. Consecutive runs of the > export 
get different results: > > hbase org.apache.hadoop.hbase.mapreduce.Export \ > > 
-Dmapred.map.tasks.speculative.execution=false \ > > 
-Dmapred.reduce.tasks.speculative.execution=false \ TABLENAME \ > > HDFSPATH 1 
1485907200000 1486058400000 > > > > From Map Input/output records: > > Run 1: 
4296778 > > Run 2: 4297307 > > > > None of the results show anything for 
spilled records, no failed maps. > Sometimes the row count increases, sometimes 
it decreases. We aren’t using > any row filter queries, we just want to export 
chunks of the data for a > specific time range. This table is actively being 
read/written to, but I am > asking about a date range in early 2017 in this 
case, so that should have > no impact I would have thought. Another point is 
that the rowcount job and > the export return ridiculously different numbers. 
There should be no older > versions of rows involved as we are set to only keep 
the newest, and I can > confirm that there are rows that are consistently 
missing from the exports. > Table definition is below. > > > > 
hbase(main):001:0> describe 'TABLENAME' > > Table TABLENAME is ENABLED > > 
TABLENAME > > COLUMN FAMILIES DESCRIPTION > > {NAME => 'text', 
DATA_BLOCK_ENCODING => 'NONE', BLOOMFILTER => 'ROW', > > REPLICATION_SCOPE => 
'0', COMPRESSION => 'SNAPPY', VERSIONS => '1', > > MIN_VERSIONS => '0', TTL => 
'FOREVER', KEEP_DELETED_CELLS => 'FALSE', > > BLO CKSIZE => '65536', IN_MEMORY 
=> 'false', BLOCKCACHE => 'true'} > > 1 row(s) in 0.2800 seconds > > > > Any 
advice/suggestions would be greatly appreciated, are some of my > assumptions 
wrong regarding import/export and that it should be consistent > given 
consistent date/times? > > > > > > Andrew Kettmann > > Platform Services Group 
> > >

Re: Inconsistent rows exported/counted when looking at a set, unchanged past time frame.

Reply via email to