Re: Difference between ResultScanner and initTableMapperJob
I got a timeout when trying to search for this row (185_) and for a different row (20_): hbase(main):016:0> scan 'server_based_data', {FILTER => "(PrefixFilter ('20'))", COLUMNS => 'raw_data:top', TIMERANGE => [149920560, 149920620]} ROWCOLUMN+CELL ERROR: Call id=7856, waitTime=120001, operationTimeout=12 expired. I tried to increase the timeout but now after waiting over 1 hr, it still hasn't come back. hbase(main):017:0> @shell.hbase.configuration.setInt("hbase.client.scanner.timeout.period", 24) hbase(main):018:0> scan 'server_based_data', {FILTER => "(PrefixFilter ('20_'))", COLUMNS => 'raw_data:top', TIMERANGE => [149920560, 149920620]} ROWCOLUMN+CELL (Still no output and waiting over 1 hr) I also checked other failed/killed mappers. These are a small sample of "bad" rowkeys. These deleted rowkeys show up with all sorts of hashes so scanning a row after the "bad" rowkey won't tell us much since it seems like these bad row keys occurs on all sorts of rows/hashes. 2017-07-07 20:25:59,640 INFO [main] org.apache.hadoop.hbase.mapreduce.TableRecordReaderImpl: recovered from org.apache.hadoop.hbase.client.RetriesExhaustedException: Failed after attempts=36, exceptions: Fri Jul 07 20:25:59 PDT 2017, null, java.net.SocketTimeoutException: callTimeout=4, callDuration=40306: row '145_app129023.lhr1.mydomain.com_1482214200' on table 'server_based_data' at region=server_based_data,145_app129023.lhr1.mydomain.com_1482214200,1483679406846.fbc6c1e473b944fcf1eedd03a3b8e2ec., hostname=hslave35139.ams9.mydomain.com,60020,1483577331446, seqNum=8165882 2017-07-07 20:29:22,280 INFO [main] org.apache.hadoop.hbase.mapreduce.TableRecordReaderImpl: recovered from org.apache.hadoop.hbase.client.RetriesExhaustedException: Failed after attempts=36, exceptions: Fri Jul 07 20:29:22 PDT 2017, null, java.net.SocketTimeoutException: callTimeout=4, callDuration=40303: row '162_app128162.sjc4.mydomain.com_1485642420' on table 'server_based_data' at region=server_based_data,162_app128162.sjc4.mydomain.com_1485642420,1485969672759.37985ed5325cf4afb4bd54afa25728e9., hostname=hslave35150.ams9.mydomain.com,60020,1483579082784, seqNum=5489984 2017-07-07 20:28:52,216 INFO [main] org.apache.hadoop.hbase.mapreduce.TableRecordReaderImpl: recovered from org.apache.hadoop.hbase.client.RetriesExhaustedException: Failed after attempts=36, exceptions: Fri Jul 07 20:28:52 PDT 2017, null, java.net.SocketTimeoutException: callTimeout=4, callDuration=40304: row '41_db160190.iad3.mydomain.com_1486067940' on table 'server_based_data' at region=server_based_data,41_db160190.iad3.mydomain.com_1486067940,1487094006943.f67c3b9836107bdbe6a533e2829c509a., hostname=hslave35150.ams9.mydomain.com,60020,1483579082784, seqNum=5423139 On Tue, Jul 11, 2017 at 2:12 PM, Ted Yuwrote: > bq. it can find 0 rows in less than 1 sec > > What if you perform a scan with start row lower than the deleted key, can > you reproduce the hanging scan ? > > Cheers > > On Tue, Jul 11, 2017 at 1:55 PM, S L wrote: > > > Same error as from the hadoop job output I initially posted. > > > > SocketTimeoutException/RetriesExhaustedException is caused by a key that > > should be deleted/expired. > > > > row '184_app128057.syd2.mydomain.com_1485646620'. > > > > The funny thing is when I execute a "get 'tablename', 'rowkey'" from > "hbase > > shell", it can find 0 rows in less than 1 sec. It seems like the > > initTableMapperJob method is sitting there for 40 sec trying to reach > this > > non-existent key for some reason. > > > > > > 2017-07-07 20:28:19,974 INFO [main] org.apache.hadoop.mapred.MapTask: > > bufstart = 0; bufvoid = 268435456 > > > > 2017-07-07 20:28:19,974 INFO [main] org.apache.hadoop.mapred.MapTask: > > kvstart = 67108860; length = 16777216 > > > > 2017-07-07 20:28:19,980 INFO [main] org.apache.hadoop.mapred.MapTask: > Map > > output collector class = org.apache.hadoop.mapred. > MapTask$MapOutputBuffer > > > > 2017-07-07 20:29:25,248 INFO [main] > > org.apache.hadoop.hbase.mapreduce.TableRecordReaderImpl: recovered from > > org.apache.hadoop.hbase.client.RetriesExhaustedException: Failed after > > attempts=36, exceptions: > > > > Fri Jul 07 20:29:25 PDT 2017, null, java.net.SocketTimeoutException: > > callTimeout=4, callDuration=40314: row > > '184_app128057.syd2.mydomain.com_1485646620' on table > 'server_based_data' > > at > > region=server_based_data,184_app128057.syd2.mydomain.com_ > > 1485646620,1486597623524.37ccf993b84fd15b24c0c4efbb95b7f5., > > hostname=hslave35120.ams9.mydomain.com,60020,1498245230342, > seqNum=9247698 > > > > > > > >at > > org.apache.hadoop.hbase.client.RpcRetryingCallerWithReadRepli > > cas.throwEnrichedException(RpcRetryingCallerWithReadReplicas.java:276) > > > >at > >
Re: Difference between ResultScanner and initTableMapperJob
bq. it can find 0 rows in less than 1 sec What if you perform a scan with start row lower than the deleted key, can you reproduce the hanging scan ? Cheers On Tue, Jul 11, 2017 at 1:55 PM, S Lwrote: > Same error as from the hadoop job output I initially posted. > > SocketTimeoutException/RetriesExhaustedException is caused by a key that > should be deleted/expired. > > row '184_app128057.syd2.mydomain.com_1485646620'. > > The funny thing is when I execute a "get 'tablename', 'rowkey'" from "hbase > shell", it can find 0 rows in less than 1 sec. It seems like the > initTableMapperJob method is sitting there for 40 sec trying to reach this > non-existent key for some reason. > > > 2017-07-07 20:28:19,974 INFO [main] org.apache.hadoop.mapred.MapTask: > bufstart = 0; bufvoid = 268435456 > > 2017-07-07 20:28:19,974 INFO [main] org.apache.hadoop.mapred.MapTask: > kvstart = 67108860; length = 16777216 > > 2017-07-07 20:28:19,980 INFO [main] org.apache.hadoop.mapred.MapTask: Map > output collector class = org.apache.hadoop.mapred.MapTask$MapOutputBuffer > > 2017-07-07 20:29:25,248 INFO [main] > org.apache.hadoop.hbase.mapreduce.TableRecordReaderImpl: recovered from > org.apache.hadoop.hbase.client.RetriesExhaustedException: Failed after > attempts=36, exceptions: > > Fri Jul 07 20:29:25 PDT 2017, null, java.net.SocketTimeoutException: > callTimeout=4, callDuration=40314: row > '184_app128057.syd2.mydomain.com_1485646620' on table 'server_based_data' > at > region=server_based_data,184_app128057.syd2.mydomain.com_ > 1485646620,1486597623524.37ccf993b84fd15b24c0c4efbb95b7f5., > hostname=hslave35120.ams9.mydomain.com,60020,1498245230342, seqNum=9247698 > > > >at > org.apache.hadoop.hbase.client.RpcRetryingCallerWithReadRepli > cas.throwEnrichedException(RpcRetryingCallerWithReadReplicas.java:276) > >at > org.apache.hadoop.hbase.client.ScannerCallableWithReplicas.call( > ScannerCallableWithReplicas.java:207) > >at > org.apache.hadoop.hbase.client.ScannerCallableWithReplicas.call( > ScannerCallableWithReplicas.java:60) > >at > org.apache.hadoop.hbase.client.RpcRetryingCaller.callWithoutRetries( > RpcRetryingCaller.java:200) > >at > org.apache.hadoop.hbase.client.ClientScanner.call(ClientScanner.java:320) > >at > org.apache.hadoop.hbase.client.ClientScanner.loadCache(ClientScanner.java: > 403) > >at > org.apache.hadoop.hbase.client.ClientScanner.next(ClientScanner.java:364) > >at > org.apache.hadoop.hbase.mapreduce.TableRecordReaderImpl.nextKeyValue( > TableRecordReaderImpl.java:222) > >at > org.apache.hadoop.hbase.mapreduce.TableRecordReader.nextKeyValue( > TableRecordReader.java:147) > >at > org.apache.hadoop.hbase.mapreduce.TableInputFormatBase$1.nextKeyValue( > TableInputFormatBase.java:216) > >at > org.apache.hadoop.mapred.MapTask$NewTrackingRecordReader. > nextKeyValue(MapTask.java:556) > >at > org.apache.hadoop.mapreduce.task.MapContextImpl. > nextKeyValue(MapContextImpl.java:80) > >at > org.apache.hadoop.mapreduce.lib.map.WrappedMapper$Context. > nextKeyValue(WrappedMapper.java:91) > >at org.apache.hadoop.mapreduce.Mapper.run(Mapper.java:144) > >at org.apache.hadoop.mapred.MapTask.runNewMapper(MapTask.java:787) > >at org.apache.hadoop.mapred.MapTask.run(MapTask.java:341) > >at org.apache.hadoop.mapred.YarnChild$2.run(YarnChild.java:164) > >at java.security.AccessController.doPrivileged(Native Method) > >at javax.security.auth.Subject.doAs(Subject.java:415) > >at > org.apache.hadoop.security.UserGroupInformation.doAs( > UserGroupInformation.java:1693) > >at org.apache.hadoop.mapred.YarnChild.main(YarnChild.java:158) > > Caused by: java.net.SocketTimeoutException: callTimeout=4, > callDuration=40314: row '184_app128057.syd2.mydomain.com_1485646620' on > table 'server_based_data' at > region=server_based_data,184_app128057.syd2.mydomain.com_ > 1485646620,1486597623524.37ccf993b84fd15b24c0c4efbb95b7f5., > hostname=hslave35120.ams9.mydomain.com,60020,1498245230342, seqNum=9247698 > >at > org.apache.hadoop.hbase.client.RpcRetryingCaller.callWithRetries( > RpcRetryingCaller.java:159) > >at > org.apache.hadoop.hbase.client.ResultBoundedCompletionService > $QueueingFuture.run(ResultBoundedCompletionService.java:65) > >at > java.util.concurrent.ThreadPoolExecutor.runWorker( > ThreadPoolExecutor.java:1145) > >at > java.util.concurrent.ThreadPoolExecutor$Worker.run( > ThreadPoolExecutor.java:615) > >at java.lang.Thread.run(Thread.java:745) > > Caused by: java.io.IOException: Call to > hslave35120.ams9.mydomain.com/10.216.35.120:60020 failed on local > exception: org.apache.hadoop.hbase.ipc.CallTimeoutException: Call id=2, > waitTime=40001, operationTimeout=4 expired. > >at > org.apache.hadoop.hbase.ipc.AbstractRpcClient.wrapException( > AbstractRpcClient.java:291) > >at > org.apache.hadoop.hbase.ipc.RpcClientImpl.call(RpcClientImpl.java:1272) > >at >
Re: Difference between ResultScanner and initTableMapperJob
Same error as from the hadoop job output I initially posted. SocketTimeoutException/RetriesExhaustedException is caused by a key that should be deleted/expired. row '184_app128057.syd2.mydomain.com_1485646620'. The funny thing is when I execute a "get 'tablename', 'rowkey'" from "hbase shell", it can find 0 rows in less than 1 sec. It seems like the initTableMapperJob method is sitting there for 40 sec trying to reach this non-existent key for some reason. 2017-07-07 20:28:19,974 INFO [main] org.apache.hadoop.mapred.MapTask: bufstart = 0; bufvoid = 268435456 2017-07-07 20:28:19,974 INFO [main] org.apache.hadoop.mapred.MapTask: kvstart = 67108860; length = 16777216 2017-07-07 20:28:19,980 INFO [main] org.apache.hadoop.mapred.MapTask: Map output collector class = org.apache.hadoop.mapred.MapTask$MapOutputBuffer 2017-07-07 20:29:25,248 INFO [main] org.apache.hadoop.hbase.mapreduce.TableRecordReaderImpl: recovered from org.apache.hadoop.hbase.client.RetriesExhaustedException: Failed after attempts=36, exceptions: Fri Jul 07 20:29:25 PDT 2017, null, java.net.SocketTimeoutException: callTimeout=4, callDuration=40314: row '184_app128057.syd2.mydomain.com_1485646620' on table 'server_based_data' at region=server_based_data,184_app128057.syd2.mydomain.com_1485646620,1486597623524.37ccf993b84fd15b24c0c4efbb95b7f5., hostname=hslave35120.ams9.mydomain.com,60020,1498245230342, seqNum=9247698 at org.apache.hadoop.hbase.client.RpcRetryingCallerWithReadReplicas.throwEnrichedException(RpcRetryingCallerWithReadReplicas.java:276) at org.apache.hadoop.hbase.client.ScannerCallableWithReplicas.call(ScannerCallableWithReplicas.java:207) at org.apache.hadoop.hbase.client.ScannerCallableWithReplicas.call(ScannerCallableWithReplicas.java:60) at org.apache.hadoop.hbase.client.RpcRetryingCaller.callWithoutRetries(RpcRetryingCaller.java:200) at org.apache.hadoop.hbase.client.ClientScanner.call(ClientScanner.java:320) at org.apache.hadoop.hbase.client.ClientScanner.loadCache(ClientScanner.java:403) at org.apache.hadoop.hbase.client.ClientScanner.next(ClientScanner.java:364) at org.apache.hadoop.hbase.mapreduce.TableRecordReaderImpl.nextKeyValue(TableRecordReaderImpl.java:222) at org.apache.hadoop.hbase.mapreduce.TableRecordReader.nextKeyValue(TableRecordReader.java:147) at org.apache.hadoop.hbase.mapreduce.TableInputFormatBase$1.nextKeyValue(TableInputFormatBase.java:216) at org.apache.hadoop.mapred.MapTask$NewTrackingRecordReader.nextKeyValue(MapTask.java:556) at org.apache.hadoop.mapreduce.task.MapContextImpl.nextKeyValue(MapContextImpl.java:80) at org.apache.hadoop.mapreduce.lib.map.WrappedMapper$Context.nextKeyValue(WrappedMapper.java:91) at org.apache.hadoop.mapreduce.Mapper.run(Mapper.java:144) at org.apache.hadoop.mapred.MapTask.runNewMapper(MapTask.java:787) at org.apache.hadoop.mapred.MapTask.run(MapTask.java:341) at org.apache.hadoop.mapred.YarnChild$2.run(YarnChild.java:164) at java.security.AccessController.doPrivileged(Native Method) at javax.security.auth.Subject.doAs(Subject.java:415) at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1693) at org.apache.hadoop.mapred.YarnChild.main(YarnChild.java:158) Caused by: java.net.SocketTimeoutException: callTimeout=4, callDuration=40314: row '184_app128057.syd2.mydomain.com_1485646620' on table 'server_based_data' at region=server_based_data,184_app128057.syd2.mydomain.com_1485646620,1486597623524.37ccf993b84fd15b24c0c4efbb95b7f5., hostname=hslave35120.ams9.mydomain.com,60020,1498245230342, seqNum=9247698 at org.apache.hadoop.hbase.client.RpcRetryingCaller.callWithRetries(RpcRetryingCaller.java:159) at org.apache.hadoop.hbase.client.ResultBoundedCompletionService$QueueingFuture.run(ResultBoundedCompletionService.java:65) at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145) at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615) at java.lang.Thread.run(Thread.java:745) Caused by: java.io.IOException: Call to hslave35120.ams9.mydomain.com/10.216.35.120:60020 failed on local exception: org.apache.hadoop.hbase.ipc.CallTimeoutException: Call id=2, waitTime=40001, operationTimeout=4 expired. at org.apache.hadoop.hbase.ipc.AbstractRpcClient.wrapException(AbstractRpcClient.java:291) at org.apache.hadoop.hbase.ipc.RpcClientImpl.call(RpcClientImpl.java:1272) at org.apache.hadoop.hbase.ipc.AbstractRpcClient.callBlockingMethod(AbstractRpcClient.java:226) at org.apache.hadoop.hbase.ipc.AbstractRpcClient$BlockingRpcChannelImplementation.callBlockingMethod(AbstractRpcClient.java:331) at org.apache.hadoop.hbase.protobuf.generated.ClientProtos$ClientService$BlockingStub.scan(ClientProtos.java:34094) at org.apache.hadoop.hbase.client.ScannerCallable.call(ScannerCallable.java:219) at
Re: Difference between ResultScanner and initTableMapperJob
Can you take a look at the server log on hslave35150.ams9.mydomain.com around 17/07/07 20:23:31 ? See if there is some clue in the log. On Tue, Jul 11, 2017 at 12:18 PM, S Lwrote: > If I forgot to say, the keys that the log shows is causing the > RetriesExhaustedException should be deleted/gone from the table due to the > TTL being exceeded. > > Fri Jul 07 20:23:26 PDT 2017, null, java.net.SocketTimeoutException: > callTimeout=4, callDuration=40303: row > '41_db160190.iad3.mydomain.com_1486067940' on table 'server_based_data' at > region=server_based_data,41_db160190.iad3.mydomain.com_ > 1486067940,1487094006943.f67c3b9836107bdbe6a533e2829c509a., > hostname=hslave35150.ams9.mydomain.com,60020,1483579082784, seqNum=5423139 > > The timestamp here is from Feb 2, 2017. My TTL is 30 days. Since I ran > the job on July 7, 2017, Feb 2017 is way past the 30 day TTL > > describe 'server_based_data' > > Table server_based_data is ENABLED > > > server_based_data > > > COLUMN FAMILIES DESCRIPTION > > > {NAME => 'raw_data', DATA_BLOCK_ENCODING => 'NONE', BLOOMFILTER => 'ROW', > REPLIC > > ATION_SCOPE => '0', VERSIONS => '1', COMPRESSION => 'SNAPPY', MIN_VERSIONS > => '0 > > ', TTL => '2592000 SECONDS (30 DAYS)', KEEP_DELETED_CELLS => 'FALSE', > BLOCKSIZE > > => '65536', IN_MEMORY => 'false', BLOCKCACHE => 'true'} > > > 1 row(s) in 0.5180 seconds > > On Tue, Jul 11, 2017 at 12:11 PM, S L wrote: > > > Sorry for not being clear. I tried with both versions, first 1.0.1, then > > 1,2-cdh5.7.2. We are currently running on Cloudera 5.7.2, thus why I > used > > that version of the jar. > > > > I had set the timeout to be as short as 30 sec and as long as 2 min but I > > was still running into the problem. When setting the timeout to 2 min, > the > > job took almost 50 min to "complete". Complete is in quotes because it > > fails (see pastebin below) > > > > Here's a copy of the hadoop output logs via pastebin. The log is 11000 > > lines so I just pasted up to the first couple exceptions and then pasted > > the end where it jumps from 80% maps to 100% and from 21% reduce to 100% > > because Yarn or something killed it. > > > > https://pastebin.com/KwriyPn6 > > http://imgur.com/a/ouPZ5 - screenshot from failed mapreduce job from > > cloudera manager/Yarn > > > > > > > > On Mon, Jul 10, 2017 at 8:50 PM, Ted Yu wrote: > > > >> bq. for hbase-client/hbase-server version 1.0.1 and 1.2.0-cdh5.7.2. > >> > >> You mean the error occurred for both versions or, client is on 1.0.1 and > >> server is on 1.2.0 ? > >> > >> There should be more to the RetriesExhaustedException. > >> Can you pastebin the full stack trace ? > >> > >> Cheers > >> > >> On Mon, Jul 10, 2017 at 2:21 PM, S L wrote: > >> > >> > I hope someone can tell me what the difference between these two API > >> calls > >> > are. I'm getting weird results between the two of them. This is > >> happening > >> > for hbase-client/hbase-server version 1.0.1 and 1.2.0-cdh5.7.2. > >> > > >> > First off, my rowkeys are in the format hash_name_timestamp > >> > e.g. 100_servername_1234567890. The hbase table has a TTL of 30 days > so > >> > things older than 30 days should disappear after compaction. > >> > > >> > The following is code for using ResultScanner. It doesn't use > >> MapReduce so > >> > it takes a very long time to complete. I can't run my job this way > >> because > >> > it takes too long. However, for debugging purposes, I don't have any > >> > problems with this method. It lists all keys for the specified time > >> range, > >> > which look valid to me since all the timestamps of the returned keys > are > >> > within the past 30 days and within the specified time range: > >> > > >> > Scan scan = new Scan(); > >> > scan.addColumn(Bytes.toBytes("raw_data"), > Bytes.toBytes(fileType)); > >> > scan.setCaching(500); > >> > scan.setCacheBlocks(false); > >> > scan.setTimeRange(start, end); > >> > > >> > Connection fConnection = ConnectionFactory. > createConnection(conf); > >> > Table table = fConnection.getTable(TableName.valueOf(tableName)); > >> > ResultScanner scanner = table.getScanner(scan); > >> > for (Result result = scanner.next(); result != null; result = > >> > scanner.next()) { > >> >System.out.println("Found row: " + > Bytes.toString(result.getRow() > >> > )); > >> > } > >> > > >> > > >> > The follow code doesn't work but it uses MapReduce, which runs way > >> faster > >> > than using the ResultScanner way, since it divides things up into 1200 > >> > maps. The problem is I'm getting rowkeys that should have disappeared > >> due > >> > to TTL expiring: > >> > > >> > Scan scan = new Scan(); > >> > scan.addColumn(Bytes.toBytes("raw_data"), > Bytes.toBytes(fileType)); > >> > scan.setCaching(500); > >> > scan.setCacheBlocks(false); > >> > scan.setTimeRange(start, end); > >> >
Re: Difference between ResultScanner and initTableMapperJob
Sorry for not being clear. I tried with both versions, first 1.0.1, then 1,2-cdh5.7.2. We are currently running on Cloudera 5.7.2, thus why I used that version of the jar. I had set the timeout to be as short as 30 sec and as long as 2 min but I was still running into the problem. When setting the timeout to 2 min, the job took almost 50 min to "complete". Complete is in quotes because it fails (see pastebin below) Here's a copy of the hadoop output logs via pastebin. The log is 11000 lines so I just pasted up to the first couple exceptions and then pasted the end where it jumps from 80% maps to 100% and from 21% reduce to 100% because Yarn or something killed it. https://pastebin.com/KwriyPn6 http://imgur.com/a/ouPZ5 - screenshot from failed mapreduce job from cloudera manager/Yarn On Mon, Jul 10, 2017 at 8:50 PM, Ted Yuwrote: > bq. for hbase-client/hbase-server version 1.0.1 and 1.2.0-cdh5.7.2. > > You mean the error occurred for both versions or, client is on 1.0.1 and > server is on 1.2.0 ? > > There should be more to the RetriesExhaustedException. > Can you pastebin the full stack trace ? > > Cheers > > On Mon, Jul 10, 2017 at 2:21 PM, S L wrote: > > > I hope someone can tell me what the difference between these two API > calls > > are. I'm getting weird results between the two of them. This is > happening > > for hbase-client/hbase-server version 1.0.1 and 1.2.0-cdh5.7.2. > > > > First off, my rowkeys are in the format hash_name_timestamp > > e.g. 100_servername_1234567890. The hbase table has a TTL of 30 days so > > things older than 30 days should disappear after compaction. > > > > The following is code for using ResultScanner. It doesn't use MapReduce > so > > it takes a very long time to complete. I can't run my job this way > because > > it takes too long. However, for debugging purposes, I don't have any > > problems with this method. It lists all keys for the specified time > range, > > which look valid to me since all the timestamps of the returned keys are > > within the past 30 days and within the specified time range: > > > > Scan scan = new Scan(); > > scan.addColumn(Bytes.toBytes("raw_data"), Bytes.toBytes(fileType)); > > scan.setCaching(500); > > scan.setCacheBlocks(false); > > scan.setTimeRange(start, end); > > > > Connection fConnection = ConnectionFactory.createConnection(conf); > > Table table = fConnection.getTable(TableName.valueOf(tableName)); > > ResultScanner scanner = table.getScanner(scan); > > for (Result result = scanner.next(); result != null; result = > > scanner.next()) { > >System.out.println("Found row: " + Bytes.toString(result.getRow() > > )); > > } > > > > > > The follow code doesn't work but it uses MapReduce, which runs way faster > > than using the ResultScanner way, since it divides things up into 1200 > > maps. The problem is I'm getting rowkeys that should have disappeared > due > > to TTL expiring: > > > > Scan scan = new Scan(); > > scan.addColumn(Bytes.toBytes("raw_data"), Bytes.toBytes(fileType)); > > scan.setCaching(500); > > scan.setCacheBlocks(false); > > scan.setTimeRange(start, end); > > TableMapReduceUtil.initTableMapperJob(tableName, scan, MTTRMapper.class, > > Text.class, IntWritable.class, job); > > > > Here is the error that I get, which eventually kills the whole MR job > later > > because over 25% of the mappers failed. > > > > > Error: org.apache.hadoop.hbase.client.RetriesExhaustedException: > > > Failed after attempts=36, exceptions: Wed Jun 28 13:46:57 PDT 2017, > > > null, java.net.SocketTimeoutException: callTimeout=12, > > > callDuration=120301: row '65_app129041.iad1.mydomain.com_1476641940' > > > on table 'server_based_data' at region=server_based_data > > > > I'll try to study the code for the hbase-client and hbase-server jars but > > hopefully someone will know offhand what the difference between the > methods > > are and what is causing the initTableMapperJob call to fail. > > >
Re: Difference between ResultScanner and initTableMapperJob
If I forgot to say, the keys that the log shows is causing the RetriesExhaustedException should be deleted/gone from the table due to the TTL being exceeded. Fri Jul 07 20:23:26 PDT 2017, null, java.net.SocketTimeoutException: callTimeout=4, callDuration=40303: row '41_db160190.iad3.mydomain.com_1486067940' on table 'server_based_data' at region=server_based_data,41_db160190.iad3.mydomain.com_1486067940,1487094006943.f67c3b9836107bdbe6a533e2829c509a., hostname=hslave35150.ams9.mydomain.com,60020,1483579082784, seqNum=5423139 The timestamp here is from Feb 2, 2017. My TTL is 30 days. Since I ran the job on July 7, 2017, Feb 2017 is way past the 30 day TTL describe 'server_based_data' Table server_based_data is ENABLED server_based_data COLUMN FAMILIES DESCRIPTION {NAME => 'raw_data', DATA_BLOCK_ENCODING => 'NONE', BLOOMFILTER => 'ROW', REPLIC ATION_SCOPE => '0', VERSIONS => '1', COMPRESSION => 'SNAPPY', MIN_VERSIONS => '0 ', TTL => '2592000 SECONDS (30 DAYS)', KEEP_DELETED_CELLS => 'FALSE', BLOCKSIZE => '65536', IN_MEMORY => 'false', BLOCKCACHE => 'true'} 1 row(s) in 0.5180 seconds On Tue, Jul 11, 2017 at 12:11 PM, S Lwrote: > Sorry for not being clear. I tried with both versions, first 1.0.1, then > 1,2-cdh5.7.2. We are currently running on Cloudera 5.7.2, thus why I used > that version of the jar. > > I had set the timeout to be as short as 30 sec and as long as 2 min but I > was still running into the problem. When setting the timeout to 2 min, the > job took almost 50 min to "complete". Complete is in quotes because it > fails (see pastebin below) > > Here's a copy of the hadoop output logs via pastebin. The log is 11000 > lines so I just pasted up to the first couple exceptions and then pasted > the end where it jumps from 80% maps to 100% and from 21% reduce to 100% > because Yarn or something killed it. > > https://pastebin.com/KwriyPn6 > http://imgur.com/a/ouPZ5 - screenshot from failed mapreduce job from > cloudera manager/Yarn > > > > On Mon, Jul 10, 2017 at 8:50 PM, Ted Yu wrote: > >> bq. for hbase-client/hbase-server version 1.0.1 and 1.2.0-cdh5.7.2. >> >> You mean the error occurred for both versions or, client is on 1.0.1 and >> server is on 1.2.0 ? >> >> There should be more to the RetriesExhaustedException. >> Can you pastebin the full stack trace ? >> >> Cheers >> >> On Mon, Jul 10, 2017 at 2:21 PM, S L wrote: >> >> > I hope someone can tell me what the difference between these two API >> calls >> > are. I'm getting weird results between the two of them. This is >> happening >> > for hbase-client/hbase-server version 1.0.1 and 1.2.0-cdh5.7.2. >> > >> > First off, my rowkeys are in the format hash_name_timestamp >> > e.g. 100_servername_1234567890. The hbase table has a TTL of 30 days so >> > things older than 30 days should disappear after compaction. >> > >> > The following is code for using ResultScanner. It doesn't use >> MapReduce so >> > it takes a very long time to complete. I can't run my job this way >> because >> > it takes too long. However, for debugging purposes, I don't have any >> > problems with this method. It lists all keys for the specified time >> range, >> > which look valid to me since all the timestamps of the returned keys are >> > within the past 30 days and within the specified time range: >> > >> > Scan scan = new Scan(); >> > scan.addColumn(Bytes.toBytes("raw_data"), Bytes.toBytes(fileType)); >> > scan.setCaching(500); >> > scan.setCacheBlocks(false); >> > scan.setTimeRange(start, end); >> > >> > Connection fConnection = ConnectionFactory.createConnection(conf); >> > Table table = fConnection.getTable(TableName.valueOf(tableName)); >> > ResultScanner scanner = table.getScanner(scan); >> > for (Result result = scanner.next(); result != null; result = >> > scanner.next()) { >> >System.out.println("Found row: " + Bytes.toString(result.getRow() >> > )); >> > } >> > >> > >> > The follow code doesn't work but it uses MapReduce, which runs way >> faster >> > than using the ResultScanner way, since it divides things up into 1200 >> > maps. The problem is I'm getting rowkeys that should have disappeared >> due >> > to TTL expiring: >> > >> > Scan scan = new Scan(); >> > scan.addColumn(Bytes.toBytes("raw_data"), Bytes.toBytes(fileType)); >> > scan.setCaching(500); >> > scan.setCacheBlocks(false); >> > scan.setTimeRange(start, end); >> > TableMapReduceUtil.initTableMapperJob(tableName, scan, >> MTTRMapper.class, >> > Text.class, IntWritable.class, job); >> > >> > Here is the error that I get, which eventually kills the whole MR job >> later >> > because over 25% of the mappers failed. >> > >> > > Error: org.apache.hadoop.hbase.client.RetriesExhaustedException: >> > > Failed after attempts=36, exceptions: Wed Jun 28 13:46:57 PDT 2017, >> > > null, java.net.SocketTimeoutException:
Re: Difference between ResultScanner and initTableMapperJob
You should pay more attention to the RetriesExhaustedException exception. As you have already found out that the important difference between normal scan and MapReduce job is that the mapreduce job will end up in many concurrent tasks( 1200 maps as you pointed out). Maybe the 1200 tasks running in parallel create a big load for client/server and result in the retry exhausted exception. Anyway, still need more log here to locate the root problem. Best Regards Allan Yang 2017-07-11 11:50 GMT+08:00 Ted Yu: > bq. for hbase-client/hbase-server version 1.0.1 and 1.2.0-cdh5.7.2. > > You mean the error occurred for both versions or, client is on 1.0.1 and > server is on 1.2.0 ? > > There should be more to the RetriesExhaustedException. > Can you pastebin the full stack trace ? > > Cheers > > On Mon, Jul 10, 2017 at 2:21 PM, S L wrote: > > > I hope someone can tell me what the difference between these two API > calls > > are. I'm getting weird results between the two of them. This is > happening > > for hbase-client/hbase-server version 1.0.1 and 1.2.0-cdh5.7.2. > > > > First off, my rowkeys are in the format hash_name_timestamp > > e.g. 100_servername_1234567890. The hbase table has a TTL of 30 days so > > things older than 30 days should disappear after compaction. > > > > The following is code for using ResultScanner. It doesn't use MapReduce > so > > it takes a very long time to complete. I can't run my job this way > because > > it takes too long. However, for debugging purposes, I don't have any > > problems with this method. It lists all keys for the specified time > range, > > which look valid to me since all the timestamps of the returned keys are > > within the past 30 days and within the specified time range: > > > > Scan scan = new Scan(); > > scan.addColumn(Bytes.toBytes("raw_data"), Bytes.toBytes(fileType)); > > scan.setCaching(500); > > scan.setCacheBlocks(false); > > scan.setTimeRange(start, end); > > > > Connection fConnection = ConnectionFactory.createConnection(conf); > > Table table = fConnection.getTable(TableName.valueOf(tableName)); > > ResultScanner scanner = table.getScanner(scan); > > for (Result result = scanner.next(); result != null; result = > > scanner.next()) { > >System.out.println("Found row: " + Bytes.toString(result.getRow() > > )); > > } > > > > > > The follow code doesn't work but it uses MapReduce, which runs way faster > > than using the ResultScanner way, since it divides things up into 1200 > > maps. The problem is I'm getting rowkeys that should have disappeared > due > > to TTL expiring: > > > > Scan scan = new Scan(); > > scan.addColumn(Bytes.toBytes("raw_data"), Bytes.toBytes(fileType)); > > scan.setCaching(500); > > scan.setCacheBlocks(false); > > scan.setTimeRange(start, end); > > TableMapReduceUtil.initTableMapperJob(tableName, scan, MTTRMapper.class, > > Text.class, IntWritable.class, job); > > > > Here is the error that I get, which eventually kills the whole MR job > later > > because over 25% of the mappers failed. > > > > > Error: org.apache.hadoop.hbase.client.RetriesExhaustedException: > > > Failed after attempts=36, exceptions: Wed Jun 28 13:46:57 PDT 2017, > > > null, java.net.SocketTimeoutException: callTimeout=12, > > > callDuration=120301: row '65_app129041.iad1.mydomain.com_1476641940' > > > on table 'server_based_data' at region=server_based_data > > > > I'll try to study the code for the hbase-client and hbase-server jars but > > hopefully someone will know offhand what the difference between the > methods > > are and what is causing the initTableMapperJob call to fail. > > >
Re: Difference between ResultScanner and initTableMapperJob
bq. for hbase-client/hbase-server version 1.0.1 and 1.2.0-cdh5.7.2. You mean the error occurred for both versions or, client is on 1.0.1 and server is on 1.2.0 ? There should be more to the RetriesExhaustedException. Can you pastebin the full stack trace ? Cheers On Mon, Jul 10, 2017 at 2:21 PM, S Lwrote: > I hope someone can tell me what the difference between these two API calls > are. I'm getting weird results between the two of them. This is happening > for hbase-client/hbase-server version 1.0.1 and 1.2.0-cdh5.7.2. > > First off, my rowkeys are in the format hash_name_timestamp > e.g. 100_servername_1234567890. The hbase table has a TTL of 30 days so > things older than 30 days should disappear after compaction. > > The following is code for using ResultScanner. It doesn't use MapReduce so > it takes a very long time to complete. I can't run my job this way because > it takes too long. However, for debugging purposes, I don't have any > problems with this method. It lists all keys for the specified time range, > which look valid to me since all the timestamps of the returned keys are > within the past 30 days and within the specified time range: > > Scan scan = new Scan(); > scan.addColumn(Bytes.toBytes("raw_data"), Bytes.toBytes(fileType)); > scan.setCaching(500); > scan.setCacheBlocks(false); > scan.setTimeRange(start, end); > > Connection fConnection = ConnectionFactory.createConnection(conf); > Table table = fConnection.getTable(TableName.valueOf(tableName)); > ResultScanner scanner = table.getScanner(scan); > for (Result result = scanner.next(); result != null; result = > scanner.next()) { >System.out.println("Found row: " + Bytes.toString(result.getRow() > )); > } > > > The follow code doesn't work but it uses MapReduce, which runs way faster > than using the ResultScanner way, since it divides things up into 1200 > maps. The problem is I'm getting rowkeys that should have disappeared due > to TTL expiring: > > Scan scan = new Scan(); > scan.addColumn(Bytes.toBytes("raw_data"), Bytes.toBytes(fileType)); > scan.setCaching(500); > scan.setCacheBlocks(false); > scan.setTimeRange(start, end); > TableMapReduceUtil.initTableMapperJob(tableName, scan, MTTRMapper.class, > Text.class, IntWritable.class, job); > > Here is the error that I get, which eventually kills the whole MR job later > because over 25% of the mappers failed. > > > Error: org.apache.hadoop.hbase.client.RetriesExhaustedException: > > Failed after attempts=36, exceptions: Wed Jun 28 13:46:57 PDT 2017, > > null, java.net.SocketTimeoutException: callTimeout=12, > > callDuration=120301: row '65_app129041.iad1.mydomain.com_1476641940' > > on table 'server_based_data' at region=server_based_data > > I'll try to study the code for the hbase-client and hbase-server jars but > hopefully someone will know offhand what the difference between the methods > are and what is causing the initTableMapperJob call to fail. >
Difference between ResultScanner and initTableMapperJob
I hope someone can tell me what the difference between these two API calls are. I'm getting weird results between the two of them. This is happening for hbase-client/hbase-server version 1.0.1 and 1.2.0-cdh5.7.2. First off, my rowkeys are in the format hash_name_timestamp e.g. 100_servername_1234567890. The hbase table has a TTL of 30 days so things older than 30 days should disappear after compaction. The following is code for using ResultScanner. It doesn't use MapReduce so it takes a very long time to complete. I can't run my job this way because it takes too long. However, for debugging purposes, I don't have any problems with this method. It lists all keys for the specified time range, which look valid to me since all the timestamps of the returned keys are within the past 30 days and within the specified time range: Scan scan = new Scan(); scan.addColumn(Bytes.toBytes("raw_data"), Bytes.toBytes(fileType)); scan.setCaching(500); scan.setCacheBlocks(false); scan.setTimeRange(start, end); Connection fConnection = ConnectionFactory.createConnection(conf); Table table = fConnection.getTable(TableName.valueOf(tableName)); ResultScanner scanner = table.getScanner(scan); for (Result result = scanner.next(); result != null; result = scanner.next()) { System.out.println("Found row: " + Bytes.toString(result.getRow())); } The follow code doesn't work but it uses MapReduce, which runs way faster than using the ResultScanner way, since it divides things up into 1200 maps. The problem is I'm getting rowkeys that should have disappeared due to TTL expiring: Scan scan = new Scan(); scan.addColumn(Bytes.toBytes("raw_data"), Bytes.toBytes(fileType)); scan.setCaching(500); scan.setCacheBlocks(false); scan.setTimeRange(start, end); TableMapReduceUtil.initTableMapperJob(tableName, scan, MTTRMapper.class, Text.class, IntWritable.class, job); Here is the error that I get, which eventually kills the whole MR job later because over 25% of the mappers failed. > Error: org.apache.hadoop.hbase.client.RetriesExhaustedException: > Failed after attempts=36, exceptions: Wed Jun 28 13:46:57 PDT 2017, > null, java.net.SocketTimeoutException: callTimeout=12, > callDuration=120301: row '65_app129041.iad1.mydomain.com_1476641940' > on table 'server_based_data' at region=server_based_data I'll try to study the code for the hbase-client and hbase-server jars but hopefully someone will know offhand what the difference between the methods are and what is causing the initTableMapperJob call to fail.