[
https://issues.apache.org/jira/browse/HBASE-21642?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16728939#comment-16728939
]
Zheng Hu commented on HBASE-21642:
----------------------------------
When running copyTable on Mob table by scan snapshot, I found :
{code}
2018-12-26 16:52:51,088 DEBUG [LocalJobRunner Map Task Executor #0]
ipc.AbstractRpcClient(483): Stopping rpc client
2018-12-26 16:52:51,095 WARN [Thread-1048] mapred.LocalJobRunner$Job(560):
job_local2134482229_0002
java.lang.Exception: java.lang.NullPointerException
at
org.apache.hadoop.mapred.LocalJobRunner$Job.runTasks(LocalJobRunner.java:462)
at
org.apache.hadoop.mapred.LocalJobRunner$Job.run(LocalJobRunner.java:522)
Caused by: java.lang.NullPointerException
at
org.apache.hadoop.hbase.regionserver.HMobStore.readCell(HMobStore.java:409)
at
org.apache.hadoop.hbase.regionserver.HMobStore.resolve(HMobStore.java:346)
at
org.apache.hadoop.hbase.regionserver.MobStoreScanner.next(MobStoreScanner.java:73)
at
org.apache.hadoop.hbase.regionserver.KeyValueHeap.next(KeyValueHeap.java:153)
at
org.apache.hadoop.hbase.regionserver.HRegion$RegionScannerImpl.populateResult(HRegion.java:6631)
at
org.apache.hadoop.hbase.regionserver.HRegion$RegionScannerImpl.nextInternal(HRegion.java:6795)
at
org.apache.hadoop.hbase.regionserver.HRegion$RegionScannerImpl.nextRaw(HRegion.java:6568)
at
org.apache.hadoop.hbase.regionserver.HRegion$RegionScannerImpl.nextRaw(HRegion.java:6554)
at
org.apache.hadoop.hbase.client.ClientSideRegionScanner.next(ClientSideRegionScanner.java:77)
at
org.apache.hadoop.hbase.mapreduce.TableSnapshotInputFormatImpl$RecordReader.nextKeyValue(TableSnapshotInputFormatImpl.java:241)
at
org.apache.hadoop.hbase.mapreduce.TableSnapshotInputFormat$TableSnapshotRegionRecordReader.nextKeyValue(TableSnapshotInputFormat.java:166)
at
org.apache.hadoop.mapred.MapTask$NewTrackingRecordReader.nextKeyValue(MapTask.java:556)
at
org.apache.hadoop.mapreduce.task.MapContextImpl.nextKeyValue(MapContextImpl.java:80)
at
org.apache.hadoop.mapreduce.lib.map.WrappedMapper$Context.nextKeyValue(WrappedMapper.java:91)
at org.apache.hadoop.mapreduce.Mapper.run(Mapper.java:145)
at org.apache.hadoop.mapred.MapTask.runNewMapper(MapTask.java:787)
at org.apache.hadoop.mapred.MapTask.run(MapTask.java:341)
at
org.apache.hadoop.mapred.LocalJobRunner$Job$MapTaskRunnable.run(LocalJobRunner.java:243)
at
java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511)
at java.util.concurrent.FutureTask.run(FutureTask.java:266)
at
java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142)
at
java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617)
at java.lang.Thread.run(Thread.java:745)
{code}
It's a bug when scaning snapshot of mob table...
> CopyTable by reading snapshot and bulkloading will save a lot of time.
> ----------------------------------------------------------------------
>
> Key: HBASE-21642
> URL: https://issues.apache.org/jira/browse/HBASE-21642
> Project: HBase
> Issue Type: Improvement
> Reporter: Zheng Hu
> Assignee: Zheng Hu
> Priority: Major
>
> In our HBase clusters, some users has the need to merge two diff table's
> data into one. Currently , the CopyTable will scan the source table , and
> put mutations into destination table.
> Although CopyTable with bulkload can speed a lot (compared to CopyTable with
> scan and put), it still take lots of time to scan the source table. and the
> worst thing is: CopyTable with scan table will impact the cluster's
> availablity, it cost lots of resource in RS to scanning, the cpu, memory,
> gc stw, rs handlers, disk io, network io ... etc. All those things will
> affect the availablity.
> So in our clusters, we tried to do all scanning job by using scan snapshot
> instead of scan table. it at least isolate the cpu & memory & gc resource
> between the online RS and scanning job. What's more, the snapshot scanning
> is much faster than scaning RS, and it's more stable.
> So, here, I'll make the copy table tool support snapshot scanning.
>
--
This message was sent by Atlassian JIRA
(v7.6.3#76005)