[
https://issues.apache.org/jira/browse/PHOENIX-4094?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
chenglei updated PHOENIX-4094:
------------------------------
Description:
I used phoenix-4.x-HBase-0.98 in my hbase cluster.When I When I restarted my
hbase cluster a certain time, I noticed some RegionServers have a lot of error
logs as following:
{code:java}
2017-08-01 11:53:10,669 WARN
[rsync.slave005.bizhbasetest.sjs.ted,60020,1501511894174-index-writer--pool2-t786]
regionserver.HRegion: Failed getting lock in batch put,
row=\x10\x00\x00\x00913f0eed-6710-4de9-8bac-077a106bb9ae_0
org.apache.hadoop.hbase.regionserver.WrongRegionException: Requested row out of
range for row lock on HRegion
BIZARCH_NS_PRODUCT.BIZTRACER_SPAN,90ffd783-b0a3-4f8a-81ef-0a7535fea197_0,1490066612493.463220cd8fad7254481595911e62d74d.,
startKey='90ffd783-b0a3-4f8a-81ef-0a7535fea197_0',
getEndKey()='917fc343-3331-47fa-907c-df83a6f302f7_0',
row='\x10\x00\x00\x00913f0eed-6710-4de9-8bac-077a106bb9ae_0'
at
org.apache.hadoop.hbase.regionserver.HRegion.checkRow(HRegion.java:3539)
at
org.apache.hadoop.hbase.regionserver.HRegion.getRowLock(HRegion.java:3557)
at
org.apache.hadoop.hbase.regionserver.HRegion.doMiniBatchMutation(HRegion.java:2394)
at
org.apache.hadoop.hbase.regionserver.HRegion.batchMutate(HRegion.java:2261)
at
org.apache.hadoop.hbase.regionserver.HRegion.batchMutate(HRegion.java:2213)
at
org.apache.phoenix.util.IndexUtil.writeLocalUpdates(IndexUtil.java:671)
at
org.apache.phoenix.hbase.index.write.ParallelWriterIndexCommitter$1.call(ParallelWriterIndexCommitter.java:157)
at
org.apache.phoenix.hbase.index.write.ParallelWriterIndexCommitter$1.call(ParallelWriterIndexCommitter.java:134)
at java.util.concurrent.FutureTask.run(FutureTask.java:262)
at
java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145)
at
java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615)
at java.lang.Thread.run(Thread.java:745)
{code}
The problem is caused by the ParallelWriterIndexCommitter.write method, in
following line 151,if allowLocalUpdates is true, it would wiite index
mutations to local region unconditionlly,which is obviously inappropriate.
{code:java}
150 try {
151 if (allowLocalUpdates && env != null) {
152 try {
153 throwFailureIfDone();
154 IndexUtil.writeLocalUpdates(env.getRegion(),
mutations, true);
155 return null;
156 } catch (IOException ignord) {
157 // when it's failed we fall back to the
standard & slow way
158 if (LOG.isDebugEnabled()) {
159 LOG.debug("indexRegion.batchMutate
failed and fall back to HTable.batch(). Got error="
160 + ignord);
161 }
162 }
163 }
{code}
When a data table has a global index table , and when we replay the WALs to
index table in Indexer.postOpen method in following
line 691, which the {{allowLocalUpdates}} parameter is true, the {{updates}}
parameter for the index table would incorrectly be written to the cuurent data
table region:
{code:java}
688 // do the usual writer stuff, killing the server again, if we can't
manage to make the index
689 // writes succeed again
690 try {
691 writer.writeAndKillYourselfOnFailure(updates, true);
692 } catch (IOException e) {
693 LOG.error("During WAL replay of outstanding index updates, "
694 + "Exception is thrown instead of killing server
during index writing", e);
695 }
696 } finally {
{code}
However , the master and other 4.x branches is correct:
was:
I used phoenix-4.x-HBase-0.98 in my hbase cluster.When I When I restarted my
hbase cluster a certain time, I noticed some RegionServers have a lot of error
logs as following:
{code:java}
2017-08-01 11:53:10,669 WARN
[rsync.slave005.bizhbasetest.sjs.ted,60020,1501511894174-index-writer--pool2-t786]
regionserver.HRegion: Failed getting lock in batch put,
row=\x10\x00\x00\x00913f0eed-6710-4de9-8bac-077a106bb9ae_0
org.apache.hadoop.hbase.regionserver.WrongRegionException: Requested row out of
range for row lock on HRegion
BIZARCH_NS_PRODUCT.BIZTRACER_SPAN,90ffd783-b0a3-4f8a-81ef-0a7535fea197_0,1490066612493.463220cd8fad7254481595911e62d74d.,
startKey='90ffd783-b0a3-4f8a-81ef-0a7535fea197_0',
getEndKey()='917fc343-3331-47fa-907c-df83a6f302f7_0',
row='\x10\x00\x00\x00913f0eed-6710-4de9-8bac-077a106bb9ae_0'
at
org.apache.hadoop.hbase.regionserver.HRegion.checkRow(HRegion.java:3539)
at
org.apache.hadoop.hbase.regionserver.HRegion.getRowLock(HRegion.java:3557)
at
org.apache.hadoop.hbase.regionserver.HRegion.doMiniBatchMutation(HRegion.java:2394)
at
org.apache.hadoop.hbase.regionserver.HRegion.batchMutate(HRegion.java:2261)
at
org.apache.hadoop.hbase.regionserver.HRegion.batchMutate(HRegion.java:2213)
at
org.apache.phoenix.util.IndexUtil.writeLocalUpdates(IndexUtil.java:671)
at
org.apache.phoenix.hbase.index.write.ParallelWriterIndexCommitter$1.call(ParallelWriterIndexCommitter.java:157)
at
org.apache.phoenix.hbase.index.write.ParallelWriterIndexCommitter$1.call(ParallelWriterIndexCommitter.java:134)
at java.util.concurrent.FutureTask.run(FutureTask.java:262)
at
java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145)
at
java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615)
at java.lang.Thread.run(Thread.java:745)
{code}
The problem is caused by the ParallelWriterIndexCommitter.write method, in
following line 151,if allowLocalUpdates is true, it would wiite index
mutations to local region unconditionlly,which is obviously inappropriate.
{code:java}
150 try {
151 if (allowLocalUpdates && env != null) {
152 try {
153 throwFailureIfDone();
154 IndexUtil.writeLocalUpdates(env.getRegion(),
mutations, true);
155 return null;
156 } catch (IOException ignord) {
157 // when it's failed we fall back to the
standard & slow way
158 if (LOG.isDebugEnabled()) {
159 LOG.debug("indexRegion.batchMutate
failed and fall back to HTable.batch(). Got error="
160 + ignord);
161 }
162 }
163 }
{code}
When a data table has a global index table , and when we replay the WALs to
index table in Indexer.postOpen method in following
line 691, which the {{allowLocalUpdates}} parameter is true, the {{updates}}
parameter for the index table would be written to the cuurent data table
region:
{code:java}
688 // do the usual writer stuff, killing the server again, if we can't
manage to make the index
689 // writes succeed again
690 try {
691 writer.writeAndKillYourselfOnFailure(updates, true);
692 } catch (IOException e) {
693 LOG.error("During WAL replay of outstanding index updates, "
694 + "Exception is thrown instead of killing server
during index writing", e);
695 }
696 } finally {
{code}
> ParallelWriterIndexCommitter incorrectly applys local updates to index tables
> for 4.x-HBase-0.98
> ------------------------------------------------------------------------------------------------
>
> Key: PHOENIX-4094
> URL: https://issues.apache.org/jira/browse/PHOENIX-4094
> Project: Phoenix
> Issue Type: Bug
> Affects Versions: 4.11.0
> Reporter: chenglei
>
> I used phoenix-4.x-HBase-0.98 in my hbase cluster.When I When I restarted my
> hbase cluster a certain time, I noticed some RegionServers have a lot of
> error logs as following:
> {code:java}
> 2017-08-01 11:53:10,669 WARN
> [rsync.slave005.bizhbasetest.sjs.ted,60020,1501511894174-index-writer--pool2-t786]
> regionserver.HRegion: Failed getting lock in batch put,
> row=\x10\x00\x00\x00913f0eed-6710-4de9-8bac-077a106bb9ae_0
> org.apache.hadoop.hbase.regionserver.WrongRegionException: Requested row out
> of range for row lock on HRegion
> BIZARCH_NS_PRODUCT.BIZTRACER_SPAN,90ffd783-b0a3-4f8a-81ef-0a7535fea197_0,1490066612493.463220cd8fad7254481595911e62d74d.,
> startKey='90ffd783-b0a3-4f8a-81ef-0a7535fea197_0',
> getEndKey()='917fc343-3331-47fa-907c-df83a6f302f7_0',
> row='\x10\x00\x00\x00913f0eed-6710-4de9-8bac-077a106bb9ae_0'
> at
> org.apache.hadoop.hbase.regionserver.HRegion.checkRow(HRegion.java:3539)
> at
> org.apache.hadoop.hbase.regionserver.HRegion.getRowLock(HRegion.java:3557)
> at
> org.apache.hadoop.hbase.regionserver.HRegion.doMiniBatchMutation(HRegion.java:2394)
> at
> org.apache.hadoop.hbase.regionserver.HRegion.batchMutate(HRegion.java:2261)
> at
> org.apache.hadoop.hbase.regionserver.HRegion.batchMutate(HRegion.java:2213)
> at
> org.apache.phoenix.util.IndexUtil.writeLocalUpdates(IndexUtil.java:671)
> at
> org.apache.phoenix.hbase.index.write.ParallelWriterIndexCommitter$1.call(ParallelWriterIndexCommitter.java:157)
> at
> org.apache.phoenix.hbase.index.write.ParallelWriterIndexCommitter$1.call(ParallelWriterIndexCommitter.java:134)
> at java.util.concurrent.FutureTask.run(FutureTask.java:262)
> at
> java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145)
> at
> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615)
> at java.lang.Thread.run(Thread.java:745)
> {code}
> The problem is caused by the ParallelWriterIndexCommitter.write method, in
> following line 151,if allowLocalUpdates is true, it would wiite index
> mutations to local region unconditionlly,which is obviously inappropriate.
> {code:java}
> 150 try {
> 151 if (allowLocalUpdates && env != null) {
> 152 try {
> 153 throwFailureIfDone();
> 154
> IndexUtil.writeLocalUpdates(env.getRegion(), mutations, true);
> 155 return null;
> 156 } catch (IOException ignord) {
> 157 // when it's failed we fall back to the
> standard & slow way
> 158 if (LOG.isDebugEnabled()) {
> 159 LOG.debug("indexRegion.batchMutate
> failed and fall back to HTable.batch(). Got error="
> 160 + ignord);
> 161 }
> 162 }
> 163 }
> {code}
> When a data table has a global index table , and when we replay the WALs to
> index table in Indexer.postOpen method in following
> line 691, which the {{allowLocalUpdates}} parameter is true, the {{updates}}
> parameter for the index table would incorrectly be written to the cuurent
> data table region:
> {code:java}
> 688 // do the usual writer stuff, killing the server again, if we
> can't manage to make the index
> 689 // writes succeed again
> 690 try {
> 691 writer.writeAndKillYourselfOnFailure(updates, true);
> 692 } catch (IOException e) {
> 693 LOG.error("During WAL replay of outstanding index updates,
> "
> 694 + "Exception is thrown instead of killing server
> during index writing", e);
> 695 }
> 696 } finally {
> {code}
> However , the master and other 4.x branches is correct:
--
This message was sent by Atlassian JIRA
(v6.4.14#64029)