[jira] [Commented] (HBASE-28076) NPE on initialization error in RecoveredReplicationSourceShipper

2023-09-17 Thread Istvan Toth (Jira)


[ 
https://issues.apache.org/jira/browse/HBASE-28076?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17766196#comment-17766196
 ] 

Istvan Toth commented on HBASE-28076:
-

It's an internal branch based on 2.4.12, but it has a lot of backports from 
branch-2.

> NPE on initialization error in RecoveredReplicationSourceShipper
> 
>
> Key: HBASE-28076
> URL: https://issues.apache.org/jira/browse/HBASE-28076
> Project: HBase
>  Issue Type: Bug
>Affects Versions: 2.6.0, 2.4.17, 2.5.5
>Reporter: Istvan Toth
>Assignee: Istvan Toth
>Priority: Minor
> Fix For: 2.6.0, 2.4.18, 2.5.6
>
>
> When we run into problems starting RecoveredReplicationSourceShipper, we try 
> to stop the reader thread which we haven't initialized yet, resulting in an 
> NPE.
> {noformat}
> ERROR org.apache.hadoop.hbase.replication.regionserver.ReplicationSource: 
> Unexpected exception in redacted currentPath=hdfs://redacted
> java.lang.NullPointerException
>         at 
> org.apache.hadoop.hbase.replication.regionserver.RecoveredReplicationSourceShipper.terminate(RecoveredReplicationSourceShipper.java:100)
>         at 
> org.apache.hadoop.hbase.replication.regionserver.RecoveredReplicationSourceShipper.getRecoveredQueueStartPos(RecoveredReplicationSourceShipper.java:87)
>         at 
> org.apache.hadoop.hbase.replication.regionserver.RecoveredReplicationSourceShipper.getStartPosition(RecoveredReplicationSourceShipper.java:62)
>         at 
> org.apache.hadoop.hbase.replication.regionserver.ReplicationSource.lambda$tryStartNewShipper$3(ReplicationSource.java:349)
>         at 
> java.util.concurrent.ConcurrentHashMap.compute(ConcurrentHashMap.java:1853)
>         at 
> org.apache.hadoop.hbase.replication.regionserver.ReplicationSource.tryStartNewShipper(ReplicationSource.java:341)
>         at 
> org.apache.hadoop.hbase.replication.regionserver.ReplicationSource.initialize(ReplicationSource.java:601)
>         at java.lang.Thread.run(Thread.java:750)
> {noformat}
> A simple null check should fix this.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Commented] (HBASE-28076) NPE on initialization error in RecoveredReplicationSourceShipper

2023-09-17 Thread Duo Zhang (Jira)


[ 
https://issues.apache.org/jira/browse/HBASE-28076?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17766186#comment-17766186
 ] 

Duo Zhang commented on HBASE-28076:
---

What is the hbase version throws this NPE?

> NPE on initialization error in RecoveredReplicationSourceShipper
> 
>
> Key: HBASE-28076
> URL: https://issues.apache.org/jira/browse/HBASE-28076
> Project: HBase
>  Issue Type: Bug
>Affects Versions: 2.6.0, 2.4.17, 2.5.5
>Reporter: Istvan Toth
>Assignee: Istvan Toth
>Priority: Minor
> Fix For: 2.6.0, 2.4.18, 2.5.6
>
>
> When we run into problems starting RecoveredReplicationSourceShipper, we try 
> to stop the reader thread which we haven't initialized yet, resulting in an 
> NPE.
> {noformat}
> ERROR org.apache.hadoop.hbase.replication.regionserver.ReplicationSource: 
> Unexpected exception in redacted currentPath=hdfs://redacted
> java.lang.NullPointerException
>         at 
> org.apache.hadoop.hbase.replication.regionserver.RecoveredReplicationSourceShipper.terminate(RecoveredReplicationSourceShipper.java:100)
>         at 
> org.apache.hadoop.hbase.replication.regionserver.RecoveredReplicationSourceShipper.getRecoveredQueueStartPos(RecoveredReplicationSourceShipper.java:87)
>         at 
> org.apache.hadoop.hbase.replication.regionserver.RecoveredReplicationSourceShipper.getStartPosition(RecoveredReplicationSourceShipper.java:62)
>         at 
> org.apache.hadoop.hbase.replication.regionserver.ReplicationSource.lambda$tryStartNewShipper$3(ReplicationSource.java:349)
>         at 
> java.util.concurrent.ConcurrentHashMap.compute(ConcurrentHashMap.java:1853)
>         at 
> org.apache.hadoop.hbase.replication.regionserver.ReplicationSource.tryStartNewShipper(ReplicationSource.java:341)
>         at 
> org.apache.hadoop.hbase.replication.regionserver.ReplicationSource.initialize(ReplicationSource.java:601)
>         at java.lang.Thread.run(Thread.java:750)
> {noformat}
> A simple null check should fix this.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Commented] (HBASE-28076) NPE on initialization error in RecoveredReplicationSourceShipper

2023-09-15 Thread Hudson (Jira)


[ 
https://issues.apache.org/jira/browse/HBASE-28076?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17765760#comment-17765760
 ] 

Hudson commented on HBASE-28076:


Results for branch branch-2
[build #880 on 
builds.a.o|https://ci-hbase.apache.org/job/HBase%20Nightly/job/branch-2/880/]: 
(/) *{color:green}+1 overall{color}*

details (if available):

(/) {color:green}+1 general checks{color}
-- For more information [see general 
report|https://ci-hbase.apache.org/job/HBase%20Nightly/job/branch-2/880/General_20Nightly_20Build_20Report/]


(/) {color:green}+1 jdk8 hadoop2 checks{color}
-- For more information [see jdk8 (hadoop2) 
report|https://ci-hbase.apache.org/job/HBase%20Nightly/job/branch-2/880/JDK8_20Nightly_20Build_20Report_20_28Hadoop2_29/]


(/) {color:green}+1 jdk8 hadoop3 checks{color}
-- For more information [see jdk8 (hadoop3) 
report|https://ci-hbase.apache.org/job/HBase%20Nightly/job/branch-2/880/JDK8_20Nightly_20Build_20Report_20_28Hadoop3_29/]


(/) {color:green}+1 jdk11 hadoop3 checks{color}
-- For more information [see jdk11 
report|https://ci-hbase.apache.org/job/HBase%20Nightly/job/branch-2/880/JDK11_20Nightly_20Build_20Report_20_28Hadoop3_29/]


(/) {color:green}+1 source release artifact{color}
-- See build output for details.


(/) {color:green}+1 client integration test{color}


> NPE on initialization error in RecoveredReplicationSourceShipper
> 
>
> Key: HBASE-28076
> URL: https://issues.apache.org/jira/browse/HBASE-28076
> Project: HBase
>  Issue Type: Bug
>Affects Versions: 2.6.0, 2.4.17, 2.5.5
>Reporter: Istvan Toth
>Assignee: Istvan Toth
>Priority: Minor
> Fix For: 2.6.0, 2.4.18, 2.5.6
>
>
> When we run into problems starting RecoveredReplicationSourceShipper, we try 
> to stop the reader thread which we haven't initialized yet, resulting in an 
> NPE.
> {noformat}
> ERROR org.apache.hadoop.hbase.replication.regionserver.ReplicationSource: 
> Unexpected exception in redacted currentPath=hdfs://redacted
> java.lang.NullPointerException
>         at 
> org.apache.hadoop.hbase.replication.regionserver.RecoveredReplicationSourceShipper.terminate(RecoveredReplicationSourceShipper.java:100)
>         at 
> org.apache.hadoop.hbase.replication.regionserver.RecoveredReplicationSourceShipper.getRecoveredQueueStartPos(RecoveredReplicationSourceShipper.java:87)
>         at 
> org.apache.hadoop.hbase.replication.regionserver.RecoveredReplicationSourceShipper.getStartPosition(RecoveredReplicationSourceShipper.java:62)
>         at 
> org.apache.hadoop.hbase.replication.regionserver.ReplicationSource.lambda$tryStartNewShipper$3(ReplicationSource.java:349)
>         at 
> java.util.concurrent.ConcurrentHashMap.compute(ConcurrentHashMap.java:1853)
>         at 
> org.apache.hadoop.hbase.replication.regionserver.ReplicationSource.tryStartNewShipper(ReplicationSource.java:341)
>         at 
> org.apache.hadoop.hbase.replication.regionserver.ReplicationSource.initialize(ReplicationSource.java:601)
>         at java.lang.Thread.run(Thread.java:750)
> {noformat}
> A simple null check should fix this.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Commented] (HBASE-28076) NPE on initialization error in RecoveredReplicationSourceShipper

2023-09-15 Thread Karthik Palanisamy (Jira)


[ 
https://issues.apache.org/jira/browse/HBASE-28076?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17765761#comment-17765761
 ] 

Karthik Palanisamy commented on HBASE-28076:


Thanks [~stoty]. 

[~zhangduo]  Aside from this particular issue, there is another race condition 
occurring that is resulting in a NullPointerException. The exact cause of this 
NPE is currently unknown. It appears to be attempting to access a queue that 
either no longer exists or has been removed from the queue before it can be 
accessed.
{code:java}
2023-09-10 20:02:35,365 ERROR 
org.apache.hadoop.hbase.replication.regionserver.ReplicationSource: Unexpected 
exception in ReplicationExecutor
..
..
..
java.lang.NullPointerException
at 
org.apache.hadoop.hbase.replication.regionserver.ReplicationSourceManager.cleanOldLogs(ReplicationSourceManager.java:563)
at 
org.apache.hadoop.hbase.replication.regionserver.ReplicationSourceManager.logPositionAndCleanOldLogs(ReplicationSourceManager.java:549)
at 
org.apache.hadoop.hbase.replication.regionserver.ReplicationSourceInterface.logPositionAndCleanOldLogs(ReplicationSourceInterface.java:202)
at 
org.apache.hadoop.hbase.replication.regionserver.ReplicationSourceShipper.updateLogPosition(ReplicationSourceShipper.java:269)
at 
org.apache.hadoop.hbase.replication.regionserver.ReplicationSourceShipper.shipEdits(ReplicationSourceShipper.java:163)
at 
org.apache.hadoop.hbase.replication.regionserver.ReplicationSourceShipper.run(ReplicationSourceShipper.java:119)
 {code}

> NPE on initialization error in RecoveredReplicationSourceShipper
> 
>
> Key: HBASE-28076
> URL: https://issues.apache.org/jira/browse/HBASE-28076
> Project: HBase
>  Issue Type: Bug
>Affects Versions: 2.6.0, 2.4.17, 2.5.5
>Reporter: Istvan Toth
>Assignee: Istvan Toth
>Priority: Minor
> Fix For: 2.6.0, 2.4.18, 2.5.6
>
>
> When we run into problems starting RecoveredReplicationSourceShipper, we try 
> to stop the reader thread which we haven't initialized yet, resulting in an 
> NPE.
> {noformat}
> ERROR org.apache.hadoop.hbase.replication.regionserver.ReplicationSource: 
> Unexpected exception in redacted currentPath=hdfs://redacted
> java.lang.NullPointerException
>         at 
> org.apache.hadoop.hbase.replication.regionserver.RecoveredReplicationSourceShipper.terminate(RecoveredReplicationSourceShipper.java:100)
>         at 
> org.apache.hadoop.hbase.replication.regionserver.RecoveredReplicationSourceShipper.getRecoveredQueueStartPos(RecoveredReplicationSourceShipper.java:87)
>         at 
> org.apache.hadoop.hbase.replication.regionserver.RecoveredReplicationSourceShipper.getStartPosition(RecoveredReplicationSourceShipper.java:62)
>         at 
> org.apache.hadoop.hbase.replication.regionserver.ReplicationSource.lambda$tryStartNewShipper$3(ReplicationSource.java:349)
>         at 
> java.util.concurrent.ConcurrentHashMap.compute(ConcurrentHashMap.java:1853)
>         at 
> org.apache.hadoop.hbase.replication.regionserver.ReplicationSource.tryStartNewShipper(ReplicationSource.java:341)
>         at 
> org.apache.hadoop.hbase.replication.regionserver.ReplicationSource.initialize(ReplicationSource.java:601)
>         at java.lang.Thread.run(Thread.java:750)
> {noformat}
> A simple null check should fix this.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Commented] (HBASE-28076) NPE on initialization error in RecoveredReplicationSourceShipper

2023-09-15 Thread Hudson (Jira)


[ 
https://issues.apache.org/jira/browse/HBASE-28076?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17765733#comment-17765733
 ] 

Hudson commented on HBASE-28076:


Results for branch branch-2.4
[build #616 on 
builds.a.o|https://ci-hbase.apache.org/job/HBase%20Nightly/job/branch-2.4/616/]:
 (/) *{color:green}+1 overall{color}*

details (if available):

(/) {color:green}+1 general checks{color}
-- For more information [see general 
report|https://ci-hbase.apache.org/job/HBase%20Nightly/job/branch-2.4/616/General_20Nightly_20Build_20Report/]


(/) {color:green}+1 jdk8 hadoop2 checks{color}
-- For more information [see jdk8 (hadoop2) 
report|https://ci-hbase.apache.org/job/HBase%20Nightly/job/branch-2.4/616/JDK8_20Nightly_20Build_20Report_20_28Hadoop2_29/]


(/) {color:green}+1 jdk8 hadoop3 checks{color}
-- For more information [see jdk8 (hadoop3) 
report|https://ci-hbase.apache.org/job/HBase%20Nightly/job/branch-2.4/616/JDK8_20Nightly_20Build_20Report_20_28Hadoop3_29/]


(/) {color:green}+1 jdk11 hadoop3 checks{color}
-- For more information [see jdk11 
report|https://ci-hbase.apache.org/job/HBase%20Nightly/job/branch-2.4/616/JDK11_20Nightly_20Build_20Report_20_28Hadoop3_29/]


(/) {color:green}+1 source release artifact{color}
-- See build output for details.


(/) {color:green}+1 client integration test{color}


> NPE on initialization error in RecoveredReplicationSourceShipper
> 
>
> Key: HBASE-28076
> URL: https://issues.apache.org/jira/browse/HBASE-28076
> Project: HBase
>  Issue Type: Bug
>Affects Versions: 2.6.0, 2.4.17, 2.5.5
>Reporter: Istvan Toth
>Assignee: Istvan Toth
>Priority: Minor
> Fix For: 2.6.0, 2.4.18, 2.5.6
>
>
> When we run into problems starting RecoveredReplicationSourceShipper, we try 
> to stop the reader thread which we haven't initialized yet, resulting in an 
> NPE.
> {noformat}
> ERROR org.apache.hadoop.hbase.replication.regionserver.ReplicationSource: 
> Unexpected exception in redacted currentPath=hdfs://redacted
> java.lang.NullPointerException
>         at 
> org.apache.hadoop.hbase.replication.regionserver.RecoveredReplicationSourceShipper.terminate(RecoveredReplicationSourceShipper.java:100)
>         at 
> org.apache.hadoop.hbase.replication.regionserver.RecoveredReplicationSourceShipper.getRecoveredQueueStartPos(RecoveredReplicationSourceShipper.java:87)
>         at 
> org.apache.hadoop.hbase.replication.regionserver.RecoveredReplicationSourceShipper.getStartPosition(RecoveredReplicationSourceShipper.java:62)
>         at 
> org.apache.hadoop.hbase.replication.regionserver.ReplicationSource.lambda$tryStartNewShipper$3(ReplicationSource.java:349)
>         at 
> java.util.concurrent.ConcurrentHashMap.compute(ConcurrentHashMap.java:1853)
>         at 
> org.apache.hadoop.hbase.replication.regionserver.ReplicationSource.tryStartNewShipper(ReplicationSource.java:341)
>         at 
> org.apache.hadoop.hbase.replication.regionserver.ReplicationSource.initialize(ReplicationSource.java:601)
>         at java.lang.Thread.run(Thread.java:750)
> {noformat}
> A simple null check should fix this.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Commented] (HBASE-28076) NPE on initialization error in RecoveredReplicationSourceShipper

2023-09-15 Thread Hudson (Jira)


[ 
https://issues.apache.org/jira/browse/HBASE-28076?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17765722#comment-17765722
 ] 

Hudson commented on HBASE-28076:


Results for branch branch-2.5
[build #402 on 
builds.a.o|https://ci-hbase.apache.org/job/HBase%20Nightly/job/branch-2.5/402/]:
 (x) *{color:red}-1 overall{color}*

details (if available):

(/) {color:green}+1 general checks{color}
-- For more information [see general 
report|https://ci-hbase.apache.org/job/HBase%20Nightly/job/branch-2.5/402/General_20Nightly_20Build_20Report/]


(x) {color:red}-1 jdk8 hadoop2 checks{color}
-- For more information [see jdk8 (hadoop2) 
report|https://ci-hbase.apache.org/job/HBase%20Nightly/job/branch-2.5/402/JDK8_20Nightly_20Build_20Report_20_28Hadoop2_29/]


(x) {color:red}-1 jdk8 hadoop3 checks{color}
-- For more information [see jdk8 (hadoop3) 
report|https://ci-hbase.apache.org/job/HBase%20Nightly/job/branch-2.5/402/JDK8_20Nightly_20Build_20Report_20_28Hadoop3_29/]


(/) {color:green}+1 jdk11 hadoop3 checks{color}
-- For more information [see jdk11 
report|https://ci-hbase.apache.org/job/HBase%20Nightly/job/branch-2.5/402/JDK11_20Nightly_20Build_20Report_20_28Hadoop3_29/]


(/) {color:green}+1 source release artifact{color}
-- See build output for details.


(/) {color:green}+1 client integration test{color}


> NPE on initialization error in RecoveredReplicationSourceShipper
> 
>
> Key: HBASE-28076
> URL: https://issues.apache.org/jira/browse/HBASE-28076
> Project: HBase
>  Issue Type: Bug
>Affects Versions: 2.6.0, 2.4.17, 2.5.5
>Reporter: Istvan Toth
>Assignee: Istvan Toth
>Priority: Minor
> Fix For: 2.6.0, 2.4.18, 2.5.6
>
>
> When we run into problems starting RecoveredReplicationSourceShipper, we try 
> to stop the reader thread which we haven't initialized yet, resulting in an 
> NPE.
> {noformat}
> ERROR org.apache.hadoop.hbase.replication.regionserver.ReplicationSource: 
> Unexpected exception in redacted currentPath=hdfs://redacted
> java.lang.NullPointerException
>         at 
> org.apache.hadoop.hbase.replication.regionserver.RecoveredReplicationSourceShipper.terminate(RecoveredReplicationSourceShipper.java:100)
>         at 
> org.apache.hadoop.hbase.replication.regionserver.RecoveredReplicationSourceShipper.getRecoveredQueueStartPos(RecoveredReplicationSourceShipper.java:87)
>         at 
> org.apache.hadoop.hbase.replication.regionserver.RecoveredReplicationSourceShipper.getStartPosition(RecoveredReplicationSourceShipper.java:62)
>         at 
> org.apache.hadoop.hbase.replication.regionserver.ReplicationSource.lambda$tryStartNewShipper$3(ReplicationSource.java:349)
>         at 
> java.util.concurrent.ConcurrentHashMap.compute(ConcurrentHashMap.java:1853)
>         at 
> org.apache.hadoop.hbase.replication.regionserver.ReplicationSource.tryStartNewShipper(ReplicationSource.java:341)
>         at 
> org.apache.hadoop.hbase.replication.regionserver.ReplicationSource.initialize(ReplicationSource.java:601)
>         at java.lang.Thread.run(Thread.java:750)
> {noformat}
> A simple null check should fix this.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Commented] (HBASE-28076) NPE on initialization error in RecoveredReplicationSourceShipper

2023-09-15 Thread Duo Zhang (Jira)


[ 
https://issues.apache.org/jira/browse/HBASE-28076?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17765568#comment-17765568
 ] 

Duo Zhang commented on HBASE-28076:
---

We have this in ReplicationSource.terminate

{code}
for (ReplicationSourceShipper worker : workers) {
  worker.stopWorker();
  if (worker.entryReader != null) {
worker.entryReader.setReaderRunning(false);
  }
}

if (this.replicationEndpoint != null) {
  this.replicationEndpoint.stop();
}

for (ReplicationSourceShipper worker : workers) {
  if (worker.isAlive() || worker.entryReader.isAlive()) {
try {
  // Wait worker to stop
  Thread.sleep(this.sleepForRetries);
} catch (InterruptedException e) {
  LOG.info("{} Interrupted while waiting {} to stop", logPeerId(), 
worker.getName());
  Thread.currentThread().interrupt();
}
// If worker still is alive after waiting, interrupt it
if (worker.isAlive()) {
  worker.interrupt();
}
// If entry reader is alive after waiting, interrupt it
if (worker.entryReader.isAlive()) {
  worker.entryReader.interrupt();
}
  }
  if (!server.isAborted() && !server.isStopped()) {
// If server is running and worker is already stopped but there was 
still entries batched,
// we need to clear buffer used for non processed entries
worker.clearWALEntryBatch();
  }
}
{code}

A bit strange, we add null check in the first loop but no check in the second 
loop...

Let me take deeper look into it.

> NPE on initialization error in RecoveredReplicationSourceShipper
> 
>
> Key: HBASE-28076
> URL: https://issues.apache.org/jira/browse/HBASE-28076
> Project: HBase
>  Issue Type: Bug
>Affects Versions: 2.6.0, 2.4.17, 2.5.5
>Reporter: Istvan Toth
>Assignee: Istvan Toth
>Priority: Minor
> Fix For: 2.6.0, 2.4.18, 2.5.6
>
>
> When we run into problems starting RecoveredReplicationSourceShipper, we try 
> to stop the reader thread which we haven't initialized yet, resulting in an 
> NPE.
> {noformat}
> ERROR org.apache.hadoop.hbase.replication.regionserver.ReplicationSource: 
> Unexpected exception in redacted currentPath=hdfs://redacted
> java.lang.NullPointerException
>         at 
> org.apache.hadoop.hbase.replication.regionserver.RecoveredReplicationSourceShipper.terminate(RecoveredReplicationSourceShipper.java:100)
>         at 
> org.apache.hadoop.hbase.replication.regionserver.RecoveredReplicationSourceShipper.getRecoveredQueueStartPos(RecoveredReplicationSourceShipper.java:87)
>         at 
> org.apache.hadoop.hbase.replication.regionserver.RecoveredReplicationSourceShipper.getStartPosition(RecoveredReplicationSourceShipper.java:62)
>         at 
> org.apache.hadoop.hbase.replication.regionserver.ReplicationSource.lambda$tryStartNewShipper$3(ReplicationSource.java:349)
>         at 
> java.util.concurrent.ConcurrentHashMap.compute(ConcurrentHashMap.java:1853)
>         at 
> org.apache.hadoop.hbase.replication.regionserver.ReplicationSource.tryStartNewShipper(ReplicationSource.java:341)
>         at 
> org.apache.hadoop.hbase.replication.regionserver.ReplicationSource.initialize(ReplicationSource.java:601)
>         at java.lang.Thread.run(Thread.java:750)
> {noformat}
> A simple null check should fix this.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Commented] (HBASE-28076) NPE on initialization error in RecoveredReplicationSourceShipper

2023-09-14 Thread Duo Zhang (Jira)


[ 
https://issues.apache.org/jira/browse/HBASE-28076?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17765405#comment-17765405
 ] 

Duo Zhang commented on HBASE-28076:
---

IIRC the big refactoring had also been done on branch-2 as well. Will take a 
look at the code for master and branch-3 today.

Thanks for providing the patch and the reminding.

> NPE on initialization error in RecoveredReplicationSourceShipper
> 
>
> Key: HBASE-28076
> URL: https://issues.apache.org/jira/browse/HBASE-28076
> Project: HBase
>  Issue Type: Bug
>Affects Versions: 2.6.0, 2.4.17, 2.5.5
>Reporter: Istvan Toth
>Assignee: Istvan Toth
>Priority: Minor
> Fix For: 2.6.0, 2.4.18, 2.5.6
>
>
> When we run into problems starting RecoveredReplicationSourceShipper, we try 
> to stop the reader thread which we haven't initialized yet, resulting in an 
> NPE.
> {noformat}
> ERROR org.apache.hadoop.hbase.replication.regionserver.ReplicationSource: 
> Unexpected exception in redacted currentPath=hdfs://redacted
> java.lang.NullPointerException
>         at 
> org.apache.hadoop.hbase.replication.regionserver.RecoveredReplicationSourceShipper.terminate(RecoveredReplicationSourceShipper.java:100)
>         at 
> org.apache.hadoop.hbase.replication.regionserver.RecoveredReplicationSourceShipper.getRecoveredQueueStartPos(RecoveredReplicationSourceShipper.java:87)
>         at 
> org.apache.hadoop.hbase.replication.regionserver.RecoveredReplicationSourceShipper.getStartPosition(RecoveredReplicationSourceShipper.java:62)
>         at 
> org.apache.hadoop.hbase.replication.regionserver.ReplicationSource.lambda$tryStartNewShipper$3(ReplicationSource.java:349)
>         at 
> java.util.concurrent.ConcurrentHashMap.compute(ConcurrentHashMap.java:1853)
>         at 
> org.apache.hadoop.hbase.replication.regionserver.ReplicationSource.tryStartNewShipper(ReplicationSource.java:341)
>         at 
> org.apache.hadoop.hbase.replication.regionserver.ReplicationSource.initialize(ReplicationSource.java:601)
>         at java.lang.Thread.run(Thread.java:750)
> {noformat}
> A simple null check should fix this.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Commented] (HBASE-28076) NPE on initialization error in RecoveredReplicationSourceShipper

2023-09-12 Thread Istvan Toth (Jira)


[ 
https://issues.apache.org/jira/browse/HBASE-28076?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17764187#comment-17764187
 ] 

Istvan Toth commented on HBASE-28076:
-

This seems to be happening during shutdown, so I have reduced the severity.

> NPE on initialization error in RecoveredReplicationSourceShipper
> 
>
> Key: HBASE-28076
> URL: https://issues.apache.org/jira/browse/HBASE-28076
> Project: HBase
>  Issue Type: Bug
>Affects Versions: 2.6.0, 2.4.17, 2.5.5
>Reporter: Istvan Toth
>Assignee: Istvan Toth
>Priority: Minor
>
> When we run into problems starting RecoveredReplicationSourceShipper, we try 
> to stop the reader thread which we haven't initialized yet, resulting in an 
> NPE.
> {noformat}
> ERROR org.apache.hadoop.hbase.replication.regionserver.ReplicationSource: 
> Unexpected exception in redacted currentPath=hdfs://redacted
> java.lang.NullPointerException
>         at 
> org.apache.hadoop.hbase.replication.regionserver.RecoveredReplicationSourceShipper.terminate(RecoveredReplicationSourceShipper.java:100)
>         at 
> org.apache.hadoop.hbase.replication.regionserver.RecoveredReplicationSourceShipper.getRecoveredQueueStartPos(RecoveredReplicationSourceShipper.java:87)
>         at 
> org.apache.hadoop.hbase.replication.regionserver.RecoveredReplicationSourceShipper.getStartPosition(RecoveredReplicationSourceShipper.java:62)
>         at 
> org.apache.hadoop.hbase.replication.regionserver.ReplicationSource.lambda$tryStartNewShipper$3(ReplicationSource.java:349)
>         at 
> java.util.concurrent.ConcurrentHashMap.compute(ConcurrentHashMap.java:1853)
>         at 
> org.apache.hadoop.hbase.replication.regionserver.ReplicationSource.tryStartNewShipper(ReplicationSource.java:341)
>         at 
> org.apache.hadoop.hbase.replication.regionserver.ReplicationSource.initialize(ReplicationSource.java:601)
>         at java.lang.Thread.run(Thread.java:750)
> {noformat}
> A simple null check should fix this.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Commented] (HBASE-28076) NPE on initialization error in RecoveredReplicationSourceShipper

2023-09-12 Thread Istvan Toth (Jira)


[ 
https://issues.apache.org/jira/browse/HBASE-28076?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17764182#comment-17764182
 ] 

Istvan Toth commented on HBASE-28076:
-

This seems to have been heavily refactored in 3.x, I am not sure if we still 
have this problem there.

> NPE on initialization error in RecoveredReplicationSourceShipper
> 
>
> Key: HBASE-28076
> URL: https://issues.apache.org/jira/browse/HBASE-28076
> Project: HBase
>  Issue Type: Bug
>Affects Versions: 2.6.0, 2.4.17, 2.5.5
>Reporter: Istvan Toth
>Assignee: Istvan Toth
>Priority: Major
>
> When we run into problems starting RecoveredReplicationSourceShipper, we try 
> to stop the reader thread which we haven't initialized yet, resulting in an 
> NPE.
> {noformat}
> ERROR org.apache.hadoop.hbase.replication.regionserver.ReplicationSource: 
> Unexpected exception in redacted currentPath=hdfs://redacted
> java.lang.NullPointerException
>         at 
> org.apache.hadoop.hbase.replication.regionserver.RecoveredReplicationSourceShipper.terminate(RecoveredReplicationSourceShipper.java:100)
>         at 
> org.apache.hadoop.hbase.replication.regionserver.RecoveredReplicationSourceShipper.getRecoveredQueueStartPos(RecoveredReplicationSourceShipper.java:87)
>         at 
> org.apache.hadoop.hbase.replication.regionserver.RecoveredReplicationSourceShipper.getStartPosition(RecoveredReplicationSourceShipper.java:62)
>         at 
> org.apache.hadoop.hbase.replication.regionserver.ReplicationSource.lambda$tryStartNewShipper$3(ReplicationSource.java:349)
>         at 
> java.util.concurrent.ConcurrentHashMap.compute(ConcurrentHashMap.java:1853)
>         at 
> org.apache.hadoop.hbase.replication.regionserver.ReplicationSource.tryStartNewShipper(ReplicationSource.java:341)
>         at 
> org.apache.hadoop.hbase.replication.regionserver.ReplicationSource.initialize(ReplicationSource.java:601)
>         at java.lang.Thread.run(Thread.java:750)
> {noformat}
> A simple null check should fix this.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)