Apache9 commented on code in PR #6263:
URL: https://github.com/apache/hbase/pull/6263#discussion_r1766363193


##########
hbase-server/src/main/java/org/apache/hadoop/hbase/replication/regionserver/ReplicationSink.java:
##########
@@ -492,16 +492,33 @@ private void batch(TableName tableName, 
Collection<List<Row>> allRows, int batch
       }
       
futures.addAll(batchRows.stream().map(table::batchAll).collect(Collectors.toList()));
     }
+    // Here we will always wait until all futures are finished, even if there 
are failures when
+    // getting from a future in the middle. This is because this method may be 
called in a rpc call,
+    // so the batch operations may reference some off heap cells(through 
CellScanner). If we return
+    // earlier here, the rpc call may be finished and they will release the 
off heap cells before
+    // some of the batch operations finish, and then cause corrupt data or 
even crash the region
+    // server. See HBASE-28584 and HBASE-28850 for more details.
+    IOException error = null;
     for (Future<?> future : futures) {
       try {
         FutureUtils.get(future);
       } catch (RetriesExhaustedException e) {
+        IOException ioe;
         if (e.getCause() instanceof TableNotFoundException) {
-          throw new TableNotFoundException("'" + tableName + "'");
+          ioe = new TableNotFoundException("'" + tableName + "'");
+        } else {
+          ioe = e;
+        }
+        if (error == null) {
+          error = ioe;
+        } else {
+          error.addSuppressed(ioe);

Review Comment:
   ```
     public static void main(String[] args) {
       Exception error = new Exception("root");
       error.addSuppressed(new IOException("suppressed1"));
       error.addSuppressed(new IOException("suppressed2"));
       error.printStackTrace();
     }
   ```
   
   The output
   ```
   java.lang.Exception: root
        at 
org.apache.hadoop.hbase.regionserver.regionreplication.RegionReplicationSink.main(RegionReplicationSink.java:461)
        Suppressed: java.io.IOException: suppressed1
                at 
org.apache.hadoop.hbase.regionserver.regionreplication.RegionReplicationSink.main(RegionReplicationSink.java:462)
        Suppressed: java.io.IOException: suppressed2
                at 
org.apache.hadoop.hbase.regionserver.regionreplication.RegionReplicationSink.main(RegionReplicationSink.java:463)
   ```
   
   So I think it is fine to not log it here?



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]

Reply via email to