Re: [PR] HBASE-29823 Control WAL flush and offset persistence from ReplicationSourceShipper [hbase]
ankitsol commented on PR #7617: URL: https://github.com/apache/hbase/pull/7617#issuecomment-4251464552 Hi @Apache9 @wchevreuil, just a gentle ping on this PR when you get a chance. I’ve addressed the review comments -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: [email protected] For queries about this service, please contact Infrastructure at: [email protected]
Re: [PR] HBASE-29823 Control WAL flush and offset persistence from ReplicationSourceShipper [hbase]
ankitsol commented on code in PR #7617: URL: https://github.com/apache/hbase/pull/7617#discussion_r3085931097 ## hbase-server/src/main/java/org/apache/hadoop/hbase/replication/regionserver/ReplicationSourceShipper.java: ## Review Comment: Yes, we intentionally do not restart here—this path can be reached during normal lifecycle transitions (e.g., config updates), while restarts are only triggered explicitly on failure via abortAndRestart(). I have address below comments as per your suggestions -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: [email protected] For queries about this service, please contact Infrastructure at: [email protected]
Re: [PR] HBASE-29823 Control WAL flush and offset persistence from ReplicationSourceShipper [hbase]
ankitsol commented on code in PR #7617:
URL: https://github.com/apache/hbase/pull/7617#discussion_r3085912904
##
hbase-server/src/main/java/org/apache/hadoop/hbase/replication/regionserver/ReplicationSourceShipper.java:
##
@@ -215,6 +277,12 @@ private void shipEdits(WALEntryBatch entryBatch) {
entryBatch.getNbOperations(), (endTimeNs - startTimeNs) / 100);
}
break;
+ } catch (IOException ioe) {
+// Offset-Persist failure is treated as fatal to this shipper since it
might come from
+// beforePersistingReplicationOffset. So abort and restart the
Shipper, and WAL reading
+// will resume from the last successfully persisted offset
+abortAndRestart(ioe);
Review Comment:
Good point, I’ve updated the code to rethrow the exception and handle
abortAndRestart in the top-level run() loop
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: [email protected]
For queries about this service, please contact Infrastructure at:
[email protected]
Re: [PR] HBASE-29823 Control WAL flush and offset persistence from ReplicationSourceShipper [hbase]
anmolnar commented on code in PR #7617: URL: https://github.com/apache/hbase/pull/7617#discussion_r3080258595 ## hbase-server/src/main/java/org/apache/hadoop/hbase/replication/regionserver/ReplicationSourceShipper.java: ## Review Comment: I might be missing something, but to my understanding we don't restart the worker when we enter here, because that's the "finished" branch. Worker is already in finished state, because there's no more data process. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: [email protected] For queries about this service, please contact Infrastructure at: [email protected]
Re: [PR] HBASE-29823 Control WAL flush and offset persistence from ReplicationSourceShipper [hbase]
anmolnar commented on code in PR #7617:
URL: https://github.com/apache/hbase/pull/7617#discussion_r3080238529
##
hbase-server/src/main/java/org/apache/hadoop/hbase/replication/regionserver/ReplicationSource.java:
##
@@ -866,4 +866,24 @@ public long getTotalReplicatedEdits() {
long getSleepForRetries() {
return sleepForRetries;
}
+
+ void restartShipper(String walGroupId, ReplicationSourceShipper oldWorker) {
+workerThreads.compute(walGroupId, (key, current) -> {
+ if (current != oldWorker) {
+return current; // already replaced
Review Comment:
Okay, we can leave this safety check here, but still unclear to me that if a
single `walGroupId` is strictly handled by a single thread in the worker pool,
how is it possible that single thread calls for restart two times?
I think what you say "multiple threads trigger restart around the same time"
doesn't seem to be possible, because what I mentioned above: exactly one thread
is dealing with a `walGroupId`.
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: [email protected]
For queries about this service, please contact Infrastructure at:
[email protected]
Re: [PR] HBASE-29823 Control WAL flush and offset persistence from ReplicationSourceShipper [hbase]
anmolnar commented on code in PR #7617:
URL: https://github.com/apache/hbase/pull/7617#discussion_r3080238529
##
hbase-server/src/main/java/org/apache/hadoop/hbase/replication/regionserver/ReplicationSource.java:
##
@@ -866,4 +866,24 @@ public long getTotalReplicatedEdits() {
long getSleepForRetries() {
return sleepForRetries;
}
+
+ void restartShipper(String walGroupId, ReplicationSourceShipper oldWorker) {
+workerThreads.compute(walGroupId, (key, current) -> {
+ if (current != oldWorker) {
+return current; // already replaced
Review Comment:
Okay, we can leave this safety check here, but still unclear to me that a
single `walGroupId` is strictly handled by a single thread in the worker pool.
How is it possible that single thread calls for restart two times?
I think what you say "multiple threads trigger restart around the same time"
doesn't seem to be possible, because what I mentioned above: exactly one thread
is dealing with a `walGroupId`.
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: [email protected]
For queries about this service, please contact Infrastructure at:
[email protected]
Re: [PR] HBASE-29823 Control WAL flush and offset persistence from ReplicationSourceShipper [hbase]
ankitsol commented on code in PR #7617:
URL: https://github.com/apache/hbase/pull/7617#discussion_r3074515880
##
hbase-server/src/main/java/org/apache/hadoop/hbase/replication/regionserver/ReplicationSource.java:
##
@@ -866,4 +866,24 @@ public long getTotalReplicatedEdits() {
long getSleepForRetries() {
return sleepForRetries;
}
+
+ void restartShipper(String walGroupId, ReplicationSourceShipper oldWorker) {
+workerThreads.compute(walGroupId, (key, current) -> {
+ if (current != oldWorker) {
+return current; // already replaced
Review Comment:
One scenario is concurrent restart attempts, very very rare chance though
imho. For example, if the same worker encounters failure paths close together
(or multiple threads trigger restart around the same time), one call to
restartShipper may already replace the worker in workerThreads. When the second
call executes, the current entry is no longer oldWorker, so we skip replacing
it again.
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: [email protected]
For queries about this service, please contact Infrastructure at:
[email protected]
Re: [PR] HBASE-29823 Control WAL flush and offset persistence from ReplicationSourceShipper [hbase]
Apache9 commented on code in PR #7617:
URL: https://github.com/apache/hbase/pull/7617#discussion_r3044610064
##
hbase-server/src/main/java/org/apache/hadoop/hbase/replication/regionserver/ReplicationSourceShipper.java:
##
Review Comment:
What I can recall is that for some reasons like updating replication config,
we may terminate a replication source and recreate it before the replication
source finishes, so maybe we do not always need to restart the worker when
entering here, but I'm still a bit nervous that we miss some scenarios where we
need to restart the worker if we only call abortAndRestart when catching an
exception...
##
hbase-server/src/main/java/org/apache/hadoop/hbase/replication/regionserver/ReplicationSourceShipper.java:
##
@@ -215,6 +277,12 @@ private void shipEdits(WALEntryBatch entryBatch) {
entryBatch.getNbOperations(), (endTimeNs - startTimeNs) / 100);
}
break;
+ } catch (IOException ioe) {
+// Offset-Persist failure is treated as fatal to this shipper since it
might come from
+// beforePersistingReplicationOffset. So abort and restart the
Shipper, and WAL reading
+// will resume from the last successfully persisted offset
+abortAndRestart(ioe);
Review Comment:
Better rethrow the exception to the top layer, and call abortAndRestart
there? After returning from this method, we may still have other logics in the
parent method and may cause inconsistency since the new shipper may already be
started...
##
hbase-server/src/main/java/org/apache/hadoop/hbase/replication/regionserver/ReplicationSource.java:
##
@@ -866,4 +866,24 @@ public long getTotalReplicatedEdits() {
long getSleepForRetries() {
return sleepForRetries;
}
+
+ void restartShipper(String walGroupId, ReplicationSourceShipper oldWorker) {
+workerThreads.compute(walGroupId, (key, current) -> {
+ if (current != oldWorker) {
+return current; // already replaced
+ }
+
+ LOG.warn("Restarting shipper for walGroupId={}", walGroupId);
+
+ try {
+ReplicationSourceShipper newWorker = createNewShipper(walGroupId);
+startShipper(newWorker);
+return newWorker;
+ } catch (Exception e) {
+LOG.error("Failed to restart shipper for walGroupId={}", walGroupId,
e);
+return current; // retry later
Review Comment:
Checked che code, createNewShipper and startShipper both do not throw any
exception, so do we really need to catch exception here?
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: [email protected]
For queries about this service, please contact Infrastructure at:
[email protected]
Re: [PR] HBASE-29823 Control WAL flush and offset persistence from ReplicationSourceShipper [hbase]
anmolnar commented on PR #7617: URL: https://github.com/apache/hbase/pull/7617#issuecomment-4210586624 > Looking at this. I don't have additional comments, but waiting to approve until existing comments are resolved. Thanks @charlesconnell , much appreciated. Patiently waiting for @Apache9 to comment on latest changes. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: [email protected] For queries about this service, please contact Infrastructure at: [email protected]
Re: [PR] HBASE-29823 Control WAL flush and offset persistence from ReplicationSourceShipper [hbase]
anmolnar commented on PR #7617: URL: https://github.com/apache/hbase/pull/7617#issuecomment-4210591233 @taklwu @wchevreuil Would you guys like to update your previous reviews? -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: [email protected] For queries about this service, please contact Infrastructure at: [email protected]
Re: [PR] HBASE-29823 Control WAL flush and offset persistence from ReplicationSourceShipper [hbase]
ankitsol commented on code in PR #7617:
URL: https://github.com/apache/hbase/pull/7617#discussion_r3043017939
##
hbase-server/src/main/java/org/apache/hadoop/hbase/replication/regionserver/ReplicationSourceShipper.java:
##
@@ -215,6 +255,13 @@ private void shipEdits(WALEntryBatch entryBatch) {
entryBatch.getNbOperations(), (endTimeNs - startTimeNs) / 100);
}
break;
+ } catch (IOException ioe) {
+// Offset-Persist failure is treated as fatal to this worker since it
might come from
Review Comment:
Hi @Apache9, just a gentle reminder to please take a look at the code review
when you get a chance
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: [email protected]
For queries about this service, please contact Infrastructure at:
[email protected]
Re: [PR] HBASE-29823 Control WAL flush and offset persistence from ReplicationSourceShipper [hbase]
anmolnar commented on code in PR #7617:
URL: https://github.com/apache/hbase/pull/7617#discussion_r3016292503
##
hbase-server/src/main/java/org/apache/hadoop/hbase/replication/regionserver/ReplicationSource.java:
##
@@ -866,4 +866,24 @@ public long getTotalReplicatedEdits() {
long getSleepForRetries() {
return sleepForRetries;
}
+
+ void restartShipper(String walGroupId, ReplicationSourceShipper oldWorker) {
+workerThreads.compute(walGroupId, (key, current) -> {
+ if (current != oldWorker) {
+return current; // already replaced
Review Comment:
How is this possible?
##
hbase-server/src/main/java/org/apache/hadoop/hbase/replication/regionserver/ReplicationSource.java:
##
@@ -866,4 +866,24 @@ public long getTotalReplicatedEdits() {
long getSleepForRetries() {
return sleepForRetries;
}
+
+ void restartShipper(String walGroupId, ReplicationSourceShipper oldWorker) {
+workerThreads.compute(walGroupId, (key, current) -> {
+ if (current != oldWorker) {
+return current; // already replaced
+ }
+
+ LOG.warn("Restarting shipper for walGroupId={}", walGroupId);
+
+ try {
+ReplicationSourceShipper newWorker = createNewShipper(walGroupId);
Review Comment:
You might want to reuse `tryStartNewShipper(walGroupId)` here.
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: [email protected]
For queries about this service, please contact Infrastructure at:
[email protected]
Re: [PR] HBASE-29823 Control WAL flush and offset persistence from ReplicationSourceShipper [hbase]
taklwu commented on code in PR #7617:
URL: https://github.com/apache/hbase/pull/7617#discussion_r3012835303
##
hbase-server/src/main/java/org/apache/hadoop/hbase/replication/regionserver/ReplicationSource.java:
##
@@ -866,4 +866,24 @@ public long getTotalReplicatedEdits() {
long getSleepForRetries() {
return sleepForRetries;
}
+
+ void restartShipper(String walGroupId, ReplicationSourceShipper oldWorker) {
+workerThreads.compute(walGroupId, (key, current) -> {
+ if (current != oldWorker) {
+return current; // already replaced
+ }
+
+ LOG.warn("Restarting shipper for walGroupId={}", walGroupId);
+
+ try {
+ReplicationSourceShipper newWorker = createNewShipper(walGroupId);
+startShipper(newWorker);
+return newWorker;
+ } catch (Exception e) {
+LOG.error("Failed to restart shipper for walGroupId={}", walGroupId,
e);
+return current; // retry later
Review Comment:
can you check if this comment is right? I used the cursor AI editor also
confirmed this may be a potential bug.
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: [email protected]
For queries about this service, please contact Infrastructure at:
[email protected]
Re: [PR] HBASE-29823 Control WAL flush and offset persistence from ReplicationSourceShipper [hbase]
Copilot commented on code in PR #7617:
URL: https://github.com/apache/hbase/pull/7617#discussion_r3012357137
##
hbase-server/src/main/java/org/apache/hadoop/hbase/replication/regionserver/ReplicationSourceShipper.java:
##
@@ -229,6 +297,41 @@ private void shipEdits(WALEntryBatch entryBatch) {
}
}
+ private boolean shouldPersistLogPosition() {
+if (accumulatedSizeSinceLastUpdate == 0 || lastShippedBatch == null) {
+ return false;
+}
+
+// Default behaviour to update offset immediately after replicate()
+if (offsetUpdateSizeThresholdBytes == -1 && offsetUpdateIntervalMs ==
Long.MAX_VALUE) {
+ return true;
+}
+
+return (accumulatedSizeSinceLastUpdate >= offsetUpdateSizeThresholdBytes)
+ || (EnvironmentEdgeManager.currentTime() - lastOffsetUpdateTime >=
offsetUpdateIntervalMs);
Review Comment:
`shouldPersistLogPosition` treats `offsetUpdateSizeThresholdBytes == -1` as
a real threshold, so `accumulatedSizeSinceLastUpdate >= -1` will be true
whenever any data has been shipped. This makes it impossible to enable
*time-based only* persistence by setting just
`hbase.replication.shipper.offset.update.interval.ms` (you will still persist
immediately after every batch). Consider treating negative/zero thresholds as
"disabled" for size-based persistence and only applying the size comparison
when the configured threshold is > 0.
```suggestion
// Treat non-positive size thresholds as "disabled" for size-based
persistence.
boolean sizeThresholdReached = offsetUpdateSizeThresholdBytes > 0
&& accumulatedSizeSinceLastUpdate >= offsetUpdateSizeThresholdBytes;
boolean timeThresholdReached =
EnvironmentEdgeManager.currentTime() - lastOffsetUpdateTime >=
offsetUpdateIntervalMs;
return sizeThresholdReached || timeThresholdReached;
```
##
hbase-server/src/main/java/org/apache/hadoop/hbase/replication/regionserver/ReplicationSourceShipper.java:
##
@@ -118,11 +154,24 @@ public final void run() {
} else {
shipEdits(entryBatch);
}
Review Comment:
With deferred offset persistence enabled (size/interval thresholds),
receiving `WALEntryBatch.NO_MORE_DATA` can transition the worker to `FINISHED`
without ever calling `persistLogPosition()` for the last shipped batch. This
can drop the final offset update and any deferred HFile-ref cleanup, causing
re-replication/recovery work. Consider persisting (if
`accumulatedSizeSinceLastUpdate > 0`) before calling `noMoreData()` / before
setting state to `FINISHED`.
##
hbase-server/src/main/java/org/apache/hadoop/hbase/replication/regionserver/ReplicationSourceShipper.java:
##
@@ -190,13 +252,13 @@ private void shipEdits(WALEntryBatch entryBatch) {
} else {
sleepMultiplier = Math.max(sleepMultiplier - 1, 0);
}
-// Clean up hfile references
-for (Entry entry : entries) {
- cleanUpHFileRefs(entry.getEdit());
- LOG.trace("shipped entry {}: ", entry);
+
+accumulatedSizeSinceLastUpdate += currentSize;
+entriesForCleanUpHFileRefs.addAll(entries);
Review Comment:
`entriesForCleanUpHFileRefs.addAll(entries)` retains full `WAL.Entry`
objects across batches until the next offset persist. With large
thresholds/intervals this can hold a large amount of WAL data in memory
(including edits/cells) even though cleanup only needs bulk-load-related
metadata. Consider recording only the minimal information needed for cleanup
(e.g., extracted bulk-load store file refs) or only retaining entries that
actually contain `WALEdit.BULK_LOAD` cells.
```suggestion
// Only retain WAL entries that contain bulk load cells for later
HFile ref cleanup
for (Entry entry : entries) {
WALEdit edit = entry.getEdit();
if (edit == null) {
continue;
}
for (Cell cell : edit.getCells()) {
if (CellUtil.matchFamily(cell, WALEdit.BULK_LOAD)) {
entriesForCleanUpHFileRefs.add(entry);
break;
}
}
}
```
##
hbase-server/src/main/java/org/apache/hadoop/hbase/replication/ReplicationEndpoint.java:
##
@@ -283,4 +283,11 @@ public int getTimeout() {
* @throws IllegalStateException if this service's state isn't FAILED.
*/
Throwable failureCause();
+
+ /**
+ * Hook invoked before persisting replication offsets. Eg: Buffered
endpoints can flush/close WALs
Review Comment:
Javadoc typo: use "e.g.," instead of "Eg:".
```suggestion
* Hook invoked before persisting replication offsets, e.g., buffered
endpoints can flush/close WALs
```
##
hbase-server/src/main/java/org/apache/hadoop/hbase/replication/regionserver/ReplicationSource.java:
##
@@ -866,4 +866,24 @@ public long getTotalReplicatedEdits() {
long getSleepF
Re: [PR] HBASE-29823 Control WAL flush and offset persistence from ReplicationSourceShipper [hbase]
ankitsol commented on code in PR #7617:
URL: https://github.com/apache/hbase/pull/7617#discussion_r3011386961
##
hbase-server/src/main/java/org/apache/hadoop/hbase/replication/regionserver/ReplicationSourceShipper.java:
##
@@ -215,6 +255,13 @@ private void shipEdits(WALEntryBatch entryBatch) {
entryBatch.getNbOperations(), (endTimeNs - startTimeNs) / 100);
}
break;
+ } catch (IOException ioe) {
+// Offset-Persist failure is treated as fatal to this worker since it
might come from
Review Comment:
Hi @Apache9 just a gentle reminder on this new commit when you get a moment.
Thank you!
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: [email protected]
For queries about this service, please contact Infrastructure at:
[email protected]
Re: [PR] HBASE-29823 Control WAL flush and offset persistence from ReplicationSourceShipper [hbase]
ankitsol commented on PR #7617: URL: https://github.com/apache/hbase/pull/7617#issuecomment-4134778608 Failiing test `org.apache.hadoop.hbase.master.procedure.TestSnapshotProcedureRSCrashes.testRegionServerCrashWhileVerifyingSnapshot` is unrelated to this change -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: [email protected] For queries about this service, please contact Infrastructure at: [email protected]
Re: [PR] HBASE-29823 Control WAL flush and offset persistence from ReplicationSourceShipper [hbase]
ankitsol commented on code in PR #7617:
URL: https://github.com/apache/hbase/pull/7617#discussion_r2994837149
##
hbase-server/src/main/java/org/apache/hadoop/hbase/replication/regionserver/ReplicationSourceShipper.java:
##
@@ -90,6 +97,10 @@ public ReplicationSourceShipper(Configuration conf, String
walGroupId, Replicati
this.conf.getInt("replication.source.getEntries.timeout",
DEFAULT_TIMEOUT);
this.shipEditsTimeout =
this.conf.getInt(HConstants.REPLICATION_SOURCE_SHIPEDITS_TIMEOUT,
HConstants.REPLICATION_SOURCE_SHIPEDITS_TIMEOUT_DFAULT);
+this.offsetUpdateIntervalMs =
+ conf.getLong("hbase.replication.shipper.offset.update.interval.ms",
Long.MAX_VALUE);
+this.offsetUpdateSizeThresholdBytes =
+ conf.getLong("hbase.replication.shipper.offset.update.size.threshold",
-1L);
Review Comment:
Extracted these constants as a static final field and added defaults too
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: [email protected]
For queries about this service, please contact Infrastructure at:
[email protected]
Re: [PR] HBASE-29823 Control WAL flush and offset persistence from ReplicationSourceShipper [hbase]
ankitsol commented on code in PR #7617:
URL: https://github.com/apache/hbase/pull/7617#discussion_r2994846425
##
hbase-server/src/main/java/org/apache/hadoop/hbase/replication/regionserver/ReplicationSourceShipper.java:
##
@@ -106,9 +117,25 @@ public final void run() {
continue;
}
try {
-WALEntryBatch entryBatch = entryReader.poll(getEntriesTimeout);
+// check time-based offset persistence
+if (shouldPersistLogPosition()) {
+ // Trigger offset persistence via existing retry/backoff mechanism
in shipEdits()
+ WALEntryBatch emptyBatch = createEmptyBatchForTimeBasedFlush();
+ if (emptyBatch != null) shipEdits(emptyBatch);
Review Comment:
I was thinking to add tests to the final PR against our continuous backup
feature branch once I get approval on code implementation
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: [email protected]
For queries about this service, please contact Infrastructure at:
[email protected]
Re: [PR] HBASE-29823 Control WAL flush and offset persistence from ReplicationSourceShipper [hbase]
ankitsol commented on code in PR #7617:
URL: https://github.com/apache/hbase/pull/7617#discussion_r2994831828
##
hbase-server/src/main/java/org/apache/hadoop/hbase/replication/regionserver/ReplicationSourceShipper.java:
##
@@ -215,6 +255,13 @@ private void shipEdits(WALEntryBatch entryBatch) {
entryBatch.getNbOperations(), (endTimeNs - startTimeNs) / 100);
}
break;
+ } catch (IOException ioe) {
+// Offset-Persist failure is treated as fatal to this worker since it
might come from
Review Comment:
> Better introduce a abortAndRestart method in ReplicationSourceShipper to
call a method in ReplicationSource to replace the current shipper with a new
one. This can avoid introducing a new monitor thread. And also, we can deal
with InterruptedException with the same way, as usually if we want to shutdown,
we will first change a running or close flag and then interrupt, so when we
want to reschedule, we can check the flag to determine whether we should quit
now.
@Apache9 @anmolnar @taklwu I have implemented this approach, please help me
with review these code changes
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: [email protected]
For queries about this service, please contact Infrastructure at:
[email protected]
Re: [PR] HBASE-29823 Control WAL flush and offset persistence from ReplicationSourceShipper [hbase]
Apache9 commented on code in PR #7617:
URL: https://github.com/apache/hbase/pull/7617#discussion_r2943848541
##
hbase-server/src/main/java/org/apache/hadoop/hbase/replication/regionserver/ReplicationSourceShipper.java:
##
@@ -90,6 +97,10 @@ public ReplicationSourceShipper(Configuration conf, String
walGroupId, Replicati
this.conf.getInt("replication.source.getEntries.timeout",
DEFAULT_TIMEOUT);
this.shipEditsTimeout =
this.conf.getInt(HConstants.REPLICATION_SOURCE_SHIPEDITS_TIMEOUT,
HConstants.REPLICATION_SOURCE_SHIPEDITS_TIMEOUT_DFAULT);
+this.offsetUpdateIntervalMs =
+ conf.getLong("hbase.replication.shipper.offset.update.interval.ms",
Long.MAX_VALUE);
+this.offsetUpdateSizeThresholdBytes =
+ conf.getLong("hbase.replication.shipper.offset.update.size.threshold",
-1L);
Review Comment:
HConstants is IA.Public so generally we do not want to add new fields to it
unless we can confirm they should be accessed by user code directly. But I
agree that we should extract these constants as a static final field.
##
hbase-server/src/main/java/org/apache/hadoop/hbase/replication/regionserver/ReplicationSourceShipper.java:
##
@@ -215,6 +255,13 @@ private void shipEdits(WALEntryBatch entryBatch) {
entryBatch.getNbOperations(), (endTimeNs - startTimeNs) / 100);
}
break;
+ } catch (IOException ioe) {
+// Offset-Persist failure is treated as fatal to this worker since it
might come from
Review Comment:
Better introduce a abortAndRestart method in ReplicationSourceShipper to
call a method in ReplicationSource to replace the current shipper with a new
one. This can avoid introducing a new monitor thread. And also, we can deal
with InterruptedException with the same way, as usually if we want to shutdown,
we will first change a running or close flag and then interrupt, so when we
want to reschedule, we can check the flag to determine whether we should quit
now.
##
hbase-server/src/main/java/org/apache/hadoop/hbase/replication/regionserver/ReplicationSource.java:
##
@@ -148,6 +151,12 @@ public class ReplicationSource implements
ReplicationSourceInterface {
public static final int DEFAULT_WAIT_ON_ENDPOINT_SECONDS = 30;
private int waitOnEndpointSeconds = -1;
+ public static final String SHIPPER_MONITOR_INTERVAL =
Review Comment:
Since the code in shipper is all implemented by us, we can just reschedule a
new shipper when it is dead? I mean, we do not need a monitor thread to do this.
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: [email protected]
For queries about this service, please contact Infrastructure at:
[email protected]
Re: [PR] HBASE-29823 Control WAL flush and offset persistence from ReplicationSourceShipper [hbase]
taklwu commented on code in PR #7617:
URL: https://github.com/apache/hbase/pull/7617#discussion_r2943368507
##
hbase-server/src/main/java/org/apache/hadoop/hbase/replication/regionserver/ReplicationSourceShipper.java:
##
@@ -90,6 +97,10 @@ public ReplicationSourceShipper(Configuration conf, String
walGroupId, Replicati
this.conf.getInt("replication.source.getEntries.timeout",
DEFAULT_TIMEOUT);
this.shipEditsTimeout =
this.conf.getInt(HConstants.REPLICATION_SOURCE_SHIPEDITS_TIMEOUT,
HConstants.REPLICATION_SOURCE_SHIPEDITS_TIMEOUT_DFAULT);
+this.offsetUpdateIntervalMs =
+ conf.getLong("hbase.replication.shipper.offset.update.interval.ms",
Long.MAX_VALUE);
+this.offsetUpdateSizeThresholdBytes =
+ conf.getLong("hbase.replication.shipper.offset.update.size.threshold",
-1L);
Review Comment:
nit: why isn't this feature flag added in `HConstants` ?
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: [email protected]
For queries about this service, please contact Infrastructure at:
[email protected]
Re: [PR] HBASE-29823 Control WAL flush and offset persistence from ReplicationSourceShipper [hbase]
taklwu commented on code in PR #7617:
URL: https://github.com/apache/hbase/pull/7617#discussion_r2943109855
##
hbase-server/src/main/java/org/apache/hadoop/hbase/replication/regionserver/ReplicationSourceShipper.java:
##
@@ -106,9 +117,25 @@ public final void run() {
continue;
}
try {
-WALEntryBatch entryBatch = entryReader.poll(getEntriesTimeout);
+// check time-based offset persistence
+if (shouldPersistLogPosition()) {
+ // Trigger offset persistence via existing retry/backoff mechanism
in shipEdits()
+ WALEntryBatch emptyBatch = createEmptyBatchForTimeBasedFlush();
+ if (emptyBatch != null) shipEdits(emptyBatch);
Review Comment:
nit better to keep bracket even if this is a one-liner
```suggestion
if (emptyBatch != null) {
shipEdits(emptyBatch);
}
```
##
hbase-server/src/main/java/org/apache/hadoop/hbase/replication/regionserver/ReplicationSourceShipper.java:
##
@@ -154,15 +196,16 @@ protected void postFinish() {
private void shipEdits(WALEntryBatch entryBatch) {
List entries = entryBatch.getWalEntries();
int sleepMultiplier = 0;
-if (entries.isEmpty()) {
- updateLogPosition(entryBatch);
- return;
-}
int currentSize = (int) entryBatch.getHeapSize();
source.getSourceMetrics()
.setTimeStampNextToReplicate(entries.get(entries.size() -
1).getKey().getWriteTime());
while (isActive()) {
try {
+if (entries.isEmpty()) {
Review Comment:
@ankitsol is this align/update in the latest patch ?
##
hbase-server/src/main/java/org/apache/hadoop/hbase/replication/regionserver/ReplicationSourceShipper.java:
##
@@ -215,6 +255,13 @@ private void shipEdits(WALEntryBatch entryBatch) {
entryBatch.getNbOperations(), (endTimeNs - startTimeNs) / 100);
}
break;
+ } catch (IOException ioe) {
+// Offset-Persist failure is treated as fatal to this worker since it
might come from
+// beforePersistingReplicationOffset.
+// ReplicationSource will restart the Shipper, and WAL reading
+// will resume from the last successfully persisted offset
+throw new ReplicationRuntimeException(
Review Comment:
k, it looks like the previous updateLogPosition issue
https://github.com/apache/hbase/pull/7617/changes#r2802692158 has been handled
here
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: [email protected]
For queries about this service, please contact Infrastructure at:
[email protected]
Re: [PR] HBASE-29823 Control WAL flush and offset persistence from ReplicationSourceShipper [hbase]
anmolnar commented on PR #7617: URL: https://github.com/apache/hbase/pull/7617#issuecomment-4031807503 @ankitsol Have you checked the CI output? I see a lot of unit test failures related to replication. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: [email protected] For queries about this service, please contact Infrastructure at: [email protected]
Re: [PR] HBASE-29823 Control WAL flush and offset persistence from ReplicationSourceShipper [hbase]
ankitsol commented on PR #7617: URL: https://github.com/apache/hbase/pull/7617#issuecomment-402300 @Apache9 just a gentle ping on my previous comment to check if I'm heading in the right direction. Thank you! -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: [email protected] For queries about this service, please contact Infrastructure at: [email protected]
Re: [PR] HBASE-29823 Control WAL flush and offset persistence from ReplicationSourceShipper [hbase]
ankitsol commented on PR #7617: URL: https://github.com/apache/hbase/pull/7617#issuecomment-3996883509 @Apache9 @anmolnar I have now added a periodic shipper monitoring task in ReplicationSource (`shipperMonitorExecutor`) which checks for dead (non-FINISHED) shipper threads and recreates them. When the new shipper starts, it reads the replication offset from the persisted offset storage and resumes WAL reading from that point, so any WAL entries whose offset was not successfully persisted will be replicated again. So now if the endpoint (for example an S3-based implementation) fails during `beforePersistingReplicationOffset()` while committing/flushing data, `persistLogPosition()` throws an `IOException`. In `shipEdits()` this `IOException` is caught and rethrown as a `ReplicationRuntimeException`. This causes the ReplicationSourceShipper worker thread to exit (the run() method interrupts and terminates the thread). The monitor then detects the dead worker and recreates the shipper, which resumes replication from the last persisted offset. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: [email protected] For queries about this service, please contact Infrastructure at: [email protected]
Re: [PR] HBASE-29823 Control WAL flush and offset persistence from ReplicationSourceShipper [hbase]
Apache9 commented on code in PR #7617:
URL: https://github.com/apache/hbase/pull/7617#discussion_r2844329304
##
hbase-server/src/main/java/org/apache/hadoop/hbase/replication/regionserver/ReplicationSourceShipper.java:
##
@@ -154,15 +196,16 @@ protected void postFinish() {
private void shipEdits(WALEntryBatch entryBatch) {
List entries = entryBatch.getWalEntries();
int sleepMultiplier = 0;
-if (entries.isEmpty()) {
- updateLogPosition(entryBatch);
- return;
-}
int currentSize = (int) entryBatch.getHeapSize();
source.getSourceMetrics()
.setTimeStampNextToReplicate(entries.get(entries.size() -
1).getKey().getWriteTime());
while (isActive()) {
try {
+if (entries.isEmpty()) {
Review Comment:
I'm not sure if the current logic in ReplicationSource can handle shipper
abort, but I think this is a possible way to deal with the problem.
When restarting, we read from the external replication offset storage and
restart from the offset.
I think this could also be used to deal with the updateLogPosition
exception, so we do not need to abort the whole region server.
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: [email protected]
For queries about this service, please contact Infrastructure at:
[email protected]
Re: [PR] HBASE-29823 Control WAL flush and offset persistence from ReplicationSourceShipper [hbase]
ankitsol commented on code in PR #7617:
URL: https://github.com/apache/hbase/pull/7617#discussion_r2840781191
##
hbase-server/src/main/java/org/apache/hadoop/hbase/replication/regionserver/ReplicationSourceShipper.java:
##
@@ -154,15 +196,16 @@ protected void postFinish() {
private void shipEdits(WALEntryBatch entryBatch) {
List entries = entryBatch.getWalEntries();
int sleepMultiplier = 0;
-if (entries.isEmpty()) {
- updateLogPosition(entryBatch);
- return;
-}
int currentSize = (int) entryBatch.getHeapSize();
source.getSourceMetrics()
.setTimeStampNextToReplicate(entries.get(entries.size() -
1).getKey().getWriteTime());
while (isActive()) {
try {
+if (entries.isEmpty()) {
Review Comment:
Hi @Apache9, just a gentle ping on my previous comment to check if I'm
heading in the right direction. Thank you!
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: [email protected]
For queries about this service, please contact Infrastructure at:
[email protected]
Re: [PR] HBASE-29823 Control WAL flush and offset persistence from ReplicationSourceShipper [hbase]
ankitsol commented on code in PR #7617:
URL: https://github.com/apache/hbase/pull/7617#discussion_r2827260596
##
hbase-server/src/main/java/org/apache/hadoop/hbase/replication/regionserver/ReplicationSourceShipper.java:
##
@@ -154,15 +196,16 @@ protected void postFinish() {
private void shipEdits(WALEntryBatch entryBatch) {
List entries = entryBatch.getWalEntries();
int sleepMultiplier = 0;
-if (entries.isEmpty()) {
- updateLogPosition(entryBatch);
- return;
-}
int currentSize = (int) entryBatch.getHeapSize();
source.getSourceMetrics()
.setTimeStampNextToReplicate(entries.get(entries.size() -
1).getKey().getWriteTime());
while (isActive()) {
try {
+if (entries.isEmpty()) {
Review Comment:
run() method has this catch block to handle `ReplicationRuntimeException`
```
catch (InterruptedException | ReplicationRuntimeException e) {
// It is interrupted and needs to quit.
LOG.warn("Interrupted while waiting for next replication entry
batch", e);
Thread.currentThread().interrupt();
}
```
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: [email protected]
For queries about this service, please contact Infrastructure at:
[email protected]
Re: [PR] HBASE-29823 Control WAL flush and offset persistence from ReplicationSourceShipper [hbase]
ankitsol commented on code in PR #7617:
URL: https://github.com/apache/hbase/pull/7617#discussion_r2827254931
##
hbase-server/src/main/java/org/apache/hadoop/hbase/replication/regionserver/ReplicationSourceShipper.java:
##
@@ -154,15 +196,16 @@ protected void postFinish() {
private void shipEdits(WALEntryBatch entryBatch) {
List entries = entryBatch.getWalEntries();
int sleepMultiplier = 0;
-if (entries.isEmpty()) {
- updateLogPosition(entryBatch);
- return;
-}
int currentSize = (int) entryBatch.getHeapSize();
source.getSourceMetrics()
.setTimeStampNextToReplicate(entries.get(entries.size() -
1).getKey().getWriteTime());
while (isActive()) {
try {
+if (entries.isEmpty()) {
Review Comment:
Currently in `shipEdits()`, the only place where an IOException can be
raised is from `persistLogPosition()` (via
`beforePersistingReplicationOffset()` or `cleanUpHFileRefs()`).
Based on your earlier comment about restarting from the last persisted
offset and resending WAL entries, I am thinking of splitting the existing catch
block into two parts:
```
catch (IOException ioe) {
// Persist failure → fatal → restart worker → WAL replay from last
persisted offset
throw new ReplicationRuntimeException(
"Failed to persist replication offset; restarting worker to replay
WAL", ioe);
} catch (Exception ex) {
// Replication/transient failure → retry current batch (existing behaviour)
source.getSourceMetrics().incrementFailedBatches();
LOG.warn("{} threw unknown exception:",
source.getReplicationEndpoint().getClass().getName(), ex);
if (sleepForRetries("ReplicationEndpoint threw exception",
sleepForRetries, sleepMultiplier, maxRetriesMultiplier)) {
sleepMultiplier++;
}
}
```
The intention is that propagating `ReplicationRuntimeException` will cause
the shipper worker to exit, and ReplicationSource will recreate it, so WAL
reading resumes from the last persisted offset and all batches since then are
resent.
Does this approach match what you had in mind? @Apache9
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: [email protected]
For queries about this service, please contact Infrastructure at:
[email protected]
Re: [PR] HBASE-29823 Control WAL flush and offset persistence from ReplicationSourceShipper [hbase]
anmolnar commented on code in PR #7617:
URL: https://github.com/apache/hbase/pull/7617#discussion_r2819270897
##
hbase-server/src/main/java/org/apache/hadoop/hbase/replication/regionserver/ReplicationSourceShipper.java:
##
@@ -154,15 +196,16 @@ protected void postFinish() {
private void shipEdits(WALEntryBatch entryBatch) {
List entries = entryBatch.getWalEntries();
int sleepMultiplier = 0;
-if (entries.isEmpty()) {
- updateLogPosition(entryBatch);
- return;
-}
int currentSize = (int) entryBatch.getHeapSize();
source.getSourceMetrics()
.setTimeStampNextToReplicate(entries.get(entries.size() -
1).getKey().getWriteTime());
while (isActive()) {
try {
+if (entries.isEmpty()) {
Review Comment:
Can we just abort this shipper?
Is `ReplicationSource` going to create a new one if it's aborted?
In that case it will pick up at the last persisted position and retry
correctly, won't it?
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: [email protected]
For queries about this service, please contact Infrastructure at:
[email protected]
Re: [PR] HBASE-29823 Control WAL flush and offset persistence from ReplicationSourceShipper [hbase]
anmolnar commented on code in PR #7617:
URL: https://github.com/apache/hbase/pull/7617#discussion_r2819270897
##
hbase-server/src/main/java/org/apache/hadoop/hbase/replication/regionserver/ReplicationSourceShipper.java:
##
@@ -154,15 +196,16 @@ protected void postFinish() {
private void shipEdits(WALEntryBatch entryBatch) {
List entries = entryBatch.getWalEntries();
int sleepMultiplier = 0;
-if (entries.isEmpty()) {
- updateLogPosition(entryBatch);
- return;
-}
int currentSize = (int) entryBatch.getHeapSize();
source.getSourceMetrics()
.setTimeStampNextToReplicate(entries.get(entries.size() -
1).getKey().getWriteTime());
while (isActive()) {
try {
+if (entries.isEmpty()) {
Review Comment:
Can we just abort this shipper?
Is `ReplicationSource` going to create a new one if it's aborted?
In that case it will pick up at last persisted position and retry correctly,
won't it?
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: [email protected]
For queries about this service, please contact Infrastructure at:
[email protected]
Re: [PR] HBASE-29823 Control WAL flush and offset persistence from ReplicationSourceShipper [hbase]
ankitsol commented on code in PR #7617:
URL: https://github.com/apache/hbase/pull/7617#discussion_r2817632340
##
hbase-server/src/main/java/org/apache/hadoop/hbase/replication/regionserver/ReplicationSourceShipper.java:
##
@@ -229,6 +272,41 @@ private void shipEdits(WALEntryBatch entryBatch) {
}
}
+ private boolean shouldPersistLogPosition() {
+if (accumulatedSizeSinceLastUpdate == 0 || lastShippedBatch == null) {
+ return false;
+}
+
+// Default behaviour to update offset immediately after replicate()
+if (offsetUpdateSizeThresholdBytes == -1 && offsetUpdateIntervalMs ==
Long.MAX_VALUE) {
Review Comment:
Yes, this is not needed, I just added here to make this default behaviour
explicit
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: [email protected]
For queries about this service, please contact Infrastructure at:
[email protected]
Re: [PR] HBASE-29823 Control WAL flush and offset persistence from ReplicationSourceShipper [hbase]
ankitsol commented on code in PR #7617:
URL: https://github.com/apache/hbase/pull/7617#discussion_r2817553228
##
hbase-server/src/main/java/org/apache/hadoop/hbase/replication/regionserver/ReplicationSourceShipper.java:
##
@@ -90,6 +99,22 @@ public ReplicationSourceShipper(Configuration conf, String
walGroupId, Replicati
this.conf.getInt("replication.source.getEntries.timeout",
DEFAULT_TIMEOUT);
this.shipEditsTimeout =
this.conf.getInt(HConstants.REPLICATION_SOURCE_SHIPEDITS_TIMEOUT,
HConstants.REPLICATION_SOURCE_SHIPEDITS_TIMEOUT_DFAULT);
+ReplicationPeer peer = source.getPeer();
+if (peer != null && peer.getPeerConfig() != null) {
Review Comment:
Thank you for pointing this, in ReplicationSourceManager.java this merge of
peer config happens
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: [email protected]
For queries about this service, please contact Infrastructure at:
[email protected]
Re: [PR] HBASE-29823 Control WAL flush and offset persistence from ReplicationSourceShipper [hbase]
Apache9 commented on code in PR #7617:
URL: https://github.com/apache/hbase/pull/7617#discussion_r2802689035
##
hbase-server/src/main/java/org/apache/hadoop/hbase/replication/regionserver/ReplicationSourceShipper.java:
##
@@ -229,6 +272,41 @@ private void shipEdits(WALEntryBatch entryBatch) {
}
}
+ private boolean shouldPersistLogPosition() {
+if (accumulatedSizeSinceLastUpdate == 0 || lastShippedBatch == null) {
+ return false;
+}
+
+// Default behaviour to update offset immediately after replicate()
+if (offsetUpdateSizeThresholdBytes == -1 && offsetUpdateIntervalMs ==
Long.MAX_VALUE) {
Review Comment:
This is a bit strange...
If offsetUpdateSizeThresholdBytes is -1, then the below
accumulatedSizeSinceLastUpdate >= offsetUpdateSizeThresholdBytes will always
returns true, so we do not need this check here?
##
hbase-server/src/main/java/org/apache/hadoop/hbase/replication/regionserver/ReplicationSourceShipper.java:
##
@@ -105,10 +130,17 @@ public final void run() {
sleepForRetries("Replication is disabled", sleepForRetries, 1,
maxRetriesMultiplier);
continue;
}
+ // check time-based offset persistence
+ if (shouldPersistLogPosition()) {
Review Comment:
So this wants to solve the problem that, since now we may not always persist
the offset and shipping, it is possible that we shipped out a batch, and then
did not get a new batch for a very long time, so if we do not add a check here,
there is no way for us to persist the offset?
I think we also need to change the poll timeout below, we should not wait
longer than the remaining time of update offset interval?
##
hbase-server/src/main/java/org/apache/hadoop/hbase/replication/regionserver/ReplicationSourceShipper.java:
##
@@ -90,6 +99,22 @@ public ReplicationSourceShipper(Configuration conf, String
walGroupId, Replicati
this.conf.getInt("replication.source.getEntries.timeout",
DEFAULT_TIMEOUT);
this.shipEditsTimeout =
this.conf.getInt(HConstants.REPLICATION_SOURCE_SHIPEDITS_TIMEOUT,
HConstants.REPLICATION_SOURCE_SHIPEDITS_TIMEOUT_DFAULT);
+ReplicationPeer peer = source.getPeer();
+if (peer != null && peer.getPeerConfig() != null) {
Review Comment:
The conf here already contains all the configurations in peer config. You
can see the implementation of ReplicationPeers for more details, we will create
a CompoundConfiguration and apply the peerConfig.getConfiguration() to it.
##
hbase-server/src/main/java/org/apache/hadoop/hbase/replication/regionserver/ReplicationSourceShipper.java:
##
@@ -154,15 +196,16 @@ protected void postFinish() {
private void shipEdits(WALEntryBatch entryBatch) {
List entries = entryBatch.getWalEntries();
int sleepMultiplier = 0;
-if (entries.isEmpty()) {
- updateLogPosition(entryBatch);
- return;
-}
int currentSize = (int) entryBatch.getHeapSize();
source.getSourceMetrics()
.setTimeStampNextToReplicate(entries.get(entries.size() -
1).getKey().getWriteTime());
while (isActive()) {
try {
+if (entries.isEmpty()) {
Review Comment:
Why move this here?
##
hbase-server/src/main/java/org/apache/hadoop/hbase/replication/regionserver/ReplicationSourceShipper.java:
##
@@ -154,15 +196,16 @@ protected void postFinish() {
private void shipEdits(WALEntryBatch entryBatch) {
List entries = entryBatch.getWalEntries();
int sleepMultiplier = 0;
-if (entries.isEmpty()) {
- updateLogPosition(entryBatch);
- return;
-}
int currentSize = (int) entryBatch.getHeapSize();
source.getSourceMetrics()
.setTimeStampNextToReplicate(entries.get(entries.size() -
1).getKey().getWriteTime());
while (isActive()) {
try {
+if (entries.isEmpty()) {
Review Comment:
OK, I guess I know why you move this here, in the past updateLogPosition can
not fail(it will lead to an abort so we can assume it never fails), but now
since we have introduced a callback method, it could fail.
Then I do not think this is the correct way to deal with this. Consider the
S3 based solution, if you fail to commit the file on S3, then the correct way
is to send the data again? But here we just retry committing... I think we
should restart from the previous persist offset and send data again.
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: [email protected]
For queries about this service, please contact Infrastructure at:
[email protected]
Re: [PR] HBASE-29823 Control WAL flush and offset persistence from ReplicationSourceShipper [hbase]
Apache9 commented on PR #7617: URL: https://github.com/apache/hbase/pull/7617#issuecomment-3894748231 Will take a look this afternoon. Please give me some time. Thanks. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: [email protected] For queries about this service, please contact Infrastructure at: [email protected]
Re: [PR] HBASE-29823 Control WAL flush and offset persistence from ReplicationSourceShipper [hbase]
anmolnar commented on code in PR #7617:
URL: https://github.com/apache/hbase/pull/7617#discussion_r2789440736
##
hbase-server/src/main/java/org/apache/hadoop/hbase/replication/regionserver/ReplicationSourceShipper.java:
##
@@ -229,6 +247,41 @@ private void shipEdits(WALEntryBatch entryBatch) {
}
}
+ private boolean shouldPersistLogPosition() {
+ReplicationEndpoint endpoint = source.getReplicationEndpoint();
+long maxBufferSize = endpoint.getMaxBufferSize();
+if (stagedWalSize == 0 || lastShippedBatch == null) {
+ return false;
+}
+if (maxBufferSize == -1) {
+ return true;
+}
+return stagedWalSize >= maxBufferSize
+ || (EnvironmentEdgeManager.currentTime() - lastStagedFlushTs >=
endpoint.maxFlushInterval());
+ }
+
+ private void persistLogPosition() throws IOException {
+if (lastShippedBatch == null) {
Review Comment:
We cannot. This method is not getting called if lastShippedBatch is null in
the current implementation. We can still keep this check here for safety
reasons.
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: [email protected]
For queries about this service, please contact Infrastructure at:
[email protected]
Re: [PR] HBASE-29823 Control WAL flush and offset persistence from ReplicationSourceShipper [hbase]
Apache-HBase commented on PR #7617:
URL: https://github.com/apache/hbase/pull/7617#issuecomment-3860012449
:broken_heart: **-1 overall**
| Vote | Subsystem | Runtime | Logfile | Comment |
|::|--:|:|::|:---:|
| +0 :ok: | reexec | 0m 37s | | Docker mode activated. |
| -0 :warning: | yetus | 0m 3s | | Unprocessed flag(s):
--brief-report-file --spotbugs-strict-precheck --author-ignore-list
--blanks-eol-ignore-file --blanks-tabs-ignore-file --quick-hadoopcheck |
_ Prechecks _ |
_ master Compile Tests _ |
| +1 :green_heart: | mvninstall | 2m 34s | | master passed |
| +1 :green_heart: | compile | 0m 46s | | master passed |
| +1 :green_heart: | javadoc | 0m 24s | | master passed |
| +1 :green_heart: | shadedjars | 4m 26s | | branch has no errors when
building our shaded downstream artifacts. |
_ Patch Compile Tests _ |
| +1 :green_heart: | mvninstall | 2m 17s | | the patch passed |
| +1 :green_heart: | compile | 0m 47s | | the patch passed |
| +1 :green_heart: | javac | 0m 47s | | the patch passed |
| +1 :green_heart: | javadoc | 0m 21s | | the patch passed |
| +1 :green_heart: | shadedjars | 4m 21s | | patch has no errors when
building our shaded downstream artifacts. |
_ Other Tests _ |
| -1 :x: | unit | 28m 34s |
[/patch-unit-hbase-server.txt](https://ci-hbase.apache.org/job/HBase-PreCommit-GitHub-PR/job/PR-7617/8/artifact/yetus-jdk17-hadoop3-check/output/patch-unit-hbase-server.txt)
| hbase-server in the patch failed. |
| | | 46m 44s | | |
| Subsystem | Report/Notes |
|--:|:-|
| Docker | ClientAPI=1.53 ServerAPI=1.53 base:
https://ci-hbase.apache.org/job/HBase-PreCommit-GitHub-PR/job/PR-7617/8/artifact/yetus-jdk17-hadoop3-check/output/Dockerfile
|
| GITHUB PR | https://github.com/apache/hbase/pull/7617 |
| Optional Tests | javac javadoc unit compile shadedjars |
| uname | Linux fda2279b6c5b 6.14.0-1018-aws #18~24.04.1-Ubuntu SMP Mon Nov
24 19:46:27 UTC 2025 x86_64 x86_64 x86_64 GNU/Linux |
| Build tool | maven |
| Personality | dev-support/hbase-personality.sh |
| git revision | master / 640308bf96121abbabd3f9fcad9ba094699af23d |
| Default Java | Eclipse Adoptium-17.0.11+9 |
| Test Results |
https://ci-hbase.apache.org/job/HBase-PreCommit-GitHub-PR/job/PR-7617/8/testReport/
|
| Max. process+thread count | 1419 (vs. ulimit of 3) |
| modules | C: hbase-server U: hbase-server |
| Console output |
https://ci-hbase.apache.org/job/HBase-PreCommit-GitHub-PR/job/PR-7617/8/console
|
| versions | git=2.34.1 maven=3.9.8 |
| Powered by | Apache Yetus 0.15.0 https://yetus.apache.org |
This message was automatically generated.
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: [email protected]
For queries about this service, please contact Infrastructure at:
[email protected]
Re: [PR] HBASE-29823 Control WAL flush and offset persistence from ReplicationSourceShipper [hbase]
Apache-HBase commented on PR #7617:
URL: https://github.com/apache/hbase/pull/7617#issuecomment-3859957472
:confetti_ball: **+1 overall**
| Vote | Subsystem | Runtime | Logfile | Comment |
|::|--:|:|::|:---:|
| +0 :ok: | reexec | 0m 11s | | Docker mode activated. |
_ Prechecks _ |
| +1 :green_heart: | dupname | 0m 0s | | No case conflicting files
found. |
| +0 :ok: | codespell | 0m 0s | | codespell was not available. |
| +0 :ok: | detsecrets | 0m 0s | | detect-secrets was not available.
|
| +1 :green_heart: | @author | 0m 0s | | The patch does not contain
any @author tags. |
| +1 :green_heart: | hbaseanti | 0m 0s | | Patch does not have any
anti-patterns. |
_ master Compile Tests _ |
| +1 :green_heart: | mvninstall | 2m 55s | | master passed |
| +1 :green_heart: | compile | 2m 54s | | master passed |
| +1 :green_heart: | checkstyle | 0m 51s | | master passed |
| +1 :green_heart: | spotbugs | 1m 33s | | master passed |
| +1 :green_heart: | spotless | 0m 46s | | branch has no errors when
running spotless:check. |
_ Patch Compile Tests _ |
| +1 :green_heart: | mvninstall | 2m 30s | | the patch passed |
| +1 :green_heart: | compile | 2m 56s | | the patch passed |
| +1 :green_heart: | javac | 2m 56s | | the patch passed |
| +1 :green_heart: | blanks | 0m 0s | | The patch has no blanks
issues. |
| -0 :warning: | checkstyle | 0m 48s |
[/results-checkstyle-hbase-server.txt](https://ci-hbase.apache.org/job/HBase-PreCommit-GitHub-PR/job/PR-7617/8/artifact/yetus-general-check/output/results-checkstyle-hbase-server.txt)
| hbase-server: The patch generated 1 new + 0 unchanged - 0 fixed = 1 total
(was 0) |
| +1 :green_heart: | spotbugs | 1m 42s | | the patch passed |
| +1 :green_heart: | hadoopcheck | 9m 21s | | Patch does not cause any
errors with Hadoop 3.3.6 3.4.1. |
| +1 :green_heart: | spotless | 0m 39s | | patch has no errors when
running spotless:check. |
_ Other Tests _ |
| +1 :green_heart: | asflicense | 0m 10s | | The patch does not
generate ASF License warnings. |
| | | 33m 19s | | |
| Subsystem | Report/Notes |
|--:|:-|
| Docker | ClientAPI=1.53 ServerAPI=1.53 base:
https://ci-hbase.apache.org/job/HBase-PreCommit-GitHub-PR/job/PR-7617/8/artifact/yetus-general-check/output/Dockerfile
|
| GITHUB PR | https://github.com/apache/hbase/pull/7617 |
| Optional Tests | dupname asflicense javac spotbugs checkstyle codespell
detsecrets compile hadoopcheck hbaseanti spotless |
| uname | Linux 867472c88f7c 6.14.0-1018-aws #18~24.04.1-Ubuntu SMP Mon Nov
24 19:46:27 UTC 2025 x86_64 x86_64 x86_64 GNU/Linux |
| Build tool | maven |
| Personality | dev-support/hbase-personality.sh |
| git revision | master / 640308bf96121abbabd3f9fcad9ba094699af23d |
| Default Java | Eclipse Adoptium-17.0.11+9 |
| Max. process+thread count | 85 (vs. ulimit of 3) |
| modules | C: hbase-server U: hbase-server |
| Console output |
https://ci-hbase.apache.org/job/HBase-PreCommit-GitHub-PR/job/PR-7617/8/console
|
| versions | git=2.34.1 maven=3.9.8 spotbugs=4.7.3 |
| Powered by | Apache Yetus 0.15.0 https://yetus.apache.org |
This message was automatically generated.
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: [email protected]
For queries about this service, please contact Infrastructure at:
[email protected]
Re: [PR] HBASE-29823 Control WAL flush and offset persistence from ReplicationSourceShipper [hbase]
Apache-HBase commented on PR #7617:
URL: https://github.com/apache/hbase/pull/7617#issuecomment-3858344435
:broken_heart: **-1 overall**
| Vote | Subsystem | Runtime | Logfile | Comment |
|::|--:|:|::|:---:|
| +0 :ok: | reexec | 0m 32s | | Docker mode activated. |
| -0 :warning: | yetus | 0m 5s | | Unprocessed flag(s):
--brief-report-file --spotbugs-strict-precheck --author-ignore-list
--blanks-eol-ignore-file --blanks-tabs-ignore-file --quick-hadoopcheck |
_ Prechecks _ |
_ master Compile Tests _ |
| +1 :green_heart: | mvninstall | 3m 38s | | master passed |
| +1 :green_heart: | compile | 1m 1s | | master passed |
| +1 :green_heart: | javadoc | 0m 33s | | master passed |
| +1 :green_heart: | shadedjars | 5m 57s | | branch has no errors when
building our shaded downstream artifacts. |
_ Patch Compile Tests _ |
| -1 :x: | mvninstall | 1m 44s |
[/patch-mvninstall-root.txt](https://ci-hbase.apache.org/job/HBase-PreCommit-GitHub-PR/job/PR-7617/7/artifact/yetus-jdk17-hadoop3-check/output/patch-mvninstall-root.txt)
| root in the patch failed. |
| -1 :x: | compile | 0m 59s |
[/patch-compile-hbase-server.txt](https://ci-hbase.apache.org/job/HBase-PreCommit-GitHub-PR/job/PR-7617/7/artifact/yetus-jdk17-hadoop3-check/output/patch-compile-hbase-server.txt)
| hbase-server in the patch failed. |
| -0 :warning: | javac | 0m 59s |
[/patch-compile-hbase-server.txt](https://ci-hbase.apache.org/job/HBase-PreCommit-GitHub-PR/job/PR-7617/7/artifact/yetus-jdk17-hadoop3-check/output/patch-compile-hbase-server.txt)
| hbase-server in the patch failed. |
| +1 :green_heart: | javadoc | 0m 27s | | the patch passed |
| -1 :x: | shadedjars | 4m 29s | | patch has 16 errors when building
our shaded downstream artifacts. |
_ Other Tests _ |
| -1 :x: | unit | 1m 0s |
[/patch-unit-hbase-server.txt](https://ci-hbase.apache.org/job/HBase-PreCommit-GitHub-PR/job/PR-7617/7/artifact/yetus-jdk17-hadoop3-check/output/patch-unit-hbase-server.txt)
| hbase-server in the patch failed. |
| | | 21m 27s | | |
| Subsystem | Report/Notes |
|--:|:-|
| Docker | ClientAPI=1.43 ServerAPI=1.43 base:
https://ci-hbase.apache.org/job/HBase-PreCommit-GitHub-PR/job/PR-7617/7/artifact/yetus-jdk17-hadoop3-check/output/Dockerfile
|
| GITHUB PR | https://github.com/apache/hbase/pull/7617 |
| Optional Tests | javac javadoc unit compile shadedjars |
| uname | Linux d235f58c2160 5.4.0-1103-aws #111~18.04.1-Ubuntu SMP Tue May
23 20:04:10 UTC 2023 x86_64 x86_64 x86_64 GNU/Linux |
| Build tool | maven |
| Personality | dev-support/hbase-personality.sh |
| git revision | master / cfc9d908eb46fd943e83f0f6288d1eecc3ce16ad |
| Default Java | Eclipse Adoptium-17.0.11+9 |
| shadedjars |
https://ci-hbase.apache.org/job/HBase-PreCommit-GitHub-PR/job/PR-7617/7/artifact/yetus-jdk17-hadoop3-check/output/patch-shadedjars.txt
|
| Test Results |
https://ci-hbase.apache.org/job/HBase-PreCommit-GitHub-PR/job/PR-7617/7/testReport/
|
| Max. process+thread count | 80 (vs. ulimit of 3) |
| modules | C: hbase-server U: hbase-server |
| Console output |
https://ci-hbase.apache.org/job/HBase-PreCommit-GitHub-PR/job/PR-7617/7/console
|
| versions | git=2.34.1 maven=3.9.8 |
| Powered by | Apache Yetus 0.15.0 https://yetus.apache.org |
This message was automatically generated.
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: [email protected]
For queries about this service, please contact Infrastructure at:
[email protected]
Re: [PR] HBASE-29823 Control WAL flush and offset persistence from ReplicationSourceShipper [hbase]
Apache-HBase commented on PR #7617:
URL: https://github.com/apache/hbase/pull/7617#issuecomment-3858309287
:broken_heart: **-1 overall**
| Vote | Subsystem | Runtime | Logfile | Comment |
|::|--:|:|::|:---:|
| +0 :ok: | reexec | 0m 12s | | Docker mode activated. |
_ Prechecks _ |
| +1 :green_heart: | dupname | 0m 0s | | No case conflicting files
found. |
| +0 :ok: | codespell | 0m 0s | | codespell was not available. |
| +0 :ok: | detsecrets | 0m 0s | | detect-secrets was not available.
|
| +1 :green_heart: | @author | 0m 0s | | The patch does not contain
any @author tags. |
| +1 :green_heart: | hbaseanti | 0m 0s | | Patch does not have any
anti-patterns. |
_ master Compile Tests _ |
| +1 :green_heart: | mvninstall | 2m 15s | | master passed |
| +1 :green_heart: | compile | 2m 34s | | master passed |
| +1 :green_heart: | checkstyle | 0m 46s | | master passed |
| +1 :green_heart: | spotbugs | 1m 14s | | master passed |
| +1 :green_heart: | spotless | 0m 34s | | branch has no errors when
running spotless:check. |
_ Patch Compile Tests _ |
| -1 :x: | mvninstall | 1m 15s |
[/patch-mvninstall-root.txt](https://ci-hbase.apache.org/job/HBase-PreCommit-GitHub-PR/job/PR-7617/7/artifact/yetus-general-check/output/patch-mvninstall-root.txt)
| root in the patch failed. |
| -1 :x: | compile | 1m 24s |
[/patch-compile-hbase-server.txt](https://ci-hbase.apache.org/job/HBase-PreCommit-GitHub-PR/job/PR-7617/7/artifact/yetus-general-check/output/patch-compile-hbase-server.txt)
| hbase-server in the patch failed. |
| -0 :warning: | javac | 1m 24s |
[/patch-compile-hbase-server.txt](https://ci-hbase.apache.org/job/HBase-PreCommit-GitHub-PR/job/PR-7617/7/artifact/yetus-general-check/output/patch-compile-hbase-server.txt)
| hbase-server in the patch failed. |
| +1 :green_heart: | blanks | 0m 1s | | The patch has no blanks
issues. |
| -0 :warning: | checkstyle | 0m 45s |
[/results-checkstyle-hbase-server.txt](https://ci-hbase.apache.org/job/HBase-PreCommit-GitHub-PR/job/PR-7617/7/artifact/yetus-general-check/output/results-checkstyle-hbase-server.txt)
| hbase-server: The patch generated 1 new + 0 unchanged - 0 fixed = 1 total
(was 0) |
| -1 :x: | spotbugs | 0m 38s |
[/patch-spotbugs-hbase-server.txt](https://ci-hbase.apache.org/job/HBase-PreCommit-GitHub-PR/job/PR-7617/7/artifact/yetus-general-check/output/patch-spotbugs-hbase-server.txt)
| hbase-server in the patch failed. |
| -1 :x: | hadoopcheck | 1m 28s | | The patch causes 16 errors with
Hadoop v3.3.6. |
| -1 :x: | hadoopcheck | 2m 57s | | The patch causes 16 errors with
Hadoop v3.4.1. |
| +1 :green_heart: | spotless | 0m 33s | | patch has no errors when
running spotless:check. |
_ Other Tests _ |
| +1 :green_heart: | asflicense | 0m 6s | | The patch does not
generate ASF License warnings. |
| | | 16m 26s | | |
| Subsystem | Report/Notes |
|--:|:-|
| Docker | ClientAPI=1.53 ServerAPI=1.53 base:
https://ci-hbase.apache.org/job/HBase-PreCommit-GitHub-PR/job/PR-7617/7/artifact/yetus-general-check/output/Dockerfile
|
| GITHUB PR | https://github.com/apache/hbase/pull/7617 |
| Optional Tests | dupname asflicense javac spotbugs checkstyle codespell
detsecrets compile hadoopcheck hbaseanti spotless |
| uname | Linux 8c4a7b145330 6.14.0-1018-aws #18~24.04.1-Ubuntu SMP Mon Nov
24 19:46:27 UTC 2025 x86_64 x86_64 x86_64 GNU/Linux |
| Build tool | maven |
| Personality | dev-support/hbase-personality.sh |
| git revision | master / cfc9d908eb46fd943e83f0f6288d1eecc3ce16ad |
| Default Java | Eclipse Adoptium-17.0.11+9 |
| hadoopcheck |
https://ci-hbase.apache.org/job/HBase-PreCommit-GitHub-PR/job/PR-7617/7/artifact/yetus-general-check/output/patch-javac-3.3.6.txt
|
| hadoopcheck |
https://ci-hbase.apache.org/job/HBase-PreCommit-GitHub-PR/job/PR-7617/7/artifact/yetus-general-check/output/patch-javac-3.4.1.txt
|
| Max. process+thread count | 85 (vs. ulimit of 3) |
| modules | C: hbase-server U: hbase-server |
| Console output |
https://ci-hbase.apache.org/job/HBase-PreCommit-GitHub-PR/job/PR-7617/7/console
|
| versions | git=2.34.1 maven=3.9.8 spotbugs=4.7.3 |
| Powered by | Apache Yetus 0.15.0 https://yetus.apache.org |
This message was automatically generated.
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: [email protected]
For queries about this service, please contact Infrastructure at:
[email protected]
Re: [PR] HBASE-29823 Control WAL flush and offset persistence from ReplicationSourceShipper [hbase]
Apache-HBase commented on PR #7617:
URL: https://github.com/apache/hbase/pull/7617#issuecomment-3853691977
:broken_heart: **-1 overall**
| Vote | Subsystem | Runtime | Logfile | Comment |
|::|--:|:|::|:---:|
| +0 :ok: | reexec | 0m 11s | | Docker mode activated. |
_ Prechecks _ |
| +1 :green_heart: | dupname | 0m 0s | | No case conflicting files
found. |
| +0 :ok: | codespell | 0m 0s | | codespell was not available. |
| +0 :ok: | detsecrets | 0m 0s | | detect-secrets was not available.
|
| +1 :green_heart: | @author | 0m 0s | | The patch does not contain
any @author tags. |
| +1 :green_heart: | hbaseanti | 0m 0s | | Patch does not have any
anti-patterns. |
_ master Compile Tests _ |
| +1 :green_heart: | mvninstall | 2m 18s | | master passed |
| +1 :green_heart: | compile | 2m 38s | | master passed |
| +1 :green_heart: | checkstyle | 0m 46s | | master passed |
| +1 :green_heart: | spotbugs | 1m 14s | | master passed |
| +1 :green_heart: | spotless | 0m 35s | | branch has no errors when
running spotless:check. |
_ Patch Compile Tests _ |
| -1 :x: | mvninstall | 1m 15s |
[/patch-mvninstall-root.txt](https://ci-hbase.apache.org/job/HBase-PreCommit-GitHub-PR/job/PR-7617/6/artifact/yetus-general-check/output/patch-mvninstall-root.txt)
| root in the patch failed. |
| -1 :x: | compile | 1m 23s |
[/patch-compile-hbase-server.txt](https://ci-hbase.apache.org/job/HBase-PreCommit-GitHub-PR/job/PR-7617/6/artifact/yetus-general-check/output/patch-compile-hbase-server.txt)
| hbase-server in the patch failed. |
| -0 :warning: | javac | 1m 23s |
[/patch-compile-hbase-server.txt](https://ci-hbase.apache.org/job/HBase-PreCommit-GitHub-PR/job/PR-7617/6/artifact/yetus-general-check/output/patch-compile-hbase-server.txt)
| hbase-server in the patch failed. |
| +1 :green_heart: | blanks | 0m 0s | | The patch has no blanks
issues. |
| -0 :warning: | checkstyle | 0m 45s |
[/results-checkstyle-hbase-server.txt](https://ci-hbase.apache.org/job/HBase-PreCommit-GitHub-PR/job/PR-7617/6/artifact/yetus-general-check/output/results-checkstyle-hbase-server.txt)
| hbase-server: The patch generated 1 new + 0 unchanged - 0 fixed = 1 total
(was 0) |
| -1 :x: | spotbugs | 0m 39s |
[/patch-spotbugs-hbase-server.txt](https://ci-hbase.apache.org/job/HBase-PreCommit-GitHub-PR/job/PR-7617/6/artifact/yetus-general-check/output/patch-spotbugs-hbase-server.txt)
| hbase-server in the patch failed. |
| -1 :x: | hadoopcheck | 1m 29s | | The patch causes 16 errors with
Hadoop v3.3.6. |
| -1 :x: | hadoopcheck | 2m 57s | | The patch causes 16 errors with
Hadoop v3.4.1. |
| +1 :green_heart: | spotless | 0m 33s | | patch has no errors when
running spotless:check. |
_ Other Tests _ |
| +1 :green_heart: | asflicense | 0m 6s | | The patch does not
generate ASF License warnings. |
| | | 16m 27s | | |
| Subsystem | Report/Notes |
|--:|:-|
| Docker | ClientAPI=1.53 ServerAPI=1.53 base:
https://ci-hbase.apache.org/job/HBase-PreCommit-GitHub-PR/job/PR-7617/6/artifact/yetus-general-check/output/Dockerfile
|
| GITHUB PR | https://github.com/apache/hbase/pull/7617 |
| Optional Tests | dupname asflicense javac spotbugs checkstyle codespell
detsecrets compile hadoopcheck hbaseanti spotless |
| uname | Linux 5ca63016945b 6.14.0-1018-aws #18~24.04.1-Ubuntu SMP Mon Nov
24 19:46:27 UTC 2025 x86_64 x86_64 x86_64 GNU/Linux |
| Build tool | maven |
| Personality | dev-support/hbase-personality.sh |
| git revision | master / f7e81880f2fb14a4277b9ce69ca597e82e3b7854 |
| Default Java | Eclipse Adoptium-17.0.11+9 |
| hadoopcheck |
https://ci-hbase.apache.org/job/HBase-PreCommit-GitHub-PR/job/PR-7617/6/artifact/yetus-general-check/output/patch-javac-3.3.6.txt
|
| hadoopcheck |
https://ci-hbase.apache.org/job/HBase-PreCommit-GitHub-PR/job/PR-7617/6/artifact/yetus-general-check/output/patch-javac-3.4.1.txt
|
| Max. process+thread count | 84 (vs. ulimit of 3) |
| modules | C: hbase-server U: hbase-server |
| Console output |
https://ci-hbase.apache.org/job/HBase-PreCommit-GitHub-PR/job/PR-7617/6/console
|
| versions | git=2.34.1 maven=3.9.8 spotbugs=4.7.3 |
| Powered by | Apache Yetus 0.15.0 https://yetus.apache.org |
This message was automatically generated.
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: [email protected]
For queries about this service, please contact Infrastructure at:
[email protected]
Re: [PR] HBASE-29823 Control WAL flush and offset persistence from ReplicationSourceShipper [hbase]
Apache-HBase commented on PR #7617:
URL: https://github.com/apache/hbase/pull/7617#issuecomment-3853686338
:broken_heart: **-1 overall**
| Vote | Subsystem | Runtime | Logfile | Comment |
|::|--:|:|::|:---:|
| +0 :ok: | reexec | 0m 12s | | Docker mode activated. |
| -0 :warning: | yetus | 0m 4s | | Unprocessed flag(s):
--brief-report-file --spotbugs-strict-precheck --author-ignore-list
--blanks-eol-ignore-file --blanks-tabs-ignore-file --quick-hadoopcheck |
_ Prechecks _ |
_ master Compile Tests _ |
| +1 :green_heart: | mvninstall | 2m 37s | | master passed |
| +1 :green_heart: | compile | 0m 47s | | master passed |
| +1 :green_heart: | javadoc | 0m 23s | | master passed |
| +1 :green_heart: | shadedjars | 4m 28s | | branch has no errors when
building our shaded downstream artifacts. |
_ Patch Compile Tests _ |
| -1 :x: | mvninstall | 1m 16s |
[/patch-mvninstall-root.txt](https://ci-hbase.apache.org/job/HBase-PreCommit-GitHub-PR/job/PR-7617/6/artifact/yetus-jdk17-hadoop3-check/output/patch-mvninstall-root.txt)
| root in the patch failed. |
| -1 :x: | compile | 0m 46s |
[/patch-compile-hbase-server.txt](https://ci-hbase.apache.org/job/HBase-PreCommit-GitHub-PR/job/PR-7617/6/artifact/yetus-jdk17-hadoop3-check/output/patch-compile-hbase-server.txt)
| hbase-server in the patch failed. |
| -0 :warning: | javac | 0m 46s |
[/patch-compile-hbase-server.txt](https://ci-hbase.apache.org/job/HBase-PreCommit-GitHub-PR/job/PR-7617/6/artifact/yetus-jdk17-hadoop3-check/output/patch-compile-hbase-server.txt)
| hbase-server in the patch failed. |
| +1 :green_heart: | javadoc | 0m 21s | | the patch passed |
| -1 :x: | shadedjars | 3m 18s | | patch has 16 errors when building
our shaded downstream artifacts. |
_ Other Tests _ |
| -1 :x: | unit | 0m 46s |
[/patch-unit-hbase-server.txt](https://ci-hbase.apache.org/job/HBase-PreCommit-GitHub-PR/job/PR-7617/6/artifact/yetus-jdk17-hadoop3-check/output/patch-unit-hbase-server.txt)
| hbase-server in the patch failed. |
| | | 15m 39s | | |
| Subsystem | Report/Notes |
|--:|:-|
| Docker | ClientAPI=1.53 ServerAPI=1.53 base:
https://ci-hbase.apache.org/job/HBase-PreCommit-GitHub-PR/job/PR-7617/6/artifact/yetus-jdk17-hadoop3-check/output/Dockerfile
|
| GITHUB PR | https://github.com/apache/hbase/pull/7617 |
| Optional Tests | javac javadoc unit compile shadedjars |
| uname | Linux 542437302e36 6.14.0-1018-aws #18~24.04.1-Ubuntu SMP Mon Nov
24 19:46:27 UTC 2025 x86_64 x86_64 x86_64 GNU/Linux |
| Build tool | maven |
| Personality | dev-support/hbase-personality.sh |
| git revision | master / f7e81880f2fb14a4277b9ce69ca597e82e3b7854 |
| Default Java | Eclipse Adoptium-17.0.11+9 |
| shadedjars |
https://ci-hbase.apache.org/job/HBase-PreCommit-GitHub-PR/job/PR-7617/6/artifact/yetus-jdk17-hadoop3-check/output/patch-shadedjars.txt
|
| Test Results |
https://ci-hbase.apache.org/job/HBase-PreCommit-GitHub-PR/job/PR-7617/6/testReport/
|
| Max. process+thread count | 75 (vs. ulimit of 3) |
| modules | C: hbase-server U: hbase-server |
| Console output |
https://ci-hbase.apache.org/job/HBase-PreCommit-GitHub-PR/job/PR-7617/6/console
|
| versions | git=2.34.1 maven=3.9.8 |
| Powered by | Apache Yetus 0.15.0 https://yetus.apache.org |
This message was automatically generated.
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: [email protected]
For queries about this service, please contact Infrastructure at:
[email protected]
Re: [PR] HBASE-29823 Control WAL flush and offset persistence from ReplicationSourceShipper [hbase]
ankitsol commented on PR #7617: URL: https://github.com/apache/hbase/pull/7617#issuecomment-3853556817 I have moved both size and time based configs to Replication Endpoint peerConfig and defaults in ReplicationSourceShipper I am now re-using ReplicationSourceShipper#shipEdits() retry mechanism for handling exceptions Please help me with the code review @Apache9 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: [email protected] For queries about this service, please contact Infrastructure at: [email protected]
Re: [PR] HBASE-29823 Control WAL flush and offset persistence from ReplicationSourceShipper [hbase]
ankitsol commented on code in PR #7617:
URL: https://github.com/apache/hbase/pull/7617#discussion_r2739954159
##
hbase-server/src/main/java/org/apache/hadoop/hbase/replication/regionserver/ReplicationSourceShipper.java:
##
@@ -229,6 +247,41 @@ private void shipEdits(WALEntryBatch entryBatch) {
}
}
+ private boolean shouldPersistLogPosition() {
+ReplicationEndpoint endpoint = source.getReplicationEndpoint();
Review Comment:
Sure, so does it make sense to handle this via Peer Configuration?
I can add the timeout/size logic to ReplicationSourceShipper, but have the
thresholds default to -1 (preserving existing behavior). For my custom
endpoint, I’ll pass specific values via the Peer Config.
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: [email protected]
For queries about this service, please contact Infrastructure at:
[email protected]
Re: [PR] HBASE-29823 Control WAL flush and offset persistence from ReplicationSourceShipper [hbase]
Apache9 commented on code in PR #7617:
URL: https://github.com/apache/hbase/pull/7617#discussion_r2736942086
##
hbase-server/src/main/java/org/apache/hadoop/hbase/replication/regionserver/ReplicationSourceShipper.java:
##
@@ -98,6 +103,14 @@ public final void run() {
LOG.info("Running ReplicationSourceShipper Thread for wal group: {}",
this.walGroupId);
// Loop until we close down
while (isActive()) {
+ // Whether to persist replication offsets based on size/time thresholds
+ if (shouldPersistLogPosition()) {
+try {
+ persistLogPosition();
+} catch (IOException e) {
+ LOG.warn("Exception while persisting replication state", e);
Review Comment:
In persistLogPosition, we will call the callback method in
ReplicationEndpoint, and for S3 based replication endpoint, you will close the
S3 file right? This operation could throw exceptions?
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: [email protected]
For queries about this service, please contact Infrastructure at:
[email protected]
Re: [PR] HBASE-29823 Control WAL flush and offset persistence from ReplicationSourceShipper [hbase]
Apache9 commented on code in PR #7617:
URL: https://github.com/apache/hbase/pull/7617#discussion_r2736930436
##
hbase-server/src/main/java/org/apache/hadoop/hbase/replication/regionserver/ReplicationSourceShipper.java:
##
@@ -229,6 +247,41 @@ private void shipEdits(WALEntryBatch entryBatch) {
}
}
+ private boolean shouldPersistLogPosition() {
+ReplicationEndpoint endpoint = source.getReplicationEndpoint();
+long maxBufferSize = endpoint.getMaxBufferSize();
+if (stagedWalSize == 0 || lastShippedBatch == null) {
+ return false;
+}
+if (maxBufferSize == -1) {
+ return true;
+}
+return stagedWalSize >= maxBufferSize
+ || (EnvironmentEdgeManager.currentTime() - lastStagedFlushTs >=
endpoint.maxFlushInterval());
+ }
+
+ private void persistLogPosition() throws IOException {
+if (lastShippedBatch == null) {
Review Comment:
Ah, OK, you do not rese the lastShippedBatch when reading a new batch. But
it still makes me a bit nervous that how can we get here when lastShippedBatch
is null...
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: [email protected]
For queries about this service, please contact Infrastructure at:
[email protected]
Re: [PR] HBASE-29823 Control WAL flush and offset persistence from ReplicationSourceShipper [hbase]
Apache9 commented on code in PR #7617:
URL: https://github.com/apache/hbase/pull/7617#discussion_r2736924354
##
hbase-server/src/main/java/org/apache/hadoop/hbase/replication/regionserver/ReplicationSourceShipper.java:
##
@@ -229,6 +247,41 @@ private void shipEdits(WALEntryBatch entryBatch) {
}
}
+ private boolean shouldPersistLogPosition() {
+ReplicationEndpoint endpoint = source.getReplicationEndpoint();
Review Comment:
We only need this configuration in shipper, as persisting the log position
is the duty of shipper, not endpoint. And there is no correctness problem if
you implement the endpoint correctly since when persisting the log position,
you will flush the files in endpoint. The problem is that the performance may
be bad. So when using different replication endpoint implementation, you shoud
tune the configuration in shipper about the persistency interval.
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: [email protected]
For queries about this service, please contact Infrastructure at:
[email protected]
Re: [PR] HBASE-29823 Control WAL flush and offset persistence from ReplicationSourceShipper [hbase]
ankitsol commented on code in PR #7617:
URL: https://github.com/apache/hbase/pull/7617#discussion_r2736370914
##
hbase-server/src/main/java/org/apache/hadoop/hbase/replication/regionserver/ReplicationSourceShipper.java:
##
@@ -98,6 +103,14 @@ public final void run() {
LOG.info("Running ReplicationSourceShipper Thread for wal group: {}",
this.walGroupId);
// Loop until we close down
while (isActive()) {
+ // Whether to persist replication offsets based on size/time thresholds
+ if (shouldPersistLogPosition()) {
+try {
+ persistLogPosition();
+} catch (IOException e) {
+ LOG.warn("Exception while persisting replication state", e);
Review Comment:
persistLogPosition() throws Exception only from cleanUpHFileRefs() where
getBulkLoadDescriptor() can throw exception, hence I thought retry is not
necessary. I am thinking to keep this as it is, but move cleanUpHFileRefs()
towards the end of persistLogPosition() definition (after we update offset).
Please let me know your thoughts
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: [email protected]
For queries about this service, please contact Infrastructure at:
[email protected]
Re: [PR] HBASE-29823 Control WAL flush and offset persistence from ReplicationSourceShipper [hbase]
ankitsol commented on code in PR #7617:
URL: https://github.com/apache/hbase/pull/7617#discussion_r2736351890
##
hbase-server/src/main/java/org/apache/hadoop/hbase/replication/regionserver/ReplicationSourceShipper.java:
##
@@ -229,6 +247,41 @@ private void shipEdits(WALEntryBatch entryBatch) {
}
}
+ private boolean shouldPersistLogPosition() {
+ReplicationEndpoint endpoint = source.getReplicationEndpoint();
Review Comment:
Ok, so we need to have these conf (time and size ones) at both
(ReplicationSourceShipper and ReplicationEndpoint) ends, so we can achieve the
tuning you mention in your previous comment:
https://github.com/apache/hbase/pull/7617#discussion_r2719322657
> For me, I do not think we need to expose this information to shipper?
>
> The design here is that, when using different ReplicationEndpoint, you
need to tune the shipper configuration by your own, as the parameters are not
only affected by ReplicationEndpoint, they also depend on the shipper side.
>
> For example, when you want to reduce the pressure on recording the offset,
you should increase the record interval, i.e, increase batch size, increase the
number of ship times between recording offset, etc. And if you want to reduce
the pressure on memory and the target receiver, you should decrease the batch
size, and for S3 based replication endpoint, there is also a trade off, if you
increase the flush interval, you can get better performance and less files on
S3, but failover will be more complicated as you need to start from long before.
>
> So, this should be in the documentation, just exposing some configuration
from ReplicationEndpoint can not handle all the above situations.
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: [email protected]
For queries about this service, please contact Infrastructure at:
[email protected]
Re: [PR] HBASE-29823 Control WAL flush and offset persistence from ReplicationSourceShipper [hbase]
ankitsol commented on code in PR #7617:
URL: https://github.com/apache/hbase/pull/7617#discussion_r2736357408
##
hbase-server/src/main/java/org/apache/hadoop/hbase/replication/regionserver/ReplicationSourceShipper.java:
##
@@ -229,6 +247,41 @@ private void shipEdits(WALEntryBatch entryBatch) {
}
}
+ private boolean shouldPersistLogPosition() {
+ReplicationEndpoint endpoint = source.getReplicationEndpoint();
Review Comment:
@Apache9 Please let me know if my understanding is correct here
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: [email protected]
For queries about this service, please contact Infrastructure at:
[email protected]
Re: [PR] HBASE-29823 Control WAL flush and offset persistence from ReplicationSourceShipper [hbase]
ankitsol commented on code in PR #7617:
URL: https://github.com/apache/hbase/pull/7617#discussion_r2736351890
##
hbase-server/src/main/java/org/apache/hadoop/hbase/replication/regionserver/ReplicationSourceShipper.java:
##
@@ -229,6 +247,41 @@ private void shipEdits(WALEntryBatch entryBatch) {
}
}
+ private boolean shouldPersistLogPosition() {
+ReplicationEndpoint endpoint = source.getReplicationEndpoint();
Review Comment:
Ok, so we need to have these conf (time and size ones) at both
(ReplicationSourceShipper and ReplicationEndpoint) ends, so we can achieve the
tuning you mention in your previous comment:
https://github.com/apache/hbase/pull/7617#discussion_r2719322657
`The design here is that, when using different ReplicationEndpoint, you need
to tune the shipper configuration by your own, as the parameters are not only
affected by ReplicationEndpoint, they also depend on the shipper side.
For example, when you want to reduce the pressure on recording the offset,
you should increase the record interval, i.e, increase batch size, increase the
number of ship times between recording offset, etc. And if you want to reduce
the pressure on memory and the target receiver, you should decrease the batch
size, and for S3 based replication endpoint, there is also a trade off, if you
increase the flush interval, you can get better performance and less files on
S3, but failover will be more complicated as you need to start from long
before.`
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: [email protected]
For queries about this service, please contact Infrastructure at:
[email protected]
Re: [PR] HBASE-29823 Control WAL flush and offset persistence from ReplicationSourceShipper [hbase]
ankitsol commented on code in PR #7617:
URL: https://github.com/apache/hbase/pull/7617#discussion_r2736314130
##
hbase-server/src/main/java/org/apache/hadoop/hbase/replication/regionserver/ReplicationSourceShipper.java:
##
@@ -229,6 +247,41 @@ private void shipEdits(WALEntryBatch entryBatch) {
}
}
+ private boolean shouldPersistLogPosition() {
+ReplicationEndpoint endpoint = source.getReplicationEndpoint();
+long maxBufferSize = endpoint.getMaxBufferSize();
+if (stagedWalSize == 0 || lastShippedBatch == null) {
+ return false;
+}
+if (maxBufferSize == -1) {
+ return true;
+}
+return stagedWalSize >= maxBufferSize
+ || (EnvironmentEdgeManager.currentTime() - lastStagedFlushTs >=
endpoint.maxFlushInterval());
+ }
+
+ private void persistLogPosition() throws IOException {
+if (lastShippedBatch == null) {
Review Comment:
As far as I understand, lastShippedBatch 'null' means no batch has been
replicated yet, so we don't need to update offset. Please correct me if I am
wrong here
lastShippedBatch is by default 'null' during ReplicationSourceShipper
initialisation and as soon as a batch is replicated it is updated.
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: [email protected]
For queries about this service, please contact Infrastructure at:
[email protected]
Re: [PR] HBASE-29823 Control WAL flush and offset persistence from ReplicationSourceShipper [hbase]
wchevreuil commented on code in PR #7617:
URL: https://github.com/apache/hbase/pull/7617#discussion_r2736093265
##
hbase-server/src/main/java/org/apache/hadoop/hbase/replication/ReplicationEndpoint.java:
##
@@ -283,4 +283,35 @@ public int getTimeout() {
* @throws IllegalStateException if this service's state isn't FAILED.
*/
Throwable failureCause();
+
+ /**
+ * @return true if this endpoint buffers WAL entries and requires explicit
flush control before
+ * persisting replication offsets.
+ */
+ default boolean isBufferedReplicationEndpoint() {
+return false;
+ }
+
+ /**
+ * Maximum WAL size (bytes) to buffer before forcing a flush. Only
meaningful when
+ * isBufferedReplicationEndpoint() == true.
+ */
+ default long getMaxBufferSize() {
+return -1L;
+ }
+
+ /**
+ * Maximum time (ms) to wait before forcing a flush. Only meaningful when
+ * isBufferedReplicationEndpoint() == true.
+ */
+ default long maxFlushInterval() {
+return Long.MAX_VALUE;
+ }
+
+ /**
+ * Hook invoked before persisting replication offsets. Buffered endpoints
should flush/close WALs
+ * here.
+ */
+ default void beforePersistingReplicationOffset() {
Review Comment:
Ok, so maybe call it `beforeUpdatingLogPosition`? And let's be more detailed
in the javadoc comments, like saying this is called by the shipper when it's
about to move the offset forward in the wal reader and potentially make wal
files that were fully read eligible for deletion.
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: [email protected]
For queries about this service, please contact Infrastructure at:
[email protected]
Re: [PR] HBASE-29823 Control WAL flush and offset persistence from ReplicationSourceShipper [hbase]
wchevreuil commented on code in PR #7617:
URL: https://github.com/apache/hbase/pull/7617#discussion_r2736079118
##
hbase-server/src/main/java/org/apache/hadoop/hbase/replication/regionserver/ReplicationSourceShipper.java:
##
@@ -229,6 +247,41 @@ private void shipEdits(WALEntryBatch entryBatch) {
}
}
+ private boolean shouldPersistLogPosition() {
+ReplicationEndpoint endpoint = source.getReplicationEndpoint();
+long maxBufferSize = endpoint.getMaxBufferSize();
+if (stagedWalSize == 0 || lastShippedBatch == null) {
+ return false;
+}
+if (maxBufferSize == -1) {
+ return true;
+}
+return stagedWalSize >= maxBufferSize
+ || (EnvironmentEdgeManager.currentTime() - lastStagedFlushTs >=
endpoint.maxFlushInterval());
+ }
+
+ private void persistLogPosition() throws IOException {
+if (lastShippedBatch == null) {
+ return;
+}
+
+ReplicationEndpoint endpoint = source.getReplicationEndpoint();
+endpoint.beforePersistingReplicationOffset();
+
+// Clean up hfile references
+for (Entry entry : entriesForCleanUpHFileRefs) {
+ cleanUpHFileRefs(entry.getEdit());
+ LOG.trace("shipped entry {}: ", entry);
Review Comment:
Please provide a more accurate message. This is not what really happened
here.
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: [email protected]
For queries about this service, please contact Infrastructure at:
[email protected]
Re: [PR] HBASE-29823 Control WAL flush and offset persistence from ReplicationSourceShipper [hbase]
Apache9 commented on code in PR #7617:
URL: https://github.com/apache/hbase/pull/7617#discussion_r2736040903
##
hbase-server/src/main/java/org/apache/hadoop/hbase/replication/regionserver/ReplicationSourceShipper.java:
##
@@ -229,6 +247,41 @@ private void shipEdits(WALEntryBatch entryBatch) {
}
}
+ private boolean shouldPersistLogPosition() {
+ReplicationEndpoint endpoint = source.getReplicationEndpoint();
Review Comment:
Here we should use configuration values instead of getting from
ReplicationEndpoint. We can have default configuration values to keep the old
behavior for normal replication.
##
hbase-server/src/main/java/org/apache/hadoop/hbase/replication/regionserver/ReplicationSourceShipper.java:
##
@@ -229,6 +247,41 @@ private void shipEdits(WALEntryBatch entryBatch) {
}
}
+ private boolean shouldPersistLogPosition() {
+ReplicationEndpoint endpoint = source.getReplicationEndpoint();
+long maxBufferSize = endpoint.getMaxBufferSize();
+if (stagedWalSize == 0 || lastShippedBatch == null) {
+ return false;
+}
+if (maxBufferSize == -1) {
+ return true;
+}
+return stagedWalSize >= maxBufferSize
+ || (EnvironmentEdgeManager.currentTime() - lastStagedFlushTs >=
endpoint.maxFlushInterval());
+ }
+
+ private void persistLogPosition() throws IOException {
+if (lastShippedBatch == null) {
Review Comment:
Since we could cumulate different batches in the above loop, a null batch
does not mean we haven't shipped anything out? Why here we just return if
lastShippedBatch is null?
##
hbase-server/src/main/java/org/apache/hadoop/hbase/replication/regionserver/ReplicationSourceShipper.java:
##
@@ -98,6 +103,14 @@ public final void run() {
LOG.info("Running ReplicationSourceShipper Thread for wal group: {}",
this.walGroupId);
// Loop until we close down
while (isActive()) {
+ // Whether to persist replication offsets based on size/time thresholds
+ if (shouldPersistLogPosition()) {
+try {
+ persistLogPosition();
+} catch (IOException e) {
+ LOG.warn("Exception while persisting replication state", e);
Review Comment:
This is not enough for handling the exception? Typically we should restart
from the last persistent offset and replicate again.
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: [email protected]
For queries about this service, please contact Infrastructure at:
[email protected]
Re: [PR] HBASE-29823 Control WAL flush and offset persistence from ReplicationSourceShipper [hbase]
wchevreuil commented on code in PR #7617:
URL: https://github.com/apache/hbase/pull/7617#discussion_r2735996225
##
hbase-server/src/main/java/org/apache/hadoop/hbase/replication/regionserver/ReplicationSourceShipper.java:
##
@@ -190,13 +204,16 @@ private void shipEdits(WALEntryBatch entryBatch) {
} else {
sleepMultiplier = Math.max(sleepMultiplier - 1, 0);
}
-// Clean up hfile references
-for (Entry entry : entries) {
- cleanUpHFileRefs(entry.getEdit());
- LOG.trace("shipped entry {}: ", entry);
+
+stagedWalSize += currentSize;
+entriesForCleanUpHFileRefs.addAll(entries);
+lastShippedBatch = entryBatch;
+if (shouldPersistLogPosition()) {
Review Comment:
> Are you referring to your original comment about passing the entire
entryBatch object?
No, I meant [this check
](https://github.com/apache/hbase/pull/7617/changes/8c828290cc7ab36b9db75a72916d6bbfd1dc4e47#diff-1907878c880a164eafc9eb5fb412a376d1d504bf54bfe91192e36693e60dfaf8L251)that
was being done inside "shouldPersistLogPosition", which would cause the buffer
size to be only considered for specific endpoint types.
@ankitsol has already addressed it on a recent commit.
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: [email protected]
For queries about this service, please contact Infrastructure at:
[email protected]
Re: [PR] HBASE-29823 Control WAL flush and offset persistence from ReplicationSourceShipper [hbase]
Apache-HBase commented on PR #7617:
URL: https://github.com/apache/hbase/pull/7617#issuecomment-3807670294
:confetti_ball: **+1 overall**
| Vote | Subsystem | Runtime | Logfile | Comment |
|::|--:|:|::|:---:|
| +0 :ok: | reexec | 0m 30s | | Docker mode activated. |
| -0 :warning: | yetus | 0m 3s | | Unprocessed flag(s):
--brief-report-file --spotbugs-strict-precheck --author-ignore-list
--blanks-eol-ignore-file --blanks-tabs-ignore-file --quick-hadoopcheck |
_ Prechecks _ |
_ master Compile Tests _ |
| +1 :green_heart: | mvninstall | 3m 33s | | master passed |
| +1 :green_heart: | compile | 1m 7s | | master passed |
| +1 :green_heart: | javadoc | 0m 31s | | master passed |
| +1 :green_heart: | shadedjars | 5m 58s | | branch has no errors when
building our shaded downstream artifacts. |
_ Patch Compile Tests _ |
| +1 :green_heart: | mvninstall | 3m 6s | | the patch passed |
| +1 :green_heart: | compile | 0m 59s | | the patch passed |
| +1 :green_heart: | javac | 0m 59s | | the patch passed |
| -0 :warning: | javadoc | 0m 27s |
[/results-javadoc-javadoc-hbase-server.txt](https://ci-hbase.apache.org/job/HBase-PreCommit-GitHub-PR/job/PR-7617/5/artifact/yetus-jdk17-hadoop3-check/output/results-javadoc-javadoc-hbase-server.txt)
| hbase-server generated 1 new + 65 unchanged - 0 fixed = 66 total (was 65) |
| +1 :green_heart: | shadedjars | 5m 52s | | patch has no errors when
building our shaded downstream artifacts. |
_ Other Tests _ |
| +1 :green_heart: | unit | 230m 20s | | hbase-server in the patch
passed. |
| | | 257m 36s | | |
| Subsystem | Report/Notes |
|--:|:-|
| Docker | ClientAPI=1.43 ServerAPI=1.43 base:
https://ci-hbase.apache.org/job/HBase-PreCommit-GitHub-PR/job/PR-7617/5/artifact/yetus-jdk17-hadoop3-check/output/Dockerfile
|
| GITHUB PR | https://github.com/apache/hbase/pull/7617 |
| Optional Tests | javac javadoc unit compile shadedjars |
| uname | Linux a908a4083825 5.4.0-1103-aws #111~18.04.1-Ubuntu SMP Tue May
23 20:04:10 UTC 2023 x86_64 x86_64 x86_64 GNU/Linux |
| Build tool | maven |
| Personality | dev-support/hbase-personality.sh |
| git revision | master / e74eee1698d3bff65b1d1b05e4b1982fe450c72b |
| Default Java | Eclipse Adoptium-17.0.11+9 |
| Test Results |
https://ci-hbase.apache.org/job/HBase-PreCommit-GitHub-PR/job/PR-7617/5/testReport/
|
| Max. process+thread count | 4270 (vs. ulimit of 3) |
| modules | C: hbase-server U: hbase-server |
| Console output |
https://ci-hbase.apache.org/job/HBase-PreCommit-GitHub-PR/job/PR-7617/5/console
|
| versions | git=2.34.1 maven=3.9.8 |
| Powered by | Apache Yetus 0.15.0 https://yetus.apache.org |
This message was automatically generated.
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: [email protected]
For queries about this service, please contact Infrastructure at:
[email protected]
Re: [PR] HBASE-29823 Control WAL flush and offset persistence from ReplicationSourceShipper [hbase]
Apache-HBase commented on PR #7617:
URL: https://github.com/apache/hbase/pull/7617#issuecomment-3806617854
:confetti_ball: **+1 overall**
| Vote | Subsystem | Runtime | Logfile | Comment |
|::|--:|:|::|:---:|
| +0 :ok: | reexec | 0m 11s | | Docker mode activated. |
_ Prechecks _ |
| +1 :green_heart: | dupname | 0m 0s | | No case conflicting files
found. |
| +0 :ok: | codespell | 0m 0s | | codespell was not available. |
| +0 :ok: | detsecrets | 0m 0s | | detect-secrets was not available.
|
| +1 :green_heart: | @author | 0m 0s | | The patch does not contain
any @author tags. |
| +1 :green_heart: | hbaseanti | 0m 0s | | Patch does not have any
anti-patterns. |
_ master Compile Tests _ |
| +1 :green_heart: | mvninstall | 2m 15s | | master passed |
| +1 :green_heart: | compile | 2m 35s | | master passed |
| +1 :green_heart: | checkstyle | 0m 45s | | master passed |
| +1 :green_heart: | spotbugs | 1m 13s | | master passed |
| +1 :green_heart: | spotless | 0m 34s | | branch has no errors when
running spotless:check. |
_ Patch Compile Tests _ |
| +1 :green_heart: | mvninstall | 2m 17s | | the patch passed |
| +1 :green_heart: | compile | 2m 31s | | the patch passed |
| +1 :green_heart: | javac | 2m 31s | | the patch passed |
| +1 :green_heart: | blanks | 0m 0s | | The patch has no blanks
issues. |
| +1 :green_heart: | checkstyle | 0m 45s | | the patch passed |
| +1 :green_heart: | spotbugs | 1m 19s | | the patch passed |
| +1 :green_heart: | hadoopcheck | 8m 30s | | Patch does not cause any
errors with Hadoop 3.3.6 3.4.1. |
| +1 :green_heart: | spotless | 0m 34s | | patch has no errors when
running spotless:check. |
_ Other Tests _ |
| +1 :green_heart: | asflicense | 0m 8s | | The patch does not
generate ASF License warnings. |
| | | 29m 3s | | |
| Subsystem | Report/Notes |
|--:|:-|
| Docker | ClientAPI=1.53 ServerAPI=1.53 base:
https://ci-hbase.apache.org/job/HBase-PreCommit-GitHub-PR/job/PR-7617/5/artifact/yetus-general-check/output/Dockerfile
|
| GITHUB PR | https://github.com/apache/hbase/pull/7617 |
| Optional Tests | dupname asflicense javac spotbugs checkstyle codespell
detsecrets compile hadoopcheck hbaseanti spotless |
| uname | Linux 3133a65ee296 6.14.0-1018-aws #18~24.04.1-Ubuntu SMP Mon Nov
24 19:46:27 UTC 2025 x86_64 x86_64 x86_64 GNU/Linux |
| Build tool | maven |
| Personality | dev-support/hbase-personality.sh |
| git revision | master / e74eee1698d3bff65b1d1b05e4b1982fe450c72b |
| Default Java | Eclipse Adoptium-17.0.11+9 |
| Max. process+thread count | 85 (vs. ulimit of 3) |
| modules | C: hbase-server U: hbase-server |
| Console output |
https://ci-hbase.apache.org/job/HBase-PreCommit-GitHub-PR/job/PR-7617/5/console
|
| versions | git=2.34.1 maven=3.9.8 spotbugs=4.7.3 |
| Powered by | Apache Yetus 0.15.0 https://yetus.apache.org |
This message was automatically generated.
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: [email protected]
For queries about this service, please contact Infrastructure at:
[email protected]
Re: [PR] HBASE-29823 Control WAL flush and offset persistence from ReplicationSourceShipper [hbase]
anmolnar commented on code in PR #7617:
URL: https://github.com/apache/hbase/pull/7617#discussion_r2733050407
##
hbase-server/src/main/java/org/apache/hadoop/hbase/replication/regionserver/ReplicationSourceShipper.java:
##
@@ -229,6 +246,53 @@ private void shipEdits(WALEntryBatch entryBatch) {
}
}
+ private boolean shouldPersistLogPosition() {
+ReplicationEndpoint endpoint = source.getReplicationEndpoint();
+if (!endpoint.isBufferedReplicationEndpoint()) {
+ // Non-buffering endpoints persist immediately
+ return true;
+}
+if (stagedWalSize == 0 || lastShippedBatch == null) {
+ return false;
+}
+return stagedWalSize >= endpoint.getMaxBufferSize()
+ || (EnvironmentEdgeManager.currentTime() - lastStagedFlushTs >=
endpoint.maxFlushInterval());
+ }
+
+ @Nullable
+ // Returns IOException instead of throwing so callers can decide
+ // whether to retry (shipEdits) or best-effort log (run()).
+ private IOException persistLogPosition() {
+if (lastShippedBatch == null) {
+ return null;
+}
+
+ReplicationEndpoint endpoint = source.getReplicationEndpoint();
+
+if (endpoint.isBufferedReplicationEndpoint() && stagedWalSize > 0) {
+ source.getReplicationEndpoint().beforePersistingReplicationOffset();
+}
Review Comment:
I think Duo's
[comment](https://github.com/apache/hbase/pull/7617#discussion_r2721491163)
applies here as well.
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: [email protected]
For queries about this service, please contact Infrastructure at:
[email protected]
Re: [PR] HBASE-29823 Control WAL flush and offset persistence from ReplicationSourceShipper [hbase]
anmolnar commented on code in PR #7617:
URL: https://github.com/apache/hbase/pull/7617#discussion_r2733047546
##
hbase-server/src/main/java/org/apache/hadoop/hbase/replication/regionserver/ReplicationSourceShipper.java:
##
@@ -190,13 +204,16 @@ private void shipEdits(WALEntryBatch entryBatch) {
} else {
sleepMultiplier = Math.max(sleepMultiplier - 1, 0);
}
-// Clean up hfile references
-for (Entry entry : entries) {
- cleanUpHFileRefs(entry.getEdit());
- LOG.trace("shipped entry {}: ", entry);
+
+stagedWalSize += currentSize;
+entriesForCleanUpHFileRefs.addAll(entries);
+lastShippedBatch = entryBatch;
+if (shouldPersistLogPosition()) {
Review Comment:
> That would be fine for me, but then we would need to adjust the
shouldPersistLogPosition method accordingly.
Are you referring to your original comment about passing the entire
`entryBatch` object?
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: [email protected]
For queries about this service, please contact Infrastructure at:
[email protected]
Re: [PR] HBASE-29823 Control WAL flush and offset persistence from ReplicationSourceShipper [hbase]
Apache-HBase commented on PR #7617:
URL: https://github.com/apache/hbase/pull/7617#issuecomment-3805694243
:broken_heart: **-1 overall**
| Vote | Subsystem | Runtime | Logfile | Comment |
|::|--:|:|::|:---:|
| +0 :ok: | reexec | 0m 11s | | Docker mode activated. |
| -0 :warning: | yetus | 0m 3s | | Unprocessed flag(s):
--brief-report-file --spotbugs-strict-precheck --author-ignore-list
--blanks-eol-ignore-file --blanks-tabs-ignore-file --quick-hadoopcheck |
_ Prechecks _ |
_ master Compile Tests _ |
| +1 :green_heart: | mvninstall | 2m 23s | | master passed |
| +1 :green_heart: | compile | 0m 48s | | master passed |
| +1 :green_heart: | javadoc | 0m 21s | | master passed |
| +1 :green_heart: | shadedjars | 4m 25s | | branch has no errors when
building our shaded downstream artifacts. |
_ Patch Compile Tests _ |
| +1 :green_heart: | mvninstall | 2m 17s | | the patch passed |
| +1 :green_heart: | compile | 0m 46s | | the patch passed |
| +1 :green_heart: | javac | 0m 46s | | the patch passed |
| -0 :warning: | javadoc | 0m 21s |
[/results-javadoc-javadoc-hbase-server.txt](https://ci-hbase.apache.org/job/HBase-PreCommit-GitHub-PR/job/PR-7617/4/artifact/yetus-jdk17-hadoop3-check/output/results-javadoc-javadoc-hbase-server.txt)
| hbase-server generated 1 new + 65 unchanged - 0 fixed = 66 total (was 65) |
| +1 :green_heart: | shadedjars | 4m 25s | | patch has no errors when
building our shaded downstream artifacts. |
_ Other Tests _ |
| -1 :x: | unit | 216m 8s |
[/patch-unit-hbase-server.txt](https://ci-hbase.apache.org/job/HBase-PreCommit-GitHub-PR/job/PR-7617/4/artifact/yetus-jdk17-hadoop3-check/output/patch-unit-hbase-server.txt)
| hbase-server in the patch failed. |
| | | 236m 1s | | |
| Subsystem | Report/Notes |
|--:|:-|
| Docker | ClientAPI=1.52 ServerAPI=1.52 base:
https://ci-hbase.apache.org/job/HBase-PreCommit-GitHub-PR/job/PR-7617/4/artifact/yetus-jdk17-hadoop3-check/output/Dockerfile
|
| GITHUB PR | https://github.com/apache/hbase/pull/7617 |
| Optional Tests | javac javadoc unit compile shadedjars |
| uname | Linux 127c6853cf95 6.14.0-1018-aws #18~24.04.1-Ubuntu SMP Mon Nov
24 19:46:27 UTC 2025 x86_64 x86_64 x86_64 GNU/Linux |
| Build tool | maven |
| Personality | dev-support/hbase-personality.sh |
| git revision | master / 8c828290cc7ab36b9db75a72916d6bbfd1dc4e47 |
| Default Java | Eclipse Adoptium-17.0.11+9 |
| Test Results |
https://ci-hbase.apache.org/job/HBase-PreCommit-GitHub-PR/job/PR-7617/4/testReport/
|
| Max. process+thread count | 6323 (vs. ulimit of 3) |
| modules | C: hbase-server U: hbase-server |
| Console output |
https://ci-hbase.apache.org/job/HBase-PreCommit-GitHub-PR/job/PR-7617/4/console
|
| versions | git=2.34.1 maven=3.9.8 |
| Powered by | Apache Yetus 0.15.0 https://yetus.apache.org |
This message was automatically generated.
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: [email protected]
For queries about this service, please contact Infrastructure at:
[email protected]
Re: [PR] HBASE-29823 Control WAL flush and offset persistence from ReplicationSourceShipper [hbase]
Apache-HBase commented on PR #7617:
URL: https://github.com/apache/hbase/pull/7617#issuecomment-3804659205
:broken_heart: **-1 overall**
| Vote | Subsystem | Runtime | Logfile | Comment |
|::|--:|:|::|:---:|
| +0 :ok: | reexec | 0m 11s | | Docker mode activated. |
_ Prechecks _ |
| +1 :green_heart: | dupname | 0m 0s | | No case conflicting files
found. |
| +0 :ok: | codespell | 0m 0s | | codespell was not available. |
| +0 :ok: | detsecrets | 0m 0s | | detect-secrets was not available.
|
| +1 :green_heart: | @author | 0m 0s | | The patch does not contain
any @author tags. |
| +1 :green_heart: | hbaseanti | 0m 0s | | Patch does not have any
anti-patterns. |
_ master Compile Tests _ |
| +1 :green_heart: | mvninstall | 3m 8s | | master passed |
| +1 :green_heart: | compile | 2m 52s | | master passed |
| +1 :green_heart: | checkstyle | 0m 53s | | master passed |
| +1 :green_heart: | spotbugs | 1m 26s | | master passed |
| +1 :green_heart: | spotless | 0m 43s | | branch has no errors when
running spotless:check. |
_ Patch Compile Tests _ |
| +1 :green_heart: | mvninstall | 2m 49s | | the patch passed |
| +1 :green_heart: | compile | 2m 47s | | the patch passed |
| +1 :green_heart: | javac | 2m 47s | | the patch passed |
| +1 :green_heart: | blanks | 0m 0s | | The patch has no blanks
issues. |
| +1 :green_heart: | checkstyle | 0m 51s | | the patch passed |
| +1 :green_heart: | spotbugs | 1m 30s | | the patch passed |
| +1 :green_heart: | hadoopcheck | 9m 26s | | Patch does not cause any
errors with Hadoop 3.3.6 3.4.1. |
| -1 :x: | spotless | 0m 32s |
[/patch-spotless.txt](https://ci-hbase.apache.org/job/HBase-PreCommit-GitHub-PR/job/PR-7617/4/artifact/yetus-general-check/output/patch-spotless.txt)
| patch has 24 errors when running spotless:check, run spotless:apply to fix.
|
_ Other Tests _ |
| +1 :green_heart: | asflicense | 0m 10s | | The patch does not
generate ASF License warnings. |
| | | 33m 9s | | |
| Subsystem | Report/Notes |
|--:|:-|
| Docker | ClientAPI=1.53 ServerAPI=1.53 base:
https://ci-hbase.apache.org/job/HBase-PreCommit-GitHub-PR/job/PR-7617/4/artifact/yetus-general-check/output/Dockerfile
|
| GITHUB PR | https://github.com/apache/hbase/pull/7617 |
| Optional Tests | dupname asflicense javac spotbugs checkstyle codespell
detsecrets compile hadoopcheck hbaseanti spotless |
| uname | Linux 11fb3d023937 6.14.0-1018-aws #18~24.04.1-Ubuntu SMP Mon Nov
24 19:46:27 UTC 2025 x86_64 x86_64 x86_64 GNU/Linux |
| Build tool | maven |
| Personality | dev-support/hbase-personality.sh |
| git revision | master / 8c828290cc7ab36b9db75a72916d6bbfd1dc4e47 |
| Default Java | Eclipse Adoptium-17.0.11+9 |
| Max. process+thread count | 84 (vs. ulimit of 3) |
| modules | C: hbase-server U: hbase-server |
| Console output |
https://ci-hbase.apache.org/job/HBase-PreCommit-GitHub-PR/job/PR-7617/4/console
|
| versions | git=2.34.1 maven=3.9.8 spotbugs=4.7.3 |
| Powered by | Apache Yetus 0.15.0 https://yetus.apache.org |
This message was automatically generated.
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: [email protected]
For queries about this service, please contact Infrastructure at:
[email protected]
Re: [PR] HBASE-29823 Control WAL flush and offset persistence from ReplicationSourceShipper [hbase]
wchevreuil commented on code in PR #7617:
URL: https://github.com/apache/hbase/pull/7617#discussion_r2727222631
##
hbase-server/src/main/java/org/apache/hadoop/hbase/replication/regionserver/ReplicationSourceShipper.java:
##
@@ -190,13 +204,16 @@ private void shipEdits(WALEntryBatch entryBatch) {
} else {
sleepMultiplier = Math.max(sleepMultiplier - 1, 0);
}
-// Clean up hfile references
-for (Entry entry : entries) {
- cleanUpHFileRefs(entry.getEdit());
- LOG.trace("shipped entry {}: ", entry);
+
+stagedWalSize += currentSize;
+entriesForCleanUpHFileRefs.addAll(entries);
+lastShippedBatch = entryBatch;
+if (shouldPersistLogPosition()) {
Review Comment:
Oh, so your idea is to allow shipper to decide persist log position based on
time and/or stg usage by wals regarldess of the endpoint implementation? That
would be fine for me, but then we would need to adjust the
shouldPersistLogPosition method accordingly.
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: [email protected]
For queries about this service, please contact Infrastructure at:
[email protected]
Re: [PR] HBASE-29823 Control WAL flush and offset persistence from ReplicationSourceShipper [hbase]
Apache9 commented on code in PR #7617:
URL: https://github.com/apache/hbase/pull/7617#discussion_r2724276907
##
hbase-server/src/main/java/org/apache/hadoop/hbase/replication/regionserver/ReplicationSourceShipper.java:
##
@@ -190,13 +204,16 @@ private void shipEdits(WALEntryBatch entryBatch) {
} else {
sleepMultiplier = Math.max(sleepMultiplier - 1, 0);
}
-// Clean up hfile references
-for (Entry entry : entries) {
- cleanUpHFileRefs(entry.getEdit());
- LOG.trace("shipped entry {}: ", entry);
+
+stagedWalSize += currentSize;
+entriesForCleanUpHFileRefs.addAll(entries);
+lastShippedBatch = entryBatch;
+if (shouldPersistLogPosition()) {
Review Comment:
I think time based and size based persistency is enough for most cases? If
in the future we have some special endpoint which needs new type of decision
way, we can add new mechanism, no problem.
The problem here why we do not want to only trigger persistency from
endpoint is that, we have other considerations about when to persist the log
position, like the trade off between failover and pressure on replication
storage. So here I suggest that we introduce general mechanisms to control the
behavior of persistency of log position, users can tune it based on different
approach.
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: [email protected]
For queries about this service, please contact Infrastructure at:
[email protected]
Re: [PR] HBASE-29823 Control WAL flush and offset persistence from ReplicationSourceShipper [hbase]
wchevreuil commented on code in PR #7617:
URL: https://github.com/apache/hbase/pull/7617#discussion_r2721937683
##
hbase-server/src/main/java/org/apache/hadoop/hbase/replication/regionserver/ReplicationSourceShipper.java:
##
@@ -190,13 +204,16 @@ private void shipEdits(WALEntryBatch entryBatch) {
} else {
sleepMultiplier = Math.max(sleepMultiplier - 1, 0);
}
-// Clean up hfile references
-for (Entry entry : entries) {
- cleanUpHFileRefs(entry.getEdit());
- LOG.trace("shipped entry {}: ", entry);
+
+stagedWalSize += currentSize;
+entriesForCleanUpHFileRefs.addAll(entries);
+lastShippedBatch = entryBatch;
+if (shouldPersistLogPosition()) {
Review Comment:
IMHO, it doesn't look much cohesive. Shipper seems to be taking decisions
based on specific endpoint implementations. What if new endpoint impls with
different logic for updating log position are thought in the future, we would
need to revisit the shipper again.
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: [email protected]
For queries about this service, please contact Infrastructure at:
[email protected]
Re: [PR] HBASE-29823 Control WAL flush and offset persistence from ReplicationSourceShipper [hbase]
Apache9 commented on code in PR #7617:
URL: https://github.com/apache/hbase/pull/7617#discussion_r2721491163
##
hbase-server/src/main/java/org/apache/hadoop/hbase/replication/regionserver/ReplicationSourceShipper.java:
##
@@ -190,13 +204,16 @@ private void shipEdits(WALEntryBatch entryBatch) {
} else {
sleepMultiplier = Math.max(sleepMultiplier - 1, 0);
}
-// Clean up hfile references
-for (Entry entry : entries) {
- cleanUpHFileRefs(entry.getEdit());
- LOG.trace("shipped entry {}: ", entry);
+
+stagedWalSize += currentSize;
+entriesForCleanUpHFileRefs.addAll(entries);
+lastShippedBatch = entryBatch;
+if (shouldPersistLogPosition()) {
Review Comment:
We want to determine whether we need to persist the log position in shipper,
based on some configurations, not triggered by replication endpoint. Users can
choose different configuration values based on different replication endpoint
implementations.
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: [email protected]
For queries about this service, please contact Infrastructure at:
[email protected]
Re: [PR] HBASE-29823 Control WAL flush and offset persistence from ReplicationSourceShipper [hbase]
Apache9 commented on code in PR #7617:
URL: https://github.com/apache/hbase/pull/7617#discussion_r2721486401
##
hbase-server/src/main/java/org/apache/hadoop/hbase/replication/regionserver/ReplicationSourceShipper.java:
##
@@ -229,6 +246,53 @@ private void shipEdits(WALEntryBatch entryBatch) {
}
}
+ private boolean shouldPersistLogPosition() {
+ReplicationEndpoint endpoint = source.getReplicationEndpoint();
+if (!endpoint.isBufferedReplicationEndpoint()) {
+ // Non-buffering endpoints persist immediately
+ return true;
+}
+if (stagedWalSize == 0 || lastShippedBatch == null) {
+ return false;
+}
+return stagedWalSize >= endpoint.getMaxBufferSize()
+ || (EnvironmentEdgeManager.currentTime() - lastStagedFlushTs >=
endpoint.maxFlushInterval());
+ }
Review Comment:
Please see my comment above to get more context
> For me, I do not think we need to expose this information to shipper?
>
>The design here is that, when using different ReplicationEndpoint, you need
to tune the shipper configuration by your own, as the parameters are not only
affected by ReplicationEndpoint, they also depend on the shipper side.
>
>For example, when you want to reduce the pressure on recording the offset,
you should increase the record interval, i.e, increase batch size, increase the
number of ship times between recording offset, etc. And if you want to reduce
the pressure on memory and the target receiver, you should decrease the batch
size, and for S3 based replication endpoint, there is also a trade off, if you
increase the flush interval, you can get better performance and less files on
S3, but failover will be more complicated as you need to start from long before.
>
>So, this should be in the documentation, just exposing some configuration
from ReplicationEndpoint can not handle all the above situations.
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: [email protected]
For queries about this service, please contact Infrastructure at:
[email protected]
Re: [PR] HBASE-29823 Control WAL flush and offset persistence from ReplicationSourceShipper [hbase]
Apache9 commented on code in PR #7617:
URL: https://github.com/apache/hbase/pull/7617#discussion_r2721481232
##
hbase-server/src/main/java/org/apache/hadoop/hbase/replication/ReplicationEndpoint.java:
##
@@ -283,4 +283,35 @@ public int getTimeout() {
* @throws IllegalStateException if this service's state isn't FAILED.
*/
Throwable failureCause();
+
+ /**
+ * @return true if this endpoint buffers WAL entries and requires explicit
flush control before
+ * persisting replication offsets.
+ */
+ default boolean isBufferedReplicationEndpoint() {
+return false;
+ }
+
+ /**
+ * Maximum WAL size (bytes) to buffer before forcing a flush. Only
meaningful when
+ * isBufferedReplicationEndpoint() == true.
+ */
+ default long getMaxBufferSize() {
+return -1L;
+ }
+
+ /**
+ * Maximum time (ms) to wait before forcing a flush. Only meaningful when
+ * isBufferedReplicationEndpoint() == true.
+ */
+ default long maxFlushInterval() {
+return Long.MAX_VALUE;
+ }
+
+ /**
+ * Hook invoked before persisting replication offsets. Buffered endpoints
should flush/close WALs
+ * here.
+ */
+ default void beforePersistingReplicationOffset() {
Review Comment:
Sorry, this is exactly what we want to avoid here. This method just tells
the endpoint what the shipper going to do, and the endpoint can do anything it
wants. For our normal replication framework, there is no flush and close
operations, as we will always send everything out and return until we get acks,
so basically we do not need to implement this method. And for S3 based
replication endpoint, we need to close the file to persist it on S3. Maybe in
the future we have other types of replication endpoint which may do other
works, so we do not want to name it `flushAndCloseWAL`.
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: [email protected]
For queries about this service, please contact Infrastructure at:
[email protected]
Re: [PR] HBASE-29823 Control WAL flush and offset persistence from ReplicationSourceShipper [hbase]
wchevreuil commented on code in PR #7617:
URL: https://github.com/apache/hbase/pull/7617#discussion_r2720780067
##
hbase-server/src/main/java/org/apache/hadoop/hbase/replication/regionserver/ReplicationSourceShipper.java:
##
@@ -229,6 +246,53 @@ private void shipEdits(WALEntryBatch entryBatch) {
}
}
+ private boolean shouldPersistLogPosition() {
+ReplicationEndpoint endpoint = source.getReplicationEndpoint();
+if (!endpoint.isBufferedReplicationEndpoint()) {
+ // Non-buffering endpoints persist immediately
+ return true;
+}
+if (stagedWalSize == 0 || lastShippedBatch == null) {
+ return false;
+}
+return stagedWalSize >= endpoint.getMaxBufferSize()
+ || (EnvironmentEdgeManager.currentTime() - lastStagedFlushTs >=
endpoint.maxFlushInterval());
+ }
+
+ @Nullable
+ // Returns IOException instead of throwing so callers can decide
+ // whether to retry (shipEdits) or best-effort log (run()).
+ private IOException persistLogPosition() {
+if (lastShippedBatch == null) {
+ return null;
+}
+
+ReplicationEndpoint endpoint = source.getReplicationEndpoint();
+
+if (endpoint.isBufferedReplicationEndpoint() && stagedWalSize > 0) {
+ source.getReplicationEndpoint().beforePersistingReplicationOffset();
+}
Review Comment:
Also should be in the endpoint.
##
hbase-server/src/main/java/org/apache/hadoop/hbase/replication/regionserver/ReplicationSourceShipper.java:
##
@@ -229,6 +246,53 @@ private void shipEdits(WALEntryBatch entryBatch) {
}
}
+ private boolean shouldPersistLogPosition() {
+ReplicationEndpoint endpoint = source.getReplicationEndpoint();
+if (!endpoint.isBufferedReplicationEndpoint()) {
+ // Non-buffering endpoints persist immediately
+ return true;
+}
+if (stagedWalSize == 0 || lastShippedBatch == null) {
+ return false;
+}
+return stagedWalSize >= endpoint.getMaxBufferSize()
+ || (EnvironmentEdgeManager.currentTime() - lastStagedFlushTs >=
endpoint.maxFlushInterval());
+ }
Review Comment:
This should be in the Endpoint, as the decision varies according to the type
of endpoint. Today we have basically two types, buffered and non-buffered. If
we have new endpoint types in the future, we might again need to come here and
add the related logic.
##
hbase-server/src/main/java/org/apache/hadoop/hbase/replication/ReplicationEndpoint.java:
##
@@ -283,4 +283,35 @@ public int getTimeout() {
* @throws IllegalStateException if this service's state isn't FAILED.
*/
Throwable failureCause();
+
+ /**
+ * @return true if this endpoint buffers WAL entries and requires explicit
flush control before
+ * persisting replication offsets.
+ */
+ default boolean isBufferedReplicationEndpoint() {
+return false;
+ }
+
+ /**
+ * Maximum WAL size (bytes) to buffer before forcing a flush. Only
meaningful when
+ * isBufferedReplicationEndpoint() == true.
+ */
+ default long getMaxBufferSize() {
+return -1L;
+ }
+
+ /**
+ * Maximum time (ms) to wait before forcing a flush. Only meaningful when
+ * isBufferedReplicationEndpoint() == true.
+ */
+ default long maxFlushInterval() {
+return Long.MAX_VALUE;
+ }
+
+ /**
+ * Hook invoked before persisting replication offsets. Buffered endpoints
should flush/close WALs
+ * here.
+ */
+ default void beforePersistingReplicationOffset() {
Review Comment:
From the endpoints view, this method is just a flush/close of the given wal,
so lets name it accordingly.
```suggestion
default void flushAndCloseWAL() {
```
##
hbase-server/src/main/java/org/apache/hadoop/hbase/replication/regionserver/ReplicationSourceShipper.java:
##
@@ -190,13 +204,16 @@ private void shipEdits(WALEntryBatch entryBatch) {
} else {
sleepMultiplier = Math.max(sleepMultiplier - 1, 0);
}
-// Clean up hfile references
-for (Entry entry : entries) {
- cleanUpHFileRefs(entry.getEdit());
- LOG.trace("shipped entry {}: ", entry);
+
+stagedWalSize += currentSize;
+entriesForCleanUpHFileRefs.addAll(entries);
+lastShippedBatch = entryBatch;
+if (shouldPersistLogPosition()) {
Review Comment:
Rather than having these `stagedWalSize` and `lastShippedBatch` as global
variables, we should just pass the `entryBatch` along to
`shouldPersistLogPosition()` (which should be defined/implemented in the
endpoints, btw, see my other comment related) and `persistLogPosition()`.
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: [email protected]
For queries about this service, please contact Infrastructure at:
[email protected]
Re: [PR] HBASE-29823 Control WAL flush and offset persistence from ReplicationSourceShipper [hbase]
Apache9 commented on code in PR #7617:
URL: https://github.com/apache/hbase/pull/7617#discussion_r2719322657
##
hbase-server/src/main/java/org/apache/hadoop/hbase/replication/ReplicationEndpoint.java:
##
@@ -283,4 +283,35 @@ public int getTimeout() {
* @throws IllegalStateException if this service's state isn't FAILED.
*/
Throwable failureCause();
+
+ /**
+ * @return true if this endpoint buffers WAL entries and requires explicit
flush control before
+ * persisting replication offsets.
+ */
+ default boolean isBufferedReplicationEndpoint() {
Review Comment:
For me, I do not think we need to expose this information to shipper?
The design here is that, when using different ReplicationEndpoint, you need
to tune the shipper configuration by your own, as the parameters are not only
affected by ReplicationEndpoint, they also depend on the shipper side.
For example, when you want to reduce the pressure on recording the offset,
you should increase the record interval, i.e, increase batch size, increase the
number of ship times between recording offset, etc. And if you want to reduce
the pressure on memory and the target receiver, you should decrease the batch
size, and for S3 based replication endpoint, there is also a trade off, if you
increase the flush interval, you can get better performance and less files on
S3, but failover will be more complicated as you need to start from long before.
So, this should be in the documentation, just exposing some configuration
from ReplicationEndpoint can not handle all the above situations.
##
hbase-server/src/main/java/org/apache/hadoop/hbase/replication/regionserver/ReplicationSourceShipper.java:
##
@@ -98,6 +104,13 @@ public final void run() {
LOG.info("Running ReplicationSourceShipper Thread for wal group: {}",
this.walGroupId);
// Loop until we close down
while (isActive()) {
+ // check if flush needed for WAL backup, this is need for timeout based
flush
Review Comment:
This is not designed for WAL backup only, I need to say. Here, in shipper,
we just follow the configured limit, i.e, time based or size based, to
determine whether we need to record the log position, and there is a callback
to ReplicationEndpoint before recording, the replication endpoint can use this
callback to do something, the shipper does not know the detail.
##
hbase-server/src/main/java/org/apache/hadoop/hbase/replication/regionserver/ReplicationSourceShipper.java:
##
@@ -229,6 +246,53 @@ private void shipEdits(WALEntryBatch entryBatch) {
}
}
+ private boolean shouldPersistLogPosition() {
+ReplicationEndpoint endpoint = source.getReplicationEndpoint();
+if (!endpoint.isBufferedReplicationEndpoint()) {
+ // Non-buffering endpoints persist immediately
+ return true;
+}
+if (stagedWalSize == 0 || lastShippedBatch == null) {
+ return false;
+}
+return stagedWalSize >= endpoint.getMaxBufferSize()
+ || (EnvironmentEdgeManager.currentTime() - lastStagedFlushTs >=
endpoint.maxFlushInterval());
+ }
+
+ @Nullable
+ // Returns IOException instead of throwing so callers can decide
+ // whether to retry (shipEdits) or best-effort log (run()).
+ private IOException persistLogPosition() {
Review Comment:
+1
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: [email protected]
For queries about this service, please contact Infrastructure at:
[email protected]
Re: [PR] HBASE-29823 Control WAL flush and offset persistence from ReplicationSourceShipper [hbase]
taklwu commented on code in PR #7617:
URL: https://github.com/apache/hbase/pull/7617#discussion_r2718075417
##
hbase-server/src/main/java/org/apache/hadoop/hbase/replication/regionserver/ReplicationSourceShipper.java:
##
@@ -229,6 +246,53 @@ private void shipEdits(WALEntryBatch entryBatch) {
}
}
+ private boolean shouldPersistLogPosition() {
+ReplicationEndpoint endpoint = source.getReplicationEndpoint();
+if (!endpoint.isBufferedReplicationEndpoint()) {
+ // Non-buffering endpoints persist immediately
+ return true;
+}
+if (stagedWalSize == 0 || lastShippedBatch == null) {
+ return false;
+}
+return stagedWalSize >= endpoint.getMaxBufferSize()
+ || (EnvironmentEdgeManager.currentTime() - lastStagedFlushTs >=
endpoint.maxFlushInterval());
+ }
+
+ @Nullable
+ // Returns IOException instead of throwing so callers can decide
+ // whether to retry (shipEdits) or best-effort log (run()).
+ private IOException persistLogPosition() {
+if (lastShippedBatch == null) {
+ return null;
+}
+
+ReplicationEndpoint endpoint = source.getReplicationEndpoint();
+
+if (endpoint.isBufferedReplicationEndpoint() && stagedWalSize > 0) {
+ source.getReplicationEndpoint().beforePersistingReplicationOffset();
+}
+
+stagedWalSize = 0;
+lastStagedFlushTs = EnvironmentEdgeManager.currentTime();
+
+// Clean up hfile references
+try {
+ for (Entry entry : entriesForCleanUpHFileRefs) {
+cleanUpHFileRefs(entry.getEdit());
+LOG.trace("shipped entry {}: ", entry);
+ }
+} catch (IOException e) {
+ LOG.warn("{} threw exception while cleaning up hfile refs",
endpoint.getClass().getName(), e);
+ return e;
Review Comment:
```suggestion
throw e;
```
##
hbase-server/src/main/java/org/apache/hadoop/hbase/replication/regionserver/ReplicationSourceShipper.java:
##
@@ -98,6 +104,13 @@ public final void run() {
LOG.info("Running ReplicationSourceShipper Thread for wal group: {}",
this.walGroupId);
// Loop until we close down
while (isActive()) {
+ // check if flush needed for WAL backup, this is need for timeout based
flush
+ if (shouldPersistLogPosition()) {
+IOException error = persistLogPosition();
+if (error != null) {
+ LOG.warn("Exception while persisting replication state", error);
+}
+ }
Review Comment:
nit: I don't think you need to change the return type as `IOException`,
instead you can just use `try-catch` if it's really needed to handle any
Exception differently.
```suggestion
if (shouldPersistLogPosition()) {
persistLogPosition();
}
```
##
hbase-server/src/main/java/org/apache/hadoop/hbase/replication/regionserver/ReplicationSourceShipper.java:
##
@@ -229,6 +246,53 @@ private void shipEdits(WALEntryBatch entryBatch) {
}
}
+ private boolean shouldPersistLogPosition() {
+ReplicationEndpoint endpoint = source.getReplicationEndpoint();
+if (!endpoint.isBufferedReplicationEndpoint()) {
+ // Non-buffering endpoints persist immediately
+ return true;
+}
+if (stagedWalSize == 0 || lastShippedBatch == null) {
+ return false;
+}
+return stagedWalSize >= endpoint.getMaxBufferSize()
+ || (EnvironmentEdgeManager.currentTime() - lastStagedFlushTs >=
endpoint.maxFlushInterval());
+ }
+
+ @Nullable
+ // Returns IOException instead of throwing so callers can decide
+ // whether to retry (shipEdits) or best-effort log (run()).
+ private IOException persistLogPosition() {
Review Comment:
```suggestion
private void persistLogPosition() throws IOException {
```
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: [email protected]
For queries about this service, please contact Infrastructure at:
[email protected]
Re: [PR] HBASE-29823 Control WAL flush and offset persistence from ReplicationSourceShipper [hbase]
Apache-HBase commented on PR #7617:
URL: https://github.com/apache/hbase/pull/7617#issuecomment-3786077391
:confetti_ball: **+1 overall**
| Vote | Subsystem | Runtime | Logfile | Comment |
|::|--:|:|::|:---:|
| +0 :ok: | reexec | 0m 29s | | Docker mode activated. |
| -0 :warning: | yetus | 0m 3s | | Unprocessed flag(s):
--brief-report-file --spotbugs-strict-precheck --author-ignore-list
--blanks-eol-ignore-file --blanks-tabs-ignore-file --quick-hadoopcheck |
_ Prechecks _ |
_ master Compile Tests _ |
| +1 :green_heart: | mvninstall | 3m 27s | | master passed |
| +1 :green_heart: | compile | 1m 6s | | master passed |
| +1 :green_heart: | javadoc | 0m 30s | | master passed |
| +1 :green_heart: | shadedjars | 5m 55s | | branch has no errors when
building our shaded downstream artifacts. |
_ Patch Compile Tests _ |
| +1 :green_heart: | mvninstall | 3m 8s | | the patch passed |
| +1 :green_heart: | compile | 1m 5s | | the patch passed |
| +1 :green_heart: | javac | 1m 5s | | the patch passed |
| +1 :green_heart: | javadoc | 0m 27s | | the patch passed |
| +1 :green_heart: | shadedjars | 5m 51s | | patch has no errors when
building our shaded downstream artifacts. |
_ Other Tests _ |
| +1 :green_heart: | unit | 241m 59s | | hbase-server in the patch
passed. |
| | | 269m 23s | | |
| Subsystem | Report/Notes |
|--:|:-|
| Docker | ClientAPI=1.43 ServerAPI=1.43 base:
https://ci-hbase.apache.org/job/HBase-PreCommit-GitHub-PR/job/PR-7617/3/artifact/yetus-jdk17-hadoop3-check/output/Dockerfile
|
| GITHUB PR | https://github.com/apache/hbase/pull/7617 |
| Optional Tests | javac javadoc unit compile shadedjars |
| uname | Linux d4d0c9ba21ef 5.4.0-1103-aws #111~18.04.1-Ubuntu SMP Tue May
23 20:04:10 UTC 2023 x86_64 x86_64 x86_64 GNU/Linux |
| Build tool | maven |
| Personality | dev-support/hbase-personality.sh |
| git revision | master / daded968719a4ba7ae0af58cfc0b2b1083f35ad7 |
| Default Java | Eclipse Adoptium-17.0.11+9 |
| Test Results |
https://ci-hbase.apache.org/job/HBase-PreCommit-GitHub-PR/job/PR-7617/3/testReport/
|
| Max. process+thread count | 3736 (vs. ulimit of 3) |
| modules | C: hbase-server U: hbase-server |
| Console output |
https://ci-hbase.apache.org/job/HBase-PreCommit-GitHub-PR/job/PR-7617/3/console
|
| versions | git=2.34.1 maven=3.9.8 |
| Powered by | Apache Yetus 0.15.0 https://yetus.apache.org |
This message was automatically generated.
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: [email protected]
For queries about this service, please contact Infrastructure at:
[email protected]
Re: [PR] HBASE-29823 Control WAL flush and offset persistence from ReplicationSourceShipper [hbase]
anmolnar commented on PR #7617: URL: https://github.com/apache/hbase/pull/7617#issuecomment-3781194007 @wchevreuil @Apache9 Thank you guys for the suggestion. I'm happy to merge it either branch once it's accepted. Please also provide feedback on the patch itself. Does it match the approach that you suggested on the feature branch @Apache9 ? -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: [email protected] For queries about this service, please contact Infrastructure at: [email protected]
Re: [PR] HBASE-29823 Control WAL flush and offset persistence from ReplicationSourceShipper [hbase]
Apache9 commented on PR #7617: URL: https://github.com/apache/hbase/pull/7617#issuecomment-3777600615 > Why are we targeting this to master? Shouldn't it be on the feature branch? > > Also, as already mentioned by @anmolnar and @taklwu , we should refrain from adding logic that is specific to the continuous backup replication in the generic replication interfaces/classes. This is my suggestion that we should target to master branch to add this feature first, and then reimplement the continuous backup feature on top of this change. Of course, we do not need to merge this to master first, this is just for better reviewing, we can land this to a feature branch and then rebase the continus backup branch. Thanks. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: [email protected] For queries about this service, please contact Infrastructure at: [email protected]
Re: [PR] HBASE-29823 Control WAL flush and offset persistence from ReplicationSourceShipper [hbase]
anmolnar commented on code in PR #7617:
URL: https://github.com/apache/hbase/pull/7617#discussion_r2709926770
##
hbase-server/src/main/java/org/apache/hadoop/hbase/replication/regionserver/ReplicationSourceShipper.java:
##
@@ -229,6 +239,41 @@ private void shipEdits(WALEntryBatch entryBatch) {
}
}
+ private boolean shouldPersistLogPosition() {
+if (stagedWalSize == 0 || lastShippedBatch == null) {
+ return false;
+}
+return (stagedWalSize >=
source.getReplicationEndpoint().getMaxBufferSize())
Review Comment:
I can agree with this comment too. You introduced two new properties:
`getMaxBufferSize()` and `maxFlushInterval()` with default values which should
mimic the original behavior. I think you documented it in `ReplicationEndpoint`
interface properly and it would be useful to explicitly check for it too.
##
hbase-server/src/main/java/org/apache/hadoop/hbase/replication/ReplicationEndpoint.java:
##
@@ -283,4 +283,29 @@ public int getTimeout() {
* @throws IllegalStateException if this service's state isn't FAILED.
*/
Throwable failureCause();
+
+ // WAL entries are buffered in ContinuousBackupReplicationEndpoint before
flushing to WAL backup
+ // file. So we return config value CONF_BACKUP_MAX_WAL_SIZE for
+ // ContinuousBackupReplicationEndpoint
+ // and -1 for other ReplicationEndpoint since they don't buffer.
+ // For other ReplicationEndpoint, everytime a WALEntryBatch is shipped, we
update replication
+ // offset. Please check ReplicationSourceShipper#shouldFlushStagedWal()
Review Comment:
@ankitsol It seems that this comment is valid.
##
hbase-server/src/main/java/org/apache/hadoop/hbase/replication/regionserver/ReplicationSourceShipper.java:
##
@@ -229,6 +239,41 @@ private void shipEdits(WALEntryBatch entryBatch) {
}
}
+ private boolean shouldPersistLogPosition() {
+if (stagedWalSize == 0 || lastShippedBatch == null) {
+ return false;
+}
+return (stagedWalSize >=
source.getReplicationEndpoint().getMaxBufferSize())
+ || (EnvironmentEdgeManager.currentTime() - lastStagedFlushTs
+ >= source.getReplicationEndpoint().maxFlushInterval());
+ }
+
+ private void persistLogPosition() {
+if (lastShippedBatch == null) {
+ return;
+}
+if (stagedWalSize > 0) {
+ source.getReplicationEndpoint().beforePersistingReplicationOffset();
+}
+stagedWalSize = 0;
+lastStagedFlushTs = EnvironmentEdgeManager.currentTime();
+
+// Clean up hfile references
+for (Entry entry : entriesForCleanUpHFileRefs) {
+ try {
+cleanUpHFileRefs(entry.getEdit());
+ } catch (IOException e) {
+LOG.warn("{} threw unknown exception:",
+ source.getReplicationEndpoint().getClass().getName(), e);
+ }
Review Comment:
Agree. Either the exception should be rethrown here or don't need to catch
at all.
##
hbase-server/src/main/java/org/apache/hadoop/hbase/replication/ReplicationEndpoint.java:
##
@@ -283,4 +283,29 @@ public int getTimeout() {
* @throws IllegalStateException if this service's state isn't FAILED.
*/
Throwable failureCause();
+
+ // WAL entries are buffered in ContinuousBackupReplicationEndpoint before
flushing to WAL backup
+ // file. So we return config value CONF_BACKUP_MAX_WAL_SIZE for
+ // ContinuousBackupReplicationEndpoint
+ // and -1 for other ReplicationEndpoint since they don't buffer.
+ // For other ReplicationEndpoint, everytime a WALEntryBatch is shipped, we
update replication
+ // offset. Please check ReplicationSourceShipper#shouldFlushStagedWal()
+ default long getMaxBufferSize() {
+return -1;
+ }
+
+ // WAL entries are buffered in ContinuousBackupReplicationEndpoint before
flushing to WAL backup
+ // file. So we return config value CONF_STAGED_WAL_FLUSH_INTERVAL for
+ // ContinuousBackupReplicationEndpoint
+ // and Long.MAX_VALUE for other ReplicationEndpoint since they don't buffer.
+ // For other ReplicationEndpoint, everytime a WALEntryBatch is shipped, we
update replication
+ // offset. Please check ReplicationSourceShipper#shouldFlushStagedWal()
Review Comment:
Same here.
##
hbase-server/src/test/java/org/apache/hadoop/hbase/replication/regionserver/TestReplicationSourceShipperBufferedFlush.java:
##
@@ -0,0 +1,131 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one
+ * or more contributor license agreements. See the NOTICE file
+ * distributed with this work for additional information
+ * regarding copyright ownership. The ASF licenses this file
+ * to you under the Apache License, Version 2.0 (the
+ * "License"); you may not use this file except in compliance
+ * with the License. You may obtain a copy of the License at
+ *
+ * http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License
Re: [PR] HBASE-29823 Control WAL flush and offset persistence from ReplicationSourceShipper [hbase]
anmolnar commented on code in PR #7617:
URL: https://github.com/apache/hbase/pull/7617#discussion_r2709831534
##
hbase-server/src/main/java/org/apache/hadoop/hbase/replication/ReplicationEndpoint.java:
##
@@ -283,4 +283,29 @@ public int getTimeout() {
* @throws IllegalStateException if this service's state isn't FAILED.
*/
Throwable failureCause();
+
+ // WAL entries are buffered in ContinuousBackupReplicationEndpoint before
flushing to WAL backup
+ // file. So we return config value CONF_BACKUP_MAX_WAL_SIZE for
+ // ContinuousBackupReplicationEndpoint
+ // and -1 for other ReplicationEndpoint since they don't buffer.
+ // For other ReplicationEndpoint, everytime a WALEntryBatch is shipped, we
update replication
+ // offset. Please check ReplicationSourceShipper#shouldFlushStagedWal()
+ default long getMaxBufferSize() {
+return -1;
+ }
+
+ // WAL entries are buffered in ContinuousBackupReplicationEndpoint before
flushing to WAL backup
Review Comment:
@ankitsol I agree. Please remove these references from this patch.
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: [email protected]
For queries about this service, please contact Infrastructure at:
[email protected]
Re: [PR] HBASE-29823 Control WAL flush and offset persistence from ReplicationSourceShipper [hbase]
ankitsol commented on PR #7617: URL: https://github.com/apache/hbase/pull/7617#issuecomment-3774098500 Hi @Apache9 @anmolnar @vinayakphegde just a gentle reminder on this PR when you get a moment 🙂 Thanks! This PR is followup to https://github.com/apache/hbase/pull/7591 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: [email protected] For queries about this service, please contact Infrastructure at: [email protected]
Re: [PR] HBASE-29823 Control WAL flush and offset persistence from ReplicationSourceShipper [hbase]
ankitsol commented on PR #7617: URL: https://github.com/apache/hbase/pull/7617#issuecomment-3772997503 > can you also check if these two failures are related? > > TestHRegionWithInMemoryFlush.testParallelIncrementWithMemStoreFlush TestTags.testFlushAndCompactionwithCombinations These are passing locally -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: [email protected] For queries about this service, please contact Infrastructure at: [email protected]
Re: [PR] HBASE-29823 Control WAL flush and offset persistence from ReplicationSourceShipper [hbase]
taklwu commented on code in PR #7617:
URL: https://github.com/apache/hbase/pull/7617#discussion_r2684251451
##
hbase-server/src/main/java/org/apache/hadoop/hbase/replication/ReplicationEndpoint.java:
##
@@ -283,4 +283,29 @@ public int getTimeout() {
* @throws IllegalStateException if this service's state isn't FAILED.
*/
Throwable failureCause();
+
+ // WAL entries are buffered in ContinuousBackupReplicationEndpoint before
flushing to WAL backup
+ // file. So we return config value CONF_BACKUP_MAX_WAL_SIZE for
+ // ContinuousBackupReplicationEndpoint
+ // and -1 for other ReplicationEndpoint since they don't buffer.
+ // For other ReplicationEndpoint, everytime a WALEntryBatch is shipped, we
update replication
+ // offset. Please check ReplicationSourceShipper#shouldFlushStagedWal()
+ default long getMaxBufferSize() {
+return -1;
+ }
+
+ // WAL entries are buffered in ContinuousBackupReplicationEndpoint before
flushing to WAL backup
Review Comment:
nit: `ContinuousBackupReplicationEndpoint` is part of
https://github.com/apache/hbase/pull/7591/ and it's yet committing to master,
should we mention early in this change?
##
hbase-server/src/main/java/org/apache/hadoop/hbase/replication/regionserver/ReplicationSourceShipper.java:
##
@@ -229,6 +239,41 @@ private void shipEdits(WALEntryBatch entryBatch) {
}
}
+ private boolean shouldPersistLogPosition() {
+if (stagedWalSize == 0 || lastShippedBatch == null) {
+ return false;
+}
+return (stagedWalSize >=
source.getReplicationEndpoint().getMaxBufferSize())
+ || (EnvironmentEdgeManager.currentTime() - lastStagedFlushTs
+ >= source.getReplicationEndpoint().maxFlushInterval());
+ }
+
+ private void persistLogPosition() {
+if (lastShippedBatch == null) {
+ return;
+}
+if (stagedWalSize > 0) {
+ source.getReplicationEndpoint().beforePersistingReplicationOffset();
+}
+stagedWalSize = 0;
+lastStagedFlushTs = EnvironmentEdgeManager.currentTime();
+
+// Clean up hfile references
+for (Entry entry : entriesForCleanUpHFileRefs) {
+ try {
+cleanUpHFileRefs(entry.getEdit());
+ } catch (IOException e) {
+LOG.warn("{} threw unknown exception:",
+ source.getReplicationEndpoint().getClass().getName(), e);
+ }
Review Comment:
nit: will this be a behavior change because previously when
`cleanUpHFileRefs` failed, it's throwing thru the function but here we're
logging it only.
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: [email protected]
For queries about this service, please contact Infrastructure at:
[email protected]
Re: [PR] HBASE-29823 Control WAL flush and offset persistence from ReplicationSourceShipper [hbase]
Copilot commented on code in PR #7617:
URL: https://github.com/apache/hbase/pull/7617#discussion_r2684440781
##
hbase-server/src/main/java/org/apache/hadoop/hbase/replication/regionserver/ReplicationSourceShipper.java:
##
@@ -229,6 +239,41 @@ private void shipEdits(WALEntryBatch entryBatch) {
}
}
+ private boolean shouldPersistLogPosition() {
+if (stagedWalSize == 0 || lastShippedBatch == null) {
+ return false;
+}
+return (stagedWalSize >=
source.getReplicationEndpoint().getMaxBufferSize())
Review Comment:
The condition checks if stagedWalSize is greater than or equal to
getMaxBufferSize(), but getMaxBufferSize() can return -1 for non-buffering
endpoints (as documented in ReplicationEndpoint). This means the comparison
'stagedWalSize >= -1' would always be true when stagedWalSize > 0, causing
immediate flushes for non-buffering endpoints. While this preserves existing
behavior, the intent is unclear and could be confusing. Consider explicitly
checking for -1 to make the logic more explicit.
```suggestion
long maxBufferSize = source.getReplicationEndpoint().getMaxBufferSize();
// For non-buffering endpoints, getMaxBufferSize() returns a negative
value (e.g., -1).
// In that case, we always trigger a flush based on size as soon as
there is staged data.
boolean sizeBasedFlush =
(maxBufferSize < 0) || (stagedWalSize >= maxBufferSize);
return sizeBasedFlush
```
##
hbase-server/src/main/java/org/apache/hadoop/hbase/replication/regionserver/ReplicationSourceShipper.java:
##
@@ -229,6 +239,41 @@ private void shipEdits(WALEntryBatch entryBatch) {
}
}
+ private boolean shouldPersistLogPosition() {
+if (stagedWalSize == 0 || lastShippedBatch == null) {
+ return false;
+}
+return (stagedWalSize >=
source.getReplicationEndpoint().getMaxBufferSize())
+ || (EnvironmentEdgeManager.currentTime() - lastStagedFlushTs
+ >= source.getReplicationEndpoint().maxFlushInterval());
+ }
+
+ private void persistLogPosition() {
+if (lastShippedBatch == null) {
+ return;
+}
+if (stagedWalSize > 0) {
+ source.getReplicationEndpoint().beforePersistingReplicationOffset();
+}
+stagedWalSize = 0;
+lastStagedFlushTs = EnvironmentEdgeManager.currentTime();
+
+// Clean up hfile references
+for (Entry entry : entriesForCleanUpHFileRefs) {
+ try {
+cleanUpHFileRefs(entry.getEdit());
+ } catch (IOException e) {
Review Comment:
The error handling for IOException has been changed to catch and log the
exception instead of propagating it. This silently suppresses IOException
failures during cleanup, which could hide serious issues like file system
problems. If cleanup failures should be non-fatal, this should be explicitly
documented, or consider at least incrementing a failure metric to track these
errors.
```suggestion
} catch (IOException e) {
// Cleanup failures are intentionally treated as non-fatal:
replication has already
// succeeded for these entries, so we log the failure and continue.
```
##
hbase-server/src/main/java/org/apache/hadoop/hbase/replication/ReplicationEndpoint.java:
##
@@ -283,4 +283,29 @@ public int getTimeout() {
* @throws IllegalStateException if this service's state isn't FAILED.
*/
Throwable failureCause();
+
+ // WAL entries are buffered in ContinuousBackupReplicationEndpoint before
flushing to WAL backup
+ // file. So we return config value CONF_BACKUP_MAX_WAL_SIZE for
+ // ContinuousBackupReplicationEndpoint
+ // and -1 for other ReplicationEndpoint since they don't buffer.
+ // For other ReplicationEndpoint, everytime a WALEntryBatch is shipped, we
update replication
+ // offset. Please check ReplicationSourceShipper#shouldFlushStagedWal()
Review Comment:
The comment references 'shouldFlushStagedWal()' but the actual method name
in ReplicationSourceShipper is 'shouldPersistLogPosition()'. This inconsistency
will confuse developers trying to understand the interaction between these
components.
##
hbase-server/src/test/java/org/apache/hadoop/hbase/replication/regionserver/TestReplicationSourceShipperBufferedFlush.java:
##
@@ -0,0 +1,131 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one
+ * or more contributor license agreements. See the NOTICE file
+ * distributed with this work for additional information
+ * regarding copyright ownership. The ASF licenses this file
+ * to you under the Apache License, Version 2.0 (the
+ * "License"); you may not use this file except in compliance
+ * with the License. You may obtain a copy of the License at
+ *
+ * http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANT
Re: [PR] HBASE-29823 Control WAL flush and offset persistence from ReplicationSourceShipper [hbase]
Apache-HBase commented on PR #7617:
URL: https://github.com/apache/hbase/pull/7617#issuecomment-3740810146
:broken_heart: **-1 overall**
| Vote | Subsystem | Runtime | Logfile | Comment |
|::|--:|:|::|:---:|
| +0 :ok: | reexec | 0m 14s | | Docker mode activated. |
| -0 :warning: | yetus | 0m 3s | | Unprocessed flag(s):
--brief-report-file --spotbugs-strict-precheck --author-ignore-list
--blanks-eol-ignore-file --blanks-tabs-ignore-file --quick-hadoopcheck |
_ Prechecks _ |
_ master Compile Tests _ |
| +1 :green_heart: | mvninstall | 3m 19s | | master passed |
| +1 :green_heart: | compile | 0m 59s | | master passed |
| +1 :green_heart: | javadoc | 0m 30s | | master passed |
| +1 :green_heart: | shadedjars | 6m 20s | | branch has no errors when
building our shaded downstream artifacts. |
_ Patch Compile Tests _ |
| +1 :green_heart: | mvninstall | 2m 55s | | the patch passed |
| +1 :green_heart: | compile | 1m 0s | | the patch passed |
| +1 :green_heart: | javac | 1m 0s | | the patch passed |
| +1 :green_heart: | javadoc | 0m 27s | | the patch passed |
| +1 :green_heart: | shadedjars | 6m 20s | | patch has no errors when
building our shaded downstream artifacts. |
_ Other Tests _ |
| -1 :x: | unit | 251m 26s |
[/patch-unit-hbase-server.txt](https://ci-hbase.apache.org/job/HBase-PreCommit-GitHub-PR/job/PR-7617/2/artifact/yetus-jdk17-hadoop3-check/output/patch-unit-hbase-server.txt)
| hbase-server in the patch failed. |
| | | 277m 59s | | |
| Subsystem | Report/Notes |
|--:|:-|
| Docker | ClientAPI=1.52 ServerAPI=1.52 base:
https://ci-hbase.apache.org/job/HBase-PreCommit-GitHub-PR/job/PR-7617/2/artifact/yetus-jdk17-hadoop3-check/output/Dockerfile
|
| GITHUB PR | https://github.com/apache/hbase/pull/7617 |
| Optional Tests | javac javadoc unit compile shadedjars |
| uname | Linux 82b2e4b8e507 6.14.0-1018-aws #18~24.04.1-Ubuntu SMP Mon Nov
24 19:46:27 UTC 2025 x86_64 x86_64 x86_64 GNU/Linux |
| Build tool | maven |
| Personality | dev-support/hbase-personality.sh |
| git revision | master / 95000e024dbdf7543366f7e883db615c10d43d49 |
| Default Java | Eclipse Adoptium-17.0.11+9 |
| Test Results |
https://ci-hbase.apache.org/job/HBase-PreCommit-GitHub-PR/job/PR-7617/2/testReport/
|
| Max. process+thread count | 4718 (vs. ulimit of 3) |
| modules | C: hbase-server U: hbase-server |
| Console output |
https://ci-hbase.apache.org/job/HBase-PreCommit-GitHub-PR/job/PR-7617/2/console
|
| versions | git=2.34.1 maven=3.9.8 |
| Powered by | Apache Yetus 0.15.0 https://yetus.apache.org |
This message was automatically generated.
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: [email protected]
For queries about this service, please contact Infrastructure at:
[email protected]
Re: [PR] HBASE-29823 Control WAL flush and offset persistence from ReplicationSourceShipper [hbase]
Apache-HBase commented on PR #7617:
URL: https://github.com/apache/hbase/pull/7617#issuecomment-3739962557
:confetti_ball: **+1 overall**
| Vote | Subsystem | Runtime | Logfile | Comment |
|::|--:|:|::|:---:|
| +0 :ok: | reexec | 0m 28s | | Docker mode activated. |
_ Prechecks _ |
| +1 :green_heart: | dupname | 0m 0s | | No case conflicting files
found. |
| +0 :ok: | codespell | 0m 0s | | codespell was not available. |
| +0 :ok: | detsecrets | 0m 0s | | detect-secrets was not available.
|
| +1 :green_heart: | @author | 0m 0s | | The patch does not contain
any @author tags. |
| +1 :green_heart: | hbaseanti | 0m 0s | | Patch does not have any
anti-patterns. |
_ master Compile Tests _ |
| +1 :green_heart: | mvninstall | 3m 39s | | master passed |
| +1 :green_heart: | compile | 4m 9s | | master passed |
| +1 :green_heart: | checkstyle | 1m 10s | | master passed |
| +1 :green_heart: | spotbugs | 2m 5s | | master passed |
| +1 :green_heart: | spotless | 1m 4s | | branch has no errors when
running spotless:check. |
_ Patch Compile Tests _ |
| +1 :green_heart: | mvninstall | 3m 45s | | the patch passed |
| +1 :green_heart: | compile | 4m 8s | | the patch passed |
| +1 :green_heart: | javac | 4m 8s | | the patch passed |
| +1 :green_heart: | blanks | 0m 0s | | The patch has no blanks
issues. |
| +1 :green_heart: | checkstyle | 1m 10s | | the patch passed |
| +1 :green_heart: | spotbugs | 2m 28s | | the patch passed |
| +1 :green_heart: | hadoopcheck | 13m 50s | | Patch does not cause any
errors with Hadoop 3.3.6 3.4.1. |
| +1 :green_heart: | spotless | 0m 56s | | patch has no errors when
running spotless:check. |
_ Other Tests _ |
| +1 :green_heart: | asflicense | 0m 13s | | The patch does not
generate ASF License warnings. |
| | | 48m 15s | | |
| Subsystem | Report/Notes |
|--:|:-|
| Docker | ClientAPI=1.43 ServerAPI=1.43 base:
https://ci-hbase.apache.org/job/HBase-PreCommit-GitHub-PR/job/PR-7617/2/artifact/yetus-general-check/output/Dockerfile
|
| GITHUB PR | https://github.com/apache/hbase/pull/7617 |
| Optional Tests | dupname asflicense javac spotbugs checkstyle codespell
detsecrets compile hadoopcheck hbaseanti spotless |
| uname | Linux 4f1801e5c24c 5.4.0-1103-aws #111~18.04.1-Ubuntu SMP Tue May
23 20:04:10 UTC 2023 x86_64 x86_64 x86_64 GNU/Linux |
| Build tool | maven |
| Personality | dev-support/hbase-personality.sh |
| git revision | master / 95000e024dbdf7543366f7e883db615c10d43d49 |
| Default Java | Eclipse Adoptium-17.0.11+9 |
| Max. process+thread count | 83 (vs. ulimit of 3) |
| modules | C: hbase-server U: hbase-server |
| Console output |
https://ci-hbase.apache.org/job/HBase-PreCommit-GitHub-PR/job/PR-7617/2/console
|
| versions | git=2.34.1 maven=3.9.8 spotbugs=4.7.3 |
| Powered by | Apache Yetus 0.15.0 https://yetus.apache.org |
This message was automatically generated.
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: [email protected]
For queries about this service, please contact Infrastructure at:
[email protected]
Re: [PR] HBASE-29823 Control WAL flush and offset persistence from ReplicationSourceShipper [hbase]
ankitsol commented on PR #7617: URL: https://github.com/apache/hbase/pull/7617#issuecomment-3739717187 @Apache9 @anmolnar @vinayakphegde Please review this PR as followup to preview PR https://github.com/apache/hbase/pull/7591 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: [email protected] For queries about this service, please contact Infrastructure at: [email protected]
Re: [PR] HBASE-29823 Control WAL flush and offset persistence from ReplicationSourceShipper [hbase]
Apache-HBase commented on PR #7617:
URL: https://github.com/apache/hbase/pull/7617#issuecomment-3736834357
:confetti_ball: **+1 overall**
| Vote | Subsystem | Runtime | Logfile | Comment |
|::|--:|:|::|:---:|
| +0 :ok: | reexec | 0m 28s | | Docker mode activated. |
_ Prechecks _ |
| +1 :green_heart: | dupname | 0m 0s | | No case conflicting files
found. |
| +0 :ok: | codespell | 0m 0s | | codespell was not available. |
| +0 :ok: | detsecrets | 0m 0s | | detect-secrets was not available.
|
| +1 :green_heart: | @author | 0m 0s | | The patch does not contain
any @author tags. |
| +1 :green_heart: | hbaseanti | 0m 0s | | Patch does not have any
anti-patterns. |
_ master Compile Tests _ |
| +1 :green_heart: | mvninstall | 3m 30s | | master passed |
| +1 :green_heart: | compile | 3m 28s | | master passed |
| +1 :green_heart: | checkstyle | 1m 0s | | master passed |
| +1 :green_heart: | spotbugs | 1m 39s | | master passed |
| +1 :green_heart: | spotless | 0m 51s | | branch has no errors when
running spotless:check. |
_ Patch Compile Tests _ |
| +1 :green_heart: | mvninstall | 3m 10s | | the patch passed |
| +1 :green_heart: | compile | 3m 33s | | the patch passed |
| +1 :green_heart: | javac | 3m 33s | | the patch passed |
| +1 :green_heart: | blanks | 0m 0s | | The patch has no blanks
issues. |
| +1 :green_heart: | checkstyle | 1m 0s | | the patch passed |
| +1 :green_heart: | spotbugs | 1m 43s | | the patch passed |
| +1 :green_heart: | hadoopcheck | 12m 7s | | Patch does not cause any
errors with Hadoop 3.3.6 3.4.1. |
| +1 :green_heart: | spotless | 0m 46s | | patch has no errors when
running spotless:check. |
_ Other Tests _ |
| +1 :green_heart: | asflicense | 0m 11s | | The patch does not
generate ASF License warnings. |
| | | 41m 6s | | |
| Subsystem | Report/Notes |
|--:|:-|
| Docker | ClientAPI=1.43 ServerAPI=1.43 base:
https://ci-hbase.apache.org/job/HBase-PreCommit-GitHub-PR/job/PR-7617/1/artifact/yetus-general-check/output/Dockerfile
|
| GITHUB PR | https://github.com/apache/hbase/pull/7617 |
| Optional Tests | dupname asflicense javac spotbugs checkstyle codespell
detsecrets compile hadoopcheck hbaseanti spotless |
| uname | Linux cb70c3150cf1 5.4.0-1103-aws #111~18.04.1-Ubuntu SMP Tue May
23 20:04:10 UTC 2023 x86_64 x86_64 x86_64 GNU/Linux |
| Build tool | maven |
| Personality | dev-support/hbase-personality.sh |
| git revision | master / ea8ed6a845cd1ba2f4660974cb3368338cfd7576 |
| Default Java | Eclipse Adoptium-17.0.11+9 |
| Max. process+thread count | 86 (vs. ulimit of 3) |
| modules | C: hbase-server U: hbase-server |
| Console output |
https://ci-hbase.apache.org/job/HBase-PreCommit-GitHub-PR/job/PR-7617/1/console
|
| versions | git=2.34.1 maven=3.9.8 spotbugs=4.7.3 |
| Powered by | Apache Yetus 0.15.0 https://yetus.apache.org |
This message was automatically generated.
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: [email protected]
For queries about this service, please contact Infrastructure at:
[email protected]
Re: [PR] HBASE-29823 Control WAL flush and offset persistence from ReplicationSourceShipper [hbase]
Apache-HBase commented on PR #7617:
URL: https://github.com/apache/hbase/pull/7617#issuecomment-3736828446
:broken_heart: **-1 overall**
| Vote | Subsystem | Runtime | Logfile | Comment |
|::|--:|:|::|:---:|
| +0 :ok: | reexec | 2m 48s | | Docker mode activated. |
| -0 :warning: | yetus | 0m 2s | | Unprocessed flag(s):
--brief-report-file --spotbugs-strict-precheck --author-ignore-list
--blanks-eol-ignore-file --blanks-tabs-ignore-file --quick-hadoopcheck |
_ Prechecks _ |
_ master Compile Tests _ |
| +1 :green_heart: | mvninstall | 3m 19s | | master passed |
| +1 :green_heart: | compile | 0m 58s | | master passed |
| +1 :green_heart: | javadoc | 0m 28s | | master passed |
| +1 :green_heart: | shadedjars | 6m 3s | | branch has no errors when
building our shaded downstream artifacts. |
_ Patch Compile Tests _ |
| +1 :green_heart: | mvninstall | 3m 4s | | the patch passed |
| +1 :green_heart: | compile | 0m 58s | | the patch passed |
| +1 :green_heart: | javac | 0m 58s | | the patch passed |
| +1 :green_heart: | javadoc | 0m 26s | | the patch passed |
| +1 :green_heart: | shadedjars | 6m 0s | | patch has no errors when
building our shaded downstream artifacts. |
_ Other Tests _ |
| -1 :x: | unit | 9m 16s |
[/patch-unit-hbase-server.txt](https://ci-hbase.apache.org/job/HBase-PreCommit-GitHub-PR/job/PR-7617/1/artifact/yetus-jdk17-hadoop3-check/output/patch-unit-hbase-server.txt)
| hbase-server in the patch failed. |
| | | 35m 38s | | |
| Subsystem | Report/Notes |
|--:|:-|
| Docker | ClientAPI=1.43 ServerAPI=1.43 base:
https://ci-hbase.apache.org/job/HBase-PreCommit-GitHub-PR/job/PR-7617/1/artifact/yetus-jdk17-hadoop3-check/output/Dockerfile
|
| GITHUB PR | https://github.com/apache/hbase/pull/7617 |
| Optional Tests | javac javadoc unit compile shadedjars |
| uname | Linux 1274cf557e0e 5.4.0-1103-aws #111~18.04.1-Ubuntu SMP Tue May
23 20:04:10 UTC 2023 x86_64 x86_64 x86_64 GNU/Linux |
| Build tool | maven |
| Personality | dev-support/hbase-personality.sh |
| git revision | master / ea8ed6a845cd1ba2f4660974cb3368338cfd7576 |
| Default Java | Eclipse Adoptium-17.0.11+9 |
| Test Results |
https://ci-hbase.apache.org/job/HBase-PreCommit-GitHub-PR/job/PR-7617/1/testReport/
|
| Max. process+thread count | 1120 (vs. ulimit of 3) |
| modules | C: hbase-server U: hbase-server |
| Console output |
https://ci-hbase.apache.org/job/HBase-PreCommit-GitHub-PR/job/PR-7617/1/console
|
| versions | git=2.34.1 maven=3.9.8 |
| Powered by | Apache Yetus 0.15.0 https://yetus.apache.org |
This message was automatically generated.
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: [email protected]
For queries about this service, please contact Infrastructure at:
[email protected]
