xintongsong commented on code in PR #22652:
URL: https://github.com/apache/flink/pull/22652#discussion_r1212515884
##########
flink-runtime/src/main/java/org/apache/flink/runtime/io/network/partition/hybrid/tiered/storage/TieredStorageProducerClient.java:
##########
@@ -119,7 +149,60 @@ private void writeAccumulatedBuffers(
private void writeAccumulatedBuffer(
TieredStorageSubpartitionId subpartitionId, Buffer
accumulatedBuffer)
throws IOException {
- // TODO, Try to write the accumulated buffer to the appropriate tier.
After the tier is
- // decided, then write the accumulated buffer to the tier.
+ updateStatistics(accumulatedBuffer);
+ Buffer compressedBuffer = compressBufferIfPossible(accumulatedBuffer);
+
+ if (currentSubpartitionTierAgent[subpartitionId.getSubpartitionId()]
== null) {
+ chooseStorageTierToStartSegment(subpartitionId);
+ }
+
+ boolean isSuccess =
+
currentSubpartitionTierAgent[subpartitionId.getSubpartitionId()].write(
Review Comment:
The semantic of `TierProducerAgent#write` return value needs to be
documented. E.g., after a write failure, do we allow writing to the segment
again?
##########
flink-runtime/src/main/java/org/apache/flink/runtime/io/network/partition/hybrid/tiered/storage/TieredStorageProducerClient.java:
##########
@@ -100,11 +128,13 @@ public void close() {
*/
private void writeAccumulatedBuffers(
TieredStorageSubpartitionId subpartitionId, List<Buffer>
accumulatedBuffers) {
+ Queue<Buffer> buffers = new ArrayDeque<>(accumulatedBuffers);
try {
- for (Buffer finishedBuffer : accumulatedBuffers) {
- writeAccumulatedBuffer(subpartitionId, finishedBuffer);
+ while (!buffers.isEmpty()) {
+ writeAccumulatedBuffer(subpartitionId, buffers.poll());
}
} catch (IOException e) {
+ buffers.forEach(Buffer::recycleBuffer);
ExceptionUtils.rethrow(e);
}
Review Comment:
There's no need to create the `ArrayDeque`. This can be achieved with
`List#iterator()`.
##########
flink-runtime/src/main/java/org/apache/flink/runtime/io/network/partition/hybrid/tiered/storage/TieredStorageProducerClient.java:
##########
@@ -100,11 +128,13 @@ public void close() {
*/
private void writeAccumulatedBuffers(
TieredStorageSubpartitionId subpartitionId, List<Buffer>
accumulatedBuffers) {
+ Queue<Buffer> buffers = new ArrayDeque<>(accumulatedBuffers);
try {
- for (Buffer finishedBuffer : accumulatedBuffers) {
- writeAccumulatedBuffer(subpartitionId, finishedBuffer);
+ while (!buffers.isEmpty()) {
+ writeAccumulatedBuffer(subpartitionId, buffers.poll());
}
} catch (IOException e) {
+ buffers.forEach(Buffer::recycleBuffer);
ExceptionUtils.rethrow(e);
}
Review Comment:
What may cause an exception here? What happens to the particular buffer that
causes the exception?
##########
flink-runtime/src/main/java/org/apache/flink/runtime/io/network/partition/hybrid/tiered/storage/TieredStorageProducerClient.java:
##########
@@ -119,7 +149,60 @@ private void writeAccumulatedBuffers(
private void writeAccumulatedBuffer(
TieredStorageSubpartitionId subpartitionId, Buffer
accumulatedBuffer)
throws IOException {
- // TODO, Try to write the accumulated buffer to the appropriate tier.
After the tier is
- // decided, then write the accumulated buffer to the tier.
+ updateStatistics(accumulatedBuffer);
+ Buffer compressedBuffer = compressBufferIfPossible(accumulatedBuffer);
+
+ if (currentSubpartitionTierAgent[subpartitionId.getSubpartitionId()]
== null) {
+ chooseStorageTierToStartSegment(subpartitionId);
+ }
+
+ boolean isSuccess =
+
currentSubpartitionTierAgent[subpartitionId.getSubpartitionId()].write(
+ subpartitionId, compressedBuffer);
+ if (!isSuccess) {
+ chooseStorageTierToStartSegment(subpartitionId);
+ isSuccess =
+
currentSubpartitionTierAgent[subpartitionId.getSubpartitionId()].write(
+ subpartitionId, compressedBuffer);
+ checkState(isSuccess, "Failed to write the first buffer to the new
segment");
+ }
+ }
+
+ private void chooseStorageTierToStartSegment(TieredStorageSubpartitionId
subpartitionId)
+ throws IOException {
+ int subpartitionIndex = subpartitionId.getSubpartitionId();
+ int segmentIndex = currentSubpartitionSegmentId[subpartitionIndex];
+ int nextSegmentIndex = segmentIndex + 1;
+
+ for (TierProducerAgent tierProducerAgent : tierProducerAgents) {
+ if (tierProducerAgent.tryStartNewSegment(subpartitionId,
nextSegmentIndex)) {
+ // Update the segment index and the chosen storage tier for
the subpartition.
+ currentSubpartitionSegmentId[subpartitionIndex] =
nextSegmentIndex;
+ currentSubpartitionTierAgent[subpartitionIndex] =
tierProducerAgent;
+ return;
+ }
+ }
+ throw new IOException("Failed to choose a storage tier to start a new
segment.");
+ }
+
+ private Buffer compressBufferIfPossible(Buffer buffer) {
+ if (!canBeCompressed(buffer)) {
+ return buffer;
+ }
+
+ return checkNotNull(bufferCompressor).compressToOriginalBuffer(buffer);
+ }
+
+ /**
+ * Whether the buffer can be compressed or not. Note that event is not
compressed because it is
+ * usually small and the size can become even larger after compression.
+ */
+ private boolean canBeCompressed(Buffer buffer) {
+ return bufferCompressor != null && buffer.isBuffer() &&
buffer.readableBytes() > 0;
+ }
+
+ private void updateStatistics(Buffer buffer) {
+ checkNotNull(outputMetrics).getNumBuffersOut().inc();
+
checkNotNull(outputMetrics).getNumBytesOut().inc(buffer.readableBytes());
Review Comment:
This indicates Flink's metric mechanism invading tiered storage.
Alternatively, we can apply a listener pattern that tiered result partition
subscribes to the tiered storage statistic updates.
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: [email protected]
For queries about this service, please contact Infrastructure at:
[email protected]