snleee commented on code in PR #10045:
URL: https://github.com/apache/pinot/pull/10045#discussion_r1062679718
##########
pinot-connectors/pinot-flink-connector/src/main/java/org/apache/pinot/connector/flink/sink/PinotSinkFunction.java:
##########
@@ -151,4 +148,18 @@ private void flush()
LOG.info("Pinot segment uploaded to {}", segmentURI);
});
}
+
+ @Override
+ public List<GenericRow> snapshotState(long checkpointId, long timestamp) {
Review Comment:
I see. Have you thought of using Kafka Sink on Flink side and let Pinot
directly consume from the output kafka topic? Pinot's real-time consumption
handles the offset commit by itself and it handles fault tolerance by resuming
to consume from the offset that has been committed recently on the Pinot side.
For this PR, let's at least go with your 2nd approach using Flink state to
unblock you. In case of Flink state, does Flink take care of cleaning up the
RocksDB data? If not, we should handle the data clean up when exiting the
`flush()`.
##########
pinot-connectors/pinot-flink-connector/src/main/java/org/apache/pinot/connector/flink/sink/PinotSinkFunction.java:
##########
@@ -151,4 +148,18 @@ private void flush()
LOG.info("Pinot segment uploaded to {}", segmentURI);
});
}
+
+ @Override
+ public List<GenericRow> snapshotState(long checkpointId, long timestamp) {
Review Comment:
I see. Have you thought of using Kafka Sink on the Flink side and let Pinot
directly consume from the output kafka topic? Pinot's real-time consumption
handles the offset commit by itself and it handles fault tolerance by resuming
to consume from the offset that has been committed recently on the Pinot side.
For this PR, let's at least go with your 2nd approach using Flink state to
unblock you. In case of Flink state, does Flink take care of cleaning up the
RocksDB data? If not, we should handle the data clean up when exiting the
`flush()`.
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: [email protected]
For queries about this service, please contact Infrastructure at:
[email protected]
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]