navina commented on code in PR #10995:
URL: https://github.com/apache/pinot/pull/10995#discussion_r1256427515
##########
pinot-plugins/pinot-stream-ingestion/pinot-pulsar/src/main/java/org/apache/pinot/plugin/stream/pulsar/PulsarConfig.java:
##########
@@ -46,6 +46,8 @@ public class PulsarConfig {
private final String _authenticationToken;
private final String _tlsTrustCertsFilePath;
private final boolean _enableKeyValueStitch;
+ private final boolean _populateMetadata;
+ private final boolean _populateRecordTimeFromEventTime;
Review Comment:
@JeffBolle `StreamMessageMetadata#getRecordIngestionTimeMs()` is used in
ingestion delay calculations. So, it would be incorrect to use user's eventTime
as Pinot is trying to track the consumption from the stream and needs the time
at which the record landed in the source.
[See](https://github.com/apache/pinot/blob/68ab6bcd8da321960cbad4715ae80dd33ffd1d60/pinot-core/src/main/java/org/apache/pinot/core/data/manager/realtime/LLRealtimeSegmentDataManager.java#L1611C38-L1611C38)
Not super familiar with pulsar semantics - is `publishTime` set by the
producer of the data or by the broker when it gets persisted in the stream? It
should be the latter.
In any case, you should avoid using event time as `recordIngestionTimeMs`
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: [email protected]
For queries about this service, please contact Infrastructure at:
[email protected]
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]