abhishekrb19 commented on code in PR #18855:
URL: https://github.com/apache/druid/pull/18855#discussion_r2635919315
##########
indexing-service/src/main/java/org/apache/druid/indexing/seekablestream/SeekableStreamIndexTaskRunner.java:
##########
@@ -2144,26 +2146,36 @@ private void refreshMinMaxMessageTime()
);
}
- public boolean withinMinMaxRecordTime(final InputRow row)
+ /**
+ * Returns the rejection reason for a row, or null if the row should be
accepted.
+ * This method is used as a {@link RowFilter} for the {@link
StreamChunkParser}.
+ */
+ @Nullable
+ ThrownAwayReason getRowRejectionReason(final InputRow row)
{
- final boolean beforeMinimumMessageTime =
minMessageTime.isAfter(row.getTimestamp());
- final boolean afterMaximumMessageTime =
maxMessageTime.isBefore(row.getTimestamp());
-
- if (log.isDebugEnabled()) {
- if (beforeMinimumMessageTime) {
+ if (row == null) {
+ return ThrownAwayReason.NULL;
+ }
+ if (minMessageTime.isAfter(row.getTimestamp())) {
+ if (log.isDebugEnabled()) {
log.debug(
"CurrentTimeStamp[%s] is before MinimumMessageTime[%s]",
row.getTimestamp(),
minMessageTime
);
- } else if (afterMaximumMessageTime) {
+ }
+ return ThrownAwayReason.BEFORE_MIN_MESSAGE_TIME;
+ }
+ if (maxMessageTime.isBefore(row.getTimestamp())) {
+ if (log.isDebugEnabled()) {
log.debug(
"CurrentTimeStamp[%s] is after MaximumMessageTime[%s]",
Review Comment:
```suggestion
"CurrentTimeStamp[%s] is after maximumMessageTime[%s]",
```
##########
indexing-service/src/main/java/org/apache/druid/indexing/seekablestream/SeekableStreamIndexTaskRunner.java:
##########
@@ -2144,26 +2146,36 @@ private void refreshMinMaxMessageTime()
);
}
- public boolean withinMinMaxRecordTime(final InputRow row)
+ /**
+ * Returns the rejection reason for a row, or null if the row should be
accepted.
+ * This method is used as a {@link RowFilter} for the {@link
StreamChunkParser}.
+ */
+ @Nullable
+ ThrownAwayReason getRowRejectionReason(final InputRow row)
Review Comment:
```suggestion
ThrownAwayReason getRowRejectionReason(@Nullable final InputRow row)
```
##########
indexing-service/src/main/java/org/apache/druid/indexing/seekablestream/SeekableStreamIndexTaskRunner.java:
##########
@@ -2144,26 +2146,36 @@ private void refreshMinMaxMessageTime()
);
}
- public boolean withinMinMaxRecordTime(final InputRow row)
+ /**
+ * Returns the rejection reason for a row, or null if the row should be
accepted.
+ * This method is used as a {@link RowFilter} for the {@link
StreamChunkParser}.
+ */
+ @Nullable
+ ThrownAwayReason getRowRejectionReason(final InputRow row)
{
- final boolean beforeMinimumMessageTime =
minMessageTime.isAfter(row.getTimestamp());
- final boolean afterMaximumMessageTime =
maxMessageTime.isBefore(row.getTimestamp());
-
- if (log.isDebugEnabled()) {
- if (beforeMinimumMessageTime) {
+ if (row == null) {
+ return ThrownAwayReason.NULL;
+ }
+ if (minMessageTime.isAfter(row.getTimestamp())) {
+ if (log.isDebugEnabled()) {
log.debug(
"CurrentTimeStamp[%s] is before MinimumMessageTime[%s]",
row.getTimestamp(),
minMessageTime
);
- } else if (afterMaximumMessageTime) {
+ }
+ return ThrownAwayReason.BEFORE_MIN_MESSAGE_TIME;
+ }
+ if (maxMessageTime.isBefore(row.getTimestamp())) {
+ if (log.isDebugEnabled()) {
log.debug(
"CurrentTimeStamp[%s] is after MaximumMessageTime[%s]",
row.getTimestamp(),
maxMessageTime
);
}
+ return ThrownAwayReason.AFTER_MAX_MESSAGE_TIME;
}
- return !beforeMinimumMessageTime && !afterMaximumMessageTime;
+ return null;
Review Comment:
I wonder if it'd be cleaner to also capture "Accepted" as an enum state
rather than null. Perhaps an enum like: RowFilterResult.Accepted,
RowFilterResult.Null, RowFilterResult.Before_MaxTime, etc.
Then the callers can test and filter out the specific enum predicate to get
the ThrownAway dimension as needed. What do you think?
##########
indexing-service/src/main/java/org/apache/druid/indexing/seekablestream/SeekableStreamIndexTaskRunner.java:
##########
@@ -2144,26 +2146,36 @@ private void refreshMinMaxMessageTime()
);
}
- public boolean withinMinMaxRecordTime(final InputRow row)
+ /**
+ * Returns the rejection reason for a row, or null if the row should be
accepted.
+ * This method is used as a {@link RowFilter} for the {@link
StreamChunkParser}.
+ */
+ @Nullable
+ ThrownAwayReason getRowRejectionReason(final InputRow row)
{
- final boolean beforeMinimumMessageTime =
minMessageTime.isAfter(row.getTimestamp());
- final boolean afterMaximumMessageTime =
maxMessageTime.isBefore(row.getTimestamp());
-
- if (log.isDebugEnabled()) {
- if (beforeMinimumMessageTime) {
+ if (row == null) {
+ return ThrownAwayReason.NULL;
+ }
+ if (minMessageTime.isAfter(row.getTimestamp())) {
+ if (log.isDebugEnabled()) {
log.debug(
"CurrentTimeStamp[%s] is before MinimumMessageTime[%s]",
Review Comment:
nit - while at it:
```suggestion
"CurrentTimeStamp[%s] is before minimumMessageTime[%s]",
```
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: [email protected]
For queries about this service, please contact Infrastructure at:
[email protected]
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]