[PR] Reposition query submission spot for adaptive server selection [pinot]

2024-06-07 Thread via GitHub


vvivekiyer opened a new pull request, #13327:
URL: https://github.com/apache/pinot/pull/13327

   This is a followup to https://github.com/apache/pinot/pull/13104 
   
   If there are cases of netty channel lock exceptions, we will avoid 
publishing stats for some servers as the exception handling breaks out of the 
loop. With this fix, we will submit stats for all picked servers and at the 
same time release for all servers. 


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscr...@pinot.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


-
To unsubscribe, e-mail: commits-unsubscr...@pinot.apache.org
For additional commands, e-mail: commits-h...@pinot.apache.org



Re: [PR] Fix few metric rules which were affected by the database prefix handling [pinot]

2024-06-07 Thread via GitHub


KKcorps merged PR #13290:
URL: https://github.com/apache/pinot/pull/13290


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscr...@pinot.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


-
To unsubscribe, e-mail: commits-unsubscr...@pinot.apache.org
For additional commands, e-mail: commits-h...@pinot.apache.org



Re: [PR] [Backfill] allow externally partitioned segment uploads for upsert tables [pinot]

2024-06-07 Thread via GitHub


rohityadav1993 commented on code in PR #13107:
URL: https://github.com/apache/pinot/pull/13107#discussion_r1630674736


##
pinot-segment-local/src/main/java/org/apache/pinot/segment/local/upsert/ConcurrentMapPartitionUpsertMetadataManager.java:
##
@@ -158,6 +158,45 @@ protected void addOrReplaceSegment(ImmutableSegmentImpl 
segment, ThreadSafeMutab
 }
   }
 
+  /**
+   *  When the replacing segment and current segment are of {@link 
LLCSegmentName} then the PK should resolve to
+   * row in segment with higher sequence id.
+   *  When the replacing segment and current segment are of {@link 
UploadedRealtimeSegmentName} then the PK
+   * should resolve to row in segment with higher creation time
+   *  For other cases resolve based on creation time of segment. In case 
the creation time is same, give
+   * preference to an uploaded segment. A segment which is not LLCSegment can 
be assumed to be uploaded segment and
+   * is given preference.
+   *
+   * @param segmentName replacing segment name
+   * @param currentSegmentName current segment name having the record for the 
given primary key
+   * @param segmentCreationTimeMs replacing segment creation time
+   * @param currentSegmentCreationTimeMs current segment creation time
+   * @return true if the record in replacing segment should replace the record 
in current segment
+   */
+  protected boolean shouldReplaceOnComparisonTie(String segmentName, String 
currentSegmentName,
+  long segmentCreationTimeMs, long currentSegmentCreationTimeMs) {
+
+LLCSegmentName llcSegmentName = LLCSegmentName.of(segmentName);
+LLCSegmentName currentLLCSegmentName = 
LLCSegmentName.of(currentSegmentName);
+if (llcSegmentName != null && currentLLCSegmentName != null) {
+  return llcSegmentName.getSequenceNumber() > 
currentLLCSegmentName.getSequenceNumber();
+}
+
+int creationTimeComparisonRes = Long.compare(segmentCreationTimeMs, 
currentSegmentCreationTimeMs);

Review Comment:
   Thanks for simplifying this
   
   > // favor the first writer
   > return false;
   
   This may be an inconsistent behaviour(there can be two writers and may not 
always sync and write) but I don't have a strong inclination. It will be good 
to keep it as you suggested as it does not modify the existing behaviour 
   
   For later, we can always depend on a critera based on segment metadata/name 
to be consistent. e.g. we can check based on lexical order of segment name 
which can enforce to use right naming conventions.



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscr...@pinot.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


-
To unsubscribe, e-mail: commits-unsubscr...@pinot.apache.org
For additional commands, e-mail: commits-h...@pinot.apache.org



Re: [PR] Add offset based lag metrics [pinot]

2024-06-07 Thread via GitHub


gortiz commented on code in PR #13298:
URL: https://github.com/apache/pinot/pull/13298#discussion_r1630692381


##
pinot-core/src/main/java/org/apache/pinot/core/data/manager/realtime/IngestionDelayTracker.java:
##
@@ -88,6 +90,16 @@ private static class IngestionTimestamps {
 private final long _ingestionTimeMs;
 private final long _firstStreamIngestionTimeMs;
   }
+
+  private static class IngestionOffsets {
+IngestionOffsets(StreamPartitionMsgOffset offset, StreamPartitionMsgOffset 
latestOffset) {
+  _offset = offset;
+  _latestOffset = latestOffset;
+}
+private final StreamPartitionMsgOffset _offset;
+private final StreamPartitionMsgOffset _latestOffset;

Review Comment:
   nit: can we write the attributes before the constructor? Doing it the other 
way around is very strange.



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscr...@pinot.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


-
To unsubscribe, e-mail: commits-unsubscr...@pinot.apache.org
For additional commands, e-mail: commits-h...@pinot.apache.org



Re: [I] [multistage] Support Window Functions [pinot]

2024-06-06 Thread via GitHub


Jackie-Jiang closed issue #7213: [multistage] Support Window Functions
URL: https://github.com/apache/pinot/issues/7213


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscr...@pinot.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


-
To unsubscribe, e-mail: commits-unsubscr...@pinot.apache.org
For additional commands, e-mail: commits-h...@pinot.apache.org



Re: [I] [multistage] Support Window Functions [pinot]

2024-06-06 Thread via GitHub


Jackie-Jiang commented on issue #7213:
URL: https://github.com/apache/pinot/issues/7213#issuecomment-2154130013

   Closing this as the basic window functions are all supported


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscr...@pinot.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


-
To unsubscribe, e-mail: commits-unsubscr...@pinot.apache.org
For additional commands, e-mail: commits-h...@pinot.apache.org



Re: [PR] [Backfill] allow externally partitioned segment uploads for upsert tables [pinot]

2024-06-06 Thread via GitHub


rohityadav1993 commented on code in PR #13107:
URL: https://github.com/apache/pinot/pull/13107#discussion_r1630674736


##
pinot-segment-local/src/main/java/org/apache/pinot/segment/local/upsert/ConcurrentMapPartitionUpsertMetadataManager.java:
##
@@ -158,6 +158,45 @@ protected void addOrReplaceSegment(ImmutableSegmentImpl 
segment, ThreadSafeMutab
 }
   }
 
+  /**
+   *  When the replacing segment and current segment are of {@link 
LLCSegmentName} then the PK should resolve to
+   * row in segment with higher sequence id.
+   *  When the replacing segment and current segment are of {@link 
UploadedRealtimeSegmentName} then the PK
+   * should resolve to row in segment with higher creation time
+   *  For other cases resolve based on creation time of segment. In case 
the creation time is same, give
+   * preference to an uploaded segment. A segment which is not LLCSegment can 
be assumed to be uploaded segment and
+   * is given preference.
+   *
+   * @param segmentName replacing segment name
+   * @param currentSegmentName current segment name having the record for the 
given primary key
+   * @param segmentCreationTimeMs replacing segment creation time
+   * @param currentSegmentCreationTimeMs current segment creation time
+   * @return true if the record in replacing segment should replace the record 
in current segment
+   */
+  protected boolean shouldReplaceOnComparisonTie(String segmentName, String 
currentSegmentName,
+  long segmentCreationTimeMs, long currentSegmentCreationTimeMs) {
+
+LLCSegmentName llcSegmentName = LLCSegmentName.of(segmentName);
+LLCSegmentName currentLLCSegmentName = 
LLCSegmentName.of(currentSegmentName);
+if (llcSegmentName != null && currentLLCSegmentName != null) {
+  return llcSegmentName.getSequenceNumber() > 
currentLLCSegmentName.getSequenceNumber();
+}
+
+int creationTimeComparisonRes = Long.compare(segmentCreationTimeMs, 
currentSegmentCreationTimeMs);

Review Comment:
   > // favor the first writer
   > return false;
   
   This may be an inconsistent behaviour(there can be two writers and may not 
always sync and write) but I don't have a strong inclination. It will be good 
to keep it as you suggested as it does not modify the existing behaviour 
   
   For later, we can always depend on a critera based on segment metadata/name 
to be consistent. e.g. we can check based on lexical order of segment name 
which can enforce to use right naming conventions.



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscr...@pinot.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


-
To unsubscribe, e-mail: commits-unsubscr...@pinot.apache.org
For additional commands, e-mail: commits-h...@pinot.apache.org



Re: [PR] [Backfill] allow externally partitioned segment uploads for upsert tables [pinot]

2024-06-06 Thread via GitHub


rohityadav1993 commented on code in PR #13107:
URL: https://github.com/apache/pinot/pull/13107#discussion_r1630664998


##
pinot-common/src/main/java/org/apache/pinot/common/utils/UploadedRealtimeSegmentName.java:
##
@@ -0,0 +1,180 @@
+/**
+ * Licensed to the Apache Software Foundation (ASF) under one
+ * or more contributor license agreements.  See the NOTICE file
+ * distributed with this work for additional information
+ * regarding copyright ownership.  The ASF licenses this file
+ * to you under the Apache License, Version 2.0 (the
+ * "License"); you may not use this file except in compliance
+ * with the License.  You may obtain a copy of the License at
+ *
+ *   http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing,
+ * software distributed under the License is distributed on an
+ * "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
+ * KIND, either express or implied.  See the License for the
+ * specific language governing permissions and limitations
+ * under the License.
+ */
+package org.apache.pinot.common.utils;
+
+import com.google.common.base.Joiner;
+import com.google.common.base.Preconditions;
+import java.util.Objects;
+import javax.annotation.Nullable;
+import org.apache.commons.lang3.StringUtils;
+import org.joda.time.format.DateTimeFormat;
+import org.joda.time.format.DateTimeFormatter;
+
+
+/**
+ * Class to represent segment names like: 
{prefix}__{tableName}__{partitionId}__{creationTime}__{suffix}
+ *
+ * This naming convention is adopted to represent a segment uploaded to a 
realtime table. The naming
+ * convention has been kept semantically similar to {@link LLCSegmentName} but 
differs in following ways:
+ *
+ *  prefix to quickly identify the type/source of segment e.g. 
"uploaded"/"minion"
+ *  tableName to be same as the table name of segment
+ *  partitionId to identify the right parition for upsert table segment 
table assignment.
+ *  creationTime creation time of segment of the format MMdd'T'HHmm'Z'
+ *  suffix to deduplicate segment names created at the same time
+ *
+ * Use {@link 
org.apache.pinot.segment.spi.creator.name.UploadedRealtimeSegmentNameGenerator} 
to generate segment names.
+ */
+public class UploadedRealtimeSegmentName implements 
Comparable {
+
+  private static final String SEPARATOR = "__";
+  private static final String DATE_FORMAT = "MMdd'T'HHmm'Z'";
+  private static final DateTimeFormatter DATE_FORMATTER = 
DateTimeFormat.forPattern(DATE_FORMAT).withZoneUTC();
+  private final String _prefix;
+  private final String _tableName;
+  private final int _partitionId;
+  private final String _creationTime;
+  private final String _segmentName;
+  private final String _suffix;
+
+  public UploadedRealtimeSegmentName(String segmentName) {
+
+// split the segment name by the separator and get creation time, sequence 
id, partition id and table name from
+// the end and validate segment name starts with prefix uploaded_
+try {
+  String[] parts = StringUtils.splitByWholeSeparator(segmentName, 
SEPARATOR);
+  Preconditions.checkState(parts.length == 5,
+  "Uploaded segment name must be of the format 
{prefix}__{tableName}__{partitionId}__{creationTime}__{suffix}");
+  _prefix = parts[0];
+  _tableName = parts[1];
+  _partitionId = Integer.parseInt(parts[2]);
+  _creationTime = parts[3];
+  _suffix = parts[4];
+  _segmentName = segmentName;
+} catch (NumberFormatException e) {
+  throw new IllegalArgumentException("Invalid segment name: " + 
segmentName, e);
+}
+  }
+
+  /**
+   * Constructor for UploadedRealtimeSegmentName.
+   * @param tableName
+   * @param partitionId
+   * @param msSinceEpoch
+   * @param prefix
+   * @param suffix
+   */
+  public UploadedRealtimeSegmentName(String tableName, int partitionId, long 
msSinceEpoch, String prefix,
+  String suffix) {
+Preconditions.checkArgument(
+StringUtils.isNotBlank(tableName) && StringUtils.isNotBlank(prefix) && 
StringUtils.isNotBlank(suffix),

Review Comment:
   Ack, will also add this to UploadedRealtimeSegmentNameGenerator as a check.



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscr...@pinot.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


-
To unsubscribe, e-mail: commits-unsubscr...@pinot.apache.org
For additional commands, e-mail: commits-h...@pinot.apache.org



Re: [PR] Add offset based lag metrics [pinot]

2024-06-06 Thread via GitHub


swaminathanmanish commented on code in PR #13298:
URL: https://github.com/apache/pinot/pull/13298#discussion_r1630651998


##
pinot-core/src/main/java/org/apache/pinot/core/data/manager/realtime/IngestionDelayTracker.java:
##
@@ -174,6 +188,22 @@ private long getIngestionDelayMs(long ingestionTimeMs) {
 return agedIngestionDelayMs;
   }
 
+  private long getPartitionOffsetLag(IngestionOffsets offset) {
+if (offset == null) {

Review Comment:
   Curious, can this even be null?



##
pinot-core/src/main/java/org/apache/pinot/core/data/manager/realtime/IngestionDelayTracker.java:
##
@@ -174,6 +188,22 @@ private long getIngestionDelayMs(long ingestionTimeMs) {
 return agedIngestionDelayMs;
   }
 
+  private long getPartitionOffsetLag(IngestionOffsets offset) {
+if (offset == null) {
+  return 0;
+}
+StreamPartitionMsgOffset msgOffset = offset._offset;

Review Comment:
   This is current offset right? 



##
pinot-core/src/main/java/org/apache/pinot/core/data/manager/realtime/IngestionDelayTracker.java:
##
@@ -174,6 +188,22 @@ private long getIngestionDelayMs(long ingestionTimeMs) {
 return agedIngestionDelayMs;
   }
 
+  private long getPartitionOffsetLag(IngestionOffsets offset) {

Review Comment:
   This comment applies to time based reporting as well and can be taken as 
follow up.  
   
   Should this belong to individual consumer types instead of common code just 
like how getCurrentPartitionLagState is interfaced out, because reporting the 
metric for all types here might not make any sense ? 
   
   Like you said, only for Kafka, this is relevant. 



##
pinot-core/src/main/java/org/apache/pinot/core/data/manager/realtime/IngestionDelayTracker.java:
##
@@ -174,6 +188,22 @@ private long getIngestionDelayMs(long ingestionTimeMs) {
 return agedIngestionDelayMs;
   }
 
+  private long getPartitionOffsetLag(IngestionOffsets offset) {
+if (offset == null) {
+  return 0;
+}
+StreamPartitionMsgOffset msgOffset = offset._offset;
+StreamPartitionMsgOffset latestOffset = offset._latestOffset;
+
+// Compute aged delay for current partition
+// TODO: Support other types of offsets
+if (!(msgOffset instanceof LongMsgOffset && latestOffset instanceof 
LongMsgOffset)) {
+  return 0;

Review Comment:
   What are other types that can fall here? Basically we need to be able to 
explain this metric. 



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscr...@pinot.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


-
To unsubscribe, e-mail: commits-unsubscr...@pinot.apache.org
For additional commands, e-mail: commits-h...@pinot.apache.org



Re: [PR] [Minor] Handle 403 response to send the message as HTTP response [pinot]

2024-06-06 Thread via GitHub


eaugene commented on PR #13292:
URL: https://github.com/apache/pinot/pull/13292#issuecomment-2153869284

   Hi @Jackie-Jiang, can you PTAL for this minor fix? 


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscr...@pinot.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


-
To unsubscribe, e-mail: commits-unsubscr...@pinot.apache.org
For additional commands, e-mail: commits-h...@pinot.apache.org



Re: [PR] [Multi-stage] Clean up RelNode to Operator handling [pinot]

2024-06-06 Thread via GitHub


codecov-commenter commented on PR #13325:
URL: https://github.com/apache/pinot/pull/13325#issuecomment-2153772651

   ## 
[Codecov](https://app.codecov.io/gh/apache/pinot/pull/13325?dropdown=coverage=pr=h1_medium=referral_source=github_content=comment_campaign=pr+comments_term=apache)
 Report
   Attention: Patch coverage is `0%` with `1129 lines` in your changes missing 
coverage. Please review.
   > Project coverage is 0.00%. Comparing base 
[(`59551e4`)](https://app.codecov.io/gh/apache/pinot/commit/59551e45224f1535c4863fd577622b37366ccc97?dropdown=coverage=desc_medium=referral_source=github_content=comment_campaign=pr+comments_term=apache)
 to head 
[(`499106b`)](https://app.codecov.io/gh/apache/pinot/commit/499106b770f1180daa86b24819db6d87f64671fe?dropdown=coverage=desc_medium=referral_source=github_content=comment_campaign=pr+comments_term=apache).
   > Report is 563 commits behind head on master.
   
   | 
[Files](https://app.codecov.io/gh/apache/pinot/pull/13325?dropdown=coverage=pr=tree_medium=referral_source=github_content=comment_campaign=pr+comments_term=apache)
 | Patch % | Lines |
   |---|---|---|
   | 
[...inot/query/planner/serde/PlanNodeDeserializer.java](https://app.codecov.io/gh/apache/pinot/pull/13325?src=pr=tree=pinot-query-planner%2Fsrc%2Fmain%2Fjava%2Forg%2Fapache%2Fpinot%2Fquery%2Fplanner%2Fserde%2FPlanNodeDeserializer.java_medium=referral_source=github_content=comment_campaign=pr+comments_term=apache#diff-cGlub3QtcXVlcnktcGxhbm5lci9zcmMvbWFpbi9qYXZhL29yZy9hcGFjaGUvcGlub3QvcXVlcnkvcGxhbm5lci9zZXJkZS9QbGFuTm9kZURlc2VyaWFsaXplci5qYXZh)
 | 0.00% | [175 Missing :warning: 
](https://app.codecov.io/gh/apache/pinot/pull/13325?src=pr=tree_medium=referral_source=github_content=comment_campaign=pr+comments_term=apache)
 |
   | 
[.../pinot/query/planner/serde/PlanNodeSerializer.java](https://app.codecov.io/gh/apache/pinot/pull/13325?src=pr=tree=pinot-query-planner%2Fsrc%2Fmain%2Fjava%2Forg%2Fapache%2Fpinot%2Fquery%2Fplanner%2Fserde%2FPlanNodeSerializer.java_medium=referral_source=github_content=comment_campaign=pr+comments_term=apache#diff-cGlub3QtcXVlcnktcGxhbm5lci9zcmMvbWFpbi9qYXZhL29yZy9hcGFjaGUvcGlub3QvcXVlcnkvcGxhbm5lci9zZXJkZS9QbGFuTm9kZVNlcmlhbGl6ZXIuamF2YQ==)
 | 0.00% | [122 Missing :warning: 
](https://app.codecov.io/gh/apache/pinot/pull/13325?src=pr=tree_medium=referral_source=github_content=comment_campaign=pr+comments_term=apache)
 |
   | 
[.../query/planner/logical/RelToPlanNodeConverter.java](https://app.codecov.io/gh/apache/pinot/pull/13325?src=pr=tree=pinot-query-planner%2Fsrc%2Fmain%2Fjava%2Forg%2Fapache%2Fpinot%2Fquery%2Fplanner%2Flogical%2FRelToPlanNodeConverter.java_medium=referral_source=github_content=comment_campaign=pr+comments_term=apache#diff-cGlub3QtcXVlcnktcGxhbm5lci9zcmMvbWFpbi9qYXZhL29yZy9hcGFjaGUvcGlub3QvcXVlcnkvcGxhbm5lci9sb2dpY2FsL1JlbFRvUGxhbk5vZGVDb252ZXJ0ZXIuamF2YQ==)
 | 0.00% | [109 Missing :warning: 
](https://app.codecov.io/gh/apache/pinot/pull/13325?src=pr=tree_medium=referral_source=github_content=comment_campaign=pr+comments_term=apache)
 |
   | 
[...uery/runtime/operator/WindowAggregateOperator.java](https://app.codecov.io/gh/apache/pinot/pull/13325?src=pr=tree=pinot-query-runtime%2Fsrc%2Fmain%2Fjava%2Forg%2Fapache%2Fpinot%2Fquery%2Fruntime%2Foperator%2FWindowAggregateOperator.java_medium=referral_source=github_content=comment_campaign=pr+comments_term=apache#diff-cGlub3QtcXVlcnktcnVudGltZS9zcmMvbWFpbi9qYXZhL29yZy9hcGFjaGUvcGlub3QvcXVlcnkvcnVudGltZS9vcGVyYXRvci9XaW5kb3dBZ2dyZWdhdGVPcGVyYXRvci5qYXZh)
 | 0.00% | [55 Missing :warning: 
](https://app.codecov.io/gh/apache/pinot/pull/13325?src=pr=tree_medium=referral_source=github_content=comment_campaign=pr+comments_term=apache)
 |
   | 
[...hysical/colocated/GreedyShuffleRewriteVisitor.java](https://app.codecov.io/gh/apache/pinot/pull/13325?src=pr=tree=pinot-query-planner%2Fsrc%2Fmain%2Fjava%2Forg%2Fapache%2Fpinot%2Fquery%2Fplanner%2Fphysical%2Fcolocated%2FGreedyShuffleRewriteVisitor.java_medium=referral_source=github_content=comment_campaign=pr+comments_term=apache#diff-cGlub3QtcXVlcnktcGxhbm5lci9zcmMvbWFpbi9qYXZhL29yZy9hcGFjaGUvcGlub3QvcXVlcnkvcGxhbm5lci9waHlzaWNhbC9jb2xvY2F0ZWQvR3JlZWR5U2h1ZmZsZVJld3JpdGVWaXNpdG9yLmphdmE=)
 | 0.00% | [35 Missing :warning: 
](https://app.codecov.io/gh/apache/pinot/pull/13325?src=pr=tree_medium=referral_source=github_content=comment_campaign=pr+comments_term=apache)
 |
   | 
[...not/query/planner/plannode/MailboxReceiveNode.java](https://app.codecov.io/gh/apache/pinot/pull/13325?src=pr=tree=pinot-query-planner%2Fsrc%2Fmain%2Fjava%2Forg%2Fapache%2Fpinot%2Fquery%2Fplanner%2Fplannode%2FMailboxReceiveNode.java_medium=referral_source=github_content=comment_campaign=pr+comments_term=apache#diff-cGlub3QtcXVlcnktcGxhbm5lci9zcmMvbWFpbi9qYXZhL29yZy9hcGFjaGUvcGlub3QvcXVlcnkvcGxhbm5lci9wbGFubm9kZS9NYWlsYm94UmVjZWl2ZU5vZGUuamF2YQ==)
 | 0.00% | [26 Missing :warning: 

[PR] [DO NOT MERGE] Handle legacy query and log error [pinot]

2024-06-06 Thread via GitHub


Jackie-Jiang opened a new pull request, #13326:
URL: https://github.com/apache/pinot/pull/13326

   This change can be used to bring back the legacy non-sql support removed in 
#11610 and #11763, and log an error:
   - on broker when alias is applied to filter clause
   - on server when illegal cast is applied for CASE statement


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscr...@pinot.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


-
To unsubscribe, e-mail: commits-unsubscr...@pinot.apache.org
For additional commands, e-mail: commits-h...@pinot.apache.org



Re: [PR] Support array sum aggregation function [pinot]

2024-06-06 Thread via GitHub


codecov-commenter commented on PR #13324:
URL: https://github.com/apache/pinot/pull/13324#issuecomment-2153621937

   ## 
[Codecov](https://app.codecov.io/gh/apache/pinot/pull/13324?dropdown=coverage=pr=h1_medium=referral_source=github_content=comment_campaign=pr+comments_term=apache)
 Report
   Attention: Patch coverage is `0%` with `274 lines` in your changes missing 
coverage. Please review.
   > Project coverage is 0.00%. Comparing base 
[(`59551e4`)](https://app.codecov.io/gh/apache/pinot/commit/59551e45224f1535c4863fd577622b37366ccc97?dropdown=coverage=desc_medium=referral_source=github_content=comment_campaign=pr+comments_term=apache)
 to head 
[(`ad6fb9f`)](https://app.codecov.io/gh/apache/pinot/commit/ad6fb9f6abc6ba96aa5e66f3bde0a9e0c2ed9cb7?dropdown=coverage=desc_medium=referral_source=github_content=comment_campaign=pr+comments_term=apache).
   > Report is 563 commits behind head on master.
   
   | 
[Files](https://app.codecov.io/gh/apache/pinot/pull/13324?dropdown=coverage=pr=tree_medium=referral_source=github_content=comment_campaign=pr+comments_term=apache)
 | Patch % | Lines |
   |---|---|---|
   | 
[.../core/operator/docvalsets/RowBasedBlockValSet.java](https://app.codecov.io/gh/apache/pinot/pull/13324?src=pr=tree=pinot-core%2Fsrc%2Fmain%2Fjava%2Forg%2Fapache%2Fpinot%2Fcore%2Foperator%2Fdocvalsets%2FRowBasedBlockValSet.java_medium=referral_source=github_content=comment_campaign=pr+comments_term=apache#diff-cGlub3QtY29yZS9zcmMvbWFpbi9qYXZhL29yZy9hcGFjaGUvcGlub3QvY29yZS9vcGVyYXRvci9kb2N2YWxzZXRzL1Jvd0Jhc2VkQmxvY2tWYWxTZXQuamF2YQ==)
 | 0.00% | [154 Missing :warning: 
](https://app.codecov.io/gh/apache/pinot/pull/13324?src=pr=tree_medium=referral_source=github_content=comment_campaign=pr+comments_term=apache)
 |
   | 
[...ction/array/SumArrayDoubleAggregationFunction.java](https://app.codecov.io/gh/apache/pinot/pull/13324?src=pr=tree=pinot-core%2Fsrc%2Fmain%2Fjava%2Forg%2Fapache%2Fpinot%2Fcore%2Fquery%2Faggregation%2Ffunction%2Farray%2FSumArrayDoubleAggregationFunction.java_medium=referral_source=github_content=comment_campaign=pr+comments_term=apache#diff-cGlub3QtY29yZS9zcmMvbWFpbi9qYXZhL29yZy9hcGFjaGUvcGlub3QvY29yZS9xdWVyeS9hZ2dyZWdhdGlvbi9mdW5jdGlvbi9hcnJheS9TdW1BcnJheURvdWJsZUFnZ3JlZ2F0aW9uRnVuY3Rpb24uamF2YQ==)
 | 0.00% | [52 Missing :warning: 
](https://app.codecov.io/gh/apache/pinot/pull/13324?src=pr=tree_medium=referral_source=github_content=comment_campaign=pr+comments_term=apache)
 |
   | 
[...unction/array/SumArrayLongAggregationFunction.java](https://app.codecov.io/gh/apache/pinot/pull/13324?src=pr=tree=pinot-core%2Fsrc%2Fmain%2Fjava%2Forg%2Fapache%2Fpinot%2Fcore%2Fquery%2Faggregation%2Ffunction%2Farray%2FSumArrayLongAggregationFunction.java_medium=referral_source=github_content=comment_campaign=pr+comments_term=apache#diff-cGlub3QtY29yZS9zcmMvbWFpbi9qYXZhL29yZy9hcGFjaGUvcGlub3QvY29yZS9xdWVyeS9hZ2dyZWdhdGlvbi9mdW5jdGlvbi9hcnJheS9TdW1BcnJheUxvbmdBZ2dyZWdhdGlvbkZ1bmN0aW9uLmphdmE=)
 | 0.00% | [52 Missing :warning: 
](https://app.codecov.io/gh/apache/pinot/pull/13324?src=pr=tree_medium=referral_source=github_content=comment_campaign=pr+comments_term=apache)
 |
   | 
[...che/pinot/segment/spi/AggregationFunctionType.java](https://app.codecov.io/gh/apache/pinot/pull/13324?src=pr=tree=pinot-segment-spi%2Fsrc%2Fmain%2Fjava%2Forg%2Fapache%2Fpinot%2Fsegment%2Fspi%2FAggregationFunctionType.java_medium=referral_source=github_content=comment_campaign=pr+comments_term=apache#diff-cGlub3Qtc2VnbWVudC1zcGkvc3JjL21haW4vamF2YS9vcmcvYXBhY2hlL3Bpbm90L3NlZ21lbnQvc3BpL0FnZ3JlZ2F0aW9uRnVuY3Rpb25UeXBlLmphdmE=)
 | 0.00% | [14 Missing :warning: 
](https://app.codecov.io/gh/apache/pinot/pull/13324?src=pr=tree_medium=referral_source=github_content=comment_campaign=pr+comments_term=apache)
 |
   | 
[...gregation/function/AggregationFunctionFactory.java](https://app.codecov.io/gh/apache/pinot/pull/13324?src=pr=tree=pinot-core%2Fsrc%2Fmain%2Fjava%2Forg%2Fapache%2Fpinot%2Fcore%2Fquery%2Faggregation%2Ffunction%2FAggregationFunctionFactory.java_medium=referral_source=github_content=comment_campaign=pr+comments_term=apache#diff-cGlub3QtY29yZS9zcmMvbWFpbi9qYXZhL29yZy9hcGFjaGUvcGlub3QvY29yZS9xdWVyeS9hZ2dyZWdhdGlvbi9mdW5jdGlvbi9BZ2dyZWdhdGlvbkZ1bmN0aW9uRmFjdG9yeS5qYXZh)
 | 0.00% | [2 Missing :warning: 
](https://app.codecov.io/gh/apache/pinot/pull/13324?src=pr=tree_medium=referral_source=github_content=comment_campaign=pr+comments_term=apache)
 |
   
   Additional details and impacted files
   
   
   ```diff
   @@  Coverage Diff  @@
   ## master   #13324   +/-   ##
   =
   - Coverage 61.75%0.00%   -61.76% 
   =
 Files  2436 2471   +35 
 Lines133233   136267 +3034 
 Branches  2063621125  +489 
   =
   - Hits  822740-82274 
   - Misses44911   

[PR] [Multi-stage] Clean up RelNode to Operator handling [pinot]

2024-06-06 Thread via GitHub


Jackie-Jiang opened a new pull request, #13325:
URL: https://github.com/apache/pinot/pull/13325

   Clean up the flow from `RelNode` from planner side to `MultiStageOperator` 
on the executor side:
   - Cleanup unnecessary fields and conversions in `PlanNode`, and keep it 
close to information from Calcite `RelNode`
   - Unify the constructor for `PlanNode`
   - Cleanup `plan.proto` to reflect the changes in `PlanNode`
   - Cleanup `MultiStageOperator` fields and unnecessary conversions
   - Reduce the memory allocation
   - Cleanup all the `MultiStageOperator` tests


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscr...@pinot.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


-
To unsubscribe, e-mail: commits-unsubscr...@pinot.apache.org
For additional commands, e-mail: commits-h...@pinot.apache.org



[PR] Support array sum aggregation function [pinot]

2024-06-06 Thread via GitHub


xiangfu0 opened a new pull request, #13324:
URL: https://github.com/apache/pinot/pull/13324

   Support Aggregation function `SumArrayLong` and `SumArrayDouble` for 
aggregate on Arrays to sum on the corresponding array positions.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscr...@pinot.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


-
To unsubscribe, e-mail: commits-unsubscr...@pinot.apache.org
For additional commands, e-mail: commits-h...@pinot.apache.org



Re: [PR] [Backfill] allow externally partitioned segment uploads for upsert tables [pinot]

2024-06-06 Thread via GitHub


klsince commented on code in PR #13107:
URL: https://github.com/apache/pinot/pull/13107#discussion_r1630365372


##
pinot-common/src/main/java/org/apache/pinot/common/utils/UploadedRealtimeSegmentName.java:
##
@@ -0,0 +1,180 @@
+/**
+ * Licensed to the Apache Software Foundation (ASF) under one
+ * or more contributor license agreements.  See the NOTICE file
+ * distributed with this work for additional information
+ * regarding copyright ownership.  The ASF licenses this file
+ * to you under the Apache License, Version 2.0 (the
+ * "License"); you may not use this file except in compliance
+ * with the License.  You may obtain a copy of the License at
+ *
+ *   http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing,
+ * software distributed under the License is distributed on an
+ * "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
+ * KIND, either express or implied.  See the License for the
+ * specific language governing permissions and limitations
+ * under the License.
+ */
+package org.apache.pinot.common.utils;
+
+import com.google.common.base.Joiner;
+import com.google.common.base.Preconditions;
+import java.util.Objects;
+import javax.annotation.Nullable;
+import org.apache.commons.lang3.StringUtils;
+import org.joda.time.format.DateTimeFormat;
+import org.joda.time.format.DateTimeFormatter;
+
+
+/**
+ * Class to represent segment names like: 
{prefix}__{tableName}__{partitionId}__{creationTime}__{suffix}
+ *
+ * This naming convention is adopted to represent a segment uploaded to a 
realtime table. The naming
+ * convention has been kept semantically similar to {@link LLCSegmentName} but 
differs in following ways:
+ *
+ *  prefix to quickly identify the type/source of segment e.g. 
"uploaded"/"minion"
+ *  tableName to be same as the table name of segment
+ *  partitionId to identify the right parition for upsert table segment 
table assignment.
+ *  creationTime creation time of segment of the format MMdd'T'HHmm'Z'
+ *  suffix to deduplicate segment names created at the same time
+ *
+ * Use {@link 
org.apache.pinot.segment.spi.creator.name.UploadedRealtimeSegmentNameGenerator} 
to generate segment names.
+ */
+public class UploadedRealtimeSegmentName implements 
Comparable {
+
+  private static final String SEPARATOR = "__";
+  private static final String DATE_FORMAT = "MMdd'T'HHmm'Z'";
+  private static final DateTimeFormatter DATE_FORMATTER = 
DateTimeFormat.forPattern(DATE_FORMAT).withZoneUTC();
+  private final String _prefix;
+  private final String _tableName;
+  private final int _partitionId;
+  private final String _creationTime;
+  private final String _segmentName;
+  private final String _suffix;
+
+  public UploadedRealtimeSegmentName(String segmentName) {
+
+// split the segment name by the separator and get creation time, sequence 
id, partition id and table name from
+// the end and validate segment name starts with prefix uploaded_
+try {
+  String[] parts = StringUtils.splitByWholeSeparator(segmentName, 
SEPARATOR);
+  Preconditions.checkState(parts.length == 5,
+  "Uploaded segment name must be of the format 
{prefix}__{tableName}__{partitionId}__{creationTime}__{suffix}");
+  _prefix = parts[0];
+  _tableName = parts[1];
+  _partitionId = Integer.parseInt(parts[2]);
+  _creationTime = parts[3];
+  _suffix = parts[4];
+  _segmentName = segmentName;
+} catch (NumberFormatException e) {
+  throw new IllegalArgumentException("Invalid segment name: " + 
segmentName, e);
+}
+  }
+
+  /**
+   * Constructor for UploadedRealtimeSegmentName.
+   * @param tableName
+   * @param partitionId
+   * @param msSinceEpoch
+   * @param prefix
+   * @param suffix
+   */
+  public UploadedRealtimeSegmentName(String tableName, int partitionId, long 
msSinceEpoch, String prefix,
+  String suffix) {
+Preconditions.checkArgument(
+StringUtils.isNotBlank(tableName) && StringUtils.isNotBlank(prefix) && 
StringUtils.isNotBlank(suffix),

Review Comment:
   as in LLCSegmentName, we can also check tableName/suffix/prefix doesn't 
contains(SEPARATOR)



##
pinot-common/src/main/java/org/apache/pinot/common/utils/UploadedRealtimeSegmentName.java:
##
@@ -0,0 +1,180 @@
+/**
+ * Licensed to the Apache Software Foundation (ASF) under one
+ * or more contributor license agreements.  See the NOTICE file
+ * distributed with this work for additional information
+ * regarding copyright ownership.  The ASF licenses this file
+ * to you under the Apache License, Version 2.0 (the
+ * "License"); you may not use this file except in compliance
+ * with the License.  You may obtain a copy of the License at
+ *
+ *   http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing,
+ * software distributed under the License is distributed on an
+ * "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS 

Re: [PR] Optimize snapshot flow to only snapshot segments which have updates [pinot]

2024-06-06 Thread via GitHub


tibrewalpratik17 commented on code in PR #13285:
URL: https://github.com/apache/pinot/pull/13285#discussion_r1630391893


##
pinot-core/src/main/java/org/apache/pinot/core/data/manager/realtime/RealtimeSegmentDataManager.java:
##
@@ -703,9 +703,11 @@ public void run() {
 //   persisted.
 // Take upsert snapshot before starting consuming events
 if (_partitionUpsertMetadataManager != null) {
-  _partitionUpsertMetadataManager.takeSnapshot();
-  // If upsertTTL is enabled, we will remove expired primary keys from 
upsertMetadata after taking snapshot.
+  // we should remove keys first and then take snapshot, this is 
because deletedKeysTTL flow removes keys from

Review Comment:
   Updated javadocs as well! 



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscr...@pinot.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


-
To unsubscribe, e-mail: commits-unsubscr...@pinot.apache.org
For additional commands, e-mail: commits-h...@pinot.apache.org



Re: [PR] Optimize snapshot flow to only snapshot segments which have updates [pinot]

2024-06-06 Thread via GitHub


tibrewalpratik17 commented on code in PR #13285:
URL: https://github.com/apache/pinot/pull/13285#discussion_r1630385916


##
pinot-core/src/main/java/org/apache/pinot/core/data/manager/realtime/RealtimeSegmentDataManager.java:
##
@@ -703,9 +703,11 @@ public void run() {
 //   persisted.
 // Take upsert snapshot before starting consuming events
 if (_partitionUpsertMetadataManager != null) {
-  _partitionUpsertMetadataManager.takeSnapshot();
-  // If upsertTTL is enabled, we will remove expired primary keys from 
upsertMetadata after taking snapshot.
+  // we should remove keys first and then take snapshot, this is 
because deletedKeysTTL flow removes keys from

Review Comment:
   Hey @deemoliu i have updated based on your suggestion! I understand this is 
needed to maintain the last state of a key going out of TTL window. Makes 
sense! 



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscr...@pinot.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


-
To unsubscribe, e-mail: commits-unsubscr...@pinot.apache.org
For additional commands, e-mail: commits-h...@pinot.apache.org



Re: [PR] Optimize snapshot flow to only snapshot segments which have updates [pinot]

2024-06-06 Thread via GitHub


tibrewalpratik17 commented on code in PR #13285:
URL: https://github.com/apache/pinot/pull/13285#discussion_r1630385916


##
pinot-core/src/main/java/org/apache/pinot/core/data/manager/realtime/RealtimeSegmentDataManager.java:
##
@@ -703,9 +703,11 @@ public void run() {
 //   persisted.
 // Take upsert snapshot before starting consuming events
 if (_partitionUpsertMetadataManager != null) {
-  _partitionUpsertMetadataManager.takeSnapshot();
-  // If upsertTTL is enabled, we will remove expired primary keys from 
upsertMetadata after taking snapshot.
+  // we should remove keys first and then take snapshot, this is 
because deletedKeysTTL flow removes keys from

Review Comment:
   Hey @deemoliu i have updated based on your suggestion! Can you also 
elaborate on why is this necessary? I can maybe update javadocs as well. Going 
through `metadataTTL` code i didn't find any strict requirement for taking 
upsertSnapshot first but i might be definitely missing something.



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscr...@pinot.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


-
To unsubscribe, e-mail: commits-unsubscr...@pinot.apache.org
For additional commands, e-mail: commits-h...@pinot.apache.org



Re: [PR] Optimize snapshot flow to only snapshot segments which have updates [pinot]

2024-06-06 Thread via GitHub


tibrewalpratik17 commented on code in PR #13285:
URL: https://github.com/apache/pinot/pull/13285#discussion_r1630165678


##
pinot-segment-local/src/main/java/org/apache/pinot/segment/local/upsert/BasePartitionUpsertMetadataManager.java:
##
@@ -,6 +1132,9 @@ protected void addDocId(IndexSegment segment, 
ThreadSafeMutableRoaringBitmap val
 try {
   doAddDocId(validDocIds, queryableDocIds, docId, recordInfo);
 } finally {
+  if (_enableSnapshot) {

Review Comment:
   > Actually this is not inside upsertViewLock. I just wanted to keep it in 
the finally block so that we add the updated segment to the set always.
   
   Nvm! I will move this below after we do 
`_upsertViewLock.readLock().unlock();` if block.



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscr...@pinot.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


-
To unsubscribe, e-mail: commits-unsubscr...@pinot.apache.org
For additional commands, e-mail: commits-h...@pinot.apache.org



Re: [PR] Optimize snapshot flow to only snapshot segments which have updates [pinot]

2024-06-06 Thread via GitHub


tibrewalpratik17 commented on code in PR #13285:
URL: https://github.com/apache/pinot/pull/13285#discussion_r1630122807


##
pinot-segment-local/src/main/java/org/apache/pinot/segment/local/upsert/BasePartitionUpsertMetadataManager.java:
##
@@ -,6 +1132,9 @@ protected void addDocId(IndexSegment segment, 
ThreadSafeMutableRoaringBitmap val
 try {
   doAddDocId(validDocIds, queryableDocIds, docId, recordInfo);
 } finally {
+  if (_enableSnapshot) {

Review Comment:
   > I'd suggest to move this if-block outside the critical section of 
upsertViewLock
   
   Actually this is not inside upsertViewLock. I just wanted to keep it in the 
finally block so that we add the updated segment to the set always.
   
   > btw, it may help to comment that we don't need to take snasphotLock when 
updating this Set (IIUC)
   
   yeah we don't need snapshotLock here at all while updating this Set. Will 
add a comment



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscr...@pinot.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


-
To unsubscribe, e-mail: commits-unsubscr...@pinot.apache.org
For additional commands, e-mail: commits-h...@pinot.apache.org



Re: [PR] Optimize snapshot flow to only snapshot segments which have updates [pinot]

2024-06-06 Thread via GitHub


klsince commented on code in PR #13285:
URL: https://github.com/apache/pinot/pull/13285#discussion_r1630153069


##
pinot-segment-local/src/main/java/org/apache/pinot/segment/local/upsert/BasePartitionUpsertMetadataManager.java:
##
@@ -874,19 +875,32 @@ protected void doTakeSnapshot() {
 numConsumingSegments++;
 continue;
   }
-  ImmutableSegmentImpl immutableSegment = (ImmutableSegmentImpl) segment;
-  if (!immutableSegment.hasValidDocIdsSnapshotFile()) {
-segmentsWithoutSnapshot.add(immutableSegment);
+  if (!_updatedSegmentsSinceLastSnapshot.contains(segment)) {

Review Comment:
   I see. That sounds right to me.



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscr...@pinot.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


-
To unsubscribe, e-mail: commits-unsubscr...@pinot.apache.org
For additional commands, e-mail: commits-h...@pinot.apache.org



Re: [PR] Optimize snapshot flow to only snapshot segments which have updates [pinot]

2024-06-06 Thread via GitHub


tibrewalpratik17 commented on code in PR #13285:
URL: https://github.com/apache/pinot/pull/13285#discussion_r1630119781


##
pinot-segment-local/src/main/java/org/apache/pinot/segment/local/upsert/BasePartitionUpsertMetadataManager.java:
##
@@ -874,19 +875,32 @@ protected void doTakeSnapshot() {
 numConsumingSegments++;
 continue;
   }
-  ImmutableSegmentImpl immutableSegment = (ImmutableSegmentImpl) segment;
-  if (!immutableSegment.hasValidDocIdsSnapshotFile()) {
-segmentsWithoutSnapshot.add(immutableSegment);
+  if (!_updatedSegmentsSinceLastSnapshot.contains(segment)) {

Review Comment:
   TBH i started this implementation with the exact thing in mind and kept a 
set of `snapshottedSegments`. But then I realised during server restarts, 
`addSegment` would put all the segments in `_updatedSegmentsSinceLastSnapshot` 
right? Enabling snapshotting would require server-restarts as well I suppose? 



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscr...@pinot.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


-
To unsubscribe, e-mail: commits-unsubscr...@pinot.apache.org
For additional commands, e-mail: commits-h...@pinot.apache.org



Re: [PR] Optimize snapshot flow to only snapshot segments which have updates [pinot]

2024-06-06 Thread via GitHub


klsince commented on code in PR #13285:
URL: https://github.com/apache/pinot/pull/13285#discussion_r1630013857


##
pinot-segment-local/src/main/java/org/apache/pinot/segment/local/upsert/BasePartitionUpsertMetadataManager.java:
##
@@ -874,19 +875,32 @@ protected void doTakeSnapshot() {
 numConsumingSegments++;
 continue;
   }
-  ImmutableSegmentImpl immutableSegment = (ImmutableSegmentImpl) segment;
-  if (!immutableSegment.hasValidDocIdsSnapshotFile()) {
-segmentsWithoutSnapshot.add(immutableSegment);
+  if (!_updatedSegmentsSinceLastSnapshot.contains(segment)) {

Review Comment:
   If we enableSnapshot on an existing upsert table and the set is empty, I 
think we would skip those existing segments until they get updated or possibly 
some of them never get updated in long future and they won't have snapshot on 
disk. (This was a bug I hit on while testing the 
`_updatedSegmentsSinceLastRefresh`)
   
   We would need to take snapshot for all tracked segments for the first time 
when to take snapshot for the partition, and then start to skip segments not 
having updates with this Set.



##
pinot-segment-local/src/main/java/org/apache/pinot/segment/local/upsert/BasePartitionUpsertMetadataManager.java:
##
@@ -,6 +1132,9 @@ protected void addDocId(IndexSegment segment, 
ThreadSafeMutableRoaringBitmap val
 try {
   doAddDocId(validDocIds, queryableDocIds, docId, recordInfo);
 } finally {
+  if (_enableSnapshot) {

Review Comment:
   I'd suggest to move this if-block outside the critical section of 
upsertViewLock
   
   btw, it may help to comment that we don't need to take snasphotLock when 
updating this Set (IIUC)



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscr...@pinot.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


-
To unsubscribe, e-mail: commits-unsubscr...@pinot.apache.org
For additional commands, e-mail: commits-h...@pinot.apache.org



Re: [PR] Optimize snapshot flow to only snapshot segments which have updates [pinot]

2024-06-06 Thread via GitHub


tibrewalpratik17 commented on PR #13285:
URL: https://github.com/apache/pinot/pull/13285#issuecomment-2153109192

   cc @klsince @Jackie-Jiang 


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscr...@pinot.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


-
To unsubscribe, e-mail: commits-unsubscr...@pinot.apache.org
For additional commands, e-mail: commits-h...@pinot.apache.org



Re: [PR] Update getValidDocIdsMetadataFromServer to make call in batches to servers and other bug fixes [pinot]

2024-06-06 Thread via GitHub


klsince merged PR #13314:
URL: https://github.com/apache/pinot/pull/13314


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscr...@pinot.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


-
To unsubscribe, e-mail: commits-unsubscr...@pinot.apache.org
For additional commands, e-mail: commits-h...@pinot.apache.org



Re: [I] Proposal for a new instance-partition based routing strategy [pinot]

2024-06-06 Thread via GitHub


klsince commented on issue #13284:
URL: https://github.com/apache/pinot/issues/13284#issuecomment-2153094150

   This reminded me of an improvement I tried for strictReplicaGroup 
(https://github.com/apache/pinot/pull/11847) but didn't finish. As you 
mentioned in the issue description, it's a bit too rigid to skip the instance 
if any one of segments hosted on it was unavailable, as often we'd have to skip 
all instances, and reporting that a huge number of segments were unavailable 
(which was kinda misleading).
   
   Basically the improvement I was trying to add was to pick an instance, even 
though it has unavailable segments and reported status of the unavailable 
segments back. The key abstraction in that PR was `InstanceGroup`, which caches 
the mapping from `a set of instances` to `a set of segments on them`. With 
replica group assignment, the set of instances should host the same set of 
segments, but some instances might have unavailable segments and some instance 
might be fine. The mapping info is updated whenever IS/EV gets updated. While 
selecting instances, InstanceGroup is used to quickly identify a instance.
   
   The `InstanceGroup` in the PR simply tracks the segments in a `Set`, but we 
can group segments further by their partitions, then we may do server selection 
based on Instance-Partition as proposed here. 
   
   Just some quick thoughts based on the implementation I tried earlier on. I 
didn't get enough time to finish that improvement, but if it makes sense and 
could be reused by this feature, then I can try my best to get it. btw, feel 
free to comment that PR.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscr...@pinot.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


-
To unsubscribe, e-mail: commits-unsubscr...@pinot.apache.org
For additional commands, e-mail: commits-h...@pinot.apache.org



Re: [PR] Update metadata.properties to have schema max-length [pinot]

2024-06-06 Thread via GitHub


tibrewalpratik17 commented on code in PR #13187:
URL: https://github.com/apache/pinot/pull/13187#discussion_r1629984437


##
pinot-segment-spi/src/main/java/org/apache/pinot/segment/spi/index/metadata/ColumnMetadataImpl.java:
##
@@ -229,11 +229,17 @@ public static ColumnMetadataImpl 
fromPropertiesConfiguration(String column, Prop
 if (defaultNullValueString != null && storedType == DataType.STRING) {
   defaultNullValueString = 
CommonsConfigurationUtils.recoverSpecialCharacterInPropertyValue(defaultNullValueString);
 }
+int maxLength = config.getInt(Column.getKeyFor(column, 
Column.SCHEMA_MAX_LENGTH), FieldSpec.DEFAULT_MAX_LENGTH);
+String maxLengthExceedStrategyString =
+config.getString(Column.getKeyFor(column, 
Column.SCHEMA_MAX_LENGTH_EXCEED_STRATEGY), null);
+FieldSpec.MaxLengthExceedStrategy maxLengthExceedStrategy = 
maxLengthExceedStrategyString != null
+? 
FieldSpec.MaxLengthExceedStrategy.valueOf(maxLengthExceedStrategyString) : null;
 FieldSpec fieldSpec;
 switch (fieldType) {
   case DIMENSION:
 boolean isSingleValue = config.getBoolean(Column.getKeyFor(column, 
Column.IS_SINGLE_VALUED));
-fieldSpec = new DimensionFieldSpec(column, dataType, isSingleValue, 
defaultNullValueString);
+fieldSpec = new DimensionFieldSpec(column, dataType, isSingleValue, 
maxLength,
+defaultNullValueString, maxLengthExceedStrategy);
 break;
   case METRIC:
 fieldSpec = new MetricFieldSpec(column, dataType, 
defaultNullValueString);

Review Comment:
   I see the constructor was missing in MetricFieldSpec so added that as well



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscr...@pinot.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


-
To unsubscribe, e-mail: commits-unsubscr...@pinot.apache.org
For additional commands, e-mail: commits-h...@pinot.apache.org



Re: [I] Proposal for a new instance-partition based routing strategy [pinot]

2024-06-06 Thread via GitHub


Jackie-Jiang commented on issue #13284:
URL: https://github.com/apache/pinot/issues/13284#issuecomment-2153029104

   Great feature to add!
   
   We need the following info to route the query:
   IS: Find ONLINE/CONSUMING segments to query
   EV: Actual segment states
   Segment name or ZK metadata: Which partition does the segment belongs to (we 
don't track this currently because we don't want to read ZK metadata so 
frequently, but we may consider caching it, similar to what we do in the 
partition pruning)
   
   With the above info, we should be able to calculate which partition is 
available on which server, then route accordingly.
   
   cc @klsince


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscr...@pinot.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


-
To unsubscribe, e-mail: commits-unsubscr...@pinot.apache.org
For additional commands, e-mail: commits-h...@pinot.apache.org



Re: [PR] Short circuit SubPlanFragmenter because we don't support multiple sub-plans yet [pinot]

2024-06-06 Thread via GitHub


Jackie-Jiang commented on code in PR #13306:
URL: https://github.com/apache/pinot/pull/13306#discussion_r1629889683


##
pinot-query-planner/src/main/java/org/apache/pinot/query/planner/logical/PinotLogicalQueryPlanner.java:
##
@@ -51,70 +48,72 @@ private PinotLogicalQueryPlanner() {
* Converts a Calcite {@link RelRoot} into a Pinot {@link SubPlan}.
*/
   public static SubPlan makePlan(RelRoot relRoot) {
-PlanNode rootNode = relNodeToStageNode(relRoot.rel);
-QueryPlanMetadata metadata =
-new 
QueryPlanMetadata(RelToPlanNodeConverter.getTableNamesFromRelRoot(relRoot.rel), 
relRoot.fields);
+PlanNode rootNode = relNodeToPlanNode(relRoot.rel);
+PlanFragment rootFragment = planNodeToPlanFragment(rootNode);
+return new SubPlan(rootFragment,
+new 
SubPlanMetadata(RelToPlanNodeConverter.getTableNamesFromRelRoot(relRoot.rel), 
relRoot.fields), List.of());
 
+// TODO: Currently we don't support multiple sub-plans. Revisit the 
following logic when we add the support.
 // Fragment the stage tree into multiple SubPlans.
-SubPlanFragmenter.Context subPlanContext = new SubPlanFragmenter.Context();
-subPlanContext._subPlanIdToRootNodeMap.put(0, rootNode);
-subPlanContext._subPlanIdToMetadataMap.put(0, new 
SubPlanMetadata(metadata.getTableNames(), metadata.getFields()));
-rootNode.visit(SubPlanFragmenter.INSTANCE, subPlanContext);
-
-Map subPlanMap = new HashMap<>();
-for (Map.Entry subPlanEntry : 
subPlanContext._subPlanIdToRootNodeMap.entrySet()) {
-  int subPlanId = subPlanEntry.getKey();
-  PlanNode subPlanRoot = subPlanEntry.getValue();
-
-  // Fragment the SubPlan into multiple PlanFragments.
-  PlanFragmenter fragmenter = new PlanFragmenter();
-  PlanFragmenter.Context fragmenterContext = fragmenter.createContext();
-  subPlanRoot = subPlanRoot.visit(fragmenter, fragmenterContext);
-  Int2ObjectOpenHashMap planFragmentMap = 
fragmenter.getPlanFragmentMap();
-  Int2ObjectOpenHashMap childPlanFragmentIdsMap = 
fragmenter.getChildPlanFragmentIdsMap();
-
-  // Sub plan root needs to send final results back to the Broker
-  // TODO: Should be SINGLETON (currently SINGLETON has to be local, so 
use BROADCAST_DISTRIBUTED instead)
-  MailboxSendNode subPlanRootSenderNode =
-  new MailboxSendNode(subPlanRoot.getPlanFragmentId(), 
subPlanRoot.getDataSchema(), 0,
-  RelDistribution.Type.BROADCAST_DISTRIBUTED, 
PinotRelExchangeType.getDefaultExchangeType(), null, null,
-  false, false);
-  subPlanRootSenderNode.addInput(subPlanRoot);
-  PlanFragment planFragment1 = new PlanFragment(1, subPlanRootSenderNode, 
new ArrayList<>());
-  planFragmentMap.put(1, planFragment1);
-  for (Int2ObjectMap.Entry entry : 
childPlanFragmentIdsMap.int2ObjectEntrySet()) {
-PlanFragment planFragment = planFragmentMap.get(entry.getIntKey());
-List childPlanFragments = planFragment.getChildren();
-IntListIterator childPlanFragmentIdIterator = 
entry.getValue().iterator();
-while (childPlanFragmentIdIterator.hasNext()) {
-  
childPlanFragments.add(planFragmentMap.get(childPlanFragmentIdIterator.nextInt()));
-}
-  }
-  MailboxReceiveNode rootReceiveNode =
-  new MailboxReceiveNode(0, subPlanRoot.getDataSchema(), 
subPlanRoot.getPlanFragmentId(),
-  RelDistribution.Type.BROADCAST_DISTRIBUTED, 
PinotRelExchangeType.getDefaultExchangeType(), null, null,
-  false, false, subPlanRootSenderNode);
-  PlanFragment rootPlanFragment = new PlanFragment(0, rootReceiveNode, 
Collections.singletonList(planFragment1));
-  SubPlan subPlan = new SubPlan(rootPlanFragment, 
subPlanContext._subPlanIdToMetadataMap.get(0), new ArrayList<>());
-  subPlanMap.put(subPlanId, subPlan);
-}
-for (Map.Entry> subPlanToChildrenEntry : 
subPlanContext._subPlanIdToChildrenMap.entrySet()) {
-  int subPlanId = subPlanToChildrenEntry.getKey();
-  List subPlanChildren = subPlanToChildrenEntry.getValue();
-  for (int subPlanChild : subPlanChildren) {
-
subPlanMap.get(subPlanId).getChildren().add(subPlanMap.get(subPlanChild));
-  }
-}
-return subPlanMap.get(0);
+//SubPlanFragmenter.Context subPlanContext = new 
SubPlanFragmenter.Context();
+//subPlanContext._subPlanIdToRootNodeMap.put(0, rootNode);
+//subPlanContext._subPlanIdToMetadataMap.put(0,
+//new 
SubPlanMetadata(RelToPlanNodeConverter.getTableNamesFromRelRoot(relRoot.rel), 
relRoot.fields));
+//rootNode.visit(SubPlanFragmenter.INSTANCE, subPlanContext);
+//
+//Map subPlanMap = new HashMap<>();
+//for (Map.Entry subPlanEntry : 
subPlanContext._subPlanIdToRootNodeMap.entrySet()) {
+//  SubPlan subPlan =
+//  new SubPlan(planNodeToPlanFragment(subPlanEntry.getValue()), 
subPlanContext._subPlanIdToMetadataMap.get(0),
+//  new ArrayList<>());
+//  

Re: [PR] Update metadata.properties to have schema max-length [pinot]

2024-06-06 Thread via GitHub


Jackie-Jiang commented on code in PR #13187:
URL: https://github.com/apache/pinot/pull/13187#discussion_r1629874567


##
pinot-segment-spi/src/main/java/org/apache/pinot/segment/spi/index/metadata/ColumnMetadataImpl.java:
##
@@ -229,11 +229,17 @@ public static ColumnMetadataImpl 
fromPropertiesConfiguration(String column, Prop
 if (defaultNullValueString != null && storedType == DataType.STRING) {
   defaultNullValueString = 
CommonsConfigurationUtils.recoverSpecialCharacterInPropertyValue(defaultNullValueString);
 }
+int maxLength = config.getInt(Column.getKeyFor(column, 
Column.SCHEMA_MAX_LENGTH), FieldSpec.DEFAULT_MAX_LENGTH);
+String maxLengthExceedStrategyString =
+config.getString(Column.getKeyFor(column, 
Column.SCHEMA_MAX_LENGTH_EXCEED_STRATEGY), null);
+FieldSpec.MaxLengthExceedStrategy maxLengthExceedStrategy = 
maxLengthExceedStrategyString != null
+? 
FieldSpec.MaxLengthExceedStrategy.valueOf(maxLengthExceedStrategyString) : null;
 FieldSpec fieldSpec;
 switch (fieldType) {
   case DIMENSION:
 boolean isSingleValue = config.getBoolean(Column.getKeyFor(column, 
Column.IS_SINGLE_VALUED));
-fieldSpec = new DimensionFieldSpec(column, dataType, isSingleValue, 
defaultNullValueString);
+fieldSpec = new DimensionFieldSpec(column, dataType, isSingleValue, 
maxLength,
+defaultNullValueString, maxLengthExceedStrategy);
 break;
   case METRIC:
 fieldSpec = new MetricFieldSpec(column, dataType, 
defaultNullValueString);

Review Comment:
   This also applies to metric columns



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscr...@pinot.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


-
To unsubscribe, e-mail: commits-unsubscr...@pinot.apache.org
For additional commands, e-mail: commits-h...@pinot.apache.org



Re: [PR] Update metadata.properties to have schema max-length [pinot]

2024-06-06 Thread via GitHub


tibrewalpratik17 commented on code in PR #13187:
URL: https://github.com/apache/pinot/pull/13187#discussion_r1629870264


##
pinot-segment-local/src/main/java/org/apache/pinot/segment/local/segment/creator/impl/SegmentColumnarIndexCreator.java:
##
@@ -585,6 +585,19 @@ public static void 
addColumnMetadataInfo(PropertiesConfiguration properties, Str
 String.valueOf(columnIndexCreationInfo.getTotalNumberOfEntries()));
 properties.setProperty(getKeyFor(column, IS_AUTO_GENERATED),
 String.valueOf(columnIndexCreationInfo.isAutoGenerated()));
+FieldSpec.MaxLengthExceedStrategy maxLengthExceedStrategy = 
fieldSpec.getMaxLengthExceedStrategy();
+if (dataType.equals(DataType.STRING)) {

Review Comment:
   Sounds good to me! Updated accordingly! 
   



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscr...@pinot.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


-
To unsubscribe, e-mail: commits-unsubscr...@pinot.apache.org
For additional commands, e-mail: commits-h...@pinot.apache.org



Re: [PR] Update metadata.properties to have schema max-length [pinot]

2024-06-06 Thread via GitHub


Jackie-Jiang commented on code in PR #13187:
URL: https://github.com/apache/pinot/pull/13187#discussion_r1629848687


##
pinot-segment-local/src/main/java/org/apache/pinot/segment/local/segment/creator/impl/SegmentColumnarIndexCreator.java:
##
@@ -585,6 +585,19 @@ public static void 
addColumnMetadataInfo(PropertiesConfiguration properties, Str
 String.valueOf(columnIndexCreationInfo.getTotalNumberOfEntries()));
 properties.setProperty(getKeyFor(column, IS_AUTO_GENERATED),
 String.valueOf(columnIndexCreationInfo.isAutoGenerated()));
+FieldSpec.MaxLengthExceedStrategy maxLengthExceedStrategy = 
fieldSpec.getMaxLengthExceedStrategy();
+if (dataType.equals(DataType.STRING)) {

Review Comment:
   The overall goal should be to reconstruct the original field spec



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscr...@pinot.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


-
To unsubscribe, e-mail: commits-unsubscr...@pinot.apache.org
For additional commands, e-mail: commits-h...@pinot.apache.org



Re: [PR] Update metadata.properties to have schema max-length [pinot]

2024-06-06 Thread via GitHub


Jackie-Jiang commented on code in PR #13187:
URL: https://github.com/apache/pinot/pull/13187#discussion_r1629847762


##
pinot-segment-local/src/main/java/org/apache/pinot/segment/local/segment/creator/impl/SegmentColumnarIndexCreator.java:
##
@@ -585,6 +585,19 @@ public static void 
addColumnMetadataInfo(PropertiesConfiguration properties, Str
 String.valueOf(columnIndexCreationInfo.getTotalNumberOfEntries()));
 properties.setProperty(getKeyFor(column, IS_AUTO_GENERATED),
 String.valueOf(columnIndexCreationInfo.isAutoGenerated()));
+FieldSpec.MaxLengthExceedStrategy maxLengthExceedStrategy = 
fieldSpec.getMaxLengthExceedStrategy();
+if (dataType.equals(DataType.STRING)) {

Review Comment:
   We want to set it for all var-length types so that the schema can be 
properly re-constructed.
   For `SCHEMA_MAX_LENGTH_EXCEED_STRATEGY`, we want to add it when it is 
configured



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscr...@pinot.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


-
To unsubscribe, e-mail: commits-unsubscr...@pinot.apache.org
For additional commands, e-mail: commits-h...@pinot.apache.org



Re: [I] Ingest from Pulsar: Update to Pulsar 3.3 [pinot]

2024-06-06 Thread via GitHub


Jackie-Jiang closed issue #13317: Ingest from Pulsar: Update to Pulsar 3.3
URL: https://github.com/apache/pinot/issues/13317


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscr...@pinot.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


-
To unsubscribe, e-mail: commits-unsubscr...@pinot.apache.org
For additional commands, e-mail: commits-h...@pinot.apache.org



Re: [PR] Bump pulsar.version from 3.2.3 to 3.3.0 [pinot]

2024-06-06 Thread via GitHub


Jackie-Jiang merged PR #13322:
URL: https://github.com/apache/pinot/pull/13322


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscr...@pinot.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


-
To unsubscribe, e-mail: commits-unsubscr...@pinot.apache.org
For additional commands, e-mail: commits-h...@pinot.apache.org



Re: [PR] Bump org.apache.maven.plugins:maven-checkstyle-plugin from 3.3.1 to 3.4.0 [pinot]

2024-06-06 Thread via GitHub


Jackie-Jiang merged PR #13321:
URL: https://github.com/apache/pinot/pull/13321


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscr...@pinot.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


-
To unsubscribe, e-mail: commits-unsubscr...@pinot.apache.org
For additional commands, e-mail: commits-h...@pinot.apache.org



Re: [PR] Bump software.amazon.awssdk:bom from 2.25.66 to 2.25.67 [pinot]

2024-06-06 Thread via GitHub


Jackie-Jiang merged PR #13323:
URL: https://github.com/apache/pinot/pull/13323


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscr...@pinot.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


-
To unsubscribe, e-mail: commits-unsubscr...@pinot.apache.org
For additional commands, e-mail: commits-h...@pinot.apache.org



Re: [PR] Vector data type in Pinot [pinot]

2024-06-06 Thread via GitHub


hpvd commented on PR #11262:
URL: https://github.com/apache/pinot/pull/11262#issuecomment-2152860652

   Since this is a really important feature and so much work has already gone 
into it,
   I dare to send a shy ping to the requested reviewers @walterddr 
@Jackie-Jiang 
   Thanks for all you work!


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscr...@pinot.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


-
To unsubscribe, e-mail: commits-unsubscr...@pinot.apache.org
For additional commands, e-mail: commits-h...@pinot.apache.org



Re: [I] Ingest from Pulsar: Update to Pulsar 3.3 [pinot]

2024-06-06 Thread via GitHub


hpvd commented on issue #13317:
URL: https://github.com/apache/pinot/issues/13317#issuecomment-2152291744

   dependabot already has done its job:
   ```
   All checks have passed
   22 successful checks
   ```
   only one manual approving review required to close this.
   https://github.com/apache/pinot/pull/13322


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscr...@pinot.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


-
To unsubscribe, e-mail: commits-unsubscr...@pinot.apache.org
For additional commands, e-mail: commits-h...@pinot.apache.org



Re: [PR] Update metadata.properties to have schema max-length [pinot]

2024-06-06 Thread via GitHub


tibrewalpratik17 commented on code in PR #13187:
URL: https://github.com/apache/pinot/pull/13187#discussion_r1629444612


##
pinot-segment-local/src/main/java/org/apache/pinot/segment/local/segment/creator/impl/SegmentColumnarIndexCreator.java:
##
@@ -585,6 +585,7 @@ public static void 
addColumnMetadataInfo(PropertiesConfiguration properties, Str
 String.valueOf(columnIndexCreationInfo.getTotalNumberOfEntries()));
 properties.setProperty(getKeyFor(column, IS_AUTO_GENERATED),
 String.valueOf(columnIndexCreationInfo.isAutoGenerated()));
+properties.setProperty(getKeyFor(column, SCHEMA_MAX_LENGTH), 
String.valueOf(fieldSpec.getMaxLength()));

Review Comment:
   hey @Jackie-Jiang updated this PR based on #13103 changes. 



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscr...@pinot.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


-
To unsubscribe, e-mail: commits-unsubscr...@pinot.apache.org
For additional commands, e-mail: commits-h...@pinot.apache.org



[PR] Bump software.amazon.awssdk:bom from 2.25.66 to 2.25.67 [pinot]

2024-06-06 Thread via GitHub


dependabot[bot] opened a new pull request, #13323:
URL: https://github.com/apache/pinot/pull/13323

   Bumps software.amazon.awssdk:bom from 2.25.66 to 2.25.67.
   
   
   [![Dependabot compatibility 
score](https://dependabot-badges.githubapp.com/badges/compatibility_score?dependency-name=software.amazon.awssdk:bom=maven=2.25.66=2.25.67)](https://docs.github.com/en/github/managing-security-vulnerabilities/about-dependabot-security-updates#about-compatibility-scores)
   
   Dependabot will resolve any conflicts with this PR as long as you don't 
alter it yourself. You can also trigger a rebase manually by commenting 
`@dependabot rebase`.
   
   [//]: # (dependabot-automerge-start)
   [//]: # (dependabot-automerge-end)
   
   ---
   
   
   Dependabot commands and options
   
   
   You can trigger Dependabot actions by commenting on this PR:
   - `@dependabot rebase` will rebase this PR
   - `@dependabot recreate` will recreate this PR, overwriting any edits that 
have been made to it
   - `@dependabot merge` will merge this PR after your CI passes on it
   - `@dependabot squash and merge` will squash and merge this PR after your CI 
passes on it
   - `@dependabot cancel merge` will cancel a previously requested merge and 
block automerging
   - `@dependabot reopen` will reopen this PR if it is closed
   - `@dependabot close` will close this PR and stop Dependabot recreating it. 
You can achieve the same result by closing it manually
   - `@dependabot show  ignore conditions` will show all of 
the ignore conditions of the specified dependency
   - `@dependabot ignore this major version` will close this PR and stop 
Dependabot creating any more for this major version (unless you reopen the PR 
or upgrade to it yourself)
   - `@dependabot ignore this minor version` will close this PR and stop 
Dependabot creating any more for this minor version (unless you reopen the PR 
or upgrade to it yourself)
   - `@dependabot ignore this dependency` will close this PR and stop 
Dependabot creating any more for this dependency (unless you reopen the PR or 
upgrade to it yourself)
   
   
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscr...@pinot.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


-
To unsubscribe, e-mail: commits-unsubscr...@pinot.apache.org
For additional commands, e-mail: commits-h...@pinot.apache.org



[PR] Bump pulsar.version from 3.2.3 to 3.3.0 [pinot]

2024-06-06 Thread via GitHub


dependabot[bot] opened a new pull request, #13322:
URL: https://github.com/apache/pinot/pull/13322

   Bumps `pulsar.version` from 3.2.3 to 3.3.0.
   Updates `org.apache.pulsar:pulsar-client` from 3.2.3 to 3.3.0
   
   Release notes
   Sourced from https://github.com/apache/pulsar/releases;>org.apache.pulsar:pulsar-client's
 releases.
   
   v3.3.0
   PIP
   
   PIP-315: Configurable max delay limit for delayed delivery (https://redirect.github.com/apache/pulsar/issues/21798;>#21798)
   PIP-321  Introduce allowed-cluster at the namespace level 
(https://redirect.github.com/apache/pulsar/issues/21648;>#21648)
   PIP-324: Alpine Docker images (https://redirect.github.com/apache/pulsar/issues/21716;>#21716)
   PIP-325: Add command to abort transaction (https://redirect.github.com/apache/pulsar/issues/21731;>#21731)
   PIP-326: Create a BOM to ease dependency management (https://redirect.github.com/apache/pulsar/issues/21747;>#21747)
   PIP-329: Strategy for maintaining the latest tag to Pulsar docker images 
(https://redirect.github.com/apache/pulsar/issues/21872;>#21872)
   PIP-330: getMessagesById gets all messages (https://redirect.github.com/apache/pulsar/issues/21873;>#21873)
   PIP 342: Support OpenTelemetry metrics in Pulsar client (https://redirect.github.com/apache/pulsar/issues/22178;>#22178)
   PIP-343: Use picocli instead of jcommander (https://redirect.github.com/apache/pulsar/issues/22181;>#22181)
   PIP-344 Correct the behavior of the public API 
pulsarClient.getPartitionsForTopic(topicName) (https://redirect.github.com/apache/pulsar/issues/22182;>#22182)
   PIP-335: Oxia metadata plugin (https://redirect.github.com/apache/pulsar/issues/22009;>#22009)
   PIP-339: Introducing the --log-topic Option for Pulsar Sinks and Sources 
(https://redirect.github.com/apache/pulsar/issues/22071;>#22071)
   
   Broker
   
   [admin][broker] Fix force delete subscription not working (https://redirect.github.com/apache/pulsar/issues/22423;>#22423)
   [cleanup][admin] Remove unused methods in PersistentTopicsBase (https://redirect.github.com/apache/pulsar/issues/22424;>#22424)
   [cleanup][broker] Remove unused NamespaceBundleFactory parameter when 
creating OwnershipCache (https://redirect.github.com/apache/pulsar/issues/22482;>#22482)
   [cleanup][broker]∂ fix doc for TransactionBuffer#isTxnAborted (https://redirect.github.com/apache/pulsar/issues/21956;>#21956)
   [cleanup][broker] fix return value of 
TransactionTimeoutTrackerImpl#addTransaction (https://redirect.github.com/apache/pulsar/issues/22020;>#22020)
   [cleanup][broker] remove useless code comment (https://redirect.github.com/apache/pulsar/issues/22459;>#22459)
   [cleanup][meta] Remove com.beust.jcommander.internal import (https://redirect.github.com/apache/pulsar/issues/22294;>#22294)
   [cleanup][ml] ManagedCursor clean up. (https://redirect.github.com/apache/pulsar/issues/22246;>#22246)
   [cleanup][test] remove useless TestAuthorizationProvider2 (https://redirect.github.com/apache/pulsar/issues/22595;>#22595)
   [feat][admin] Enable Gzip Compression by Default in Admin Client (https://redirect.github.com/apache/pulsar/issues/22464;>#22464)
   [feat][admin] PIP-330: getMessagesById gets all messages (https://redirect.github.com/apache/pulsar/issues/21918;>#21918)
   [feat][broker] Implementation of PIP-323: Complete Backlog Quota 
Telemetry (https://redirect.github.com/apache/pulsar/issues/21816;>#21816)
   [feat][broker] PIP-264: Add Java runtime metrics (https://redirect.github.com/apache/pulsar/issues/22616;>#22616)
   [feat][broker] PIP-264: Add topic messaging metrics (https://redirect.github.com/apache/pulsar/issues/22467;>#22467)
   [feat][misc] Add Pulsar BOM (Bill of Materials) (https://redirect.github.com/apache/pulsar/issues/21871;>#21871)
   [feat][misc] PIP-264: Implement topic lookup metrics using OpenTelemetry 
(https://redirect.github.com/apache/pulsar/issues/22058;>#22058)
   [feat][misc] PIP-320: Add OpenTelemetry scaffolding (https://redirect.github.com/apache/pulsar/issues/22010;>#22010)
   [fix] Fix Reader can be stuck from transaction aborted messages. (https://redirect.github.com/apache/pulsar/issues/22610;>#22610)
   [fix] Fixed implicit conversions from long - int (https://redirect.github.com/apache/pulsar/issues/22055;>#22055)
   [fix] Include swagger annotations in shaded client lib (https://redirect.github.com/apache/pulsar/issues/22570;>#22570)
   [fix] Restored method as deprecated in AbstractMetadataStore (https://redirect.github.com/apache/pulsar/issues/21950;>#21950)
   [fix] Test was leaving client instance to null (https://redirect.github.com/apache/pulsar/issues/22631;>#22631)
   [fix][admin] Clearly define REST API on Open API (https://redirect.github.com/apache/pulsar/issues/22783;>#22783)
   [fix][admin] Clearly define REST API on Open API for Namesaces@v2 (https://redirect.github.com/apache/pulsar/issues/22775;>#22775)
   [fix][admin] Clearly define REST API on Open API for Topics 

[PR] Bump org.apache.maven.plugins:maven-checkstyle-plugin from 3.3.1 to 3.4.0 [pinot]

2024-06-06 Thread via GitHub


dependabot[bot] opened a new pull request, #13321:
URL: https://github.com/apache/pinot/pull/13321

   Bumps 
[org.apache.maven.plugins:maven-checkstyle-plugin](https://github.com/apache/maven-checkstyle-plugin)
 from 3.3.1 to 3.4.0.
   
   Commits
   
   https://github.com/apache/maven-checkstyle-plugin/commit/3af735f04cc3d6dd5e785d626043ecbe46a63395;>3af735f
 [maven-release-plugin] prepare release maven-checkstyle-plugin-3.4.0
   https://github.com/apache/maven-checkstyle-plugin/commit/e72bd83039bdee65126e36eebe58a175cbd39fc5;>e72bd83
 [MCHECKSTYLE-448] Upgrade to Parent 42 and Maven 3.6.3
   https://github.com/apache/maven-checkstyle-plugin/commit/61def027fbe275572c496cd0ba7429b47e209258;>61def02
 Add missing dependencies
   https://github.com/apache/maven-checkstyle-plugin/commit/142304a4d93d41a2f95e5f577387ca9de004807a;>142304a
 [MCHECKSTYLE-450] Checkstyle rule link format results in 404
   https://github.com/apache/maven-checkstyle-plugin/commit/1ad603398fb3e8db919dcd6c15338ecce9ad;>1ad6033
 [MCHECKSTYLE-449] Add support for SARIF output format
   https://github.com/apache/maven-checkstyle-plugin/commit/a29a2943756552b5fc2a27ec36085de323d43d66;>a29a294
 Bump maven-gh-actions-shared to v4
   https://github.com/apache/maven-checkstyle-plugin/commit/cf708a3f8ceea2226b506b5c351b31c971afc52b;>cf708a3
 [MCHECKSTYLE-447] Bump org.codehaus.plexus:plexus-resources from 1.1.0 to 
1.3...
   https://github.com/apache/maven-checkstyle-plugin/commit/34fcf3ea87a91bbe1a7511a6f393ca59066a30a9;>34fcf3e
 Bump org.codehaus.mojo:build-helper-maven-plugin from 3.4.0 to 3.5.0 (https://redirect.github.com/apache/maven-checkstyle-plugin/issues/127;>#127)
   https://github.com/apache/maven-checkstyle-plugin/commit/05df96b987b69b6137faccfe57108378752614a9;>05df96b
 Bump org.apache.commons:commons-lang3 from 3.12.0 to 3.14.0 (https://redirect.github.com/apache/maven-checkstyle-plugin/issues/128;>#128)
   https://github.com/apache/maven-checkstyle-plugin/commit/6521c2995ef17ddfacb654c6e5bc1bd8f5b65bd0;>6521c29
 Bump org.codehaus.plexus:plexus-component-metadata from 2.1.1 to 2.2.0 (https://redirect.github.com/apache/maven-checkstyle-plugin/issues/131;>#131)
   Additional commits viewable in https://github.com/apache/maven-checkstyle-plugin/compare/maven-checkstyle-plugin-3.3.1...maven-checkstyle-plugin-3.4.0;>compare
 view
   
   
   
   
   
   [![Dependabot compatibility 
score](https://dependabot-badges.githubapp.com/badges/compatibility_score?dependency-name=org.apache.maven.plugins:maven-checkstyle-plugin=maven=3.3.1=3.4.0)](https://docs.github.com/en/github/managing-security-vulnerabilities/about-dependabot-security-updates#about-compatibility-scores)
   
   Dependabot will resolve any conflicts with this PR as long as you don't 
alter it yourself. You can also trigger a rebase manually by commenting 
`@dependabot rebase`.
   
   [//]: # (dependabot-automerge-start)
   [//]: # (dependabot-automerge-end)
   
   ---
   
   
   Dependabot commands and options
   
   
   You can trigger Dependabot actions by commenting on this PR:
   - `@dependabot rebase` will rebase this PR
   - `@dependabot recreate` will recreate this PR, overwriting any edits that 
have been made to it
   - `@dependabot merge` will merge this PR after your CI passes on it
   - `@dependabot squash and merge` will squash and merge this PR after your CI 
passes on it
   - `@dependabot cancel merge` will cancel a previously requested merge and 
block automerging
   - `@dependabot reopen` will reopen this PR if it is closed
   - `@dependabot close` will close this PR and stop Dependabot recreating it. 
You can achieve the same result by closing it manually
   - `@dependabot show  ignore conditions` will show all of 
the ignore conditions of the specified dependency
   - `@dependabot ignore this major version` will close this PR and stop 
Dependabot creating any more for this major version (unless you reopen the PR 
or upgrade to it yourself)
   - `@dependabot ignore this minor version` will close this PR and stop 
Dependabot creating any more for this minor version (unless you reopen the PR 
or upgrade to it yourself)
   - `@dependabot ignore this dependency` will close this PR and stop 
Dependabot creating any more for this dependency (unless you reopen the PR or 
upgrade to it yourself)
   
   
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscr...@pinot.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


-
To unsubscribe, e-mail: commits-unsubscr...@pinot.apache.org
For additional commands, e-mail: commits-h...@pinot.apache.org



Re: [PR] [bugfix] Handling null value for kafka client id suffix [pinot]

2024-06-06 Thread via GitHub


eaugene commented on PR #13279:
URL: https://github.com/apache/pinot/pull/13279#issuecomment-2151603091

   Thanks for catching this @tibrewalpratik17  . 
   _TIL :_ null in string concatenation is converted to "null"  in java . 
[Ref](https://docs.oracle.com/javase/specs/jls/se11/html/jls-5.html#jls-5.1.11) 


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscr...@pinot.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


-
To unsubscribe, e-mail: commits-unsubscr...@pinot.apache.org
For additional commands, e-mail: commits-h...@pinot.apache.org



Re: [PR] Multi stage metrics [pinot]

2024-06-06 Thread via GitHub


gortiz commented on code in PR #13035:
URL: https://github.com/apache/pinot/pull/13035#discussion_r1628920654


##
pinot-common/src/main/java/org/apache/pinot/common/metrics/ServerTimer.java:
##
@@ -54,7 +54,14 @@ public enum ServerTimer implements AbstractMetrics.Timer {
   UPSERT_REMOVE_EXPIRED_PRIMARY_KEYS_TIME_MS("milliseconds", false,
   "Total time taken to delete expired primary keys based on metadataTTL or 
deletedKeysTTL"),
   GRPC_QUERY_EXECUTION_MS("milliseconds", false, "Total execution time of a 
successful query over gRPC"),
-  UPSERT_SNAPSHOT_TIME_MS("milliseconds", false, "Total time taken to take 
upsert table snapshot");
+  UPSERT_SNAPSHOT_TIME_MS("milliseconds", false, "Total time taken to take 
upsert table snapshot"),
+
+  // Multi-stage
+  HASH_JOIN_CPU_TIME_BUILDING_HASH_TABLE_MS("millis", true),
+  MULTI_STAGE_SERIALIZATION_CPU_TIME_MS("millis", true),
+  MULTI_STAGE_DESERIALIZATION_CPU_TIME_MS("millis", true),
+  RECEIVE_DOWNSTREAM_CPU_TIME_MS("millis", true),

Review Comment:
   Also, added javadoc in all new metrics



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscr...@pinot.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


-
To unsubscribe, e-mail: commits-unsubscr...@pinot.apache.org
For additional commands, e-mail: commits-h...@pinot.apache.org



Re: [PR] Multi stage metrics [pinot]

2024-06-06 Thread via GitHub


gortiz commented on code in PR #13035:
URL: https://github.com/apache/pinot/pull/13035#discussion_r1628906266


##
pinot-query-runtime/src/main/java/org/apache/pinot/query/runtime/plan/MultiStageQueryStats.java:
##
@@ -284,14 +284,18 @@ public void mergeUpstream(List otherStats) {
 myStats.merge(dis);
   }
 } catch (IOException ex) {
-  LOGGER.warn("Error deserializing stats on stage {}. Considering the 
new stats empty", i, ex);
+  LOGGER.warn("Error deserializing stats on stage " + i + ". 
Considering the new stats empty", ex);
 } catch (IllegalArgumentException | IllegalStateException ex) {
-  LOGGER.warn("Error merging stats on stage {}. Ignoring the new 
stats", i, ex);
+  LOGGER.warn("Error merging stats on stage " + i + ". Ignoring the 
new stats", ex);
 }
   }
 }
   }
 
+  public List getClosedStats() {
+return Collections.unmodifiableList(_closedStats);

Review Comment:
   A unmodifiable list is very light (just an indirection) and TBH I prefer to 
be a bit paranoic here so we don't end up modifying it in the future, which 
would be difficult to catch.
   
   This method is only called once per query on the broker. I don't think 
creating a single small object (that can be probably scalar replaced) would be 
problematic.



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscr...@pinot.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


-
To unsubscribe, e-mail: commits-unsubscr...@pinot.apache.org
For additional commands, e-mail: commits-h...@pinot.apache.org



Re: [PR] Multi stage metrics [pinot]

2024-06-06 Thread via GitHub


gortiz commented on code in PR #13035:
URL: https://github.com/apache/pinot/pull/13035#discussion_r1628900459


##
pinot-common/src/main/java/org/apache/pinot/common/metrics/ServerTimer.java:
##
@@ -54,7 +54,14 @@ public enum ServerTimer implements AbstractMetrics.Timer {
   UPSERT_REMOVE_EXPIRED_PRIMARY_KEYS_TIME_MS("milliseconds", false,
   "Total time taken to delete expired primary keys based on metadataTTL or 
deletedKeysTTL"),
   GRPC_QUERY_EXECUTION_MS("milliseconds", false, "Total execution time of a 
successful query over gRPC"),
-  UPSERT_SNAPSHOT_TIME_MS("milliseconds", false, "Total time taken to take 
upsert table snapshot");
+  UPSERT_SNAPSHOT_TIME_MS("milliseconds", false, "Total time taken to take 
upsert table snapshot"),
+
+  // Multi-stage
+  HASH_JOIN_CPU_TIME_BUILDING_HASH_TABLE_MS("millis", true),
+  MULTI_STAGE_SERIALIZATION_CPU_TIME_MS("millis", true),
+  MULTI_STAGE_DESERIALIZATION_CPU_TIME_MS("millis", true),
+  RECEIVE_DOWNSTREAM_CPU_TIME_MS("millis", true),

Review Comment:
   Renamed as `RECEIVE_DOWNSTREAM_WAIT_CPU_TIME_MS` and 
`RECEIVE_UPSTREAM_WAIT_CPU_TIME_MS`



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscr...@pinot.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


-
To unsubscribe, e-mail: commits-unsubscr...@pinot.apache.org
For additional commands, e-mail: commits-h...@pinot.apache.org



Re: [PR] Multi stage metrics [pinot]

2024-06-06 Thread via GitHub


gortiz commented on code in PR #13035:
URL: https://github.com/apache/pinot/pull/13035#discussion_r1628890607


##
pinot-query-runtime/src/main/java/org/apache/pinot/query/runtime/plan/MultiStageQueryStats.java:
##
@@ -256,7 +256,7 @@ public void mergeUpstream(MultiStageQueryStats otherStats) {
   myStats.merge(otherStatsForStage);
 }
   } catch (IllegalArgumentException | IllegalStateException ex) {
-LOGGER.warn("Error merging stats on stage {}. Ignoring the new stats", 
i, ex);
+LOGGER.warn("Error merging stats on stage " + i + ". Ignoring the new 
stats", ex);

Review Comment:
   You are right. I though the older case would not print the exception 
correctly, but it does.



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscr...@pinot.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


-
To unsubscribe, e-mail: commits-unsubscr...@pinot.apache.org
For additional commands, e-mail: commits-h...@pinot.apache.org



Re: [PR] [Bug fix] Fix URI construction so that AddSchema command line tool works [pinot]

2024-06-06 Thread via GitHub


codecov-commenter commented on PR #13320:
URL: https://github.com/apache/pinot/pull/13320#issuecomment-2151554203

   ## 
[Codecov](https://app.codecov.io/gh/apache/pinot/pull/13320?dropdown=coverage=pr=h1_medium=referral_source=github_content=comment_campaign=pr+comments_term=apache)
 Report
   All modified and coverable lines are covered by tests :white_check_mark:
   > Project coverage is 0.00%. Comparing base 
[(`59551e4`)](https://app.codecov.io/gh/apache/pinot/commit/59551e45224f1535c4863fd577622b37366ccc97?dropdown=coverage=desc_medium=referral_source=github_content=comment_campaign=pr+comments_term=apache)
 to head 
[(`54b1cc2`)](https://app.codecov.io/gh/apache/pinot/commit/54b1cc2c91ad6adf018da81fe4d43eda5a710bab?dropdown=coverage=desc_medium=referral_source=github_content=comment_campaign=pr+comments_term=apache).
   > Report is 559 commits behind head on master.
   
   
   Additional details and impacted files
   
   
   ```diff
   @@  Coverage Diff  @@
   ## master   #13320   +/-   ##
   =
   - Coverage 61.75%0.00%   -61.76% 
   =
 Files  2436 2469   +33 
 Lines133233   135995 +2762 
 Branches  2063621049  +413 
   =
   - Hits  822740-82274 
   - Misses44911   135995+91084 
   + Partials   60480 -6048 
   ```
   
   | 
[Flag](https://app.codecov.io/gh/apache/pinot/pull/13320/flags?src=pr=flags_medium=referral_source=github_content=comment_campaign=pr+comments_term=apache)
 | Coverage Δ | |
   |---|---|---|
   | 
[custom-integration1](https://app.codecov.io/gh/apache/pinot/pull/13320/flags?src=pr=flag_medium=referral_source=github_content=comment_campaign=pr+comments_term=apache)
 | `?` | |
   | 
[integration](https://app.codecov.io/gh/apache/pinot/pull/13320/flags?src=pr=flag_medium=referral_source=github_content=comment_campaign=pr+comments_term=apache)
 | `0.00% <ø> (-0.01%)` | :arrow_down: |
   | 
[integration1](https://app.codecov.io/gh/apache/pinot/pull/13320/flags?src=pr=flag_medium=referral_source=github_content=comment_campaign=pr+comments_term=apache)
 | `?` | |
   | 
[integration2](https://app.codecov.io/gh/apache/pinot/pull/13320/flags?src=pr=flag_medium=referral_source=github_content=comment_campaign=pr+comments_term=apache)
 | `0.00% <ø> (ø)` | |
   | 
[java-11](https://app.codecov.io/gh/apache/pinot/pull/13320/flags?src=pr=flag_medium=referral_source=github_content=comment_campaign=pr+comments_term=apache)
 | `?` | |
   | 
[java-21](https://app.codecov.io/gh/apache/pinot/pull/13320/flags?src=pr=flag_medium=referral_source=github_content=comment_campaign=pr+comments_term=apache)
 | `0.00% <ø> (-61.63%)` | :arrow_down: |
   | 
[skip-bytebuffers-false](https://app.codecov.io/gh/apache/pinot/pull/13320/flags?src=pr=flag_medium=referral_source=github_content=comment_campaign=pr+comments_term=apache)
 | `0.00% <ø> (-61.75%)` | :arrow_down: |
   | 
[skip-bytebuffers-true](https://app.codecov.io/gh/apache/pinot/pull/13320/flags?src=pr=flag_medium=referral_source=github_content=comment_campaign=pr+comments_term=apache)
 | `?` | |
   | 
[temurin](https://app.codecov.io/gh/apache/pinot/pull/13320/flags?src=pr=flag_medium=referral_source=github_content=comment_campaign=pr+comments_term=apache)
 | `0.00% <ø> (-61.76%)` | :arrow_down: |
   | 
[unittests](https://app.codecov.io/gh/apache/pinot/pull/13320/flags?src=pr=flag_medium=referral_source=github_content=comment_campaign=pr+comments_term=apache)
 | `?` | |
   | 
[unittests1](https://app.codecov.io/gh/apache/pinot/pull/13320/flags?src=pr=flag_medium=referral_source=github_content=comment_campaign=pr+comments_term=apache)
 | `?` | |
   | 
[unittests2](https://app.codecov.io/gh/apache/pinot/pull/13320/flags?src=pr=flag_medium=referral_source=github_content=comment_campaign=pr+comments_term=apache)
 | `?` | |
   
   Flags with carried forward coverage won't be shown. [Click 
here](https://docs.codecov.io/docs/carryforward-flags?utm_medium=referral_source=github_content=comment_campaign=pr+comments_term=apache#carryforward-flags-in-the-pull-request-comment)
 to find out more.
   
   
   
   
   [:umbrella: View full report in Codecov by 
Sentry](https://app.codecov.io/gh/apache/pinot/pull/13320?dropdown=coverage=pr=continue_medium=referral_source=github_content=comment_campaign=pr+comments_term=apache).
   
   :loudspeaker: Have feedback on the report? [Share it 
here](https://about.codecov.io/codecov-pr-comment-feedback/?utm_medium=referral_source=github_content=comment_campaign=pr+comments_term=apache).
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscr...@pinot.apache.org

For 

[PR] [Bugfix] Fix URI construction so that AddSchema command line tool works [pinot]

2024-06-06 Thread via GitHub


suyashpatel98 opened a new pull request, #13320:
URL: https://github.com/apache/pinot/pull/13320

   Fix URI construction so that AddSchema command line tool works when override 
flag is set to true

Verified that when the override flag is set to true the schema is 
overridden successfully

 $ bin/pinot-admin.sh AddSchema \
 -schemaFile 
/Users/suyash/Desktop/work/open-source/workspace/testData/transcript_schema.json
 \
 -controllerHost localhost \
 -controllerPort 9000 \
 -override=true \
 -exec
   
   2024/06/05 23:22:22.613 INFO [AddSchemaCommand] [main] Executing command: 
AddSchema -controllerProtocol http -controllerHost localhost -controllerPort 
9000 -schemaFile 
/Users/suyash/Desktop/work/open-source/workspace/testData/transcript_schema.json
 -override true _force false -user null -password [hidden] -exec


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscr...@pinot.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


-
To unsubscribe, e-mail: commits-unsubscr...@pinot.apache.org
For additional commands, e-mail: commits-h...@pinot.apache.org



Re: [PR] Interning for OnHeapByteDictionary [pinot]

2024-06-05 Thread via GitHub


vvivekiyer merged PR #12342:
URL: https://github.com/apache/pinot/pull/12342


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscr...@pinot.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


-
To unsubscribe, e-mail: commits-unsubscr...@pinot.apache.org
For additional commands, e-mail: commits-h...@pinot.apache.org



Re: [PR] Clean Google Dependencies [pinot]

2024-06-05 Thread via GitHub


abhioncbr commented on code in PR #13297:
URL: https://github.com/apache/pinot/pull/13297#discussion_r1628677484


##
pom.xml:
##
@@ -224,12 +224,12 @@
 
 
 26.40.0
-1.23.0
-2.10.1
-33.1.0-jre
-1.44.1
-3.25.2
-1.61.1
+3.25.2 
+1.61.1  

Review Comment:
   Imported `grpc-bom` and `protobuf-pom` into the project. 
   
   In this way, the version during compilation and runtime should be the same. 
However, while updating `google-bom,` we need to make sure that the `grpc` and 
`protobuf ` versions is supported or not.



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscr...@pinot.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


-
To unsubscribe, e-mail: commits-unsubscr...@pinot.apache.org
For additional commands, e-mail: commits-h...@pinot.apache.org



Re: [I] Realtime servers reporting lag for partitions they don't own after rebalancing w/o including consuming segments [pinot]

2024-06-05 Thread via GitHub


Jackie-Jiang commented on issue #11448:
URL: https://github.com/apache/pinot/issues/11448#issuecomment-2151192409

   1. What is the typical freshness lag during segment commit? If we don't want 
to read IS for every segment commit, we need to configure a timeout longer than 
the common freshness lag so that it doesn't always timeout.
   2. We should look at IS to know if server still owns the segment
   
   After a second thought, I feel I missed an important point. After 
rebalancing without consuming segments, when the current consuming segment 
commits, does it remain on the current server, or is moved to the new server? 
If it moves to the new server, it should trigger the CONSUMING -> OFFLINE state 
transition, which does not need to wait for the 10 minutes timeout.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscr...@pinot.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


-
To unsubscribe, e-mail: commits-unsubscr...@pinot.apache.org
For additional commands, e-mail: commits-h...@pinot.apache.org



Re: [PR] Update getValidDocIdsMetadataFromServer to make call in batches to servers and other bug fixes [pinot]

2024-06-05 Thread via GitHub


klsince commented on code in PR #13314:
URL: https://github.com/apache/pinot/pull/13314#discussion_r1628544264


##
pinot-controller/src/main/java/org/apache/pinot/controller/util/CompletionServiceHelper.java:
##
@@ -129,13 +129,21 @@ private CompletionServiceResponse collectResponse(String 
tableNameWithType, int
 int statusCode = 
multiHttpRequestResponse.getResponse().getStatusLine().getStatusCode();
 if (statusCode >= 300) {
   String reason = 
multiHttpRequestResponse.getResponse().getStatusLine().getReasonPhrase();
-  LOGGER.error("Server: {} returned error: {}, reason: {}", instance, 
statusCode, reason);
+  LOGGER.error("Server: {} returned error: {}, reason: {} for uri: 
{}", instance, statusCode, reason, uri);
   completionServiceResponse._failedResponseCount++;
   continue;
 }
 String responseString = 
EntityUtils.toString(multiHttpRequestResponse.getResponse().getEntity());
-completionServiceResponse._httpResponses
-.put(multiRequestPerServer ? uri.toString() : instance, 
responseString);
+String key = multiRequestPerServer ? uri.toString() : instance;
+// If there are multiple requests to the same server with the same URI 
but different payloads,
+// we append a count value to the key to ensure each response is 
uniquely identified.
+// Otherwise, the map will store only the last response, overwriting 
previous ones.
+if (multiRequestPerServer) {
+  int count = 
completionServiceResponse._instanceToRequestCount.getOrDefault(key, 0) + 1;
+  completionServiceResponse._instanceToRequestCount.put(key, count);

Review Comment:
   can do `compute(key, (k, v) -> (v == null) ? 1 : v + 1);` but not a blocking 
comment



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscr...@pinot.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


-
To unsubscribe, e-mail: commits-unsubscr...@pinot.apache.org
For additional commands, e-mail: commits-h...@pinot.apache.org



Re: [I] Error while performing schema evolution [pinot]

2024-06-05 Thread via GitHub


suyashpatel98 commented on issue #13312:
URL: https://github.com/apache/pinot/issues/13312#issuecomment-2151061084

   @Jackie-Jiang Please assign this to me. I will raise the PR soon


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscr...@pinot.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


-
To unsubscribe, e-mail: commits-unsubscr...@pinot.apache.org
For additional commands, e-mail: commits-h...@pinot.apache.org



Re: [I] Realtime servers reporting lag for partitions they don't own after rebalancing w/o including consuming segments [pinot]

2024-06-05 Thread via GitHub


jadami10 commented on issue #11448:
URL: https://github.com/apache/pinot/issues/11448#issuecomment-2151049470

   I wish there was a smaller ZK node we could look at. But given that we (and 
likely others) have use cases that want to meet SLC thresholds < 30s, then 
that's roughly how quickly we'd want the ingestion lag metric to reflect that a 
partition has been rebalanced away.
   
   1. is that possible? Can we just do this in the rebalance case?
   2. should we be looking at IS or EV in the case the a rebalance fails and 
things are in an inconsistent state for example.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscr...@pinot.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


-
To unsubscribe, e-mail: commits-unsubscr...@pinot.apache.org
For additional commands, e-mail: commits-h...@pinot.apache.org



Re: [PR] Clean Google Dependencies [pinot]

2024-06-05 Thread via GitHub


Jackie-Jiang commented on code in PR #13297:
URL: https://github.com/apache/pinot/pull/13297#discussion_r1628489765


##
pom.xml:
##
@@ -1395,7 +1389,7 @@
   
 net.openhft
 posix
-2.26ea0
+2.25ea0

Review Comment:
   Is this change intended?



##
pom.xml:
##
@@ -224,12 +224,12 @@
 
 
 26.40.0
-1.23.0
-2.10.1
-33.1.0-jre
-1.44.1
-3.25.2
-1.61.1
+3.25.2 
+1.61.1  

Review Comment:
   Is there a way to extract these 2 versions from the BOM?



##
pom.xml:
##
@@ -1646,7 +1640,7 @@
 
   org.apache.maven.plugins
   maven-javadoc-plugin
-  3.7.0
+  3.6.3

Review Comment:
   Is this change intended?



##
pom.xml:
##
@@ -224,12 +224,12 @@
 
 
 26.40.0
-1.23.0
-2.10.1
-33.1.0-jre
-1.44.1
-3.25.2
-1.61.1
+3.25.2 
+1.61.1  
+3.0.2
+2.27.1

Review Comment:
   This doesn't match the current version



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscr...@pinot.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


-
To unsubscribe, e-mail: commits-unsubscr...@pinot.apache.org
For additional commands, e-mail: commits-h...@pinot.apache.org



Re: [I] Realtime servers reporting lag for partitions they don't own after rebalancing w/o including consuming segments [pinot]

2024-06-05 Thread via GitHub


Jackie-Jiang commented on issue #11448:
URL: https://github.com/apache/pinot/issues/11448#issuecomment-2151037009

   Server doesn't read IS on its own. Controller periodically scan all IS/EV 
and send messages to servers.
   In order to detect that the next segment doesn't come to this server, server 
will need to read the IS for that table. Basically that translates to one IS 
read per segment commit if we do it in the synchronize fashion.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscr...@pinot.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


-
To unsubscribe, e-mail: commits-unsubscr...@pinot.apache.org
For additional commands, e-mail: commits-h...@pinot.apache.org



Re: [I] Realtime servers reporting lag for partitions they don't own after rebalancing w/o including consuming segments [pinot]

2024-06-05 Thread via GitHub


jadami10 commented on issue #11448:
URL: https://github.com/apache/pinot/issues/11448#issuecomment-2151032441

   @Jackie-Jiang, I wasn't able to track this down, but do servers consistently 
parse changes between idealstate/external view? Is there a way to do it when 
that happens and have it all a `removePartition` function on 
`IngestionDelayTracker.java`?


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscr...@pinot.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


-
To unsubscribe, e-mail: commits-unsubscr...@pinot.apache.org
For additional commands, e-mail: commits-h...@pinot.apache.org



Re: [I] Realtime servers reporting lag for partitions they don't own after rebalancing w/o including consuming segments [pinot]

2024-06-05 Thread via GitHub


Jackie-Jiang commented on issue #11448:
URL: https://github.com/apache/pinot/issues/11448#issuecomment-2151027072

   The current solution should be able to eventually remove the lag, but could 
take up to 15 minutes (10 minutes timeout + 5 minutes scheduling delay).
   There are 2 potential enhancements:
   1. Make timeout and scheduling delay configurable (both cluster level and 
table level)
   2. Add a mode to directly pull ideal state so that it doesn't need to do 
async verification
   
   @priyen @jadami10 which solution do you think suits your use case better? Or 
maybe both of them are needed?


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscr...@pinot.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


-
To unsubscribe, e-mail: commits-unsubscr...@pinot.apache.org
For additional commands, e-mail: commits-h...@pinot.apache.org



Re: [PR] Update getValidDocIdsMetadataFromServer to make call in batches to servers and other bug fixes [pinot]

2024-06-05 Thread via GitHub


klsince commented on code in PR #13314:
URL: https://github.com/apache/pinot/pull/13314#discussion_r1628481078


##
pinot-controller/src/main/java/org/apache/pinot/controller/util/ServerSegmentMetadataReader.java:
##
@@ -235,8 +236,18 @@ public List 
getValidDocIdsMetadataFromServer(String tab
   }
 }
   }
-  
serverURLsAndBodies.add(generateValidDocIdsMetadataURL(tableNameWithType, 
segmentsToQuery, validDocIdsType,
-  serverToEndpoints.get(serverToSegments.getKey(;
+
+  // Number of segments to query per server request. If a table has a lot 
of segments, then we might send a
+  // huge payload to pinot-server in request. Batching the requests will 
help in reducing the payload size.

Review Comment:
   use Lists.partition() for short?



##
pinot-controller/src/main/java/org/apache/pinot/controller/api/resources/PinotTableRestletResource.java:
##
@@ -972,7 +972,10 @@ public String getTableAggregateValidDocIdsMetadata(
   @ApiParam(value = "A list of segments", allowMultiple = true) 
@QueryParam("segmentNames")
   List segmentNames,
   @ApiParam(value = "Valid doc ids type") @QueryParam("validDocIdsType")
-  @DefaultValue("SNAPSHOT") ValidDocIdsType validDocIdsType, @Context 
HttpHeaders headers) {
+  @DefaultValue("SNAPSHOT") ValidDocIdsType validDocIdsType,
+  @ApiParam(value = "Number of segments in a batch per server request")
+  @QueryParam("numSegmentsBatchPerServerRequest") @DefaultValue("500") int 
numSegmentsBatchPerServerRequest,

Review Comment:
   nit: numSegmentsPerServerRequest or serverRequestBatchSize



##
pinot-controller/src/main/java/org/apache/pinot/controller/util/CompletionServiceHelper.java:
##
@@ -129,13 +130,20 @@ private CompletionServiceResponse collectResponse(String 
tableNameWithType, int
 int statusCode = 
multiHttpRequestResponse.getResponse().getStatusLine().getStatusCode();
 if (statusCode >= 300) {
   String reason = 
multiHttpRequestResponse.getResponse().getStatusLine().getReasonPhrase();
-  LOGGER.error("Server: {} returned error: {}, reason: {}", instance, 
statusCode, reason);
+  LOGGER.error("Server: {} returned error: {}, reason: {} for uri: 
{}", instance, statusCode, reason, uri);
   completionServiceResponse._failedResponseCount++;
   continue;
 }
 String responseString = 
EntityUtils.toString(multiHttpRequestResponse.getResponse().getEntity());
-completionServiceResponse._httpResponses
-.put(multiRequestPerServer ? uri.toString() : instance, 
responseString);
+String key = multiRequestPerServer ? uri.toString() : instance;
+// there can be a scenario where all the requests to a particular 
server had the same uri but the
+// payload might be different. In that scenario, we should append a 
random string to the key so that
+// we send all the responses back to the caller otherwise in the map, 
the last response will overwrite
+if (multiRequestPerServer && 
completionServiceResponse._httpResponses.containsKey(key)) {
+  LOGGER.warn("Appending random string to http response key name: {}", 
key);
+  key = key + "__" + RandomStringUtils.randomAlphanumeric(10);

Review Comment:
   use a counter to avoid conflict, to make it easier to tell how many requests 
created for the same uri



##
pinot-server/src/main/java/org/apache/pinot/server/api/resources/TablesResource.java:
##
@@ -657,9 +657,7 @@ private List> 
processValidDocIdsMetadata(String tableNameWit
 }
 try {
   if (!missingSegments.isEmpty()) {
-throw new WebApplicationException(
-String.format("Table %s has missing segments: %s)", 
tableNameWithType, segments),
-Response.Status.NOT_FOUND);
+LOGGER.warn("Table {} has missing segments {}", tableNameWithType, 
missingSegments);

Review Comment:
   can add a comment on why no need to abort



##
pinot-plugins/pinot-minion-tasks/pinot-minion-builtin-tasks/src/main/java/org/apache/pinot/plugin/minion/tasks/upsertcompaction/UpsertCompactionTaskGenerator.java:
##
@@ -58,6 +58,7 @@ public class UpsertCompactionTaskGenerator extends 
BaseTaskGenerator {
   private static final String DEFAULT_BUFFER_PERIOD = "7d";
   private static final double DEFAULT_INVALID_RECORDS_THRESHOLD_PERCENT = 0.0;
   private static final long DEFAULT_INVALID_RECORDS_THRESHOLD_COUNT = 0;
+  private static final int DEFAULT_SEGMENT_BATCH_SIZE_FOR_QUERYING_SERVER = 
500;

Review Comment:
   nit: simply `DEFAULT_`NUM_SEGMENTS_BATCH_PER_SERVER_REQUEST



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscr...@pinot.apache.org

For queries about this service, please contact Infrastructure at:

Re: [PR] Custom configuration property reader for segment metadata files [pinot]

2024-06-05 Thread via GitHub


klsince merged PR #12440:
URL: https://github.com/apache/pinot/pull/12440


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscr...@pinot.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


-
To unsubscribe, e-mail: commits-unsubscr...@pinot.apache.org
For additional commands, e-mail: commits-h...@pinot.apache.org



Re: [PR] Fix Logging Location for CPU-Based Query Killing [pinot]

2024-06-05 Thread via GitHub


jasperjiaguo merged PR #13318:
URL: https://github.com/apache/pinot/pull/13318


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscr...@pinot.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


-
To unsubscribe, e-mail: commits-unsubscr...@pinot.apache.org
For additional commands, e-mail: commits-h...@pinot.apache.org



Re: [PR] Fix Logging Location for CPU-Based Query Killing [pinot]

2024-06-05 Thread via GitHub


codecov-commenter commented on PR #13318:
URL: https://github.com/apache/pinot/pull/13318#issuecomment-2150896993

   ## 
[Codecov](https://app.codecov.io/gh/apache/pinot/pull/13318?dropdown=coverage=pr=h1_medium=referral_source=github_content=comment_campaign=pr+comments_term=apache)
 Report
   Attention: Patch coverage is `0%` with `3 lines` in your changes missing 
coverage. Please review.
   > Project coverage is 61.99%. Comparing base 
[(`59551e4`)](https://app.codecov.io/gh/apache/pinot/commit/59551e45224f1535c4863fd577622b37366ccc97?dropdown=coverage=desc_medium=referral_source=github_content=comment_campaign=pr+comments_term=apache)
 to head 
[(`6d7323b`)](https://app.codecov.io/gh/apache/pinot/commit/6d7323b47bce84ffa4dfe07881af39ac17dabae9?dropdown=coverage=desc_medium=referral_source=github_content=comment_campaign=pr+comments_term=apache).
   > Report is 556 commits behind head on master.
   
   | 
[Files](https://app.codecov.io/gh/apache/pinot/pull/13318?dropdown=coverage=pr=tree_medium=referral_source=github_content=comment_campaign=pr+comments_term=apache)
 | Patch % | Lines |
   |---|---|---|
   | 
[...re/accounting/PerQueryCPUMemAccountantFactory.java](https://app.codecov.io/gh/apache/pinot/pull/13318?src=pr=tree=pinot-core%2Fsrc%2Fmain%2Fjava%2Forg%2Fapache%2Fpinot%2Fcore%2Faccounting%2FPerQueryCPUMemAccountantFactory.java_medium=referral_source=github_content=comment_campaign=pr+comments_term=apache#diff-cGlub3QtY29yZS9zcmMvbWFpbi9qYXZhL29yZy9hcGFjaGUvcGlub3QvY29yZS9hY2NvdW50aW5nL1BlclF1ZXJ5Q1BVTWVtQWNjb3VudGFudEZhY3RvcnkuamF2YQ==)
 | 0.00% | [3 Missing :warning: 
](https://app.codecov.io/gh/apache/pinot/pull/13318?src=pr=tree_medium=referral_source=github_content=comment_campaign=pr+comments_term=apache)
 |
   
   Additional details and impacted files
   
   
   ```diff
   @@ Coverage Diff  @@
   ## master   #13318  +/-   ##
   
   + Coverage 61.75%   61.99%   +0.23% 
   + Complexity  207  198   -9 
   
 Files  2436 2540 +104 
 Lines133233   139591+6358 
 Branches  2063621597 +961 
   
   + Hits  8227486533+4259 
   - Misses4491146563+1652 
   - Partials   6048 6495 +447 
   ```
   
   | 
[Flag](https://app.codecov.io/gh/apache/pinot/pull/13318/flags?src=pr=flags_medium=referral_source=github_content=comment_campaign=pr+comments_term=apache)
 | Coverage Δ | |
   |---|---|---|
   | 
[custom-integration1](https://app.codecov.io/gh/apache/pinot/pull/13318/flags?src=pr=flag_medium=referral_source=github_content=comment_campaign=pr+comments_term=apache)
 | `<0.01% <0.00%> (-0.01%)` | :arrow_down: |
   | 
[integration](https://app.codecov.io/gh/apache/pinot/pull/13318/flags?src=pr=flag_medium=referral_source=github_content=comment_campaign=pr+comments_term=apache)
 | `<0.01% <0.00%> (-0.01%)` | :arrow_down: |
   | 
[integration1](https://app.codecov.io/gh/apache/pinot/pull/13318/flags?src=pr=flag_medium=referral_source=github_content=comment_campaign=pr+comments_term=apache)
 | `<0.01% <0.00%> (-0.01%)` | :arrow_down: |
   | 
[integration2](https://app.codecov.io/gh/apache/pinot/pull/13318/flags?src=pr=flag_medium=referral_source=github_content=comment_campaign=pr+comments_term=apache)
 | `0.00% <0.00%> (ø)` | |
   | 
[java-11](https://app.codecov.io/gh/apache/pinot/pull/13318/flags?src=pr=flag_medium=referral_source=github_content=comment_campaign=pr+comments_term=apache)
 | `61.98% <0.00%> (+0.27%)` | :arrow_up: |
   | 
[java-21](https://app.codecov.io/gh/apache/pinot/pull/13318/flags?src=pr=flag_medium=referral_source=github_content=comment_campaign=pr+comments_term=apache)
 | `27.76% <0.00%> (-33.87%)` | :arrow_down: |
   | 
[skip-bytebuffers-false](https://app.codecov.io/gh/apache/pinot/pull/13318/flags?src=pr=flag_medium=referral_source=github_content=comment_campaign=pr+comments_term=apache)
 | `61.98% <0.00%> (+0.24%)` | :arrow_up: |
   | 
[skip-bytebuffers-true](https://app.codecov.io/gh/apache/pinot/pull/13318/flags?src=pr=flag_medium=referral_source=github_content=comment_campaign=pr+comments_term=apache)
 | `27.75% <0.00%> (+0.02%)` | :arrow_up: |
   | 
[temurin](https://app.codecov.io/gh/apache/pinot/pull/13318/flags?src=pr=flag_medium=referral_source=github_content=comment_campaign=pr+comments_term=apache)
 | `61.99% <0.00%> (+0.23%)` | :arrow_up: |
   | 
[unittests](https://app.codecov.io/gh/apache/pinot/pull/13318/flags?src=pr=flag_medium=referral_source=github_content=comment_campaign=pr+comments_term=apache)
 | `61.98% <0.00%> (+0.23%)` | :arrow_up: |
   | 
[unittests1](https://app.codecov.io/gh/apache/pinot/pull/13318/flags?src=pr=flag_medium=referral_source=github_content=comment_campaign=pr+comments_term=apache)
 | `46.50% <0.00%> (-0.39%)` | :arrow_down: |
  

[I] Data ingestion CPU efficiency improvements [pinot]

2024-06-05 Thread via GitHub


lnbest0707-uber opened a new issue, #13319:
URL: https://github.com/apache/pinot/issues/13319

   Pinot data ingestion from Kafka is following the 1 thread per Kafka 
partition mechanism. The scaling up is relying on increasing number of Kafka 
topic partitions. However, due to the nature of ingestion computation load, 
Kafka broker usually has a far higher traffic volume limit per partition than 
Pinot.
   For example, with same type of hardware, Kafka could afford traffic over 
8MB/s/partition but Pinot if doing complex transformation and index building 
(e.g. SchemaConformingTransformer & text index) can only afford <2 
MB/s/partition. This makes the Kafka partition expansion not able to be always 
in sync with Pinot's system load.
   In reality, we are observing that in a Pinot server with tens of cores, only 
20% are busy with ingesting and others relatively idle.
   
   Hence, there's requirement to improve the computation efficiency and do 
parallel (at least part of) single partition message processing.
   
![image](https://github.com/apache/pinot/assets/106711887/a6e6390b-ddfc-48f6-97b7-0959dad88bfc)
   From the attached pic, there are a few components to be improved:
   
   - gzip compression -> to zstd with proper level
   - transformers -> using batch and parallel processing
   - indexing -> TBD
   - Kafka polling -> batch polling


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscr...@pinot.apache.org.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


-
To unsubscribe, e-mail: commits-unsubscr...@pinot.apache.org
For additional commands, e-mail: commits-h...@pinot.apache.org



[PR] fix logging [pinot]

2024-06-05 Thread via GitHub


PraveenVora opened a new pull request, #13318:
URL: https://github.com/apache/pinot/pull/13318

   Instructions:
   1. The PR has to be tagged with at least one of the following labels (*):
  1. `feature`
  2. `bugfix`
  3. `performance`
  4. `ui`
  5. `backward-incompat`
  6. `release-notes` (**)
   2. Remove these instructions before publishing the PR.

   (*) Other labels to consider:
   - `testing`
   - `dependencies`
   - `docker`
   - `kubernetes`
   - `observability`
   - `security`
   - `code-style`
   - `extension-point`
   - `refactor`
   - `cleanup`
   
   (**) Use `release-notes` label for scenarios like:
   - New configuration options
   - Deprecation of configurations
   - Signature changes to public methods/interfaces
   - New plugins added or old plugins removed
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscr...@pinot.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


-
To unsubscribe, e-mail: commits-unsubscr...@pinot.apache.org
For additional commands, e-mail: commits-h...@pinot.apache.org



Re: [PR] [Backfill] allow externally partitioned segment uploads for upsert tables [pinot]

2024-06-05 Thread via GitHub


klsince commented on code in PR #13107:
URL: https://github.com/apache/pinot/pull/13107#discussion_r1628157120


##
pinot-segment-spi/src/main/java/org/apache/pinot/segment/spi/creator/name/UploadedRealtimeSegmentNameGenerator.java:
##
@@ -0,0 +1,62 @@
+/**
+ * Licensed to the Apache Software Foundation (ASF) under one
+ * or more contributor license agreements.  See the NOTICE file
+ * distributed with this work for additional information
+ * regarding copyright ownership.  The ASF licenses this file
+ * to you under the Apache License, Version 2.0 (the
+ * "License"); you may not use this file except in compliance
+ * with the License.  You may obtain a copy of the License at
+ *
+ *   http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing,
+ * software distributed under the License is distributed on an
+ * "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
+ * KIND, either express or implied.  See the License for the
+ * specific language governing permissions and limitations
+ * under the License.
+ */
+package org.apache.pinot.segment.spi.creator.name;
+
+import com.google.common.base.Joiner;
+import com.google.common.base.Preconditions;
+import javax.annotation.Nullable;
+
+
+/**
+ * Implementation for generating segment names of the format 
UploadedRealtimeSegmentName:
+ * 
uploaded__{tableName}__{partitionId}__{sequenceId}__{creationTime}__{optionalSuffix}
+ *
+ * This naming convention is adopted to represent uploaded segments to a 
realtime table. The semantic is similar
+ * to LLCSegmentName. Scenarios where this naming convention can be preferred 
is:
+ *  Generating segments from a batch workload
+ *  Minion based segment transformations
+ */
+public class UploadedRealtimeSegmentNameGenerator implements 
SegmentNameGenerator {
+
+  private static final String SEGMENT_NAME_PREFIX = "uploaded";
+  private static final String DELIMITER = "__";
+  private final String _tableName;
+  private final int _partitionId;
+  // creation time must be in long and milliseconds since epoch to be 
consistent with creation.meta time for valid
+  // comparison in segment replace flow.
+  private final long _creationTimeMillis;
+  @Nullable
+  private final String _suffix;
+
+  public UploadedRealtimeSegmentNameGenerator(String tableName, int 
partitionId, long creationTimeMillis,
+  String suffix) {
+Preconditions.checkState(creationTimeMillis > 0, "Creation time must be 
positive");

Review Comment:
   bump ^ as this might be missed.



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscr...@pinot.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


-
To unsubscribe, e-mail: commits-unsubscr...@pinot.apache.org
For additional commands, e-mail: commits-h...@pinot.apache.org



Re: [PR] Bump net.openhft:chronicle-core from 2.25ea15 to 2.26ea0 [pinot]

2024-06-05 Thread via GitHub


Jackie-Jiang commented on PR #13274:
URL: https://github.com/apache/pinot/pull/13274#issuecomment-2150569516

   @dependabot rebase


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscr...@pinot.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


-
To unsubscribe, e-mail: commits-unsubscr...@pinot.apache.org
For additional commands, e-mail: commits-h...@pinot.apache.org



Re: [PR] Disabling checkstyle plugin and Spotless plugin for Java version > 21.0 [pinot]

2024-06-05 Thread via GitHub


Jackie-Jiang closed pull request #13252: Disabling checkstyle plugin and 
Spotless plugin for Java version > 21.0
URL: https://github.com/apache/pinot/pull/13252


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscr...@pinot.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


-
To unsubscribe, e-mail: commits-unsubscr...@pinot.apache.org
For additional commands, e-mail: commits-h...@pinot.apache.org



Re: [PR] Multi stage metrics [pinot]

2024-06-05 Thread via GitHub


Jackie-Jiang commented on code in PR #13035:
URL: https://github.com/apache/pinot/pull/13035#discussion_r1628117693


##
pinot-common/src/main/java/org/apache/pinot/common/metrics/ServerTimer.java:
##
@@ -54,7 +54,14 @@ public enum ServerTimer implements AbstractMetrics.Timer {
   UPSERT_REMOVE_EXPIRED_PRIMARY_KEYS_TIME_MS("milliseconds", false,
   "Total time taken to delete expired primary keys based on metadataTTL or 
deletedKeysTTL"),
   GRPC_QUERY_EXECUTION_MS("milliseconds", false, "Total execution time of a 
successful query over gRPC"),
-  UPSERT_SNAPSHOT_TIME_MS("milliseconds", false, "Total time taken to take 
upsert table snapshot");
+  UPSERT_SNAPSHOT_TIME_MS("milliseconds", false, "Total time taken to take 
upsert table snapshot"),
+
+  // Multi-stage
+  HASH_JOIN_CPU_TIME_BUILDING_HASH_TABLE_MS("millis", true),

Review Comment:
   (minor) Consider renaming to `HASH_JOIN_BUILD_TABLE_CPU_TIME_MS` for 
consistency



##
pinot-common/src/main/java/org/apache/pinot/common/metrics/ServerMeter.java:
##
@@ -118,7 +118,15 @@ public enum ServerMeter implements AbstractMetrics.Meter {
   LARGE_QUERY_RESPONSES_SENT("largeResponses", false),
   TOTAL_THREAD_CPU_TIME_MILLIS("millis", false),
   LARGE_QUERY_RESPONSE_SIZE_EXCEPTIONS("exceptions", false),
-  STREAM_DATA_LOSS("streamDataLoss", false);
+  STREAM_DATA_LOSS("streamDataLoss", false),
+
+  // Multi-stage
+  HASH_JOIN_TIMES_MAX_ROWS_REACHED("times", true),
+  AGGREGATE_TIMES_NUM_GROUPS_LIMIT_REACHED("times", true),
+  MULTI_STAGE_IN_MEMORY_MESSAGES("messages", true),
+  MULTI_STAGE_RAW_MESSAGES("messages", true),
+  MULTI_STAGE_RAW_BYTES("bytes", true),
+  MAX_ROWS_IN_WINDOW_REACHED("times", true),;

Review Comment:
   (minor) Consider renaming it to `WINDOW_TIMES_MAX_ROWS_REACHED` for 
consistency? Also remove the extra comma



##
pinot-common/src/main/java/org/apache/pinot/common/metrics/ServerTimer.java:
##
@@ -54,7 +54,14 @@ public enum ServerTimer implements AbstractMetrics.Timer {
   UPSERT_REMOVE_EXPIRED_PRIMARY_KEYS_TIME_MS("milliseconds", false,
   "Total time taken to delete expired primary keys based on metadataTTL or 
deletedKeysTTL"),
   GRPC_QUERY_EXECUTION_MS("milliseconds", false, "Total execution time of a 
successful query over gRPC"),
-  UPSERT_SNAPSHOT_TIME_MS("milliseconds", false, "Total time taken to take 
upsert table snapshot");
+  UPSERT_SNAPSHOT_TIME_MS("milliseconds", false, "Total time taken to take 
upsert table snapshot"),
+
+  // Multi-stage
+  HASH_JOIN_CPU_TIME_BUILDING_HASH_TABLE_MS("millis", true),
+  MULTI_STAGE_SERIALIZATION_CPU_TIME_MS("millis", true),
+  MULTI_STAGE_DESERIALIZATION_CPU_TIME_MS("millis", true),
+  RECEIVE_DOWNSTREAM_CPU_TIME_MS("millis", true),
+  RECEIVE_UPSTREAM_CPU_WAIT_MS("millis", true),;

Review Comment:
   (minor) remove the extra comma



##
pinot-common/src/main/java/org/apache/pinot/common/metrics/ServerTimer.java:
##
@@ -54,7 +54,14 @@ public enum ServerTimer implements AbstractMetrics.Timer {
   UPSERT_REMOVE_EXPIRED_PRIMARY_KEYS_TIME_MS("milliseconds", false,
   "Total time taken to delete expired primary keys based on metadataTTL or 
deletedKeysTTL"),
   GRPC_QUERY_EXECUTION_MS("milliseconds", false, "Total execution time of a 
successful query over gRPC"),
-  UPSERT_SNAPSHOT_TIME_MS("milliseconds", false, "Total time taken to take 
upsert table snapshot");
+  UPSERT_SNAPSHOT_TIME_MS("milliseconds", false, "Total time taken to take 
upsert table snapshot"),
+
+  // Multi-stage
+  HASH_JOIN_CPU_TIME_BUILDING_HASH_TABLE_MS("millis", true),
+  MULTI_STAGE_SERIALIZATION_CPU_TIME_MS("millis", true),
+  MULTI_STAGE_DESERIALIZATION_CPU_TIME_MS("millis", true),
+  RECEIVE_DOWNSTREAM_CPU_TIME_MS("millis", true),

Review Comment:
   Receive downstream is confusing. IIUC these 2 are the wait time for mailbox 
receive node. Some javadoc can help understand this, or consider renaming them 
to `MAILBOX_RECEIVE_DOWNSTREAM_WAIT_CPU_TIME_MS` and 
`MAILBOX_RECEIVE_UPSTREAM_WAIT_CPU_TIME_MS`



##
pinot-query-runtime/src/main/java/org/apache/pinot/query/runtime/plan/MultiStageQueryStats.java:
##
@@ -256,7 +256,7 @@ public void mergeUpstream(MultiStageQueryStats otherStats) {
   myStats.merge(otherStatsForStage);
 }
   } catch (IllegalArgumentException | IllegalStateException ex) {
-LOGGER.warn("Error merging stats on stage {}. Ignoring the new stats", 
i, ex);
+LOGGER.warn("Error merging stats on stage " + i + ". Ignoring the new 
stats", ex);

Review Comment:
   This seems anti-pattern. Do you want to avoid regexp match?



##
pinot-query-runtime/src/main/java/org/apache/pinot/query/runtime/plan/MultiStageQueryStats.java:
##
@@ -284,14 +284,18 @@ public void mergeUpstream(List otherStats) {
 myStats.merge(dis);
   }
 } catch (IOException ex) {
-  LOGGER.warn("Error deserializing stats on stage {}. Considering the 
new 

Re: [PR] Clean Google Dependencies [pinot]

2024-06-05 Thread via GitHub


abhioncbr closed pull request #13297: Clean Google Dependencies
URL: https://github.com/apache/pinot/pull/13297


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscr...@pinot.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


-
To unsubscribe, e-mail: commits-unsubscr...@pinot.apache.org
For additional commands, e-mail: commits-h...@pinot.apache.org



Re: [I] Error while performing schema evolution [pinot]

2024-06-05 Thread via GitHub


aishwaryathondapu commented on issue #13312:
URL: https://github.com/apache/pinot/issues/13312#issuecomment-2150541038

   yes. 


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscr...@pinot.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


-
To unsubscribe, e-mail: commits-unsubscr...@pinot.apache.org
For additional commands, e-mail: commits-h...@pinot.apache.org



Re: [PR] Bump software.amazon.awssdk:bom from 2.25.64 to 2.25.66 [pinot]

2024-06-05 Thread via GitHub


Jackie-Jiang merged PR #13316:
URL: https://github.com/apache/pinot/pull/13316


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscr...@pinot.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


-
To unsubscribe, e-mail: commits-unsubscr...@pinot.apache.org
For additional commands, e-mail: commits-h...@pinot.apache.org



Re: [PR] Fix few metric rules which were affected by the database prefix handling [pinot]

2024-06-05 Thread via GitHub


shounakmk219 commented on PR #13290:
URL: https://github.com/apache/pinot/pull/13290#issuecomment-2150188180

   ## Controller metrics validations:
   
   1.
   https://github.com/apache/pinot/assets/25409127/cb1a6438-8797-4de0-99f1-cff12ec7bd51;>
   
   2.
   https://github.com/apache/pinot/assets/25409127/9f2570be-1ecb-487f-bf2b-943fd3ffcb01;>
   
   ## Broker metrics validations
   
   1.
   https://github.com/apache/pinot/assets/25409127/85fbb98d-e27e-4341-bd30-0d7a3d72facf;>
   
   2.
   https://github.com/apache/pinot/assets/25409127/8f23d296-54e5-43d6-8da2-a987d8d476bb;>
   
   ## Server metrics validation
   
   1.
   https://github.com/apache/pinot/assets/25409127/15aa2187-aedb-414b-8537-387ef5b32d87;>
   
   2.
   https://github.com/apache/pinot/assets/25409127/9abad935-6f1e-42ac-859b-9be7342138ef;>
   
   
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscr...@pinot.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


-
To unsubscribe, e-mail: commits-unsubscr...@pinot.apache.org
For additional commands, e-mail: commits-h...@pinot.apache.org



Re: [PR] Fix few metric rules which were affected by the database prefix handling [pinot]

2024-06-05 Thread via GitHub


shounakmk219 commented on code in PR #13290:
URL: https://github.com/apache/pinot/pull/13290#discussion_r1627830416


##
docker/images/pinot/etc/jmx_prometheus_javaagent/configs/server.yml:
##
@@ -73,6 +66,14 @@ rules:
 table: "$1$3"
 tableType: "$4"
 partition: "$5"
+- pattern: 
"\"org\\.apache\\.pinot\\.common\\.metrics\"<>(\\w+)"

Review Comment:
   Yeah. `realtimeIngestionDelayMs` is handled by moving the metric rule 
earlier present on line 16 to line 175.
   Will add screenshots on validation as well



##
docker/images/pinot/etc/jmx_prometheus_javaagent/configs/server.yml:
##
@@ -171,6 +172,14 @@ rules:
   name: "pinot_server_grpc$1_$2"
   cache: true
 
+- pattern: 
"\"org\\.apache\\.pinot\\.common\\.metrics\"<>(\\w+)"
+  name: "pinot_server_$5_$6"
+  cache: true
+  labels:
+database: "$2"
+table: "$1$3"

Review Comment:
   replied above



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscr...@pinot.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


-
To unsubscribe, e-mail: commits-unsubscr...@pinot.apache.org
For additional commands, e-mail: commits-h...@pinot.apache.org



Re: [PR] Fix few metric rules which were affected by the database prefix handling [pinot]

2024-06-05 Thread via GitHub


shounakmk219 commented on code in PR #13290:
URL: https://github.com/apache/pinot/pull/13290#discussion_r1627823043


##
docker/images/pinot/etc/jmx_prometheus_javaagent/configs/broker.yml:
##
@@ -112,6 +118,12 @@ rules:
 - pattern: 
"\"org\\.apache\\.pinot\\.common\\.metrics\"<>(\\w+)"
   name: "pinot_broker_routingTableUpdateTime_$1"
   cache: true
+- pattern: 
"\"org\\.apache\\.pinot\\.common\\.metrics\"<>(\\w+)"
+  name: "pinot_broker_adaptiveServerSelectorType_$1"

Review Comment:
   Its does not have any extracted info to add as a label. More of a static 
config info metric



##
docker/images/pinot/etc/jmx_prometheus_javaagent/configs/broker.yml:
##
@@ -112,6 +118,12 @@ rules:
 - pattern: 
"\"org\\.apache\\.pinot\\.common\\.metrics\"<>(\\w+)"
   name: "pinot_broker_routingTableUpdateTime_$1"
   cache: true
+- pattern: 
"\"org\\.apache\\.pinot\\.common\\.metrics\"<>(\\w+)"
+  name: "pinot_broker_adaptiveServerSelectorType_$1"
+  cache: true
+- pattern: 
"\"org\\.apache\\.pinot\\.common\\.metrics\"<>(\\w+)"
+  name: "pinot_broker_adaptiveServerSelectorType_$1_$2"

Review Comment:
   replied above



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscr...@pinot.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


-
To unsubscribe, e-mail: commits-unsubscr...@pinot.apache.org
For additional commands, e-mail: commits-h...@pinot.apache.org



Re: [PR] Fix few metric rules which were affected by the database prefix handling [pinot]

2024-06-05 Thread via GitHub


shounakmk219 commented on code in PR #13290:
URL: https://github.com/apache/pinot/pull/13290#discussion_r1627809416


##
docker/images/pinot/etc/jmx_prometheus_javaagent/configs/broker.yml:
##
@@ -59,6 +59,12 @@ rules:
   labels:
 database: "$2"
 table: "$1$3"
+- pattern: 
"\"org\\.apache\\.pinot\\.common\\.metrics\"<>(\\w+)"
+  name: "pinot_broker_requestSize_$4"
+  cache: true
+  labels:
+database: "$2"
+table: "$1$3"

Review Comment:
   $1 matches the whole `databaseName.` part hence no need to delimit 
explicitly.



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscr...@pinot.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


-
To unsubscribe, e-mail: commits-unsubscr...@pinot.apache.org
For additional commands, e-mail: commits-h...@pinot.apache.org



Re: [PR] Bump software.amazon.awssdk:bom from 2.25.64 to 2.25.65 [pinot]

2024-06-05 Thread via GitHub


dependabot[bot] closed pull request #13311: Bump software.amazon.awssdk:bom 
from 2.25.64 to 2.25.65
URL: https://github.com/apache/pinot/pull/13311


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscr...@pinot.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


-
To unsubscribe, e-mail: commits-unsubscr...@pinot.apache.org
For additional commands, e-mail: commits-h...@pinot.apache.org



Re: [PR] Bump software.amazon.awssdk:bom from 2.25.64 to 2.25.65 [pinot]

2024-06-05 Thread via GitHub


dependabot[bot] commented on PR #13311:
URL: https://github.com/apache/pinot/pull/13311#issuecomment-2149608954

   Superseded by #13316.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscr...@pinot.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


-
To unsubscribe, e-mail: commits-unsubscr...@pinot.apache.org
For additional commands, e-mail: commits-h...@pinot.apache.org



[PR] Bump software.amazon.awssdk:bom from 2.25.64 to 2.25.66 [pinot]

2024-06-05 Thread via GitHub


dependabot[bot] opened a new pull request, #13316:
URL: https://github.com/apache/pinot/pull/13316

   Bumps software.amazon.awssdk:bom from 2.25.64 to 2.25.66.
   
   
   [![Dependabot compatibility 
score](https://dependabot-badges.githubapp.com/badges/compatibility_score?dependency-name=software.amazon.awssdk:bom=maven=2.25.64=2.25.66)](https://docs.github.com/en/github/managing-security-vulnerabilities/about-dependabot-security-updates#about-compatibility-scores)
   
   Dependabot will resolve any conflicts with this PR as long as you don't 
alter it yourself. You can also trigger a rebase manually by commenting 
`@dependabot rebase`.
   
   [//]: # (dependabot-automerge-start)
   [//]: # (dependabot-automerge-end)
   
   ---
   
   
   Dependabot commands and options
   
   
   You can trigger Dependabot actions by commenting on this PR:
   - `@dependabot rebase` will rebase this PR
   - `@dependabot recreate` will recreate this PR, overwriting any edits that 
have been made to it
   - `@dependabot merge` will merge this PR after your CI passes on it
   - `@dependabot squash and merge` will squash and merge this PR after your CI 
passes on it
   - `@dependabot cancel merge` will cancel a previously requested merge and 
block automerging
   - `@dependabot reopen` will reopen this PR if it is closed
   - `@dependabot close` will close this PR and stop Dependabot recreating it. 
You can achieve the same result by closing it manually
   - `@dependabot show  ignore conditions` will show all of 
the ignore conditions of the specified dependency
   - `@dependabot ignore this major version` will close this PR and stop 
Dependabot creating any more for this major version (unless you reopen the PR 
or upgrade to it yourself)
   - `@dependabot ignore this minor version` will close this PR and stop 
Dependabot creating any more for this minor version (unless you reopen the PR 
or upgrade to it yourself)
   - `@dependabot ignore this dependency` will close this PR and stop 
Dependabot creating any more for this dependency (unless you reopen the PR or 
upgrade to it yourself)
   
   
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscr...@pinot.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


-
To unsubscribe, e-mail: commits-unsubscr...@pinot.apache.org
For additional commands, e-mail: commits-h...@pinot.apache.org



Re: [PR] [partial-upsert] configure early release of _partitionGroupConsumerSemaphore in RealtimeSegmentDataManager [pinot]

2024-06-05 Thread via GitHub


rohityadav1993 commented on code in PR #13256:
URL: https://github.com/apache/pinot/pull/13256#discussion_r1627386271


##
pinot-spi/src/main/java/org/apache/pinot/spi/config/table/UpsertConfig.java:
##
@@ -94,6 +94,9 @@ public enum ConsistencyMode {
   @JsonPropertyDescription("Whether to drop out-of-order record")
   private boolean _dropOutOfOrderRecord;
 
+  @JsonPropertyDescription("Whether to pause partial upsert table's partition 
ingestion during commit")
+  private boolean _pausePartialUpsertPartitionIngestionDuringCommit;

Review Comment:
   Making default as enabled sounds good. This is a behaviour change, I also 
think it makes sense to make this a cluster level property as well so it can be 
toggled at bulk.



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscr...@pinot.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


-
To unsubscribe, e-mail: commits-unsubscr...@pinot.apache.org
For additional commands, e-mail: commits-h...@pinot.apache.org



Re: [PR] Fix the NPE from IS update metrics [pinot]

2024-06-05 Thread via GitHub


gortiz commented on PR #13313:
URL: https://github.com/apache/pinot/pull/13313#issuecomment-2149310502

   I've created a NOOP registry in https://github.com/apache/pinot/pull/13032, 
so this kind of issues shouldn't be problematic in the future.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscr...@pinot.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


-
To unsubscribe, e-mail: commits-unsubscr...@pinot.apache.org
For additional commands, e-mail: commits-h...@pinot.apache.org



Re: [PR] Add metrics to count joins and window functions [pinot]

2024-06-05 Thread via GitHub


gortiz merged PR #13032:
URL: https://github.com/apache/pinot/pull/13032


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscr...@pinot.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


-
To unsubscribe, e-mail: commits-unsubscr...@pinot.apache.org
For additional commands, e-mail: commits-h...@pinot.apache.org



Re: [PR] Multi stage metrics [pinot]

2024-06-05 Thread via GitHub


codecov-commenter commented on PR #13035:
URL: https://github.com/apache/pinot/pull/13035#issuecomment-2149272592

   ## 
[Codecov](https://app.codecov.io/gh/apache/pinot/pull/13035?dropdown=coverage=pr=h1_medium=referral_source=github_content=comment_campaign=pr+comments_term=apache)
 Report
   Attention: Patch coverage is `0%` with `75 lines` in your changes missing 
coverage. Please review.
   > Project coverage is 0.00%. Comparing base 
[(`59551e4`)](https://app.codecov.io/gh/apache/pinot/commit/59551e45224f1535c4863fd577622b37366ccc97?dropdown=coverage=desc_medium=referral_source=github_content=comment_campaign=pr+comments_term=apache)
 to head 
[(`14d0b58`)](https://app.codecov.io/gh/apache/pinot/commit/14d0b58237f6e7d2c066cecdfb5dbd3efa61be9a?dropdown=coverage=desc_medium=referral_source=github_content=comment_campaign=pr+comments_term=apache).
   > Report is 554 commits behind head on master.
   
   | 
[Files](https://app.codecov.io/gh/apache/pinot/pull/13035?dropdown=coverage=pr=tree_medium=referral_source=github_content=comment_campaign=pr+comments_term=apache)
 | Patch % | Lines |
   |---|---|---|
   | 
[...not/query/runtime/operator/MultiStageOperator.java](https://app.codecov.io/gh/apache/pinot/pull/13035?src=pr=tree=pinot-query-runtime%2Fsrc%2Fmain%2Fjava%2Forg%2Fapache%2Fpinot%2Fquery%2Fruntime%2Foperator%2FMultiStageOperator.java_medium=referral_source=github_content=comment_campaign=pr+comments_term=apache#diff-cGlub3QtcXVlcnktcnVudGltZS9zcmMvbWFpbi9qYXZhL29yZy9hcGFjaGUvcGlub3QvcXVlcnkvcnVudGltZS9vcGVyYXRvci9NdWx0aVN0YWdlT3BlcmF0b3IuamF2YQ==)
 | 0.00% | [38 Missing :warning: 
](https://app.codecov.io/gh/apache/pinot/pull/13035?src=pr=tree_medium=referral_source=github_content=comment_campaign=pr+comments_term=apache)
 |
   | 
[...ot/query/runtime/operator/MailboxSendOperator.java](https://app.codecov.io/gh/apache/pinot/pull/13035?src=pr=tree=pinot-query-runtime%2Fsrc%2Fmain%2Fjava%2Forg%2Fapache%2Fpinot%2Fquery%2Fruntime%2Foperator%2FMailboxSendOperator.java_medium=referral_source=github_content=comment_campaign=pr+comments_term=apache#diff-cGlub3QtcXVlcnktcnVudGltZS9zcmMvbWFpbi9qYXZhL29yZy9hcGFjaGUvcGlub3QvcXVlcnkvcnVudGltZS9vcGVyYXRvci9NYWlsYm94U2VuZE9wZXJhdG9yLmphdmE=)
 | 0.00% | [15 Missing :warning: 
](https://app.codecov.io/gh/apache/pinot/pull/13035?src=pr=tree_medium=referral_source=github_content=comment_campaign=pr+comments_term=apache)
 |
   | 
[...pinot/query/runtime/plan/MultiStageQueryStats.java](https://app.codecov.io/gh/apache/pinot/pull/13035?src=pr=tree=pinot-query-runtime%2Fsrc%2Fmain%2Fjava%2Forg%2Fapache%2Fpinot%2Fquery%2Fruntime%2Fplan%2FMultiStageQueryStats.java_medium=referral_source=github_content=comment_campaign=pr+comments_term=apache#diff-cGlub3QtcXVlcnktcnVudGltZS9zcmMvbWFpbi9qYXZhL29yZy9hcGFjaGUvcGlub3QvcXVlcnkvcnVudGltZS9wbGFuL011bHRpU3RhZ2VRdWVyeVN0YXRzLmphdmE=)
 | 0.00% | [9 Missing :warning: 
](https://app.codecov.io/gh/apache/pinot/pull/13035?src=pr=tree_medium=referral_source=github_content=comment_campaign=pr+comments_term=apache)
 |
   | 
[...a/org/apache/pinot/common/metrics/ServerMeter.java](https://app.codecov.io/gh/apache/pinot/pull/13035?src=pr=tree=pinot-common%2Fsrc%2Fmain%2Fjava%2Forg%2Fapache%2Fpinot%2Fcommon%2Fmetrics%2FServerMeter.java_medium=referral_source=github_content=comment_campaign=pr+comments_term=apache#diff-cGlub3QtY29tbW9uL3NyYy9tYWluL2phdmEvb3JnL2FwYWNoZS9waW5vdC9jb21tb24vbWV0cmljcy9TZXJ2ZXJNZXRlci5qYXZh)
 | 0.00% | [7 Missing :warning: 
](https://app.codecov.io/gh/apache/pinot/pull/13035?src=pr=tree_medium=referral_source=github_content=comment_campaign=pr+comments_term=apache)
 |
   | 
[...a/org/apache/pinot/common/metrics/ServerTimer.java](https://app.codecov.io/gh/apache/pinot/pull/13035?src=pr=tree=pinot-common%2Fsrc%2Fmain%2Fjava%2Forg%2Fapache%2Fpinot%2Fcommon%2Fmetrics%2FServerTimer.java_medium=referral_source=github_content=comment_campaign=pr+comments_term=apache#diff-cGlub3QtY29tbW9uL3NyYy9tYWluL2phdmEvb3JnL2FwYWNoZS9waW5vdC9jb21tb24vbWV0cmljcy9TZXJ2ZXJUaW1lci5qYXZh)
 | 0.00% | [6 Missing :warning: 
](https://app.codecov.io/gh/apache/pinot/pull/13035?src=pr=tree_medium=referral_source=github_content=comment_campaign=pr+comments_term=apache)
 |
   
   Additional details and impacted files
   
   
   ```diff
   @@  Coverage Diff  @@
   ## master   #13035   +/-   ##
   =
   - Coverage 61.75%0.00%   -61.76% 
   =
 Files  2436 2463   +27 
 Lines133233   135899 +2666 
 Branches  2063621043  +407 
   =
   - Hits  822740-82274 
   - Misses44911   135899+90988 
   + Partials   60480 -6048 
   ```
   
   | 

Re: [PR] Short circuit SubPlanFragmenter because we don't support multiple sub-plans yet [pinot]

2024-06-05 Thread via GitHub


gortiz commented on code in PR #13306:
URL: https://github.com/apache/pinot/pull/13306#discussion_r1627085061


##
pinot-query-planner/src/main/java/org/apache/pinot/query/planner/logical/PinotLogicalQueryPlanner.java:
##
@@ -51,70 +48,72 @@ private PinotLogicalQueryPlanner() {
* Converts a Calcite {@link RelRoot} into a Pinot {@link SubPlan}.
*/
   public static SubPlan makePlan(RelRoot relRoot) {
-PlanNode rootNode = relNodeToStageNode(relRoot.rel);
-QueryPlanMetadata metadata =
-new 
QueryPlanMetadata(RelToPlanNodeConverter.getTableNamesFromRelRoot(relRoot.rel), 
relRoot.fields);
+PlanNode rootNode = relNodeToPlanNode(relRoot.rel);
+PlanFragment rootFragment = planNodeToPlanFragment(rootNode);
+return new SubPlan(rootFragment,
+new 
SubPlanMetadata(RelToPlanNodeConverter.getTableNamesFromRelRoot(relRoot.rel), 
relRoot.fields), List.of());
 
+// TODO: Currently we don't support multiple sub-plans. Revisit the 
following logic when we add the support.
 // Fragment the stage tree into multiple SubPlans.
-SubPlanFragmenter.Context subPlanContext = new SubPlanFragmenter.Context();
-subPlanContext._subPlanIdToRootNodeMap.put(0, rootNode);
-subPlanContext._subPlanIdToMetadataMap.put(0, new 
SubPlanMetadata(metadata.getTableNames(), metadata.getFields()));
-rootNode.visit(SubPlanFragmenter.INSTANCE, subPlanContext);
-
-Map subPlanMap = new HashMap<>();
-for (Map.Entry subPlanEntry : 
subPlanContext._subPlanIdToRootNodeMap.entrySet()) {
-  int subPlanId = subPlanEntry.getKey();
-  PlanNode subPlanRoot = subPlanEntry.getValue();
-
-  // Fragment the SubPlan into multiple PlanFragments.
-  PlanFragmenter fragmenter = new PlanFragmenter();
-  PlanFragmenter.Context fragmenterContext = fragmenter.createContext();
-  subPlanRoot = subPlanRoot.visit(fragmenter, fragmenterContext);
-  Int2ObjectOpenHashMap planFragmentMap = 
fragmenter.getPlanFragmentMap();
-  Int2ObjectOpenHashMap childPlanFragmentIdsMap = 
fragmenter.getChildPlanFragmentIdsMap();
-
-  // Sub plan root needs to send final results back to the Broker
-  // TODO: Should be SINGLETON (currently SINGLETON has to be local, so 
use BROADCAST_DISTRIBUTED instead)
-  MailboxSendNode subPlanRootSenderNode =
-  new MailboxSendNode(subPlanRoot.getPlanFragmentId(), 
subPlanRoot.getDataSchema(), 0,
-  RelDistribution.Type.BROADCAST_DISTRIBUTED, 
PinotRelExchangeType.getDefaultExchangeType(), null, null,
-  false, false);
-  subPlanRootSenderNode.addInput(subPlanRoot);
-  PlanFragment planFragment1 = new PlanFragment(1, subPlanRootSenderNode, 
new ArrayList<>());
-  planFragmentMap.put(1, planFragment1);
-  for (Int2ObjectMap.Entry entry : 
childPlanFragmentIdsMap.int2ObjectEntrySet()) {
-PlanFragment planFragment = planFragmentMap.get(entry.getIntKey());
-List childPlanFragments = planFragment.getChildren();
-IntListIterator childPlanFragmentIdIterator = 
entry.getValue().iterator();
-while (childPlanFragmentIdIterator.hasNext()) {
-  
childPlanFragments.add(planFragmentMap.get(childPlanFragmentIdIterator.nextInt()));
-}
-  }
-  MailboxReceiveNode rootReceiveNode =
-  new MailboxReceiveNode(0, subPlanRoot.getDataSchema(), 
subPlanRoot.getPlanFragmentId(),
-  RelDistribution.Type.BROADCAST_DISTRIBUTED, 
PinotRelExchangeType.getDefaultExchangeType(), null, null,
-  false, false, subPlanRootSenderNode);
-  PlanFragment rootPlanFragment = new PlanFragment(0, rootReceiveNode, 
Collections.singletonList(planFragment1));
-  SubPlan subPlan = new SubPlan(rootPlanFragment, 
subPlanContext._subPlanIdToMetadataMap.get(0), new ArrayList<>());
-  subPlanMap.put(subPlanId, subPlan);
-}
-for (Map.Entry> subPlanToChildrenEntry : 
subPlanContext._subPlanIdToChildrenMap.entrySet()) {
-  int subPlanId = subPlanToChildrenEntry.getKey();
-  List subPlanChildren = subPlanToChildrenEntry.getValue();
-  for (int subPlanChild : subPlanChildren) {
-
subPlanMap.get(subPlanId).getChildren().add(subPlanMap.get(subPlanChild));
-  }
-}
-return subPlanMap.get(0);
+//SubPlanFragmenter.Context subPlanContext = new 
SubPlanFragmenter.Context();
+//subPlanContext._subPlanIdToRootNodeMap.put(0, rootNode);
+//subPlanContext._subPlanIdToMetadataMap.put(0,
+//new 
SubPlanMetadata(RelToPlanNodeConverter.getTableNamesFromRelRoot(relRoot.rel), 
relRoot.fields));
+//rootNode.visit(SubPlanFragmenter.INSTANCE, subPlanContext);
+//
+//Map subPlanMap = new HashMap<>();
+//for (Map.Entry subPlanEntry : 
subPlanContext._subPlanIdToRootNodeMap.entrySet()) {
+//  SubPlan subPlan =
+//  new SubPlan(planNodeToPlanFragment(subPlanEntry.getValue()), 
subPlanContext._subPlanIdToMetadataMap.get(0),
+//  new ArrayList<>());
+//  

Re: [PR] Update getValidDocIdsMetadataFromServer to make call in batches to servers [pinot]

2024-06-04 Thread via GitHub


tibrewalpratik17 commented on code in PR #13314:
URL: https://github.com/apache/pinot/pull/13314#discussion_r1626674964


##
pinot-controller/src/main/java/org/apache/pinot/controller/util/ServerSegmentMetadataReader.java:
##
@@ -61,6 +61,7 @@
  */
 public class ServerSegmentMetadataReader {
   private static final Logger LOGGER = 
LoggerFactory.getLogger(ServerSegmentMetadataReader.class);
+  private static final int DEFAULT_SEGMENT_BATCH_SIZE_FOR_QUERYING_SERVER = 
500;

Review Comment:
   Discussed offline with @klsince , we want to expose this as a configurable 
param in upsert-compaction-task and in the controller GET call API as well. 



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscr...@pinot.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


-
To unsubscribe, e-mail: commits-unsubscr...@pinot.apache.org
For additional commands, e-mail: commits-h...@pinot.apache.org



Re: [PR] Add table level metric for segment upload error [pinot]

2024-06-04 Thread via GitHub


codecov-commenter commented on PR #13315:
URL: https://github.com/apache/pinot/pull/13315#issuecomment-2148502498

   ## 
[Codecov](https://app.codecov.io/gh/apache/pinot/pull/13315?dropdown=coverage=pr=h1_medium=referral_source=github_content=comment_campaign=pr+comments_term=apache)
 Report
   Attention: Patch coverage is `0%` with `1 line` in your changes missing 
coverage. Please review.
   > Project coverage is 0.00%. Comparing base 
[(`59551e4`)](https://app.codecov.io/gh/apache/pinot/commit/59551e45224f1535c4863fd577622b37366ccc97?dropdown=coverage=desc_medium=referral_source=github_content=comment_campaign=pr+comments_term=apache)
 to head 
[(`75ae464`)](https://app.codecov.io/gh/apache/pinot/commit/75ae464a775b9fd17c931897c4dfd48f36d21a68?dropdown=coverage=desc_medium=referral_source=github_content=comment_campaign=pr+comments_term=apache).
   > Report is 554 commits behind head on master.
   
   | 
[Files](https://app.codecov.io/gh/apache/pinot/pull/13315?dropdown=coverage=pr=tree_medium=referral_source=github_content=comment_campaign=pr+comments_term=apache)
 | Patch % | Lines |
   |---|---|---|
   | 
[...ces/PinotSegmentUploadDownloadRestletResource.java](https://app.codecov.io/gh/apache/pinot/pull/13315?src=pr=tree=pinot-controller%2Fsrc%2Fmain%2Fjava%2Forg%2Fapache%2Fpinot%2Fcontroller%2Fapi%2Fresources%2FPinotSegmentUploadDownloadRestletResource.java_medium=referral_source=github_content=comment_campaign=pr+comments_term=apache#diff-cGlub3QtY29udHJvbGxlci9zcmMvbWFpbi9qYXZhL29yZy9hcGFjaGUvcGlub3QvY29udHJvbGxlci9hcGkvcmVzb3VyY2VzL1Bpbm90U2VnbWVudFVwbG9hZERvd25sb2FkUmVzdGxldFJlc291cmNlLmphdmE=)
 | 0.00% | [1 Missing :warning: 
](https://app.codecov.io/gh/apache/pinot/pull/13315?src=pr=tree_medium=referral_source=github_content=comment_campaign=pr+comments_term=apache)
 |
   
   Additional details and impacted files
   
   
   ```diff
   @@  Coverage Diff  @@
   ## master   #13315   +/-   ##
   =
   - Coverage 61.75%0.00%   -61.76% 
   =
 Files  2436 2463   +27 
 Lines133233   135835 +2602 
 Branches  2063621037  +401 
   =
   - Hits  822740-82274 
   - Misses44911   135835+90924 
   + Partials   60480 -6048 
   ```
   
   | 
[Flag](https://app.codecov.io/gh/apache/pinot/pull/13315/flags?src=pr=flags_medium=referral_source=github_content=comment_campaign=pr+comments_term=apache)
 | Coverage Δ | |
   |---|---|---|
   | 
[custom-integration1](https://app.codecov.io/gh/apache/pinot/pull/13315/flags?src=pr=flag_medium=referral_source=github_content=comment_campaign=pr+comments_term=apache)
 | `?` | |
   | 
[integration](https://app.codecov.io/gh/apache/pinot/pull/13315/flags?src=pr=flag_medium=referral_source=github_content=comment_campaign=pr+comments_term=apache)
 | `0.00% <0.00%> (-0.01%)` | :arrow_down: |
   | 
[integration1](https://app.codecov.io/gh/apache/pinot/pull/13315/flags?src=pr=flag_medium=referral_source=github_content=comment_campaign=pr+comments_term=apache)
 | `?` | |
   | 
[integration2](https://app.codecov.io/gh/apache/pinot/pull/13315/flags?src=pr=flag_medium=referral_source=github_content=comment_campaign=pr+comments_term=apache)
 | `0.00% <0.00%> (ø)` | |
   | 
[java-11](https://app.codecov.io/gh/apache/pinot/pull/13315/flags?src=pr=flag_medium=referral_source=github_content=comment_campaign=pr+comments_term=apache)
 | `?` | |
   | 
[java-21](https://app.codecov.io/gh/apache/pinot/pull/13315/flags?src=pr=flag_medium=referral_source=github_content=comment_campaign=pr+comments_term=apache)
 | `0.00% <0.00%> (-61.63%)` | :arrow_down: |
   | 
[skip-bytebuffers-false](https://app.codecov.io/gh/apache/pinot/pull/13315/flags?src=pr=flag_medium=referral_source=github_content=comment_campaign=pr+comments_term=apache)
 | `0.00% <0.00%> (-61.75%)` | :arrow_down: |
   | 
[skip-bytebuffers-true](https://app.codecov.io/gh/apache/pinot/pull/13315/flags?src=pr=flag_medium=referral_source=github_content=comment_campaign=pr+comments_term=apache)
 | `0.00% <0.00%> (-27.73%)` | :arrow_down: |
   | 
[temurin](https://app.codecov.io/gh/apache/pinot/pull/13315/flags?src=pr=flag_medium=referral_source=github_content=comment_campaign=pr+comments_term=apache)
 | `0.00% <0.00%> (-61.76%)` | :arrow_down: |
   | 
[unittests](https://app.codecov.io/gh/apache/pinot/pull/13315/flags?src=pr=flag_medium=referral_source=github_content=comment_campaign=pr+comments_term=apache)
 | `?` | |
   | 
[unittests1](https://app.codecov.io/gh/apache/pinot/pull/13315/flags?src=pr=flag_medium=referral_source=github_content=comment_campaign=pr+comments_term=apache)
 | `?` | |
   | 

Re: [PR] Bump org.apache.maven.plugins:maven-jxr-plugin from 3.3.2 to 3.4.0 [pinot]

2024-06-04 Thread via GitHub


Jackie-Jiang merged PR #13310:
URL: https://github.com/apache/pinot/pull/13310


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscr...@pinot.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


-
To unsubscribe, e-mail: commits-unsubscr...@pinot.apache.org
For additional commands, e-mail: commits-h...@pinot.apache.org



Re: [I] Error while performing schema evolution [pinot]

2024-06-04 Thread via GitHub


aishwaryathondapu commented on issue #13312:
URL: https://github.com/apache/pinot/issues/13312#issuecomment-2148488572

   yes. 


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscr...@pinot.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


-
To unsubscribe, e-mail: commits-unsubscr...@pinot.apache.org
For additional commands, e-mail: commits-h...@pinot.apache.org



Re: [PR] Update getValidDocIdsMetadataFromServer to make call in batches to servers [pinot]

2024-06-04 Thread via GitHub


codecov-commenter commented on PR #13314:
URL: https://github.com/apache/pinot/pull/13314#issuecomment-2148478178

   ## 
[Codecov](https://app.codecov.io/gh/apache/pinot/pull/13314?dropdown=coverage=pr=h1_medium=referral_source=github_content=comment_campaign=pr+comments_term=apache)
 Report
   Attention: Patch coverage is `0%` with `8 lines` in your changes missing 
coverage. Please review.
   > Project coverage is 27.76%. Comparing base 
[(`59551e4`)](https://app.codecov.io/gh/apache/pinot/commit/59551e45224f1535c4863fd577622b37366ccc97?dropdown=coverage=desc_medium=referral_source=github_content=comment_campaign=pr+comments_term=apache)
 to head 
[(`c7372cd`)](https://app.codecov.io/gh/apache/pinot/commit/c7372cd58965a4d95c2862b90f4235f9b6f4084d?dropdown=coverage=desc_medium=referral_source=github_content=comment_campaign=pr+comments_term=apache).
   > Report is 553 commits behind head on master.
   
   | 
[Files](https://app.codecov.io/gh/apache/pinot/pull/13314?dropdown=coverage=pr=tree_medium=referral_source=github_content=comment_campaign=pr+comments_term=apache)
 | Patch % | Lines |
   |---|---|---|
   | 
[...t/controller/util/ServerSegmentMetadataReader.java](https://app.codecov.io/gh/apache/pinot/pull/13314?src=pr=tree=pinot-controller%2Fsrc%2Fmain%2Fjava%2Forg%2Fapache%2Fpinot%2Fcontroller%2Futil%2FServerSegmentMetadataReader.java_medium=referral_source=github_content=comment_campaign=pr+comments_term=apache#diff-cGlub3QtY29udHJvbGxlci9zcmMvbWFpbi9qYXZhL29yZy9hcGFjaGUvcGlub3QvY29udHJvbGxlci91dGlsL1NlcnZlclNlZ21lbnRNZXRhZGF0YVJlYWRlci5qYXZh)
 | 0.00% | [8 Missing :warning: 
](https://app.codecov.io/gh/apache/pinot/pull/13314?src=pr=tree_medium=referral_source=github_content=comment_campaign=pr+comments_term=apache)
 |
   
   Additional details and impacted files
   
   
   ```diff
   @@  Coverage Diff  @@
   ## master   #13314   +/-   ##
   =
   - Coverage 61.75%   27.76%   -33.99% 
   + Complexity  207  198-9 
   =
 Files  2436 2538  +102 
 Lines133233   139519 +6286 
 Branches  2063621594  +958 
   =
   - Hits  8227438743-43531 
   - Misses4491197811+52900 
   + Partials   6048 2965 -3083 
   ```
   
   | 
[Flag](https://app.codecov.io/gh/apache/pinot/pull/13314/flags?src=pr=flags_medium=referral_source=github_content=comment_campaign=pr+comments_term=apache)
 | Coverage Δ | |
   |---|---|---|
   | 
[custom-integration1](https://app.codecov.io/gh/apache/pinot/pull/13314/flags?src=pr=flag_medium=referral_source=github_content=comment_campaign=pr+comments_term=apache)
 | `<0.01% <0.00%> (-0.01%)` | :arrow_down: |
   | 
[integration](https://app.codecov.io/gh/apache/pinot/pull/13314/flags?src=pr=flag_medium=referral_source=github_content=comment_campaign=pr+comments_term=apache)
 | `<0.01% <0.00%> (-0.01%)` | :arrow_down: |
   | 
[integration1](https://app.codecov.io/gh/apache/pinot/pull/13314/flags?src=pr=flag_medium=referral_source=github_content=comment_campaign=pr+comments_term=apache)
 | `<0.01% <0.00%> (-0.01%)` | :arrow_down: |
   | 
[integration2](https://app.codecov.io/gh/apache/pinot/pull/13314/flags?src=pr=flag_medium=referral_source=github_content=comment_campaign=pr+comments_term=apache)
 | `0.00% <0.00%> (ø)` | |
   | 
[java-11](https://app.codecov.io/gh/apache/pinot/pull/13314/flags?src=pr=flag_medium=referral_source=github_content=comment_campaign=pr+comments_term=apache)
 | `<0.01% <0.00%> (-61.71%)` | :arrow_down: |
   | 
[java-21](https://app.codecov.io/gh/apache/pinot/pull/13314/flags?src=pr=flag_medium=referral_source=github_content=comment_campaign=pr+comments_term=apache)
 | `27.76% <0.00%> (-33.86%)` | :arrow_down: |
   | 
[skip-bytebuffers-false](https://app.codecov.io/gh/apache/pinot/pull/13314/flags?src=pr=flag_medium=referral_source=github_content=comment_campaign=pr+comments_term=apache)
 | `<0.01% <0.00%> (-61.75%)` | :arrow_down: |
   | 
[skip-bytebuffers-true](https://app.codecov.io/gh/apache/pinot/pull/13314/flags?src=pr=flag_medium=referral_source=github_content=comment_campaign=pr+comments_term=apache)
 | `27.76% <0.00%> (+0.04%)` | :arrow_up: |
   | 
[temurin](https://app.codecov.io/gh/apache/pinot/pull/13314/flags?src=pr=flag_medium=referral_source=github_content=comment_campaign=pr+comments_term=apache)
 | `27.76% <0.00%> (-33.99%)` | :arrow_down: |
   | 
[unittests](https://app.codecov.io/gh/apache/pinot/pull/13314/flags?src=pr=flag_medium=referral_source=github_content=comment_campaign=pr+comments_term=apache)
 | `27.76% <0.00%> (-33.99%)` | :arrow_down: |
   | 
[unittests1](https://app.codecov.io/gh/apache/pinot/pull/13314/flags?src=pr=flag_medium=referral_source=github_content=comment_campaign=pr+comments_term=apache)
 | `?` | |
   | 

[PR] Add table level metric for segment upload error [pinot]

2024-06-04 Thread via GitHub


tibrewalpratik17 opened a new pull request, #13315:
URL: https://github.com/apache/pinot/pull/13315

   label:
   `observability` 
   
   Adding a table-level metric to track number of segment upload errors. Right 
now, the metric is only emitted at controller level.
   
   cc @ankitsultana @Jackie-Jiang 


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscr...@pinot.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


-
To unsubscribe, e-mail: commits-unsubscr...@pinot.apache.org
For additional commands, e-mail: commits-h...@pinot.apache.org



[PR] Update getValidDocIdsMetadataFromServer to make call in batches to servers [pinot]

2024-06-04 Thread via GitHub


tibrewalpratik17 opened a new pull request, #13314:
URL: https://github.com/apache/pinot/pull/13314

   label:
   `bugfix`
   `upsert`
   
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscr...@pinot.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


-
To unsubscribe, e-mail: commits-unsubscr...@pinot.apache.org
For additional commands, e-mail: commits-h...@pinot.apache.org



Re: [PR] add metrics tracking lucene near real-time refresh delay [pinot]

2024-06-04 Thread via GitHub


Jackie-Jiang commented on PR #13307:
URL: https://github.com/apache/pinot/pull/13307#issuecomment-2148349411

   cc @swaminathanmanish 


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscr...@pinot.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


-
To unsubscribe, e-mail: commits-unsubscr...@pinot.apache.org
For additional commands, e-mail: commits-h...@pinot.apache.org



Re: [PR] Bump software.amazon.awssdk:bom from 2.25.64 to 2.25.65 [pinot]

2024-06-04 Thread via GitHub


Jackie-Jiang commented on PR #13311:
URL: https://github.com/apache/pinot/pull/13311#issuecomment-2148347141

   @dependabot rebase


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscr...@pinot.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


-
To unsubscribe, e-mail: commits-unsubscr...@pinot.apache.org
For additional commands, e-mail: commits-h...@pinot.apache.org



Re: [PR] Bump org.apache.maven.plugins:maven-jxr-plugin from 3.3.2 to 3.4.0 [pinot]

2024-06-04 Thread via GitHub


Jackie-Jiang commented on PR #13310:
URL: https://github.com/apache/pinot/pull/13310#issuecomment-2148346830

   @dependabot rebase


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscr...@pinot.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


-
To unsubscribe, e-mail: commits-unsubscr...@pinot.apache.org
For additional commands, e-mail: commits-h...@pinot.apache.org



Re: [PR] Bump org.checkerframework:checker-qual from 3.43.0 to 3.44.0 [pinot]

2024-06-04 Thread via GitHub


Jackie-Jiang merged PR #13309:
URL: https://github.com/apache/pinot/pull/13309


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscr...@pinot.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


-
To unsubscribe, e-mail: commits-unsubscr...@pinot.apache.org
For additional commands, e-mail: commits-h...@pinot.apache.org



Re: [PR] Fix the NPE from IS update metrics [pinot]

2024-06-04 Thread via GitHub


Jackie-Jiang merged PR #13313:
URL: https://github.com/apache/pinot/pull/13313


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscr...@pinot.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


-
To unsubscribe, e-mail: commits-unsubscr...@pinot.apache.org
For additional commands, e-mail: commits-h...@pinot.apache.org



Re: [PR] Fix the NPE from IS update metrics [pinot]

2024-06-04 Thread via GitHub


codecov-commenter commented on PR #13313:
URL: https://github.com/apache/pinot/pull/13313#issuecomment-2148186208

   ## 
[Codecov](https://app.codecov.io/gh/apache/pinot/pull/13313?dropdown=coverage=pr=h1_medium=referral_source=github_content=comment_campaign=pr+comments_term=apache)
 Report
   Attention: Patch coverage is `0%` with `8 lines` in your changes missing 
coverage. Please review.
   > Project coverage is 0.00%. Comparing base 
[(`59551e4`)](https://app.codecov.io/gh/apache/pinot/commit/59551e45224f1535c4863fd577622b37366ccc97?dropdown=coverage=desc_medium=referral_source=github_content=comment_campaign=pr+comments_term=apache)
 to head 
[(`2ec9f75`)](https://app.codecov.io/gh/apache/pinot/commit/2ec9f75f82322b68e458852059376371779522e5?dropdown=coverage=desc_medium=referral_source=github_content=comment_campaign=pr+comments_term=apache).
   > Report is 551 commits behind head on master.
   
   | 
[Files](https://app.codecov.io/gh/apache/pinot/pull/13313?dropdown=coverage=pr=tree_medium=referral_source=github_content=comment_campaign=pr+comments_term=apache)
 | Patch % | Lines |
   |---|---|---|
   | 
[...g/apache/pinot/common/utils/helix/HelixHelper.java](https://app.codecov.io/gh/apache/pinot/pull/13313?src=pr=tree=pinot-common%2Fsrc%2Fmain%2Fjava%2Forg%2Fapache%2Fpinot%2Fcommon%2Futils%2Fhelix%2FHelixHelper.java_medium=referral_source=github_content=comment_campaign=pr+comments_term=apache#diff-cGlub3QtY29tbW9uL3NyYy9tYWluL2phdmEvb3JnL2FwYWNoZS9waW5vdC9jb21tb24vdXRpbHMvaGVsaXgvSGVsaXhIZWxwZXIuamF2YQ==)
 | 0.00% | [8 Missing :warning: 
](https://app.codecov.io/gh/apache/pinot/pull/13313?src=pr=tree_medium=referral_source=github_content=comment_campaign=pr+comments_term=apache)
 |
   
   Additional details and impacted files
   
   
   ```diff
   @@  Coverage Diff  @@
   ## master   #13313   +/-   ##
   =
   - Coverage 61.75%0.00%   -61.76% 
   =
 Files  2436 2463   +27 
 Lines133233   135819 +2586 
 Branches  2063621037  +401 
   =
   - Hits  822740-82274 
   - Misses44911   135819+90908 
   + Partials   60480 -6048 
   ```
   
   | 
[Flag](https://app.codecov.io/gh/apache/pinot/pull/13313/flags?src=pr=flags_medium=referral_source=github_content=comment_campaign=pr+comments_term=apache)
 | Coverage Δ | |
   |---|---|---|
   | 
[custom-integration1](https://app.codecov.io/gh/apache/pinot/pull/13313/flags?src=pr=flag_medium=referral_source=github_content=comment_campaign=pr+comments_term=apache)
 | `?` | |
   | 
[integration](https://app.codecov.io/gh/apache/pinot/pull/13313/flags?src=pr=flag_medium=referral_source=github_content=comment_campaign=pr+comments_term=apache)
 | `0.00% <0.00%> (-0.01%)` | :arrow_down: |
   | 
[integration1](https://app.codecov.io/gh/apache/pinot/pull/13313/flags?src=pr=flag_medium=referral_source=github_content=comment_campaign=pr+comments_term=apache)
 | `?` | |
   | 
[integration2](https://app.codecov.io/gh/apache/pinot/pull/13313/flags?src=pr=flag_medium=referral_source=github_content=comment_campaign=pr+comments_term=apache)
 | `0.00% <0.00%> (ø)` | |
   | 
[java-11](https://app.codecov.io/gh/apache/pinot/pull/13313/flags?src=pr=flag_medium=referral_source=github_content=comment_campaign=pr+comments_term=apache)
 | `?` | |
   | 
[java-21](https://app.codecov.io/gh/apache/pinot/pull/13313/flags?src=pr=flag_medium=referral_source=github_content=comment_campaign=pr+comments_term=apache)
 | `0.00% <0.00%> (-61.63%)` | :arrow_down: |
   | 
[skip-bytebuffers-false](https://app.codecov.io/gh/apache/pinot/pull/13313/flags?src=pr=flag_medium=referral_source=github_content=comment_campaign=pr+comments_term=apache)
 | `0.00% <0.00%> (-61.75%)` | :arrow_down: |
   | 
[skip-bytebuffers-true](https://app.codecov.io/gh/apache/pinot/pull/13313/flags?src=pr=flag_medium=referral_source=github_content=comment_campaign=pr+comments_term=apache)
 | `0.00% <0.00%> (-27.73%)` | :arrow_down: |
   | 
[temurin](https://app.codecov.io/gh/apache/pinot/pull/13313/flags?src=pr=flag_medium=referral_source=github_content=comment_campaign=pr+comments_term=apache)
 | `0.00% <0.00%> (-61.76%)` | :arrow_down: |
   | 
[unittests](https://app.codecov.io/gh/apache/pinot/pull/13313/flags?src=pr=flag_medium=referral_source=github_content=comment_campaign=pr+comments_term=apache)
 | `?` | |
   | 
[unittests1](https://app.codecov.io/gh/apache/pinot/pull/13313/flags?src=pr=flag_medium=referral_source=github_content=comment_campaign=pr+comments_term=apache)
 | `?` | |
   | 
[unittests2](https://app.codecov.io/gh/apache/pinot/pull/13313/flags?src=pr=flag_medium=referral_source=github_content=comment_campaign=pr+comments_term=apache)
 | `?` | |
   
   Flags with carried forward coverage won't be shown. 

Re: [I] Use Foreign Memory API, introduced in Java 22 [pinot]

2024-06-04 Thread via GitHub


aditya0811 commented on issue #12809:
URL: https://github.com/apache/pinot/issues/12809#issuecomment-2148179476

   @gortiz 
   1) Shall we start implementing Buffer using Foreign Memory by adding 
necessary files in ` pinot-plugins/pinot-foreign-memory/src/main/java22`? Or is 
there any other discovery you would recommend before proceeding ahead. Maybe on 
the lines of `how to publish the code in the pipeline`.
   
   2) If no discovery required, I will refer the code you shared earlier. I am 
yet to figure out few things regarding existing Buffer use case for Pinot. Can 
I use this space for any questions, or we want to move this discussion to maybe 
slack thread.
   
   3) After (1) & (2), I would need your inputs regarding the way we would test 
this. 


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscr...@pinot.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


-
To unsubscribe, e-mail: commits-unsubscr...@pinot.apache.org
For additional commands, e-mail: commits-h...@pinot.apache.org



  1   2   3   4   5   6   7   8   9   10   >