klsince commented on code in PR #13107:
URL: https://github.com/apache/pinot/pull/13107#discussion_r1605390671
##########
pinot-common/src/main/java/org/apache/pinot/common/metadata/segment/SegmentPartitionMetadata.java:
##########
@@ -48,6 +53,21 @@ public SegmentPartitionMetadata(
@Nonnull @JsonProperty("columnPartitionMap") Map<String,
ColumnPartitionMetadata> columnPartitionMap) {
Preconditions.checkNotNull(columnPartitionMap);
_columnPartitionMap = columnPartitionMap;
+ _uploadedSegmentPartitionId = -1;
+ }
+
+ /**
+ * Constructor for the class.
+ *
+ * @param columnPartitionMap Column name to ColumnPartitionMetadata map.
+ */
+ @JsonCreator
+ public SegmentPartitionMetadata(
+ @Nullable @JsonProperty("columnPartitionMap") Map<String,
ColumnPartitionMetadata> columnPartitionMap,
+ @Nullable @JsonProperty(value = "uploadedSegmentPartitionId",
defaultValue = "-1")
Review Comment:
I actually dropped {sequenceId} intentionally, otherwise, the segment name
becomes LLC format, and would cause the complexities you mentioned in another
alternative proposal, like the checks on start/end offsets. In fact, the
endOffset of latest LLC segment (the one with max seqId) is used to resume
stream consumption, and we probably don't want to affect that logic with
uploaded segments.
> Do we enforce naming conventions? ... ... add a SegmentNameGenerator
implementation ...
I don't think so. We may need one just for uploading segments to upsert
tables, due to its special requirements on partition id and to break comparison
ties. And good point for the need of a new name generator.
btw, from what I learnt while reviewing this PR, there seems a design choice
for RT segments that the {partitionId} encoded in the segment name is the
source of truth, rather than the one kept in ZK metadata as there might no
partition info at all in ZK metadata. So following on that design principle,
I'd prefer to keep encoding the {partitionId} in segment name for uploaded
segments as well. In this way, we can 1) avoid the cost of reading ZK metadata
every time we need partitionId for the uploaded segments; 2) avoid changes on
persistent segment metadata in ZK or on disk, which might make things a bit
easier when to consider upgrade/downgrade.
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: [email protected]
For queries about this service, please contact Infrastructure at:
[email protected]
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]