Jackie-Jiang commented on a change in pull request #8224:
URL: https://github.com/apache/pinot/pull/8224#discussion_r811505110
##########
File path:
pinot-segment-spi/src/main/java/org/apache/pinot/segment/spi/creator/ColumnStatistics.java
##########
@@ -98,5 +99,7 @@ default int getMaxRowLengthInBytes() {
int getNumPartitions();
+ Map<String, String> getFunctionConfig();
Review comment:
```suggestion
Map<String, String> getPartitionFunctionConfig();
```
##########
File path:
pinot-spi/src/main/java/org/apache/pinot/spi/config/table/ColumnPartitionConfig.java
##########
@@ -46,6 +51,15 @@ public String getFunctionName() {
return _functionName;
}
+ /**
+ * Returns the partition function configuration for the column.
+ *
+ * @return Partition function configuration.
+ */
+ public Map<String, String> getFunctionConfig() {
Review comment:
Annotate it with `@Nullabel`
```suggestion
@Nullable
public Map<String, String> getFunctionConfig() {
```
##########
File path:
pinot-broker/src/main/java/org/apache/pinot/broker/routing/segmentpruner/PartitionSegmentPruner.java
##########
@@ -126,7 +126,8 @@ private PartitionInfo
extractPartitionInfoFromSegmentZKMetadataZNRecord(String s
}
return new
PartitionInfo(PartitionFunctionFactory.getPartitionFunction(columnPartitionMetadata.getFunctionName(),
- columnPartitionMetadata.getNumPartitions()),
columnPartitionMetadata.getPartitions());
+ columnPartitionMetadata.getNumPartitions(),
columnPartitionMetadata.getFunctionConfig()),
+ columnPartitionMetadata.getPartitions());
Review comment:
(code format) Can you please setup the [Pinot
Style](https://docs.pinot.apache.org/developers/developers-and-contributors/code-setup#intellij)
and reformat the changes?
##########
File path:
pinot-segment-spi/src/main/java/org/apache/pinot/segment/spi/partition/BoundedColumnValuePartitionFunction.java
##########
@@ -0,0 +1,92 @@
+/**
+ * Licensed to the Apache Software Foundation (ASF) under one
+ * or more contributor license agreements. See the NOTICE file
+ * distributed with this work for additional information
+ * regarding copyright ownership. The ASF licenses this file
+ * to you under the Apache License, Version 2.0 (the
+ * "License"); you may not use this file except in compliance
+ * with the License. You may obtain a copy of the License at
+ *
+ * http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing,
+ * software distributed under the License is distributed on an
+ * "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
+ * KIND, either express or implied. See the License for the
+ * specific language governing permissions and limitations
+ * under the License.
+ */
+package org.apache.pinot.segment.spi.partition;
+
+import com.google.common.base.Preconditions;
+import java.util.Map;
+
+
+/**
+ * Implementation of {@link PartitionFunction} which partitions based
configured column values.
+ *
+ * "columnPartitionMap": {
+ * "subject": {
+ * "functionName": "BoundedColumnValue",
+ * "functionConfig": {
+ * "columnValues": "Maths|English|Chemistry"
+ * }
+ * }
+ * }
+ * With this partition config on column "subject", partitionId would be 1 for
Maths, 2 for English and so on.
+ * partitionId would be "0" for all other values which is not configured but
may occur.
+ */
+public class BoundedColumnValuePartitionFunction implements PartitionFunction {
+ private static final int DEFAULT_PARTITION_ID = 0;
+ private static final String NAME = "BoundedColumnValue";
+ private static final String COLUMN_VALUES = "columnValues";
+ private final Map<String, String> _functionConfig;
+ private final String[] _values;
+ private final int _noOfValues;
+
+ public BoundedColumnValuePartitionFunction(Map<String, String>
functionConfig) {
+ Preconditions.checkArgument(functionConfig != null &&
functionConfig.size() > 0,
+ "functionConfig should be present, specified", functionConfig);
+ Preconditions.checkState(functionConfig.get(COLUMN_VALUES) != null,
"columnValues must be configured");
+ _functionConfig = functionConfig;
+ _values = functionConfig.get(COLUMN_VALUES).split("\\|");
+ _noOfValues = _values.length;
+ }
+
+ @Override
+ public int getPartition(Object valueIn) {
+ String value = valueIn instanceof String ? (String) valueIn :
String.valueOf(valueIn);
Review comment:
(minor) Same as `value.toString()`
##########
File path:
pinot-spi/src/main/java/org/apache/pinot/spi/config/table/ColumnPartitionConfig.java
##########
@@ -21,20 +21,25 @@
import com.fasterxml.jackson.annotation.JsonCreator;
import com.fasterxml.jackson.annotation.JsonProperty;
import com.google.common.base.Preconditions;
+import java.util.Map;
import org.apache.pinot.spi.config.BaseJsonConfig;
public class ColumnPartitionConfig extends BaseJsonConfig {
private final String _functionName;
private final int _numPartitions;
+ private final Map<String, String> _functionConfig;
@JsonCreator
public ColumnPartitionConfig(@JsonProperty(value = "functionName", required
= true) String functionName,
- @JsonProperty(value = "numPartitions", required = true) int
numPartitions) {
+ @JsonProperty(value = "numPartitions", required = false) int
numPartitions,
Review comment:
Suggest not changing this to optional because it is used for segment
management. Within the `BoundedColumnValuePartitionFunction`, we can validate
whether it matches the column values provided
##########
File path:
pinot-segment-spi/src/main/java/org/apache/pinot/segment/spi/partition/BoundedColumnValuePartitionFunction.java
##########
@@ -0,0 +1,92 @@
+/**
+ * Licensed to the Apache Software Foundation (ASF) under one
+ * or more contributor license agreements. See the NOTICE file
+ * distributed with this work for additional information
+ * regarding copyright ownership. The ASF licenses this file
+ * to you under the Apache License, Version 2.0 (the
+ * "License"); you may not use this file except in compliance
+ * with the License. You may obtain a copy of the License at
+ *
+ * http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing,
+ * software distributed under the License is distributed on an
+ * "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
+ * KIND, either express or implied. See the License for the
+ * specific language governing permissions and limitations
+ * under the License.
+ */
+package org.apache.pinot.segment.spi.partition;
+
+import com.google.common.base.Preconditions;
+import java.util.Map;
+
+
+/**
+ * Implementation of {@link PartitionFunction} which partitions based
configured column values.
+ *
+ * "columnPartitionMap": {
+ * "subject": {
+ * "functionName": "BoundedColumnValue",
+ * "functionConfig": {
+ * "columnValues": "Maths|English|Chemistry"
+ * }
+ * }
+ * }
+ * With this partition config on column "subject", partitionId would be 1 for
Maths, 2 for English and so on.
+ * partitionId would be "0" for all other values which is not configured but
may occur.
+ */
+public class BoundedColumnValuePartitionFunction implements PartitionFunction {
+ private static final int DEFAULT_PARTITION_ID = 0;
+ private static final String NAME = "BoundedColumnValue";
+ private static final String COLUMN_VALUES = "columnValues";
+ private final Map<String, String> _functionConfig;
+ private final String[] _values;
+ private final int _noOfValues;
+
+ public BoundedColumnValuePartitionFunction(Map<String, String>
functionConfig) {
+ Preconditions.checkArgument(functionConfig != null &&
functionConfig.size() > 0,
+ "functionConfig should be present, specified", functionConfig);
+ Preconditions.checkState(functionConfig.get(COLUMN_VALUES) != null,
"columnValues must be configured");
+ _functionConfig = functionConfig;
+ _values = functionConfig.get(COLUMN_VALUES).split("\\|");
Review comment:
Suggest making the delimiter configurable to support values with `|`
##########
File path:
pinot-segment-spi/src/main/java/org/apache/pinot/segment/spi/partition/BoundedColumnValuePartitionFunction.java
##########
@@ -0,0 +1,92 @@
+/**
+ * Licensed to the Apache Software Foundation (ASF) under one
+ * or more contributor license agreements. See the NOTICE file
+ * distributed with this work for additional information
+ * regarding copyright ownership. The ASF licenses this file
+ * to you under the Apache License, Version 2.0 (the
+ * "License"); you may not use this file except in compliance
+ * with the License. You may obtain a copy of the License at
+ *
+ * http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing,
+ * software distributed under the License is distributed on an
+ * "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
+ * KIND, either express or implied. See the License for the
+ * specific language governing permissions and limitations
+ * under the License.
+ */
+package org.apache.pinot.segment.spi.partition;
+
+import com.google.common.base.Preconditions;
+import java.util.Map;
+
+
+/**
+ * Implementation of {@link PartitionFunction} which partitions based
configured column values.
+ *
+ * "columnPartitionMap": {
+ * "subject": {
+ * "functionName": "BoundedColumnValue",
+ * "functionConfig": {
+ * "columnValues": "Maths|English|Chemistry"
+ * }
+ * }
+ * }
+ * With this partition config on column "subject", partitionId would be 1 for
Maths, 2 for English and so on.
+ * partitionId would be "0" for all other values which is not configured but
may occur.
+ */
+public class BoundedColumnValuePartitionFunction implements PartitionFunction {
+ private static final int DEFAULT_PARTITION_ID = 0;
+ private static final String NAME = "BoundedColumnValue";
+ private static final String COLUMN_VALUES = "columnValues";
+ private final Map<String, String> _functionConfig;
+ private final String[] _values;
Review comment:
Make it a map to accelerate the lookup?
##########
File path:
pinot-spi/src/main/java/org/apache/pinot/spi/config/table/ColumnPartitionConfig.java
##########
@@ -21,20 +21,25 @@
import com.fasterxml.jackson.annotation.JsonCreator;
import com.fasterxml.jackson.annotation.JsonProperty;
import com.google.common.base.Preconditions;
+import java.util.Map;
import org.apache.pinot.spi.config.BaseJsonConfig;
public class ColumnPartitionConfig extends BaseJsonConfig {
private final String _functionName;
private final int _numPartitions;
+ private final Map<String, String> _functionConfig;
@JsonCreator
public ColumnPartitionConfig(@JsonProperty(value = "functionName", required
= true) String functionName,
Review comment:
Let's add another constructor without `functionConfig`, so that we don't
need to change the existing usage of this method
##########
File path:
pinot-segment-spi/src/main/java/org/apache/pinot/segment/spi/partition/BoundedColumnValuePartitionFunction.java
##########
@@ -0,0 +1,92 @@
+/**
+ * Licensed to the Apache Software Foundation (ASF) under one
+ * or more contributor license agreements. See the NOTICE file
+ * distributed with this work for additional information
+ * regarding copyright ownership. The ASF licenses this file
+ * to you under the Apache License, Version 2.0 (the
+ * "License"); you may not use this file except in compliance
+ * with the License. You may obtain a copy of the License at
+ *
+ * http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing,
+ * software distributed under the License is distributed on an
+ * "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
+ * KIND, either express or implied. See the License for the
+ * specific language governing permissions and limitations
+ * under the License.
+ */
+package org.apache.pinot.segment.spi.partition;
+
+import com.google.common.base.Preconditions;
+import java.util.Map;
+
+
+/**
+ * Implementation of {@link PartitionFunction} which partitions based
configured column values.
+ *
+ * "columnPartitionMap": {
+ * "subject": {
+ * "functionName": "BoundedColumnValue",
+ * "functionConfig": {
+ * "columnValues": "Maths|English|Chemistry"
+ * }
+ * }
+ * }
+ * With this partition config on column "subject", partitionId would be 1 for
Maths, 2 for English and so on.
Review comment:
Should we consider putting other values as the last partition in case
all values are already configured, and no value goes into partition 0?
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: [email protected]
For queries about this service, please contact Infrastructure at:
[email protected]
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]