Jackie-Jiang commented on code in PR #13086:
URL: https://github.com/apache/pinot/pull/13086#discussion_r1623331211
##########
pinot-segment-local/src/main/java/org/apache/pinot/segment/local/recordtransformer/RecordTransformer.java:
##########
@@ -18,15 +18,79 @@
*/
package org.apache.pinot.segment.local.recordtransformer;
+import com.fasterxml.jackson.databind.JsonNode;
+import java.io.IOException;
import java.io.Serializable;
+import java.util.ArrayList;
+import java.util.HashSet;
+import java.util.List;
+import java.util.Map;
+import java.util.Set;
import javax.annotation.Nullable;
+import org.apache.pinot.plugin.record.enricher.RecordEnricherRegistry;
+import org.apache.pinot.spi.config.table.TableConfig;
+import org.apache.pinot.spi.config.table.ingestion.EnrichmentConfig;
+import org.apache.pinot.spi.config.table.ingestion.IngestionConfig;
import org.apache.pinot.spi.data.readers.GenericRow;
+import org.slf4j.Logger;
+import org.slf4j.LoggerFactory;
/**
* The record transformer which takes a {@link GenericRow} and transform it
based on some custom rules.
*/
public interface RecordTransformer extends Serializable {
+ final boolean GROOVY_DISABLED = false;
Review Comment:
Suggest keeping the interface simple. We need to add `getInputColumns()`
from `RecordEnricher`, other method seems unnecessary. Take a look at
`CompositeTransformer` for record transformer handling
##########
pinot-segment-local/src/main/java/org/apache/pinot/segment/local/segment/creator/impl/SegmentIndexCreationDriverImpl.java:
##########
@@ -113,6 +112,29 @@ public class SegmentIndexCreationDriverImpl implements
SegmentIndexCreationDrive
private long _totalStatsCollectorTime = 0;
private boolean _continueOnError;
+ public static void persistCreationMeta(File indexDir, long crc, long
creationTime)
Review Comment:
These changes are not related to this PR. I guess you might enabled
`Rearrange code` option in IDE during auto reformat. Can you revert these
unrelated changes? There are quite some unrelated auto reformat in several files
##########
pinot-connectors/pinot-flink-connector/src/main/java/org/apache/pinot/connector/flink/sink/FlinkSegmentWriter.java:
##########
@@ -78,7 +77,7 @@ public class FlinkSegmentWriter implements SegmentWriter {
private String _outputDirURI;
private Schema _schema;
private Set<String> _fieldsToRead;
- private RecordEnricherPipeline _recordEnricherPipeline;
+ private RecordTransformer _recordEnricherPipeline;
Review Comment:
After changing `RecordEnricher` into `RecordTransformer`, we should be able
to put the `RecordEnricher` in front of the existing `RecordTransformer`s as
part of the default `CompositeTransformer`
##########
pinot-segment-local/src/main/java/org/apache/pinot/plugin/record/enricher/RecordEnricherRegistry.java:
##########
@@ -16,45 +16,29 @@
* specific language governing permissions and limitations
* under the License.
*/
-package org.apache.pinot.spi.recordenricher;
+package org.apache.pinot.plugin.record.enricher;
-import java.io.IOException;
import java.util.HashMap;
import java.util.Map;
import java.util.ServiceLoader;
-import org.apache.pinot.spi.config.table.ingestion.EnrichmentConfig;
+import org.apache.pinot.segment.local.recordtransformer.RecordTransformer;
import org.slf4j.Logger;
import org.slf4j.LoggerFactory;
public class RecordEnricherRegistry {
Review Comment:
I think we should be able to remove this registry and follow the existing
way of creating `RecordTransformer`. Same for other places where 2 pipelines
exist
##########
pinot-core/src/main/java/org/apache/pinot/core/data/manager/realtime/RealtimeSegmentDataManager.java:
##########
@@ -275,7 +275,7 @@ public void deleteSegmentFile() {
private final int _partitionGroupId;
private final PartitionGroupConsumptionStatus
_partitionGroupConsumptionStatus;
final String _clientId;
- private final RecordEnricherPipeline _recordEnricherPipeline;
+ private final RecordTransformer _recordEnricherPipeline;
Review Comment:
Same here, we should be able to integrate it into the `TransformPipeline`
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: [email protected]
For queries about this service, please contact Infrastructure at:
[email protected]
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]