luchunliang commented on a change in pull request #2267:
URL: https://github.com/apache/incubator-inlong/pull/2267#discussion_r790131573



##########
File path: 
inlong-sort/sort-core/src/main/java/org/apache/inlong/sort/flink/deserialization/DeserializationSchema.java
##########
@@ -122,7 +145,23 @@ public void processElement(
                     context.output(METRIC_DATA_OUTPUT_TAG, metricData);
                 }
 
-                
collector.collect(recordTransformer.toSerializedRecord(sinkRecord));
+                SerializedRecord serializedSinkRecord = 
recordTransformer.toSerializedRecord(sinkRecord);
+
+                if (auditImp != null) {
+                    Pair<String, String> groupIdAndStreamId = 
inLongGroupIdAndStreamIdMap.getOrDefault(
+                            serializedRecord.getDataFlowId(),
+                            Pair.of("", ""));
+
+                    auditImp.add(
+                            Constants.METRIC_AUDIT_ID_FOR_INPUT,
+                            groupIdAndStreamId.getLeft(),
+                            groupIdAndStreamId.getRight(),
+                            sinkRecord.getTimestampMillis(),

Review comment:
       Please check that sinkRecord.getTimestampMillis() is the generated time 
of pulsar/tube Message or the logged time of user data or current time.
       public DeserializationResult<SerializedRecord> 
deserialize(@SuppressWarnings("rawtypes") Message message)
               throws IOException {
           final byte[] data = message.getData();
           return DeserializationResult.of(new SerializedRecord(dataFlowId, 
message.getEventTime(), data), data.length);
       }
               deserializer.flatMap(mixedRow, new CallbackCollector<>((row -> {
                   // each tid might be associated with multiple data flows
                   for (long dataFlowId : dataFlowIds) {
                       collector.collect(new Record(dataFlowId, 
System.currentTimeMillis(), row));
                   }
               })));

##########
File path: 
inlong-sort/sort-core/src/main/java/org/apache/inlong/sort/flink/deserialization/DeserializationSchema.java
##########
@@ -181,6 +228,8 @@ public void removeDataFlow(DataFlowInfo dataFlowInfo) 
throws Exception {
                 multiTenancyDeserializer.removeDataFlow(dataFlowInfo);
                 fieldMappingTransformer.removeDataFlow(dataFlowInfo);
                 recordTransformer.removeDataFlow(dataFlowInfo);
+
+                inLongGroupIdAndStreamIdMap.remove(dataFlowInfo.getId());

Review comment:
       Remove the map between dataFlowId and inlongGroupIdStreamId immediately 
when the configuration data remove DataFlowInfo, maybe miss some audit data.

##########
File path: 
inlong-sort/sort-core/src/main/java/org/apache/inlong/sort/flink/hive/HiveMultiTenantWriter.java
##########
@@ -126,6 +147,20 @@ public void processElement(SerializedRecord 
serializedRecord, Context context,
 
             
hiveWriter.processElement(recordTransformer.toRecord(serializedRecord).getRow(),
                     proxyContext.setContext(context), collector);
+
+            if (auditImp != null) {
+                Pair<String, String> groupIdAndStreamId = 
inLongGroupIdAndStreamIdMap.getOrDefault(
+                        serializedRecord.getDataFlowId(),
+                        Pair.of("", ""));
+
+                auditImp.add(
+                        Constants.METRIC_AUDIT_ID_FOR_OUTPUT,
+                        groupIdAndStreamId.getLeft(),
+                        groupIdAndStreamId.getRight(),
+                        serializedRecord.getTimestampMillis(),

Review comment:
       ditto




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]


Reply via email to