cmx-ops commented on issue #7740:
URL: https://github.com/apache/seatunnel/issues/7740#issuecomment-2373280984

   > > > hi, i update the DynamicCompile Document recently 
https://github.com/apache/seatunnel/pull/7730/files, you can take a look
   > > 
   > > 
   > > That is to say, my code is incorrect and cannot be written this way; it 
only supports simple transformations and does not support aggregation or 
deduplication operations.
   > 
   > Hi, base my understand, DynamicCompile is not support `Filter` data. 
@jackyyyyyssss please correct me if i am wrong.
   > 
   > If you want filter the duplicated data, you can do like this:
   > 
   > 1. use `DynamicalCompile` transform, add a flag column.
   > 2. use `Sql` transform to filter that flag mark as duplicated.
   
   I tried using the DynamicCompiler Transform method you mentioned, intending 
to add a new field to each row of data as an identifier for data duplication, 
but this could not be achieved. Because I cannot define a common collection to 
store the existing row data.
   ,Here is my code:
   ```java
       private Set<String> uniqueRows = new HashSet<>();
       public Column[] getInlineOutputColumns(CatalogTable inputCatalogTable) {
   
           ArrayList<Column> columns = new ArrayList<Column>();
           PhysicalColumn destColumn =
                   PhysicalColumn.of(
                           "duplicate",
                           BasicType.STRING_TYPE,
                           10,
                           true,
                           "",
                           "");
           return new Column[]{
                   destColumn
           };
   
       }
   
       public Object[] getInlineOutputFieldValues(SeaTunnelRowAccessor 
inputRow) {
           Object field0 = inputRow.getField(0);
           Object field1 = inputRow.getField(1);
           Object field2 = inputRow.getField(2);
   
           String compositeKey = field0 + "#" + field1 + "#" + field2;
           boolean isNew = uniqueRows.add(compositeKey);
           Object[] fieldValues = new Object[1];
           if (!isNew){
               fieldValues[0] ="duplicate";
           }else {
               fieldValues[0] ="no";
           }
           return fieldValues;
       }
   ```
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]

Reply via email to