godfreyhe commented on PR #19286:
URL: https://github.com/apache/flink/pull/19286#issuecomment-1145020743

   > @AHeise I am not sure that the proposed alternative approach makes things 
much easier. I personally find the implementation of the projection 
specification as an array of arrays somewhat confusing and would prefer not to 
spread this abstraction to the lower level interfaces, if possible. I find the 
code of the `CsvRowDataDeserializationSchema` that just operates on input (read 
from source) and output (converted/projected) data types easier to understand. 
That way someone who needs to maintain this class does not need to decipher 
what exactly ` int[][] projections` means. @godfreyhe please let me know if you 
also think it is worth pursuing the proposed alternative approach to what is 
currently implemented in the PR.
   
   Thanks @afedulov for the contribution. I find there is no any difference of 
the json node after deserialized from the message between changing the 
`csvSchema` or not. See the debug info with original schema:
   
![image](https://user-images.githubusercontent.com/8777671/171669068-db8c58f0-0b0f-4501-9b88-6147067b299b.png)
   
   Do you have any performance test to verify the change?
   
   One more thing about  the code in `optimizeCsvRead` method, I find 
`filteredFieldsNames` and `filteredFieldsIndices` are not needed. The code can 
be `
                   final VarCharType varCharType = new VarCharType();
                   final List<RowType.RowField> optimizedRowFields =
                           fields.stream()
                                   .map(
                                           field -> {
                                               if 
(rowResultTypeFields.contains(field.getName())) {
                                                   return field;
                                               } else {
                                                   return new RowType.RowField(
                                                           field.getName(),
                                                           varCharType,
                                                           
field.getDescription().orElse(null));
                                               }
                                           })
                                   .collect(Collectors.toList());
   `
     
   
   
    


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@flink.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org

Reply via email to