fhueske commented on a change in pull request #6710: [FLINK-10134] UTF-16 
support for TextInputFormat bug fixed
URL: https://github.com/apache/flink/pull/6710#discussion_r223268822
 
 

 ##########
 File path: 
flink-core/src/main/java/org/apache/flink/api/common/io/FileInputFormat.java
 ##########
 @@ -601,41 +602,44 @@ public LocatableInputSplitAssigner 
getInputSplitAssigner(FileInputSplit[] splits
                if (unsplittable) {
                        int splitNum = 0;
                        for (final FileStatus file : files) {
+                               String bomCharsetName = getBomCharset(file);
 
 Review comment:
   Yes, I'm aware of that. It would also be required for every split unless we 
cache the BOM per file.
   OTOH, if we do it in the JM, the job cannot start until a single thread had 
a look at the first bytes of each file.

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
[email protected]


With regards,
Apache Git Services

Reply via email to