[GitHub] [flink] JingsongLi commented on a change in pull request #12573: [FLINK-18232][hive] Fix Hive streaming source bugs

GitBox Wed, 10 Jun 2020 01:32:49 -0700


JingsongLi commented on a change in pull request #12573:
URL: https://github.com/apache/flink/pull/12573#discussion_r437953848




##########
File path: 
flink-connectors/flink-connector-hive/src/main/java/org/apache/flink/connectors/hive/read/HiveTableFileInputFormat.java
##########
@@ -52,16 +53,26 @@ public HiveTableFileInputFormat(
 
        @Override
        public void open(FileInputSplit fileSplit) throws IOException {
-               URI uri = fileSplit.getPath().toUri();
                HiveTableInputSplit split = new HiveTableInputSplit(
                                fileSplit.getSplitNumber(),
-                               new FileSplit(new Path(uri), 
fileSplit.getStart(), fileSplit.getLength(), (String[]) null),
+                               toHadoopFileSplit(fileSplit),
                                inputFormat.getJobConf(),
-                               hiveTablePartition
-               );
+                               hiveTablePartition);
                inputFormat.open(split);
        }
 
+       @VisibleForTesting
+       static FileSplit toHadoopFileSplit(FileInputSplit fileSplit) throws 
IOException {
+               URI uri = fileSplit.getPath().toUri();
+               long length = fileSplit.getLength();
+               // Hadoop FileSplit should not have -1 length.
+               if (length == -1) {

Review comment:
       Actually, there are comments in `FileInputSplit`: `the number of bytes 
in the file to process (-1 is flag for "read whole file")`




----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
[email protected]

[GitHub] [flink] JingsongLi commented on a change in pull request #12573: [FLINK-18232][hive] Fix Hive streaming source bugs

Reply via email to