[GitHub] [flink] lirui-apache commented on a change in pull request #12573: [FLINK-18232][hive] Fix Hive streaming source bugs

GitBox Wed, 10 Jun 2020 01:16:03 -0700


lirui-apache commented on a change in pull request #12573:
URL: https://github.com/apache/flink/pull/12573#discussion_r437930910




##########
File path: 
flink-connectors/flink-connector-hive/src/main/java/org/apache/flink/connectors/hive/read/HiveTableFileInputFormat.java
##########
@@ -52,16 +53,26 @@ public HiveTableFileInputFormat(
 
        @Override
        public void open(FileInputSplit fileSplit) throws IOException {
-               URI uri = fileSplit.getPath().toUri();
                HiveTableInputSplit split = new HiveTableInputSplit(
                                fileSplit.getSplitNumber(),
-                               new FileSplit(new Path(uri), 
fileSplit.getStart(), fileSplit.getLength(), (String[]) null),
+                               toHadoopFileSplit(fileSplit),
                                inputFormat.getJobConf(),
-                               hiveTablePartition
-               );
+                               hiveTablePartition);
                inputFormat.open(split);
        }
 
+       @VisibleForTesting
+       static FileSplit toHadoopFileSplit(FileInputSplit fileSplit) throws 
IOException {
+               URI uri = fileSplit.getPath().toUri();
+               long length = fileSplit.getLength();
+               // Hadoop FileSplit should not have -1 length.
+               if (length == -1) {

Review comment:
       Can we update the doc of `FileInputSplit::getLength()` to indicate 
length == -1 means to read all data from the file? I'll feel more comfortable 
about this change if it's guaranteed by the API contract.




----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
[email protected]

[GitHub] [flink] lirui-apache commented on a change in pull request #12573: [FLINK-18232][hive] Fix Hive streaming source bugs

Reply via email to