Zoltán Borók-Nagy created IMPALA-9723: -----------------------------------------
Summary: Read files created by Hive Streaming Ingestion V2 Key: IMPALA-9723 URL: https://issues.apache.org/jira/browse/IMPALA-9723 Project: IMPALA Issue Type: Sub-task Reporter: Zoltán Borók-Nagy Impala should be able to read files created by Hive Streaming Ingestion V2. Hive Streaming only writes full ACID ORC files. Such files might contain row stripes that Impala shouldn't read based on its validWriteIdList. Also, Hive Streaming might append to the end of such files. In that case it writes a "side file" next to the file that contains the last committed file end (name of it is file name + ___flush_length). Impala should take that into consideration when it reads such files. Everything after "flush length" must be ignored. OrcAcidUtils.getLastFlushLength(fileSystem, filePath) can be used to determine the committed file size. -- This message was sent by Atlassian Jira (v8.3.4#803005)