Sergey Shelukhin created HIVE-17423: ---------------------------------------
Summary: LLAP Parquet caching - support file ID in splits Key: HIVE-17423 URL: https://issues.apache.org/jira/browse/HIVE-17423 Project: Hive Issue Type: Bug Reporter: Sergey Shelukhin To get LLAP cache data one needs a file ID which is either an HDFS inode ID, or a composite of path, modification time and size. These can be embedded into splits for ORC, cause in particular for the former it's possible to get the IDs as a part of a normal file enumeration that split generation performs anyway. If they are missing, the IDs need to be obtained for every file on the fragment side. We should explore adding file IDs to Parquet splits when the cache is enabled. -- This message was sent by Atlassian JIRA (v6.4.14#64029)