[ 
https://issues.apache.org/jira/browse/IMPALA-8454?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16829592#comment-16829592
 ] 

ASF subversion and git services commented on IMPALA-8454:
---------------------------------------------------------

Commit 5ced9160bd65e5a72d739a7a5c548add2dbc4b84 in impala's branch 
refs/heads/master from Todd Lipcon
[ https://gitbox.apache.org/repos/asf?p=impala.git;h=5ced916 ]

IMPALA-8454 (part 2): Initial support for recursive file listing within a 
partition

This adds support to FileMetadataLoader to recursively list a directory
and create file descriptors. The changes are as follows:

* FileMetadataLoader can now take a 'recursive' argument to trigger the
  new behavior. All the non-test code paths still use non-recursive
  (i.e. this new feature isn't exposed for real tables as of yet).

* FileSystemUtil has some functionality for recursive directory listing.
  There are a few notes there around unexpected optimizations for S3 vs
  HDFS.

* Renamed the 'file_name' field to 'relative_path' for FileDescriptor
  and HDFS splits, since now the file descriptors may be more than a
  single path component.

The new functionality is just unit tested at the moment. Later, this
functionality will be tied into the actual table code paths to solve
issues with Hive interop, along with end-to-end tests.

Change-Id: I9b151d7abb8443c0d9de0a0d82a9f13e07ad5109
Reviewed-on: http://gerrit.cloudera.org:8080/12991
Tested-by: Todd Lipcon <t...@apache.org>
Reviewed-by: Todd Lipcon <t...@apache.org>


> Recursively list files within transactional tables
> --------------------------------------------------
>
>                 Key: IMPALA-8454
>                 URL: https://issues.apache.org/jira/browse/IMPALA-8454
>             Project: IMPALA
>          Issue Type: Improvement
>          Components: Catalog
>            Reporter: Todd Lipcon
>            Assignee: Todd Lipcon
>            Priority: Major
>
> For transactional tables, the data files are not directly within the 
> partition directories, but instead are stored within subdirectories 
> corresponding to writeIds, compactions, etc. To support this, we need to be 
> able to recursively load file lists within partition directories.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

---------------------------------------------------------------------
To unsubscribe, e-mail: issues-all-unsubscr...@impala.apache.org
For additional commands, e-mail: issues-all-h...@impala.apache.org

Reply via email to