[
https://issues.apache.org/jira/browse/HADOOP-12666?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15324806#comment-15324806
]
Chris Nauroth commented on HADOOP-12666:
----------------------------------------
bq. This is the kind of thing best done as feature branch: it can be stabilised
before being merged with anything.
[[email protected]], please allow me to take the responsibility for this, and
I'll learn from it. I was working based on established precedent for prior
file system implementations. The prior history is that s3n, s3a, swift and
wasb all entered trunk without development on feature branches. I considered
feature branches unnecessary for those, because they were new, isolated modules
with no impact on the rest of the tree, so I followed the same path here for
ADL.
Established precedent doesn't always mean good practice though. My thinking on
this has evolved recently, and I'm leaning towards using feature branches more
often.
bq. Certainly I'd like to see backports to branch-2 holding back...
Yes, this is the plan. I may have caused confusion earlier with my accidental
commit to branch-2. That has been reverted.
I'd like to recap the plan that Chris D and I landed on. The comments on this
JIRA are lengthy, so it would have been easy to miss this:
{quote}
HADOOP-13037 will remove the dependency on WebHDFS, largely rewriting this
client. The buffering in PrivateAzureDataLakeFileSystem should also be
rewritten. It's implementing something like demand-paging, but some of the
control flow would be more powerful, and more understandable, if it were
layered more conventionally. Configuring the client is also very complex. I
tried the directions, but only arrived at a working client with Vishwajeet's
help.
The target version is 2.9, but we should hold off on backporting this before
it's easier to use and maintain. I would like to commit the result of review
from Chris Nauroth, Lei (Eddy) Xu, Tony Wu, Aaron Fabbri, and Sean Mackrory to
trunk. It'll be easier to fixup the patch in targeted JIRAs. Committing the
contract tests in HADOOP-12875 would also be helpful. This would be with the
caveats from HDFS-9938: this module may be removed if it impedes WebHDFS
development. Further, it should be easier to configure before we include it in
a release. Is this an acceptable path forward?
{quote}
To summarize, the plan was going to be to commit HADOOP-12666 and HADOOP-12875
close together. Then, the contract tests would serve as an effective check
against regressions when the work is done on HADOOP-13037. You have -1'd the
current revision of HADOOP-12875, so the contributors will need to work through
your feedback before HADOOP-12875 can be committed.
> Support Microsoft Azure Data Lake - as a file system in Hadoop
> --------------------------------------------------------------
>
> Key: HADOOP-12666
> URL: https://issues.apache.org/jira/browse/HADOOP-12666
> Project: Hadoop Common
> Issue Type: New Feature
> Components: fs, fs/azure, tools
> Reporter: Vishwajeet Dusane
> Assignee: Vishwajeet Dusane
> Fix For: 3.0.0-alpha1
>
> Attachments: Create_Read_Hadoop_Adl_Store_Semantics.pdf,
> HADOOP-12666-002.patch, HADOOP-12666-003.patch, HADOOP-12666-004.patch,
> HADOOP-12666-005.patch, HADOOP-12666-006.patch, HADOOP-12666-007.patch,
> HADOOP-12666-008.patch, HADOOP-12666-009.patch, HADOOP-12666-010.patch,
> HADOOP-12666-011.patch, HADOOP-12666-012.patch, HADOOP-12666-013.patch,
> HADOOP-12666-014.patch, HADOOP-12666-015.patch, HADOOP-12666-016.patch,
> HADOOP-12666-1.patch
>
> Original Estimate: 336h
> Time Spent: 336h
> Remaining Estimate: 0h
>
> h2. Description
> This JIRA describes a new file system implementation for accessing Microsoft
> Azure Data Lake Store (ADL) from within Hadoop. This would enable existing
> Hadoop applications such has MR, HIVE, Hbase etc.., to use ADL store as
> input or output.
>
> ADL is ultra-high capacity, Optimized for massive throughput with rich
> management and security features. More details available at
> https://azure.microsoft.com/en-us/services/data-lake-store/
--
This message was sent by Atlassian JIRA
(v6.3.4#6332)
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]