[jira] [Commented] (HADOOP-12666) Support Microsoft Azure Data Lake - as a file system in Hadoop

Chris Nauroth (JIRA) Fri, 08 Apr 2016 12:39:51 -0700

    [ 
https://issues.apache.org/jira/browse/HADOOP-12666?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15232787#comment-15232787
 ]


Chris Nauroth commented on HADOOP-12666:
----------------------------------------

[~vishwajeet.dusane], thank you for patch v009.  It looks like this has mostly 
addressed my prior feedback.

I see one more mishandled {{InterruptedException}} inside 
{{BatchByteArrayInputStream}}.  That would be good to clean up.

We still have an open question about {{LOG_VERSION}}.  Here is an earlier 
comment from me:

{quote}
Can you please elaborate on what {{LOG_VERSION}} is supposed to mean? Does 
"code instrumentation version" mean that this is trying to capture the version 
of the Hadoop software that is running, or is there some special significance 
to the hard-coded "1.2.1" value? If it's meant to indicate the Hadoop software 
version, then the Hadoop {{VersionInfo}} class I mentioned would be a good fit.
{quote}

This was your response:

{quote}
LOG_VERSION is telemetry information used in Adl back-end only.
{quote}

This didn't entirely address the question for me.  Does this mean that the 
{{LOG_VERSION}} is an opaque value used by telemetry only for tracking the 
"client version" that sent the request?  If so, then my suggestion to use the 
Hadoop {{VersionInfo}} class would be more accurate.  If not, then can you help 
me better understand the significance of the current hard-coded 1.2.1 value?  
Does the service telemetry have some kind of dependency on that specific value?

One thing I would want to avoid is a situation where this value needs to be 
maintained in lock step with something in the back-end implementation.

> Support Microsoft Azure Data Lake - as a file system in Hadoop
> --------------------------------------------------------------
>
>                 Key: HADOOP-12666
>                 URL: https://issues.apache.org/jira/browse/HADOOP-12666
>             Project: Hadoop Common
>          Issue Type: New Feature
>          Components: fs, fs/azure, tools
>            Reporter: Vishwajeet Dusane
>            Assignee: Vishwajeet Dusane
>         Attachments: Create_Read_Hadoop_Adl_Store_Semantics.pdf, 
> HADOOP-12666-002.patch, HADOOP-12666-003.patch, HADOOP-12666-004.patch, 
> HADOOP-12666-005.patch, HADOOP-12666-006.patch, HADOOP-12666-007.patch, 
> HADOOP-12666-008.patch, HADOOP-12666-009.patch, HADOOP-12666-1.patch
>
>   Original Estimate: 336h
>          Time Spent: 336h
>  Remaining Estimate: 0h
>
> h2. Description
> This JIRA describes a new file system implementation for accessing Microsoft 
> Azure Data Lake Store (ADL) from within Hadoop. This would enable existing 
> Hadoop applications such has MR, HIVE, Hbase etc..,  to use ADL store as 
> input or output.
>  
> ADL is ultra-high capacity, Optimized for massive throughput with rich 
> management and security features. More details available at 
> https://azure.microsoft.com/en-us/services/data-lake-store/



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (HADOOP-12666) Support Microsoft Azure Data Lake - as a file system in Hadoop

Reply via email to