[ 
https://issues.apache.org/jira/browse/HADOOP-15407?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16599829#comment-16599829
 ] 

Thomas Marquardt commented on HADOOP-15407:
-------------------------------------------

[[email protected]], the 403 errors are happening because your storage account 
key was updated.  Please try the new key I sent you.

Also, make sure you have the latest sources and refer to the "Testing the Azure 
ABFS Client" section of testing_azure.md.  The config was updated recently by 
HADOOP-15663.

In a nutshell, add the following to src/test/resources/azure-auth-keys.xml:

{noformat}
<?xml version="1.0" encoding="UTF-8"?>
<?xml-stylesheet type="text/xsl" href="configuration.xsl"?>
<configuration xmlns:xi="http://www.w3.org/2001/XInclude";>
  <property>
    <name>fs.azure.abfs.account.name</name>
    <value>{ACCOUNT_NAME}.dfs.core.windows.net</value>
  </property>

  <property>
    <name>fs.azure.account.key.{ACCOUNT_NAME}.dfs.core.windows.net</name>
    <value>{ACCOUNT_ACCESS_KEY}</value>
  </property>

  <property>
    <name>fs.azure.wasb.account.name</name>
    <value>{ACCOUNT_NAME}.blob.core.windows.net</value>
  </property>
  
  <property>
    <name>fs.azure.account.key.{ACCOUNT_NAME}.blob.core.windows.net</name>
    <value>{ACCOUNT_ACCESS_KEY}</value>
  </property>

  <property>
    <name>fs.contract.test.fs.abfs</name>
    <value>abfs://{CONTAINER_NAME}@{ACCOUNT_NAME}.dfs.core.windows.net</value>
    <description>A file system URI to be used by the contract 
tests.</description>
  </property>

  <property>
    <name>fs.contract.test.fs.wasb</name>
    <value>wasb://{CONTAINER_NAME}@{ACCOUNT_NAME}.blob.core.windows.net</value>
    <description>A file system URI to be used by the contract 
tests.</description>
  </property>
</configuration>
{noformat}


> Support Windows Azure Storage - Blob file system in Hadoop
> ----------------------------------------------------------
>
>                 Key: HADOOP-15407
>                 URL: https://issues.apache.org/jira/browse/HADOOP-15407
>             Project: Hadoop Common
>          Issue Type: New Feature
>          Components: fs/azure
>    Affects Versions: 3.2.0
>            Reporter: Esfandiar Manii
>            Assignee: Da Zhou
>            Priority: Blocker
>         Attachments: HADOOP-15407-001.patch, HADOOP-15407-002.patch, 
> HADOOP-15407-003.patch, HADOOP-15407-004.patch, HADOOP-15407-008.patch, 
> HADOOP-15407-HADOOP-15407-008.patch, HADOOP-15407-HADOOP-15407.006.patch, 
> HADOOP-15407-HADOOP-15407.007.patch, HADOOP-15407-HADOOP-15407.008.patch
>
>
> *{color:#212121}Description{color}*
>  This JIRA adds a new file system implementation, ABFS, for running Big Data 
> and Analytics workloads against Azure Storage. This is a complete rewrite of 
> the previous WASB driver with a heavy focus on optimizing both performance 
> and cost.
>  {color:#212121} {color}
>  *{color:#212121}High level design{color}*
>  At a high level, the code here extends the FileSystem class to provide an 
> implementation for accessing blobs in Azure Storage. The scheme abfs is used 
> for accessing it over HTTP, and abfss for accessing over HTTPS. The following 
> URI scheme is used to address individual paths:
>  {color:#212121} {color}
>  
> {color:#212121}abfs[s]://<filesystem>@<account>.dfs.core.windows.net/<path>{color}
>  {color:#212121} {color}
>  {color:#212121}ABFS is intended as a replacement to WASB. WASB is not 
> deprecated but is in pure maintenance mode and customers should upgrade to 
> ABFS once it hits General Availability later in CY18.{color}
>  {color:#212121}Benefits of ABFS include:{color}
>  {color:#212121}·         Higher scale (capacity, throughput, and IOPS) Big 
> Data and Analytics workloads by allowing higher limits on storage 
> accounts{color}
>  {color:#212121}·         Removing any ramp up time with Storage backend 
> partitioning; blocks are now automatically sharded across partitions in the 
> Storage backend{color}
> {color:#212121}          .         This avoids the need for using 
> temporary/intermediate files, increasing the cost (and framework complexity 
> around committing jobs/tasks){color}
>  {color:#212121}·         Enabling much higher read and write throughput on 
> single files (tens of Gbps by default){color}
>  {color:#212121}·         Still retaining all of the Azure Blob features 
> customers are familiar with and expect, and gaining the benefits of future 
> Blob features as well{color}
>  {color:#212121}ABFS incorporates Hadoop Filesystem metrics to monitor the 
> file system throughput and operations. Ambari metrics are not currently 
> implemented for ABFS, but will be available soon.{color}
>  {color:#212121} {color}
>  *{color:#212121}Credits and history{color}*
>  Credit for this work goes to (hope I don't forget anyone): Shane Mainali, 
> {color:#212121}Thomas Marquardt, Zichen Sun, Georgi Chalakov, Esfandiar 
> Manii, Amit Singh, Dana Kaban, Da Zhou, Junhua Gu, Saher Ahwal, Saurabh Pant, 
> and James Baker. {color}
>  {color:#212121} {color}
>  *Test*
>  ABFS has gone through many test procedures including Hadoop file system 
> contract tests, unit testing, functional testing, and manual testing. All the 
> Junit tests provided with the driver are capable of running in both 
> sequential/parallel fashion in order to reduce the testing time.
>  {color:#212121}Besides unit tests, we have used ABFS as the default file 
> system in Azure HDInsight. Azure HDInsight will very soon offer ABFS as a 
> storage option. (HDFS is also used but not as default file system.) Various 
> different customer and test workloads have been run against clusters with 
> such configurations for quite some time. Benchmarks such as Tera*, TPC-DS, 
> Spark Streaming and Spark SQL, and others have been run to do scenario, 
> performance, and functional testing. Third parties and customers have also 
> done various testing of ABFS.{color}
>  {color:#212121}The current version reflects to the version of the code 
> tested and used in our production environment.{color}



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to