Thanks Bryan,
Working with the configuration you sent what I needed to change was to
set the fs.defaultFS to the wasb url that we're working from.
Unfortunately this is a less than ideal solution since we'll be pulling
files from multiple wasb urls and ingesting them into an Accumulo
datastore. Changing the defaultFS I'm pretty certainly would mess with
our local HDFS/Accumulo install. In addition we're trying to maintain
all of this configuration with Ambari, which from what I can tell only
supports one core-site configuration file.
Is the only solution here to maintain multiple core-site.xml files or is
there another way we configure this?
Thanks,
Austin
On 03/28/2017 01:41 PM, Bryan Bende wrote:
Austin,
Can you provide the full error message and stacktrace for the
IllegalArgumentException from nifi-app.log?
When you start the processor it creates a FileSystem instance based on
the config files provided to the processor, which in turn causes all
of the corresponding classes to load.
I'm not that familiar with Azure, but if "Azure blob store" is WASB,
then I have successfully done the following...
In core-site.xml:
<configuration>
<property>
<name>fs.defaultFS</name>
<value>wasb://YOUR_USER@YOUR_HOST/</value>
</property>
<property>
<name>fs.azure.account.key.nifi.blob.core.windows.net</name>
<value>YOUR_KEY</value>
</property>
<property>
<name>fs.AbstractFileSystem.wasb.impl</name>
<value>org.apache.hadoop.fs.azure.Wasb</value>
</property>
<property>
<name>fs.wasb.impl</name>
<value>org.apache.hadoop.fs.azure.NativeAzureFileSystem</value>
</property>
<property>
<name>fs.azure.skip.metrics</name>
<value>true</value>
</property>
</configuration>
In Additional Resources property of an HDFS processor, point to a
directory with:
azure-storage-2.0.0.jar
commons-codec-1.6.jar
commons-lang3-3.3.2.jar
commons-logging-1.1.1.jar
guava-11.0.2.jar
hadoop-azure-2.7.3.jar
httpclient-4.2.5.jar
httpcore-4.2.4.jar
jackson-core-2.2.3.jar
jsr305-1.3.9.jar
slf4j-api-1.7.5.jar
Thanks,
Bryan
On Tue, Mar 28, 2017 at 1:15 PM, Austin Heyne <[email protected]> wrote:
Hi all,
Thanks for all the help you've given me so far. Today I'm trying to pull
files from an Azure blob store. I've done some reading on this and from
previous tickets [1] and guides [2] it seems the recommended approach is to
place the required jars, to use the HDFS Azure protocol, in 'Additional
Classpath Resoures' and the hadoop core-site and hdfs-site configs into the
'Hadoop Configuration Resources'. I have my local HDFS properly configured
to access wasb urls. I'm able to ls, copy to and from, etc with out problem.
Using the same HDFS config files and trying both all the jars in my
hadoop-client/lib directory (hdp) and using the jars recommend in [1] I'm
still seeing the "java.lang.IllegalArgumentException: Wrong FS: " error in
my NiFi logs and am unable to pull files from Azure blob storage.
Interestingly, it seems the processor is spinning up way to fast, the errors
appear in the log as soon as I start the processor. I'm not sure how it
could be loading all of those jars that quickly.
Does anyone have any experience with this or recommendations to try?
Thanks,
Austin
[1] https://issues.apache.org/jira/browse/NIFI-1922
[2]
https://community.hortonworks.com/articles/71916/connecting-to-azure-data-lake-from-a-nifi-dataflow.html