zenfenan created NIFI-4826:
------------------------------
Summary: ListAzureBlobStorage doesn't write azure.blobname properly
Key: NIFI-4826
URL: https://issues.apache.org/jira/browse/NIFI-4826
Project: Apache NiFi
Issue Type: Improvement
Components: Extensions
Affects Versions: 1.5.0, 1.4.0, 1.3.0, 1.2.0
Reporter: zenfenan
ListAzureBlobStorage as of now takes the substring from the blob's primary URI
i.e. primaryUri.lastIndexOf('/') + 1 and writes that as azure.blobname. For ex,
if the blob is in the path
"mystorageaccountname.blob.core.windows.net/container-name/path/to/the/blob".
It will write azure.blobname as "blob". So if we have the blob located under a
multiple hierarchy directory structure such as the above one, it will be
troublesome in the downstream processors like FetchAzureBlobStorage which
expects the full blob name to be given i.e. "path/to/the/blob". Giving just
"blob" here will fail.
A workaround that can be followed right now, is to use "ExecuteScript" and get
the substring from primary URI i.e. everything after the
"https://"+storageAccountName+"/"+containerName+"/". A better approach would be
to make use of the CloudBlob.getName() API provided in Azure SDK. It should be
a minor change since we are already using this SDK and the class in our
processor.
--
This message was sent by Atlassian JIRA
(v7.6.3#76005)