[jira] [Commented] (NIFI-4360) Add support for Azure Data Lake Store (ADLS)

2021-05-18 Thread Bill SAndman (Jira)


[ 
https://issues.apache.org/jira/browse/NIFI-4360?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17347109#comment-17347109
 ] 

Bill SAndman commented on NIFI-4360:


I believe this Jira can be closed in light of NIFI-7259 , and associated JIRAS

> Add support for Azure Data Lake Store (ADLS)
> 
>
> Key: NIFI-4360
> URL: https://issues.apache.org/jira/browse/NIFI-4360
> Project: Apache NiFi
>  Issue Type: New Feature
>  Components: Extensions
>Reporter: Milan Chandna
>Assignee: Milan Chandna
>Priority: Major
>  Labels: adls, azure, hdfs
>   Original Estimate: 336h
>  Time Spent: 10m
>  Remaining Estimate: 335h 50m
>
> Currently ingress and egress in ADLS account is possible only using HDFS 
> processors.
> Opening this feature to support separate processors for interaction with ADLS 
> accounts directly.
> Benefits are many like 
> - simple configuration.
> - Helping users not familiar with HDFS 
> - Helping users who currently are accessing ADLS accounts directly.
> - using the ADLS SDK rather than HDFS client, one lesser layer to go through.
> Can be achieved by adding separate ADLS processors.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Commented] (NIFI-4360) Add support for Azure Data Lake Store (ADLS)

2018-09-26 Thread ASF GitHub Bot (JIRA)


[ 
https://issues.apache.org/jira/browse/NIFI-4360?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16628876#comment-16628876
 ] 

ASF GitHub Bot commented on NIFI-4360:
--

Github user mohitgargk commented on the issue:

https://github.com/apache/nifi/pull/2158
  
Hi @milanchandna, was wondering if you are still working on this. I need 
this feature for my use case and was going the same path as you. Seems you are 
almost there.  In case you are working on it, Can you please try syncing your 
fork and rebase your commits.


> Add support for Azure Data Lake Store (ADLS)
> 
>
> Key: NIFI-4360
> URL: https://issues.apache.org/jira/browse/NIFI-4360
> Project: Apache NiFi
>  Issue Type: New Feature
>  Components: Extensions
>Reporter: Milan Chandna
>Assignee: Milan Chandna
>Priority: Major
>  Labels: adls, azure, hdfs
>   Original Estimate: 336h
>  Remaining Estimate: 336h
>
> Currently ingress and egress in ADLS account is possible only using HDFS 
> processors.
> Opening this feature to support separate processors for interaction with ADLS 
> accounts directly.
> Benefits are many like 
> - simple configuration.
> - Helping users not familiar with HDFS 
> - Helping users who currently are accessing ADLS accounts directly.
> - using the ADLS SDK rather than HDFS client, one lesser layer to go through.
> Can be achieved by adding separate ADLS processors.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (NIFI-4360) Add support for Azure Data Lake Store (ADLS)

2018-03-21 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/NIFI-4360?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16408541#comment-16408541
 ] 

ASF GitHub Bot commented on NIFI-4360:
--

Github user milanchandna commented on the issue:

https://github.com/apache/nifi/pull/2158
  
@jzonthemtn @jzonthemtn @pvillard31 Please review.


> Add support for Azure Data Lake Store (ADLS)
> 
>
> Key: NIFI-4360
> URL: https://issues.apache.org/jira/browse/NIFI-4360
> Project: Apache NiFi
>  Issue Type: New Feature
>  Components: Extensions
>Reporter: Milan Chandna
>Assignee: Milan Chandna
>Priority: Major
>  Labels: adls, azure, hdfs
>   Original Estimate: 336h
>  Remaining Estimate: 336h
>
> Currently ingress and egress in ADLS account is possible only using HDFS 
> processors.
> Opening this feature to support separate processors for interaction with ADLS 
> accounts directly.
> Benefits are many like 
> - simple configuration.
> - Helping users not familiar with HDFS 
> - Helping users who currently are accessing ADLS accounts directly.
> - using the ADLS SDK rather than HDFS client, one lesser layer to go through.
> Can be achieved by adding separate ADLS processors.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (NIFI-4360) Add support for Azure Data Lake Store (ADLS)

2018-03-21 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/NIFI-4360?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16408514#comment-16408514
 ] 

ASF GitHub Bot commented on NIFI-4360:
--

Github user milanchandna commented on the issue:

https://github.com/apache/nifi/pull/2158
  
Can I please get some feedback here. Thanks.


> Add support for Azure Data Lake Store (ADLS)
> 
>
> Key: NIFI-4360
> URL: https://issues.apache.org/jira/browse/NIFI-4360
> Project: Apache NiFi
>  Issue Type: New Feature
>  Components: Extensions
>Reporter: Milan Chandna
>Assignee: Milan Chandna
>Priority: Major
>  Labels: adls, azure, hdfs
>   Original Estimate: 336h
>  Remaining Estimate: 336h
>
> Currently ingress and egress in ADLS account is possible only using HDFS 
> processors.
> Opening this feature to support separate processors for interaction with ADLS 
> accounts directly.
> Benefits are many like 
> - simple configuration.
> - Helping users not familiar with HDFS 
> - Helping users who currently are accessing ADLS accounts directly.
> - using the ADLS SDK rather than HDFS client, one lesser layer to go through.
> Can be achieved by adding separate ADLS processors.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (NIFI-4360) Add support for Azure Data Lake Store (ADLS)

2017-11-15 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/NIFI-4360?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16253324#comment-16253324
 ] 

ASF GitHub Bot commented on NIFI-4360:
--

Github user milanchandna commented on the issue:

https://github.com/apache/nifi/pull/2158
  
@joewitt, @jzonthemtn @pvillard31 - Hey, did you guys got a chance to look 
at this?


> Add support for Azure Data Lake Store (ADLS)
> 
>
> Key: NIFI-4360
> URL: https://issues.apache.org/jira/browse/NIFI-4360
> Project: Apache NiFi
>  Issue Type: New Feature
>  Components: Extensions
>Reporter: Milan Chandna
>Assignee: Milan Chandna
>  Labels: adls, azure, hdfs
>   Original Estimate: 336h
>  Remaining Estimate: 336h
>
> Currently ingress and egress in ADLS account is possible only using HDFS 
> processors.
> Opening this feature to support separate processors for interaction with ADLS 
> accounts directly.
> Benefits are many like 
> - simple configuration.
> - Helping users not familiar with HDFS 
> - Helping users who currently are accessing ADLS accounts directly.
> - using the ADLS SDK rather than HDFS client, one lesser layer to go through.
> Can be achieved by adding separate ADLS processors.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Commented] (NIFI-4360) Add support for Azure Data Lake Store (ADLS)

2017-10-09 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/NIFI-4360?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16196630#comment-16196630
 ] 

ASF GitHub Bot commented on NIFI-4360:
--

Github user milanchandna commented on a diff in the pull request:

https://github.com/apache/nifi/pull/2158#discussion_r143404820
  
--- Diff: 
nifi-nar-bundles/nifi-azure-bundle/nifi-azure-processors/src/main/java/org/apache/nifi/processors/adls/ListADLSFile.java
 ---
@@ -0,0 +1,289 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one or more
+ * contributor license agreements.  See the NOTICE file distributed with
+ * this work for additional information regarding copyright ownership.
+ * The ASF licenses this file to You under the Apache License, Version 2.0
+ * (the "License"); you may not use this file except in compliance with
+ * the License.  You may obtain a copy of the License at
+ *
+ * http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+package org.apache.nifi.processors.adls;
+
+import com.microsoft.azure.datalake.store.ADLStoreClient;
+import com.microsoft.azure.datalake.store.DirectoryEntry;
+import com.microsoft.azure.datalake.store.DirectoryEntryType;
+import org.apache.nifi.annotation.behavior.*;
+import org.apache.nifi.annotation.documentation.CapabilityDescription;
+import org.apache.nifi.annotation.documentation.Tags;
+import org.apache.nifi.components.PropertyDescriptor;
+import org.apache.nifi.components.state.Scope;
+import org.apache.nifi.components.state.StateMap;
+import org.apache.nifi.distributed.cache.client.DistributedMapCacheClient;
+import org.apache.nifi.flowfile.FlowFile;
+import org.apache.nifi.flowfile.attributes.CoreAttributes;
+import org.apache.nifi.processor.ProcessContext;
+import org.apache.nifi.processor.ProcessSession;
+import org.apache.nifi.processor.ProcessorInitializationContext;
+import org.apache.nifi.processor.exception.ProcessException;
+import org.apache.nifi.processor.util.StandardValidators;
+
+import java.io.IOException;
+import java.nio.file.Path;
+import java.nio.file.Paths;
+import java.util.*;
+import java.util.concurrent.TimeUnit;
+
+import static org.apache.nifi.processors.adls.ADLSConstants.ACCOUNT_NAME;
+import static org.apache.nifi.processors.adls.ADLSConstants.REL_SUCCESS;
+
+@TriggerSerially
+@TriggerWhenEmpty
+@InputRequirement(InputRequirement.Requirement.INPUT_FORBIDDEN)
+@Tags({"hadoop", "ADLS", "get", "list", "ingest", "ingress", "source", 
"filesystem"})
+@CapabilityDescription("Retrieves a listing of files from ADLS. For each 
file that is "
++ "listed in ADLS, this processor creates a FlowFile that 
represents the ADLS file to be fetched in conjunction with FetchADLSFile. This 
Processor is "
++  "designed to run on Primary Node only in a cluster. If the 
primary node changes, the new Primary Node will pick up where the previous node 
left "
++  "off without duplicating all of the data.")
+@WritesAttributes({
+@WritesAttribute(attribute="filename", description="The name of 
the file that was read from ADLS"),
+@WritesAttribute(attribute="path", description="The path is set to 
the path of the file on ADLS"),
+@WritesAttribute(attribute="adls.owner", description="The user 
that owns the file"),
+@WritesAttribute(attribute="adls.group", description="The group 
that owns the file"),
+@WritesAttribute(attribute="adls.lastModified", description="The 
timestamp of when the file was last modified, as milliseconds since midnight 
Jan 1, 1970 UTC"),
+@WritesAttribute(attribute="adls.length", description="The number 
of bytes in the file"),
+@WritesAttribute(attribute="adls.expirytime", description="The 
number of bytes in the file"),
+@WritesAttribute(attribute="adls.permissions", description="The 
permissions for the file in ADLS. This is formatted as 3 characters for the 
owner, "
++ "3 for the group, and 3 for other users. For example 
rw-rw-r--")
+})
+@Stateful(scopes = Scope.CLUSTER, description = "After performing a 
listing of ADLS files, the latest timestamp of all the files transferred is 
stored. "
++ "This allows the Processor to list only files that have been 
added or modified after "
++ "this date the next time that the Processor is run, without 
having to 

[jira] [Commented] (NIFI-4360) Add support for Azure Data Lake Store (ADLS)

2017-10-09 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/NIFI-4360?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16196631#comment-16196631
 ] 

ASF GitHub Bot commented on NIFI-4360:
--

Github user milanchandna commented on a diff in the pull request:

https://github.com/apache/nifi/pull/2158#discussion_r143404854
  
--- Diff: 
nifi-nar-bundles/nifi-azure-bundle/nifi-azure-processors/src/main/java/org/apache/nifi/processors/adls/PutADLSFile.java
 ---
@@ -0,0 +1,267 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one or more
+ * contributor license agreements.  See the NOTICE file distributed with
+ * this work for additional information regarding copyright ownership.
+ * The ASF licenses this file to You under the Apache License, Version 2.0
+ * (the "License"); you may not use this file except in compliance with
+ * the License.  You may obtain a copy of the License at
+ *
+ * http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+package org.apache.nifi.processors.adls;
+
+import com.microsoft.azure.datalake.store.ADLFileOutputStream;
+import com.microsoft.azure.datalake.store.ADLStoreClient;
+import com.microsoft.azure.datalake.store.IfExists;
+import com.microsoft.azure.datalake.store.acl.AclEntry;
+import org.apache.nifi.annotation.behavior.*;
+import org.apache.nifi.annotation.documentation.CapabilityDescription;
+import org.apache.nifi.annotation.documentation.Tags;
+import org.apache.nifi.components.AllowableValue;
+import org.apache.nifi.components.PropertyDescriptor;
+import org.apache.nifi.flowfile.FlowFile;
+import org.apache.nifi.flowfile.attributes.CoreAttributes;
+import org.apache.nifi.processor.ProcessContext;
+import org.apache.nifi.processor.ProcessSession;
+import org.apache.nifi.processor.ProcessorInitializationContext;
+import org.apache.nifi.processor.exception.ProcessException;
+import org.apache.nifi.processor.util.StandardValidators;
+import org.apache.nifi.util.StopWatch;
+import org.apache.nifi.util.StringUtils;
+
+import java.io.IOException;
+import java.nio.file.Path;
+import java.nio.file.Paths;
+import java.util.ArrayList;
+import java.util.Arrays;
+import java.util.Collections;
+import java.util.List;
+import java.util.concurrent.TimeUnit;
+import java.util.stream.Collectors;
+
+import static org.apache.nifi.processors.adls.ADLSConstants.*;
+
+@InputRequirement(InputRequirement.Requirement.INPUT_REQUIRED)
+@Tags({"hadoop", "ADLS", "put", "copy", "filesystem", "restricted"})
--- End diff --

done.


> Add support for Azure Data Lake Store (ADLS)
> 
>
> Key: NIFI-4360
> URL: https://issues.apache.org/jira/browse/NIFI-4360
> Project: Apache NiFi
>  Issue Type: New Feature
>  Components: Extensions
>Reporter: Milan Chandna
>Assignee: Milan Chandna
>  Labels: adls, azure, hdfs
>   Original Estimate: 336h
>  Remaining Estimate: 336h
>
> Currently ingress and egress in ADLS account is possible only using HDFS 
> processors.
> Opening this feature to support separate processors for interaction with ADLS 
> accounts directly.
> Benefits are many like 
> - simple configuration.
> - Helping users not familiar with HDFS 
> - Helping users who currently are accessing ADLS accounts directly.
> - using the ADLS SDK rather than HDFS client, one lesser layer to go through.
> Can be achieved by adding separate ADLS processors.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Commented] (NIFI-4360) Add support for Azure Data Lake Store (ADLS)

2017-10-09 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/NIFI-4360?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16196627#comment-16196627
 ] 

ASF GitHub Bot commented on NIFI-4360:
--

Github user milanchandna commented on a diff in the pull request:

https://github.com/apache/nifi/pull/2158#discussion_r143404758
  
--- Diff: 
nifi-nar-bundles/nifi-azure-bundle/nifi-azure-processors/src/main/java/org/apache/nifi/processors/adls/ADLSConstants.java
 ---
@@ -0,0 +1,90 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one or more
+ * contributor license agreements.  See the NOTICE file distributed with
+ * this work for additional information regarding copyright ownership.
+ * The ASF licenses this file to You under the Apache License, Version 2.0
+ * (the "License"); you may not use this file except in compliance with
+ * the License.  You may obtain a copy of the License at
+ *
+ * http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+package org.apache.nifi.processors.adls;
+
+import org.apache.nifi.components.PropertyDescriptor;
+import org.apache.nifi.processor.Relationship;
+import org.apache.nifi.processor.util.StandardValidators;
+
+
+public class ADLSConstants {
+
+public static final int CHUNK_SIZE_IN_BYTES = 400;
+
+public static final PropertyDescriptor PATH_NAME = new 
PropertyDescriptor.Builder()
+.name("Path")
+.description("Path for file in Azure Data Lake, e.g. 
/adlshome/")
+.required(true)
+.expressionLanguageSupported(true)
+.addValidator(StandardValidators.NON_EMPTY_VALIDATOR)
+.build();
+
+public static final PropertyDescriptor RETRY_ON_FAIL = new 
PropertyDescriptor.Builder()
+.name("Overwrite policy")
+.description("How many times to retry if read fails per chunk 
read, defaults to 3")
+.required(true)
+//TODO add Integer validator
--- End diff --

property wasn't being used anymore, removed it.


> Add support for Azure Data Lake Store (ADLS)
> 
>
> Key: NIFI-4360
> URL: https://issues.apache.org/jira/browse/NIFI-4360
> Project: Apache NiFi
>  Issue Type: New Feature
>  Components: Extensions
>Reporter: Milan Chandna
>Assignee: Milan Chandna
>  Labels: adls, azure, hdfs
>   Original Estimate: 336h
>  Remaining Estimate: 336h
>
> Currently ingress and egress in ADLS account is possible only using HDFS 
> processors.
> Opening this feature to support separate processors for interaction with ADLS 
> accounts directly.
> Benefits are many like 
> - simple configuration.
> - Helping users not familiar with HDFS 
> - Helping users who currently are accessing ADLS accounts directly.
> - using the ADLS SDK rather than HDFS client, one lesser layer to go through.
> Can be achieved by adding separate ADLS processors.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Commented] (NIFI-4360) Add support for Azure Data Lake Store (ADLS)

2017-10-09 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/NIFI-4360?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16196628#comment-16196628
 ] 

ASF GitHub Bot commented on NIFI-4360:
--

Github user milanchandna commented on a diff in the pull request:

https://github.com/apache/nifi/pull/2158#discussion_r143404785
  
--- Diff: 
nifi-nar-bundles/nifi-azure-bundle/nifi-azure-processors/src/main/java/org/apache/nifi/processors/adls/ListADLSFile.java
 ---
@@ -0,0 +1,289 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one or more
+ * contributor license agreements.  See the NOTICE file distributed with
+ * this work for additional information regarding copyright ownership.
+ * The ASF licenses this file to You under the Apache License, Version 2.0
+ * (the "License"); you may not use this file except in compliance with
+ * the License.  You may obtain a copy of the License at
+ *
+ * http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+package org.apache.nifi.processors.adls;
+
+import com.microsoft.azure.datalake.store.ADLStoreClient;
+import com.microsoft.azure.datalake.store.DirectoryEntry;
+import com.microsoft.azure.datalake.store.DirectoryEntryType;
+import org.apache.nifi.annotation.behavior.*;
+import org.apache.nifi.annotation.documentation.CapabilityDescription;
+import org.apache.nifi.annotation.documentation.Tags;
+import org.apache.nifi.components.PropertyDescriptor;
+import org.apache.nifi.components.state.Scope;
+import org.apache.nifi.components.state.StateMap;
+import org.apache.nifi.distributed.cache.client.DistributedMapCacheClient;
+import org.apache.nifi.flowfile.FlowFile;
+import org.apache.nifi.flowfile.attributes.CoreAttributes;
+import org.apache.nifi.processor.ProcessContext;
+import org.apache.nifi.processor.ProcessSession;
+import org.apache.nifi.processor.ProcessorInitializationContext;
+import org.apache.nifi.processor.exception.ProcessException;
+import org.apache.nifi.processor.util.StandardValidators;
+
+import java.io.IOException;
+import java.nio.file.Path;
+import java.nio.file.Paths;
+import java.util.*;
+import java.util.concurrent.TimeUnit;
+
+import static org.apache.nifi.processors.adls.ADLSConstants.ACCOUNT_NAME;
+import static org.apache.nifi.processors.adls.ADLSConstants.REL_SUCCESS;
+
+@TriggerSerially
+@TriggerWhenEmpty
+@InputRequirement(InputRequirement.Requirement.INPUT_FORBIDDEN)
+@Tags({"hadoop", "ADLS", "get", "list", "ingest", "ingress", "source", 
"filesystem"})
--- End diff --

done.


> Add support for Azure Data Lake Store (ADLS)
> 
>
> Key: NIFI-4360
> URL: https://issues.apache.org/jira/browse/NIFI-4360
> Project: Apache NiFi
>  Issue Type: New Feature
>  Components: Extensions
>Reporter: Milan Chandna
>Assignee: Milan Chandna
>  Labels: adls, azure, hdfs
>   Original Estimate: 336h
>  Remaining Estimate: 336h
>
> Currently ingress and egress in ADLS account is possible only using HDFS 
> processors.
> Opening this feature to support separate processors for interaction with ADLS 
> accounts directly.
> Benefits are many like 
> - simple configuration.
> - Helping users not familiar with HDFS 
> - Helping users who currently are accessing ADLS accounts directly.
> - using the ADLS SDK rather than HDFS client, one lesser layer to go through.
> Can be achieved by adding separate ADLS processors.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Commented] (NIFI-4360) Add support for Azure Data Lake Store (ADLS)

2017-10-01 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/NIFI-4360?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16187500#comment-16187500
 ] 

ASF GitHub Bot commented on NIFI-4360:
--

Github user milanchandna commented on a diff in the pull request:

https://github.com/apache/nifi/pull/2158#discussion_r142037960
  
--- Diff: 
nifi-nar-bundles/nifi-azure-bundle/nifi-azure-processors/src/main/java/org/apache/nifi/processors/adls/PutADLSFile.java
 ---
@@ -0,0 +1,282 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one or more
+ * contributor license agreements.  See the NOTICE file distributed with
+ * this work for additional information regarding copyright ownership.
+ * The ASF licenses this file to You under the Apache License, Version 2.0
+ * (the "License"); you may not use this file except in compliance with
+ * the License.  You may obtain a copy of the License at
+ *
+ * http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+package org.apache.nifi.processors.adls;
+
+import com.microsoft.azure.datalake.store.ADLFileOutputStream;
+import com.microsoft.azure.datalake.store.ADLStoreClient;
+import com.microsoft.azure.datalake.store.IfExists;
+import com.microsoft.azure.datalake.store.acl.AclEntry;
+import org.apache.nifi.annotation.behavior.InputRequirement;
+import org.apache.nifi.annotation.behavior.ReadsAttribute;
+import org.apache.nifi.annotation.behavior.ReadsAttributes;
+import org.apache.nifi.annotation.behavior.WritesAttribute;
+import org.apache.nifi.annotation.behavior.WritesAttributes;
+import org.apache.nifi.annotation.behavior.Restricted;
+import org.apache.nifi.annotation.documentation.CapabilityDescription;
+import org.apache.nifi.annotation.documentation.SeeAlso;
+import org.apache.nifi.annotation.documentation.Tags;
+import org.apache.nifi.components.AllowableValue;
+import org.apache.nifi.components.PropertyDescriptor;
+import org.apache.nifi.flowfile.FlowFile;
+import org.apache.nifi.flowfile.attributes.CoreAttributes;
+import org.apache.nifi.processor.ProcessContext;
+import org.apache.nifi.processor.ProcessSession;
+import org.apache.nifi.processor.ProcessorInitializationContext;
+import org.apache.nifi.processor.exception.ProcessException;
+import org.apache.nifi.processor.util.StandardValidators;
+import org.apache.nifi.util.StopWatch;
+import org.apache.nifi.util.StringUtils;
+
+import java.io.IOException;
+import java.nio.file.Path;
+import java.nio.file.Paths;
+import java.util.ArrayList;
+import java.util.Arrays;
+import java.util.Collections;
+import java.util.List;
+import java.util.concurrent.TimeUnit;
+import java.util.stream.Collectors;
+
+import static org.apache.nifi.processors.adls.ADLSConstants.ERR_ACL_ENTRY;
+import static 
org.apache.nifi.processors.adls.ADLSConstants.ERR_FLOWFILE_CORE_ATTR_FILENAME;
+import static org.apache.nifi.processors.adls.ADLSConstants.REL_FAILURE;
+import static 
org.apache.nifi.processors.adls.ADLSConstants.FILE_NAME_ATTRIBUTE;
+import static 
org.apache.nifi.processors.adls.ADLSConstants.CHUNK_SIZE_IN_BYTES;
+import static 
org.apache.nifi.processors.adls.ADLSConstants.ADLS_FILE_PATH_ATTRIBUTE;
+import static org.apache.nifi.processors.adls.ADLSConstants.REL_SUCCESS;
+
+@InputRequirement(InputRequirement.Requirement.INPUT_REQUIRED)
+@Tags({"azure", "hadoop", "ADLS", "put", "egress", "copy", "filesystem", 
"restricted"})
+@CapabilityDescription("Write FlowFile data to Azure Data Lake Store 
(ADLS)")
+@SeeAlso({ListADLSFile.class, FetchADLSFile.class})
+@ReadsAttributes({
+@ReadsAttribute(attribute = "filename", description = "The name of 
the file written to ADLS comes from the value of this attribute."),
+})
+@WritesAttributes({
+@WritesAttribute(attribute = "filename", description = "The name 
of the file written to ADLS is stored in this attribute."),
+@WritesAttribute(attribute = "absolute.adls.path", description = 
"The absolute path to the file on ADLS is stored in this attribute.")
+})
+@Restricted("Provides operator the ability to write to any file that NiFi 
has access to in ADLS.")
+public class PutADLSFile extends ADLSAbstractProcessor {
+
+private static final String PART_FILE_EXTENSION = ".nifipart";
+private static final String REPLACE_RESOLUTION = "replace";
+private static final 

[jira] [Commented] (NIFI-4360) Add support for Azure Data Lake Store (ADLS)

2017-09-22 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/NIFI-4360?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16176476#comment-16176476
 ] 

ASF GitHub Bot commented on NIFI-4360:
--

Github user milanchandna commented on a diff in the pull request:

https://github.com/apache/nifi/pull/2158#discussion_r140503820
  
--- Diff: 
nifi-nar-bundles/nifi-azure-bundle/nifi-azure-processors/src/test/java/org/apache/nifi/processors/adls/TestPutADLSFile.java
 ---
@@ -0,0 +1,522 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one or more
+ * contributor license agreements.  See the NOTICE file distributed with
+ * this work for additional information regarding copyright ownership.
+ * The ASF licenses this file to You under the Apache License, Version 2.0
+ * (the "License"); you may not use this file except in compliance with
+ * the License.  You may obtain a copy of the License at
+ *
+ * http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+package org.apache.nifi.processors.adls;
+
+import com.microsoft.azure.datalake.store.ADLStoreClient;
+import com.microsoft.azure.datalake.store.ADLStoreOptions;
+import okhttp3.mockwebserver.MockResponse;
+import okhttp3.mockwebserver.MockWebServer;
+import okhttp3.mockwebserver.QueueDispatcher;
+import okhttp3.mockwebserver.RecordedRequest;
+import org.apache.nifi.processor.Processor;
+import org.apache.nifi.reporting.InitializationException;
+import org.apache.nifi.util.TestRunner;
+import org.apache.nifi.util.TestRunners;
+import org.hamcrest.CoreMatchers;
+import org.junit.Assert;
+import org.junit.Before;
+import org.junit.Test;
+
+import java.io.ByteArrayInputStream;
+import java.io.IOException;
+import java.io.InputStream;
+import java.nio.file.Paths;
+import java.util.HashMap;
+import java.util.Map;
+import java.util.concurrent.TimeUnit;
+
+
+public class TestPutADLSFile {
+
+private MockWebServer server;
+private ADLStoreClient client;
+private Processor processor;
+private TestRunner runner;
+Map fileAttributes;
+
+@Before
+public void init() throws IOException, InitializationException {
+server = new MockWebServer();
+QueueDispatcher dispatcher = new QueueDispatcher();
+dispatcher.setFailFast(new MockResponse().setResponseCode(400));
+server.setDispatcher(dispatcher);
+server.start();
+String accountFQDN = server.getHostName() + ":" + server.getPort();
+String dummyToken = "testDummyAadToken";
+
+client = ADLStoreClient.createClient(accountFQDN, dummyToken);
+client.setOptions(new ADLStoreOptions().setInsecureTransport());
+
+processor = new PutADLSWithMockClient(client);
+
+runner = TestRunners.newTestRunner(processor);
+
+runner.setProperty(ADLSConstants.ACCOUNT_NAME, accountFQDN);
+runner.setProperty(ADLSConstants.CLIENT_ID, "foobar");
+runner.setProperty(ADLSConstants.CLIENT_SECRET, "foobar");
+runner.setProperty(ADLSConstants.AUTH_TOKEN_ENDPOINT, "foobar");
+runner.setProperty(PutADLSFile.DIRECTORY, "/sample/");
+runner.setProperty(PutADLSFile.CONFLICT_RESOLUTION, 
PutADLSFile.FAIL_RESOLUTION_AV);
+runner.removeProperty(PutADLSFile.UMASK);
+runner.removeProperty(PutADLSFile.ACL);
+
+fileAttributes = new HashMap<>();
+fileAttributes.put("filename", "sample.txt");
+fileAttributes.put("path", "/root/");
+}
+
+@Test
+public void testPutConflictFail() throws IOException, 
InterruptedException {
+
+//for create call
+server.enqueue(new MockResponse().setResponseCode(200));
+//for append call(internal call while writing to stream)
+server.enqueue(new MockResponse().setResponseCode(200));
+//for rename call(since its fail conflict)
+server.enqueue(new 
MockResponse().setResponseCode(200).setBody("{\"boolean\":true}"));
+//for delete call(removing temp file)
+server.enqueue(new MockResponse().setResponseCode(200));
+
+//String bodySimpleFileContent = new 
String(Files.readAllBytes(Paths.get(bodySimpleFileName)));
--- End diff --

yes, removed.


> Add support for Azure Data Lake Store 

[jira] [Commented] (NIFI-4360) Add support for Azure Data Lake Store (ADLS)

2017-09-22 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/NIFI-4360?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16176475#comment-16176475
 ] 

ASF GitHub Bot commented on NIFI-4360:
--

Github user milanchandna commented on a diff in the pull request:

https://github.com/apache/nifi/pull/2158#discussion_r140503615
  
--- Diff: 
nifi-nar-bundles/nifi-azure-bundle/nifi-azure-processors/src/main/java/org/apache/nifi/processors/adls/PutADLSFile.java
 ---
@@ -0,0 +1,267 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one or more
+ * contributor license agreements.  See the NOTICE file distributed with
+ * this work for additional information regarding copyright ownership.
+ * The ASF licenses this file to You under the Apache License, Version 2.0
+ * (the "License"); you may not use this file except in compliance with
+ * the License.  You may obtain a copy of the License at
+ *
+ * http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+package org.apache.nifi.processors.adls;
+
+import com.microsoft.azure.datalake.store.ADLFileOutputStream;
+import com.microsoft.azure.datalake.store.ADLStoreClient;
+import com.microsoft.azure.datalake.store.IfExists;
+import com.microsoft.azure.datalake.store.acl.AclEntry;
+import org.apache.nifi.annotation.behavior.*;
+import org.apache.nifi.annotation.documentation.CapabilityDescription;
+import org.apache.nifi.annotation.documentation.Tags;
+import org.apache.nifi.components.AllowableValue;
+import org.apache.nifi.components.PropertyDescriptor;
+import org.apache.nifi.flowfile.FlowFile;
+import org.apache.nifi.flowfile.attributes.CoreAttributes;
+import org.apache.nifi.processor.ProcessContext;
+import org.apache.nifi.processor.ProcessSession;
+import org.apache.nifi.processor.ProcessorInitializationContext;
+import org.apache.nifi.processor.exception.ProcessException;
+import org.apache.nifi.processor.util.StandardValidators;
+import org.apache.nifi.util.StopWatch;
+import org.apache.nifi.util.StringUtils;
+
+import java.io.IOException;
+import java.nio.file.Path;
+import java.nio.file.Paths;
+import java.util.ArrayList;
+import java.util.Arrays;
+import java.util.Collections;
+import java.util.List;
+import java.util.concurrent.TimeUnit;
+import java.util.stream.Collectors;
+
+import static org.apache.nifi.processors.adls.ADLSConstants.*;
+
+@InputRequirement(InputRequirement.Requirement.INPUT_REQUIRED)
+@Tags({"hadoop", "ADLS", "put", "copy", "filesystem", "restricted"})
+@CapabilityDescription("Write FlowFile data to Azure Data Lake Store 
(ADLS)")
+@ReadsAttributes({
+@ReadsAttribute(attribute = "filename", description = "The name of 
the file written to ADLS comes from the value of this attribute."),
+@ReadsAttribute(attribute = "filepath", description = "The 
relative path to the file on ADLS comes from the value of this attribute.")
+})
+@WritesAttributes({
+@WritesAttribute(attribute = "filename", description = "The name 
of the file written to ADLS is stored in this attribute."),
+@WritesAttribute(attribute = "absolute.adls.path", description = 
"The absolute path to the file on ADLS is stored in this attribute.")
+})
+@Restricted("Provides operator the ability to write to any file that NiFi 
has access to in ADLS.")
+public class PutADLSFile extends ADLSAbstractProcessor {
+
+private static final String PART_FILE_EXTENSION = ".nifipart";
+private static final String REPLACE_RESOLUTION = "replace";
+private static final String FAIL_RESOLUTION = "fail";
+private static final String APPEND_RESOLUTION = "append";
+
+protected static final AllowableValue REPLACE_RESOLUTION_AV = new 
AllowableValue(REPLACE_RESOLUTION,
+REPLACE_RESOLUTION, "Replaces the existing file if any.");
+protected static final AllowableValue FAIL_RESOLUTION_AV = new 
AllowableValue(FAIL_RESOLUTION, FAIL_RESOLUTION,
+"Penalizes the flow file and routes it to failure.");
+protected static final AllowableValue APPEND_RESOLUTION_AV = new 
AllowableValue(APPEND_RESOLUTION, APPEND_RESOLUTION,
+"Appends to the existing file if any, creates a new file 
otherwise.");
+
+// properties
+protected static final PropertyDescriptor CONFLICT_RESOLUTION = new 
PropertyDescriptor.Builder()
+

[jira] [Commented] (NIFI-4360) Add support for Azure Data Lake Store (ADLS)

2017-09-22 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/NIFI-4360?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16176472#comment-16176472
 ] 

ASF GitHub Bot commented on NIFI-4360:
--

Github user milanchandna commented on a diff in the pull request:

https://github.com/apache/nifi/pull/2158#discussion_r140503030
  
--- Diff: 
nifi-nar-bundles/nifi-azure-bundle/nifi-azure-processors/src/main/java/org/apache/nifi/processors/adls/ADLSConstants.java
 ---
@@ -0,0 +1,90 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one or more
+ * contributor license agreements.  See the NOTICE file distributed with
+ * this work for additional information regarding copyright ownership.
+ * The ASF licenses this file to You under the Apache License, Version 2.0
+ * (the "License"); you may not use this file except in compliance with
+ * the License.  You may obtain a copy of the License at
+ *
+ * http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+package org.apache.nifi.processors.adls;
+
+import org.apache.nifi.components.PropertyDescriptor;
+import org.apache.nifi.processor.Relationship;
+import org.apache.nifi.processor.util.StandardValidators;
+
+
+public class ADLSConstants {
+
+public static final int CHUNK_SIZE_IN_BYTES = 400;
+
+public static final PropertyDescriptor PATH_NAME = new 
PropertyDescriptor.Builder()
+.name("Path")
+.description("Path for file in Azure Data Lake, e.g. 
/adlshome/")
+.required(true)
+.expressionLanguageSupported(true)
+.addValidator(StandardValidators.NON_EMPTY_VALIDATOR)
+.build();
+
+public static final PropertyDescriptor RETRY_ON_FAIL = new 
PropertyDescriptor.Builder()
+.name("Overwrite policy")
+.description("How many times to retry if read fails per chunk 
read, defaults to 3")
+.required(true)
+//TODO add Integer validator
+.defaultValue("3")
+.build();
+
+public static final PropertyDescriptor ACCOUNT_NAME = new 
PropertyDescriptor.Builder()
+.name("Account FQDN")
+.description("Azure account fully qualified domain name eg: 
accountname.azuredatalakestore.net")
+.required(true)
+.addValidator(StandardValidators.NON_EMPTY_VALIDATOR)
+.build();
+
+public static final PropertyDescriptor CLIENT_ID = new 
PropertyDescriptor.Builder()
+.name("Client ID")
+.description("Azure client ID")
+.required(true)
+.addValidator(StandardValidators.NON_EMPTY_VALIDATOR)
+.build();
+
+public static final PropertyDescriptor CLIENT_SECRET = new 
PropertyDescriptor.Builder()
+.name("Client secret")
+.description("Azure client secret")
+.required(true)
+.sensitive(true)
+.addValidator(StandardValidators.NON_EMPTY_VALIDATOR)
+.build();
+
+public static final PropertyDescriptor AUTH_TOKEN_ENDPOINT = new 
PropertyDescriptor.Builder()
+.name("Auth token endpoint")
+.description("Azure client secret")
--- End diff --

Removed it, wasn't being used.


> Add support for Azure Data Lake Store (ADLS)
> 
>
> Key: NIFI-4360
> URL: https://issues.apache.org/jira/browse/NIFI-4360
> Project: Apache NiFi
>  Issue Type: New Feature
>  Components: Extensions
>Reporter: Milan Chandna
>Assignee: Milan Chandna
>  Labels: adls, azure, hdfs
>   Original Estimate: 336h
>  Remaining Estimate: 336h
>
> Currently ingress and egress in ADLS account is possible only using HDFS 
> processors.
> Opening this feature to support separate processors for interaction with ADLS 
> accounts directly.
> Benefits are many like 
> - simple configuration.
> - Helping users not familiar with HDFS 
> - Helping users who currently are accessing ADLS accounts directly.
> - using the ADLS SDK rather than HDFS client, one lesser layer to go through.
> Can be achieved by adding separate ADLS processors.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Commented] (NIFI-4360) Add support for Azure Data Lake Store (ADLS)

2017-09-22 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/NIFI-4360?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16176249#comment-16176249
 ] 

ASF GitHub Bot commented on NIFI-4360:
--

Github user pvillard31 commented on the issue:

https://github.com/apache/nifi/pull/2158
  
@milanchandna - you can go in ``nifi-nar-bundles/nifi-azure-bundle`` and 
build from this directory ``mvn clean install -Pcontrib-check``


> Add support for Azure Data Lake Store (ADLS)
> 
>
> Key: NIFI-4360
> URL: https://issues.apache.org/jira/browse/NIFI-4360
> Project: Apache NiFi
>  Issue Type: New Feature
>  Components: Extensions
>Reporter: Milan Chandna
>Assignee: Milan Chandna
>  Labels: adls, azure, hdfs
>   Original Estimate: 336h
>  Remaining Estimate: 336h
>
> Currently ingress and egress in ADLS account is possible only using HDFS 
> processors.
> Opening this feature to support separate processors for interaction with ADLS 
> accounts directly.
> Benefits are many like 
> - simple configuration.
> - Helping users not familiar with HDFS 
> - Helping users who currently are accessing ADLS accounts directly.
> - using the ADLS SDK rather than HDFS client, one lesser layer to go through.
> Can be achieved by adding separate ADLS processors.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Commented] (NIFI-4360) Add support for Azure Data Lake Store (ADLS)

2017-09-22 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/NIFI-4360?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16176246#comment-16176246
 ] 

ASF GitHub Bot commented on NIFI-4360:
--

Github user milanchandna commented on the issue:

https://github.com/apache/nifi/pull/2158
  
Hey @pvillard31 Thanks for comments, looking into them.
Currently when I am running full build, I am getting failures in 
TestMinimalLockingWriteAheadLog.

testRecoverFileThatHasTrailingNULBytesAndTruncation(org.wali.TestMinimalLockingWriteAheadLog)

And because of this not able to reliably detect failures in my own module.
Any suggestion to skip the above failures?

-Milan.


> Add support for Azure Data Lake Store (ADLS)
> 
>
> Key: NIFI-4360
> URL: https://issues.apache.org/jira/browse/NIFI-4360
> Project: Apache NiFi
>  Issue Type: New Feature
>  Components: Extensions
>Reporter: Milan Chandna
>Assignee: Milan Chandna
>  Labels: adls, azure, hdfs
>   Original Estimate: 336h
>  Remaining Estimate: 336h
>
> Currently ingress and egress in ADLS account is possible only using HDFS 
> processors.
> Opening this feature to support separate processors for interaction with ADLS 
> accounts directly.
> Benefits are many like 
> - simple configuration.
> - Helping users not familiar with HDFS 
> - Helping users who currently are accessing ADLS accounts directly.
> - using the ADLS SDK rather than HDFS client, one lesser layer to go through.
> Can be achieved by adding separate ADLS processors.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Commented] (NIFI-4360) Add support for Azure Data Lake Store (ADLS)

2017-09-22 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/NIFI-4360?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16176178#comment-16176178
 ] 

ASF GitHub Bot commented on NIFI-4360:
--

Github user pvillard31 commented on a diff in the pull request:

https://github.com/apache/nifi/pull/2158#discussion_r140449455
  
--- Diff: 
nifi-nar-bundles/nifi-azure-bundle/nifi-azure-processors/src/test/java/org/apache/nifi/processors/adls/TestPutADLSFile.java
 ---
@@ -0,0 +1,522 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one or more
+ * contributor license agreements.  See the NOTICE file distributed with
+ * this work for additional information regarding copyright ownership.
+ * The ASF licenses this file to You under the Apache License, Version 2.0
+ * (the "License"); you may not use this file except in compliance with
+ * the License.  You may obtain a copy of the License at
+ *
+ * http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+package org.apache.nifi.processors.adls;
+
+import com.microsoft.azure.datalake.store.ADLStoreClient;
+import com.microsoft.azure.datalake.store.ADLStoreOptions;
+import okhttp3.mockwebserver.MockResponse;
+import okhttp3.mockwebserver.MockWebServer;
+import okhttp3.mockwebserver.QueueDispatcher;
+import okhttp3.mockwebserver.RecordedRequest;
+import org.apache.nifi.processor.Processor;
+import org.apache.nifi.reporting.InitializationException;
+import org.apache.nifi.util.TestRunner;
+import org.apache.nifi.util.TestRunners;
+import org.hamcrest.CoreMatchers;
+import org.junit.Assert;
+import org.junit.Before;
+import org.junit.Test;
+
+import java.io.ByteArrayInputStream;
+import java.io.IOException;
+import java.io.InputStream;
+import java.nio.file.Paths;
+import java.util.HashMap;
+import java.util.Map;
+import java.util.concurrent.TimeUnit;
+
+
+public class TestPutADLSFile {
+
+private MockWebServer server;
+private ADLStoreClient client;
+private Processor processor;
+private TestRunner runner;
+Map fileAttributes;
+
+@Before
+public void init() throws IOException, InitializationException {
+server = new MockWebServer();
+QueueDispatcher dispatcher = new QueueDispatcher();
+dispatcher.setFailFast(new MockResponse().setResponseCode(400));
+server.setDispatcher(dispatcher);
+server.start();
+String accountFQDN = server.getHostName() + ":" + server.getPort();
+String dummyToken = "testDummyAadToken";
+
+client = ADLStoreClient.createClient(accountFQDN, dummyToken);
+client.setOptions(new ADLStoreOptions().setInsecureTransport());
+
+processor = new PutADLSWithMockClient(client);
+
+runner = TestRunners.newTestRunner(processor);
+
+runner.setProperty(ADLSConstants.ACCOUNT_NAME, accountFQDN);
+runner.setProperty(ADLSConstants.CLIENT_ID, "foobar");
+runner.setProperty(ADLSConstants.CLIENT_SECRET, "foobar");
+runner.setProperty(ADLSConstants.AUTH_TOKEN_ENDPOINT, "foobar");
+runner.setProperty(PutADLSFile.DIRECTORY, "/sample/");
+runner.setProperty(PutADLSFile.CONFLICT_RESOLUTION, 
PutADLSFile.FAIL_RESOLUTION_AV);
+runner.removeProperty(PutADLSFile.UMASK);
+runner.removeProperty(PutADLSFile.ACL);
+
+fileAttributes = new HashMap<>();
+fileAttributes.put("filename", "sample.txt");
+fileAttributes.put("path", "/root/");
+}
+
+@Test
+public void testPutConflictFail() throws IOException, 
InterruptedException {
+
+//for create call
+server.enqueue(new MockResponse().setResponseCode(200));
+//for append call(internal call while writing to stream)
+server.enqueue(new MockResponse().setResponseCode(200));
+//for rename call(since its fail conflict)
+server.enqueue(new 
MockResponse().setResponseCode(200).setBody("{\"boolean\":true}"));
+//for delete call(removing temp file)
+server.enqueue(new MockResponse().setResponseCode(200));
+
+//String bodySimpleFileContent = new 
String(Files.readAllBytes(Paths.get(bodySimpleFileName)));
--- End diff --

line to be removed?


> Add support for Azure Data Lake 

[jira] [Commented] (NIFI-4360) Add support for Azure Data Lake Store (ADLS)

2017-09-22 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/NIFI-4360?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16176177#comment-16176177
 ] 

ASF GitHub Bot commented on NIFI-4360:
--

Github user pvillard31 commented on a diff in the pull request:

https://github.com/apache/nifi/pull/2158#discussion_r140449340
  
--- Diff: 
nifi-nar-bundles/nifi-azure-bundle/nifi-azure-processors/src/test/java/org/apache/nifi/processors/adls/TestPutADLSFile.java
 ---
@@ -0,0 +1,522 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one or more
+ * contributor license agreements.  See the NOTICE file distributed with
+ * this work for additional information regarding copyright ownership.
+ * The ASF licenses this file to You under the Apache License, Version 2.0
+ * (the "License"); you may not use this file except in compliance with
+ * the License.  You may obtain a copy of the License at
+ *
+ * http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+package org.apache.nifi.processors.adls;
+
+import com.microsoft.azure.datalake.store.ADLStoreClient;
+import com.microsoft.azure.datalake.store.ADLStoreOptions;
+import okhttp3.mockwebserver.MockResponse;
+import okhttp3.mockwebserver.MockWebServer;
+import okhttp3.mockwebserver.QueueDispatcher;
+import okhttp3.mockwebserver.RecordedRequest;
+import org.apache.nifi.processor.Processor;
+import org.apache.nifi.reporting.InitializationException;
+import org.apache.nifi.util.TestRunner;
+import org.apache.nifi.util.TestRunners;
+import org.hamcrest.CoreMatchers;
+import org.junit.Assert;
+import org.junit.Before;
+import org.junit.Test;
+
+import java.io.ByteArrayInputStream;
+import java.io.IOException;
+import java.io.InputStream;
+import java.nio.file.Paths;
+import java.util.HashMap;
+import java.util.Map;
+import java.util.concurrent.TimeUnit;
+
+
+public class TestPutADLSFile {
+
+private MockWebServer server;
+private ADLStoreClient client;
+private Processor processor;
+private TestRunner runner;
+Map fileAttributes;
+
+@Before
+public void init() throws IOException, InitializationException {
+server = new MockWebServer();
+QueueDispatcher dispatcher = new QueueDispatcher();
+dispatcher.setFailFast(new MockResponse().setResponseCode(400));
+server.setDispatcher(dispatcher);
+server.start();
+String accountFQDN = server.getHostName() + ":" + server.getPort();
+String dummyToken = "testDummyAadToken";
+
+client = ADLStoreClient.createClient(accountFQDN, dummyToken);
+client.setOptions(new ADLStoreOptions().setInsecureTransport());
+
+processor = new PutADLSWithMockClient(client);
+
+runner = TestRunners.newTestRunner(processor);
+
+runner.setProperty(ADLSConstants.ACCOUNT_NAME, accountFQDN);
+runner.setProperty(ADLSConstants.CLIENT_ID, "foobar");
+runner.setProperty(ADLSConstants.CLIENT_SECRET, "foobar");
+runner.setProperty(ADLSConstants.AUTH_TOKEN_ENDPOINT, "foobar");
+runner.setProperty(PutADLSFile.DIRECTORY, "/sample/");
+runner.setProperty(PutADLSFile.CONFLICT_RESOLUTION, 
PutADLSFile.FAIL_RESOLUTION_AV);
+runner.removeProperty(PutADLSFile.UMASK);
+runner.removeProperty(PutADLSFile.ACL);
+
+fileAttributes = new HashMap<>();
+fileAttributes.put("filename", "sample.txt");
+fileAttributes.put("path", "/root/");
+}
+
+@Test
+public void testPutConflictFail() throws IOException, 
InterruptedException {
+
+//for create call
+server.enqueue(new MockResponse().setResponseCode(200));
+//for append call(internal call while writing to stream)
+server.enqueue(new MockResponse().setResponseCode(200));
+//for rename call(since its fail conflict)
+server.enqueue(new 
MockResponse().setResponseCode(200).setBody("{\"boolean\":true}"));
+//for delete call(removing temp file)
+server.enqueue(new MockResponse().setResponseCode(200));
+
+//String bodySimpleFileContent = new 
String(Files.readAllBytes(Paths.get(bodySimpleFileName)));
+String bodySimpleFileContent = "sample content to be send to 
external adls 

[jira] [Commented] (NIFI-4360) Add support for Azure Data Lake Store (ADLS)

2017-09-22 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/NIFI-4360?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16176174#comment-16176174
 ] 

ASF GitHub Bot commented on NIFI-4360:
--

Github user pvillard31 commented on a diff in the pull request:

https://github.com/apache/nifi/pull/2158#discussion_r140448400
  
--- Diff: 
nifi-nar-bundles/nifi-azure-bundle/nifi-azure-processors/src/main/java/org/apache/nifi/processors/adls/ListADLSFile.java
 ---
@@ -0,0 +1,289 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one or more
+ * contributor license agreements.  See the NOTICE file distributed with
+ * this work for additional information regarding copyright ownership.
+ * The ASF licenses this file to You under the Apache License, Version 2.0
+ * (the "License"); you may not use this file except in compliance with
+ * the License.  You may obtain a copy of the License at
+ *
+ * http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+package org.apache.nifi.processors.adls;
+
+import com.microsoft.azure.datalake.store.ADLStoreClient;
+import com.microsoft.azure.datalake.store.DirectoryEntry;
+import com.microsoft.azure.datalake.store.DirectoryEntryType;
+import org.apache.nifi.annotation.behavior.*;
+import org.apache.nifi.annotation.documentation.CapabilityDescription;
+import org.apache.nifi.annotation.documentation.Tags;
+import org.apache.nifi.components.PropertyDescriptor;
+import org.apache.nifi.components.state.Scope;
+import org.apache.nifi.components.state.StateMap;
+import org.apache.nifi.distributed.cache.client.DistributedMapCacheClient;
+import org.apache.nifi.flowfile.FlowFile;
+import org.apache.nifi.flowfile.attributes.CoreAttributes;
+import org.apache.nifi.processor.ProcessContext;
+import org.apache.nifi.processor.ProcessSession;
+import org.apache.nifi.processor.ProcessorInitializationContext;
+import org.apache.nifi.processor.exception.ProcessException;
+import org.apache.nifi.processor.util.StandardValidators;
+
+import java.io.IOException;
+import java.nio.file.Path;
+import java.nio.file.Paths;
+import java.util.*;
+import java.util.concurrent.TimeUnit;
+
+import static org.apache.nifi.processors.adls.ADLSConstants.ACCOUNT_NAME;
+import static org.apache.nifi.processors.adls.ADLSConstants.REL_SUCCESS;
+
+@TriggerSerially
+@TriggerWhenEmpty
+@InputRequirement(InputRequirement.Requirement.INPUT_FORBIDDEN)
+@Tags({"hadoop", "ADLS", "get", "list", "ingest", "ingress", "source", 
"filesystem"})
+@CapabilityDescription("Retrieves a listing of files from ADLS. For each 
file that is "
++ "listed in ADLS, this processor creates a FlowFile that 
represents the ADLS file to be fetched in conjunction with FetchADLSFile. This 
Processor is "
++  "designed to run on Primary Node only in a cluster. If the 
primary node changes, the new Primary Node will pick up where the previous node 
left "
++  "off without duplicating all of the data.")
+@WritesAttributes({
+@WritesAttribute(attribute="filename", description="The name of 
the file that was read from ADLS"),
+@WritesAttribute(attribute="path", description="The path is set to 
the path of the file on ADLS"),
+@WritesAttribute(attribute="adls.owner", description="The user 
that owns the file"),
+@WritesAttribute(attribute="adls.group", description="The group 
that owns the file"),
+@WritesAttribute(attribute="adls.lastModified", description="The 
timestamp of when the file was last modified, as milliseconds since midnight 
Jan 1, 1970 UTC"),
+@WritesAttribute(attribute="adls.length", description="The number 
of bytes in the file"),
+@WritesAttribute(attribute="adls.expirytime", description="The 
number of bytes in the file"),
+@WritesAttribute(attribute="adls.permissions", description="The 
permissions for the file in ADLS. This is formatted as 3 characters for the 
owner, "
++ "3 for the group, and 3 for other users. For example 
rw-rw-r--")
+})
+@Stateful(scopes = Scope.CLUSTER, description = "After performing a 
listing of ADLS files, the latest timestamp of all the files transferred is 
stored. "
++ "This allows the Processor to list only files that have been 
added or modified after "
++ "this date the next time that the Processor is run, without 
having to store 

[jira] [Commented] (NIFI-4360) Add support for Azure Data Lake Store (ADLS)

2017-09-22 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/NIFI-4360?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16176170#comment-16176170
 ] 

ASF GitHub Bot commented on NIFI-4360:
--

Github user pvillard31 commented on a diff in the pull request:

https://github.com/apache/nifi/pull/2158#discussion_r140446942
  
--- Diff: 
nifi-nar-bundles/nifi-azure-bundle/nifi-azure-processors/src/main/java/org/apache/nifi/processors/adls/ADLSConstants.java
 ---
@@ -0,0 +1,90 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one or more
+ * contributor license agreements.  See the NOTICE file distributed with
+ * this work for additional information regarding copyright ownership.
+ * The ASF licenses this file to You under the Apache License, Version 2.0
+ * (the "License"); you may not use this file except in compliance with
+ * the License.  You may obtain a copy of the License at
+ *
+ * http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+package org.apache.nifi.processors.adls;
+
+import org.apache.nifi.components.PropertyDescriptor;
+import org.apache.nifi.processor.Relationship;
+import org.apache.nifi.processor.util.StandardValidators;
+
+
+public class ADLSConstants {
+
+public static final int CHUNK_SIZE_IN_BYTES = 400;
+
+public static final PropertyDescriptor PATH_NAME = new 
PropertyDescriptor.Builder()
+.name("Path")
--- End diff --

Could you use ``.name()`` and ``.displayName()`` ?
Name cannot be changed once we release the processor because it'd break 
backward compatibility on existing workflow. But we can change the display name 
as it's only used in logs/UI. We usually use a script-friendly naming for the 
name field (no whitespace, lower case, etc). For instance: 
adls-overwrite-policy.


> Add support for Azure Data Lake Store (ADLS)
> 
>
> Key: NIFI-4360
> URL: https://issues.apache.org/jira/browse/NIFI-4360
> Project: Apache NiFi
>  Issue Type: New Feature
>  Components: Extensions
>Reporter: Milan Chandna
>Assignee: Milan Chandna
>  Labels: adls, azure, hdfs
>   Original Estimate: 336h
>  Remaining Estimate: 336h
>
> Currently ingress and egress in ADLS account is possible only using HDFS 
> processors.
> Opening this feature to support separate processors for interaction with ADLS 
> accounts directly.
> Benefits are many like 
> - simple configuration.
> - Helping users not familiar with HDFS 
> - Helping users who currently are accessing ADLS accounts directly.
> - using the ADLS SDK rather than HDFS client, one lesser layer to go through.
> Can be achieved by adding separate ADLS processors.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Commented] (NIFI-4360) Add support for Azure Data Lake Store (ADLS)

2017-09-22 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/NIFI-4360?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16176171#comment-16176171
 ] 

ASF GitHub Bot commented on NIFI-4360:
--

Github user pvillard31 commented on a diff in the pull request:

https://github.com/apache/nifi/pull/2158#discussion_r140447052
  
--- Diff: 
nifi-nar-bundles/nifi-azure-bundle/nifi-azure-processors/src/main/java/org/apache/nifi/processors/adls/ADLSConstants.java
 ---
@@ -0,0 +1,90 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one or more
+ * contributor license agreements.  See the NOTICE file distributed with
+ * this work for additional information regarding copyright ownership.
+ * The ASF licenses this file to You under the Apache License, Version 2.0
+ * (the "License"); you may not use this file except in compliance with
+ * the License.  You may obtain a copy of the License at
+ *
+ * http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+package org.apache.nifi.processors.adls;
+
+import org.apache.nifi.components.PropertyDescriptor;
+import org.apache.nifi.processor.Relationship;
+import org.apache.nifi.processor.util.StandardValidators;
+
+
+public class ADLSConstants {
+
+public static final int CHUNK_SIZE_IN_BYTES = 400;
+
+public static final PropertyDescriptor PATH_NAME = new 
PropertyDescriptor.Builder()
+.name("Path")
+.description("Path for file in Azure Data Lake, e.g. 
/adlshome/")
+.required(true)
+.expressionLanguageSupported(true)
+.addValidator(StandardValidators.NON_EMPTY_VALIDATOR)
+.build();
+
+public static final PropertyDescriptor RETRY_ON_FAIL = new 
PropertyDescriptor.Builder()
+.name("Overwrite policy")
+.description("How many times to retry if read fails per chunk 
read, defaults to 3")
+.required(true)
+//TODO add Integer validator
+.defaultValue("3")
+.build();
+
+public static final PropertyDescriptor ACCOUNT_NAME = new 
PropertyDescriptor.Builder()
+.name("Account FQDN")
+.description("Azure account fully qualified domain name eg: 
accountname.azuredatalakestore.net")
+.required(true)
+.addValidator(StandardValidators.NON_EMPTY_VALIDATOR)
+.build();
+
+public static final PropertyDescriptor CLIENT_ID = new 
PropertyDescriptor.Builder()
+.name("Client ID")
+.description("Azure client ID")
+.required(true)
+.addValidator(StandardValidators.NON_EMPTY_VALIDATOR)
+.build();
+
+public static final PropertyDescriptor CLIENT_SECRET = new 
PropertyDescriptor.Builder()
+.name("Client secret")
+.description("Azure client secret")
+.required(true)
+.sensitive(true)
+.addValidator(StandardValidators.NON_EMPTY_VALIDATOR)
+.build();
+
+public static final PropertyDescriptor AUTH_TOKEN_ENDPOINT = new 
PropertyDescriptor.Builder()
+.name("Auth token endpoint")
+.description("Azure client secret")
--- End diff --

copy/paste description


> Add support for Azure Data Lake Store (ADLS)
> 
>
> Key: NIFI-4360
> URL: https://issues.apache.org/jira/browse/NIFI-4360
> Project: Apache NiFi
>  Issue Type: New Feature
>  Components: Extensions
>Reporter: Milan Chandna
>Assignee: Milan Chandna
>  Labels: adls, azure, hdfs
>   Original Estimate: 336h
>  Remaining Estimate: 336h
>
> Currently ingress and egress in ADLS account is possible only using HDFS 
> processors.
> Opening this feature to support separate processors for interaction with ADLS 
> accounts directly.
> Benefits are many like 
> - simple configuration.
> - Helping users not familiar with HDFS 
> - Helping users who currently are accessing ADLS accounts directly.
> - using the ADLS SDK rather than HDFS client, one lesser layer to go through.
> Can be achieved by adding separate ADLS processors.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Commented] (NIFI-4360) Add support for Azure Data Lake Store (ADLS)

2017-09-22 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/NIFI-4360?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16176175#comment-16176175
 ] 

ASF GitHub Bot commented on NIFI-4360:
--

Github user pvillard31 commented on a diff in the pull request:

https://github.com/apache/nifi/pull/2158#discussion_r140447693
  
--- Diff: 
nifi-nar-bundles/nifi-azure-bundle/nifi-azure-processors/src/main/java/org/apache/nifi/processors/adls/ListADLSFile.java
 ---
@@ -0,0 +1,289 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one or more
+ * contributor license agreements.  See the NOTICE file distributed with
+ * this work for additional information regarding copyright ownership.
+ * The ASF licenses this file to You under the Apache License, Version 2.0
+ * (the "License"); you may not use this file except in compliance with
+ * the License.  You may obtain a copy of the License at
+ *
+ * http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+package org.apache.nifi.processors.adls;
+
+import com.microsoft.azure.datalake.store.ADLStoreClient;
+import com.microsoft.azure.datalake.store.DirectoryEntry;
+import com.microsoft.azure.datalake.store.DirectoryEntryType;
+import org.apache.nifi.annotation.behavior.*;
+import org.apache.nifi.annotation.documentation.CapabilityDescription;
+import org.apache.nifi.annotation.documentation.Tags;
+import org.apache.nifi.components.PropertyDescriptor;
+import org.apache.nifi.components.state.Scope;
+import org.apache.nifi.components.state.StateMap;
+import org.apache.nifi.distributed.cache.client.DistributedMapCacheClient;
+import org.apache.nifi.flowfile.FlowFile;
+import org.apache.nifi.flowfile.attributes.CoreAttributes;
+import org.apache.nifi.processor.ProcessContext;
+import org.apache.nifi.processor.ProcessSession;
+import org.apache.nifi.processor.ProcessorInitializationContext;
+import org.apache.nifi.processor.exception.ProcessException;
+import org.apache.nifi.processor.util.StandardValidators;
+
+import java.io.IOException;
+import java.nio.file.Path;
+import java.nio.file.Paths;
+import java.util.*;
+import java.util.concurrent.TimeUnit;
+
+import static org.apache.nifi.processors.adls.ADLSConstants.ACCOUNT_NAME;
+import static org.apache.nifi.processors.adls.ADLSConstants.REL_SUCCESS;
+
+@TriggerSerially
+@TriggerWhenEmpty
+@InputRequirement(InputRequirement.Requirement.INPUT_FORBIDDEN)
+@Tags({"hadoop", "ADLS", "get", "list", "ingest", "ingress", "source", 
"filesystem"})
--- End diff --

Could you add "azure"?


> Add support for Azure Data Lake Store (ADLS)
> 
>
> Key: NIFI-4360
> URL: https://issues.apache.org/jira/browse/NIFI-4360
> Project: Apache NiFi
>  Issue Type: New Feature
>  Components: Extensions
>Reporter: Milan Chandna
>Assignee: Milan Chandna
>  Labels: adls, azure, hdfs
>   Original Estimate: 336h
>  Remaining Estimate: 336h
>
> Currently ingress and egress in ADLS account is possible only using HDFS 
> processors.
> Opening this feature to support separate processors for interaction with ADLS 
> accounts directly.
> Benefits are many like 
> - simple configuration.
> - Helping users not familiar with HDFS 
> - Helping users who currently are accessing ADLS accounts directly.
> - using the ADLS SDK rather than HDFS client, one lesser layer to go through.
> Can be achieved by adding separate ADLS processors.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Commented] (NIFI-4360) Add support for Azure Data Lake Store (ADLS)

2017-09-22 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/NIFI-4360?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16176176#comment-16176176
 ] 

ASF GitHub Bot commented on NIFI-4360:
--

Github user pvillard31 commented on a diff in the pull request:

https://github.com/apache/nifi/pull/2158#discussion_r140448507
  
--- Diff: 
nifi-nar-bundles/nifi-azure-bundle/nifi-azure-processors/src/main/java/org/apache/nifi/processors/adls/PutADLSFile.java
 ---
@@ -0,0 +1,267 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one or more
+ * contributor license agreements.  See the NOTICE file distributed with
+ * this work for additional information regarding copyright ownership.
+ * The ASF licenses this file to You under the Apache License, Version 2.0
+ * (the "License"); you may not use this file except in compliance with
+ * the License.  You may obtain a copy of the License at
+ *
+ * http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+package org.apache.nifi.processors.adls;
+
+import com.microsoft.azure.datalake.store.ADLFileOutputStream;
+import com.microsoft.azure.datalake.store.ADLStoreClient;
+import com.microsoft.azure.datalake.store.IfExists;
+import com.microsoft.azure.datalake.store.acl.AclEntry;
+import org.apache.nifi.annotation.behavior.*;
+import org.apache.nifi.annotation.documentation.CapabilityDescription;
+import org.apache.nifi.annotation.documentation.Tags;
+import org.apache.nifi.components.AllowableValue;
+import org.apache.nifi.components.PropertyDescriptor;
+import org.apache.nifi.flowfile.FlowFile;
+import org.apache.nifi.flowfile.attributes.CoreAttributes;
+import org.apache.nifi.processor.ProcessContext;
+import org.apache.nifi.processor.ProcessSession;
+import org.apache.nifi.processor.ProcessorInitializationContext;
+import org.apache.nifi.processor.exception.ProcessException;
+import org.apache.nifi.processor.util.StandardValidators;
+import org.apache.nifi.util.StopWatch;
+import org.apache.nifi.util.StringUtils;
+
+import java.io.IOException;
+import java.nio.file.Path;
+import java.nio.file.Paths;
+import java.util.ArrayList;
+import java.util.Arrays;
+import java.util.Collections;
+import java.util.List;
+import java.util.concurrent.TimeUnit;
+import java.util.stream.Collectors;
+
+import static org.apache.nifi.processors.adls.ADLSConstants.*;
+
+@InputRequirement(InputRequirement.Requirement.INPUT_REQUIRED)
+@Tags({"hadoop", "ADLS", "put", "copy", "filesystem", "restricted"})
--- End diff --

Same comment here


> Add support for Azure Data Lake Store (ADLS)
> 
>
> Key: NIFI-4360
> URL: https://issues.apache.org/jira/browse/NIFI-4360
> Project: Apache NiFi
>  Issue Type: New Feature
>  Components: Extensions
>Reporter: Milan Chandna
>Assignee: Milan Chandna
>  Labels: adls, azure, hdfs
>   Original Estimate: 336h
>  Remaining Estimate: 336h
>
> Currently ingress and egress in ADLS account is possible only using HDFS 
> processors.
> Opening this feature to support separate processors for interaction with ADLS 
> accounts directly.
> Benefits are many like 
> - simple configuration.
> - Helping users not familiar with HDFS 
> - Helping users who currently are accessing ADLS accounts directly.
> - using the ADLS SDK rather than HDFS client, one lesser layer to go through.
> Can be achieved by adding separate ADLS processors.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Commented] (NIFI-4360) Add support for Azure Data Lake Store (ADLS)

2017-09-22 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/NIFI-4360?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16176172#comment-16176172
 ] 

ASF GitHub Bot commented on NIFI-4360:
--

Github user pvillard31 commented on a diff in the pull request:

https://github.com/apache/nifi/pull/2158#discussion_r140446472
  
--- Diff: 
nifi-nar-bundles/nifi-azure-bundle/nifi-azure-processors/src/main/java/org/apache/nifi/processors/adls/ADLSConstants.java
 ---
@@ -0,0 +1,90 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one or more
+ * contributor license agreements.  See the NOTICE file distributed with
+ * this work for additional information regarding copyright ownership.
+ * The ASF licenses this file to You under the Apache License, Version 2.0
+ * (the "License"); you may not use this file except in compliance with
+ * the License.  You may obtain a copy of the License at
+ *
+ * http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+package org.apache.nifi.processors.adls;
+
+import org.apache.nifi.components.PropertyDescriptor;
+import org.apache.nifi.processor.Relationship;
+import org.apache.nifi.processor.util.StandardValidators;
+
+
+public class ADLSConstants {
+
+public static final int CHUNK_SIZE_IN_BYTES = 400;
+
+public static final PropertyDescriptor PATH_NAME = new 
PropertyDescriptor.Builder()
+.name("Path")
+.description("Path for file in Azure Data Lake, e.g. 
/adlshome/")
+.required(true)
+.expressionLanguageSupported(true)
+.addValidator(StandardValidators.NON_EMPTY_VALIDATOR)
+.build();
+
+public static final PropertyDescriptor RETRY_ON_FAIL = new 
PropertyDescriptor.Builder()
+.name("Overwrite policy")
+.description("How many times to retry if read fails per chunk 
read, defaults to 3")
+.required(true)
+//TODO add Integer validator
--- End diff --

TODO?


> Add support for Azure Data Lake Store (ADLS)
> 
>
> Key: NIFI-4360
> URL: https://issues.apache.org/jira/browse/NIFI-4360
> Project: Apache NiFi
>  Issue Type: New Feature
>  Components: Extensions
>Reporter: Milan Chandna
>Assignee: Milan Chandna
>  Labels: adls, azure, hdfs
>   Original Estimate: 336h
>  Remaining Estimate: 336h
>
> Currently ingress and egress in ADLS account is possible only using HDFS 
> processors.
> Opening this feature to support separate processors for interaction with ADLS 
> accounts directly.
> Benefits are many like 
> - simple configuration.
> - Helping users not familiar with HDFS 
> - Helping users who currently are accessing ADLS accounts directly.
> - using the ADLS SDK rather than HDFS client, one lesser layer to go through.
> Can be achieved by adding separate ADLS processors.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Commented] (NIFI-4360) Add support for Azure Data Lake Store (ADLS)

2017-09-22 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/NIFI-4360?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16176173#comment-16176173
 ] 

ASF GitHub Bot commented on NIFI-4360:
--

Github user pvillard31 commented on a diff in the pull request:

https://github.com/apache/nifi/pull/2158#discussion_r140449021
  
--- Diff: 
nifi-nar-bundles/nifi-azure-bundle/nifi-azure-processors/src/main/java/org/apache/nifi/processors/adls/PutADLSFile.java
 ---
@@ -0,0 +1,267 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one or more
+ * contributor license agreements.  See the NOTICE file distributed with
+ * this work for additional information regarding copyright ownership.
+ * The ASF licenses this file to You under the Apache License, Version 2.0
+ * (the "License"); you may not use this file except in compliance with
+ * the License.  You may obtain a copy of the License at
+ *
+ * http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+package org.apache.nifi.processors.adls;
+
+import com.microsoft.azure.datalake.store.ADLFileOutputStream;
+import com.microsoft.azure.datalake.store.ADLStoreClient;
+import com.microsoft.azure.datalake.store.IfExists;
+import com.microsoft.azure.datalake.store.acl.AclEntry;
+import org.apache.nifi.annotation.behavior.*;
+import org.apache.nifi.annotation.documentation.CapabilityDescription;
+import org.apache.nifi.annotation.documentation.Tags;
+import org.apache.nifi.components.AllowableValue;
+import org.apache.nifi.components.PropertyDescriptor;
+import org.apache.nifi.flowfile.FlowFile;
+import org.apache.nifi.flowfile.attributes.CoreAttributes;
+import org.apache.nifi.processor.ProcessContext;
+import org.apache.nifi.processor.ProcessSession;
+import org.apache.nifi.processor.ProcessorInitializationContext;
+import org.apache.nifi.processor.exception.ProcessException;
+import org.apache.nifi.processor.util.StandardValidators;
+import org.apache.nifi.util.StopWatch;
+import org.apache.nifi.util.StringUtils;
+
+import java.io.IOException;
+import java.nio.file.Path;
+import java.nio.file.Paths;
+import java.util.ArrayList;
+import java.util.Arrays;
+import java.util.Collections;
+import java.util.List;
+import java.util.concurrent.TimeUnit;
+import java.util.stream.Collectors;
+
+import static org.apache.nifi.processors.adls.ADLSConstants.*;
+
+@InputRequirement(InputRequirement.Requirement.INPUT_REQUIRED)
+@Tags({"hadoop", "ADLS", "put", "copy", "filesystem", "restricted"})
+@CapabilityDescription("Write FlowFile data to Azure Data Lake Store 
(ADLS)")
+@ReadsAttributes({
+@ReadsAttribute(attribute = "filename", description = "The name of 
the file written to ADLS comes from the value of this attribute."),
+@ReadsAttribute(attribute = "filepath", description = "The 
relative path to the file on ADLS comes from the value of this attribute.")
+})
+@WritesAttributes({
+@WritesAttribute(attribute = "filename", description = "The name 
of the file written to ADLS is stored in this attribute."),
+@WritesAttribute(attribute = "absolute.adls.path", description = 
"The absolute path to the file on ADLS is stored in this attribute.")
+})
+@Restricted("Provides operator the ability to write to any file that NiFi 
has access to in ADLS.")
+public class PutADLSFile extends ADLSAbstractProcessor {
+
+private static final String PART_FILE_EXTENSION = ".nifipart";
+private static final String REPLACE_RESOLUTION = "replace";
+private static final String FAIL_RESOLUTION = "fail";
+private static final String APPEND_RESOLUTION = "append";
+
+protected static final AllowableValue REPLACE_RESOLUTION_AV = new 
AllowableValue(REPLACE_RESOLUTION,
+REPLACE_RESOLUTION, "Replaces the existing file if any.");
+protected static final AllowableValue FAIL_RESOLUTION_AV = new 
AllowableValue(FAIL_RESOLUTION, FAIL_RESOLUTION,
+"Penalizes the flow file and routes it to failure.");
+protected static final AllowableValue APPEND_RESOLUTION_AV = new 
AllowableValue(APPEND_RESOLUTION, APPEND_RESOLUTION,
+"Appends to the existing file if any, creates a new file 
otherwise.");
+
+// properties
+protected static final PropertyDescriptor CONFLICT_RESOLUTION = new 
PropertyDescriptor.Builder()
+

[jira] [Commented] (NIFI-4360) Add support for Azure Data Lake Store (ADLS)

2017-09-22 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/NIFI-4360?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16176143#comment-16176143
 ] 

ASF GitHub Bot commented on NIFI-4360:
--

Github user milanchandna commented on the issue:

https://github.com/apache/nifi/pull/2158
  
Removed the test resource files(as there were license issues) and resolved 
their dependencies. And there were environment specific characters in test 
cases. Got rid of them. Please check now.


> Add support for Azure Data Lake Store (ADLS)
> 
>
> Key: NIFI-4360
> URL: https://issues.apache.org/jira/browse/NIFI-4360
> Project: Apache NiFi
>  Issue Type: New Feature
>  Components: Extensions
>Reporter: Milan Chandna
>Assignee: Milan Chandna
>  Labels: adls, azure, hdfs
>   Original Estimate: 336h
>  Remaining Estimate: 336h
>
> Currently ingress and egress in ADLS account is possible only using HDFS 
> processors.
> Opening this feature to support separate processors for interaction with ADLS 
> accounts directly.
> Benefits are many like 
> - simple configuration.
> - Helping users not familiar with HDFS 
> - Helping users who currently are accessing ADLS accounts directly.
> - using the ADLS SDK rather than HDFS client, one lesser layer to go through.
> Can be achieved by adding separate ADLS processors.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Commented] (NIFI-4360) Add support for Azure Data Lake Store (ADLS)

2017-09-21 Thread Joseph Witt (JIRA)

[ 
https://issues.apache.org/jira/browse/NIFI-4360?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16175433#comment-16175433
 ] 

Joseph Witt commented on NIFI-4360:
---

java version "1.8.0_141"
Java(TM) SE Runtime Environment (build 1.8.0_141-b15)
Java HotSpot(TM) 64-Bit Server VM (build 25.141-b15, mixed mode)

Apache Maven 3.5.0 (ff8f5e7444045639af65f6095c62210b5713f426; 
2017-04-03T15:39:06-04:00)
Maven home: /Users/jwitt/Applications/apache-maven-3.5.0
Java version: 1.8.0_141, vendor: Oracle Corporation
Java home: /Library/Java/JavaVirtualMachines/jdk1.8.0_141.jdk/Contents/Home/jre
Default locale: en_US, platform encoding: UTF-8
OS name: "mac os x", version: "10.12.6", arch: "x86_64", family: "mac"'

This is the environmental info for where I ran the build/tests.  We need the 
tests to work on mac/win/lin or skip certain tests in environments they're not 
built for.

> Add support for Azure Data Lake Store (ADLS)
> 
>
> Key: NIFI-4360
> URL: https://issues.apache.org/jira/browse/NIFI-4360
> Project: Apache NiFi
>  Issue Type: New Feature
>  Components: Extensions
>Reporter: Milan Chandna
>Assignee: Milan Chandna
>  Labels: adls, azure, hdfs
>   Original Estimate: 336h
>  Remaining Estimate: 336h
>
> Currently ingress and egress in ADLS account is possible only using HDFS 
> processors.
> Opening this feature to support separate processors for interaction with ADLS 
> accounts directly.
> Benefits are many like 
> - simple configuration.
> - Helping users not familiar with HDFS 
> - Helping users who currently are accessing ADLS accounts directly.
> - using the ADLS SDK rather than HDFS client, one lesser layer to go through.
> Can be achieved by adding separate ADLS processors.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Commented] (NIFI-4360) Add support for Azure Data Lake Store (ADLS)

2017-09-21 Thread Joseph Witt (JIRA)

[ 
https://issues.apache.org/jira/browse/NIFI-4360?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16175431#comment-16175431
 ] 

Joseph Witt commented on NIFI-4360:
---

Understood on the test files being desirable.  I'd like to avoid storing multi 
MB test file.  Better to have it generated for the tests and destroyed.  Even 
if the text seems really benign the rules for tracking the LICENSE and NOTICE 
also apply to test artifacts.  These would need to be licensed in our source 
LICENSE and even if they're apache we'd cite them.  It isn't clear that they 
are actually ASF licensed to me and in doing a quick google search they do not 
appear to be original works of this PR.  So, lets just make this simple and not 
pull in test files from elsewhere.

Thanks

> Add support for Azure Data Lake Store (ADLS)
> 
>
> Key: NIFI-4360
> URL: https://issues.apache.org/jira/browse/NIFI-4360
> Project: Apache NiFi
>  Issue Type: New Feature
>  Components: Extensions
>Reporter: Milan Chandna
>Assignee: Milan Chandna
>  Labels: adls, azure, hdfs
>   Original Estimate: 336h
>  Remaining Estimate: 336h
>
> Currently ingress and egress in ADLS account is possible only using HDFS 
> processors.
> Opening this feature to support separate processors for interaction with ADLS 
> accounts directly.
> Benefits are many like 
> - simple configuration.
> - Helping users not familiar with HDFS 
> - Helping users who currently are accessing ADLS accounts directly.
> - using the ADLS SDK rather than HDFS client, one lesser layer to go through.
> Can be achieved by adding separate ADLS processors.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Commented] (NIFI-4360) Add support for Azure Data Lake Store (ADLS)

2017-09-21 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/NIFI-4360?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16175423#comment-16175423
 ] 

ASF GitHub Bot commented on NIFI-4360:
--

Github user joewitt commented on the issue:

https://github.com/apache/nifi/pull/2158
  
Lots of test failures.

/code
Tests run: 19, Failures: 3, Errors: 0, Skipped: 0, Time elapsed: 0.847 sec 
<<< FAILURE! - in org.apache.nifi.processors.adls.TestPutADLSFile
testPutConflictReplace(org.apache.nifi.processors.adls.TestPutADLSFile)  
Time elapsed: 0.102 sec  <<< FAILURE!
java.lang.AssertionError:
Expected: a string containing "%5Csample%5Csample.txt.nifipart"
 but: was 
"/webhdfs/v1/sample/sample.txt.nifipart?op=CREATE=DATA=true=true=3834110d-e3b1-41cc-9ff4-022da542ae4b=3834110d-e3b1-41cc-9ff4-022da542ae4b=2016-11-01"
at org.hamcrest.MatcherAssert.assertThat(MatcherAssert.java:20)
at org.junit.Assert.assertThat(Assert.java:956)
at org.junit.Assert.assertThat(Assert.java:923)
at 
org.apache.nifi.processors.adls.TestPutADLSFile.testPutConflictReplace(TestPutADLSFile.java:161)

testPutConflictAppend(org.apache.nifi.processors.adls.TestPutADLSFile)  
Time elapsed: 0.01 sec  <<< FAILURE!
java.lang.AssertionError:
Expected: a string containing "%5Csample%5Csample.txt.nifipart"
 but: was 
"/webhdfs/v1/sample/sample.txt.nifipart?op=CREATE=DATA=true=true=0c697d9a-fa07-47b7-adfd-877b90e15717=0c697d9a-fa07-47b7-adfd-877b90e15717=2016-11-01"
at org.hamcrest.MatcherAssert.assertThat(MatcherAssert.java:20)
at org.junit.Assert.assertThat(Assert.java:956)
at org.junit.Assert.assertThat(Assert.java:923)
at 
org.apache.nifi.processors.adls.TestPutADLSFile.testPutConflictAppend(TestPutADLSFile.java:215)

testPutConflictFail(org.apache.nifi.processors.adls.TestPutADLSFile)  Time 
elapsed: 0 sec  <<< FAILURE!
java.lang.AssertionError:
Expected: a string containing "%5Csample%5Csample.txt.nifipart"
 but: was 
"/webhdfs/v1/sample/sample.txt.nifipart?op=CREATE=DATA=true=true=a6fe439c-b2f1-41c2-ba5d-e7d75836d53b=a6fe439c-b2f1-41c2-ba5d-e7d75836d53b=2016-11-01"
at org.hamcrest.MatcherAssert.assertThat(MatcherAssert.java:20)
at org.junit.Assert.assertThat(Assert.java:956)
at org.junit.Assert.assertThat(Assert.java:923)
at 
org.apache.nifi.processors.adls.TestPutADLSFile.testPutConflictFail(TestPutADLSFile.java:108)

Tests run: 7, Failures: 0, Errors: 0, Skipped: 0, Time elapsed: 0.249 sec - 
in org.apache.nifi.ranger.authorization.TestRangerBasePluginWithPolicies
Running org.apache.nifi.ranger.authorization.TestRangerNiFiAuthorizer

Results :

Failed tests:
  TestListADLSFile.testFlowFileAttributes:233 expected:<[\]test> but 
was:<[/]test>
  TestPutADLSFile.testPutConflictAppend:215
Expected: a string containing "%5Csample%5Csample.txt.nifipart"
 but: was 
"/webhdfs/v1/sample/sample.txt.nifipart?op=CREATE=DATA=true=true=0c697d9a-fa07-47b7-adfd-877b90e15717=0c697d9a-fa07-47b7-adfd-877b90e15717=2016-11-01"
  TestPutADLSFile.testPutConflictFail:108
Expected: a string containing "%5Csample%5Csample.txt.nifipart"
 but: was 
"/webhdfs/v1/sample/sample.txt.nifipart?op=CREATE=DATA=true=true=a6fe439c-b2f1-41c2-ba5d-e7d75836d53b=a6fe439c-b2f1-41c2-ba5d-e7d75836d53b=2016-11-01"
  TestPutADLSFile.testPutConflictReplace:161
Expected: a string containing "%5Csample%5Csample.txt.nifipart"
 but: was 
"/webhdfs/v1/sample/sample.txt.nifipart?op=CREATE=DATA=true=true=3834110d-e3b1-41cc-9ff4-022da542ae4b=3834110d-e3b1-41cc-9ff4-022da542ae4b=2016-11-01"




> Add support for Azure Data Lake Store (ADLS)
> 
>
> Key: NIFI-4360
> URL: https://issues.apache.org/jira/browse/NIFI-4360
> Project: Apache NiFi
>  Issue Type: New Feature
>  Components: Extensions
>Reporter: Milan Chandna
>Assignee: Milan Chandna
>  Labels: adls, azure, hdfs
>   Original Estimate: 336h
>  Remaining Estimate: 336h
>
> Currently ingress and egress in ADLS account is possible only using HDFS 
> processors.
> Opening this feature to support separate processors for interaction with ADLS 
> accounts directly.
> Benefits are many like 
> - simple configuration.
> - Helping users not familiar with HDFS 
> - Helping users who currently are accessing ADLS accounts directly.
> - using the ADLS SDK rather than HDFS client, one lesser layer to go through.
> Can be achieved by adding separate ADLS processors.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Commented] (NIFI-4360) Add support for Azure Data Lake Store (ADLS)

2017-09-21 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/NIFI-4360?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16175402#comment-16175402
 ] 

ASF GitHub Bot commented on NIFI-4360:
--

Github user milanchandna commented on the issue:

https://github.com/apache/nifi/pull/2158
  
Thanks @joewitt

These test resource files contains Lorem Ipsum text.
But anyways I will create my own if required but dont want to delete as 
they are required for an important test case.

Thanks.


> Add support for Azure Data Lake Store (ADLS)
> 
>
> Key: NIFI-4360
> URL: https://issues.apache.org/jira/browse/NIFI-4360
> Project: Apache NiFi
>  Issue Type: New Feature
>  Components: Extensions
>Reporter: Milan Chandna
>Assignee: Milan Chandna
>  Labels: adls, azure, hdfs
>   Original Estimate: 336h
>  Remaining Estimate: 336h
>
> Currently ingress and egress in ADLS account is possible only using HDFS 
> processors.
> Opening this feature to support separate processors for interaction with ADLS 
> accounts directly.
> Benefits are many like 
> - simple configuration.
> - Helping users not familiar with HDFS 
> - Helping users who currently are accessing ADLS accounts directly.
> - using the ADLS SDK rather than HDFS client, one lesser layer to go through.
> Can be achieved by adding separate ADLS processors.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Commented] (NIFI-4360) Add support for Azure Data Lake Store (ADLS)

2017-09-21 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/NIFI-4360?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16175379#comment-16175379
 ] 

ASF GitHub Bot commented on NIFI-4360:
--

Github user joewitt commented on the issue:

https://github.com/apache/nifi/pull/2158
  
@milanchandna am checking into the dependency tree and L  Looks good so 
far.

However, please get rid of the davinci and davinci4MB files.  The origin of 
them is unclear and it looks like it comes from common ADLS test files.  We 
have to cite them as source material in such cases.  We're better off just 
making up our own original test files OR removing them altogether.

Thanks


> Add support for Azure Data Lake Store (ADLS)
> 
>
> Key: NIFI-4360
> URL: https://issues.apache.org/jira/browse/NIFI-4360
> Project: Apache NiFi
>  Issue Type: New Feature
>  Components: Extensions
>Reporter: Milan Chandna
>Assignee: Milan Chandna
>  Labels: adls, azure, hdfs
>   Original Estimate: 336h
>  Remaining Estimate: 336h
>
> Currently ingress and egress in ADLS account is possible only using HDFS 
> processors.
> Opening this feature to support separate processors for interaction with ADLS 
> accounts directly.
> Benefits are many like 
> - simple configuration.
> - Helping users not familiar with HDFS 
> - Helping users who currently are accessing ADLS accounts directly.
> - using the ADLS SDK rather than HDFS client, one lesser layer to go through.
> Can be achieved by adding separate ADLS processors.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Commented] (NIFI-4360) Add support for Azure Data Lake Store (ADLS)

2017-09-21 Thread Atul Sikaria (JIRA)

[ 
https://issues.apache.org/jira/browse/NIFI-4360?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16175292#comment-16175292
 ] 

Atul Sikaria commented on NIFI-4360:


+1 for the ADLS interaction of the code.

> Add support for Azure Data Lake Store (ADLS)
> 
>
> Key: NIFI-4360
> URL: https://issues.apache.org/jira/browse/NIFI-4360
> Project: Apache NiFi
>  Issue Type: New Feature
>  Components: Extensions
>Reporter: Milan Chandna
>Assignee: Milan Chandna
>  Labels: adls, azure, hdfs
>   Original Estimate: 336h
>  Remaining Estimate: 336h
>
> Currently ingress and egress in ADLS account is possible only using HDFS 
> processors.
> Opening this feature to support separate processors for interaction with ADLS 
> accounts directly.
> Benefits are many like 
> - simple configuration.
> - Helping users not familiar with HDFS 
> - Helping users who currently are accessing ADLS accounts directly.
> - using the ADLS SDK rather than HDFS client, one lesser layer to go through.
> Can be achieved by adding separate ADLS processors.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Commented] (NIFI-4360) Add support for Azure Data Lake Store (ADLS)

2017-09-18 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/NIFI-4360?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16169738#comment-16169738
 ] 

ASF GitHub Bot commented on NIFI-4360:
--

GitHub user milanchandna opened a pull request:

https://github.com/apache/nifi/pull/2158

NIFI-4360 Adding support for ADLS Processors. Feature includes List, …

…Get, Put processors. Till now ADLS interaction was possible using HDFS 
processors. Now users can ingress and egress their data directly using ADLS 
processors. It is much simpler and easy. And one lesser layer to go through as 
well. This will also help users who are not familiar with Hadoop configurations.

Thank you for submitting a contribution to Apache NiFi.

In order to streamline the review of the contribution we ask you
to ensure the following steps have been taken:

### For all changes:
- [x] Is there a JIRA ticket associated with this PR? Is it referenced 
 in the commit message?

- [x] Does your PR title start with NIFI- where  is the JIRA number 
you are trying to resolve? Pay particular attention to the hyphen "-" character.

- [x] Has your PR been rebased against the latest commit within the target 
branch (typically master)?

- [x] Is your initial contribution a single, squashed commit?

### For code changes:
- [ ] Have you ensured that the full suite of tests is executed via mvn 
-Pcontrib-check clean install at the root nifi folder? - There were failures in 
other modules 
testRecoverFileThatHasTrailingNULBytesAndTruncation(org.wali.TestMinimalLockingWriteAheadLog)
- [x] Have you written or updated unit tests to verify your changes?
- [x] If adding new dependencies to the code, are these dependencies 
licensed in a way that is compatible for inclusion under [ASF 
2.0](http://www.apache.org/legal/resolved.html#category-a)? 
- [x] If applicable, have you updated the LICENSE file, including the main 
LICENSE file under nifi-assembly?
- [ ] If applicable, have you updated the NOTICE file, including the main 
NOTICE file found under nifi-assembly?
- [ ] If adding new Properties, have you added .displayName in addition to 
.name (programmatic access) for each of the new properties?

### For documentation related changes:
- [x] Have you ensured that format looks appropriate for the output in 
which it is rendered?

### Note:
Please ensure that once the PR is submitted, you check travis-ci for build 
issues and submit an update to your PR as soon as possible.


You can merge this pull request into a Git repository by running:

$ git pull https://github.com/milanchandna/nifi NIFI-4360

Alternatively you can review and apply these changes as the patch at:

https://github.com/apache/nifi/pull/2158.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

This closes #2158


commit 4f449c55c99868bd8b96bc370186503f4928c11f
Author: Milan Chandna 
Date:   2017-09-18T07:00:17Z

NIFI-4360 Adding support for ADLS Processors. Feature includes List, Get, 
Put processors.




> Add support for Azure Data Lake Store (ADLS)
> 
>
> Key: NIFI-4360
> URL: https://issues.apache.org/jira/browse/NIFI-4360
> Project: Apache NiFi
>  Issue Type: New Feature
>  Components: Extensions
>Reporter: Milan Chandna
>Assignee: Milan Chandna
>  Labels: adls, azure, hdfs
>   Original Estimate: 336h
>  Remaining Estimate: 336h
>
> Currently ingress and egress in ADLS account is possible only using HDFS 
> processors.
> Opening this feature to support separate processors for interaction with ADLS 
> accounts directly.
> Benefits are many like 
> - simple configuration.
> - Helping users not familiar with HDFS 
> - Helping users who currently are accessing ADLS accounts directly.
> - using the ADLS SDK rather than HDFS client, one lesser layer to go through.
> Can be achieved by adding separate ADLS processors.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)