[jira] [Commented] (NIFI-5289) NoClassDefFoundError for org.junit.Assert When Using nifi-mock
[ https://issues.apache.org/jira/browse/NIFI-5289?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16509215#comment-16509215 ] ASF GitHub Bot commented on NIFI-5289: -- Github user MartinPayne commented on the issue: https://github.com/apache/nifi/pull/2780 @joewitt Compile scope is the correct scope in this case. If JUnit 4 was only used in the tests for NiFi Mock, test scope would be correct. However, the [NiFi Mock code uses JUnit 4 as a compile time dependency](https://github.com/apache/nifi/blob/f8466cb16d6723ddc3bf5f0e7f8ce8a47d27cbe5/nifi-mock/src/main/java/org/apache/nifi/util/StandardProcessorTestRunner.java#L74), so JUnit 4 needs to be brought into consuming projects as a transitive dependency. As per the [Maven dependency scope table](https://maven.apache.org/guides/introduction/introduction-to-dependency-mechanism.html#Dependency_Scope), making it a compile scope dependency means it is added to the test classpath of a consuming project if that project declares NiFi Mock with test scope. It would only get included all over the place if consuming projects have declared NiFi Mock with a compile or runtime scope. If that was the case here, NiFi Mock would also be getting included all over the place. > NoClassDefFoundError for org.junit.Assert When Using nifi-mock > -- > > Key: NIFI-5289 > URL: https://issues.apache.org/jira/browse/NIFI-5289 > Project: Apache NiFi > Issue Type: Bug > Components: Core Framework >Affects Versions: 1.6.0 >Reporter: Martin Payne >Priority: Minor > > When using the NiFi Mock framework but not using JUnit 4, tests fail with a > NoClassDefFoundError for org.junit.Assert. This is because nifi-mock sets the > scope of junit to "provided", which means it's not pulled into consuming > projects as a transitive dependency. It should be set to "compile" so that > users don't have to set an explicit JUnit dependency in their projects. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[GitHub] nifi issue #2780: NIFI-5289 - Changed nifi-mock junit Dependency to Compile ...
Github user MartinPayne commented on the issue: https://github.com/apache/nifi/pull/2780 @joewitt Compile scope is the correct scope in this case. If JUnit 4 was only used in the tests for NiFi Mock, test scope would be correct. However, the [NiFi Mock code uses JUnit 4 as a compile time dependency](https://github.com/apache/nifi/blob/f8466cb16d6723ddc3bf5f0e7f8ce8a47d27cbe5/nifi-mock/src/main/java/org/apache/nifi/util/StandardProcessorTestRunner.java#L74), so JUnit 4 needs to be brought into consuming projects as a transitive dependency. As per the [Maven dependency scope table](https://maven.apache.org/guides/introduction/introduction-to-dependency-mechanism.html#Dependency_Scope), making it a compile scope dependency means it is added to the test classpath of a consuming project if that project declares NiFi Mock with test scope. It would only get included all over the place if consuming projects have declared NiFi Mock with a compile or runtime scope. If that was the case here, NiFi Mock would also be getting included all over the place. ---
[jira] [Commented] (NIFI-5289) NoClassDefFoundError for org.junit.Assert When Using nifi-mock
[ https://issues.apache.org/jira/browse/NIFI-5289?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16509210#comment-16509210 ] Martin Payne commented on NIFI-5289: [~mike.thomsen] I was writing some tests for a custom processor. We use JUnit 5, so JUnit 4 was not on the test classpath. I would expect the same behaviour with other test frameworks which aren't JUnit 4 too. > NoClassDefFoundError for org.junit.Assert When Using nifi-mock > -- > > Key: NIFI-5289 > URL: https://issues.apache.org/jira/browse/NIFI-5289 > Project: Apache NiFi > Issue Type: Bug > Components: Core Framework >Affects Versions: 1.6.0 >Reporter: Martin Payne >Priority: Minor > > When using the NiFi Mock framework but not using JUnit 4, tests fail with a > NoClassDefFoundError for org.junit.Assert. This is because nifi-mock sets the > scope of junit to "provided", which means it's not pulled into consuming > projects as a transitive dependency. It should be set to "compile" so that > users don't have to set an explicit JUnit dependency in their projects. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Commented] (NIFI-5214) Add a REST lookup service
[ https://issues.apache.org/jira/browse/NIFI-5214?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16509110#comment-16509110 ] ASF GitHub Bot commented on NIFI-5214: -- Github user ijokarumawak commented on a diff in the pull request: https://github.com/apache/nifi/pull/2723#discussion_r194604410 --- Diff: nifi-nar-bundles/nifi-standard-services/nifi-lookup-services-bundle/nifi-lookup-services/src/main/java/org/apache/nifi/lookup/RestLookupService.java --- @@ -0,0 +1,435 @@ +/* + * Licensed to the Apache Software Foundation (ASF) under one or more + * contributor license agreements. See the NOTICE file distributed with + * this work for additional information regarding copyright ownership. + * The ASF licenses this file to You under the Apache License, Version 2.0 + * (the "License"); you may not use this file except in compliance with + * the License. You may obtain a copy of the License at + * + * http://www.apache.org/licenses/LICENSE-2.0 + * + * Unless required by applicable law or agreed to in writing, software + * distributed under the License is distributed on an "AS IS" BASIS, + * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. + * See the License for the specific language governing permissions and + * limitations under the License. + */ + +package org.apache.nifi.lookup; + +import com.burgstaller.okhttp.AuthenticationCacheInterceptor; +import com.burgstaller.okhttp.CachingAuthenticatorDecorator; +import com.burgstaller.okhttp.digest.CachingAuthenticator; +import com.burgstaller.okhttp.digest.DigestAuthenticator; +import okhttp3.Credentials; +import okhttp3.MediaType; +import okhttp3.OkHttpClient; +import okhttp3.Request; +import okhttp3.RequestBody; +import okhttp3.Response; +import org.apache.nifi.annotation.behavior.DynamicProperties; +import org.apache.nifi.annotation.behavior.DynamicProperty; +import org.apache.nifi.annotation.documentation.CapabilityDescription; +import org.apache.nifi.annotation.documentation.Tags; +import org.apache.nifi.annotation.lifecycle.OnDisabled; +import org.apache.nifi.annotation.lifecycle.OnEnabled; +import org.apache.nifi.attribute.expression.language.PreparedQuery; +import org.apache.nifi.attribute.expression.language.Query; +import org.apache.nifi.components.PropertyDescriptor; +import org.apache.nifi.components.Validator; +import org.apache.nifi.controller.AbstractControllerService; +import org.apache.nifi.controller.ConfigurationContext; +import org.apache.nifi.expression.ExpressionLanguageScope; +import org.apache.nifi.processor.util.StandardValidators; +import org.apache.nifi.proxy.ProxyConfiguration; +import org.apache.nifi.proxy.ProxyConfigurationService; +import org.apache.nifi.proxy.ProxySpec; +import org.apache.nifi.record.path.FieldValue; +import org.apache.nifi.record.path.RecordPath; +import org.apache.nifi.record.path.validation.RecordPathValidator; +import org.apache.nifi.schema.access.SchemaNotFoundException; +import org.apache.nifi.serialization.MalformedRecordException; +import org.apache.nifi.serialization.RecordReader; +import org.apache.nifi.serialization.RecordReaderFactory; +import org.apache.nifi.serialization.SimpleRecordSchema; +import org.apache.nifi.serialization.record.MapRecord; +import org.apache.nifi.serialization.record.Record; +import org.apache.nifi.serialization.record.RecordSchema; +import org.apache.nifi.ssl.SSLContextService; +import org.apache.nifi.util.StringUtils; + +import javax.net.ssl.SSLContext; +import java.io.IOException; +import java.io.InputStream; +import java.net.Proxy; +import java.util.Arrays; +import java.util.Collections; +import java.util.HashMap; +import java.util.HashSet; +import java.util.List; +import java.util.Map; +import java.util.Optional; +import java.util.Set; +import java.util.concurrent.ConcurrentHashMap; +import java.util.regex.Pattern; +import java.util.stream.Collectors; + +import static org.apache.commons.lang3.StringUtils.trimToEmpty; + +@Tags({ "rest", "lookup", "json", "xml", "http" }) +@CapabilityDescription("Use a REST service to enrich records.") +@DynamicProperties({ +@DynamicProperty(name = "*", value = "*", description = "All dynamic properties are added as HTTP headers with the name " + +"as the header name and the value as the header value.") +}) +public class RestLookupService extends AbstractControllerService implements LookupService { +static final PropertyDescriptor URL = new PropertyDescriptor.Builder() +.name("rest-lookup-url") +.displayName("URL") +
[GitHub] nifi pull request #2723: NIFI-5214 Added REST LookupService
Github user ijokarumawak commented on a diff in the pull request: https://github.com/apache/nifi/pull/2723#discussion_r194604410 --- Diff: nifi-nar-bundles/nifi-standard-services/nifi-lookup-services-bundle/nifi-lookup-services/src/main/java/org/apache/nifi/lookup/RestLookupService.java --- @@ -0,0 +1,435 @@ +/* + * Licensed to the Apache Software Foundation (ASF) under one or more + * contributor license agreements. See the NOTICE file distributed with + * this work for additional information regarding copyright ownership. + * The ASF licenses this file to You under the Apache License, Version 2.0 + * (the "License"); you may not use this file except in compliance with + * the License. You may obtain a copy of the License at + * + * http://www.apache.org/licenses/LICENSE-2.0 + * + * Unless required by applicable law or agreed to in writing, software + * distributed under the License is distributed on an "AS IS" BASIS, + * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. + * See the License for the specific language governing permissions and + * limitations under the License. + */ + +package org.apache.nifi.lookup; + +import com.burgstaller.okhttp.AuthenticationCacheInterceptor; +import com.burgstaller.okhttp.CachingAuthenticatorDecorator; +import com.burgstaller.okhttp.digest.CachingAuthenticator; +import com.burgstaller.okhttp.digest.DigestAuthenticator; +import okhttp3.Credentials; +import okhttp3.MediaType; +import okhttp3.OkHttpClient; +import okhttp3.Request; +import okhttp3.RequestBody; +import okhttp3.Response; +import org.apache.nifi.annotation.behavior.DynamicProperties; +import org.apache.nifi.annotation.behavior.DynamicProperty; +import org.apache.nifi.annotation.documentation.CapabilityDescription; +import org.apache.nifi.annotation.documentation.Tags; +import org.apache.nifi.annotation.lifecycle.OnDisabled; +import org.apache.nifi.annotation.lifecycle.OnEnabled; +import org.apache.nifi.attribute.expression.language.PreparedQuery; +import org.apache.nifi.attribute.expression.language.Query; +import org.apache.nifi.components.PropertyDescriptor; +import org.apache.nifi.components.Validator; +import org.apache.nifi.controller.AbstractControllerService; +import org.apache.nifi.controller.ConfigurationContext; +import org.apache.nifi.expression.ExpressionLanguageScope; +import org.apache.nifi.processor.util.StandardValidators; +import org.apache.nifi.proxy.ProxyConfiguration; +import org.apache.nifi.proxy.ProxyConfigurationService; +import org.apache.nifi.proxy.ProxySpec; +import org.apache.nifi.record.path.FieldValue; +import org.apache.nifi.record.path.RecordPath; +import org.apache.nifi.record.path.validation.RecordPathValidator; +import org.apache.nifi.schema.access.SchemaNotFoundException; +import org.apache.nifi.serialization.MalformedRecordException; +import org.apache.nifi.serialization.RecordReader; +import org.apache.nifi.serialization.RecordReaderFactory; +import org.apache.nifi.serialization.SimpleRecordSchema; +import org.apache.nifi.serialization.record.MapRecord; +import org.apache.nifi.serialization.record.Record; +import org.apache.nifi.serialization.record.RecordSchema; +import org.apache.nifi.ssl.SSLContextService; +import org.apache.nifi.util.StringUtils; + +import javax.net.ssl.SSLContext; +import java.io.IOException; +import java.io.InputStream; +import java.net.Proxy; +import java.util.Arrays; +import java.util.Collections; +import java.util.HashMap; +import java.util.HashSet; +import java.util.List; +import java.util.Map; +import java.util.Optional; +import java.util.Set; +import java.util.concurrent.ConcurrentHashMap; +import java.util.regex.Pattern; +import java.util.stream.Collectors; + +import static org.apache.commons.lang3.StringUtils.trimToEmpty; + +@Tags({ "rest", "lookup", "json", "xml", "http" }) +@CapabilityDescription("Use a REST service to enrich records.") +@DynamicProperties({ +@DynamicProperty(name = "*", value = "*", description = "All dynamic properties are added as HTTP headers with the name " + +"as the header name and the value as the header value.") +}) +public class RestLookupService extends AbstractControllerService implements LookupService { +static final PropertyDescriptor URL = new PropertyDescriptor.Builder() +.name("rest-lookup-url") +.displayName("URL") +.description("The URL for the REST endpoint. Expression language is evaluated against the lookup key/value pairs, " + +"not flowfile attributes or variable registry.") +
[jira] [Updated] (NIFI-5252) Allow arbitrary headers in PutEmail processor
[ https://issues.apache.org/jira/browse/NIFI-5252?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Dustin Rodrigues updated NIFI-5252: --- Status: Patch Available (was: Open) > Allow arbitrary headers in PutEmail processor > - > > Key: NIFI-5252 > URL: https://issues.apache.org/jira/browse/NIFI-5252 > Project: Apache NiFi > Issue Type: Improvement >Reporter: Dustin Rodrigues >Priority: Minor > -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Commented] (NIFI-5252) Allow arbitrary headers in PutEmail processor
[ https://issues.apache.org/jira/browse/NIFI-5252?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16509093#comment-16509093 ] ASF GitHub Bot commented on NIFI-5252: -- GitHub user dtrodrigues opened a pull request: https://github.com/apache/nifi/pull/2787 NIFI-5252 - support arbitrary headers in PutEmail processor Thank you for submitting a contribution to Apache NiFi. In order to streamline the review of the contribution we ask you to ensure the following steps have been taken: ### For all changes: - [ ] Is there a JIRA ticket associated with this PR? Is it referenced in the commit message? - [ ] Does your PR title start with NIFI- where is the JIRA number you are trying to resolve? Pay particular attention to the hyphen "-" character. - [ ] Has your PR been rebased against the latest commit within the target branch (typically master)? - [ ] Is your initial contribution a single, squashed commit? ### For code changes: - [ ] Have you ensured that the full suite of tests is executed via mvn -Pcontrib-check clean install at the root nifi folder? - [ ] Have you written or updated unit tests to verify your changes? - [ ] If adding new dependencies to the code, are these dependencies licensed in a way that is compatible for inclusion under [ASF 2.0](http://www.apache.org/legal/resolved.html#category-a)? - [ ] If applicable, have you updated the LICENSE file, including the main LICENSE file under nifi-assembly? - [ ] If applicable, have you updated the NOTICE file, including the main NOTICE file found under nifi-assembly? - [ ] If adding new Properties, have you added .displayName in addition to .name (programmatic access) for each of the new properties? ### For documentation related changes: - [ ] Have you ensured that format looks appropriate for the output in which it is rendered? ### Note: Please ensure that once the PR is submitted, you check travis-ci for build issues and submit an update to your PR as soon as possible. You can merge this pull request into a Git repository by running: $ git pull https://github.com/dtrodrigues/nifi NIFI-5252 Alternatively you can review and apply these changes as the patch at: https://github.com/apache/nifi/pull/2787.patch To close this pull request, make a commit to your master/trunk branch with (at least) the following in the commit message: This closes #2787 commit 250736cc14ffb6c44925fe606b5de67d7a53638a Author: Dustin Rodrigues Date: 2018-06-12T02:00:28Z NIFI-5252 - support arbitrary headers in PutEmail processor > Allow arbitrary headers in PutEmail processor > - > > Key: NIFI-5252 > URL: https://issues.apache.org/jira/browse/NIFI-5252 > Project: Apache NiFi > Issue Type: Improvement >Reporter: Dustin Rodrigues >Priority: Minor > -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[GitHub] nifi pull request #2787: NIFI-5252 - support arbitrary headers in PutEmail p...
GitHub user dtrodrigues opened a pull request: https://github.com/apache/nifi/pull/2787 NIFI-5252 - support arbitrary headers in PutEmail processor Thank you for submitting a contribution to Apache NiFi. In order to streamline the review of the contribution we ask you to ensure the following steps have been taken: ### For all changes: - [ ] Is there a JIRA ticket associated with this PR? Is it referenced in the commit message? - [ ] Does your PR title start with NIFI- where is the JIRA number you are trying to resolve? Pay particular attention to the hyphen "-" character. - [ ] Has your PR been rebased against the latest commit within the target branch (typically master)? - [ ] Is your initial contribution a single, squashed commit? ### For code changes: - [ ] Have you ensured that the full suite of tests is executed via mvn -Pcontrib-check clean install at the root nifi folder? - [ ] Have you written or updated unit tests to verify your changes? - [ ] If adding new dependencies to the code, are these dependencies licensed in a way that is compatible for inclusion under [ASF 2.0](http://www.apache.org/legal/resolved.html#category-a)? - [ ] If applicable, have you updated the LICENSE file, including the main LICENSE file under nifi-assembly? - [ ] If applicable, have you updated the NOTICE file, including the main NOTICE file found under nifi-assembly? - [ ] If adding new Properties, have you added .displayName in addition to .name (programmatic access) for each of the new properties? ### For documentation related changes: - [ ] Have you ensured that format looks appropriate for the output in which it is rendered? ### Note: Please ensure that once the PR is submitted, you check travis-ci for build issues and submit an update to your PR as soon as possible. You can merge this pull request into a Git repository by running: $ git pull https://github.com/dtrodrigues/nifi NIFI-5252 Alternatively you can review and apply these changes as the patch at: https://github.com/apache/nifi/pull/2787.patch To close this pull request, make a commit to your master/trunk branch with (at least) the following in the commit message: This closes #2787 commit 250736cc14ffb6c44925fe606b5de67d7a53638a Author: Dustin Rodrigues Date: 2018-06-12T02:00:28Z NIFI-5252 - support arbitrary headers in PutEmail processor ---
[jira] [Commented] (NIFI-5213) Allow AvroReader with explicit schema to read files with embedded schema
[ https://issues.apache.org/jira/browse/NIFI-5213?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16509022#comment-16509022 ] ASF GitHub Bot commented on NIFI-5213: -- Github user mattyb149 commented on a diff in the pull request: https://github.com/apache/nifi/pull/2718#discussion_r194590103 --- Diff: nifi-nar-bundles/nifi-standard-services/nifi-record-serialization-services-bundle/nifi-record-serialization-services/src/main/java/org/apache/nifi/avro/AvroReaderWithExplicitSchema.java --- @@ -17,33 +17,61 @@ package org.apache.nifi.avro; +import java.io.ByteArrayInputStream; +import java.io.ByteArrayOutputStream; import java.io.EOFException; import java.io.IOException; import java.io.InputStream; +import java.io.SequenceInputStream; import org.apache.avro.Schema; +import org.apache.avro.file.DataFileStream; import org.apache.avro.generic.GenericDatumReader; import org.apache.avro.generic.GenericRecord; import org.apache.avro.io.BinaryDecoder; import org.apache.avro.io.DatumReader; import org.apache.avro.io.DecoderFactory; -import org.apache.nifi.schema.access.SchemaNotFoundException; +import org.apache.commons.io.input.TeeInputStream; import org.apache.nifi.serialization.MalformedRecordException; import org.apache.nifi.serialization.record.RecordSchema; public class AvroReaderWithExplicitSchema extends AvroRecordReader { private final InputStream in; private final RecordSchema recordSchema; private final DatumReader datumReader; -private final BinaryDecoder decoder; +private BinaryDecoder decoder; private GenericRecord genericRecord; +private DataFileStream dataFileStream; -public AvroReaderWithExplicitSchema(final InputStream in, final RecordSchema recordSchema, final Schema avroSchema) throws IOException, SchemaNotFoundException { +public AvroReaderWithExplicitSchema(final InputStream in, final RecordSchema recordSchema, final Schema avroSchema) throws IOException { this.in = in; this.recordSchema = recordSchema; -datumReader = new GenericDatumReader(avroSchema); -decoder = DecoderFactory.get().binaryDecoder(in, null); +datumReader = new GenericDatumReader<>(avroSchema); +ByteArrayOutputStream baos = new ByteArrayOutputStream(); +TeeInputStream teeInputStream = new TeeInputStream(in, baos); +// Try to parse as a DataFileStream, if it works, glue the streams back together and delegate calls to the DataFileStream +try { +dataFileStream = new DataFileStream<>(teeInputStream, new GenericDatumReader<>()); +} catch (IOException ioe) { +// Carry on, hopefully a raw Avro file +// Need to be able to re-read the bytes read so far, and the InputStream passed in doesn't support reset. Use the TeeInputStream in +// conjunction with SequenceInputStream to glue the two streams back together for future reading +ByteArrayInputStream bais = new ByteArrayInputStream(baos.toByteArray()); +SequenceInputStream sis = new SequenceInputStream(bais, in); +decoder = DecoderFactory.get().binaryDecoder(sis, null); +} +if (dataFileStream != null) { +// Verify the schemas are the same +Schema embeddedSchema = dataFileStream.getSchema(); +if (!embeddedSchema.equals(avroSchema)) { +throw new IOException("Explicit schema does not match embedded schema"); --- End diff -- I thought schema evolution was supported in other ways such as including optional (possibly missing) fields to support a transition to/from additional/deleted fields, but I admit I don't have my mind wrapped around the whole thing. In this case it was driven by the Avro API, if the file has a schema, there is a much more fluent API to read the records than if it does not. That is not for the case when someone wants to impose a schema on a file that already has a schema; I'm not sure that's a case for schema evolution (i.e. the embedded schema is not correct?), the alternate API is for "raw" Avro files that don't have an embedded schema, and instead need an external one for processing. TBH I don't know how to parse an Avro file that has an embedded schema with their API by imposing an external one. This was the middle ground to allow it as long as the external schema matched the embedded one. Thoughts? > Allow AvroReader with explicit schema to read files with embedded schema > > > Key: NIFI-5213 > URL:
[GitHub] nifi pull request #2718: NIFI-5213: Allow AvroReader to process files w embe...
Github user mattyb149 commented on a diff in the pull request: https://github.com/apache/nifi/pull/2718#discussion_r194590103 --- Diff: nifi-nar-bundles/nifi-standard-services/nifi-record-serialization-services-bundle/nifi-record-serialization-services/src/main/java/org/apache/nifi/avro/AvroReaderWithExplicitSchema.java --- @@ -17,33 +17,61 @@ package org.apache.nifi.avro; +import java.io.ByteArrayInputStream; +import java.io.ByteArrayOutputStream; import java.io.EOFException; import java.io.IOException; import java.io.InputStream; +import java.io.SequenceInputStream; import org.apache.avro.Schema; +import org.apache.avro.file.DataFileStream; import org.apache.avro.generic.GenericDatumReader; import org.apache.avro.generic.GenericRecord; import org.apache.avro.io.BinaryDecoder; import org.apache.avro.io.DatumReader; import org.apache.avro.io.DecoderFactory; -import org.apache.nifi.schema.access.SchemaNotFoundException; +import org.apache.commons.io.input.TeeInputStream; import org.apache.nifi.serialization.MalformedRecordException; import org.apache.nifi.serialization.record.RecordSchema; public class AvroReaderWithExplicitSchema extends AvroRecordReader { private final InputStream in; private final RecordSchema recordSchema; private final DatumReader datumReader; -private final BinaryDecoder decoder; +private BinaryDecoder decoder; private GenericRecord genericRecord; +private DataFileStream dataFileStream; -public AvroReaderWithExplicitSchema(final InputStream in, final RecordSchema recordSchema, final Schema avroSchema) throws IOException, SchemaNotFoundException { +public AvroReaderWithExplicitSchema(final InputStream in, final RecordSchema recordSchema, final Schema avroSchema) throws IOException { this.in = in; this.recordSchema = recordSchema; -datumReader = new GenericDatumReader(avroSchema); -decoder = DecoderFactory.get().binaryDecoder(in, null); +datumReader = new GenericDatumReader<>(avroSchema); +ByteArrayOutputStream baos = new ByteArrayOutputStream(); +TeeInputStream teeInputStream = new TeeInputStream(in, baos); +// Try to parse as a DataFileStream, if it works, glue the streams back together and delegate calls to the DataFileStream +try { +dataFileStream = new DataFileStream<>(teeInputStream, new GenericDatumReader<>()); +} catch (IOException ioe) { +// Carry on, hopefully a raw Avro file +// Need to be able to re-read the bytes read so far, and the InputStream passed in doesn't support reset. Use the TeeInputStream in +// conjunction with SequenceInputStream to glue the two streams back together for future reading +ByteArrayInputStream bais = new ByteArrayInputStream(baos.toByteArray()); +SequenceInputStream sis = new SequenceInputStream(bais, in); +decoder = DecoderFactory.get().binaryDecoder(sis, null); +} +if (dataFileStream != null) { +// Verify the schemas are the same +Schema embeddedSchema = dataFileStream.getSchema(); +if (!embeddedSchema.equals(avroSchema)) { +throw new IOException("Explicit schema does not match embedded schema"); --- End diff -- I thought schema evolution was supported in other ways such as including optional (possibly missing) fields to support a transition to/from additional/deleted fields, but I admit I don't have my mind wrapped around the whole thing. In this case it was driven by the Avro API, if the file has a schema, there is a much more fluent API to read the records than if it does not. That is not for the case when someone wants to impose a schema on a file that already has a schema; I'm not sure that's a case for schema evolution (i.e. the embedded schema is not correct?), the alternate API is for "raw" Avro files that don't have an embedded schema, and instead need an external one for processing. TBH I don't know how to parse an Avro file that has an embedded schema with their API by imposing an external one. This was the middle ground to allow it as long as the external schema matched the embedded one. Thoughts? ---
[jira] [Commented] (NIFI-5213) Allow AvroReader with explicit schema to read files with embedded schema
[ https://issues.apache.org/jira/browse/NIFI-5213?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16509019#comment-16509019 ] ASF GitHub Bot commented on NIFI-5213: -- Github user mattyb149 commented on a diff in the pull request: https://github.com/apache/nifi/pull/2718#discussion_r194589636 --- Diff: nifi-nar-bundles/nifi-standard-services/nifi-record-serialization-services-bundle/nifi-record-serialization-services/src/main/java/org/apache/nifi/avro/AvroReaderWithExplicitSchema.java --- @@ -17,33 +17,61 @@ package org.apache.nifi.avro; +import java.io.ByteArrayInputStream; +import java.io.ByteArrayOutputStream; import java.io.EOFException; import java.io.IOException; import java.io.InputStream; +import java.io.SequenceInputStream; import org.apache.avro.Schema; +import org.apache.avro.file.DataFileStream; import org.apache.avro.generic.GenericDatumReader; import org.apache.avro.generic.GenericRecord; import org.apache.avro.io.BinaryDecoder; import org.apache.avro.io.DatumReader; import org.apache.avro.io.DecoderFactory; -import org.apache.nifi.schema.access.SchemaNotFoundException; +import org.apache.commons.io.input.TeeInputStream; import org.apache.nifi.serialization.MalformedRecordException; import org.apache.nifi.serialization.record.RecordSchema; public class AvroReaderWithExplicitSchema extends AvroRecordReader { private final InputStream in; private final RecordSchema recordSchema; private final DatumReader datumReader; -private final BinaryDecoder decoder; +private BinaryDecoder decoder; private GenericRecord genericRecord; +private DataFileStream dataFileStream; -public AvroReaderWithExplicitSchema(final InputStream in, final RecordSchema recordSchema, final Schema avroSchema) throws IOException, SchemaNotFoundException { +public AvroReaderWithExplicitSchema(final InputStream in, final RecordSchema recordSchema, final Schema avroSchema) throws IOException { this.in = in; this.recordSchema = recordSchema; -datumReader = new GenericDatumReader(avroSchema); -decoder = DecoderFactory.get().binaryDecoder(in, null); +datumReader = new GenericDatumReader<>(avroSchema); +ByteArrayOutputStream baos = new ByteArrayOutputStream(); +TeeInputStream teeInputStream = new TeeInputStream(in, baos); +// Try to parse as a DataFileStream, if it works, glue the streams back together and delegate calls to the DataFileStream +try { +dataFileStream = new DataFileStream<>(teeInputStream, new GenericDatumReader<>()); --- End diff -- Sounds about right :) I was playing around with gluing these together and as soon as it "worked" I stopped touching it. Will take a closer look, thanks! > Allow AvroReader with explicit schema to read files with embedded schema > > > Key: NIFI-5213 > URL: https://issues.apache.org/jira/browse/NIFI-5213 > Project: Apache NiFi > Issue Type: Improvement > Components: Extensions >Reporter: Matt Burgess >Assignee: Matt Burgess >Priority: Minor > > AvroReader allows the choice of schema access strategy from such options as > Use Embedded Schema, Use Schema Name, Use Schema Text, etc. If the incoming > Avro files will have embedded schemas, then Use Embedded Schema is best > practice for the Avro Reader. However it is not intuitive that if the same > schema that is embedded in the file is specified by name (using a schema > registry) or explicitly via Schema Text, that errors can occur. This has been > noticed in QueryRecord for example, and the error is also not intuitive or > descriptive (it is often an ArrayIndexOutOfBoundsException). > To provide a better user experience, it would be an improvement for > AvroReader to be able to successfully process Avro files with embedded > schemas, even when the Schema Access Strategy is not "Use Embedded Schema". > Of course, the explicit schema would have to match the embedded schema, or an > error would be reported (and rightfully so). -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[GitHub] nifi pull request #2718: NIFI-5213: Allow AvroReader to process files w embe...
Github user mattyb149 commented on a diff in the pull request: https://github.com/apache/nifi/pull/2718#discussion_r194589636 --- Diff: nifi-nar-bundles/nifi-standard-services/nifi-record-serialization-services-bundle/nifi-record-serialization-services/src/main/java/org/apache/nifi/avro/AvroReaderWithExplicitSchema.java --- @@ -17,33 +17,61 @@ package org.apache.nifi.avro; +import java.io.ByteArrayInputStream; +import java.io.ByteArrayOutputStream; import java.io.EOFException; import java.io.IOException; import java.io.InputStream; +import java.io.SequenceInputStream; import org.apache.avro.Schema; +import org.apache.avro.file.DataFileStream; import org.apache.avro.generic.GenericDatumReader; import org.apache.avro.generic.GenericRecord; import org.apache.avro.io.BinaryDecoder; import org.apache.avro.io.DatumReader; import org.apache.avro.io.DecoderFactory; -import org.apache.nifi.schema.access.SchemaNotFoundException; +import org.apache.commons.io.input.TeeInputStream; import org.apache.nifi.serialization.MalformedRecordException; import org.apache.nifi.serialization.record.RecordSchema; public class AvroReaderWithExplicitSchema extends AvroRecordReader { private final InputStream in; private final RecordSchema recordSchema; private final DatumReader datumReader; -private final BinaryDecoder decoder; +private BinaryDecoder decoder; private GenericRecord genericRecord; +private DataFileStream dataFileStream; -public AvroReaderWithExplicitSchema(final InputStream in, final RecordSchema recordSchema, final Schema avroSchema) throws IOException, SchemaNotFoundException { +public AvroReaderWithExplicitSchema(final InputStream in, final RecordSchema recordSchema, final Schema avroSchema) throws IOException { this.in = in; this.recordSchema = recordSchema; -datumReader = new GenericDatumReader(avroSchema); -decoder = DecoderFactory.get().binaryDecoder(in, null); +datumReader = new GenericDatumReader<>(avroSchema); +ByteArrayOutputStream baos = new ByteArrayOutputStream(); +TeeInputStream teeInputStream = new TeeInputStream(in, baos); +// Try to parse as a DataFileStream, if it works, glue the streams back together and delegate calls to the DataFileStream +try { +dataFileStream = new DataFileStream<>(teeInputStream, new GenericDatumReader<>()); --- End diff -- Sounds about right :) I was playing around with gluing these together and as soon as it "worked" I stopped touching it. Will take a closer look, thanks! ---
[GitHub] nifi pull request #2718: NIFI-5213: Allow AvroReader to process files w embe...
Github user mattyb149 commented on a diff in the pull request: https://github.com/apache/nifi/pull/2718#discussion_r194589527 --- Diff: nifi-nar-bundles/nifi-standard-services/nifi-record-serialization-services-bundle/nifi-record-serialization-services/src/test/java/org/apache/nifi/avro/TestAvroReaderWithExplicitSchema.java --- @@ -0,0 +1,62 @@ +/* + * Licensed to the Apache Software Foundation (ASF) under one or more + * contributor license agreements. See the NOTICE file distributed with + * this work for additional information regarding copyright ownership. + * The ASF licenses this file to You under the Apache License, Version 2.0 + * (the "License"); you may not use this file except in compliance with + * the License. You may obtain a copy of the License at + * + * http://www.apache.org/licenses/LICENSE-2.0 + * + * Unless required by applicable law or agreed to in writing, software + * distributed under the License is distributed on an "AS IS" BASIS, + * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. + * See the License for the specific language governing permissions and + * limitations under the License. + */ +package org.apache.nifi.avro; + +import org.apache.avro.Schema; +import org.apache.nifi.serialization.SimpleRecordSchema; +import org.apache.nifi.serialization.record.RecordSchema; +import org.junit.Test; + +import java.io.File; +import java.io.FileInputStream; +import java.io.IOException; + +public class TestAvroReaderWithExplicitSchema { + +@Test +public void testAvroExplicitReaderWithSchemalessFile() throws Exception { +File avroFileWithEmbeddedSchema = new File("src/test/resources/avro/avro_schemaless.avro"); +FileInputStream fileInputStream = new FileInputStream(avroFileWithEmbeddedSchema); +Schema dataSchema = new Schema.Parser().parse(new File("src/test/resources/avro/avro_schemaless.avsc")); +RecordSchema recordSchema = new SimpleRecordSchema(dataSchema.toString(), AvroTypeUtil.AVRO_SCHEMA_FORMAT, null); + +AvroReaderWithExplicitSchema avroReader = new AvroReaderWithExplicitSchema(fileInputStream, recordSchema, dataSchema); +avroReader.nextAvroRecord(); --- End diff -- Probably, I think I did too much copy-paste in the tests instead of validating the individual things ---
[jira] [Commented] (NIFI-5213) Allow AvroReader with explicit schema to read files with embedded schema
[ https://issues.apache.org/jira/browse/NIFI-5213?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16509018#comment-16509018 ] ASF GitHub Bot commented on NIFI-5213: -- Github user mattyb149 commented on a diff in the pull request: https://github.com/apache/nifi/pull/2718#discussion_r194589527 --- Diff: nifi-nar-bundles/nifi-standard-services/nifi-record-serialization-services-bundle/nifi-record-serialization-services/src/test/java/org/apache/nifi/avro/TestAvroReaderWithExplicitSchema.java --- @@ -0,0 +1,62 @@ +/* + * Licensed to the Apache Software Foundation (ASF) under one or more + * contributor license agreements. See the NOTICE file distributed with + * this work for additional information regarding copyright ownership. + * The ASF licenses this file to You under the Apache License, Version 2.0 + * (the "License"); you may not use this file except in compliance with + * the License. You may obtain a copy of the License at + * + * http://www.apache.org/licenses/LICENSE-2.0 + * + * Unless required by applicable law or agreed to in writing, software + * distributed under the License is distributed on an "AS IS" BASIS, + * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. + * See the License for the specific language governing permissions and + * limitations under the License. + */ +package org.apache.nifi.avro; + +import org.apache.avro.Schema; +import org.apache.nifi.serialization.SimpleRecordSchema; +import org.apache.nifi.serialization.record.RecordSchema; +import org.junit.Test; + +import java.io.File; +import java.io.FileInputStream; +import java.io.IOException; + +public class TestAvroReaderWithExplicitSchema { + +@Test +public void testAvroExplicitReaderWithSchemalessFile() throws Exception { +File avroFileWithEmbeddedSchema = new File("src/test/resources/avro/avro_schemaless.avro"); +FileInputStream fileInputStream = new FileInputStream(avroFileWithEmbeddedSchema); +Schema dataSchema = new Schema.Parser().parse(new File("src/test/resources/avro/avro_schemaless.avsc")); +RecordSchema recordSchema = new SimpleRecordSchema(dataSchema.toString(), AvroTypeUtil.AVRO_SCHEMA_FORMAT, null); + +AvroReaderWithExplicitSchema avroReader = new AvroReaderWithExplicitSchema(fileInputStream, recordSchema, dataSchema); +avroReader.nextAvroRecord(); --- End diff -- Probably, I think I did too much copy-paste in the tests instead of validating the individual things > Allow AvroReader with explicit schema to read files with embedded schema > > > Key: NIFI-5213 > URL: https://issues.apache.org/jira/browse/NIFI-5213 > Project: Apache NiFi > Issue Type: Improvement > Components: Extensions >Reporter: Matt Burgess >Assignee: Matt Burgess >Priority: Minor > > AvroReader allows the choice of schema access strategy from such options as > Use Embedded Schema, Use Schema Name, Use Schema Text, etc. If the incoming > Avro files will have embedded schemas, then Use Embedded Schema is best > practice for the Avro Reader. However it is not intuitive that if the same > schema that is embedded in the file is specified by name (using a schema > registry) or explicitly via Schema Text, that errors can occur. This has been > noticed in QueryRecord for example, and the error is also not intuitive or > descriptive (it is often an ArrayIndexOutOfBoundsException). > To provide a better user experience, it would be an improvement for > AvroReader to be able to successfully process Avro files with embedded > schemas, even when the Schema Access Strategy is not "Use Embedded Schema". > Of course, the explicit schema would have to match the embedded schema, or an > error would be reported (and rightfully so). -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Updated] (NIFI-5176) NiFi needs to be buildable on Java 9
[ https://issues.apache.org/jira/browse/NIFI-5176?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Jeff Storck updated NIFI-5176: -- Description: While retaining a source/target comptability of 1.8, NiFi needs to be buildable on Java 9. The following issues have been encountered while attempting to run a Java 1.8-built NiFi on Java 9: ||Issue||Solution|| |Groovy compiler not parsing groovy code correctly on Java 9|Updated maven-compiler-plugin to 3.7.0, and included dependencies for groovy-eclipse-compiler:2.9.3-01 and groovy-eclipse-batch:2.4.15-01| |Antler3 issue with ambiguous method calls|Explicit cast to ValidationContext needed in TestHL7Query.java| |jaxb2-maven-plugin not compatible with Java 9|Switched to maven-jaxb-plugin| |hbase-client:1.1.2 depends on jdk.tools:jdk.tools:1.7|Excluded this dependency *(needs testing)*| |nifi-enrich-processors uses package com.sun.jndi.dns, which does not exist| | was: While retaining a source/target comptability of 1.8, NiFi needs to be buildable on Java 9. The following issues have been encountered while attempting to run a Java 1.8-built NiFi on Java 9: ||Issue||Solution|| |Groovy compiler not parsing groovy code correctly on Java 9|Updated maven-compiler-plugin to 3.7.0, and included dependencies for groovy-eclipse-compiler:2.9.3-01 and groovy-eclipse-batch:2.4.15-01| |Antler3 issue with ambiguous method calls|Explicit cast to ValidationContext| |jaxb2-maven-plugin not compatible with Java 9|Switched to maven-jaxb-plugin| |hbase-client:1.1.2 depends on jdk.tools:jdk.tools:1.7|Excluded this dependency *(needs testing)*| |nifi-enrich-processors uses package com.sun.jndi.dns, which does not exist| | > NiFi needs to be buildable on Java 9 > > > Key: NIFI-5176 > URL: https://issues.apache.org/jira/browse/NIFI-5176 > Project: Apache NiFi > Issue Type: Sub-task >Reporter: Jeff Storck >Assignee: Jeff Storck >Priority: Major > > While retaining a source/target comptability of 1.8, NiFi needs to be > buildable on Java 9. > The following issues have been encountered while attempting to run a Java > 1.8-built NiFi on Java 9: > ||Issue||Solution|| > |Groovy compiler not parsing groovy code correctly on Java 9|Updated > maven-compiler-plugin to 3.7.0, and included dependencies for > groovy-eclipse-compiler:2.9.3-01 and groovy-eclipse-batch:2.4.15-01| > |Antler3 issue with ambiguous method calls|Explicit cast to ValidationContext > needed in TestHL7Query.java| > |jaxb2-maven-plugin not compatible with Java 9|Switched to maven-jaxb-plugin| > |hbase-client:1.1.2 depends on jdk.tools:jdk.tools:1.7|Excluded this > dependency *(needs testing)*| > |nifi-enrich-processors uses package com.sun.jndi.dns, which does not exist| | -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Updated] (NIFI-5176) NiFi needs to be buildable on Java 9
[ https://issues.apache.org/jira/browse/NIFI-5176?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Jeff Storck updated NIFI-5176: -- Description: While retaining a source/target comptability of 1.8, NiFi needs to be buildable on Java 9. The following issues have been encountered while attempting to run a Java 1.8-built NiFi on Java 9: ||Issue||Solution|| |Groovy compiler not parsing groovy code correctly on Java 9|Updated maven-compiler-plugin to 3.7.0, and included dependencies for groovy-eclipse-compiler:2.9.3-01 and groovy-eclipse-batch:2.4.15-01| |Antler3 issue with ambiguous method calls|Explicit cast to ValidationContext| |jaxb2-maven-plugin not compatible with Java 9|Switched to maven-jaxb-plugin| |hbase-client:1.1.2 depends on jdk.tools:jdk.tools:1.7|Excluded this dependency *(needs testing)*| |nifi-enrich-processors uses package com.sun.jndi.dns, which does not exist| | was: While retaining a source/target comptability of 1.8, NiFi needs to be buildable on Java 9. The following issues have been encountered while attempting to run a Java 1.8-built NiFi on Java 9: ||Issue||Solution|| |Groovy compiler not parsing groovy code correctly on Java 9|Updated maven-compiler-plugin to 3.7.0, and included dependencies for groovy-eclipse-compiler:2.9.3-01 and groovy-eclipse-batch:2.4.15-01| |Antler isn't able to process grammars| | > NiFi needs to be buildable on Java 9 > > > Key: NIFI-5176 > URL: https://issues.apache.org/jira/browse/NIFI-5176 > Project: Apache NiFi > Issue Type: Sub-task >Reporter: Jeff Storck >Assignee: Jeff Storck >Priority: Major > > While retaining a source/target comptability of 1.8, NiFi needs to be > buildable on Java 9. > The following issues have been encountered while attempting to run a Java > 1.8-built NiFi on Java 9: > ||Issue||Solution|| > |Groovy compiler not parsing groovy code correctly on Java 9|Updated > maven-compiler-plugin to 3.7.0, and included dependencies for > groovy-eclipse-compiler:2.9.3-01 and groovy-eclipse-batch:2.4.15-01| > |Antler3 issue with ambiguous method calls|Explicit cast to ValidationContext| > |jaxb2-maven-plugin not compatible with Java 9|Switched to maven-jaxb-plugin| > |hbase-client:1.1.2 depends on jdk.tools:jdk.tools:1.7|Excluded this > dependency *(needs testing)*| > |nifi-enrich-processors uses package com.sun.jndi.dns, which does not exist| | -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Comment Edited] (NIFI-5296) Add EL Support with Variable Registry scope on SSL context service
[ https://issues.apache.org/jira/browse/NIFI-5296?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16508990#comment-16508990 ] Andy LoPresto edited comment on NIFI-5296 at 6/12/18 12:23 AM: --- [~pvillard] can you describe the use case where you feel VR EL support is necessary for these fields? I am guessing you want to be able to specify a keystore and truststore path in the VR and allow an SSLContextService to reference it. I have a couple concerns with this approach: # It allows any user with variable permissions to see the path to the keystore and truststore. In the current setup, the controller service can be deployed with literal paths by an administrator/trusted user, and the service can be referenced by dependent components without exposing the actual file system paths to a less-trusted user (given the proper absence of permissions on the controller service). Currently, a user who does not have view permission to the {{StandardSSLContextService}} cannot see the explicit path at all, even if they have view permission to the referencing {{InvokeHTTP}} processor (for example). In this new scenario, they would be able to see the explicit path in the Variables window, and see the UUID of the referencing SSLCS (see [https://nifi.apache.org/docs/nifi-docs/html/user-guide.html#unauthorized-referencing-components]) # It allows the keystore and truststore path to change without visibility on the configuration dialog. Given EL evaluation, the processor will evaluate not only VR-specific variables, but also system properties and OS-level environment variables. This means that a malicious or incidental occurrence could change the path used to locate the keystore and truststore without the NiFi user being aware The original work that introduced the unit tests you reference was done in [NIFI-4274|https://issues.apache.org/jira/browse/NIFI-4274] because of a [Stack Overflow question|https://stackoverflow.com/a/45575232/70465] where the documentation (which said EL was not supported) and the behavior (incorrectly evaluating EL) did not match. I appreciate the work you put into this PR. If you have a specific use case that really requires this behavior, let's discuss it. If this is just to bring these properties inline with the baseline EL evaluation, I would ask that we do not implement this at this time. There is some ongoing work/discussion about how to handle the sensitive properties in conjunction with the VR, Registry, flow versioning, setting from external sources, etc. and I believe it requires a cohesive approach with substantial threat modeling to avoid introducing serious issues into NiFi. We have traditionally held to a "secure but severe" policy where some features are absent because of the strict principle that "sensitive properties are always protected/blocked". It may be time to re-evaluate that, but introducing these changes piecemeal is dangerous in my opinion. was (Author: alopresto): [~pvillard] can you describe the use case where you feel VR EL support is necessary for these fields? I am guessing you want to be able to specify a keystore and truststore path in the VR and allow an SSLContextService to reference it. I have a couple concerns with this approach: # It allows any user with variable permissions to see the path to the keystore and truststore. In the current setup, the controller service can be deployed with literal paths by an administrator/trusted user, and the service can be referenced by dependent components without exposing the actual file system paths to a less-trusted user (given the proper absence of permissions on the controller service). Currently, a user who does not have view permission to the {{StandardSSLContextService}} cannot see the explicit path at all, even if they have view permission to the referencing {{InvokeHTTP}} processor (for example). In this new scenario, they would be able to see the explicit path in the Variables window, and see the UUID of the referencing SSLCS (see [https://nifi.apache.org/docs/nifi-docs/html/user-guide.html#unauthorized-referencing-components]) # It allows the keystore and truststore path to change without visibility on the configuration dialog. Given EL evaluation, the processor will evaluate not only VR-specific variables, but also system properties and OS-level environment variables. This means that a malicious or incidental occurrence could change the path used to locate the keystore and truststore without the NiFi user being aware # The original work that introduced the unit tests you reference was done in [NIFI-4274|https://issues.apache.org/jira/browse/NIFI-4274] because of a [Stack Overflow question|https://stackoverflow.com/a/45575232/70465] where the documentation (which said EL was not supported) and the behavior (incorrectly evaluating EL) did not match. I
[jira] [Commented] (NIFI-5296) Add EL Support with Variable Registry scope on SSL context service
[ https://issues.apache.org/jira/browse/NIFI-5296?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16508990#comment-16508990 ] Andy LoPresto commented on NIFI-5296: - [~pvillard] can you describe the use case where you feel VR EL support is necessary for these fields? I am guessing you want to be able to specify a keystore and truststore path in the VR and allow an SSLContextService to reference it. I have a couple concerns with this approach: # It allows any user with variable permissions to see the path to the keystore and truststore. In the current setup, the controller service can be deployed with literal paths by an administrator/trusted user, and the service can be referenced by dependent components without exposing the actual file system paths to a less-trusted user (given the proper absence of permissions on the controller service). Currently, a user who does not have view permission to the {{StandardSSLContextService}} cannot see the explicit path at all, even if they have view permission to the referencing {{InvokeHTTP}} processor (for example). In this new scenario, they would be able to see the explicit path in the Variables window, and see the UUID of the referencing SSLCS (see [https://nifi.apache.org/docs/nifi-docs/html/user-guide.html#unauthorized-referencing-components]) # It allows the keystore and truststore path to change without visibility on the configuration dialog. Given EL evaluation, the processor will evaluate not only VR-specific variables, but also system properties and OS-level environment variables. This means that a malicious or incidental occurrence could change the path used to locate the keystore and truststore without the NiFi user being aware # The original work that introduced the unit tests you reference was done in [NIFI-4274|https://issues.apache.org/jira/browse/NIFI-4274] because of a [Stack Overflow question|https://stackoverflow.com/a/45575232/70465] where the documentation (which said EL was not supported) and the behavior (incorrectly evaluating EL) did not match. I appreciate the work you put into this PR. If you have a specific use case that really requires this behavior, let's discuss it. If this is just to bring these properties inline with the baseline EL evaluation, I would ask that we do not implement this at this time. There is some ongoing work/discussion about how to handle the sensitive properties in conjunction with the VR, Registry, flow versioning, setting from external sources, etc. and I believe it requires a cohesive approach with substantial threat modeling to avoid introducing serious issues into NiFi. We have traditionally held to a "secure but severe" policy where some features are absent because of the strict principle that "sensitive properties are always protected/blocked". It may be time to re-evaluate that, but introducing these changes piecemeal is dangerous in my opinion. > Add EL Support with Variable Registry scope on SSL context service > -- > > Key: NIFI-5296 > URL: https://issues.apache.org/jira/browse/NIFI-5296 > Project: Apache NiFi > Issue Type: Improvement > Components: Extensions >Reporter: Pierre Villard >Assignee: Pierre Villard >Priority: Major > > Add EL support on Truststore and Keystore filename properties with Variable > Registry scope. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Commented] (NIFI-5213) Allow AvroReader with explicit schema to read files with embedded schema
[ https://issues.apache.org/jira/browse/NIFI-5213?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16508944#comment-16508944 ] ASF GitHub Bot commented on NIFI-5213: -- Github user MikeThomsen commented on a diff in the pull request: https://github.com/apache/nifi/pull/2718#discussion_r194575493 --- Diff: nifi-nar-bundles/nifi-standard-services/nifi-record-serialization-services-bundle/nifi-record-serialization-services/src/test/java/org/apache/nifi/avro/TestAvroReaderWithExplicitSchema.java --- @@ -0,0 +1,62 @@ +/* + * Licensed to the Apache Software Foundation (ASF) under one or more + * contributor license agreements. See the NOTICE file distributed with + * this work for additional information regarding copyright ownership. + * The ASF licenses this file to You under the Apache License, Version 2.0 + * (the "License"); you may not use this file except in compliance with + * the License. You may obtain a copy of the License at + * + * http://www.apache.org/licenses/LICENSE-2.0 + * + * Unless required by applicable law or agreed to in writing, software + * distributed under the License is distributed on an "AS IS" BASIS, + * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. + * See the License for the specific language governing permissions and + * limitations under the License. + */ +package org.apache.nifi.avro; + +import org.apache.avro.Schema; +import org.apache.nifi.serialization.SimpleRecordSchema; +import org.apache.nifi.serialization.record.RecordSchema; +import org.junit.Test; + +import java.io.File; +import java.io.FileInputStream; +import java.io.IOException; + +public class TestAvroReaderWithExplicitSchema { + +@Test +public void testAvroExplicitReaderWithSchemalessFile() throws Exception { +File avroFileWithEmbeddedSchema = new File("src/test/resources/avro/avro_schemaless.avro"); +FileInputStream fileInputStream = new FileInputStream(avroFileWithEmbeddedSchema); +Schema dataSchema = new Schema.Parser().parse(new File("src/test/resources/avro/avro_schemaless.avsc")); +RecordSchema recordSchema = new SimpleRecordSchema(dataSchema.toString(), AvroTypeUtil.AVRO_SCHEMA_FORMAT, null); + +AvroReaderWithExplicitSchema avroReader = new AvroReaderWithExplicitSchema(fileInputStream, recordSchema, dataSchema); +avroReader.nextAvroRecord(); --- End diff -- As-is, this wouldn't test anything that I can tell with that method because all of the errors are caught in that method. Should have some assertions around it IMO. > Allow AvroReader with explicit schema to read files with embedded schema > > > Key: NIFI-5213 > URL: https://issues.apache.org/jira/browse/NIFI-5213 > Project: Apache NiFi > Issue Type: Improvement > Components: Extensions >Reporter: Matt Burgess >Assignee: Matt Burgess >Priority: Minor > > AvroReader allows the choice of schema access strategy from such options as > Use Embedded Schema, Use Schema Name, Use Schema Text, etc. If the incoming > Avro files will have embedded schemas, then Use Embedded Schema is best > practice for the Avro Reader. However it is not intuitive that if the same > schema that is embedded in the file is specified by name (using a schema > registry) or explicitly via Schema Text, that errors can occur. This has been > noticed in QueryRecord for example, and the error is also not intuitive or > descriptive (it is often an ArrayIndexOutOfBoundsException). > To provide a better user experience, it would be an improvement for > AvroReader to be able to successfully process Avro files with embedded > schemas, even when the Schema Access Strategy is not "Use Embedded Schema". > Of course, the explicit schema would have to match the embedded schema, or an > error would be reported (and rightfully so). -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Commented] (NIFI-5213) Allow AvroReader with explicit schema to read files with embedded schema
[ https://issues.apache.org/jira/browse/NIFI-5213?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16508942#comment-16508942 ] ASF GitHub Bot commented on NIFI-5213: -- Github user MikeThomsen commented on a diff in the pull request: https://github.com/apache/nifi/pull/2718#discussion_r194575528 --- Diff: nifi-nar-bundles/nifi-standard-services/nifi-record-serialization-services-bundle/nifi-record-serialization-services/src/test/java/org/apache/nifi/avro/TestAvroReaderWithExplicitSchema.java --- @@ -0,0 +1,62 @@ +/* + * Licensed to the Apache Software Foundation (ASF) under one or more + * contributor license agreements. See the NOTICE file distributed with + * this work for additional information regarding copyright ownership. + * The ASF licenses this file to You under the Apache License, Version 2.0 + * (the "License"); you may not use this file except in compliance with + * the License. You may obtain a copy of the License at + * + * http://www.apache.org/licenses/LICENSE-2.0 + * + * Unless required by applicable law or agreed to in writing, software + * distributed under the License is distributed on an "AS IS" BASIS, + * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. + * See the License for the specific language governing permissions and + * limitations under the License. + */ +package org.apache.nifi.avro; + +import org.apache.avro.Schema; +import org.apache.nifi.serialization.SimpleRecordSchema; +import org.apache.nifi.serialization.record.RecordSchema; +import org.junit.Test; + +import java.io.File; +import java.io.FileInputStream; +import java.io.IOException; + +public class TestAvroReaderWithExplicitSchema { + +@Test +public void testAvroExplicitReaderWithSchemalessFile() throws Exception { +File avroFileWithEmbeddedSchema = new File("src/test/resources/avro/avro_schemaless.avro"); +FileInputStream fileInputStream = new FileInputStream(avroFileWithEmbeddedSchema); +Schema dataSchema = new Schema.Parser().parse(new File("src/test/resources/avro/avro_schemaless.avsc")); +RecordSchema recordSchema = new SimpleRecordSchema(dataSchema.toString(), AvroTypeUtil.AVRO_SCHEMA_FORMAT, null); + +AvroReaderWithExplicitSchema avroReader = new AvroReaderWithExplicitSchema(fileInputStream, recordSchema, dataSchema); +avroReader.nextAvroRecord(); +} + +@Test +public void testAvroExplicitReaderWithEmbeddedSchemaFile() throws Exception { +File avroFileWithEmbeddedSchema = new File("src/test/resources/avro/avro_embed_schema.avro"); +FileInputStream fileInputStream = new FileInputStream(avroFileWithEmbeddedSchema); +Schema dataSchema = new Schema.Parser().parse(new File("src/test/resources/avro/avro_schemaless.avsc")); +RecordSchema recordSchema = new SimpleRecordSchema(dataSchema.toString(), AvroTypeUtil.AVRO_SCHEMA_FORMAT, null); + +AvroReaderWithExplicitSchema avroReader = new AvroReaderWithExplicitSchema(fileInputStream, recordSchema, dataSchema); +avroReader.nextAvroRecord(); --- End diff -- Same deal. > Allow AvroReader with explicit schema to read files with embedded schema > > > Key: NIFI-5213 > URL: https://issues.apache.org/jira/browse/NIFI-5213 > Project: Apache NiFi > Issue Type: Improvement > Components: Extensions >Reporter: Matt Burgess >Assignee: Matt Burgess >Priority: Minor > > AvroReader allows the choice of schema access strategy from such options as > Use Embedded Schema, Use Schema Name, Use Schema Text, etc. If the incoming > Avro files will have embedded schemas, then Use Embedded Schema is best > practice for the Avro Reader. However it is not intuitive that if the same > schema that is embedded in the file is specified by name (using a schema > registry) or explicitly via Schema Text, that errors can occur. This has been > noticed in QueryRecord for example, and the error is also not intuitive or > descriptive (it is often an ArrayIndexOutOfBoundsException). > To provide a better user experience, it would be an improvement for > AvroReader to be able to successfully process Avro files with embedded > schemas, even when the Schema Access Strategy is not "Use Embedded Schema". > Of course, the explicit schema would have to match the embedded schema, or an > error would be reported (and rightfully so). -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Commented] (NIFI-5213) Allow AvroReader with explicit schema to read files with embedded schema
[ https://issues.apache.org/jira/browse/NIFI-5213?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16508943#comment-16508943 ] ASF GitHub Bot commented on NIFI-5213: -- Github user MikeThomsen commented on a diff in the pull request: https://github.com/apache/nifi/pull/2718#discussion_r194575701 --- Diff: nifi-nar-bundles/nifi-standard-services/nifi-record-serialization-services-bundle/nifi-record-serialization-services/src/test/java/org/apache/nifi/avro/TestAvroReaderWithExplicitSchema.java --- @@ -0,0 +1,62 @@ +/* + * Licensed to the Apache Software Foundation (ASF) under one or more + * contributor license agreements. See the NOTICE file distributed with + * this work for additional information regarding copyright ownership. + * The ASF licenses this file to You under the Apache License, Version 2.0 + * (the "License"); you may not use this file except in compliance with + * the License. You may obtain a copy of the License at + * + * http://www.apache.org/licenses/LICENSE-2.0 + * + * Unless required by applicable law or agreed to in writing, software + * distributed under the License is distributed on an "AS IS" BASIS, + * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. + * See the License for the specific language governing permissions and + * limitations under the License. + */ +package org.apache.nifi.avro; + +import org.apache.avro.Schema; +import org.apache.nifi.serialization.SimpleRecordSchema; +import org.apache.nifi.serialization.record.RecordSchema; +import org.junit.Test; + +import java.io.File; +import java.io.FileInputStream; +import java.io.IOException; + +public class TestAvroReaderWithExplicitSchema { + +@Test +public void testAvroExplicitReaderWithSchemalessFile() throws Exception { +File avroFileWithEmbeddedSchema = new File("src/test/resources/avro/avro_schemaless.avro"); +FileInputStream fileInputStream = new FileInputStream(avroFileWithEmbeddedSchema); +Schema dataSchema = new Schema.Parser().parse(new File("src/test/resources/avro/avro_schemaless.avsc")); +RecordSchema recordSchema = new SimpleRecordSchema(dataSchema.toString(), AvroTypeUtil.AVRO_SCHEMA_FORMAT, null); + +AvroReaderWithExplicitSchema avroReader = new AvroReaderWithExplicitSchema(fileInputStream, recordSchema, dataSchema); +avroReader.nextAvroRecord(); +} + +@Test +public void testAvroExplicitReaderWithEmbeddedSchemaFile() throws Exception { +File avroFileWithEmbeddedSchema = new File("src/test/resources/avro/avro_embed_schema.avro"); +FileInputStream fileInputStream = new FileInputStream(avroFileWithEmbeddedSchema); +Schema dataSchema = new Schema.Parser().parse(new File("src/test/resources/avro/avro_schemaless.avsc")); +RecordSchema recordSchema = new SimpleRecordSchema(dataSchema.toString(), AvroTypeUtil.AVRO_SCHEMA_FORMAT, null); + +AvroReaderWithExplicitSchema avroReader = new AvroReaderWithExplicitSchema(fileInputStream, recordSchema, dataSchema); +avroReader.nextAvroRecord(); +} + +@Test(expected = IOException.class) --- End diff -- I assume that comes from this, right? ``` if (dataFileStream != null) { return dataFileStream.hasNext() ? dataFileStream.next() : null; } ``` > Allow AvroReader with explicit schema to read files with embedded schema > > > Key: NIFI-5213 > URL: https://issues.apache.org/jira/browse/NIFI-5213 > Project: Apache NiFi > Issue Type: Improvement > Components: Extensions >Reporter: Matt Burgess >Assignee: Matt Burgess >Priority: Minor > > AvroReader allows the choice of schema access strategy from such options as > Use Embedded Schema, Use Schema Name, Use Schema Text, etc. If the incoming > Avro files will have embedded schemas, then Use Embedded Schema is best > practice for the Avro Reader. However it is not intuitive that if the same > schema that is embedded in the file is specified by name (using a schema > registry) or explicitly via Schema Text, that errors can occur. This has been > noticed in QueryRecord for example, and the error is also not intuitive or > descriptive (it is often an ArrayIndexOutOfBoundsException). > To provide a better user experience, it would be an improvement for > AvroReader to be able to successfully process Avro files with embedded > schemas, even when the Schema Access Strategy is not "Use Embedded Schema". > Of course, the explicit schema would have to match the embedded
[jira] [Commented] (NIFI-5213) Allow AvroReader with explicit schema to read files with embedded schema
[ https://issues.apache.org/jira/browse/NIFI-5213?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16508941#comment-16508941 ] ASF GitHub Bot commented on NIFI-5213: -- Github user MikeThomsen commented on a diff in the pull request: https://github.com/apache/nifi/pull/2718#discussion_r194576777 --- Diff: nifi-nar-bundles/nifi-standard-services/nifi-record-serialization-services-bundle/nifi-record-serialization-services/src/main/java/org/apache/nifi/avro/AvroReaderWithExplicitSchema.java --- @@ -17,33 +17,61 @@ package org.apache.nifi.avro; +import java.io.ByteArrayInputStream; +import java.io.ByteArrayOutputStream; import java.io.EOFException; import java.io.IOException; import java.io.InputStream; +import java.io.SequenceInputStream; import org.apache.avro.Schema; +import org.apache.avro.file.DataFileStream; import org.apache.avro.generic.GenericDatumReader; import org.apache.avro.generic.GenericRecord; import org.apache.avro.io.BinaryDecoder; import org.apache.avro.io.DatumReader; import org.apache.avro.io.DecoderFactory; -import org.apache.nifi.schema.access.SchemaNotFoundException; +import org.apache.commons.io.input.TeeInputStream; import org.apache.nifi.serialization.MalformedRecordException; import org.apache.nifi.serialization.record.RecordSchema; public class AvroReaderWithExplicitSchema extends AvroRecordReader { private final InputStream in; private final RecordSchema recordSchema; private final DatumReader datumReader; -private final BinaryDecoder decoder; +private BinaryDecoder decoder; private GenericRecord genericRecord; +private DataFileStream dataFileStream; -public AvroReaderWithExplicitSchema(final InputStream in, final RecordSchema recordSchema, final Schema avroSchema) throws IOException, SchemaNotFoundException { +public AvroReaderWithExplicitSchema(final InputStream in, final RecordSchema recordSchema, final Schema avroSchema) throws IOException { this.in = in; this.recordSchema = recordSchema; -datumReader = new GenericDatumReader(avroSchema); -decoder = DecoderFactory.get().binaryDecoder(in, null); +datumReader = new GenericDatumReader<>(avroSchema); +ByteArrayOutputStream baos = new ByteArrayOutputStream(); +TeeInputStream teeInputStream = new TeeInputStream(in, baos); +// Try to parse as a DataFileStream, if it works, glue the streams back together and delegate calls to the DataFileStream +try { +dataFileStream = new DataFileStream<>(teeInputStream, new GenericDatumReader<>()); --- End diff -- I don't see where `decoder` is initialized outside of the try block, but it's used in `nextAvroRecord`. Shouldn't you initialize it here so it's guaranteed to be properly initialized when `nextAvroRecord` is called or are you just relying on the first if statement in that method as a null check? > Allow AvroReader with explicit schema to read files with embedded schema > > > Key: NIFI-5213 > URL: https://issues.apache.org/jira/browse/NIFI-5213 > Project: Apache NiFi > Issue Type: Improvement > Components: Extensions >Reporter: Matt Burgess >Assignee: Matt Burgess >Priority: Minor > > AvroReader allows the choice of schema access strategy from such options as > Use Embedded Schema, Use Schema Name, Use Schema Text, etc. If the incoming > Avro files will have embedded schemas, then Use Embedded Schema is best > practice for the Avro Reader. However it is not intuitive that if the same > schema that is embedded in the file is specified by name (using a schema > registry) or explicitly via Schema Text, that errors can occur. This has been > noticed in QueryRecord for example, and the error is also not intuitive or > descriptive (it is often an ArrayIndexOutOfBoundsException). > To provide a better user experience, it would be an improvement for > AvroReader to be able to successfully process Avro files with embedded > schemas, even when the Schema Access Strategy is not "Use Embedded Schema". > Of course, the explicit schema would have to match the embedded schema, or an > error would be reported (and rightfully so). -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[GitHub] nifi pull request #2718: NIFI-5213: Allow AvroReader to process files w embe...
Github user MikeThomsen commented on a diff in the pull request: https://github.com/apache/nifi/pull/2718#discussion_r194575701 --- Diff: nifi-nar-bundles/nifi-standard-services/nifi-record-serialization-services-bundle/nifi-record-serialization-services/src/test/java/org/apache/nifi/avro/TestAvroReaderWithExplicitSchema.java --- @@ -0,0 +1,62 @@ +/* + * Licensed to the Apache Software Foundation (ASF) under one or more + * contributor license agreements. See the NOTICE file distributed with + * this work for additional information regarding copyright ownership. + * The ASF licenses this file to You under the Apache License, Version 2.0 + * (the "License"); you may not use this file except in compliance with + * the License. You may obtain a copy of the License at + * + * http://www.apache.org/licenses/LICENSE-2.0 + * + * Unless required by applicable law or agreed to in writing, software + * distributed under the License is distributed on an "AS IS" BASIS, + * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. + * See the License for the specific language governing permissions and + * limitations under the License. + */ +package org.apache.nifi.avro; + +import org.apache.avro.Schema; +import org.apache.nifi.serialization.SimpleRecordSchema; +import org.apache.nifi.serialization.record.RecordSchema; +import org.junit.Test; + +import java.io.File; +import java.io.FileInputStream; +import java.io.IOException; + +public class TestAvroReaderWithExplicitSchema { + +@Test +public void testAvroExplicitReaderWithSchemalessFile() throws Exception { +File avroFileWithEmbeddedSchema = new File("src/test/resources/avro/avro_schemaless.avro"); +FileInputStream fileInputStream = new FileInputStream(avroFileWithEmbeddedSchema); +Schema dataSchema = new Schema.Parser().parse(new File("src/test/resources/avro/avro_schemaless.avsc")); +RecordSchema recordSchema = new SimpleRecordSchema(dataSchema.toString(), AvroTypeUtil.AVRO_SCHEMA_FORMAT, null); + +AvroReaderWithExplicitSchema avroReader = new AvroReaderWithExplicitSchema(fileInputStream, recordSchema, dataSchema); +avroReader.nextAvroRecord(); +} + +@Test +public void testAvroExplicitReaderWithEmbeddedSchemaFile() throws Exception { +File avroFileWithEmbeddedSchema = new File("src/test/resources/avro/avro_embed_schema.avro"); +FileInputStream fileInputStream = new FileInputStream(avroFileWithEmbeddedSchema); +Schema dataSchema = new Schema.Parser().parse(new File("src/test/resources/avro/avro_schemaless.avsc")); +RecordSchema recordSchema = new SimpleRecordSchema(dataSchema.toString(), AvroTypeUtil.AVRO_SCHEMA_FORMAT, null); + +AvroReaderWithExplicitSchema avroReader = new AvroReaderWithExplicitSchema(fileInputStream, recordSchema, dataSchema); +avroReader.nextAvroRecord(); +} + +@Test(expected = IOException.class) --- End diff -- I assume that comes from this, right? ``` if (dataFileStream != null) { return dataFileStream.hasNext() ? dataFileStream.next() : null; } ``` ---
[GitHub] nifi pull request #2718: NIFI-5213: Allow AvroReader to process files w embe...
Github user MikeThomsen commented on a diff in the pull request: https://github.com/apache/nifi/pull/2718#discussion_r194575528 --- Diff: nifi-nar-bundles/nifi-standard-services/nifi-record-serialization-services-bundle/nifi-record-serialization-services/src/test/java/org/apache/nifi/avro/TestAvroReaderWithExplicitSchema.java --- @@ -0,0 +1,62 @@ +/* + * Licensed to the Apache Software Foundation (ASF) under one or more + * contributor license agreements. See the NOTICE file distributed with + * this work for additional information regarding copyright ownership. + * The ASF licenses this file to You under the Apache License, Version 2.0 + * (the "License"); you may not use this file except in compliance with + * the License. You may obtain a copy of the License at + * + * http://www.apache.org/licenses/LICENSE-2.0 + * + * Unless required by applicable law or agreed to in writing, software + * distributed under the License is distributed on an "AS IS" BASIS, + * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. + * See the License for the specific language governing permissions and + * limitations under the License. + */ +package org.apache.nifi.avro; + +import org.apache.avro.Schema; +import org.apache.nifi.serialization.SimpleRecordSchema; +import org.apache.nifi.serialization.record.RecordSchema; +import org.junit.Test; + +import java.io.File; +import java.io.FileInputStream; +import java.io.IOException; + +public class TestAvroReaderWithExplicitSchema { + +@Test +public void testAvroExplicitReaderWithSchemalessFile() throws Exception { +File avroFileWithEmbeddedSchema = new File("src/test/resources/avro/avro_schemaless.avro"); +FileInputStream fileInputStream = new FileInputStream(avroFileWithEmbeddedSchema); +Schema dataSchema = new Schema.Parser().parse(new File("src/test/resources/avro/avro_schemaless.avsc")); +RecordSchema recordSchema = new SimpleRecordSchema(dataSchema.toString(), AvroTypeUtil.AVRO_SCHEMA_FORMAT, null); + +AvroReaderWithExplicitSchema avroReader = new AvroReaderWithExplicitSchema(fileInputStream, recordSchema, dataSchema); +avroReader.nextAvroRecord(); +} + +@Test +public void testAvroExplicitReaderWithEmbeddedSchemaFile() throws Exception { +File avroFileWithEmbeddedSchema = new File("src/test/resources/avro/avro_embed_schema.avro"); +FileInputStream fileInputStream = new FileInputStream(avroFileWithEmbeddedSchema); +Schema dataSchema = new Schema.Parser().parse(new File("src/test/resources/avro/avro_schemaless.avsc")); +RecordSchema recordSchema = new SimpleRecordSchema(dataSchema.toString(), AvroTypeUtil.AVRO_SCHEMA_FORMAT, null); + +AvroReaderWithExplicitSchema avroReader = new AvroReaderWithExplicitSchema(fileInputStream, recordSchema, dataSchema); +avroReader.nextAvroRecord(); --- End diff -- Same deal. ---
[GitHub] nifi pull request #2718: NIFI-5213: Allow AvroReader to process files w embe...
Github user MikeThomsen commented on a diff in the pull request: https://github.com/apache/nifi/pull/2718#discussion_r194575493 --- Diff: nifi-nar-bundles/nifi-standard-services/nifi-record-serialization-services-bundle/nifi-record-serialization-services/src/test/java/org/apache/nifi/avro/TestAvroReaderWithExplicitSchema.java --- @@ -0,0 +1,62 @@ +/* + * Licensed to the Apache Software Foundation (ASF) under one or more + * contributor license agreements. See the NOTICE file distributed with + * this work for additional information regarding copyright ownership. + * The ASF licenses this file to You under the Apache License, Version 2.0 + * (the "License"); you may not use this file except in compliance with + * the License. You may obtain a copy of the License at + * + * http://www.apache.org/licenses/LICENSE-2.0 + * + * Unless required by applicable law or agreed to in writing, software + * distributed under the License is distributed on an "AS IS" BASIS, + * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. + * See the License for the specific language governing permissions and + * limitations under the License. + */ +package org.apache.nifi.avro; + +import org.apache.avro.Schema; +import org.apache.nifi.serialization.SimpleRecordSchema; +import org.apache.nifi.serialization.record.RecordSchema; +import org.junit.Test; + +import java.io.File; +import java.io.FileInputStream; +import java.io.IOException; + +public class TestAvroReaderWithExplicitSchema { + +@Test +public void testAvroExplicitReaderWithSchemalessFile() throws Exception { +File avroFileWithEmbeddedSchema = new File("src/test/resources/avro/avro_schemaless.avro"); +FileInputStream fileInputStream = new FileInputStream(avroFileWithEmbeddedSchema); +Schema dataSchema = new Schema.Parser().parse(new File("src/test/resources/avro/avro_schemaless.avsc")); +RecordSchema recordSchema = new SimpleRecordSchema(dataSchema.toString(), AvroTypeUtil.AVRO_SCHEMA_FORMAT, null); + +AvroReaderWithExplicitSchema avroReader = new AvroReaderWithExplicitSchema(fileInputStream, recordSchema, dataSchema); +avroReader.nextAvroRecord(); --- End diff -- As-is, this wouldn't test anything that I can tell with that method because all of the errors are caught in that method. Should have some assertions around it IMO. ---
[GitHub] nifi pull request #2718: NIFI-5213: Allow AvroReader to process files w embe...
Github user MikeThomsen commented on a diff in the pull request: https://github.com/apache/nifi/pull/2718#discussion_r194576777 --- Diff: nifi-nar-bundles/nifi-standard-services/nifi-record-serialization-services-bundle/nifi-record-serialization-services/src/main/java/org/apache/nifi/avro/AvroReaderWithExplicitSchema.java --- @@ -17,33 +17,61 @@ package org.apache.nifi.avro; +import java.io.ByteArrayInputStream; +import java.io.ByteArrayOutputStream; import java.io.EOFException; import java.io.IOException; import java.io.InputStream; +import java.io.SequenceInputStream; import org.apache.avro.Schema; +import org.apache.avro.file.DataFileStream; import org.apache.avro.generic.GenericDatumReader; import org.apache.avro.generic.GenericRecord; import org.apache.avro.io.BinaryDecoder; import org.apache.avro.io.DatumReader; import org.apache.avro.io.DecoderFactory; -import org.apache.nifi.schema.access.SchemaNotFoundException; +import org.apache.commons.io.input.TeeInputStream; import org.apache.nifi.serialization.MalformedRecordException; import org.apache.nifi.serialization.record.RecordSchema; public class AvroReaderWithExplicitSchema extends AvroRecordReader { private final InputStream in; private final RecordSchema recordSchema; private final DatumReader datumReader; -private final BinaryDecoder decoder; +private BinaryDecoder decoder; private GenericRecord genericRecord; +private DataFileStream dataFileStream; -public AvroReaderWithExplicitSchema(final InputStream in, final RecordSchema recordSchema, final Schema avroSchema) throws IOException, SchemaNotFoundException { +public AvroReaderWithExplicitSchema(final InputStream in, final RecordSchema recordSchema, final Schema avroSchema) throws IOException { this.in = in; this.recordSchema = recordSchema; -datumReader = new GenericDatumReader(avroSchema); -decoder = DecoderFactory.get().binaryDecoder(in, null); +datumReader = new GenericDatumReader<>(avroSchema); +ByteArrayOutputStream baos = new ByteArrayOutputStream(); +TeeInputStream teeInputStream = new TeeInputStream(in, baos); +// Try to parse as a DataFileStream, if it works, glue the streams back together and delegate calls to the DataFileStream +try { +dataFileStream = new DataFileStream<>(teeInputStream, new GenericDatumReader<>()); --- End diff -- I don't see where `decoder` is initialized outside of the try block, but it's used in `nextAvroRecord`. Shouldn't you initialize it here so it's guaranteed to be properly initialized when `nextAvroRecord` is called or are you just relying on the first if statement in that method as a null check? ---
[GitHub] nifi issue #2785: NIFI-5296 - Add EL Support with Variable Registry scope on...
Github user alopresto commented on the issue: https://github.com/apache/nifi/pull/2785 Reviewing... ---
[jira] [Commented] (NIFI-5296) Add EL Support with Variable Registry scope on SSL context service
[ https://issues.apache.org/jira/browse/NIFI-5296?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16508931#comment-16508931 ] ASF GitHub Bot commented on NIFI-5296: -- Github user alopresto commented on the issue: https://github.com/apache/nifi/pull/2785 Reviewing... > Add EL Support with Variable Registry scope on SSL context service > -- > > Key: NIFI-5296 > URL: https://issues.apache.org/jira/browse/NIFI-5296 > Project: Apache NiFi > Issue Type: Improvement > Components: Extensions >Reporter: Pierre Villard >Assignee: Pierre Villard >Priority: Major > > Add EL support on Truststore and Keystore filename properties with Variable > Registry scope. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Commented] (NIFI-5298) Fix typo in FDS README.md
[ https://issues.apache.org/jira/browse/NIFI-5298?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16508928#comment-16508928 ] ASF GitHub Bot commented on NIFI-5298: -- GitHub user alopresto opened a pull request: https://github.com/apache/nifi-fds/pull/6 NIFI-5298 Fixed typo in README.md. Thank you for submitting a contribution to Apache NiFi Flow Design System. In order to streamline the review of the contribution we ask you to ensure the following steps have been taken: ### For all changes: - [x] Is there a JIRA ticket associated with this PR? Is it referenced in the commit message? - [x] Does your PR title start with either NIFI- where is the JIRA number you are trying to resolve? Pay particular attention to the hyphen "-" character. - [x] Has your PR been rebased against the latest commit within the target branch (typically master)? - [x] Is your initial contribution a single, squashed commit? ### For code changes: - [ ] Have you written or updated unit tests to verify your changes? - [ ] Have you ensured that a full build and that the full suite of unit tests is executed via npm run clean:install at the root nifi-fds folder? - [ ] Have you written or updated the Apache NiFi Flow Design System demo application to demonstrate any new functionality, provide examples of usage, and to verify your changes via npm start at the nifi-fds/target folder? - [ ] If adding new dependencies to the code, are these dependencies licensed in a way that is compatible for inclusion under [ASF 2.0](http://www.apache.org/legal/resolved.html#category-a)? - [ ] If applicable, have you updated the LICENSE file, including the main LICENSE file under nifi-fds? - [ ] If applicable, have you updated the NOTICE file, including the main NOTICE file found under nifi-fds? ### For documentation related changes: - [x] Have you ensured that format looks appropriate for the output in which it is rendered? ### Note: Please ensure that once the PR is submitted, you check travis-ci for build issues and submit an update to your PR as soon as possible. You can merge this pull request into a Git repository by running: $ git pull https://github.com/alopresto/nifi-fds NIFI-5298 Alternatively you can review and apply these changes as the patch at: https://github.com/apache/nifi-fds/pull/6.patch To close this pull request, make a commit to your master/trunk branch with (at least) the following in the commit message: This closes #6 commit a8f734547fad2fa73015a1d5d1bdd44a5181df68 Author: Andy LoPresto Date: 2018-06-11T23:07:43Z NIFI-5298 Fixed typo in README.md. > Fix typo in FDS README.md > - > > Key: NIFI-5298 > URL: https://issues.apache.org/jira/browse/NIFI-5298 > Project: Apache NiFi > Issue Type: Task > Components: Documentation Website, FDS >Affects Versions: 0.1.0 >Reporter: Andy LoPresto >Assignee: Andy LoPresto >Priority: Trivial > Labels: documentation, typo > > Fix a typo in the NiFi FDS README.md. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[GitHub] nifi-fds pull request #6: NIFI-5298 Fixed typo in README.md.
GitHub user alopresto opened a pull request: https://github.com/apache/nifi-fds/pull/6 NIFI-5298 Fixed typo in README.md. Thank you for submitting a contribution to Apache NiFi Flow Design System. In order to streamline the review of the contribution we ask you to ensure the following steps have been taken: ### For all changes: - [x] Is there a JIRA ticket associated with this PR? Is it referenced in the commit message? - [x] Does your PR title start with either NIFI- where is the JIRA number you are trying to resolve? Pay particular attention to the hyphen "-" character. - [x] Has your PR been rebased against the latest commit within the target branch (typically master)? - [x] Is your initial contribution a single, squashed commit? ### For code changes: - [ ] Have you written or updated unit tests to verify your changes? - [ ] Have you ensured that a full build and that the full suite of unit tests is executed via npm run clean:install at the root nifi-fds folder? - [ ] Have you written or updated the Apache NiFi Flow Design System demo application to demonstrate any new functionality, provide examples of usage, and to verify your changes via npm start at the nifi-fds/target folder? - [ ] If adding new dependencies to the code, are these dependencies licensed in a way that is compatible for inclusion under [ASF 2.0](http://www.apache.org/legal/resolved.html#category-a)? - [ ] If applicable, have you updated the LICENSE file, including the main LICENSE file under nifi-fds? - [ ] If applicable, have you updated the NOTICE file, including the main NOTICE file found under nifi-fds? ### For documentation related changes: - [x] Have you ensured that format looks appropriate for the output in which it is rendered? ### Note: Please ensure that once the PR is submitted, you check travis-ci for build issues and submit an update to your PR as soon as possible. You can merge this pull request into a Git repository by running: $ git pull https://github.com/alopresto/nifi-fds NIFI-5298 Alternatively you can review and apply these changes as the patch at: https://github.com/apache/nifi-fds/pull/6.patch To close this pull request, make a commit to your master/trunk branch with (at least) the following in the commit message: This closes #6 commit a8f734547fad2fa73015a1d5d1bdd44a5181df68 Author: Andy LoPresto Date: 2018-06-11T23:07:43Z NIFI-5298 Fixed typo in README.md. ---
[jira] [Created] (NIFI-5298) Fix typo in FDS README.md
Andy LoPresto created NIFI-5298: --- Summary: Fix typo in FDS README.md Key: NIFI-5298 URL: https://issues.apache.org/jira/browse/NIFI-5298 Project: Apache NiFi Issue Type: Task Components: Documentation Website, FDS Affects Versions: 0.1.0 Reporter: Andy LoPresto Assignee: Andy LoPresto Fix a typo in the NiFi FDS README.md. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Commented] (NIFI-5213) Allow AvroReader with explicit schema to read files with embedded schema
[ https://issues.apache.org/jira/browse/NIFI-5213?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16508920#comment-16508920 ] ASF GitHub Bot commented on NIFI-5213: -- Github user MikeThomsen commented on a diff in the pull request: https://github.com/apache/nifi/pull/2718#discussion_r194572498 --- Diff: nifi-nar-bundles/nifi-standard-services/nifi-record-serialization-services-bundle/nifi-record-serialization-services/src/main/java/org/apache/nifi/avro/AvroReaderWithExplicitSchema.java --- @@ -17,33 +17,61 @@ package org.apache.nifi.avro; +import java.io.ByteArrayInputStream; +import java.io.ByteArrayOutputStream; import java.io.EOFException; import java.io.IOException; import java.io.InputStream; +import java.io.SequenceInputStream; import org.apache.avro.Schema; +import org.apache.avro.file.DataFileStream; import org.apache.avro.generic.GenericDatumReader; import org.apache.avro.generic.GenericRecord; import org.apache.avro.io.BinaryDecoder; import org.apache.avro.io.DatumReader; import org.apache.avro.io.DecoderFactory; -import org.apache.nifi.schema.access.SchemaNotFoundException; +import org.apache.commons.io.input.TeeInputStream; import org.apache.nifi.serialization.MalformedRecordException; import org.apache.nifi.serialization.record.RecordSchema; public class AvroReaderWithExplicitSchema extends AvroRecordReader { private final InputStream in; private final RecordSchema recordSchema; private final DatumReader datumReader; -private final BinaryDecoder decoder; +private BinaryDecoder decoder; private GenericRecord genericRecord; +private DataFileStream dataFileStream; -public AvroReaderWithExplicitSchema(final InputStream in, final RecordSchema recordSchema, final Schema avroSchema) throws IOException, SchemaNotFoundException { +public AvroReaderWithExplicitSchema(final InputStream in, final RecordSchema recordSchema, final Schema avroSchema) throws IOException { this.in = in; this.recordSchema = recordSchema; -datumReader = new GenericDatumReader(avroSchema); -decoder = DecoderFactory.get().binaryDecoder(in, null); +datumReader = new GenericDatumReader<>(avroSchema); +ByteArrayOutputStream baos = new ByteArrayOutputStream(); +TeeInputStream teeInputStream = new TeeInputStream(in, baos); +// Try to parse as a DataFileStream, if it works, glue the streams back together and delegate calls to the DataFileStream +try { +dataFileStream = new DataFileStream<>(teeInputStream, new GenericDatumReader<>()); +} catch (IOException ioe) { +// Carry on, hopefully a raw Avro file +// Need to be able to re-read the bytes read so far, and the InputStream passed in doesn't support reset. Use the TeeInputStream in +// conjunction with SequenceInputStream to glue the two streams back together for future reading +ByteArrayInputStream bais = new ByteArrayInputStream(baos.toByteArray()); +SequenceInputStream sis = new SequenceInputStream(bais, in); +decoder = DecoderFactory.get().binaryDecoder(sis, null); +} +if (dataFileStream != null) { +// Verify the schemas are the same +Schema embeddedSchema = dataFileStream.getSchema(); +if (!embeddedSchema.equals(avroSchema)) { +throw new IOException("Explicit schema does not match embedded schema"); --- End diff -- @mattyb149 How does it handle schema evolution in this case? It's possible that the Kafka producer has `Corporate Schema v1` and NiFi is configured with `Corporate Schema v2` and v2 gracefully allows an upgrade from v1 via Avro schema evolution rules. Or am I missing something about that being not really a thing WRT the Record API? > Allow AvroReader with explicit schema to read files with embedded schema > > > Key: NIFI-5213 > URL: https://issues.apache.org/jira/browse/NIFI-5213 > Project: Apache NiFi > Issue Type: Improvement > Components: Extensions >Reporter: Matt Burgess >Assignee: Matt Burgess >Priority: Minor > > AvroReader allows the choice of schema access strategy from such options as > Use Embedded Schema, Use Schema Name, Use Schema Text, etc. If the incoming > Avro files will have embedded schemas, then Use Embedded Schema is best > practice for the Avro Reader. However it is not intuitive that if the same > schema that is embedded in the file is
[GitHub] nifi pull request #2718: NIFI-5213: Allow AvroReader to process files w embe...
Github user MikeThomsen commented on a diff in the pull request: https://github.com/apache/nifi/pull/2718#discussion_r194572498 --- Diff: nifi-nar-bundles/nifi-standard-services/nifi-record-serialization-services-bundle/nifi-record-serialization-services/src/main/java/org/apache/nifi/avro/AvroReaderWithExplicitSchema.java --- @@ -17,33 +17,61 @@ package org.apache.nifi.avro; +import java.io.ByteArrayInputStream; +import java.io.ByteArrayOutputStream; import java.io.EOFException; import java.io.IOException; import java.io.InputStream; +import java.io.SequenceInputStream; import org.apache.avro.Schema; +import org.apache.avro.file.DataFileStream; import org.apache.avro.generic.GenericDatumReader; import org.apache.avro.generic.GenericRecord; import org.apache.avro.io.BinaryDecoder; import org.apache.avro.io.DatumReader; import org.apache.avro.io.DecoderFactory; -import org.apache.nifi.schema.access.SchemaNotFoundException; +import org.apache.commons.io.input.TeeInputStream; import org.apache.nifi.serialization.MalformedRecordException; import org.apache.nifi.serialization.record.RecordSchema; public class AvroReaderWithExplicitSchema extends AvroRecordReader { private final InputStream in; private final RecordSchema recordSchema; private final DatumReader datumReader; -private final BinaryDecoder decoder; +private BinaryDecoder decoder; private GenericRecord genericRecord; +private DataFileStream dataFileStream; -public AvroReaderWithExplicitSchema(final InputStream in, final RecordSchema recordSchema, final Schema avroSchema) throws IOException, SchemaNotFoundException { +public AvroReaderWithExplicitSchema(final InputStream in, final RecordSchema recordSchema, final Schema avroSchema) throws IOException { this.in = in; this.recordSchema = recordSchema; -datumReader = new GenericDatumReader(avroSchema); -decoder = DecoderFactory.get().binaryDecoder(in, null); +datumReader = new GenericDatumReader<>(avroSchema); +ByteArrayOutputStream baos = new ByteArrayOutputStream(); +TeeInputStream teeInputStream = new TeeInputStream(in, baos); +// Try to parse as a DataFileStream, if it works, glue the streams back together and delegate calls to the DataFileStream +try { +dataFileStream = new DataFileStream<>(teeInputStream, new GenericDatumReader<>()); +} catch (IOException ioe) { +// Carry on, hopefully a raw Avro file +// Need to be able to re-read the bytes read so far, and the InputStream passed in doesn't support reset. Use the TeeInputStream in +// conjunction with SequenceInputStream to glue the two streams back together for future reading +ByteArrayInputStream bais = new ByteArrayInputStream(baos.toByteArray()); +SequenceInputStream sis = new SequenceInputStream(bais, in); +decoder = DecoderFactory.get().binaryDecoder(sis, null); +} +if (dataFileStream != null) { +// Verify the schemas are the same +Schema embeddedSchema = dataFileStream.getSchema(); +if (!embeddedSchema.equals(avroSchema)) { +throw new IOException("Explicit schema does not match embedded schema"); --- End diff -- @mattyb149 How does it handle schema evolution in this case? It's possible that the Kafka producer has `Corporate Schema v1` and NiFi is configured with `Corporate Schema v2` and v2 gracefully allows an upgrade from v1 via Avro schema evolution rules. Or am I missing something about that being not really a thing WRT the Record API? ---
[jira] [Updated] (NIFI-5297) Add EL Support with Variable Registry scope in ScanAttribute
[ https://issues.apache.org/jira/browse/NIFI-5297?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Mike Thomsen updated NIFI-5297: --- Resolution: Fixed Fix Version/s: 1.7.0 Status: Resolved (was: Patch Available) > Add EL Support with Variable Registry scope in ScanAttribute > > > Key: NIFI-5297 > URL: https://issues.apache.org/jira/browse/NIFI-5297 > Project: Apache NiFi > Issue Type: Improvement > Components: Extensions >Reporter: Pierre Villard >Assignee: Pierre Villard >Priority: Trivial > Fix For: 1.7.0 > > > Add EL support with Variable Registry scope for the Dictionary File property > in the ScanAttribute processor. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Commented] (NIFI-5297) Add EL Support with Variable Registry scope in ScanAttribute
[ https://issues.apache.org/jira/browse/NIFI-5297?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16508910#comment-16508910 ] ASF GitHub Bot commented on NIFI-5297: -- Github user asfgit closed the pull request at: https://github.com/apache/nifi/pull/2786 > Add EL Support with Variable Registry scope in ScanAttribute > > > Key: NIFI-5297 > URL: https://issues.apache.org/jira/browse/NIFI-5297 > Project: Apache NiFi > Issue Type: Improvement > Components: Extensions >Reporter: Pierre Villard >Assignee: Pierre Villard >Priority: Trivial > > Add EL support with Variable Registry scope for the Dictionary File property > in the ScanAttribute processor. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Commented] (NIFI-5297) Add EL Support with Variable Registry scope in ScanAttribute
[ https://issues.apache.org/jira/browse/NIFI-5297?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16508909#comment-16508909 ] ASF subversion and git services commented on NIFI-5297: --- Commit ee18ead16c7b0697560e27c63cf1dd06b1c38c4f in nifi's branch refs/heads/master from [~pvillard] [ https://git-wip-us.apache.org/repos/asf?p=nifi.git;h=ee18ead ] NIFI-5297 - EL support in ScanAttribute This closes #2786 Signed-off-by: Mike Thomsen > Add EL Support with Variable Registry scope in ScanAttribute > > > Key: NIFI-5297 > URL: https://issues.apache.org/jira/browse/NIFI-5297 > Project: Apache NiFi > Issue Type: Improvement > Components: Extensions >Reporter: Pierre Villard >Assignee: Pierre Villard >Priority: Trivial > > Add EL support with Variable Registry scope for the Dictionary File property > in the ScanAttribute processor. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[GitHub] nifi pull request #2786: NIFI-5297 - EL support in ScanAttribute
Github user asfgit closed the pull request at: https://github.com/apache/nifi/pull/2786 ---
[jira] [Commented] (NIFI-5292) Rename existing ElasticSearch client service impl to specify it is for 5.X
[ https://issues.apache.org/jira/browse/NIFI-5292?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16508807#comment-16508807 ] ASF GitHub Bot commented on NIFI-5292: -- Github user MikeThomsen commented on the issue: https://github.com/apache/nifi/pull/2782 @markap14 can you review? @joewitt can you review the L? I took a stab at updating the v6 client's NOTICE, but am not sure if it's right. > Rename existing ElasticSearch client service impl to specify it is for 5.X > -- > > Key: NIFI-5292 > URL: https://issues.apache.org/jira/browse/NIFI-5292 > Project: Apache NiFi > Issue Type: Improvement >Affects Versions: 1.7.0 >Reporter: Mike Thomsen >Assignee: Mike Thomsen >Priority: Major > Labels: Migration > > The current version of the impl is 5.X, but has a generic name that will be > confusing down the road. > Add an ES 6.X client service as well. > > Migration note: Anyone using the existing client service component will have > to create a new one that corresponds to the version of ElasticSearch they are > using. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[GitHub] nifi issue #2782: NIFI-5292 Renamed ElasticSearch client service impl to sho...
Github user MikeThomsen commented on the issue: https://github.com/apache/nifi/pull/2782 @markap14 can you review? @joewitt can you review the L? I took a stab at updating the v6 client's NOTICE, but am not sure if it's right. ---
[jira] [Commented] (NIFI-5292) Rename existing ElasticSearch client service impl to specify it is for 5.X
[ https://issues.apache.org/jira/browse/NIFI-5292?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16508802#comment-16508802 ] ASF GitHub Bot commented on NIFI-5292: -- Github user MikeThomsen commented on the issue: https://github.com/apache/nifi/pull/2782 Looked over the transitive dependencies and HdrHistogram ([license info; appears public domain](https://github.com/HdrHistogram/HdrHistogram/blob/master/LICENSE.txt)) and SnakeYaml (ASL, unknown copyright) appear to be the only two needed to be added to the NOTICE for the v6 version. > Rename existing ElasticSearch client service impl to specify it is for 5.X > -- > > Key: NIFI-5292 > URL: https://issues.apache.org/jira/browse/NIFI-5292 > Project: Apache NiFi > Issue Type: Improvement >Affects Versions: 1.7.0 >Reporter: Mike Thomsen >Assignee: Mike Thomsen >Priority: Major > Labels: Migration > > The current version of the impl is 5.X, but has a generic name that will be > confusing down the road. > Add an ES 6.X client service as well. > > Migration note: Anyone using the existing client service component will have > to create a new one that corresponds to the version of ElasticSearch they are > using. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[GitHub] nifi issue #2782: NIFI-5292 Renamed ElasticSearch client service impl to sho...
Github user MikeThomsen commented on the issue: https://github.com/apache/nifi/pull/2782 Looked over the transitive dependencies and HdrHistogram ([license info; appears public domain](https://github.com/HdrHistogram/HdrHistogram/blob/master/LICENSE.txt)) and SnakeYaml (ASL, unknown copyright) appear to be the only two needed to be added to the NOTICE for the v6 version. ---
[jira] [Commented] (NIFI-5292) Rename existing ElasticSearch client service impl to specify it is for 5.X
[ https://issues.apache.org/jira/browse/NIFI-5292?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16508716#comment-16508716 ] Mike Thomsen commented on NIFI-5292: [~pvillard] Ok. It added a label and put a description here. Hopefully that'll come up at release time. > Rename existing ElasticSearch client service impl to specify it is for 5.X > -- > > Key: NIFI-5292 > URL: https://issues.apache.org/jira/browse/NIFI-5292 > Project: Apache NiFi > Issue Type: Improvement >Affects Versions: 1.7.0 >Reporter: Mike Thomsen >Assignee: Mike Thomsen >Priority: Major > Labels: Migration > > The current version of the impl is 5.X, but has a generic name that will be > confusing down the road. > Add an ES 6.X client service as well. > > Migration note: Anyone using the existing client service component will have > to create a new one that corresponds to the version of ElasticSearch they are > using. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Updated] (NIFI-5292) Rename existing ElasticSearch client service impl to specify it is for 5.X
[ https://issues.apache.org/jira/browse/NIFI-5292?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Mike Thomsen updated NIFI-5292: --- Description: The current version of the impl is 5.X, but has a generic name that will be confusing down the road. Add an ES 6.X client service as well. Migration note: Anyone using the existing client service component will have to create a new one that corresponds to the version of ElasticSearch they are using. was: The current version of the impl is 5.X, but has a generic name that will be confusing down the road. Add an ES 6.X client service as well. > Rename existing ElasticSearch client service impl to specify it is for 5.X > -- > > Key: NIFI-5292 > URL: https://issues.apache.org/jira/browse/NIFI-5292 > Project: Apache NiFi > Issue Type: Improvement >Affects Versions: 1.7.0 >Reporter: Mike Thomsen >Assignee: Mike Thomsen >Priority: Major > Labels: Migration > > The current version of the impl is 5.X, but has a generic name that will be > confusing down the road. > Add an ES 6.X client service as well. > > Migration note: Anyone using the existing client service component will have > to create a new one that corresponds to the version of ElasticSearch they are > using. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Updated] (NIFI-5292) Rename existing ElasticSearch client service impl to specify it is for 5.X
[ https://issues.apache.org/jira/browse/NIFI-5292?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Mike Thomsen updated NIFI-5292: --- Affects Version/s: 1.7.0 > Rename existing ElasticSearch client service impl to specify it is for 5.X > -- > > Key: NIFI-5292 > URL: https://issues.apache.org/jira/browse/NIFI-5292 > Project: Apache NiFi > Issue Type: Improvement >Affects Versions: 1.7.0 >Reporter: Mike Thomsen >Assignee: Mike Thomsen >Priority: Major > Labels: Migration > > The current version of the impl is 5.X, but has a generic name that will be > confusing down the road. > Add an ES 6.X client service as well. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Updated] (NIFI-5292) Rename existing ElasticSearch client service impl to specify it is for 5.X
[ https://issues.apache.org/jira/browse/NIFI-5292?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Mike Thomsen updated NIFI-5292: --- Labels: Migration (was: ) > Rename existing ElasticSearch client service impl to specify it is for 5.X > -- > > Key: NIFI-5292 > URL: https://issues.apache.org/jira/browse/NIFI-5292 > Project: Apache NiFi > Issue Type: Improvement >Affects Versions: 1.7.0 >Reporter: Mike Thomsen >Assignee: Mike Thomsen >Priority: Major > Labels: Migration > > The current version of the impl is 5.X, but has a generic name that will be > confusing down the road. > Add an ES 6.X client service as well. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Commented] (NIFI-5287) LookupRecord should supply flowfile attributes to the lookup service
[ https://issues.apache.org/jira/browse/NIFI-5287?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16508672#comment-16508672 ] ASF GitHub Bot commented on NIFI-5287: -- Github user MikeThomsen commented on the issue: https://github.com/apache/nifi/pull/2777 @markap14 @ijokarumawak updated based on the last comment. > LookupRecord should supply flowfile attributes to the lookup service > > > Key: NIFI-5287 > URL: https://issues.apache.org/jira/browse/NIFI-5287 > Project: Apache NiFi > Issue Type: Improvement >Reporter: Mike Thomsen >Assignee: Mike Thomsen >Priority: Major > > -LookupRecord should supply the flowfile attributes to the lookup service. It > should be done as follows:- > # -Provide a regular expression to choose which attributes are used.- > # -The chosen attributes should be foundation of the coordinates map used > for the lookup.- > # -If a configured key collides with a flowfile attribute, it should > override the flowfile attribute in the coordinate map.- > Mark had the right idea: > > I would propose an alternative approach, which would be to add a new method > to the interface that has a default implementation: > {{default Optional lookup(Map coordinates, Map String> context) throws LookupFailureException \{ return lookup(coordinates); > } }} > Where {{context}} is used for the FlowFile attributes (I'm referring to it as > {{context}} instead of {{attributes}} because there may well be a case where > we want to provide some other value that is not specifically a FlowFile > attribute). Here is why I am suggesting this: > * It provides a clean interface that properly separates the data's > coordinates from FlowFile attributes. > * It prevents any collisions between FlowFile attribute names and > coordinates. > * It maintains backward compatibility, and we know that it won't change the > behavior of existing services or processors/components using those services - > even those that may have been implemented by others outside of the Apache > realm. > * If attributes are passed in by a Processor, those attributes will be > ignored anyway unless the Controller Service is specifically updated to make > use of those attributes, such as via Expression Language. In such a case, the > Controller Service can simply be updated at that time to make use of the new > method instead of the existing method. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[GitHub] nifi issue #2777: NIFI-5287 Made LookupRecord able to take in flowfile attri...
Github user MikeThomsen commented on the issue: https://github.com/apache/nifi/pull/2777 @markap14 @ijokarumawak updated based on the last comment. ---
[jira] [Updated] (NIFI-5287) LookupRecord should supply flowfile attributes to the lookup service
[ https://issues.apache.org/jira/browse/NIFI-5287?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Mike Thomsen updated NIFI-5287: --- Description: -LookupRecord should supply the flowfile attributes to the lookup service. It should be done as follows:- # -Provide a regular expression to choose which attributes are used.- # -The chosen attributes should be foundation of the coordinates map used for the lookup.- # -If a configured key collides with a flowfile attribute, it should override the flowfile attribute in the coordinate map.- Mark had the right idea: I would propose an alternative approach, which would be to add a new method to the interface that has a default implementation: {{default Optional lookup(Map coordinates, Map context) throws LookupFailureException \{ return lookup(coordinates); } }} Where {{context}} is used for the FlowFile attributes (I'm referring to it as {{context}} instead of {{attributes}} because there may well be a case where we want to provide some other value that is not specifically a FlowFile attribute). Here is why I am suggesting this: * It provides a clean interface that properly separates the data's coordinates from FlowFile attributes. * It prevents any collisions between FlowFile attribute names and coordinates. * It maintains backward compatibility, and we know that it won't change the behavior of existing services or processors/components using those services - even those that may have been implemented by others outside of the Apache realm. * If attributes are passed in by a Processor, those attributes will be ignored anyway unless the Controller Service is specifically updated to make use of those attributes, such as via Expression Language. In such a case, the Controller Service can simply be updated at that time to make use of the new method instead of the existing method. was: LookupRecord should supply the flowfile attributes to the lookup service. It should be done as follows: # Provide a regular expression to choose which attributes are used. # The chosen attributes should be foundation of the coordinates map used for the lookup. # If a configured key collides with a flowfile attribute, it should override the flowfile attribute in the coordinate map. > LookupRecord should supply flowfile attributes to the lookup service > > > Key: NIFI-5287 > URL: https://issues.apache.org/jira/browse/NIFI-5287 > Project: Apache NiFi > Issue Type: Improvement >Reporter: Mike Thomsen >Assignee: Mike Thomsen >Priority: Major > > -LookupRecord should supply the flowfile attributes to the lookup service. It > should be done as follows:- > # -Provide a regular expression to choose which attributes are used.- > # -The chosen attributes should be foundation of the coordinates map used > for the lookup.- > # -If a configured key collides with a flowfile attribute, it should > override the flowfile attribute in the coordinate map.- > Mark had the right idea: > > I would propose an alternative approach, which would be to add a new method > to the interface that has a default implementation: > {{default Optional lookup(Map coordinates, Map String> context) throws LookupFailureException \{ return lookup(coordinates); > } }} > Where {{context}} is used for the FlowFile attributes (I'm referring to it as > {{context}} instead of {{attributes}} because there may well be a case where > we want to provide some other value that is not specifically a FlowFile > attribute). Here is why I am suggesting this: > * It provides a clean interface that properly separates the data's > coordinates from FlowFile attributes. > * It prevents any collisions between FlowFile attribute names and > coordinates. > * It maintains backward compatibility, and we know that it won't change the > behavior of existing services or processors/components using those services - > even those that may have been implemented by others outside of the Apache > realm. > * If attributes are passed in by a Processor, those attributes will be > ignored anyway unless the Controller Service is specifically updated to make > use of those attributes, such as via Expression Language. In such a case, the > Controller Service can simply be updated at that time to make use of the new > method instead of the existing method. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Commented] (NIFI-5292) Rename existing ElasticSearch client service impl to specify it is for 5.X
[ https://issues.apache.org/jira/browse/NIFI-5292?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16508641#comment-16508641 ] Pierre Villard commented on NIFI-5292: -- [~mike.thomsen] - I believe the 'migration' label has already been used in the past to tag a JIRA specifically requiring actions during the RM process. However, I don't think this has been formalized. It'd be nice to add a paragraph about that in the release guide ([https://nifi.apache.org/release-guide.html]). > Rename existing ElasticSearch client service impl to specify it is for 5.X > -- > > Key: NIFI-5292 > URL: https://issues.apache.org/jira/browse/NIFI-5292 > Project: Apache NiFi > Issue Type: Improvement >Reporter: Mike Thomsen >Assignee: Mike Thomsen >Priority: Major > > The current version of the impl is 5.X, but has a generic name that will be > confusing down the road. > Add an ES 6.X client service as well. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Commented] (NIFI-5297) Add EL Support with Variable Registry scope in ScanAttribute
[ https://issues.apache.org/jira/browse/NIFI-5297?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16508622#comment-16508622 ] ASF GitHub Bot commented on NIFI-5297: -- GitHub user pvillard31 opened a pull request: https://github.com/apache/nifi/pull/2786 NIFI-5297 - EL support in ScanAttribute Thank you for submitting a contribution to Apache NiFi. In order to streamline the review of the contribution we ask you to ensure the following steps have been taken: ### For all changes: - [ ] Is there a JIRA ticket associated with this PR? Is it referenced in the commit message? - [ ] Does your PR title start with NIFI- where is the JIRA number you are trying to resolve? Pay particular attention to the hyphen "-" character. - [ ] Has your PR been rebased against the latest commit within the target branch (typically master)? - [ ] Is your initial contribution a single, squashed commit? ### For code changes: - [ ] Have you ensured that the full suite of tests is executed via mvn -Pcontrib-check clean install at the root nifi folder? - [ ] Have you written or updated unit tests to verify your changes? - [ ] If adding new dependencies to the code, are these dependencies licensed in a way that is compatible for inclusion under [ASF 2.0](http://www.apache.org/legal/resolved.html#category-a)? - [ ] If applicable, have you updated the LICENSE file, including the main LICENSE file under nifi-assembly? - [ ] If applicable, have you updated the NOTICE file, including the main NOTICE file found under nifi-assembly? - [ ] If adding new Properties, have you added .displayName in addition to .name (programmatic access) for each of the new properties? ### For documentation related changes: - [ ] Have you ensured that format looks appropriate for the output in which it is rendered? ### Note: Please ensure that once the PR is submitted, you check travis-ci for build issues and submit an update to your PR as soon as possible. You can merge this pull request into a Git repository by running: $ git pull https://github.com/pvillard31/nifi NIFI-5297 Alternatively you can review and apply these changes as the patch at: https://github.com/apache/nifi/pull/2786.patch To close this pull request, make a commit to your master/trunk branch with (at least) the following in the commit message: This closes #2786 commit 589e631ec4f038b0cd4919305671a25b0ba436c0 Author: Pierre Villard Date: 2018-06-11T19:39:19Z NIFI-5297 - EL support in ScanAttribute > Add EL Support with Variable Registry scope in ScanAttribute > > > Key: NIFI-5297 > URL: https://issues.apache.org/jira/browse/NIFI-5297 > Project: Apache NiFi > Issue Type: Improvement > Components: Extensions >Reporter: Pierre Villard >Assignee: Pierre Villard >Priority: Trivial > > Add EL support with Variable Registry scope for the Dictionary File property > in the ScanAttribute processor. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Updated] (NIFI-5297) Add EL Support with Variable Registry scope in ScanAttribute
[ https://issues.apache.org/jira/browse/NIFI-5297?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Pierre Villard updated NIFI-5297: - Status: Patch Available (was: Open) > Add EL Support with Variable Registry scope in ScanAttribute > > > Key: NIFI-5297 > URL: https://issues.apache.org/jira/browse/NIFI-5297 > Project: Apache NiFi > Issue Type: Improvement > Components: Extensions >Reporter: Pierre Villard >Assignee: Pierre Villard >Priority: Trivial > > Add EL support with Variable Registry scope for the Dictionary File property > in the ScanAttribute processor. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[GitHub] nifi pull request #2786: NIFI-5297 - EL support in ScanAttribute
GitHub user pvillard31 opened a pull request: https://github.com/apache/nifi/pull/2786 NIFI-5297 - EL support in ScanAttribute Thank you for submitting a contribution to Apache NiFi. In order to streamline the review of the contribution we ask you to ensure the following steps have been taken: ### For all changes: - [ ] Is there a JIRA ticket associated with this PR? Is it referenced in the commit message? - [ ] Does your PR title start with NIFI- where is the JIRA number you are trying to resolve? Pay particular attention to the hyphen "-" character. - [ ] Has your PR been rebased against the latest commit within the target branch (typically master)? - [ ] Is your initial contribution a single, squashed commit? ### For code changes: - [ ] Have you ensured that the full suite of tests is executed via mvn -Pcontrib-check clean install at the root nifi folder? - [ ] Have you written or updated unit tests to verify your changes? - [ ] If adding new dependencies to the code, are these dependencies licensed in a way that is compatible for inclusion under [ASF 2.0](http://www.apache.org/legal/resolved.html#category-a)? - [ ] If applicable, have you updated the LICENSE file, including the main LICENSE file under nifi-assembly? - [ ] If applicable, have you updated the NOTICE file, including the main NOTICE file found under nifi-assembly? - [ ] If adding new Properties, have you added .displayName in addition to .name (programmatic access) for each of the new properties? ### For documentation related changes: - [ ] Have you ensured that format looks appropriate for the output in which it is rendered? ### Note: Please ensure that once the PR is submitted, you check travis-ci for build issues and submit an update to your PR as soon as possible. You can merge this pull request into a Git repository by running: $ git pull https://github.com/pvillard31/nifi NIFI-5297 Alternatively you can review and apply these changes as the patch at: https://github.com/apache/nifi/pull/2786.patch To close this pull request, make a commit to your master/trunk branch with (at least) the following in the commit message: This closes #2786 commit 589e631ec4f038b0cd4919305671a25b0ba436c0 Author: Pierre Villard Date: 2018-06-11T19:39:19Z NIFI-5297 - EL support in ScanAttribute ---
[jira] [Commented] (NIFI-5292) Rename existing ElasticSearch client service impl to specify it is for 5.X
[ https://issues.apache.org/jira/browse/NIFI-5292?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16508615#comment-16508615 ] Mike Thomsen commented on NIFI-5292: [~pvillard] is there an established way of doing that that will be easy to pick up in the release notes and migration guide? > Rename existing ElasticSearch client service impl to specify it is for 5.X > -- > > Key: NIFI-5292 > URL: https://issues.apache.org/jira/browse/NIFI-5292 > Project: Apache NiFi > Issue Type: Improvement >Reporter: Mike Thomsen >Assignee: Mike Thomsen >Priority: Major > > The current version of the impl is 5.X, but has a generic name that will be > confusing down the road. > Add an ES 6.X client service as well. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Created] (NIFI-5297) Add EL Support with Variable Registry scope in ScanAttribute
Pierre Villard created NIFI-5297: Summary: Add EL Support with Variable Registry scope in ScanAttribute Key: NIFI-5297 URL: https://issues.apache.org/jira/browse/NIFI-5297 Project: Apache NiFi Issue Type: Improvement Components: Extensions Reporter: Pierre Villard Assignee: Pierre Villard Add EL support with Variable Registry scope for the Dictionary File property in the ScanAttribute processor. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Commented] (NIFI-5296) Add EL Support with Variable Registry scope on SSL context service
[ https://issues.apache.org/jira/browse/NIFI-5296?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16508594#comment-16508594 ] ASF GitHub Bot commented on NIFI-5296: -- Github user pvillard31 commented on the issue: https://github.com/apache/nifi/pull/2785 @alopresto - could you have a look? I want to be sure I'm not doing something that would not be allowed for some reasons (I removed unit tests specifically checking that EL was not allowed, I hope I'm not missing a point here). > Add EL Support with Variable Registry scope on SSL context service > -- > > Key: NIFI-5296 > URL: https://issues.apache.org/jira/browse/NIFI-5296 > Project: Apache NiFi > Issue Type: Improvement > Components: Extensions >Reporter: Pierre Villard >Assignee: Pierre Villard >Priority: Major > > Add EL support on Truststore and Keystore filename properties with Variable > Registry scope. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Updated] (NIFI-5296) Add EL Support with Variable Registry scope on SSL context service
[ https://issues.apache.org/jira/browse/NIFI-5296?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Pierre Villard updated NIFI-5296: - Status: Patch Available (was: Open) > Add EL Support with Variable Registry scope on SSL context service > -- > > Key: NIFI-5296 > URL: https://issues.apache.org/jira/browse/NIFI-5296 > Project: Apache NiFi > Issue Type: Improvement > Components: Extensions >Reporter: Pierre Villard >Assignee: Pierre Villard >Priority: Major > > Add EL support on Truststore and Keystore filename properties with Variable > Registry scope. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[GitHub] nifi issue #2785: NIFI-5296 - Add EL Support with Variable Registry scope on...
Github user pvillard31 commented on the issue: https://github.com/apache/nifi/pull/2785 @alopresto - could you have a look? I want to be sure I'm not doing something that would not be allowed for some reasons (I removed unit tests specifically checking that EL was not allowed, I hope I'm not missing a point here). ---
[jira] [Commented] (NIFI-5296) Add EL Support with Variable Registry scope on SSL context service
[ https://issues.apache.org/jira/browse/NIFI-5296?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16508591#comment-16508591 ] ASF GitHub Bot commented on NIFI-5296: -- GitHub user pvillard31 opened a pull request: https://github.com/apache/nifi/pull/2785 NIFI-5296 - Add EL Support with Variable Registry scope on SSL contex… …t service Thank you for submitting a contribution to Apache NiFi. In order to streamline the review of the contribution we ask you to ensure the following steps have been taken: ### For all changes: - [ ] Is there a JIRA ticket associated with this PR? Is it referenced in the commit message? - [ ] Does your PR title start with NIFI- where is the JIRA number you are trying to resolve? Pay particular attention to the hyphen "-" character. - [ ] Has your PR been rebased against the latest commit within the target branch (typically master)? - [ ] Is your initial contribution a single, squashed commit? ### For code changes: - [ ] Have you ensured that the full suite of tests is executed via mvn -Pcontrib-check clean install at the root nifi folder? - [ ] Have you written or updated unit tests to verify your changes? - [ ] If adding new dependencies to the code, are these dependencies licensed in a way that is compatible for inclusion under [ASF 2.0](http://www.apache.org/legal/resolved.html#category-a)? - [ ] If applicable, have you updated the LICENSE file, including the main LICENSE file under nifi-assembly? - [ ] If applicable, have you updated the NOTICE file, including the main NOTICE file found under nifi-assembly? - [ ] If adding new Properties, have you added .displayName in addition to .name (programmatic access) for each of the new properties? ### For documentation related changes: - [ ] Have you ensured that format looks appropriate for the output in which it is rendered? ### Note: Please ensure that once the PR is submitted, you check travis-ci for build issues and submit an update to your PR as soon as possible. You can merge this pull request into a Git repository by running: $ git pull https://github.com/pvillard31/nifi NIFI-5296 Alternatively you can review and apply these changes as the patch at: https://github.com/apache/nifi/pull/2785.patch To close this pull request, make a commit to your master/trunk branch with (at least) the following in the commit message: This closes #2785 commit c38e53f295d04055351dcb76f539e1338b229c30 Author: Pierre Villard Date: 2018-06-11T19:18:40Z NIFI-5296 - Add EL Support with Variable Registry scope on SSL context service > Add EL Support with Variable Registry scope on SSL context service > -- > > Key: NIFI-5296 > URL: https://issues.apache.org/jira/browse/NIFI-5296 > Project: Apache NiFi > Issue Type: Improvement > Components: Extensions >Reporter: Pierre Villard >Assignee: Pierre Villard >Priority: Major > > Add EL support on Truststore and Keystore filename properties with Variable > Registry scope. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[GitHub] nifi pull request #2785: NIFI-5296 - Add EL Support with Variable Registry s...
GitHub user pvillard31 opened a pull request: https://github.com/apache/nifi/pull/2785 NIFI-5296 - Add EL Support with Variable Registry scope on SSL contex⦠â¦t service Thank you for submitting a contribution to Apache NiFi. In order to streamline the review of the contribution we ask you to ensure the following steps have been taken: ### For all changes: - [ ] Is there a JIRA ticket associated with this PR? Is it referenced in the commit message? - [ ] Does your PR title start with NIFI- where is the JIRA number you are trying to resolve? Pay particular attention to the hyphen "-" character. - [ ] Has your PR been rebased against the latest commit within the target branch (typically master)? - [ ] Is your initial contribution a single, squashed commit? ### For code changes: - [ ] Have you ensured that the full suite of tests is executed via mvn -Pcontrib-check clean install at the root nifi folder? - [ ] Have you written or updated unit tests to verify your changes? - [ ] If adding new dependencies to the code, are these dependencies licensed in a way that is compatible for inclusion under [ASF 2.0](http://www.apache.org/legal/resolved.html#category-a)? - [ ] If applicable, have you updated the LICENSE file, including the main LICENSE file under nifi-assembly? - [ ] If applicable, have you updated the NOTICE file, including the main NOTICE file found under nifi-assembly? - [ ] If adding new Properties, have you added .displayName in addition to .name (programmatic access) for each of the new properties? ### For documentation related changes: - [ ] Have you ensured that format looks appropriate for the output in which it is rendered? ### Note: Please ensure that once the PR is submitted, you check travis-ci for build issues and submit an update to your PR as soon as possible. You can merge this pull request into a Git repository by running: $ git pull https://github.com/pvillard31/nifi NIFI-5296 Alternatively you can review and apply these changes as the patch at: https://github.com/apache/nifi/pull/2785.patch To close this pull request, make a commit to your master/trunk branch with (at least) the following in the commit message: This closes #2785 commit c38e53f295d04055351dcb76f539e1338b229c30 Author: Pierre Villard Date: 2018-06-11T19:18:40Z NIFI-5296 - Add EL Support with Variable Registry scope on SSL context service ---
[jira] [Commented] (NIFI-4907) Provenance authorization refactoring
[ https://issues.apache.org/jira/browse/NIFI-4907?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16508555#comment-16508555 ] ASF GitHub Bot commented on NIFI-4907: -- Github user markobean commented on a diff in the pull request: https://github.com/apache/nifi/pull/2703#discussion_r194513579 --- Diff: nifi-nar-bundles/nifi-framework-bundle/nifi-framework/nifi-framework-core/src/main/java/org/apache/nifi/controller/FlowController.java --- @@ -4919,6 +4925,22 @@ private void updateRemoteProcessGroups() { return new ArrayList<>(provenanceRepository.getEvents(firstEventId, maxRecords)); } +public AuthorizationResult checkConnectableAuthorization(final String componentId) { --- End diff -- Correct. This was moved to ControllerFacade.java. I will remove it from FlowController.java. > Provenance authorization refactoring > > > Key: NIFI-4907 > URL: https://issues.apache.org/jira/browse/NIFI-4907 > Project: Apache NiFi > Issue Type: Bug > Components: Core Framework >Affects Versions: 1.5.0 >Reporter: Mark Bean >Assignee: Mark Bean >Priority: Major > > Currently, the 'view the data' component policy is too tightly coupled with > Provenance queries. The 'query provenance' policy should be the only policy > required for viewing Provenance query results. Both 'view the component' and > 'view the data' policies should be used to refine the appropriate visibility > of event details - but not the event itself. > 1) Component Visibility > The authorization of Provenance events is inconsistent with the behavior of > the graph. For example, if a user does not have 'view the component' policy, > the graph shows this component as a "black box" (no details such as name, > UUID, etc.) However, when querying Provenance, this component will show up > including the Component Type and the Component Name. This is in effect a > violation of the policy. These component details should be obscured in the > Provenance event displayed if user does not have the appropriate 'view the > component' policy. > 2) Data Visibility > For a Provenance query, all events should be visible as long as the user > performing the query belongs to the 'query provenance' global policy. As > mentioned above, some information about the component may be obscured > depending on 'view the component' policy, but the event itself should be > visible. Additionally, details of the event (clicking the View Details "i" > icon) should only be accessible if the user belongs to the 'view the data' > policy for the affected component. If the user is not in the appropriate > 'view the data' policy, a popup warning should be displayed indicating the > reason details are not visible with more specific detail than the current > "Contact the system administrator". > 3) Lineage Graphs > As with the Provenance table view recommendation above, the lineage graph > should display all events. Currently, if the lineage graph includes an event > belonging to a component which the user does not have 'view the data', it is > shown on the graph as "UNKNOWN". As with Data Visibility mentioned above, the > graph should indicate the event type as long as the user is in the 'view the > component'. Subsequent "View Details" on the event should only be visible if > the user is in the 'view the data' policy. > In summary, for Provenance query results and lineage graphs, all events > should be shown. Component Name and Component Type information should be > conditionally visible depending on the corresponding component policy 'view > the component' policy. Event details including Provenance event type and > FlowFile information should be conditionally available depending on the > corresponding component policy 'view the data'. Inability to display event > details should provide feedback to the user indicating the reason. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[GitHub] nifi pull request #2703: NIFI-4907: add 'view provenance' component policy
Github user markobean commented on a diff in the pull request: https://github.com/apache/nifi/pull/2703#discussion_r194513579 --- Diff: nifi-nar-bundles/nifi-framework-bundle/nifi-framework/nifi-framework-core/src/main/java/org/apache/nifi/controller/FlowController.java --- @@ -4919,6 +4925,22 @@ private void updateRemoteProcessGroups() { return new ArrayList<>(provenanceRepository.getEvents(firstEventId, maxRecords)); } +public AuthorizationResult checkConnectableAuthorization(final String componentId) { --- End diff -- Correct. This was moved to ControllerFacade.java. I will remove it from FlowController.java. ---
[jira] [Commented] (NIFI-4907) Provenance authorization refactoring
[ https://issues.apache.org/jira/browse/NIFI-4907?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16508548#comment-16508548 ] ASF GitHub Bot commented on NIFI-4907: -- Github user mcgilman commented on a diff in the pull request: https://github.com/apache/nifi/pull/2703#discussion_r194512038 --- Diff: nifi-nar-bundles/nifi-framework-bundle/nifi-framework/nifi-web/nifi-web-api/src/main/java/org/apache/nifi/web/controller/ControllerFacade.java --- @@ -1389,104 +1420,119 @@ private ProvenanceEventDTO createProvenanceEventDto(final ProvenanceEventRecord // sets the component details if it can find the component still in the flow setComponentDetails(dto); -// only include all details if not summarizing -if (!summarize) { -// convert the attributes -final Comparator attributeComparator = new Comparator() { -@Override -public int compare(AttributeDTO a1, AttributeDTO a2) { -return Collator.getInstance(Locale.US).compare(a1.getName(), a2.getName()); -} -}; +//try { +//AuthorizationResult result = flowController.checkConnectableAuthorization(event.getComponentId()); +AuthorizationResult result = checkConnectableAuthorization(event.getComponentId()); +if (Result.Denied.equals(result.getResult())) { +dto.setComponentType("Processor"); // is this always a Processor? +dto.setComponentName(dto.getComponentId()); +dto.setEventType("UNKNOWN"); --- End diff -- Yes, I agree. The event type should be controlled by the new provenance event policy. It is not controlled by the component policy that protects the component name and component type. > Provenance authorization refactoring > > > Key: NIFI-4907 > URL: https://issues.apache.org/jira/browse/NIFI-4907 > Project: Apache NiFi > Issue Type: Bug > Components: Core Framework >Affects Versions: 1.5.0 >Reporter: Mark Bean >Assignee: Mark Bean >Priority: Major > > Currently, the 'view the data' component policy is too tightly coupled with > Provenance queries. The 'query provenance' policy should be the only policy > required for viewing Provenance query results. Both 'view the component' and > 'view the data' policies should be used to refine the appropriate visibility > of event details - but not the event itself. > 1) Component Visibility > The authorization of Provenance events is inconsistent with the behavior of > the graph. For example, if a user does not have 'view the component' policy, > the graph shows this component as a "black box" (no details such as name, > UUID, etc.) However, when querying Provenance, this component will show up > including the Component Type and the Component Name. This is in effect a > violation of the policy. These component details should be obscured in the > Provenance event displayed if user does not have the appropriate 'view the > component' policy. > 2) Data Visibility > For a Provenance query, all events should be visible as long as the user > performing the query belongs to the 'query provenance' global policy. As > mentioned above, some information about the component may be obscured > depending on 'view the component' policy, but the event itself should be > visible. Additionally, details of the event (clicking the View Details "i" > icon) should only be accessible if the user belongs to the 'view the data' > policy for the affected component. If the user is not in the appropriate > 'view the data' policy, a popup warning should be displayed indicating the > reason details are not visible with more specific detail than the current > "Contact the system administrator". > 3) Lineage Graphs > As with the Provenance table view recommendation above, the lineage graph > should display all events. Currently, if the lineage graph includes an event > belonging to a component which the user does not have 'view the data', it is > shown on the graph as "UNKNOWN". As with Data Visibility mentioned above, the > graph should indicate the event type as long as the user is in the 'view the > component'. Subsequent "View Details" on the event should only be visible if > the user is in the 'view the data' policy. > In summary, for Provenance query results and lineage graphs, all events > should be shown. Component Name and Component Type information should be > conditionally visible depending on the corresponding component policy 'view > the component' policy. Event details including Provenance event type and > FlowFile information should be conditionally available depending
[GitHub] nifi pull request #2703: NIFI-4907: add 'view provenance' component policy
Github user mcgilman commented on a diff in the pull request: https://github.com/apache/nifi/pull/2703#discussion_r194512038 --- Diff: nifi-nar-bundles/nifi-framework-bundle/nifi-framework/nifi-web/nifi-web-api/src/main/java/org/apache/nifi/web/controller/ControllerFacade.java --- @@ -1389,104 +1420,119 @@ private ProvenanceEventDTO createProvenanceEventDto(final ProvenanceEventRecord // sets the component details if it can find the component still in the flow setComponentDetails(dto); -// only include all details if not summarizing -if (!summarize) { -// convert the attributes -final Comparator attributeComparator = new Comparator() { -@Override -public int compare(AttributeDTO a1, AttributeDTO a2) { -return Collator.getInstance(Locale.US).compare(a1.getName(), a2.getName()); -} -}; +//try { +//AuthorizationResult result = flowController.checkConnectableAuthorization(event.getComponentId()); +AuthorizationResult result = checkConnectableAuthorization(event.getComponentId()); +if (Result.Denied.equals(result.getResult())) { +dto.setComponentType("Processor"); // is this always a Processor? +dto.setComponentName(dto.getComponentId()); +dto.setEventType("UNKNOWN"); --- End diff -- Yes, I agree. The event type should be controlled by the new provenance event policy. It is not controlled by the component policy that protects the component name and component type. ---
[jira] [Commented] (NIFI-4907) Provenance authorization refactoring
[ https://issues.apache.org/jira/browse/NIFI-4907?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16508544#comment-16508544 ] ASF GitHub Bot commented on NIFI-4907: -- Github user markobean commented on a diff in the pull request: https://github.com/apache/nifi/pull/2703#discussion_r194510876 --- Diff: nifi-nar-bundles/nifi-framework-bundle/nifi-framework/nifi-web/nifi-web-api/src/main/java/org/apache/nifi/web/controller/ControllerFacade.java --- @@ -1389,104 +1420,119 @@ private ProvenanceEventDTO createProvenanceEventDto(final ProvenanceEventRecord // sets the component details if it can find the component still in the flow setComponentDetails(dto); -// only include all details if not summarizing -if (!summarize) { -// convert the attributes -final Comparator attributeComparator = new Comparator() { -@Override -public int compare(AttributeDTO a1, AttributeDTO a2) { -return Collator.getInstance(Locale.US).compare(a1.getName(), a2.getName()); -} -}; +//try { +//AuthorizationResult result = flowController.checkConnectableAuthorization(event.getComponentId()); +AuthorizationResult result = checkConnectableAuthorization(event.getComponentId()); +if (Result.Denied.equals(result.getResult())) { +dto.setComponentType("Processor"); // is this always a Processor? +dto.setComponentName(dto.getComponentId()); +dto.setEventType("UNKNOWN"); --- End diff -- If we choose to _not_ redact event type, that makes life easier. Currently, it displays "UNKNOWN" in the table (when 'view provenance' is enabled and 'view the component' is not). But, the event type IS diplayed in the lineage graph. We need to get to consistency one way or the other on this. I'm leaning towards allowing the event type info to be visible since this is a characteristic of provenance (i.e. 'view provenance') and not a characteristic of 'view the component'. > Provenance authorization refactoring > > > Key: NIFI-4907 > URL: https://issues.apache.org/jira/browse/NIFI-4907 > Project: Apache NiFi > Issue Type: Bug > Components: Core Framework >Affects Versions: 1.5.0 >Reporter: Mark Bean >Assignee: Mark Bean >Priority: Major > > Currently, the 'view the data' component policy is too tightly coupled with > Provenance queries. The 'query provenance' policy should be the only policy > required for viewing Provenance query results. Both 'view the component' and > 'view the data' policies should be used to refine the appropriate visibility > of event details - but not the event itself. > 1) Component Visibility > The authorization of Provenance events is inconsistent with the behavior of > the graph. For example, if a user does not have 'view the component' policy, > the graph shows this component as a "black box" (no details such as name, > UUID, etc.) However, when querying Provenance, this component will show up > including the Component Type and the Component Name. This is in effect a > violation of the policy. These component details should be obscured in the > Provenance event displayed if user does not have the appropriate 'view the > component' policy. > 2) Data Visibility > For a Provenance query, all events should be visible as long as the user > performing the query belongs to the 'query provenance' global policy. As > mentioned above, some information about the component may be obscured > depending on 'view the component' policy, but the event itself should be > visible. Additionally, details of the event (clicking the View Details "i" > icon) should only be accessible if the user belongs to the 'view the data' > policy for the affected component. If the user is not in the appropriate > 'view the data' policy, a popup warning should be displayed indicating the > reason details are not visible with more specific detail than the current > "Contact the system administrator". > 3) Lineage Graphs > As with the Provenance table view recommendation above, the lineage graph > should display all events. Currently, if the lineage graph includes an event > belonging to a component which the user does not have 'view the data', it is > shown on the graph as "UNKNOWN". As with Data Visibility mentioned above, the > graph should indicate the event type as long as the user is in the 'view the > component'. Subsequent "View Details" on the event should only be visible if > the user is in the 'view the data' policy. > In summary, for Provenance query results and lineage graphs, all events >
[GitHub] nifi pull request #2703: NIFI-4907: add 'view provenance' component policy
Github user markobean commented on a diff in the pull request: https://github.com/apache/nifi/pull/2703#discussion_r194510876 --- Diff: nifi-nar-bundles/nifi-framework-bundle/nifi-framework/nifi-web/nifi-web-api/src/main/java/org/apache/nifi/web/controller/ControllerFacade.java --- @@ -1389,104 +1420,119 @@ private ProvenanceEventDTO createProvenanceEventDto(final ProvenanceEventRecord // sets the component details if it can find the component still in the flow setComponentDetails(dto); -// only include all details if not summarizing -if (!summarize) { -// convert the attributes -final Comparator attributeComparator = new Comparator() { -@Override -public int compare(AttributeDTO a1, AttributeDTO a2) { -return Collator.getInstance(Locale.US).compare(a1.getName(), a2.getName()); -} -}; +//try { +//AuthorizationResult result = flowController.checkConnectableAuthorization(event.getComponentId()); +AuthorizationResult result = checkConnectableAuthorization(event.getComponentId()); +if (Result.Denied.equals(result.getResult())) { +dto.setComponentType("Processor"); // is this always a Processor? +dto.setComponentName(dto.getComponentId()); +dto.setEventType("UNKNOWN"); --- End diff -- If we choose to _not_ redact event type, that makes life easier. Currently, it displays "UNKNOWN" in the table (when 'view provenance' is enabled and 'view the component' is not). But, the event type IS diplayed in the lineage graph. We need to get to consistency one way or the other on this. I'm leaning towards allowing the event type info to be visible since this is a characteristic of provenance (i.e. 'view provenance') and not a characteristic of 'view the component'. ---
[jira] [Created] (NIFI-5296) Add EL Support with Variable Registry scope on SSL context service
Pierre Villard created NIFI-5296: Summary: Add EL Support with Variable Registry scope on SSL context service Key: NIFI-5296 URL: https://issues.apache.org/jira/browse/NIFI-5296 Project: Apache NiFi Issue Type: Improvement Components: Extensions Reporter: Pierre Villard Assignee: Pierre Villard Add EL support on Truststore and Keystore filename properties with Variable Registry scope. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Commented] (NIFI-5292) Rename existing ElasticSearch client service impl to specify it is for 5.X
[ https://issues.apache.org/jira/browse/NIFI-5292?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16508523#comment-16508523 ] Pierre Villard commented on NIFI-5292: -- Can we add a note somewhere (or add a label to this JIRA) to mention this change in the release note / upgrade path once 1.7.0 is released? > Rename existing ElasticSearch client service impl to specify it is for 5.X > -- > > Key: NIFI-5292 > URL: https://issues.apache.org/jira/browse/NIFI-5292 > Project: Apache NiFi > Issue Type: Improvement >Reporter: Mike Thomsen >Assignee: Mike Thomsen >Priority: Major > > The current version of the impl is 5.X, but has a generic name that will be > confusing down the road. > Add an ES 6.X client service as well. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Commented] (NIFI-4907) Provenance authorization refactoring
[ https://issues.apache.org/jira/browse/NIFI-4907?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16508510#comment-16508510 ] ASF GitHub Bot commented on NIFI-4907: -- Github user mcgilman commented on a diff in the pull request: https://github.com/apache/nifi/pull/2703#discussion_r194499578 --- Diff: nifi-nar-bundles/nifi-framework-bundle/nifi-framework/nifi-web/nifi-web-api/src/main/java/org/apache/nifi/web/controller/ControllerFacade.java --- @@ -1389,104 +1420,119 @@ private ProvenanceEventDTO createProvenanceEventDto(final ProvenanceEventRecord // sets the component details if it can find the component still in the flow setComponentDetails(dto); -// only include all details if not summarizing -if (!summarize) { -// convert the attributes -final Comparator attributeComparator = new Comparator() { -@Override -public int compare(AttributeDTO a1, AttributeDTO a2) { -return Collator.getInstance(Locale.US).compare(a1.getName(), a2.getName()); -} -}; +//try { +//AuthorizationResult result = flowController.checkConnectableAuthorization(event.getComponentId()); +AuthorizationResult result = checkConnectableAuthorization(event.getComponentId()); +if (Result.Denied.equals(result.getResult())) { +dto.setComponentType("Processor"); // is this always a Processor? +dto.setComponentName(dto.getComponentId()); +dto.setEventType("UNKNOWN"); +} -final SortedSet attributes = new TreeSet<>(attributeComparator); +//authorizeData(event); +final AuthorizationResult dataResult = checkAuthorizationForData(event); //(authorizer, RequestAction.READ, user, event.getAttributes()); -final Map updatedAttrs = event.getUpdatedAttributes(); -final Map previousAttrs = event.getPreviousAttributes(); +// only include all details if not summarizing and approved +if (!summarize && Result.Approved.equals(dataResult.getResult())) { --- End diff -- If the user is not authorized for the data of a component we should still be able to return a non-summary. In this case, we should just be leaving out any of the data fields in the ProvenanceEventDto. I would consider these fields data fields as they are associated with either attributes, content, or replay (all of which requires data policies to execute). ``` private Collection attributes; private Boolean contentEqual; private Boolean inputContentAvailable; private String inputContentClaimSection; private String inputContentClaimContainer; private String inputContentClaimIdentifier; private Long inputContentClaimOffset; private String inputContentClaimFileSize; private Long inputContentClaimFileSizeBytes; private Boolean outputContentAvailable; private String outputContentClaimSection; private String outputContentClaimContainer; private String outputContentClaimIdentifier; private Long outputContentClaimOffset; private String outputContentClaimFileSize; private Long outputContentClaimFileSizeBytes; private Boolean replayAvailable; private String replayExplanation; private String sourceConnectionIdentifier; ``` > Provenance authorization refactoring > > > Key: NIFI-4907 > URL: https://issues.apache.org/jira/browse/NIFI-4907 > Project: Apache NiFi > Issue Type: Bug > Components: Core Framework >Affects Versions: 1.5.0 >Reporter: Mark Bean >Assignee: Mark Bean >Priority: Major > > Currently, the 'view the data' component policy is too tightly coupled with > Provenance queries. The 'query provenance' policy should be the only policy > required for viewing Provenance query results. Both 'view the component' and > 'view the data' policies should be used to refine the appropriate visibility > of event details - but not the event itself. > 1) Component Visibility > The authorization of Provenance events is inconsistent with the behavior of > the graph. For example, if a user does not have 'view the component' policy, > the graph shows this component as a "black box" (no details such as name, > UUID, etc.) However, when querying Provenance, this component will show up > including the Component Type and the Component Name. This is in effect a > violation of the policy. These component details should be obscured in the > Provenance event
[jira] [Commented] (NIFI-4907) Provenance authorization refactoring
[ https://issues.apache.org/jira/browse/NIFI-4907?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16508505#comment-16508505 ] ASF GitHub Bot commented on NIFI-4907: -- Github user mcgilman commented on a diff in the pull request: https://github.com/apache/nifi/pull/2703#discussion_r194495379 --- Diff: nifi-nar-bundles/nifi-framework-bundle/nifi-framework/nifi-framework-core/src/main/java/org/apache/nifi/controller/FlowController.java --- @@ -4919,6 +4925,22 @@ private void updateRemoteProcessGroups() { return new ArrayList<>(provenanceRepository.getEvents(firstEventId, maxRecords)); } +public AuthorizationResult checkConnectableAuthorization(final String componentId) { --- End diff -- I don't believe this is called. > Provenance authorization refactoring > > > Key: NIFI-4907 > URL: https://issues.apache.org/jira/browse/NIFI-4907 > Project: Apache NiFi > Issue Type: Bug > Components: Core Framework >Affects Versions: 1.5.0 >Reporter: Mark Bean >Assignee: Mark Bean >Priority: Major > > Currently, the 'view the data' component policy is too tightly coupled with > Provenance queries. The 'query provenance' policy should be the only policy > required for viewing Provenance query results. Both 'view the component' and > 'view the data' policies should be used to refine the appropriate visibility > of event details - but not the event itself. > 1) Component Visibility > The authorization of Provenance events is inconsistent with the behavior of > the graph. For example, if a user does not have 'view the component' policy, > the graph shows this component as a "black box" (no details such as name, > UUID, etc.) However, when querying Provenance, this component will show up > including the Component Type and the Component Name. This is in effect a > violation of the policy. These component details should be obscured in the > Provenance event displayed if user does not have the appropriate 'view the > component' policy. > 2) Data Visibility > For a Provenance query, all events should be visible as long as the user > performing the query belongs to the 'query provenance' global policy. As > mentioned above, some information about the component may be obscured > depending on 'view the component' policy, but the event itself should be > visible. Additionally, details of the event (clicking the View Details "i" > icon) should only be accessible if the user belongs to the 'view the data' > policy for the affected component. If the user is not in the appropriate > 'view the data' policy, a popup warning should be displayed indicating the > reason details are not visible with more specific detail than the current > "Contact the system administrator". > 3) Lineage Graphs > As with the Provenance table view recommendation above, the lineage graph > should display all events. Currently, if the lineage graph includes an event > belonging to a component which the user does not have 'view the data', it is > shown on the graph as "UNKNOWN". As with Data Visibility mentioned above, the > graph should indicate the event type as long as the user is in the 'view the > component'. Subsequent "View Details" on the event should only be visible if > the user is in the 'view the data' policy. > In summary, for Provenance query results and lineage graphs, all events > should be shown. Component Name and Component Type information should be > conditionally visible depending on the corresponding component policy 'view > the component' policy. Event details including Provenance event type and > FlowFile information should be conditionally available depending on the > corresponding component policy 'view the data'. Inability to display event > details should provide feedback to the user indicating the reason. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Commented] (NIFI-4907) Provenance authorization refactoring
[ https://issues.apache.org/jira/browse/NIFI-4907?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16508507#comment-16508507 ] ASF GitHub Bot commented on NIFI-4907: -- Github user mcgilman commented on a diff in the pull request: https://github.com/apache/nifi/pull/2703#discussion_r194498155 --- Diff: nifi-nar-bundles/nifi-framework-bundle/nifi-framework/nifi-web/nifi-web-api/src/main/java/org/apache/nifi/web/controller/ControllerFacade.java --- @@ -1389,104 +1420,119 @@ private ProvenanceEventDTO createProvenanceEventDto(final ProvenanceEventRecord // sets the component details if it can find the component still in the flow setComponentDetails(dto); -// only include all details if not summarizing -if (!summarize) { -// convert the attributes -final Comparator attributeComparator = new Comparator() { -@Override -public int compare(AttributeDTO a1, AttributeDTO a2) { -return Collator.getInstance(Locale.US).compare(a1.getName(), a2.getName()); -} -}; +//try { +//AuthorizationResult result = flowController.checkConnectableAuthorization(event.getComponentId()); +AuthorizationResult result = checkConnectableAuthorization(event.getComponentId()); +if (Result.Denied.equals(result.getResult())) { +dto.setComponentType("Processor"); // is this always a Processor? +dto.setComponentName(dto.getComponentId()); +dto.setEventType("UNKNOWN"); +} -final SortedSet attributes = new TreeSet<>(attributeComparator); +//authorizeData(event); +final AuthorizationResult dataResult = checkAuthorizationForData(event); //(authorizer, RequestAction.READ, user, event.getAttributes()); --- End diff -- We only need to authorize for the data if the event is a non-summary. For instance, when we're pulling back 1000 summaries to load the provenance table we don't need to check any data policies. > Provenance authorization refactoring > > > Key: NIFI-4907 > URL: https://issues.apache.org/jira/browse/NIFI-4907 > Project: Apache NiFi > Issue Type: Bug > Components: Core Framework >Affects Versions: 1.5.0 >Reporter: Mark Bean >Assignee: Mark Bean >Priority: Major > > Currently, the 'view the data' component policy is too tightly coupled with > Provenance queries. The 'query provenance' policy should be the only policy > required for viewing Provenance query results. Both 'view the component' and > 'view the data' policies should be used to refine the appropriate visibility > of event details - but not the event itself. > 1) Component Visibility > The authorization of Provenance events is inconsistent with the behavior of > the graph. For example, if a user does not have 'view the component' policy, > the graph shows this component as a "black box" (no details such as name, > UUID, etc.) However, when querying Provenance, this component will show up > including the Component Type and the Component Name. This is in effect a > violation of the policy. These component details should be obscured in the > Provenance event displayed if user does not have the appropriate 'view the > component' policy. > 2) Data Visibility > For a Provenance query, all events should be visible as long as the user > performing the query belongs to the 'query provenance' global policy. As > mentioned above, some information about the component may be obscured > depending on 'view the component' policy, but the event itself should be > visible. Additionally, details of the event (clicking the View Details "i" > icon) should only be accessible if the user belongs to the 'view the data' > policy for the affected component. If the user is not in the appropriate > 'view the data' policy, a popup warning should be displayed indicating the > reason details are not visible with more specific detail than the current > "Contact the system administrator". > 3) Lineage Graphs > As with the Provenance table view recommendation above, the lineage graph > should display all events. Currently, if the lineage graph includes an event > belonging to a component which the user does not have 'view the data', it is > shown on the graph as "UNKNOWN". As with Data Visibility mentioned above, the > graph should indicate the event type as long as the user is in the 'view the > component'. Subsequent "View Details" on the event should only be visible if > the user is in the 'view the data' policy. > In summary, for Provenance query results and lineage
[jira] [Commented] (NIFI-4907) Provenance authorization refactoring
[ https://issues.apache.org/jira/browse/NIFI-4907?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16508508#comment-16508508 ] ASF GitHub Bot commented on NIFI-4907: -- Github user mcgilman commented on a diff in the pull request: https://github.com/apache/nifi/pull/2703#discussion_r194496260 --- Diff: nifi-nar-bundles/nifi-framework-bundle/nifi-framework/nifi-web/nifi-web-api/src/main/java/org/apache/nifi/web/controller/ControllerFacade.java --- @@ -1389,104 +1420,119 @@ private ProvenanceEventDTO createProvenanceEventDto(final ProvenanceEventRecord // sets the component details if it can find the component still in the flow setComponentDetails(dto); -// only include all details if not summarizing -if (!summarize) { -// convert the attributes -final Comparator attributeComparator = new Comparator() { -@Override -public int compare(AttributeDTO a1, AttributeDTO a2) { -return Collator.getInstance(Locale.US).compare(a1.getName(), a2.getName()); -} -}; +//try { +//AuthorizationResult result = flowController.checkConnectableAuthorization(event.getComponentId()); +AuthorizationResult result = checkConnectableAuthorization(event.getComponentId()); +if (Result.Denied.equals(result.getResult())) { +dto.setComponentType("Processor"); // is this always a Processor? +dto.setComponentName(dto.getComponentId()); +dto.setEventType("UNKNOWN"); --- End diff -- Do you think that we need to redact the event type when the user does not have permissions to the component policy? I would have considered this field under the new provenance event policy. > Provenance authorization refactoring > > > Key: NIFI-4907 > URL: https://issues.apache.org/jira/browse/NIFI-4907 > Project: Apache NiFi > Issue Type: Bug > Components: Core Framework >Affects Versions: 1.5.0 >Reporter: Mark Bean >Assignee: Mark Bean >Priority: Major > > Currently, the 'view the data' component policy is too tightly coupled with > Provenance queries. The 'query provenance' policy should be the only policy > required for viewing Provenance query results. Both 'view the component' and > 'view the data' policies should be used to refine the appropriate visibility > of event details - but not the event itself. > 1) Component Visibility > The authorization of Provenance events is inconsistent with the behavior of > the graph. For example, if a user does not have 'view the component' policy, > the graph shows this component as a "black box" (no details such as name, > UUID, etc.) However, when querying Provenance, this component will show up > including the Component Type and the Component Name. This is in effect a > violation of the policy. These component details should be obscured in the > Provenance event displayed if user does not have the appropriate 'view the > component' policy. > 2) Data Visibility > For a Provenance query, all events should be visible as long as the user > performing the query belongs to the 'query provenance' global policy. As > mentioned above, some information about the component may be obscured > depending on 'view the component' policy, but the event itself should be > visible. Additionally, details of the event (clicking the View Details "i" > icon) should only be accessible if the user belongs to the 'view the data' > policy for the affected component. If the user is not in the appropriate > 'view the data' policy, a popup warning should be displayed indicating the > reason details are not visible with more specific detail than the current > "Contact the system administrator". > 3) Lineage Graphs > As with the Provenance table view recommendation above, the lineage graph > should display all events. Currently, if the lineage graph includes an event > belonging to a component which the user does not have 'view the data', it is > shown on the graph as "UNKNOWN". As with Data Visibility mentioned above, the > graph should indicate the event type as long as the user is in the 'view the > component'. Subsequent "View Details" on the event should only be visible if > the user is in the 'view the data' policy. > In summary, for Provenance query results and lineage graphs, all events > should be shown. Component Name and Component Type information should be > conditionally visible depending on the corresponding component policy 'view > the component' policy. Event details including Provenance event type and > FlowFile information should be conditionally available
[jira] [Commented] (NIFI-4907) Provenance authorization refactoring
[ https://issues.apache.org/jira/browse/NIFI-4907?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16508506#comment-16508506 ] ASF GitHub Bot commented on NIFI-4907: -- Github user mcgilman commented on a diff in the pull request: https://github.com/apache/nifi/pull/2703#discussion_r194495873 --- Diff: nifi-nar-bundles/nifi-framework-bundle/nifi-framework/nifi-web/nifi-web-api/src/main/java/org/apache/nifi/web/controller/ControllerFacade.java --- @@ -1389,104 +1420,119 @@ private ProvenanceEventDTO createProvenanceEventDto(final ProvenanceEventRecord // sets the component details if it can find the component still in the flow setComponentDetails(dto); -// only include all details if not summarizing -if (!summarize) { -// convert the attributes -final Comparator attributeComparator = new Comparator() { -@Override -public int compare(AttributeDTO a1, AttributeDTO a2) { -return Collator.getInstance(Locale.US).compare(a1.getName(), a2.getName()); -} -}; +//try { +//AuthorizationResult result = flowController.checkConnectableAuthorization(event.getComponentId()); +AuthorizationResult result = checkConnectableAuthorization(event.getComponentId()); --- End diff -- Why not check the authorization within `setComponentDetails`? In there you already have the components to authorize and you'll know the corresponding type. > Provenance authorization refactoring > > > Key: NIFI-4907 > URL: https://issues.apache.org/jira/browse/NIFI-4907 > Project: Apache NiFi > Issue Type: Bug > Components: Core Framework >Affects Versions: 1.5.0 >Reporter: Mark Bean >Assignee: Mark Bean >Priority: Major > > Currently, the 'view the data' component policy is too tightly coupled with > Provenance queries. The 'query provenance' policy should be the only policy > required for viewing Provenance query results. Both 'view the component' and > 'view the data' policies should be used to refine the appropriate visibility > of event details - but not the event itself. > 1) Component Visibility > The authorization of Provenance events is inconsistent with the behavior of > the graph. For example, if a user does not have 'view the component' policy, > the graph shows this component as a "black box" (no details such as name, > UUID, etc.) However, when querying Provenance, this component will show up > including the Component Type and the Component Name. This is in effect a > violation of the policy. These component details should be obscured in the > Provenance event displayed if user does not have the appropriate 'view the > component' policy. > 2) Data Visibility > For a Provenance query, all events should be visible as long as the user > performing the query belongs to the 'query provenance' global policy. As > mentioned above, some information about the component may be obscured > depending on 'view the component' policy, but the event itself should be > visible. Additionally, details of the event (clicking the View Details "i" > icon) should only be accessible if the user belongs to the 'view the data' > policy for the affected component. If the user is not in the appropriate > 'view the data' policy, a popup warning should be displayed indicating the > reason details are not visible with more specific detail than the current > "Contact the system administrator". > 3) Lineage Graphs > As with the Provenance table view recommendation above, the lineage graph > should display all events. Currently, if the lineage graph includes an event > belonging to a component which the user does not have 'view the data', it is > shown on the graph as "UNKNOWN". As with Data Visibility mentioned above, the > graph should indicate the event type as long as the user is in the 'view the > component'. Subsequent "View Details" on the event should only be visible if > the user is in the 'view the data' policy. > In summary, for Provenance query results and lineage graphs, all events > should be shown. Component Name and Component Type information should be > conditionally visible depending on the corresponding component policy 'view > the component' policy. Event details including Provenance event type and > FlowFile information should be conditionally available depending on the > corresponding component policy 'view the data'. Inability to display event > details should provide feedback to the user indicating the reason. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Commented] (NIFI-4907) Provenance authorization refactoring
[ https://issues.apache.org/jira/browse/NIFI-4907?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16508509#comment-16508509 ] ASF GitHub Bot commented on NIFI-4907: -- Github user mcgilman commented on a diff in the pull request: https://github.com/apache/nifi/pull/2703#discussion_r194503331 --- Diff: nifi-nar-bundles/nifi-framework-bundle/nifi-framework/nifi-web/nifi-web-api/src/main/java/org/apache/nifi/web/controller/ControllerFacade.java --- @@ -1389,104 +1420,119 @@ private ProvenanceEventDTO createProvenanceEventDto(final ProvenanceEventRecord // sets the component details if it can find the component still in the flow setComponentDetails(dto); -// only include all details if not summarizing -if (!summarize) { -// convert the attributes -final Comparator attributeComparator = new Comparator() { -@Override -public int compare(AttributeDTO a1, AttributeDTO a2) { -return Collator.getInstance(Locale.US).compare(a1.getName(), a2.getName()); -} -}; +//try { +//AuthorizationResult result = flowController.checkConnectableAuthorization(event.getComponentId()); +AuthorizationResult result = checkConnectableAuthorization(event.getComponentId()); +if (Result.Denied.equals(result.getResult())) { +dto.setComponentType("Processor"); // is this always a Processor? +dto.setComponentName(dto.getComponentId()); +dto.setEventType("UNKNOWN"); +} -final SortedSet attributes = new TreeSet<>(attributeComparator); +//authorizeData(event); +final AuthorizationResult dataResult = checkAuthorizationForData(event); //(authorizer, RequestAction.READ, user, event.getAttributes()); --- End diff -- Also, it appears that we're checking the checkAuthorizationForData is verifying READ to the data of the corresponding component. This check is already done as part of the checkAuthorizationForReplay method. It appears that is the only place the replay authorization check is performed. It likely makes sense to refactor some of this so that we're only checking permissions for READ to the data of the corresponding component once. The remainder of the replay authorization check only needs to be performed when we're populating the data fields (READ to the data of the corresponding component is approved). See below. > Provenance authorization refactoring > > > Key: NIFI-4907 > URL: https://issues.apache.org/jira/browse/NIFI-4907 > Project: Apache NiFi > Issue Type: Bug > Components: Core Framework >Affects Versions: 1.5.0 >Reporter: Mark Bean >Assignee: Mark Bean >Priority: Major > > Currently, the 'view the data' component policy is too tightly coupled with > Provenance queries. The 'query provenance' policy should be the only policy > required for viewing Provenance query results. Both 'view the component' and > 'view the data' policies should be used to refine the appropriate visibility > of event details - but not the event itself. > 1) Component Visibility > The authorization of Provenance events is inconsistent with the behavior of > the graph. For example, if a user does not have 'view the component' policy, > the graph shows this component as a "black box" (no details such as name, > UUID, etc.) However, when querying Provenance, this component will show up > including the Component Type and the Component Name. This is in effect a > violation of the policy. These component details should be obscured in the > Provenance event displayed if user does not have the appropriate 'view the > component' policy. > 2) Data Visibility > For a Provenance query, all events should be visible as long as the user > performing the query belongs to the 'query provenance' global policy. As > mentioned above, some information about the component may be obscured > depending on 'view the component' policy, but the event itself should be > visible. Additionally, details of the event (clicking the View Details "i" > icon) should only be accessible if the user belongs to the 'view the data' > policy for the affected component. If the user is not in the appropriate > 'view the data' policy, a popup warning should be displayed indicating the > reason details are not visible with more specific detail than the current > "Contact the system administrator". > 3) Lineage Graphs > As with the Provenance table view recommendation above, the lineage graph > should display all events. Currently, if the lineage graph includes an
[GitHub] nifi pull request #2703: NIFI-4907: add 'view provenance' component policy
Github user mcgilman commented on a diff in the pull request: https://github.com/apache/nifi/pull/2703#discussion_r194496260 --- Diff: nifi-nar-bundles/nifi-framework-bundle/nifi-framework/nifi-web/nifi-web-api/src/main/java/org/apache/nifi/web/controller/ControllerFacade.java --- @@ -1389,104 +1420,119 @@ private ProvenanceEventDTO createProvenanceEventDto(final ProvenanceEventRecord // sets the component details if it can find the component still in the flow setComponentDetails(dto); -// only include all details if not summarizing -if (!summarize) { -// convert the attributes -final Comparator attributeComparator = new Comparator() { -@Override -public int compare(AttributeDTO a1, AttributeDTO a2) { -return Collator.getInstance(Locale.US).compare(a1.getName(), a2.getName()); -} -}; +//try { +//AuthorizationResult result = flowController.checkConnectableAuthorization(event.getComponentId()); +AuthorizationResult result = checkConnectableAuthorization(event.getComponentId()); +if (Result.Denied.equals(result.getResult())) { +dto.setComponentType("Processor"); // is this always a Processor? +dto.setComponentName(dto.getComponentId()); +dto.setEventType("UNKNOWN"); --- End diff -- Do you think that we need to redact the event type when the user does not have permissions to the component policy? I would have considered this field under the new provenance event policy. ---
[GitHub] nifi pull request #2703: NIFI-4907: add 'view provenance' component policy
Github user mcgilman commented on a diff in the pull request: https://github.com/apache/nifi/pull/2703#discussion_r194495379 --- Diff: nifi-nar-bundles/nifi-framework-bundle/nifi-framework/nifi-framework-core/src/main/java/org/apache/nifi/controller/FlowController.java --- @@ -4919,6 +4925,22 @@ private void updateRemoteProcessGroups() { return new ArrayList<>(provenanceRepository.getEvents(firstEventId, maxRecords)); } +public AuthorizationResult checkConnectableAuthorization(final String componentId) { --- End diff -- I don't believe this is called. ---
[GitHub] nifi pull request #2703: NIFI-4907: add 'view provenance' component policy
Github user mcgilman commented on a diff in the pull request: https://github.com/apache/nifi/pull/2703#discussion_r194503331 --- Diff: nifi-nar-bundles/nifi-framework-bundle/nifi-framework/nifi-web/nifi-web-api/src/main/java/org/apache/nifi/web/controller/ControllerFacade.java --- @@ -1389,104 +1420,119 @@ private ProvenanceEventDTO createProvenanceEventDto(final ProvenanceEventRecord // sets the component details if it can find the component still in the flow setComponentDetails(dto); -// only include all details if not summarizing -if (!summarize) { -// convert the attributes -final Comparator attributeComparator = new Comparator() { -@Override -public int compare(AttributeDTO a1, AttributeDTO a2) { -return Collator.getInstance(Locale.US).compare(a1.getName(), a2.getName()); -} -}; +//try { +//AuthorizationResult result = flowController.checkConnectableAuthorization(event.getComponentId()); +AuthorizationResult result = checkConnectableAuthorization(event.getComponentId()); +if (Result.Denied.equals(result.getResult())) { +dto.setComponentType("Processor"); // is this always a Processor? +dto.setComponentName(dto.getComponentId()); +dto.setEventType("UNKNOWN"); +} -final SortedSet attributes = new TreeSet<>(attributeComparator); +//authorizeData(event); +final AuthorizationResult dataResult = checkAuthorizationForData(event); //(authorizer, RequestAction.READ, user, event.getAttributes()); --- End diff -- Also, it appears that we're checking the checkAuthorizationForData is verifying READ to the data of the corresponding component. This check is already done as part of the checkAuthorizationForReplay method. It appears that is the only place the replay authorization check is performed. It likely makes sense to refactor some of this so that we're only checking permissions for READ to the data of the corresponding component once. The remainder of the replay authorization check only needs to be performed when we're populating the data fields (READ to the data of the corresponding component is approved). See below. ---
[GitHub] nifi pull request #2703: NIFI-4907: add 'view provenance' component policy
Github user mcgilman commented on a diff in the pull request: https://github.com/apache/nifi/pull/2703#discussion_r194495873 --- Diff: nifi-nar-bundles/nifi-framework-bundle/nifi-framework/nifi-web/nifi-web-api/src/main/java/org/apache/nifi/web/controller/ControllerFacade.java --- @@ -1389,104 +1420,119 @@ private ProvenanceEventDTO createProvenanceEventDto(final ProvenanceEventRecord // sets the component details if it can find the component still in the flow setComponentDetails(dto); -// only include all details if not summarizing -if (!summarize) { -// convert the attributes -final Comparator attributeComparator = new Comparator() { -@Override -public int compare(AttributeDTO a1, AttributeDTO a2) { -return Collator.getInstance(Locale.US).compare(a1.getName(), a2.getName()); -} -}; +//try { +//AuthorizationResult result = flowController.checkConnectableAuthorization(event.getComponentId()); +AuthorizationResult result = checkConnectableAuthorization(event.getComponentId()); --- End diff -- Why not check the authorization within `setComponentDetails`? In there you already have the components to authorize and you'll know the corresponding type. ---
[GitHub] nifi pull request #2703: NIFI-4907: add 'view provenance' component policy
Github user mcgilman commented on a diff in the pull request: https://github.com/apache/nifi/pull/2703#discussion_r194498155 --- Diff: nifi-nar-bundles/nifi-framework-bundle/nifi-framework/nifi-web/nifi-web-api/src/main/java/org/apache/nifi/web/controller/ControllerFacade.java --- @@ -1389,104 +1420,119 @@ private ProvenanceEventDTO createProvenanceEventDto(final ProvenanceEventRecord // sets the component details if it can find the component still in the flow setComponentDetails(dto); -// only include all details if not summarizing -if (!summarize) { -// convert the attributes -final Comparator attributeComparator = new Comparator() { -@Override -public int compare(AttributeDTO a1, AttributeDTO a2) { -return Collator.getInstance(Locale.US).compare(a1.getName(), a2.getName()); -} -}; +//try { +//AuthorizationResult result = flowController.checkConnectableAuthorization(event.getComponentId()); +AuthorizationResult result = checkConnectableAuthorization(event.getComponentId()); +if (Result.Denied.equals(result.getResult())) { +dto.setComponentType("Processor"); // is this always a Processor? +dto.setComponentName(dto.getComponentId()); +dto.setEventType("UNKNOWN"); +} -final SortedSet attributes = new TreeSet<>(attributeComparator); +//authorizeData(event); +final AuthorizationResult dataResult = checkAuthorizationForData(event); //(authorizer, RequestAction.READ, user, event.getAttributes()); --- End diff -- We only need to authorize for the data if the event is a non-summary. For instance, when we're pulling back 1000 summaries to load the provenance table we don't need to check any data policies. ---
[GitHub] nifi pull request #2703: NIFI-4907: add 'view provenance' component policy
Github user mcgilman commented on a diff in the pull request: https://github.com/apache/nifi/pull/2703#discussion_r194499578 --- Diff: nifi-nar-bundles/nifi-framework-bundle/nifi-framework/nifi-web/nifi-web-api/src/main/java/org/apache/nifi/web/controller/ControllerFacade.java --- @@ -1389,104 +1420,119 @@ private ProvenanceEventDTO createProvenanceEventDto(final ProvenanceEventRecord // sets the component details if it can find the component still in the flow setComponentDetails(dto); -// only include all details if not summarizing -if (!summarize) { -// convert the attributes -final Comparator attributeComparator = new Comparator() { -@Override -public int compare(AttributeDTO a1, AttributeDTO a2) { -return Collator.getInstance(Locale.US).compare(a1.getName(), a2.getName()); -} -}; +//try { +//AuthorizationResult result = flowController.checkConnectableAuthorization(event.getComponentId()); +AuthorizationResult result = checkConnectableAuthorization(event.getComponentId()); +if (Result.Denied.equals(result.getResult())) { +dto.setComponentType("Processor"); // is this always a Processor? +dto.setComponentName(dto.getComponentId()); +dto.setEventType("UNKNOWN"); +} -final SortedSet attributes = new TreeSet<>(attributeComparator); +//authorizeData(event); +final AuthorizationResult dataResult = checkAuthorizationForData(event); //(authorizer, RequestAction.READ, user, event.getAttributes()); -final Map updatedAttrs = event.getUpdatedAttributes(); -final Map previousAttrs = event.getPreviousAttributes(); +// only include all details if not summarizing and approved +if (!summarize && Result.Approved.equals(dataResult.getResult())) { --- End diff -- If the user is not authorized for the data of a component we should still be able to return a non-summary. In this case, we should just be leaving out any of the data fields in the ProvenanceEventDto. I would consider these fields data fields as they are associated with either attributes, content, or replay (all of which requires data policies to execute). ``` private Collection attributes; private Boolean contentEqual; private Boolean inputContentAvailable; private String inputContentClaimSection; private String inputContentClaimContainer; private String inputContentClaimIdentifier; private Long inputContentClaimOffset; private String inputContentClaimFileSize; private Long inputContentClaimFileSizeBytes; private Boolean outputContentAvailable; private String outputContentClaimSection; private String outputContentClaimContainer; private String outputContentClaimIdentifier; private Long outputContentClaimOffset; private String outputContentClaimFileSize; private Long outputContentClaimFileSizeBytes; private Boolean replayAvailable; private String replayExplanation; private String sourceConnectionIdentifier; ``` ---
[jira] [Updated] (NIFI-5270) my ftp password is "${password}" so nifi's LISTFtp won't use it.
[ https://issues.apache.org/jira/browse/NIFI-5270?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Andy LoPresto updated NIFI-5270: Component/s: Core UI Core Framework > my ftp password is "${password}" so nifi's LISTFtp won't use it. > > > Key: NIFI-5270 > URL: https://issues.apache.org/jira/browse/NIFI-5270 > Project: Apache NiFi > Issue Type: Bug > Components: Core Framework, Core UI >Affects Versions: 1.6.0 >Reporter: eric twilegar >Priority: Major > Labels: expression-language, passwords, registry, security, > variable > > I'm joking of course, but if that was your password the processor would fail > as it would consider it an expression and not a password. > In all seriousness though we really do something like "isPasswordExpression" > checkbox for all controllers. This would also allow nifi registry to not > consider them secrets and so you don't have to cut and paste ${ftp_password} > after deploying a version. Maybe just adding passwordExpression vs sharing > the property is a better idea. > > I didn't test whether you can escape the password in someway, so there is a > chance this isn't a bug. > > -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Updated] (NIFI-5270) my ftp password is "${password}" so nifi's LISTFtp won't use it.
[ https://issues.apache.org/jira/browse/NIFI-5270?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Andy LoPresto updated NIFI-5270: Labels: expression-language passwords registry security variable (was: ) > my ftp password is "${password}" so nifi's LISTFtp won't use it. > > > Key: NIFI-5270 > URL: https://issues.apache.org/jira/browse/NIFI-5270 > Project: Apache NiFi > Issue Type: Bug > Components: Core Framework, Core UI >Affects Versions: 1.6.0 >Reporter: eric twilegar >Priority: Major > Labels: expression-language, passwords, registry, security, > variable > > I'm joking of course, but if that was your password the processor would fail > as it would consider it an expression and not a password. > In all seriousness though we really do something like "isPasswordExpression" > checkbox for all controllers. This would also allow nifi registry to not > consider them secrets and so you don't have to cut and paste ${ftp_password} > after deploying a version. Maybe just adding passwordExpression vs sharing > the property is a better idea. > > I didn't test whether you can escape the password in someway, so there is a > chance this isn't a bug. > > -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Updated] (NIFI-5270) my ftp password is "${password}" so nifi's LISTFtp won't use it.
[ https://issues.apache.org/jira/browse/NIFI-5270?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Andy LoPresto updated NIFI-5270: Affects Version/s: 1.6.0 > my ftp password is "${password}" so nifi's LISTFtp won't use it. > > > Key: NIFI-5270 > URL: https://issues.apache.org/jira/browse/NIFI-5270 > Project: Apache NiFi > Issue Type: Bug > Components: Core Framework, Core UI >Affects Versions: 1.6.0 >Reporter: eric twilegar >Priority: Major > Labels: expression-language, passwords, registry, security, > variable > > I'm joking of course, but if that was your password the processor would fail > as it would consider it an expression and not a password. > In all seriousness though we really do something like "isPasswordExpression" > checkbox for all controllers. This would also allow nifi registry to not > consider them secrets and so you don't have to cut and paste ${ftp_password} > after deploying a version. Maybe just adding passwordExpression vs sharing > the property is a better idea. > > I didn't test whether you can escape the password in someway, so there is a > chance this isn't a bug. > > -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Commented] (NIFI-5270) my ftp password is "${password}" so nifi's LISTFtp won't use it.
[ https://issues.apache.org/jira/browse/NIFI-5270?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16508486#comment-16508486 ] Andy LoPresto commented on NIFI-5270: - Hi Eric, There has been some on-going discussion of this (pre-dating the NiFi Registry effort) and how it would relate to the Variable Registry. The effort has been paused a bit while other priorities have come up. I think the last discussion I recall had landed on "password guidance should explicitly prohibit literal passwords of the format {{'${xxx}'}}" as this was backward compatible with the existing {{PropertyDescriptor}} definitions and did not require additional work. Now may be a good time to re-evaluate that decision and perform new work for a future release. https://issues.apache.org/jira/browse/NIFI-2653 https://issues.apache.org/jira/browse/NIFI-3046 https://issues.apache.org/jira/browse/NIFI-3110 https://issues.apache.org/jira/browse/NIFI-3311 https://issues.apache.org/jira/browse/NIFI-3439 https://issues.apache.org/jira/browse/NIFI-4557 > my ftp password is "${password}" so nifi's LISTFtp won't use it. > > > Key: NIFI-5270 > URL: https://issues.apache.org/jira/browse/NIFI-5270 > Project: Apache NiFi > Issue Type: Bug >Reporter: eric twilegar >Priority: Major > > I'm joking of course, but if that was your password the processor would fail > as it would consider it an expression and not a password. > In all seriousness though we really do something like "isPasswordExpression" > checkbox for all controllers. This would also allow nifi registry to not > consider them secrets and so you don't have to cut and paste ${ftp_password} > after deploying a version. Maybe just adding passwordExpression vs sharing > the property is a better idea. > > I didn't test whether you can escape the password in someway, so there is a > chance this isn't a bug. > > -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Commented] (NIFI-4906) Add GetHdfsFileInfo Processor
[ https://issues.apache.org/jira/browse/NIFI-4906?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16508396#comment-16508396 ] ASF GitHub Bot commented on NIFI-4906: -- Github user bbende commented on a diff in the pull request: https://github.com/apache/nifi/pull/2639#discussion_r194430127 --- Diff: nifi-nar-bundles/nifi-hadoop-bundle/nifi-hdfs-processors/src/main/java/org/apache/nifi/processors/hadoop/GetHDFSFileInfo.java --- @@ -0,0 +1,803 @@ +/* + * Licensed to the Apache Software Foundation (ASF) under one or more + * contributor license agreements. See the NOTICE file distributed with + * this work for additional information regarding copyright ownership. + * The ASF licenses this file to You under the Apache License, Version 2.0 + * (the "License"); you may not use this file except in compliance with + * the License. You may obtain a copy of the License at + * + * http://www.apache.org/licenses/LICENSE-2.0 + * + * Unless required by applicable law or agreed to in writing, software + * distributed under the License is distributed on an "AS IS" BASIS, + * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. + * See the License for the specific language governing permissions and + * limitations under the License. + */ +package org.apache.nifi.processors.hadoop; + +import java.io.IOException; +import java.security.PrivilegedExceptionAction; +import java.util.ArrayList; +import java.util.Collection; +import java.util.HashMap; +import java.util.HashSet; +import java.util.LinkedList; +import java.util.List; +import java.util.Map; +import java.util.Set; +import java.util.concurrent.TimeUnit; +import java.util.regex.Pattern; + +import org.apache.commons.lang3.StringUtils; +import org.apache.hadoop.fs.FileStatus; +import org.apache.hadoop.fs.FileSystem; +import org.apache.hadoop.fs.Path; +import org.apache.hadoop.fs.permission.FsAction; +import org.apache.hadoop.fs.permission.FsPermission; +import org.apache.hadoop.security.UserGroupInformation; +import org.apache.nifi.annotation.behavior.InputRequirement; +import org.apache.nifi.annotation.behavior.InputRequirement.Requirement; +import org.apache.nifi.annotation.behavior.TriggerSerially; +import org.apache.nifi.annotation.behavior.TriggerWhenEmpty; +import org.apache.nifi.annotation.behavior.WritesAttribute; +import org.apache.nifi.annotation.behavior.WritesAttributes; +import org.apache.nifi.annotation.documentation.CapabilityDescription; +import org.apache.nifi.annotation.documentation.SeeAlso; +import org.apache.nifi.annotation.documentation.Tags; +import org.apache.nifi.components.AllowableValue; +import org.apache.nifi.components.PropertyDescriptor; +import org.apache.nifi.components.PropertyValue; +import org.apache.nifi.components.ValidationContext; +import org.apache.nifi.components.ValidationResult; +import org.apache.nifi.expression.ExpressionLanguageScope; +import org.apache.nifi.flowfile.FlowFile; +import org.apache.nifi.processor.ProcessContext; +import org.apache.nifi.processor.ProcessSession; +import org.apache.nifi.processor.ProcessorInitializationContext; +import org.apache.nifi.processor.Relationship; +import org.apache.nifi.processor.exception.ProcessException; +import org.apache.nifi.processor.util.StandardValidators; +import org.apache.nifi.processors.hadoop.GetHDFSFileInfo.HDFSFileInfoRequest.Groupping; + +@TriggerSerially +@TriggerWhenEmpty +@InputRequirement(Requirement.INPUT_ALLOWED) +@Tags({"hadoop", "HDFS", "get", "list", "ingest", "source", "filesystem"}) +@CapabilityDescription("Retrieves a listing of files and directories from HDFS. " ++ "This processor creates a FlowFile(s) that represents the HDFS file/dir with relevant information. " ++ "Main purpose of this processor to provide functionality similar to HDFS Client, i.e. count, du, ls, test, etc. " ++ "Unlike ListHDFS, this processor is stateless, supports incoming connections and provides information on a dir level. " +) +@WritesAttributes({ +@WritesAttribute(attribute="hdfs.objectName", description="The name of the file/dir found on HDFS."), +@WritesAttribute(attribute="hdfs.path", description="The path is set to the absolute path of the object's parent directory on HDFS. " ++ "For example, if an object is a directory 'foo', under directory '/bar' then 'hdfs.objectName' will have value 'foo', and 'hdfs.path' will be '/bar'"), +@WritesAttribute(attribute="hdfs.type", description="The type of an object. Possible values: directory, file, link"), +@WritesAttribute(attribute="hdfs.owner",
[jira] [Commented] (NIFI-4906) Add GetHdfsFileInfo Processor
[ https://issues.apache.org/jira/browse/NIFI-4906?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16508397#comment-16508397 ] ASF GitHub Bot commented on NIFI-4906: -- Github user bbende commented on a diff in the pull request: https://github.com/apache/nifi/pull/2639#discussion_r194424610 --- Diff: nifi-nar-bundles/nifi-hadoop-bundle/nifi-hdfs-processors/src/main/java/org/apache/nifi/processors/hadoop/GetHDFSFileInfo.java --- @@ -0,0 +1,803 @@ +/* + * Licensed to the Apache Software Foundation (ASF) under one or more + * contributor license agreements. See the NOTICE file distributed with + * this work for additional information regarding copyright ownership. + * The ASF licenses this file to You under the Apache License, Version 2.0 + * (the "License"); you may not use this file except in compliance with + * the License. You may obtain a copy of the License at + * + * http://www.apache.org/licenses/LICENSE-2.0 + * + * Unless required by applicable law or agreed to in writing, software + * distributed under the License is distributed on an "AS IS" BASIS, + * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. + * See the License for the specific language governing permissions and + * limitations under the License. + */ +package org.apache.nifi.processors.hadoop; + +import java.io.IOException; +import java.security.PrivilegedExceptionAction; +import java.util.ArrayList; +import java.util.Collection; +import java.util.HashMap; +import java.util.HashSet; +import java.util.LinkedList; +import java.util.List; +import java.util.Map; +import java.util.Set; +import java.util.concurrent.TimeUnit; +import java.util.regex.Pattern; + +import org.apache.commons.lang3.StringUtils; +import org.apache.hadoop.fs.FileStatus; +import org.apache.hadoop.fs.FileSystem; +import org.apache.hadoop.fs.Path; +import org.apache.hadoop.fs.permission.FsAction; +import org.apache.hadoop.fs.permission.FsPermission; +import org.apache.hadoop.security.UserGroupInformation; +import org.apache.nifi.annotation.behavior.InputRequirement; +import org.apache.nifi.annotation.behavior.InputRequirement.Requirement; +import org.apache.nifi.annotation.behavior.TriggerSerially; +import org.apache.nifi.annotation.behavior.TriggerWhenEmpty; +import org.apache.nifi.annotation.behavior.WritesAttribute; +import org.apache.nifi.annotation.behavior.WritesAttributes; +import org.apache.nifi.annotation.documentation.CapabilityDescription; +import org.apache.nifi.annotation.documentation.SeeAlso; +import org.apache.nifi.annotation.documentation.Tags; +import org.apache.nifi.components.AllowableValue; +import org.apache.nifi.components.PropertyDescriptor; +import org.apache.nifi.components.PropertyValue; +import org.apache.nifi.components.ValidationContext; +import org.apache.nifi.components.ValidationResult; +import org.apache.nifi.expression.ExpressionLanguageScope; +import org.apache.nifi.flowfile.FlowFile; +import org.apache.nifi.processor.ProcessContext; +import org.apache.nifi.processor.ProcessSession; +import org.apache.nifi.processor.ProcessorInitializationContext; +import org.apache.nifi.processor.Relationship; +import org.apache.nifi.processor.exception.ProcessException; +import org.apache.nifi.processor.util.StandardValidators; +import org.apache.nifi.processors.hadoop.GetHDFSFileInfo.HDFSFileInfoRequest.Groupping; + +@TriggerSerially +@TriggerWhenEmpty --- End diff -- @TriggerWhenEmpty is used to make the framework trigger the processor when there is an empty incoming connection, is that necessary in this case? You typically use this if there is a case where you need to perform some kind of check/clean-up even when no flow files are available. > Add GetHdfsFileInfo Processor > - > > Key: NIFI-4906 > URL: https://issues.apache.org/jira/browse/NIFI-4906 > Project: Apache NiFi > Issue Type: New Feature > Components: Extensions >Reporter: Ed Berezitsky >Assignee: Ed Berezitsky >Priority: Major > Labels: patch, pull-request-available > Attachments: NiFi-GetHDFSFileInfo.pdf, gethdfsfileinfo.patch > > > Add *GetHdfsFileInfo* Processor to be able to get stats from a file system. > This processor should support recursive scan, getting information of > directories and files. > _File-level info required_: name, path, length, modified timestamp, last > access timestamp, owner, group, permissions. > _Directory-level info required_: name, path, sum of lengths of files under a > dir, count of files under a dir, modified
[jira] [Commented] (NIFI-4906) Add GetHdfsFileInfo Processor
[ https://issues.apache.org/jira/browse/NIFI-4906?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16508401#comment-16508401 ] ASF GitHub Bot commented on NIFI-4906: -- Github user bbende commented on a diff in the pull request: https://github.com/apache/nifi/pull/2639#discussion_r194456502 --- Diff: nifi-nar-bundles/nifi-hadoop-bundle/nifi-hdfs-processors/src/main/java/org/apache/nifi/processors/hadoop/GetHDFSFileInfo.java --- @@ -0,0 +1,803 @@ +/* + * Licensed to the Apache Software Foundation (ASF) under one or more + * contributor license agreements. See the NOTICE file distributed with + * this work for additional information regarding copyright ownership. + * The ASF licenses this file to You under the Apache License, Version 2.0 + * (the "License"); you may not use this file except in compliance with + * the License. You may obtain a copy of the License at + * + * http://www.apache.org/licenses/LICENSE-2.0 + * + * Unless required by applicable law or agreed to in writing, software + * distributed under the License is distributed on an "AS IS" BASIS, + * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. + * See the License for the specific language governing permissions and + * limitations under the License. + */ +package org.apache.nifi.processors.hadoop; + +import java.io.IOException; +import java.security.PrivilegedExceptionAction; +import java.util.ArrayList; +import java.util.Collection; +import java.util.HashMap; +import java.util.HashSet; +import java.util.LinkedList; +import java.util.List; +import java.util.Map; +import java.util.Set; +import java.util.concurrent.TimeUnit; +import java.util.regex.Pattern; + +import org.apache.commons.lang3.StringUtils; +import org.apache.hadoop.fs.FileStatus; +import org.apache.hadoop.fs.FileSystem; +import org.apache.hadoop.fs.Path; +import org.apache.hadoop.fs.permission.FsAction; +import org.apache.hadoop.fs.permission.FsPermission; +import org.apache.hadoop.security.UserGroupInformation; +import org.apache.nifi.annotation.behavior.InputRequirement; +import org.apache.nifi.annotation.behavior.InputRequirement.Requirement; +import org.apache.nifi.annotation.behavior.TriggerSerially; +import org.apache.nifi.annotation.behavior.TriggerWhenEmpty; +import org.apache.nifi.annotation.behavior.WritesAttribute; +import org.apache.nifi.annotation.behavior.WritesAttributes; +import org.apache.nifi.annotation.documentation.CapabilityDescription; +import org.apache.nifi.annotation.documentation.SeeAlso; +import org.apache.nifi.annotation.documentation.Tags; +import org.apache.nifi.components.AllowableValue; +import org.apache.nifi.components.PropertyDescriptor; +import org.apache.nifi.components.PropertyValue; +import org.apache.nifi.components.ValidationContext; +import org.apache.nifi.components.ValidationResult; +import org.apache.nifi.expression.ExpressionLanguageScope; +import org.apache.nifi.flowfile.FlowFile; +import org.apache.nifi.processor.ProcessContext; +import org.apache.nifi.processor.ProcessSession; +import org.apache.nifi.processor.ProcessorInitializationContext; +import org.apache.nifi.processor.Relationship; +import org.apache.nifi.processor.exception.ProcessException; +import org.apache.nifi.processor.util.StandardValidators; +import org.apache.nifi.processors.hadoop.GetHDFSFileInfo.HDFSFileInfoRequest.Groupping; + +@TriggerSerially +@TriggerWhenEmpty +@InputRequirement(Requirement.INPUT_ALLOWED) +@Tags({"hadoop", "HDFS", "get", "list", "ingest", "source", "filesystem"}) +@CapabilityDescription("Retrieves a listing of files and directories from HDFS. " ++ "This processor creates a FlowFile(s) that represents the HDFS file/dir with relevant information. " ++ "Main purpose of this processor to provide functionality similar to HDFS Client, i.e. count, du, ls, test, etc. " ++ "Unlike ListHDFS, this processor is stateless, supports incoming connections and provides information on a dir level. " +) +@WritesAttributes({ +@WritesAttribute(attribute="hdfs.objectName", description="The name of the file/dir found on HDFS."), +@WritesAttribute(attribute="hdfs.path", description="The path is set to the absolute path of the object's parent directory on HDFS. " ++ "For example, if an object is a directory 'foo', under directory '/bar' then 'hdfs.objectName' will have value 'foo', and 'hdfs.path' will be '/bar'"), +@WritesAttribute(attribute="hdfs.type", description="The type of an object. Possible values: directory, file, link"), +@WritesAttribute(attribute="hdfs.owner",
[jira] [Commented] (NIFI-4906) Add GetHdfsFileInfo Processor
[ https://issues.apache.org/jira/browse/NIFI-4906?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16508392#comment-16508392 ] ASF GitHub Bot commented on NIFI-4906: -- Github user bbende commented on a diff in the pull request: https://github.com/apache/nifi/pull/2639#discussion_r194421999 --- Diff: nifi-nar-bundles/nifi-hadoop-bundle/nifi-hdfs-processors/src/main/java/org/apache/nifi/processors/hadoop/GetHDFSFileInfo.java --- @@ -0,0 +1,803 @@ +/* + * Licensed to the Apache Software Foundation (ASF) under one or more + * contributor license agreements. See the NOTICE file distributed with + * this work for additional information regarding copyright ownership. + * The ASF licenses this file to You under the Apache License, Version 2.0 + * (the "License"); you may not use this file except in compliance with + * the License. You may obtain a copy of the License at + * + * http://www.apache.org/licenses/LICENSE-2.0 + * + * Unless required by applicable law or agreed to in writing, software + * distributed under the License is distributed on an "AS IS" BASIS, + * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. + * See the License for the specific language governing permissions and + * limitations under the License. + */ +package org.apache.nifi.processors.hadoop; + +import java.io.IOException; +import java.security.PrivilegedExceptionAction; +import java.util.ArrayList; +import java.util.Collection; +import java.util.HashMap; +import java.util.HashSet; +import java.util.LinkedList; +import java.util.List; +import java.util.Map; +import java.util.Set; +import java.util.concurrent.TimeUnit; +import java.util.regex.Pattern; + +import org.apache.commons.lang3.StringUtils; +import org.apache.hadoop.fs.FileStatus; +import org.apache.hadoop.fs.FileSystem; +import org.apache.hadoop.fs.Path; +import org.apache.hadoop.fs.permission.FsAction; +import org.apache.hadoop.fs.permission.FsPermission; +import org.apache.hadoop.security.UserGroupInformation; +import org.apache.nifi.annotation.behavior.InputRequirement; +import org.apache.nifi.annotation.behavior.InputRequirement.Requirement; +import org.apache.nifi.annotation.behavior.TriggerSerially; +import org.apache.nifi.annotation.behavior.TriggerWhenEmpty; +import org.apache.nifi.annotation.behavior.WritesAttribute; +import org.apache.nifi.annotation.behavior.WritesAttributes; +import org.apache.nifi.annotation.documentation.CapabilityDescription; +import org.apache.nifi.annotation.documentation.SeeAlso; +import org.apache.nifi.annotation.documentation.Tags; +import org.apache.nifi.components.AllowableValue; +import org.apache.nifi.components.PropertyDescriptor; +import org.apache.nifi.components.PropertyValue; +import org.apache.nifi.components.ValidationContext; +import org.apache.nifi.components.ValidationResult; +import org.apache.nifi.expression.ExpressionLanguageScope; +import org.apache.nifi.flowfile.FlowFile; +import org.apache.nifi.processor.ProcessContext; +import org.apache.nifi.processor.ProcessSession; +import org.apache.nifi.processor.ProcessorInitializationContext; +import org.apache.nifi.processor.Relationship; +import org.apache.nifi.processor.exception.ProcessException; +import org.apache.nifi.processor.util.StandardValidators; +import org.apache.nifi.processors.hadoop.GetHDFSFileInfo.HDFSFileInfoRequest.Groupping; + +@TriggerSerially +@TriggerWhenEmpty +@InputRequirement(Requirement.INPUT_ALLOWED) +@Tags({"hadoop", "HDFS", "get", "list", "ingest", "source", "filesystem"}) +@CapabilityDescription("Retrieves a listing of files and directories from HDFS. " ++ "This processor creates a FlowFile(s) that represents the HDFS file/dir with relevant information. " ++ "Main purpose of this processor to provide functionality similar to HDFS Client, i.e. count, du, ls, test, etc. " ++ "Unlike ListHDFS, this processor is stateless, supports incoming connections and provides information on a dir level. " +) +@WritesAttributes({ +@WritesAttribute(attribute="hdfs.objectName", description="The name of the file/dir found on HDFS."), +@WritesAttribute(attribute="hdfs.path", description="The path is set to the absolute path of the object's parent directory on HDFS. " ++ "For example, if an object is a directory 'foo', under directory '/bar' then 'hdfs.objectName' will have value 'foo', and 'hdfs.path' will be '/bar'"), +@WritesAttribute(attribute="hdfs.type", description="The type of an object. Possible values: directory, file, link"), +@WritesAttribute(attribute="hdfs.owner",
[jira] [Commented] (NIFI-4906) Add GetHdfsFileInfo Processor
[ https://issues.apache.org/jira/browse/NIFI-4906?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16508395#comment-16508395 ] ASF GitHub Bot commented on NIFI-4906: -- Github user bbende commented on a diff in the pull request: https://github.com/apache/nifi/pull/2639#discussion_r194443340 --- Diff: nifi-nar-bundles/nifi-hadoop-bundle/nifi-hdfs-processors/src/main/java/org/apache/nifi/processors/hadoop/GetHDFSFileInfo.java --- @@ -0,0 +1,803 @@ +/* + * Licensed to the Apache Software Foundation (ASF) under one or more + * contributor license agreements. See the NOTICE file distributed with + * this work for additional information regarding copyright ownership. + * The ASF licenses this file to You under the Apache License, Version 2.0 + * (the "License"); you may not use this file except in compliance with + * the License. You may obtain a copy of the License at + * + * http://www.apache.org/licenses/LICENSE-2.0 + * + * Unless required by applicable law or agreed to in writing, software + * distributed under the License is distributed on an "AS IS" BASIS, + * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. + * See the License for the specific language governing permissions and + * limitations under the License. + */ +package org.apache.nifi.processors.hadoop; + +import java.io.IOException; +import java.security.PrivilegedExceptionAction; +import java.util.ArrayList; +import java.util.Collection; +import java.util.HashMap; +import java.util.HashSet; +import java.util.LinkedList; +import java.util.List; +import java.util.Map; +import java.util.Set; +import java.util.concurrent.TimeUnit; +import java.util.regex.Pattern; + +import org.apache.commons.lang3.StringUtils; +import org.apache.hadoop.fs.FileStatus; +import org.apache.hadoop.fs.FileSystem; +import org.apache.hadoop.fs.Path; +import org.apache.hadoop.fs.permission.FsAction; +import org.apache.hadoop.fs.permission.FsPermission; +import org.apache.hadoop.security.UserGroupInformation; +import org.apache.nifi.annotation.behavior.InputRequirement; +import org.apache.nifi.annotation.behavior.InputRequirement.Requirement; +import org.apache.nifi.annotation.behavior.TriggerSerially; +import org.apache.nifi.annotation.behavior.TriggerWhenEmpty; +import org.apache.nifi.annotation.behavior.WritesAttribute; +import org.apache.nifi.annotation.behavior.WritesAttributes; +import org.apache.nifi.annotation.documentation.CapabilityDescription; +import org.apache.nifi.annotation.documentation.SeeAlso; +import org.apache.nifi.annotation.documentation.Tags; +import org.apache.nifi.components.AllowableValue; +import org.apache.nifi.components.PropertyDescriptor; +import org.apache.nifi.components.PropertyValue; +import org.apache.nifi.components.ValidationContext; +import org.apache.nifi.components.ValidationResult; +import org.apache.nifi.expression.ExpressionLanguageScope; +import org.apache.nifi.flowfile.FlowFile; +import org.apache.nifi.processor.ProcessContext; +import org.apache.nifi.processor.ProcessSession; +import org.apache.nifi.processor.ProcessorInitializationContext; +import org.apache.nifi.processor.Relationship; +import org.apache.nifi.processor.exception.ProcessException; +import org.apache.nifi.processor.util.StandardValidators; +import org.apache.nifi.processors.hadoop.GetHDFSFileInfo.HDFSFileInfoRequest.Groupping; + +@TriggerSerially +@TriggerWhenEmpty +@InputRequirement(Requirement.INPUT_ALLOWED) +@Tags({"hadoop", "HDFS", "get", "list", "ingest", "source", "filesystem"}) +@CapabilityDescription("Retrieves a listing of files and directories from HDFS. " ++ "This processor creates a FlowFile(s) that represents the HDFS file/dir with relevant information. " ++ "Main purpose of this processor to provide functionality similar to HDFS Client, i.e. count, du, ls, test, etc. " ++ "Unlike ListHDFS, this processor is stateless, supports incoming connections and provides information on a dir level. " +) +@WritesAttributes({ +@WritesAttribute(attribute="hdfs.objectName", description="The name of the file/dir found on HDFS."), +@WritesAttribute(attribute="hdfs.path", description="The path is set to the absolute path of the object's parent directory on HDFS. " ++ "For example, if an object is a directory 'foo', under directory '/bar' then 'hdfs.objectName' will have value 'foo', and 'hdfs.path' will be '/bar'"), +@WritesAttribute(attribute="hdfs.type", description="The type of an object. Possible values: directory, file, link"), +@WritesAttribute(attribute="hdfs.owner",
[jira] [Commented] (NIFI-4906) Add GetHdfsFileInfo Processor
[ https://issues.apache.org/jira/browse/NIFI-4906?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16508400#comment-16508400 ] ASF GitHub Bot commented on NIFI-4906: -- Github user bbende commented on a diff in the pull request: https://github.com/apache/nifi/pull/2639#discussion_r194446685 --- Diff: nifi-nar-bundles/nifi-hadoop-bundle/nifi-hdfs-processors/src/main/java/org/apache/nifi/processors/hadoop/GetHDFSFileInfo.java --- @@ -0,0 +1,803 @@ +/* + * Licensed to the Apache Software Foundation (ASF) under one or more + * contributor license agreements. See the NOTICE file distributed with + * this work for additional information regarding copyright ownership. + * The ASF licenses this file to You under the Apache License, Version 2.0 + * (the "License"); you may not use this file except in compliance with + * the License. You may obtain a copy of the License at + * + * http://www.apache.org/licenses/LICENSE-2.0 + * + * Unless required by applicable law or agreed to in writing, software + * distributed under the License is distributed on an "AS IS" BASIS, + * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. + * See the License for the specific language governing permissions and + * limitations under the License. + */ +package org.apache.nifi.processors.hadoop; + +import java.io.IOException; +import java.security.PrivilegedExceptionAction; +import java.util.ArrayList; +import java.util.Collection; +import java.util.HashMap; +import java.util.HashSet; +import java.util.LinkedList; +import java.util.List; +import java.util.Map; +import java.util.Set; +import java.util.concurrent.TimeUnit; +import java.util.regex.Pattern; + +import org.apache.commons.lang3.StringUtils; +import org.apache.hadoop.fs.FileStatus; +import org.apache.hadoop.fs.FileSystem; +import org.apache.hadoop.fs.Path; +import org.apache.hadoop.fs.permission.FsAction; +import org.apache.hadoop.fs.permission.FsPermission; +import org.apache.hadoop.security.UserGroupInformation; +import org.apache.nifi.annotation.behavior.InputRequirement; +import org.apache.nifi.annotation.behavior.InputRequirement.Requirement; +import org.apache.nifi.annotation.behavior.TriggerSerially; +import org.apache.nifi.annotation.behavior.TriggerWhenEmpty; +import org.apache.nifi.annotation.behavior.WritesAttribute; +import org.apache.nifi.annotation.behavior.WritesAttributes; +import org.apache.nifi.annotation.documentation.CapabilityDescription; +import org.apache.nifi.annotation.documentation.SeeAlso; +import org.apache.nifi.annotation.documentation.Tags; +import org.apache.nifi.components.AllowableValue; +import org.apache.nifi.components.PropertyDescriptor; +import org.apache.nifi.components.PropertyValue; +import org.apache.nifi.components.ValidationContext; +import org.apache.nifi.components.ValidationResult; +import org.apache.nifi.expression.ExpressionLanguageScope; +import org.apache.nifi.flowfile.FlowFile; +import org.apache.nifi.processor.ProcessContext; +import org.apache.nifi.processor.ProcessSession; +import org.apache.nifi.processor.ProcessorInitializationContext; +import org.apache.nifi.processor.Relationship; +import org.apache.nifi.processor.exception.ProcessException; +import org.apache.nifi.processor.util.StandardValidators; +import org.apache.nifi.processors.hadoop.GetHDFSFileInfo.HDFSFileInfoRequest.Groupping; + +@TriggerSerially +@TriggerWhenEmpty +@InputRequirement(Requirement.INPUT_ALLOWED) +@Tags({"hadoop", "HDFS", "get", "list", "ingest", "source", "filesystem"}) +@CapabilityDescription("Retrieves a listing of files and directories from HDFS. " ++ "This processor creates a FlowFile(s) that represents the HDFS file/dir with relevant information. " ++ "Main purpose of this processor to provide functionality similar to HDFS Client, i.e. count, du, ls, test, etc. " ++ "Unlike ListHDFS, this processor is stateless, supports incoming connections and provides information on a dir level. " +) +@WritesAttributes({ +@WritesAttribute(attribute="hdfs.objectName", description="The name of the file/dir found on HDFS."), +@WritesAttribute(attribute="hdfs.path", description="The path is set to the absolute path of the object's parent directory on HDFS. " ++ "For example, if an object is a directory 'foo', under directory '/bar' then 'hdfs.objectName' will have value 'foo', and 'hdfs.path' will be '/bar'"), +@WritesAttribute(attribute="hdfs.type", description="The type of an object. Possible values: directory, file, link"), +@WritesAttribute(attribute="hdfs.owner",
[jira] [Commented] (NIFI-4906) Add GetHdfsFileInfo Processor
[ https://issues.apache.org/jira/browse/NIFI-4906?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16508391#comment-16508391 ] ASF GitHub Bot commented on NIFI-4906: -- Github user bbende commented on a diff in the pull request: https://github.com/apache/nifi/pull/2639#discussion_r194421861 --- Diff: nifi-nar-bundles/nifi-hadoop-bundle/nifi-hdfs-processors/src/main/java/org/apache/nifi/processors/hadoop/GetHDFSFileInfo.java --- @@ -0,0 +1,803 @@ +/* + * Licensed to the Apache Software Foundation (ASF) under one or more + * contributor license agreements. See the NOTICE file distributed with + * this work for additional information regarding copyright ownership. + * The ASF licenses this file to You under the Apache License, Version 2.0 + * (the "License"); you may not use this file except in compliance with + * the License. You may obtain a copy of the License at + * + * http://www.apache.org/licenses/LICENSE-2.0 + * + * Unless required by applicable law or agreed to in writing, software + * distributed under the License is distributed on an "AS IS" BASIS, + * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. + * See the License for the specific language governing permissions and + * limitations under the License. + */ +package org.apache.nifi.processors.hadoop; + +import java.io.IOException; +import java.security.PrivilegedExceptionAction; +import java.util.ArrayList; +import java.util.Collection; +import java.util.HashMap; +import java.util.HashSet; +import java.util.LinkedList; +import java.util.List; +import java.util.Map; +import java.util.Set; +import java.util.concurrent.TimeUnit; +import java.util.regex.Pattern; + +import org.apache.commons.lang3.StringUtils; +import org.apache.hadoop.fs.FileStatus; +import org.apache.hadoop.fs.FileSystem; +import org.apache.hadoop.fs.Path; +import org.apache.hadoop.fs.permission.FsAction; +import org.apache.hadoop.fs.permission.FsPermission; +import org.apache.hadoop.security.UserGroupInformation; +import org.apache.nifi.annotation.behavior.InputRequirement; +import org.apache.nifi.annotation.behavior.InputRequirement.Requirement; +import org.apache.nifi.annotation.behavior.TriggerSerially; +import org.apache.nifi.annotation.behavior.TriggerWhenEmpty; +import org.apache.nifi.annotation.behavior.WritesAttribute; +import org.apache.nifi.annotation.behavior.WritesAttributes; +import org.apache.nifi.annotation.documentation.CapabilityDescription; +import org.apache.nifi.annotation.documentation.SeeAlso; +import org.apache.nifi.annotation.documentation.Tags; +import org.apache.nifi.components.AllowableValue; +import org.apache.nifi.components.PropertyDescriptor; +import org.apache.nifi.components.PropertyValue; +import org.apache.nifi.components.ValidationContext; +import org.apache.nifi.components.ValidationResult; +import org.apache.nifi.expression.ExpressionLanguageScope; +import org.apache.nifi.flowfile.FlowFile; +import org.apache.nifi.processor.ProcessContext; +import org.apache.nifi.processor.ProcessSession; +import org.apache.nifi.processor.ProcessorInitializationContext; +import org.apache.nifi.processor.Relationship; +import org.apache.nifi.processor.exception.ProcessException; +import org.apache.nifi.processor.util.StandardValidators; +import org.apache.nifi.processors.hadoop.GetHDFSFileInfo.HDFSFileInfoRequest.Groupping; + +@TriggerSerially +@TriggerWhenEmpty +@InputRequirement(Requirement.INPUT_ALLOWED) +@Tags({"hadoop", "HDFS", "get", "list", "ingest", "source", "filesystem"}) +@CapabilityDescription("Retrieves a listing of files and directories from HDFS. " ++ "This processor creates a FlowFile(s) that represents the HDFS file/dir with relevant information. " ++ "Main purpose of this processor to provide functionality similar to HDFS Client, i.e. count, du, ls, test, etc. " ++ "Unlike ListHDFS, this processor is stateless, supports incoming connections and provides information on a dir level. " +) +@WritesAttributes({ +@WritesAttribute(attribute="hdfs.objectName", description="The name of the file/dir found on HDFS."), +@WritesAttribute(attribute="hdfs.path", description="The path is set to the absolute path of the object's parent directory on HDFS. " ++ "For example, if an object is a directory 'foo', under directory '/bar' then 'hdfs.objectName' will have value 'foo', and 'hdfs.path' will be '/bar'"), +@WritesAttribute(attribute="hdfs.type", description="The type of an object. Possible values: directory, file, link"), +@WritesAttribute(attribute="hdfs.owner",
[jira] [Commented] (NIFI-4906) Add GetHdfsFileInfo Processor
[ https://issues.apache.org/jira/browse/NIFI-4906?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16508398#comment-16508398 ] ASF GitHub Bot commented on NIFI-4906: -- Github user bbende commented on a diff in the pull request: https://github.com/apache/nifi/pull/2639#discussion_r194437894 --- Diff: nifi-nar-bundles/nifi-hadoop-bundle/nifi-hdfs-processors/src/main/java/org/apache/nifi/processors/hadoop/GetHDFSFileInfo.java --- @@ -0,0 +1,803 @@ +/* + * Licensed to the Apache Software Foundation (ASF) under one or more + * contributor license agreements. See the NOTICE file distributed with + * this work for additional information regarding copyright ownership. + * The ASF licenses this file to You under the Apache License, Version 2.0 + * (the "License"); you may not use this file except in compliance with + * the License. You may obtain a copy of the License at + * + * http://www.apache.org/licenses/LICENSE-2.0 + * + * Unless required by applicable law or agreed to in writing, software + * distributed under the License is distributed on an "AS IS" BASIS, + * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. + * See the License for the specific language governing permissions and + * limitations under the License. + */ +package org.apache.nifi.processors.hadoop; + +import java.io.IOException; +import java.security.PrivilegedExceptionAction; +import java.util.ArrayList; +import java.util.Collection; +import java.util.HashMap; +import java.util.HashSet; +import java.util.LinkedList; +import java.util.List; +import java.util.Map; +import java.util.Set; +import java.util.concurrent.TimeUnit; +import java.util.regex.Pattern; + +import org.apache.commons.lang3.StringUtils; +import org.apache.hadoop.fs.FileStatus; +import org.apache.hadoop.fs.FileSystem; +import org.apache.hadoop.fs.Path; +import org.apache.hadoop.fs.permission.FsAction; +import org.apache.hadoop.fs.permission.FsPermission; +import org.apache.hadoop.security.UserGroupInformation; +import org.apache.nifi.annotation.behavior.InputRequirement; +import org.apache.nifi.annotation.behavior.InputRequirement.Requirement; +import org.apache.nifi.annotation.behavior.TriggerSerially; +import org.apache.nifi.annotation.behavior.TriggerWhenEmpty; +import org.apache.nifi.annotation.behavior.WritesAttribute; +import org.apache.nifi.annotation.behavior.WritesAttributes; +import org.apache.nifi.annotation.documentation.CapabilityDescription; +import org.apache.nifi.annotation.documentation.SeeAlso; +import org.apache.nifi.annotation.documentation.Tags; +import org.apache.nifi.components.AllowableValue; +import org.apache.nifi.components.PropertyDescriptor; +import org.apache.nifi.components.PropertyValue; +import org.apache.nifi.components.ValidationContext; +import org.apache.nifi.components.ValidationResult; +import org.apache.nifi.expression.ExpressionLanguageScope; +import org.apache.nifi.flowfile.FlowFile; +import org.apache.nifi.processor.ProcessContext; +import org.apache.nifi.processor.ProcessSession; +import org.apache.nifi.processor.ProcessorInitializationContext; +import org.apache.nifi.processor.Relationship; +import org.apache.nifi.processor.exception.ProcessException; +import org.apache.nifi.processor.util.StandardValidators; +import org.apache.nifi.processors.hadoop.GetHDFSFileInfo.HDFSFileInfoRequest.Groupping; + +@TriggerSerially +@TriggerWhenEmpty +@InputRequirement(Requirement.INPUT_ALLOWED) +@Tags({"hadoop", "HDFS", "get", "list", "ingest", "source", "filesystem"}) +@CapabilityDescription("Retrieves a listing of files and directories from HDFS. " ++ "This processor creates a FlowFile(s) that represents the HDFS file/dir with relevant information. " ++ "Main purpose of this processor to provide functionality similar to HDFS Client, i.e. count, du, ls, test, etc. " ++ "Unlike ListHDFS, this processor is stateless, supports incoming connections and provides information on a dir level. " +) +@WritesAttributes({ +@WritesAttribute(attribute="hdfs.objectName", description="The name of the file/dir found on HDFS."), +@WritesAttribute(attribute="hdfs.path", description="The path is set to the absolute path of the object's parent directory on HDFS. " ++ "For example, if an object is a directory 'foo', under directory '/bar' then 'hdfs.objectName' will have value 'foo', and 'hdfs.path' will be '/bar'"), +@WritesAttribute(attribute="hdfs.type", description="The type of an object. Possible values: directory, file, link"), +@WritesAttribute(attribute="hdfs.owner",
[jira] [Commented] (NIFI-4906) Add GetHdfsFileInfo Processor
[ https://issues.apache.org/jira/browse/NIFI-4906?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16508399#comment-16508399 ] ASF GitHub Bot commented on NIFI-4906: -- Github user bbende commented on a diff in the pull request: https://github.com/apache/nifi/pull/2639#discussion_r194428165 --- Diff: nifi-nar-bundles/nifi-hadoop-bundle/nifi-hdfs-processors/src/main/java/org/apache/nifi/processors/hadoop/GetHDFSFileInfo.java --- @@ -0,0 +1,803 @@ +/* + * Licensed to the Apache Software Foundation (ASF) under one or more + * contributor license agreements. See the NOTICE file distributed with + * this work for additional information regarding copyright ownership. + * The ASF licenses this file to You under the Apache License, Version 2.0 + * (the "License"); you may not use this file except in compliance with + * the License. You may obtain a copy of the License at + * + * http://www.apache.org/licenses/LICENSE-2.0 + * + * Unless required by applicable law or agreed to in writing, software + * distributed under the License is distributed on an "AS IS" BASIS, + * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. + * See the License for the specific language governing permissions and + * limitations under the License. + */ +package org.apache.nifi.processors.hadoop; + +import java.io.IOException; +import java.security.PrivilegedExceptionAction; +import java.util.ArrayList; +import java.util.Collection; +import java.util.HashMap; +import java.util.HashSet; +import java.util.LinkedList; +import java.util.List; +import java.util.Map; +import java.util.Set; +import java.util.concurrent.TimeUnit; +import java.util.regex.Pattern; + +import org.apache.commons.lang3.StringUtils; +import org.apache.hadoop.fs.FileStatus; +import org.apache.hadoop.fs.FileSystem; +import org.apache.hadoop.fs.Path; +import org.apache.hadoop.fs.permission.FsAction; +import org.apache.hadoop.fs.permission.FsPermission; +import org.apache.hadoop.security.UserGroupInformation; +import org.apache.nifi.annotation.behavior.InputRequirement; +import org.apache.nifi.annotation.behavior.InputRequirement.Requirement; +import org.apache.nifi.annotation.behavior.TriggerSerially; +import org.apache.nifi.annotation.behavior.TriggerWhenEmpty; +import org.apache.nifi.annotation.behavior.WritesAttribute; +import org.apache.nifi.annotation.behavior.WritesAttributes; +import org.apache.nifi.annotation.documentation.CapabilityDescription; +import org.apache.nifi.annotation.documentation.SeeAlso; +import org.apache.nifi.annotation.documentation.Tags; +import org.apache.nifi.components.AllowableValue; +import org.apache.nifi.components.PropertyDescriptor; +import org.apache.nifi.components.PropertyValue; +import org.apache.nifi.components.ValidationContext; +import org.apache.nifi.components.ValidationResult; +import org.apache.nifi.expression.ExpressionLanguageScope; +import org.apache.nifi.flowfile.FlowFile; +import org.apache.nifi.processor.ProcessContext; +import org.apache.nifi.processor.ProcessSession; +import org.apache.nifi.processor.ProcessorInitializationContext; +import org.apache.nifi.processor.Relationship; +import org.apache.nifi.processor.exception.ProcessException; +import org.apache.nifi.processor.util.StandardValidators; +import org.apache.nifi.processors.hadoop.GetHDFSFileInfo.HDFSFileInfoRequest.Groupping; + +@TriggerSerially +@TriggerWhenEmpty +@InputRequirement(Requirement.INPUT_ALLOWED) +@Tags({"hadoop", "HDFS", "get", "list", "ingest", "source", "filesystem"}) +@CapabilityDescription("Retrieves a listing of files and directories from HDFS. " ++ "This processor creates a FlowFile(s) that represents the HDFS file/dir with relevant information. " ++ "Main purpose of this processor to provide functionality similar to HDFS Client, i.e. count, du, ls, test, etc. " ++ "Unlike ListHDFS, this processor is stateless, supports incoming connections and provides information on a dir level. " +) +@WritesAttributes({ +@WritesAttribute(attribute="hdfs.objectName", description="The name of the file/dir found on HDFS."), +@WritesAttribute(attribute="hdfs.path", description="The path is set to the absolute path of the object's parent directory on HDFS. " ++ "For example, if an object is a directory 'foo', under directory '/bar' then 'hdfs.objectName' will have value 'foo', and 'hdfs.path' will be '/bar'"), +@WritesAttribute(attribute="hdfs.type", description="The type of an object. Possible values: directory, file, link"), +@WritesAttribute(attribute="hdfs.owner",
[jira] [Commented] (NIFI-4906) Add GetHdfsFileInfo Processor
[ https://issues.apache.org/jira/browse/NIFI-4906?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16508394#comment-16508394 ] ASF GitHub Bot commented on NIFI-4906: -- Github user bbende commented on a diff in the pull request: https://github.com/apache/nifi/pull/2639#discussion_r194427937 --- Diff: nifi-nar-bundles/nifi-hadoop-bundle/nifi-hdfs-processors/src/main/java/org/apache/nifi/processors/hadoop/GetHDFSFileInfo.java --- @@ -0,0 +1,803 @@ +/* + * Licensed to the Apache Software Foundation (ASF) under one or more + * contributor license agreements. See the NOTICE file distributed with + * this work for additional information regarding copyright ownership. + * The ASF licenses this file to You under the Apache License, Version 2.0 + * (the "License"); you may not use this file except in compliance with + * the License. You may obtain a copy of the License at + * + * http://www.apache.org/licenses/LICENSE-2.0 + * + * Unless required by applicable law or agreed to in writing, software + * distributed under the License is distributed on an "AS IS" BASIS, + * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. + * See the License for the specific language governing permissions and + * limitations under the License. + */ +package org.apache.nifi.processors.hadoop; + +import java.io.IOException; +import java.security.PrivilegedExceptionAction; +import java.util.ArrayList; +import java.util.Collection; +import java.util.HashMap; +import java.util.HashSet; +import java.util.LinkedList; +import java.util.List; +import java.util.Map; +import java.util.Set; +import java.util.concurrent.TimeUnit; +import java.util.regex.Pattern; + +import org.apache.commons.lang3.StringUtils; +import org.apache.hadoop.fs.FileStatus; +import org.apache.hadoop.fs.FileSystem; +import org.apache.hadoop.fs.Path; +import org.apache.hadoop.fs.permission.FsAction; +import org.apache.hadoop.fs.permission.FsPermission; +import org.apache.hadoop.security.UserGroupInformation; +import org.apache.nifi.annotation.behavior.InputRequirement; +import org.apache.nifi.annotation.behavior.InputRequirement.Requirement; +import org.apache.nifi.annotation.behavior.TriggerSerially; +import org.apache.nifi.annotation.behavior.TriggerWhenEmpty; +import org.apache.nifi.annotation.behavior.WritesAttribute; +import org.apache.nifi.annotation.behavior.WritesAttributes; +import org.apache.nifi.annotation.documentation.CapabilityDescription; +import org.apache.nifi.annotation.documentation.SeeAlso; +import org.apache.nifi.annotation.documentation.Tags; +import org.apache.nifi.components.AllowableValue; +import org.apache.nifi.components.PropertyDescriptor; +import org.apache.nifi.components.PropertyValue; +import org.apache.nifi.components.ValidationContext; +import org.apache.nifi.components.ValidationResult; +import org.apache.nifi.expression.ExpressionLanguageScope; +import org.apache.nifi.flowfile.FlowFile; +import org.apache.nifi.processor.ProcessContext; +import org.apache.nifi.processor.ProcessSession; +import org.apache.nifi.processor.ProcessorInitializationContext; +import org.apache.nifi.processor.Relationship; +import org.apache.nifi.processor.exception.ProcessException; +import org.apache.nifi.processor.util.StandardValidators; +import org.apache.nifi.processors.hadoop.GetHDFSFileInfo.HDFSFileInfoRequest.Groupping; + +@TriggerSerially +@TriggerWhenEmpty +@InputRequirement(Requirement.INPUT_ALLOWED) +@Tags({"hadoop", "HDFS", "get", "list", "ingest", "source", "filesystem"}) +@CapabilityDescription("Retrieves a listing of files and directories from HDFS. " ++ "This processor creates a FlowFile(s) that represents the HDFS file/dir with relevant information. " ++ "Main purpose of this processor to provide functionality similar to HDFS Client, i.e. count, du, ls, test, etc. " ++ "Unlike ListHDFS, this processor is stateless, supports incoming connections and provides information on a dir level. " +) +@WritesAttributes({ +@WritesAttribute(attribute="hdfs.objectName", description="The name of the file/dir found on HDFS."), +@WritesAttribute(attribute="hdfs.path", description="The path is set to the absolute path of the object's parent directory on HDFS. " ++ "For example, if an object is a directory 'foo', under directory '/bar' then 'hdfs.objectName' will have value 'foo', and 'hdfs.path' will be '/bar'"), +@WritesAttribute(attribute="hdfs.type", description="The type of an object. Possible values: directory, file, link"), +@WritesAttribute(attribute="hdfs.owner",
[jira] [Commented] (NIFI-4906) Add GetHdfsFileInfo Processor
[ https://issues.apache.org/jira/browse/NIFI-4906?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16508393#comment-16508393 ] ASF GitHub Bot commented on NIFI-4906: -- Github user bbende commented on a diff in the pull request: https://github.com/apache/nifi/pull/2639#discussion_r194421928 --- Diff: nifi-nar-bundles/nifi-hadoop-bundle/nifi-hdfs-processors/src/main/java/org/apache/nifi/processors/hadoop/GetHDFSFileInfo.java --- @@ -0,0 +1,803 @@ +/* + * Licensed to the Apache Software Foundation (ASF) under one or more + * contributor license agreements. See the NOTICE file distributed with + * this work for additional information regarding copyright ownership. + * The ASF licenses this file to You under the Apache License, Version 2.0 + * (the "License"); you may not use this file except in compliance with + * the License. You may obtain a copy of the License at + * + * http://www.apache.org/licenses/LICENSE-2.0 + * + * Unless required by applicable law or agreed to in writing, software + * distributed under the License is distributed on an "AS IS" BASIS, + * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. + * See the License for the specific language governing permissions and + * limitations under the License. + */ +package org.apache.nifi.processors.hadoop; + +import java.io.IOException; +import java.security.PrivilegedExceptionAction; +import java.util.ArrayList; +import java.util.Collection; +import java.util.HashMap; +import java.util.HashSet; +import java.util.LinkedList; +import java.util.List; +import java.util.Map; +import java.util.Set; +import java.util.concurrent.TimeUnit; +import java.util.regex.Pattern; + +import org.apache.commons.lang3.StringUtils; +import org.apache.hadoop.fs.FileStatus; +import org.apache.hadoop.fs.FileSystem; +import org.apache.hadoop.fs.Path; +import org.apache.hadoop.fs.permission.FsAction; +import org.apache.hadoop.fs.permission.FsPermission; +import org.apache.hadoop.security.UserGroupInformation; +import org.apache.nifi.annotation.behavior.InputRequirement; +import org.apache.nifi.annotation.behavior.InputRequirement.Requirement; +import org.apache.nifi.annotation.behavior.TriggerSerially; +import org.apache.nifi.annotation.behavior.TriggerWhenEmpty; +import org.apache.nifi.annotation.behavior.WritesAttribute; +import org.apache.nifi.annotation.behavior.WritesAttributes; +import org.apache.nifi.annotation.documentation.CapabilityDescription; +import org.apache.nifi.annotation.documentation.SeeAlso; +import org.apache.nifi.annotation.documentation.Tags; +import org.apache.nifi.components.AllowableValue; +import org.apache.nifi.components.PropertyDescriptor; +import org.apache.nifi.components.PropertyValue; +import org.apache.nifi.components.ValidationContext; +import org.apache.nifi.components.ValidationResult; +import org.apache.nifi.expression.ExpressionLanguageScope; +import org.apache.nifi.flowfile.FlowFile; +import org.apache.nifi.processor.ProcessContext; +import org.apache.nifi.processor.ProcessSession; +import org.apache.nifi.processor.ProcessorInitializationContext; +import org.apache.nifi.processor.Relationship; +import org.apache.nifi.processor.exception.ProcessException; +import org.apache.nifi.processor.util.StandardValidators; +import org.apache.nifi.processors.hadoop.GetHDFSFileInfo.HDFSFileInfoRequest.Groupping; + +@TriggerSerially +@TriggerWhenEmpty +@InputRequirement(Requirement.INPUT_ALLOWED) +@Tags({"hadoop", "HDFS", "get", "list", "ingest", "source", "filesystem"}) +@CapabilityDescription("Retrieves a listing of files and directories from HDFS. " ++ "This processor creates a FlowFile(s) that represents the HDFS file/dir with relevant information. " ++ "Main purpose of this processor to provide functionality similar to HDFS Client, i.e. count, du, ls, test, etc. " ++ "Unlike ListHDFS, this processor is stateless, supports incoming connections and provides information on a dir level. " +) +@WritesAttributes({ +@WritesAttribute(attribute="hdfs.objectName", description="The name of the file/dir found on HDFS."), +@WritesAttribute(attribute="hdfs.path", description="The path is set to the absolute path of the object's parent directory on HDFS. " ++ "For example, if an object is a directory 'foo', under directory '/bar' then 'hdfs.objectName' will have value 'foo', and 'hdfs.path' will be '/bar'"), +@WritesAttribute(attribute="hdfs.type", description="The type of an object. Possible values: directory, file, link"), +@WritesAttribute(attribute="hdfs.owner",
[GitHub] nifi pull request #2639: NIFI-4906 Add GetHDFSFileInfo
Github user bbende commented on a diff in the pull request: https://github.com/apache/nifi/pull/2639#discussion_r194437894 --- Diff: nifi-nar-bundles/nifi-hadoop-bundle/nifi-hdfs-processors/src/main/java/org/apache/nifi/processors/hadoop/GetHDFSFileInfo.java --- @@ -0,0 +1,803 @@ +/* + * Licensed to the Apache Software Foundation (ASF) under one or more + * contributor license agreements. See the NOTICE file distributed with + * this work for additional information regarding copyright ownership. + * The ASF licenses this file to You under the Apache License, Version 2.0 + * (the "License"); you may not use this file except in compliance with + * the License. You may obtain a copy of the License at + * + * http://www.apache.org/licenses/LICENSE-2.0 + * + * Unless required by applicable law or agreed to in writing, software + * distributed under the License is distributed on an "AS IS" BASIS, + * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. + * See the License for the specific language governing permissions and + * limitations under the License. + */ +package org.apache.nifi.processors.hadoop; + +import java.io.IOException; +import java.security.PrivilegedExceptionAction; +import java.util.ArrayList; +import java.util.Collection; +import java.util.HashMap; +import java.util.HashSet; +import java.util.LinkedList; +import java.util.List; +import java.util.Map; +import java.util.Set; +import java.util.concurrent.TimeUnit; +import java.util.regex.Pattern; + +import org.apache.commons.lang3.StringUtils; +import org.apache.hadoop.fs.FileStatus; +import org.apache.hadoop.fs.FileSystem; +import org.apache.hadoop.fs.Path; +import org.apache.hadoop.fs.permission.FsAction; +import org.apache.hadoop.fs.permission.FsPermission; +import org.apache.hadoop.security.UserGroupInformation; +import org.apache.nifi.annotation.behavior.InputRequirement; +import org.apache.nifi.annotation.behavior.InputRequirement.Requirement; +import org.apache.nifi.annotation.behavior.TriggerSerially; +import org.apache.nifi.annotation.behavior.TriggerWhenEmpty; +import org.apache.nifi.annotation.behavior.WritesAttribute; +import org.apache.nifi.annotation.behavior.WritesAttributes; +import org.apache.nifi.annotation.documentation.CapabilityDescription; +import org.apache.nifi.annotation.documentation.SeeAlso; +import org.apache.nifi.annotation.documentation.Tags; +import org.apache.nifi.components.AllowableValue; +import org.apache.nifi.components.PropertyDescriptor; +import org.apache.nifi.components.PropertyValue; +import org.apache.nifi.components.ValidationContext; +import org.apache.nifi.components.ValidationResult; +import org.apache.nifi.expression.ExpressionLanguageScope; +import org.apache.nifi.flowfile.FlowFile; +import org.apache.nifi.processor.ProcessContext; +import org.apache.nifi.processor.ProcessSession; +import org.apache.nifi.processor.ProcessorInitializationContext; +import org.apache.nifi.processor.Relationship; +import org.apache.nifi.processor.exception.ProcessException; +import org.apache.nifi.processor.util.StandardValidators; +import org.apache.nifi.processors.hadoop.GetHDFSFileInfo.HDFSFileInfoRequest.Groupping; + +@TriggerSerially +@TriggerWhenEmpty +@InputRequirement(Requirement.INPUT_ALLOWED) +@Tags({"hadoop", "HDFS", "get", "list", "ingest", "source", "filesystem"}) +@CapabilityDescription("Retrieves a listing of files and directories from HDFS. " ++ "This processor creates a FlowFile(s) that represents the HDFS file/dir with relevant information. " ++ "Main purpose of this processor to provide functionality similar to HDFS Client, i.e. count, du, ls, test, etc. " ++ "Unlike ListHDFS, this processor is stateless, supports incoming connections and provides information on a dir level. " +) +@WritesAttributes({ +@WritesAttribute(attribute="hdfs.objectName", description="The name of the file/dir found on HDFS."), +@WritesAttribute(attribute="hdfs.path", description="The path is set to the absolute path of the object's parent directory on HDFS. " ++ "For example, if an object is a directory 'foo', under directory '/bar' then 'hdfs.objectName' will have value 'foo', and 'hdfs.path' will be '/bar'"), +@WritesAttribute(attribute="hdfs.type", description="The type of an object. Possible values: directory, file, link"), +@WritesAttribute(attribute="hdfs.owner", description="The user that owns the object in HDFS"), +@WritesAttribute(attribute="hdfs.group", description="The group that owns the object in HDFS"), +@WritesAttribute(attribute="hdfs.lastModified", description="The timestamp
[GitHub] nifi pull request #2639: NIFI-4906 Add GetHDFSFileInfo
Github user bbende commented on a diff in the pull request: https://github.com/apache/nifi/pull/2639#discussion_r194421999 --- Diff: nifi-nar-bundles/nifi-hadoop-bundle/nifi-hdfs-processors/src/main/java/org/apache/nifi/processors/hadoop/GetHDFSFileInfo.java --- @@ -0,0 +1,803 @@ +/* + * Licensed to the Apache Software Foundation (ASF) under one or more + * contributor license agreements. See the NOTICE file distributed with + * this work for additional information regarding copyright ownership. + * The ASF licenses this file to You under the Apache License, Version 2.0 + * (the "License"); you may not use this file except in compliance with + * the License. You may obtain a copy of the License at + * + * http://www.apache.org/licenses/LICENSE-2.0 + * + * Unless required by applicable law or agreed to in writing, software + * distributed under the License is distributed on an "AS IS" BASIS, + * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. + * See the License for the specific language governing permissions and + * limitations under the License. + */ +package org.apache.nifi.processors.hadoop; + +import java.io.IOException; +import java.security.PrivilegedExceptionAction; +import java.util.ArrayList; +import java.util.Collection; +import java.util.HashMap; +import java.util.HashSet; +import java.util.LinkedList; +import java.util.List; +import java.util.Map; +import java.util.Set; +import java.util.concurrent.TimeUnit; +import java.util.regex.Pattern; + +import org.apache.commons.lang3.StringUtils; +import org.apache.hadoop.fs.FileStatus; +import org.apache.hadoop.fs.FileSystem; +import org.apache.hadoop.fs.Path; +import org.apache.hadoop.fs.permission.FsAction; +import org.apache.hadoop.fs.permission.FsPermission; +import org.apache.hadoop.security.UserGroupInformation; +import org.apache.nifi.annotation.behavior.InputRequirement; +import org.apache.nifi.annotation.behavior.InputRequirement.Requirement; +import org.apache.nifi.annotation.behavior.TriggerSerially; +import org.apache.nifi.annotation.behavior.TriggerWhenEmpty; +import org.apache.nifi.annotation.behavior.WritesAttribute; +import org.apache.nifi.annotation.behavior.WritesAttributes; +import org.apache.nifi.annotation.documentation.CapabilityDescription; +import org.apache.nifi.annotation.documentation.SeeAlso; +import org.apache.nifi.annotation.documentation.Tags; +import org.apache.nifi.components.AllowableValue; +import org.apache.nifi.components.PropertyDescriptor; +import org.apache.nifi.components.PropertyValue; +import org.apache.nifi.components.ValidationContext; +import org.apache.nifi.components.ValidationResult; +import org.apache.nifi.expression.ExpressionLanguageScope; +import org.apache.nifi.flowfile.FlowFile; +import org.apache.nifi.processor.ProcessContext; +import org.apache.nifi.processor.ProcessSession; +import org.apache.nifi.processor.ProcessorInitializationContext; +import org.apache.nifi.processor.Relationship; +import org.apache.nifi.processor.exception.ProcessException; +import org.apache.nifi.processor.util.StandardValidators; +import org.apache.nifi.processors.hadoop.GetHDFSFileInfo.HDFSFileInfoRequest.Groupping; + +@TriggerSerially +@TriggerWhenEmpty +@InputRequirement(Requirement.INPUT_ALLOWED) +@Tags({"hadoop", "HDFS", "get", "list", "ingest", "source", "filesystem"}) +@CapabilityDescription("Retrieves a listing of files and directories from HDFS. " ++ "This processor creates a FlowFile(s) that represents the HDFS file/dir with relevant information. " ++ "Main purpose of this processor to provide functionality similar to HDFS Client, i.e. count, du, ls, test, etc. " ++ "Unlike ListHDFS, this processor is stateless, supports incoming connections and provides information on a dir level. " +) +@WritesAttributes({ +@WritesAttribute(attribute="hdfs.objectName", description="The name of the file/dir found on HDFS."), +@WritesAttribute(attribute="hdfs.path", description="The path is set to the absolute path of the object's parent directory on HDFS. " ++ "For example, if an object is a directory 'foo', under directory '/bar' then 'hdfs.objectName' will have value 'foo', and 'hdfs.path' will be '/bar'"), +@WritesAttribute(attribute="hdfs.type", description="The type of an object. Possible values: directory, file, link"), +@WritesAttribute(attribute="hdfs.owner", description="The user that owns the object in HDFS"), +@WritesAttribute(attribute="hdfs.group", description="The group that owns the object in HDFS"), +@WritesAttribute(attribute="hdfs.lastModified", description="The timestamp
[GitHub] nifi pull request #2639: NIFI-4906 Add GetHDFSFileInfo
Github user bbende commented on a diff in the pull request: https://github.com/apache/nifi/pull/2639#discussion_r194421861 --- Diff: nifi-nar-bundles/nifi-hadoop-bundle/nifi-hdfs-processors/src/main/java/org/apache/nifi/processors/hadoop/GetHDFSFileInfo.java --- @@ -0,0 +1,803 @@ +/* + * Licensed to the Apache Software Foundation (ASF) under one or more + * contributor license agreements. See the NOTICE file distributed with + * this work for additional information regarding copyright ownership. + * The ASF licenses this file to You under the Apache License, Version 2.0 + * (the "License"); you may not use this file except in compliance with + * the License. You may obtain a copy of the License at + * + * http://www.apache.org/licenses/LICENSE-2.0 + * + * Unless required by applicable law or agreed to in writing, software + * distributed under the License is distributed on an "AS IS" BASIS, + * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. + * See the License for the specific language governing permissions and + * limitations under the License. + */ +package org.apache.nifi.processors.hadoop; + +import java.io.IOException; +import java.security.PrivilegedExceptionAction; +import java.util.ArrayList; +import java.util.Collection; +import java.util.HashMap; +import java.util.HashSet; +import java.util.LinkedList; +import java.util.List; +import java.util.Map; +import java.util.Set; +import java.util.concurrent.TimeUnit; +import java.util.regex.Pattern; + +import org.apache.commons.lang3.StringUtils; +import org.apache.hadoop.fs.FileStatus; +import org.apache.hadoop.fs.FileSystem; +import org.apache.hadoop.fs.Path; +import org.apache.hadoop.fs.permission.FsAction; +import org.apache.hadoop.fs.permission.FsPermission; +import org.apache.hadoop.security.UserGroupInformation; +import org.apache.nifi.annotation.behavior.InputRequirement; +import org.apache.nifi.annotation.behavior.InputRequirement.Requirement; +import org.apache.nifi.annotation.behavior.TriggerSerially; +import org.apache.nifi.annotation.behavior.TriggerWhenEmpty; +import org.apache.nifi.annotation.behavior.WritesAttribute; +import org.apache.nifi.annotation.behavior.WritesAttributes; +import org.apache.nifi.annotation.documentation.CapabilityDescription; +import org.apache.nifi.annotation.documentation.SeeAlso; +import org.apache.nifi.annotation.documentation.Tags; +import org.apache.nifi.components.AllowableValue; +import org.apache.nifi.components.PropertyDescriptor; +import org.apache.nifi.components.PropertyValue; +import org.apache.nifi.components.ValidationContext; +import org.apache.nifi.components.ValidationResult; +import org.apache.nifi.expression.ExpressionLanguageScope; +import org.apache.nifi.flowfile.FlowFile; +import org.apache.nifi.processor.ProcessContext; +import org.apache.nifi.processor.ProcessSession; +import org.apache.nifi.processor.ProcessorInitializationContext; +import org.apache.nifi.processor.Relationship; +import org.apache.nifi.processor.exception.ProcessException; +import org.apache.nifi.processor.util.StandardValidators; +import org.apache.nifi.processors.hadoop.GetHDFSFileInfo.HDFSFileInfoRequest.Groupping; + +@TriggerSerially +@TriggerWhenEmpty +@InputRequirement(Requirement.INPUT_ALLOWED) +@Tags({"hadoop", "HDFS", "get", "list", "ingest", "source", "filesystem"}) +@CapabilityDescription("Retrieves a listing of files and directories from HDFS. " ++ "This processor creates a FlowFile(s) that represents the HDFS file/dir with relevant information. " ++ "Main purpose of this processor to provide functionality similar to HDFS Client, i.e. count, du, ls, test, etc. " ++ "Unlike ListHDFS, this processor is stateless, supports incoming connections and provides information on a dir level. " +) +@WritesAttributes({ +@WritesAttribute(attribute="hdfs.objectName", description="The name of the file/dir found on HDFS."), +@WritesAttribute(attribute="hdfs.path", description="The path is set to the absolute path of the object's parent directory on HDFS. " ++ "For example, if an object is a directory 'foo', under directory '/bar' then 'hdfs.objectName' will have value 'foo', and 'hdfs.path' will be '/bar'"), +@WritesAttribute(attribute="hdfs.type", description="The type of an object. Possible values: directory, file, link"), +@WritesAttribute(attribute="hdfs.owner", description="The user that owns the object in HDFS"), +@WritesAttribute(attribute="hdfs.group", description="The group that owns the object in HDFS"), +@WritesAttribute(attribute="hdfs.lastModified", description="The timestamp
[GitHub] nifi pull request #2639: NIFI-4906 Add GetHDFSFileInfo
Github user bbende commented on a diff in the pull request: https://github.com/apache/nifi/pull/2639#discussion_r194430127 --- Diff: nifi-nar-bundles/nifi-hadoop-bundle/nifi-hdfs-processors/src/main/java/org/apache/nifi/processors/hadoop/GetHDFSFileInfo.java --- @@ -0,0 +1,803 @@ +/* + * Licensed to the Apache Software Foundation (ASF) under one or more + * contributor license agreements. See the NOTICE file distributed with + * this work for additional information regarding copyright ownership. + * The ASF licenses this file to You under the Apache License, Version 2.0 + * (the "License"); you may not use this file except in compliance with + * the License. You may obtain a copy of the License at + * + * http://www.apache.org/licenses/LICENSE-2.0 + * + * Unless required by applicable law or agreed to in writing, software + * distributed under the License is distributed on an "AS IS" BASIS, + * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. + * See the License for the specific language governing permissions and + * limitations under the License. + */ +package org.apache.nifi.processors.hadoop; + +import java.io.IOException; +import java.security.PrivilegedExceptionAction; +import java.util.ArrayList; +import java.util.Collection; +import java.util.HashMap; +import java.util.HashSet; +import java.util.LinkedList; +import java.util.List; +import java.util.Map; +import java.util.Set; +import java.util.concurrent.TimeUnit; +import java.util.regex.Pattern; + +import org.apache.commons.lang3.StringUtils; +import org.apache.hadoop.fs.FileStatus; +import org.apache.hadoop.fs.FileSystem; +import org.apache.hadoop.fs.Path; +import org.apache.hadoop.fs.permission.FsAction; +import org.apache.hadoop.fs.permission.FsPermission; +import org.apache.hadoop.security.UserGroupInformation; +import org.apache.nifi.annotation.behavior.InputRequirement; +import org.apache.nifi.annotation.behavior.InputRequirement.Requirement; +import org.apache.nifi.annotation.behavior.TriggerSerially; +import org.apache.nifi.annotation.behavior.TriggerWhenEmpty; +import org.apache.nifi.annotation.behavior.WritesAttribute; +import org.apache.nifi.annotation.behavior.WritesAttributes; +import org.apache.nifi.annotation.documentation.CapabilityDescription; +import org.apache.nifi.annotation.documentation.SeeAlso; +import org.apache.nifi.annotation.documentation.Tags; +import org.apache.nifi.components.AllowableValue; +import org.apache.nifi.components.PropertyDescriptor; +import org.apache.nifi.components.PropertyValue; +import org.apache.nifi.components.ValidationContext; +import org.apache.nifi.components.ValidationResult; +import org.apache.nifi.expression.ExpressionLanguageScope; +import org.apache.nifi.flowfile.FlowFile; +import org.apache.nifi.processor.ProcessContext; +import org.apache.nifi.processor.ProcessSession; +import org.apache.nifi.processor.ProcessorInitializationContext; +import org.apache.nifi.processor.Relationship; +import org.apache.nifi.processor.exception.ProcessException; +import org.apache.nifi.processor.util.StandardValidators; +import org.apache.nifi.processors.hadoop.GetHDFSFileInfo.HDFSFileInfoRequest.Groupping; + +@TriggerSerially +@TriggerWhenEmpty +@InputRequirement(Requirement.INPUT_ALLOWED) +@Tags({"hadoop", "HDFS", "get", "list", "ingest", "source", "filesystem"}) +@CapabilityDescription("Retrieves a listing of files and directories from HDFS. " ++ "This processor creates a FlowFile(s) that represents the HDFS file/dir with relevant information. " ++ "Main purpose of this processor to provide functionality similar to HDFS Client, i.e. count, du, ls, test, etc. " ++ "Unlike ListHDFS, this processor is stateless, supports incoming connections and provides information on a dir level. " +) +@WritesAttributes({ +@WritesAttribute(attribute="hdfs.objectName", description="The name of the file/dir found on HDFS."), +@WritesAttribute(attribute="hdfs.path", description="The path is set to the absolute path of the object's parent directory on HDFS. " ++ "For example, if an object is a directory 'foo', under directory '/bar' then 'hdfs.objectName' will have value 'foo', and 'hdfs.path' will be '/bar'"), +@WritesAttribute(attribute="hdfs.type", description="The type of an object. Possible values: directory, file, link"), +@WritesAttribute(attribute="hdfs.owner", description="The user that owns the object in HDFS"), +@WritesAttribute(attribute="hdfs.group", description="The group that owns the object in HDFS"), +@WritesAttribute(attribute="hdfs.lastModified", description="The timestamp
[GitHub] nifi pull request #2639: NIFI-4906 Add GetHDFSFileInfo
Github user bbende commented on a diff in the pull request: https://github.com/apache/nifi/pull/2639#discussion_r194428165 --- Diff: nifi-nar-bundles/nifi-hadoop-bundle/nifi-hdfs-processors/src/main/java/org/apache/nifi/processors/hadoop/GetHDFSFileInfo.java --- @@ -0,0 +1,803 @@ +/* + * Licensed to the Apache Software Foundation (ASF) under one or more + * contributor license agreements. See the NOTICE file distributed with + * this work for additional information regarding copyright ownership. + * The ASF licenses this file to You under the Apache License, Version 2.0 + * (the "License"); you may not use this file except in compliance with + * the License. You may obtain a copy of the License at + * + * http://www.apache.org/licenses/LICENSE-2.0 + * + * Unless required by applicable law or agreed to in writing, software + * distributed under the License is distributed on an "AS IS" BASIS, + * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. + * See the License for the specific language governing permissions and + * limitations under the License. + */ +package org.apache.nifi.processors.hadoop; + +import java.io.IOException; +import java.security.PrivilegedExceptionAction; +import java.util.ArrayList; +import java.util.Collection; +import java.util.HashMap; +import java.util.HashSet; +import java.util.LinkedList; +import java.util.List; +import java.util.Map; +import java.util.Set; +import java.util.concurrent.TimeUnit; +import java.util.regex.Pattern; + +import org.apache.commons.lang3.StringUtils; +import org.apache.hadoop.fs.FileStatus; +import org.apache.hadoop.fs.FileSystem; +import org.apache.hadoop.fs.Path; +import org.apache.hadoop.fs.permission.FsAction; +import org.apache.hadoop.fs.permission.FsPermission; +import org.apache.hadoop.security.UserGroupInformation; +import org.apache.nifi.annotation.behavior.InputRequirement; +import org.apache.nifi.annotation.behavior.InputRequirement.Requirement; +import org.apache.nifi.annotation.behavior.TriggerSerially; +import org.apache.nifi.annotation.behavior.TriggerWhenEmpty; +import org.apache.nifi.annotation.behavior.WritesAttribute; +import org.apache.nifi.annotation.behavior.WritesAttributes; +import org.apache.nifi.annotation.documentation.CapabilityDescription; +import org.apache.nifi.annotation.documentation.SeeAlso; +import org.apache.nifi.annotation.documentation.Tags; +import org.apache.nifi.components.AllowableValue; +import org.apache.nifi.components.PropertyDescriptor; +import org.apache.nifi.components.PropertyValue; +import org.apache.nifi.components.ValidationContext; +import org.apache.nifi.components.ValidationResult; +import org.apache.nifi.expression.ExpressionLanguageScope; +import org.apache.nifi.flowfile.FlowFile; +import org.apache.nifi.processor.ProcessContext; +import org.apache.nifi.processor.ProcessSession; +import org.apache.nifi.processor.ProcessorInitializationContext; +import org.apache.nifi.processor.Relationship; +import org.apache.nifi.processor.exception.ProcessException; +import org.apache.nifi.processor.util.StandardValidators; +import org.apache.nifi.processors.hadoop.GetHDFSFileInfo.HDFSFileInfoRequest.Groupping; + +@TriggerSerially +@TriggerWhenEmpty +@InputRequirement(Requirement.INPUT_ALLOWED) +@Tags({"hadoop", "HDFS", "get", "list", "ingest", "source", "filesystem"}) +@CapabilityDescription("Retrieves a listing of files and directories from HDFS. " ++ "This processor creates a FlowFile(s) that represents the HDFS file/dir with relevant information. " ++ "Main purpose of this processor to provide functionality similar to HDFS Client, i.e. count, du, ls, test, etc. " ++ "Unlike ListHDFS, this processor is stateless, supports incoming connections and provides information on a dir level. " +) +@WritesAttributes({ +@WritesAttribute(attribute="hdfs.objectName", description="The name of the file/dir found on HDFS."), +@WritesAttribute(attribute="hdfs.path", description="The path is set to the absolute path of the object's parent directory on HDFS. " ++ "For example, if an object is a directory 'foo', under directory '/bar' then 'hdfs.objectName' will have value 'foo', and 'hdfs.path' will be '/bar'"), +@WritesAttribute(attribute="hdfs.type", description="The type of an object. Possible values: directory, file, link"), +@WritesAttribute(attribute="hdfs.owner", description="The user that owns the object in HDFS"), +@WritesAttribute(attribute="hdfs.group", description="The group that owns the object in HDFS"), +@WritesAttribute(attribute="hdfs.lastModified", description="The timestamp
[GitHub] nifi pull request #2639: NIFI-4906 Add GetHDFSFileInfo
Github user bbende commented on a diff in the pull request: https://github.com/apache/nifi/pull/2639#discussion_r194421928 --- Diff: nifi-nar-bundles/nifi-hadoop-bundle/nifi-hdfs-processors/src/main/java/org/apache/nifi/processors/hadoop/GetHDFSFileInfo.java --- @@ -0,0 +1,803 @@ +/* + * Licensed to the Apache Software Foundation (ASF) under one or more + * contributor license agreements. See the NOTICE file distributed with + * this work for additional information regarding copyright ownership. + * The ASF licenses this file to You under the Apache License, Version 2.0 + * (the "License"); you may not use this file except in compliance with + * the License. You may obtain a copy of the License at + * + * http://www.apache.org/licenses/LICENSE-2.0 + * + * Unless required by applicable law or agreed to in writing, software + * distributed under the License is distributed on an "AS IS" BASIS, + * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. + * See the License for the specific language governing permissions and + * limitations under the License. + */ +package org.apache.nifi.processors.hadoop; + +import java.io.IOException; +import java.security.PrivilegedExceptionAction; +import java.util.ArrayList; +import java.util.Collection; +import java.util.HashMap; +import java.util.HashSet; +import java.util.LinkedList; +import java.util.List; +import java.util.Map; +import java.util.Set; +import java.util.concurrent.TimeUnit; +import java.util.regex.Pattern; + +import org.apache.commons.lang3.StringUtils; +import org.apache.hadoop.fs.FileStatus; +import org.apache.hadoop.fs.FileSystem; +import org.apache.hadoop.fs.Path; +import org.apache.hadoop.fs.permission.FsAction; +import org.apache.hadoop.fs.permission.FsPermission; +import org.apache.hadoop.security.UserGroupInformation; +import org.apache.nifi.annotation.behavior.InputRequirement; +import org.apache.nifi.annotation.behavior.InputRequirement.Requirement; +import org.apache.nifi.annotation.behavior.TriggerSerially; +import org.apache.nifi.annotation.behavior.TriggerWhenEmpty; +import org.apache.nifi.annotation.behavior.WritesAttribute; +import org.apache.nifi.annotation.behavior.WritesAttributes; +import org.apache.nifi.annotation.documentation.CapabilityDescription; +import org.apache.nifi.annotation.documentation.SeeAlso; +import org.apache.nifi.annotation.documentation.Tags; +import org.apache.nifi.components.AllowableValue; +import org.apache.nifi.components.PropertyDescriptor; +import org.apache.nifi.components.PropertyValue; +import org.apache.nifi.components.ValidationContext; +import org.apache.nifi.components.ValidationResult; +import org.apache.nifi.expression.ExpressionLanguageScope; +import org.apache.nifi.flowfile.FlowFile; +import org.apache.nifi.processor.ProcessContext; +import org.apache.nifi.processor.ProcessSession; +import org.apache.nifi.processor.ProcessorInitializationContext; +import org.apache.nifi.processor.Relationship; +import org.apache.nifi.processor.exception.ProcessException; +import org.apache.nifi.processor.util.StandardValidators; +import org.apache.nifi.processors.hadoop.GetHDFSFileInfo.HDFSFileInfoRequest.Groupping; + +@TriggerSerially +@TriggerWhenEmpty +@InputRequirement(Requirement.INPUT_ALLOWED) +@Tags({"hadoop", "HDFS", "get", "list", "ingest", "source", "filesystem"}) +@CapabilityDescription("Retrieves a listing of files and directories from HDFS. " ++ "This processor creates a FlowFile(s) that represents the HDFS file/dir with relevant information. " ++ "Main purpose of this processor to provide functionality similar to HDFS Client, i.e. count, du, ls, test, etc. " ++ "Unlike ListHDFS, this processor is stateless, supports incoming connections and provides information on a dir level. " +) +@WritesAttributes({ +@WritesAttribute(attribute="hdfs.objectName", description="The name of the file/dir found on HDFS."), +@WritesAttribute(attribute="hdfs.path", description="The path is set to the absolute path of the object's parent directory on HDFS. " ++ "For example, if an object is a directory 'foo', under directory '/bar' then 'hdfs.objectName' will have value 'foo', and 'hdfs.path' will be '/bar'"), +@WritesAttribute(attribute="hdfs.type", description="The type of an object. Possible values: directory, file, link"), +@WritesAttribute(attribute="hdfs.owner", description="The user that owns the object in HDFS"), +@WritesAttribute(attribute="hdfs.group", description="The group that owns the object in HDFS"), +@WritesAttribute(attribute="hdfs.lastModified", description="The timestamp