Re: New "hierarchical" controller service system confusing

2017-08-01 Thread Russell Bateman

Thanks for the suggestion, Matt. I created NIFI-4251.

On 08/01/2017 09:39 AM, Matt Gilman wrote:

Russell,

Thanks for the suggestion on the improved Controller Service UX. Would 
you mind filing a JIRA for this improvement?


In 1.x, the user can see the components referencing a Controller 
Service in 1 of 3 places. The references are shown in the read-only 
details dialog, the service configuration dialog, and the 
enable/disable service dialog. The references will include Processors, 
Reporting Tasks, and Controller Services (and components that 
reference those services and so on).


Thanks

Matt


On Mon, Jul 31, 2017 at 12:45 PM, Russell Bateman 
> wrote:


Friends,

I find the new (well, since 0.7.x -> 1.x) way of associating
controller services based on the process group a bit disorienting.
When one follows the right arrow from a consuming processor and
lands on the Process Group Configurationpage where the services
are configured, one is obliged to click on the Generaltab to
figure out whether the controller service is at the root (the case
for services whose flows were upgraded from the 0.7.x world) or
for a specific process group (the case for configuration
accomplished post-upgrade/native 1.x).

I realize that this will be a dwindling problem, but there's
enough white space in this dialog to justify putting the Process
Group Name to the right of the dialog title (Process Group
Configuration) while on the Controller Servicestab, even without
the momentary confusion of origin (that is, a flow from 0.7.x vs.
a "native" 1.x flow). So, I consider that this enhancement holds
regardless of whether upgraded flows are present or not.

At the same time, I'm tempted to ask how, when looking on the
Controller Services tab/page, one figures out which processor
instance(s) is (are) consuming a particular controller service I'm
looking at?

Russ






Re: New "hierarchical" controller service system confusing

2017-08-01 Thread Matt Gilman
Russell,

Thanks for the suggestion on the improved Controller Service UX. Would you
mind filing a JIRA for this improvement?

In 1.x, the user can see the components referencing a Controller Service in
1 of 3 places. The references are shown in the read-only details dialog,
the service configuration dialog, and the enable/disable service dialog.
The references will include Processors, Reporting Tasks, and Controller
Services (and components that reference those services and so on).

Thanks

Matt


On Mon, Jul 31, 2017 at 12:45 PM, Russell Bateman 
wrote:

> Friends,
>
> I find the new (well, since 0.7.x -> 1.x) way of associating controller
> services based on the process group a bit disorienting. When one follows
> the right arrow from a consuming processor and lands on the Process Group
> Configuration page where the services are configured, one is obliged to
> click on the General tab to figure out whether the controller service is
> at the root (the case for services whose flows were upgraded from the 0.7.x
> world) or for a specific process group (the case for configuration
> accomplished post-upgrade/native 1.x).
>
> I realize that this will be a dwindling problem, but there's enough white
> space in this dialog to justify putting the Process Group Name to the right
> of the dialog title (Process Group Configuration) while on the Controller
> Services tab, even without the momentary confusion of origin (that is, a
> flow from 0.7.x vs. a "native" 1.x flow). So, I consider that this
> enhancement holds regardless of whether upgraded flows are present or not.
>
> At the same time, I'm tempted to ask how, when looking on the Controller
> Services tab/page, one figures out which processor instance(s) is (are)
> consuming a particular controller service I'm looking at?
>
> Russ
>


Re: NiFi Docker

2017-08-01 Thread Aldrin Piri
Sounds great.  Will keep an eye out for it.  Thanks!

On Tue, Aug 1, 2017 at 10:50 AM, ddewaele  wrote:

> Great  also have some ideas about this. I'll log a JIRA and elaborate
> on
> those.
>
> We can then see on how to move this forward. (willing to do a pull request
> for this).
>
>
>
> --
> View this message in context: http://apache-nifi-users-list.
> 2361937.n4.nabble.com/NiFi-Docker-tp2562p2576.html
> Sent from the Apache NiFi Users List mailing list archive at Nabble.com.
>


Re: NiFi Docker

2017-08-01 Thread ddewaele
Great  also have some ideas about this. I'll log a JIRA and elaborate on
those.

We can then see on how to move this forward. (willing to do a pull request
for this).



--
View this message in context: 
http://apache-nifi-users-list.2361937.n4.nabble.com/NiFi-Docker-tp2562p2576.html
Sent from the Apache NiFi Users List mailing list archive at Nabble.com.


Re: Query syntax error in GetMongo processor?

2017-08-01 Thread Jason Tarasovic
Does your cluster have different users per database? Or to put it a
different way, is the user that you are authenticating as have at least
read/write privileges on every database?

The connection is created when the processor is scheduled so if you are
authenticating against a specific database, then the DB needs to be in the
connection string. If your authenticating as a user that can authenticate
against the admin DB, then you don't need to know the DB a priori.

That's my (probably over-simplistic) understanding anyways.

-Jason

On Tue, Aug 1, 2017 at 8:38 AM, James McMahon  wrote:

> Follow up:
> 1- changing the query to be simple json without any db.find etc resolved
> that issue re: the invalid query. Thanks again to Pierre V. for pointing
> this out.
>
> 2- To eliminate the authentication problem I had to include the database
> name in the URI. So this
> Mongo URI mongodb://nifi:[my pwd]@34.227.51.144
> had to be this
> Mongo URI mongodb://nifi:[my pwd]@34.227.51.144/
> DataServiceAudit
> I made the assumption that because NiFi specifically calls for Mongo
> Database Name in the second parameter of GetMongo, it means that the
> database name must be left off the Mongo URI. To me if you want it in the
> URI, then why have it at all as a separate parameter? But clearly that is
> not the case.
>
> It worked with these changes.
>
> On Sun, Jul 30, 2017 at 12:31 PM, James McMahon 
> wrote:
>
>> Hello. I cannot get a simple test query to work against MongoDB from
>> GetMongo. Here is what I attempt to do:
>>
>> Mongo URI mongodb://nifi:[my pwd]@34.227.51.144
>> Mongo Database Name DataServiceAudit
>> Mongo Collection Name Audit
>> SSL Context Service  No value set
>> Client Auth  NONE
>> Query  db.find({'Call':'Marko'})
>>
>> I leave the collection out (ie, db.Audit.find...) because I have
>> expressly stated the collection name in the processor.
>> I am trying to get this working in its simplest form before I tackle
>> themore complex, and so am not using SSL at this time.
>>
>> I am using NiFi 1.3.0 in this case.
>>
>> In the UI the processor indicates this problem:
>> 'Query' validated against 'db.find({'Call':'Marko'})' is invalid because
>> org.bson.json.JSONParseException
>>
>> Why does it throw that error?
>>
>> If I remove the Query entirely, I can get the processor to enter a run
>> state. However then in the nifi-app.log it says there is an authentication
>> issue:
>>
>> 2017-07-30 16:27:27,606 INFO [pool-10-thread-1]
>> o.a.n.c.r.WriteAheadFlowFileRepository Successfully checkpointed
>> FlowFile Repository with 2313 records in 122 milliseconds
>> 2017-07-30 16:27:41,147 ERROR [Timer-Driven Process Thread-7]
>> o.a.nifi.processors.mongodb.GetMongo 
>> GetMongo[id=100c11db-1fb9-1138-1050-0f63cda85d11]
>> Failed to execute query null due to com.mongodb.MongoTimeoutException:
>> Timed out after 3 ms while waiting for a server that matches
>> ReadPreferenceServerSelector{readPreference=primary}. Client view of
>> cluster state is {type=UNKNOWN, servers=[{address=34.227.51.144:27017,
>> type=UNKNOWN, state=CONNECTING, 
>> exception={com.mongodb.MongoSecurityException:
>> Exception authenticating MongoCredential{mechanism=null,
>> userName='nifi', source='admin', password=,
>> mechanismProperties={}}}, caused by {com.mongodb.MongoCommandException:
>> Command failed with error 18: 'Authentication failed.' on server
>> 34.227.51.144:27017. The full response is { "ok" : 0.0, "errmsg" :
>> "Authentication failed.", "code" : 18, "codeName" : "AuthenticationFailed"
>> }}}]: com.mongodb.MongoTimeoutException: Timed out after 3 ms while
>> waiting for a server that matches 
>> ReadPreferenceServerSelector{readPreference=primary}.
>> Client view of cluster state is {type=UNKNOWN, servers=[{address=
>> 34.227.51.144:27017, type=UNKNOWN, state=CONNECTING,
>> exception={com.mongodb.MongoSecurityException: Exception authenticating
>> MongoCredential{mechanism=null, userName='nifi', source='admin',
>> password=, mechanismProperties={}}}, caused by
>> {com.mongodb.MongoCommandException: Command failed with error 18:
>> 'Authentication failed.' on server 34.227.51.144:27017. The full
>> response is { "ok" : 0.0, "errmsg" : "Authentication failed.", "code" : 18,
>> "codeName" : "AuthenticationFailed" }}}]
>> com.mongodb.MongoTimeoutException: Timed out after 3 ms while
>> waiting for a server that matches 
>> ReadPreferenceServerSelector{readPreference=primary}.
>> Client view of cluster state is {type=UNKNOWN, servers=[{address=
>> 34.227.51.144:27017, type=UNKNOWN, state=CONNECTING,
>> exception={com.mongodb.MongoSecurityException: Exception authenticating
>> MongoCredential{mechanism=null, userName='nifi', source='admin',
>> password=, mechanismProperties={}}}, caused by
>> {com.mongodb.MongoCommandException: 

Re: [EXT] NiFi 1.3: Simplest way possible of creating CSV files from SQL queries

2017-08-01 Thread Bryan Bende
I would think we could modify ExecuteSQL to have a property for an optional
RecordWriter. It would still create the Avro schema from the DB schema as
it does today, and then if the RecordWriter was specified it would use that
to write records, otherwise it would write the Avro as it does today. This
way you could go straight to CSV.

Without that, as Joe mentioned with the improvements in master (1.4.0
snapshot) you should be able to have an AvroReader using a "Schema Access
Strategy" of "Embedded Avro Schema" which is coming from ExecuteSQL, and
then a CsvRecordSetWriter with "Schema Access Strategy" of "Inherit Record
Schema". So you wouldn't need a schema registry at all and wouldn't need
the UpdateAttribute.

-Bryan


On Tue, Aug 1, 2017 at 9:02 AM, Joe Witt  wrote:

> There are some great points here.  Am on a phone right now so will be very
> brief.  The current 1.4.0 snapshot on master has some nice improvements for
> ease of use with record handling.  This includes writers inheriting schema
> from the reader and others.
>
> Thanks
>
> On Aug 1, 2017 1:20 AM, "Peter Wicks (pwicks)"  wrote:
>
>> I hate to respond with “me too”, but I haven’t seen a response and this
>> kind of simplification is of interest to me.
>>
>>
>>
>> The PutDatabaseRecord processor already does something similar, and I
>> have only needed the AvroReader processor without a schema registry.
>>
>>
>>
>>
>>
>> *From:* Márcio Faria [mailto:faria.mar...@ymail.com]
>> *Sent:* Tuesday, July 25, 2017 11:09 AM
>> *To:* Users 
>> *Subject:* [EXT] NiFi 1.3: Simplest way possible of creating CSV files
>> from SQL queries
>>
>>
>>
>> Hi,
>>
>>
>>
>> I'm looking for the simplest way possible of creating CSV files from SQL
>> queries using Apache NiFi 1.3.
>>
>>
>>
>> The flow I currently have (the files are to be SFTP'ed to a remote
>> server):
>>
>>
>>
>> ExecuteSQL -> UpdateAttribute -> ConversionRecord [3 CSs] -> PutSFTP
>>
>>
>>
>> The concept of SchemaRegistry is new to me, but if I understood it
>> correctly in order for the ConversionRecord to work properly is necessary
>> to have 3 Controller Services ([3 CSs]) associated with it:
>>
>>- AvroSchemaRegistry, with the schema defined in Avro Schema (JSON);
>>- AvroReader, referring to the above schema;
>>- CSVRecordSetWriter, also referring to the same schema.
>>
>>
>>
>> It seems there are many benefits in using the schema registry, including
>> versioning, validation, etc, but in my example a simpler configuration
>> would be welcome.
>>
>>
>>
>> Isn't the schema already defined by ExecuteSQL? Can I have the
>> ConversionRecord alone with no *dedicated* SchemaRegistry (property),
>> AvroReader,(instance), or CSVRecordSetWriter (instance)? Of course, we'd
>> still need to specify the output is a CSV, so perhaps a shared
>> CSVRecordSetWriter that also gets its schema from the flow file would still
>> be useful.
>>
>>
>>
>> By the way, would the Schema Access Strategy named "Use Embedded
>> Avro Schema" be part of a simpler solution? How?
>>
>>
>>
>> In the same vein, what about having the schema-name property optionally
>> defined by the ExecuteSQL itself, so we don't have to depend on the
>> UpdateAttribute component?
>>
>>
>>
>> In summary, I'm wondering if it's possible to have 3 (+ 1 generic)
>> components instead of 6 per query:
>>
>>
>>
>> ExecuteSQL -> ConversionRecord [CSVRecordSetWriter] -> PutSFTP
>>
>>
>>
>> That would make a difference when defining multiple conversions from SQL
>> to CSV, or other equivalent flows.
>>
>>
>>
>> In addition, consider that someone might want to have maximum
>> flexibility, meaning that it would be totally acceptable to change the
>> query and get a different layout for the resulting CSV file, without having
>> to change any SchemaRegistry, Reader, or Writer.
>>
>>
>>
>> I've found a few tickets out there covering a similar topic. In
>> particular, [1] mentions the difficulty with more complex Avro data types.
>> But I don't see that being a blocker when the data source is an
>> old-fashioned SQL query.
>>
>>
>>
>> Recommendations?
>>
>>
>>
>> P.S.1 Maybe templates would save the effort, but since Controller
>> Services are "global", I'm still wondering if having too many parts would
>> make it more difficult to manage lots of flows than it could be.
>>
>>
>>
>> P.S.2 Will my 1st flow have a good performance? I'm wondering if another
>> advantage of using SchemaRegistry etc is that it prevents the creation of
>> too many records at once.
>>
>>
>>
>> Thank you,
>>
>>
>>
>> Marcio
>>
>>
>>
>> [1] NIFI-1372 Create ConvertAvroToCSV
>> 
>>
>>
>> [NIFI-1372] Create ConvertAvroToCSV - ASF JIRA
>>
>>
>>
>


Re: Query syntax error in GetMongo processor?

2017-08-01 Thread James McMahon
Follow up:
1- changing the query to be simple json without any db.find etc resolved
that issue re: the invalid query. Thanks again to Pierre V. for pointing
this out.

2- To eliminate the authentication problem I had to include the database
name in the URI. So this
Mongo URI mongodb://nifi:[my pwd]@34.227.51.144
had to be this
Mongo URI mongodb://nifi:[my pwd]@34.227.51.144
/DataServiceAudit
I made the assumption that because NiFi specifically calls for Mongo
Database Name in the second parameter of GetMongo, it means that the
database name must be left off the Mongo URI. To me if you want it in the
URI, then why have it at all as a separate parameter? But clearly that is
not the case.

It worked with these changes.

On Sun, Jul 30, 2017 at 12:31 PM, James McMahon 
wrote:

> Hello. I cannot get a simple test query to work against MongoDB from
> GetMongo. Here is what I attempt to do:
>
> Mongo URI mongodb://nifi:[my pwd]@34.227.51.144
> Mongo Database Name DataServiceAudit
> Mongo Collection Name Audit
> SSL Context Service  No value set
> Client Auth  NONE
> Query  db.find({'Call':'Marko'})
>
> I leave the collection out (ie, db.Audit.find...) because I have expressly
> stated the collection name in the processor.
> I am trying to get this working in its simplest form before I tackle
> themore complex, and so am not using SSL at this time.
>
> I am using NiFi 1.3.0 in this case.
>
> In the UI the processor indicates this problem:
> 'Query' validated against 'db.find({'Call':'Marko'})' is invalid because
> org.bson.json.JSONParseException
>
> Why does it throw that error?
>
> If I remove the Query entirely, I can get the processor to enter a run
> state. However then in the nifi-app.log it says there is an authentication
> issue:
>
> 2017-07-30 16:27:27,606 INFO [pool-10-thread-1] 
> o.a.n.c.r.WriteAheadFlowFileRepository
> Successfully checkpointed FlowFile Repository with 2313 records in 122
> milliseconds
> 2017-07-30 16:27:41,147 ERROR [Timer-Driven Process Thread-7]
> o.a.nifi.processors.mongodb.GetMongo 
> GetMongo[id=100c11db-1fb9-1138-1050-0f63cda85d11]
> Failed to execute query null due to com.mongodb.MongoTimeoutException:
> Timed out after 3 ms while waiting for a server that matches
> ReadPreferenceServerSelector{readPreference=primary}. Client view of
> cluster state is {type=UNKNOWN, servers=[{address=34.227.51.144:27017,
> type=UNKNOWN, state=CONNECTING, exception={com.mongodb.MongoSecurityException:
> Exception authenticating MongoCredential{mechanism=null, userName='nifi',
> source='admin', password=, mechanismProperties={}}}, caused by
> {com.mongodb.MongoCommandException: Command failed with error 18:
> 'Authentication failed.' on server 34.227.51.144:27017. The full response
> is { "ok" : 0.0, "errmsg" : "Authentication failed.", "code" : 18,
> "codeName" : "AuthenticationFailed" }}}]: com.mongodb.MongoTimeoutException:
> Timed out after 3 ms while waiting for a server that matches
> ReadPreferenceServerSelector{readPreference=primary}. Client view of
> cluster state is {type=UNKNOWN, servers=[{address=34.227.51.144:27017,
> type=UNKNOWN, state=CONNECTING, exception={com.mongodb.MongoSecurityException:
> Exception authenticating MongoCredential{mechanism=null, userName='nifi',
> source='admin', password=, mechanismProperties={}}}, caused by
> {com.mongodb.MongoCommandException: Command failed with error 18:
> 'Authentication failed.' on server 34.227.51.144:27017. The full response
> is { "ok" : 0.0, "errmsg" : "Authentication failed.", "code" : 18,
> "codeName" : "AuthenticationFailed" }}}]
> com.mongodb.MongoTimeoutException: Timed out after 3 ms while waiting
> for a server that matches 
> ReadPreferenceServerSelector{readPreference=primary}.
> Client view of cluster state is {type=UNKNOWN, servers=[{address=
> 34.227.51.144:27017, type=UNKNOWN, state=CONNECTING,
> exception={com.mongodb.MongoSecurityException: Exception authenticating
> MongoCredential{mechanism=null, userName='nifi', source='admin',
> password=, mechanismProperties={}}}, caused by 
> {com.mongodb.MongoCommandException:
> Command failed with error 18: 'Authentication failed.' on server
> 34.227.51.144:27017. The full response is { "ok" : 0.0, "errmsg" :
> "Authentication failed.", "code" : 18, "codeName" : "AuthenticationFailed"
> }}}]
> at com.mongodb.connection.BaseCluster.createTimeoutException(
> BaseCluster.java:369)
> at com.mongodb.connection.BaseCluster.selectServer(
> BaseCluster.java:101)
> at com.mongodb.binding.ClusterBinding$
> ClusterBindingConnectionSource.(ClusterBinding.java:75)
> at com.mongodb.binding.ClusterBinding$
> ClusterBindingConnectionSource.(ClusterBinding.java:71)
> at com.mongodb.binding.ClusterBinding.getReadConnectionSource(
> ClusterBinding.java:63)
> at 

RE: [EXT] NiFi 1.3: Simplest way possible of creating CSV files from SQL queries

2017-08-01 Thread Joe Witt
There are some great points here.  Am on a phone right now so will be very
brief.  The current 1.4.0 snapshot on master has some nice improvements for
ease of use with record handling.  This includes writers inheriting schema
from the reader and others.

Thanks

On Aug 1, 2017 1:20 AM, "Peter Wicks (pwicks)"  wrote:

> I hate to respond with “me too”, but I haven’t seen a response and this
> kind of simplification is of interest to me.
>
>
>
> The PutDatabaseRecord processor already does something similar, and I have
> only needed the AvroReader processor without a schema registry.
>
>
>
>
>
> *From:* Márcio Faria [mailto:faria.mar...@ymail.com]
> *Sent:* Tuesday, July 25, 2017 11:09 AM
> *To:* Users 
> *Subject:* [EXT] NiFi 1.3: Simplest way possible of creating CSV files
> from SQL queries
>
>
>
> Hi,
>
>
>
> I'm looking for the simplest way possible of creating CSV files from SQL
> queries using Apache NiFi 1.3.
>
>
>
> The flow I currently have (the files are to be SFTP'ed to a remote server):
>
>
>
> ExecuteSQL -> UpdateAttribute -> ConversionRecord [3 CSs] -> PutSFTP
>
>
>
> The concept of SchemaRegistry is new to me, but if I understood it
> correctly in order for the ConversionRecord to work properly is necessary
> to have 3 Controller Services ([3 CSs]) associated with it:
>
>- AvroSchemaRegistry, with the schema defined in Avro Schema (JSON);
>- AvroReader, referring to the above schema;
>- CSVRecordSetWriter, also referring to the same schema.
>
>
>
> It seems there are many benefits in using the schema registry, including
> versioning, validation, etc, but in my example a simpler configuration
> would be welcome.
>
>
>
> Isn't the schema already defined by ExecuteSQL? Can I have the
> ConversionRecord alone with no *dedicated* SchemaRegistry (property),
> AvroReader,(instance), or CSVRecordSetWriter (instance)? Of course, we'd
> still need to specify the output is a CSV, so perhaps a shared
> CSVRecordSetWriter that also gets its schema from the flow file would still
> be useful.
>
>
>
> By the way, would the Schema Access Strategy named "Use Embedded
> Avro Schema" be part of a simpler solution? How?
>
>
>
> In the same vein, what about having the schema-name property optionally
> defined by the ExecuteSQL itself, so we don't have to depend on the
> UpdateAttribute component?
>
>
>
> In summary, I'm wondering if it's possible to have 3 (+ 1 generic)
> components instead of 6 per query:
>
>
>
> ExecuteSQL -> ConversionRecord [CSVRecordSetWriter] -> PutSFTP
>
>
>
> That would make a difference when defining multiple conversions from SQL
> to CSV, or other equivalent flows.
>
>
>
> In addition, consider that someone might want to have maximum flexibility,
> meaning that it would be totally acceptable to change the query and get a
> different layout for the resulting CSV file, without having to change any
> SchemaRegistry, Reader, or Writer.
>
>
>
> I've found a few tickets out there covering a similar topic. In
> particular, [1] mentions the difficulty with more complex Avro data types.
> But I don't see that being a blocker when the data source is an
> old-fashioned SQL query.
>
>
>
> Recommendations?
>
>
>
> P.S.1 Maybe templates would save the effort, but since Controller Services
> are "global", I'm still wondering if having too many parts would make it
> more difficult to manage lots of flows than it could be.
>
>
>
> P.S.2 Will my 1st flow have a good performance? I'm wondering if another
> advantage of using SchemaRegistry etc is that it prevents the creation of
> too many records at once.
>
>
>
> Thank you,
>
>
>
> Marcio
>
>
>
> [1] NIFI-1372 Create ConvertAvroToCSV
> 
>
>
> [NIFI-1372] Create ConvertAvroToCSV - ASF JIRA
>
>
>


RE: [EXT] NiFi 1.3: Simplest way possible of creating CSV files from SQL queries

2017-08-01 Thread Peter Wicks (pwicks)
I hate to respond with “me too”, but I haven’t seen a response and this kind of 
simplification is of interest to me.

The PutDatabaseRecord processor already does something similar, and I have only 
needed the AvroReader processor without a schema registry.


From: Márcio Faria [mailto:faria.mar...@ymail.com]
Sent: Tuesday, July 25, 2017 11:09 AM
To: Users 
Subject: [EXT] NiFi 1.3: Simplest way possible of creating CSV files from SQL 
queries

Hi,

I'm looking for the simplest way possible of creating CSV files from SQL 
queries using Apache NiFi 1.3.

The flow I currently have (the files are to be SFTP'ed to a remote server):

ExecuteSQL -> UpdateAttribute -> ConversionRecord [3 CSs] -> PutSFTP

The concept of SchemaRegistry is new to me, but if I understood it correctly in 
order for the ConversionRecord to work properly is necessary to have 3 
Controller Services ([3 CSs]) associated with it:

  *   AvroSchemaRegistry, with the schema defined in Avro Schema (JSON);
  *   AvroReader, referring to the above schema;
  *   CSVRecordSetWriter, also referring to the same schema.

It seems there are many benefits in using the schema registry, including 
versioning, validation, etc, but in my example a simpler configuration would be 
welcome.

Isn't the schema already defined by ExecuteSQL? Can I have the ConversionRecord 
alone with no dedicated SchemaRegistry (property), AvroReader,(instance), or 
CSVRecordSetWriter (instance)? Of course, we'd still need to specify the output 
is a CSV, so perhaps a shared CSVRecordSetWriter that also gets its schema from 
the flow file would still be useful.

By the way, would the Schema Access Strategy named "Use Embedded Avro Schema" 
be part of a simpler solution? How?

In the same vein, what about having the schema-name property optionally defined 
by the ExecuteSQL itself, so we don't have to depend on the UpdateAttribute 
component?

In summary, I'm wondering if it's possible to have 3 (+ 1 generic) components 
instead of 6 per query:

ExecuteSQL -> ConversionRecord [CSVRecordSetWriter] -> PutSFTP

That would make a difference when defining multiple conversions from SQL to 
CSV, or other equivalent flows.

In addition, consider that someone might want to have maximum flexibility, 
meaning that it would be totally acceptable to change the query and get a 
different layout for the resulting CSV file, without having to change any 
SchemaRegistry, Reader, or Writer.

I've found a few tickets out there covering a similar topic. In particular, [1] 
mentions the difficulty with more complex Avro data types. But I don't see that 
being a blocker when the data source is an old-fashioned SQL query.

Recommendations?

P.S.1 Maybe templates would save the effort, but since Controller Services are 
"global", I'm still wondering if having too many parts would make it more 
difficult to manage lots of flows than it could be.

P.S.2 Will my 1st flow have a good performance? I'm wondering if another 
advantage of using SchemaRegistry etc is that it prevents the creation of too 
many records at once.

Thank you,

Marcio

[1] NIFI-1372 Create 
ConvertAvroToCSV


[NIFI-1372] Create ConvertAvroToCSV - ASF JIRA