Re: Validate CSV/Records by name instead of position

2019-09-18 Thread Eric Chaves
Awesome, thanks for the detailed steps!

Em qua, 18 de set de 2019 às 11:41, Jerry Vinokurov 
escreveu:

> This certainly works. You can create a schema registry and define an Avro
> schema listing your fields. Then make sure that when you set up the reader,
> it's configured to read the header so that it knows which fields go where
> in the record, set up the mode of the schema access to read from the
> registry you created, and then set the name of the actual schema, which is
> a property on the schema registry. This will correctly validate your CSV
> for you.
>
> On Wed, Sep 18, 2019 at 9:59 AM Eric Chaves  wrote:
>
>> Hi folks,
>>
>> Is it possible to validate fields/columns in Record or CSV by its name
>> instead of it's position? For example I have a record with two mandatory
>> fields and some optional fields but they may be on different position on
>> each ingested file. Should I use a script or there is already a processor
>> that could help me out with those?
>>
>> Regards,
>>
>
>
> --
> http://www.google.com/profiles/grapesmoker
>


Re: Too many open files

2019-09-18 Thread Jean-Sebastien Vachon
I managed to find the culprit.. it was the init script that I was using that 
was doing something weird.

I added MAX_FD=5 to my nifi-env.sh file and everything seems to be fine now

thanks

From: Abdou B 
Sent: Wednesday, September 18, 2019 10:42 AM
To: users@nifi.apache.org 
Subject: Re: Too many open files

Hello,

It seems to me that for some distribution, you should modify those values in 
the Cluster management tool.
For example in HDF, with Ambari, you should change the parameter : 
nifi_user_nofile_limit. for the change to take effect.

Best regards
Abdou

Le mer. 18 sept. 2019 à 16:34, Jean-Sebastien Vachon 
mailto:jsvac...@brizodata.com>> a écrit :
Does not seem to help... processes are still limited to 4096 fds

From: Jean-Sebastien Vachon 
mailto:jsvac...@brizodata.com>>
Sent: Wednesday, September 18, 2019 10:31 AM
To: users@nifi.apache.org 
mailto:users@nifi.apache.org>>
Subject: Re: Too many open files

Oups.. just saw the following:

Your distribution may require an edit to /etc/security/limits.d/90-nproc.conf 
by adding:
* soft nproc 1

I will try this

From: Jean-Sebastien Vachon 
mailto:jsvac...@brizodata.com>>
Sent: Wednesday, September 18, 2019 10:30 AM
To: users@nifi.apache.org 
mailto:users@nifi.apache.org>>
Subject: Too many open files

Hi all,

I've started to see "Too many open files" error messages in Nifi. I checked 
https://nifi.apache.org/quickstart.html to see the recommended values to fix 
this
and made the required changes to /etc/security/limits.conf, exited my shell and 
restarted Nifi. When I check the limits of the Java processes I can still see 
the limits to be at 4096

I've added the following to /etc/security/limits.conf
*  hard  nofile  5
*  soft  nofile  5

but the processes show this:

 cat /proc/26861/limits
...
Max open files4096 4096 files
...

Any idea where this 4096 comes from? I tried grepping in the init scripts, nifi 
configuration and nifi-env.sh but could not find this anywhere

thanks


Re: Too many open files

2019-09-18 Thread Abdou B
Hello,

It seems to me that for some distribution, you should modify those values
in the Cluster management tool.
For example in HDF, with Ambari, you should change the parameter
: nifi_user_nofile_limit. for the change to take effect.

Best regards
Abdou

Le mer. 18 sept. 2019 à 16:34, Jean-Sebastien Vachon 
a écrit :

> Does not seem to help... processes are still limited to 4096 fds
> --
> *From:* Jean-Sebastien Vachon 
> *Sent:* Wednesday, September 18, 2019 10:31 AM
> *To:* users@nifi.apache.org 
> *Subject:* Re: Too many open files
>
> Oups.. just saw the following:
>
> Your distribution may require an edit to
> */etc/security/limits.d/90-nproc.conf* by adding:
> * soft nproc 1
>
> I will try this
> --
> *From:* Jean-Sebastien Vachon 
> *Sent:* Wednesday, September 18, 2019 10:30 AM
> *To:* users@nifi.apache.org 
> *Subject:* Too many open files
>
> Hi all,
>
> I've started to see "Too many open files" error messages in Nifi. I
> checked https://nifi.apache.org/quickstart.html to see the recommended
> values to fix this
> and made the required changes to /etc/security/limits.conf, exited my
> shell and restarted Nifi. When I check the limits of the Java processes I
> can still see the limits to be at 4096
>
> I've added the following to /etc/security/limits.conf
> *  hard  nofile  5
> *  soft  nofile  5
>
> but the processes show this:
>
>  cat /proc/26861/limits
> ...
> Max open files4096 4096 files
> ...
>
> Any idea where this 4096 comes from? I tried grepping in the init scripts,
> nifi configuration and nifi-env.sh but could not find this anywhere
>
> thanks
>


Re: Validate CSV/Records by name instead of position

2019-09-18 Thread Jerry Vinokurov
This certainly works. You can create a schema registry and define an Avro
schema listing your fields. Then make sure that when you set up the reader,
it's configured to read the header so that it knows which fields go where
in the record, set up the mode of the schema access to read from the
registry you created, and then set the name of the actual schema, which is
a property on the schema registry. This will correctly validate your CSV
for you.

On Wed, Sep 18, 2019 at 9:59 AM Eric Chaves  wrote:

> Hi folks,
>
> Is it possible to validate fields/columns in Record or CSV by its name
> instead of it's position? For example I have a record with two mandatory
> fields and some optional fields but they may be on different position on
> each ingested file. Should I use a script or there is already a processor
> that could help me out with those?
>
> Regards,
>


-- 
http://www.google.com/profiles/grapesmoker


Re: Too many open files

2019-09-18 Thread Jean-Sebastien Vachon
Does not seem to help... processes are still limited to 4096 fds

From: Jean-Sebastien Vachon 
Sent: Wednesday, September 18, 2019 10:31 AM
To: users@nifi.apache.org 
Subject: Re: Too many open files

Oups.. just saw the following:

Your distribution may require an edit to /etc/security/limits.d/90-nproc.conf 
by adding:
* soft nproc 1

I will try this

From: Jean-Sebastien Vachon 
Sent: Wednesday, September 18, 2019 10:30 AM
To: users@nifi.apache.org 
Subject: Too many open files

Hi all,

I've started to see "Too many open files" error messages in Nifi. I checked 
https://nifi.apache.org/quickstart.html to see the recommended values to fix 
this
and made the required changes to /etc/security/limits.conf, exited my shell and 
restarted Nifi. When I check the limits of the Java processes I can still see 
the limits to be at 4096

I've added the following to /etc/security/limits.conf
*  hard  nofile  5
*  soft  nofile  5

but the processes show this:

 cat /proc/26861/limits
...
Max open files4096 4096 files
...

Any idea where this 4096 comes from? I tried grepping in the init scripts, nifi 
configuration and nifi-env.sh but could not find this anywhere

thanks


Re: Too many open files

2019-09-18 Thread Jean-Sebastien Vachon
Oups.. just saw the following:

Your distribution may require an edit to /etc/security/limits.d/90-nproc.conf 
by adding:
* soft nproc 1

I will try this

From: Jean-Sebastien Vachon 
Sent: Wednesday, September 18, 2019 10:30 AM
To: users@nifi.apache.org 
Subject: Too many open files

Hi all,

I've started to see "Too many open files" error messages in Nifi. I checked 
https://nifi.apache.org/quickstart.html to see the recommended values to fix 
this
and made the required changes to /etc/security/limits.conf, exited my shell and 
restarted Nifi. When I check the limits of the Java processes I can still see 
the limits to be at 4096

I've added the following to /etc/security/limits.conf
*  hard  nofile  5
*  soft  nofile  5

but the processes show this:

 cat /proc/26861/limits
...
Max open files4096 4096 files
...

Any idea where this 4096 comes from? I tried grepping in the init scripts, nifi 
configuration and nifi-env.sh but could not find this anywhere

thanks


Too many open files

2019-09-18 Thread Jean-Sebastien Vachon
Hi all,

I've started to see "Too many open files" error messages in Nifi. I checked 
https://nifi.apache.org/quickstart.html to see the recommended values to fix 
this
and made the required changes to /etc/security/limits.conf, exited my shell and 
restarted Nifi. When I check the limits of the Java processes I can still see 
the limits to be at 4096

I've added the following to /etc/security/limits.conf
*  hard  nofile  5
*  soft  nofile  5

but the processes show this:

 cat /proc/26861/limits
...
Max open files4096 4096 files
...

Any idea where this 4096 comes from? I tried grepping in the init scripts, nifi 
configuration and nifi-env.sh but could not find this anywhere

thanks


Validate CSV/Records by name instead of position

2019-09-18 Thread Eric Chaves
Hi folks,

Is it possible to validate fields/columns in Record or CSV by its name
instead of it's position? For example I have a record with two mandatory
fields and some optional fields but they may be on different position on
each ingested file. Should I use a script or there is already a processor
that could help me out with those?

Regards,


Re: Re: NiFi active thread count is no more than 10 ?

2019-09-18 Thread wangl...@geekplus.com.cn

Thanks very much Bryan,  chaning the  overall timer-driven thread pool value 
works.



wangl...@geekplus.com.cn
 
From: Bryan Bende
Date: 2019-09-18 20:53
To: users
Subject: Re: NiFi active thread count is no more than 10 ?
The overall timer-driven thread pool defaults to 10 (configured from
the controller settings in top right menu).
 
So even if a processor has 100 concurrent tasks, there are still only
10 threads available.
 
On Wed, Sep 18, 2019 at 8:20 AM Joe Witt  wrote:
>
> Hello
>
> The 100 threads for the controller overall is the maximum number of threads 
> that could run concurrently. On a 16 core system and a flow which is very I/O 
> bound this is definitely achievable.  Generally you want to look at some 
> multiple of the number of physical cores such as 2,4,8, etc.. but in the end 
> it isn't that important.
>
> So why do you only see at most 10 or so threads in active use?  Generally 
> this means your flow isn't demanding or configured to do more.
>
> How many processors do you have?  How many tasks does each have?  How much 
> data is flowing through the system when it reaches 10?  Are backlogs growing 
> at that time?  What are the run schedules?  How does the load average on the 
> system look?  How does IO utilization/iowait look during those times?
>
> Thanks
> Joe
>
> On Wed, Sep 18, 2019 at 8:05 AM wangl...@geekplus.com.cn 
>  wrote:
>>
>>
>> My NiFi server is 16 cores. I also configed some processor cocurrent tasks 
>> to 100.
>> But why the active thread count shown on the NiFi web ui is no more than 10?
>>
>>
>> 
>> wangl...@geekplus.com.cn


Re: NiFi active thread count is no more than 10 ?

2019-09-18 Thread Bryan Bende
The overall timer-driven thread pool defaults to 10 (configured from
the controller settings in top right menu).

So even if a processor has 100 concurrent tasks, there are still only
10 threads available.

On Wed, Sep 18, 2019 at 8:20 AM Joe Witt  wrote:
>
> Hello
>
> The 100 threads for the controller overall is the maximum number of threads 
> that could run concurrently. On a 16 core system and a flow which is very I/O 
> bound this is definitely achievable.  Generally you want to look at some 
> multiple of the number of physical cores such as 2,4,8, etc.. but in the end 
> it isn't that important.
>
> So why do you only see at most 10 or so threads in active use?  Generally 
> this means your flow isn't demanding or configured to do more.
>
> How many processors do you have?  How many tasks does each have?  How much 
> data is flowing through the system when it reaches 10?  Are backlogs growing 
> at that time?  What are the run schedules?  How does the load average on the 
> system look?  How does IO utilization/iowait look during those times?
>
> Thanks
> Joe
>
> On Wed, Sep 18, 2019 at 8:05 AM wangl...@geekplus.com.cn 
>  wrote:
>>
>>
>> My NiFi server is 16 cores. I also configed some processor cocurrent tasks 
>> to 100.
>> But why the active thread count shown on the NiFi web ui is no more than 10?
>>
>>
>> 
>> wangl...@geekplus.com.cn


Re: NiFi active thread count is no more than 10 ?

2019-09-18 Thread Joe Witt
Hello

The 100 threads for the controller overall is the maximum number of threads
that could run concurrently. On a 16 core system and a flow which is very
I/O bound this is definitely achievable.  Generally you want to look at
some multiple of the number of physical cores such as 2,4,8, etc.. but in
the end it isn't that important.

So why do you only see at most 10 or so threads in active use?  Generally
this means your flow isn't demanding or configured to do more.

How many processors do you have?  How many tasks does each have?  How much
data is flowing through the system when it reaches 10?  Are backlogs
growing at that time?  What are the run schedules?  How does the load
average on the system look?  How does IO utilization/iowait look during
those times?

Thanks
Joe

On Wed, Sep 18, 2019 at 8:05 AM wangl...@geekplus.com.cn <
wangl...@geekplus.com.cn> wrote:

>
> My NiFi server is 16 cores. I also configed some processor cocurrent tasks
> to 100.
> But why the active thread count shown on the NiFi web ui is no more than
> 10?
>
>
> --
> wangl...@geekplus.com.cn
>


NiFi active thread count is no more than 10 ?

2019-09-18 Thread wangl...@geekplus.com.cn

My NiFi server is 16 cores. I also configed some processor cocurrent tasks to 
100.
But why the active thread count shown on the NiFi web ui is no more than 10? 




wangl...@geekplus.com.cn