Re: Configure duration to push to RPG in AWS (IoT flow)

2017-05-01 Thread Lee Laim
Hi Varsha,

Based on your description, It might be worth checking the 'run scheduling' to 
make sure the processors are scheduled properly.  I've changed the scheduling 
for debug purposes, and have forgotten to changing it back to '0 seconds'

Lee 


> On May 1, 2017, at 10:38 PM, Raveendran, Varsha 
>  wrote:
> 
> Hi Aldrin,
>  
> We changed our flow to the following and still see no improvement.
>  
> RPi with one instance of MiNiFi
> -   A ListenTCP processor
> -   RPG connecting to EC2
>  
> EC2 with one NiFi server
> -   Input Port
> -   SplitText, ReplaceText – to transform the data into CQL statements
> -   PutCassandra
>  
> Without MiNiFi, we are able to transfer data at the rate of 100kB/s to NiFi 
> server.
>  
> From the TCP client we are sending 4000 packets, each of size 100bytes. From 
> the MiNiFi logs, we observe that the values are sent in batches with a time 
> gap of atleast 60 s. Is that the normal behavior of MiNiFi?
> Can we not have a continuous stream of data going to EC2? Another point that 
> I want to mention is that the device has only 512MB RAM. Could this be a 
> factor?
>  
> We have observed that the CPU utilization goes up 99% at times for Minifi 
> process.
> Any pointers in optimizing the flow would help.
>  
> Regards,
> Varsha
>  
>  
> From: Aldrin Piri [mailto:aldrinp...@gmail.com] 
> Sent: Tuesday, May 02, 2017 1:25 AM
> To: users@nifi.apache.org
> Subject: Re: Configure duration to push to RPG in AWS (IoT flow)
>  
> Hi Varsha,
>  
> When you are talking about flows, are these separate instances of MiNiFi or 
> different parts of your overall configuration?  Is there a reason the 
> collected data is written to and then read from disk?  I/O costs are 
> amplified when running off of SD cards which can be quite slow.  Not that 
> this accounts for the 50s/event but does provide a point for consideration.  
> Have you been able to test network transfer rates to your EC2 NiFi instance 
> outside of MiNiFi?  
>  
> What is the rate of ingest to the TCP processor both in volume? In number of 
> events? 
>  
> Would you be able to share your configuration?
>  
> Have run similar flows on RPis and equivalents and had good throughput and 
> seems like something has gone slightly awry.
>  
> On Sat, Apr 29, 2017 at 5:30 AM, Raveendran, Varsha 
>  wrote:
> Hello All,
>  
> I have a MiNiFi flow running on Raspberry Pi. The flow has a ListenTCP that 
> pushes the events from a client to a folder using PutFile. Another flow takes 
> files from a folder using ListFile and FetchFile and pushes it to a remote 
> process group pointing to a NiFi server on EC2. From the logs on the 
> Raspberry Pi we observed that the queue between FetchFile and RPG is never 
> empty. The Nifi server in EC2 receives messages every 50 seconds. This is 
> slow for a real time application that is expecting atleast 1 flow file every 
> second.
>  
> Any thoughts on how to improve the performance in MiNiFi to push messages 
> faster to the Input port on EC2?
>  
> Regards,
> Varsha
> 
> Registered Office: 130 Pandurang Budhkar Marg, Worli, Mumbai – 400018; 
> Corporate Identity number: L28920MH1957PLC010839; Tel.: +91 (22) 3967 7000; 
> Fax: +91 22 3967 7500;
> Contact / Email: www.siemens.co.in/contact; Website: www.siemens.co.in. Sales 
> Offices: Ahmedabad, Bengaluru, Bhopal, Bhubaneswar, Chandigarh, Chennai, 
> Coimbatore, Gurgaon, Hyderabad, Jaipur, Jamshedpur, Kharghar, Kolkata, 
> Lucknow, Kochi, Mumbai, Nagpur, Navi Mumbai, New Delhi,
>  
> 
> Registered Office: 130 Pandurang Budhkar Marg, Worli, Mumbai – 400018; 
> Corporate Identity number: L28920MH1957PLC010839; Tel.: +91 (22) 3967 7000; 
> Fax: +91 22 3967 7500;
> Contact / Email: www.siemens.co.in/contact; Website: www.siemens.co.in. Sales 
> Offices: Ahmedabad, Bengaluru, Bhopal, Bhubaneswar, Chandigarh, Chennai, 
> Coimbatore, Gurgaon, Hyderabad, Jaipur, Jamshedpur, Kharghar, Kolkata, 
> Lucknow, Kochi, Mumbai, Nagpur, Navi Mumbai, New Delhi,


RE: Configure duration to push to RPG in AWS (IoT flow)

2017-05-01 Thread Raveendran, Varsha
Hi Aldrin,

We changed our flow to the following and still see no improvement.

RPi with one instance of MiNiFi

-   A ListenTCP processor

-   RPG connecting to EC2

EC2 with one NiFi server

-   Input Port

-   SplitText, ReplaceText – to transform the data into CQL statements

-   PutCassandra

Without MiNiFi, we are able to transfer data at the rate of 100kB/s to NiFi 
server.

From the TCP client we are sending 4000 packets, each of size 100bytes. From 
the MiNiFi logs, we observe that the values are sent in batches with a time gap 
of atleast 60 s. Is that the normal behavior of MiNiFi?
Can we not have a continuous stream of data going to EC2? Another point that I 
want to mention is that the device has only 512MB RAM. Could this be a factor?

We have observed that the CPU utilization goes up 99% at times for Minifi 
process.
Any pointers in optimizing the flow would help.

Regards,
Varsha


From: Aldrin Piri [mailto:aldrinp...@gmail.com]
Sent: Tuesday, May 02, 2017 1:25 AM
To: users@nifi.apache.org
Subject: Re: Configure duration to push to RPG in AWS (IoT flow)

Hi Varsha,

When you are talking about flows, are these separate instances of MiNiFi or 
different parts of your overall configuration?  Is there a reason the collected 
data is written to and then read from disk?  I/O costs are amplified when 
running off of SD cards which can be quite slow.  Not that this accounts for 
the 50s/event but does provide a point for consideration.  Have you been able 
to test network transfer rates to your EC2 NiFi instance outside of MiNiFi?

What is the rate of ingest to the TCP processor both in volume? In number of 
events?

Would you be able to share your configuration?

Have run similar flows on RPis and equivalents and had good throughput and 
seems like something has gone slightly awry.

On Sat, Apr 29, 2017 at 5:30 AM, Raveendran, Varsha 
mailto:varsha.raveend...@siemens.com>> wrote:
Hello All,

I have a MiNiFi flow running on Raspberry Pi. The flow has a ListenTCP that 
pushes the events from a client to a folder using PutFile. Another flow takes 
files from a folder using ListFile and FetchFile and pushes it to a remote 
process group pointing to a NiFi server on EC2. From the logs on the Raspberry 
Pi we observed that the queue between FetchFile and RPG is never empty. The 
Nifi server in EC2 receives messages every 50 seconds. This is slow for a real 
time application that is expecting atleast 1 flow file every second.

Any thoughts on how to improve the performance in MiNiFi to push messages 
faster to the Input port on EC2?

Regards,
Varsha

Registered Office: 130 Pandurang Budhkar Marg, Worli, Mumbai – 400018; 
Corporate Identity number: L28920MH1957PLC010839; Tel.: +91 (22) 3967 
7000; Fax: +91 22 3967 
7500;
Contact / Email: www.siemens.co.in/contact; 
Website: www.siemens.co.in. Sales Offices: Ahmedabad, 
Bengaluru, Bhopal, Bhubaneswar, Chandigarh, Chennai, Coimbatore, Gurgaon, 
Hyderabad, Jaipur, Jamshedpur, Kharghar, Kolkata, Lucknow, Kochi, Mumbai, 
Nagpur, Navi Mumbai, New Delhi,


Registered Office: 130 Pandurang Budhkar Marg, Worli, Mumbai – 400018; 
Corporate Identity number: L28920MH1957PLC010839; Tel.: +91 (22) 3967 7000; 
Fax: +91 22 3967 7500;
Contact / Email: www.siemens.co.in/contact; Website: www.siemens.co.in. Sales 
Offices: Ahmedabad, Bengaluru, Bhopal, Bhubaneswar, Chandigarh, Chennai, 
Coimbatore, Gurgaon, Hyderabad, Jaipur, Jamshedpur, Kharghar, Kolkata, Lucknow, 
Kochi, Mumbai, Nagpur, Navi Mumbai, New Delhi,


Re: Configure duration to push to RPG in AWS (IoT flow)

2017-05-01 Thread Aldrin Piri
Hi Varsha,

When you are talking about flows, are these separate instances of MiNiFi or
different parts of your overall configuration?  Is there a reason the
collected data is written to and then read from disk?  I/O costs are
amplified when running off of SD cards which can be quite slow.  Not that
this accounts for the 50s/event but does provide a point for
consideration.  Have you been able to test network transfer rates to your
EC2 NiFi instance outside of MiNiFi?

What is the rate of ingest to the TCP processor both in volume? In number
of events?

Would you be able to share your configuration?

Have run similar flows on RPis and equivalents and had good throughput and
seems like something has gone slightly awry.

On Sat, Apr 29, 2017 at 5:30 AM, Raveendran, Varsha <
varsha.raveend...@siemens.com> wrote:

> Hello All,
>
>
>
> I have a MiNiFi flow running on Raspberry Pi. The flow has a ListenTCP
> that pushes the events from a client to a folder using PutFile. Another
> flow takes files from a folder using ListFile and FetchFile and pushes it
> to a remote process group pointing to a NiFi server on EC2. From the logs
> on the Raspberry Pi we observed that the queue between FetchFile and RPG is
> never empty. The Nifi server in EC2 receives messages every 50 seconds.
> This is slow for a real time application that is expecting atleast 1 flow
> file every second.
>
>
>
> Any thoughts on how to improve the performance in MiNiFi to push messages
> faster to the Input port on EC2?
>
>
>
> Regards,
>
> Varsha
>
> Registered Office: 130 Pandurang Budhkar Marg, Worli, Mumbai – 400018;
> Corporate Identity number: L28920MH1957PLC010839; Tel.: +91 (22) 3967 7000
> <+91%2022%203967%207000>; Fax: +91 22 3967 7500 <+91%2022%203967%207500>;
> Contact / Email: www.siemens.co.in/contact; Website: www.siemens.co.in.
> Sales Offices: Ahmedabad, Bengaluru, Bhopal, Bhubaneswar, Chandigarh,
> Chennai, Coimbatore, Gurgaon, Hyderabad, Jaipur, Jamshedpur, Kharghar,
> Kolkata, Lucknow, Kochi, Mumbai, Nagpur, Navi Mumbai, New Delhi,
>


Re: csv output processor?

2017-05-01 Thread Andrew Grande
There is something interesting coming out in 1.2.0 potentially, the new
pairs of RecordReaders/RecordSetWriters. I did see CSV format support in
there. Take another look maybe.

Andrew

On Fri, Apr 28, 2017, 4:55 PM Frank Maritato 
wrote:

> Is there a nifi processor that will output a bunch of flowfile attributes
> as delimited text (i.e. csv)? I had checked google a while ago and the
> suggestion was to use ReplaceText. The problem with this is that we have to
> add all the field delimiters, quotes and escaping of characters to each
> attribute manually.
>
> Thanks!
> --
> Frank Maritato
> Data Architect
>