Regarding ConsumeIMAP Processor.

2016-09-20 Thread prabhu Mahendran
Hi,

I am new to the NIFI. I have just use Consume IMAP Processor to retrieve
attachement from mail Server.

If i use it then i can able to download attachement but that document
having MIME type information with addition of EMail Data like below
screenshot.


I need to extract the exact data only but this data comes with some MIME
information.

Can anyone please help me to extract data only or remove the MIME
information from file?

Thanks,


Does NiFi support multiple queries

2016-09-20 Thread Karthik Ramakrishnan
Hello -

I was wondering if NiFi can support multiple queries in the same PutSQL
processor. For example, if an attribute is set to 'update' - will PutSQL
run the defined update query and next time when it is an 'insert' - it runs
the insert query. Or should we go ahead and add two separate processors and
make a decision on the RouteAttributes processor? Any thoughts would be
welcome!!

TIA!!

-- 
Thanks,
*Karthik Ramakrishnan*
*Data Services Intern*
*Copart Inc.*
*Contact : +1 (469) 951-8854*


RE: Requesting Obscene FlowFile Batch Sizes

2016-09-20 Thread Peter Wicks (pwicks)
Andy/Bryan,

Thanks for all of the detail, it’s been helpful.
I actually did an experiment this morning where I modified the processor to 
force it to keep calling `get` until it had all 1 million FlowFiles.  Since I 
was calling it sequentially it was able to move files out of swap and into 
active on each request. I was able to retrieve them and process them through, 
which was great until… NiFi tried to move them through provenance.  At that 
point NiFi ran out of memory and fell over (stopped responding).  Right before 
NiFi ran out of memory I received several bulletins related to Provenance being 
written to too quickly, and that it was being slowed down.

I found another solution to my mass insert and got it up and running. Using a 
Teradata JDBC proprietary flag called FastLoadCSV, and a new custom processor, 
I was able to pass in a CSV file to my JDBC driver and get the same result.  In 
this scenario there was just a single FlowFile and everything went smoothly.

Thanks again!

Peter Wicks



From: Bryan Bende [mailto:bbe...@gmail.com]
Sent: Tuesday, September 20, 2016 3:38 PM
To: users@nifi.apache.org
Subject: Re: Requesting Obscene FlowFile Batch Sizes

Andy,

That was my thinking. An easy test might be to bump the threshold up to 100k 
(increase heap if needed) and see if it starts grabbing 100k every time.

If it does then I would think it is swapping related, then need to figure out 
if you really want to get all 1 million in a single batch, and if theres enough 
heap to support that.

-Bryan

On Tue, Sep 20, 2016 at 5:29 PM, Andy LoPresto 
> wrote:
Bryan,

That’s a good point. Would running with a larger Java heap and higher swap 
threshold allow Peter to get larger batches out?

Andy LoPresto
alopre...@apache.org
alopresto.apa...@gmail.com
PGP Fingerprint: 70EC B3E5 98A6 5A3F D3C4  BACE 3C6E F65B 2F7D EF69

On Sep 20, 2016, at 1:41 PM, Bryan Bende 
> wrote:

Peter,

Does 10k happen to be your swap threshold in nifi.properties by any chance (it 
defaults to 20k I believe)?

I suspect the behavior you are seeing could be due to the way swapping works, 
but Mark or others could probably confirm.

I found this thread where Mark explained how swapping works with a background 
thread, and I believe it still works this way:
http://apache-nifi.1125220.n5.nabble.com/Nifi-amp-Spark-receiver-performance-configuration-td524.html

-Bryan

On Tue, Sep 20, 2016 at 10:22 AM, Peter Wicks (pwicks) 
> wrote:
I’m using JSONToSQL, followed by PutSQL.  I’m using Teradata, which supports a 
special JDBC mode called FastLoad, designed for a minimum of 100,000 rows of 
data per batch.

What I’m finding is that when PutSQL requests a new batch of FlowFiles from the 
queue, which has over 1 million rows in it, with a batch size of 100, it 
always returns a maximum of 10k.  How can I get my obscenely sized batch 
request to return all the FlowFile’s I’m asking for?

Thanks,
  Peter





Re: Download item from queue - what permission is required?

2016-09-20 Thread Andre
Matt,

Thank you for looking at this. I was finding it particularly weird I
couldn't find a way of downloading the content. :-)

Cheers

On Wed, Sep 21, 2016 at 2:24 AM, Matt Gilman 
wrote:

> I think I see the issue and someone else just submitted a similar JIRA [1]
> which is caused by the same bug. When using an authentication which will
> use the API tokens, download requests are processed using a one-time
> password token (since they become part of the URL). These are only honored
> for certain endpoints which do not appear correct.
>
> As a work-around, you could use clients certificates, download via a curl
> command, or use View as it is not subject to the same endpoint check (when
> not clustered).
>
> Matt
>
> [1] https://issues.apache.org/jira/browse/NIFI-2797
>
> On Tue, Sep 20, 2016 at 12:02 PM, Peter Wicks (pwicks) 
> wrote:
>
>> Andre/Matt,
>>
>>
>>
>> Sorry, my memory was wrong. My experience matches Andre’s, it only errors
>> when I click Download; View is fine.
>>
>>
>>
>> We are running a customized build of 1.0 and I made the assumption that
>> this was an issue caused by a bad merge on our part and wasn’t paying it
>> much attention. I have not submitted a JIRA ticket.
>>
>>
>>
>> We are not clustered, running Kerberos for authentication.
>>
>>
>>
>> Thanks,
>>
>>   Peter
>>
>>
>>
>>
>>
>> *From:* Matt Gilman [mailto:matt.c.gil...@gmail.com]
>> *Sent:* Tuesday, September 20, 2016 9:55 AM
>> *To:* users@nifi.apache.org
>> *Subject:* Re: Download item from queue - what permission is required?
>>
>>
>>
>> Downloading and viewing should be the same permissions. If you're seeing
>> otherwise please file a JIRA with the details. Is the instance clustered,
>> what permissions to you have set on the source component, etc?
>>
>>
>>
>> Andre,
>>
>>
>>
>> The 'view the data' is the correct policy that you need to configure. Is
>> your instance clustered or are there anything proxying user requests? And
>> endpoint that will be transferring 'data' (or 'metadata' like flow file
>> attributes) will require that every link is the chain has the 'view the
>> data' policy enabled. This ensures that every system between the user and
>> NiFi is authorized to have the data.
>>
>>
>>
>> Let me know if that helps.
>>
>>
>>
>> Matt
>>
>>
>>
>> On Tue, Sep 20, 2016 at 11:41 AM, Andre  wrote:
>>
>> Peter,
>>
>>
>>
>> Quite curious as I am able to view the flowfile but unable to download it.
>>
>> Seems something we should either document (how to setup properly) or to
>> fix in the next release.
>>
>>
>>
>> Have you already raised a JIRA?
>>
>>
>>
>>
>>
>> On Wed, Sep 21, 2016 at 12:30 AM, Peter Wicks (pwicks) 
>> wrote:
>>
>> No help here, except to share that I’ve also seen this error.  I’ve been
>> working around it by downloading the FlowFile instead of viewing it.
>>
>>
>>
>> *From:* Andre [mailto:andre-li...@fucs.org]
>> *Sent:* Monday, September 19, 2016 11:18 PM
>> *To:* users@nifi.apache.org
>> *Subject:* Download item from queue - what permission is required?
>>
>>
>>
>> Hi there,
>>
>>
>>
>>
>>
>> I am puzzled but one of 1.0.0 features. I had some flowfiles in the queue
>> and as customary I did a list queue.
>>
>>
>>
>> Flowfile was in there, attributes in perfect shape. Yet when I try to
>> download the data of the flowfile (i.e. click the download button) it
>> reports I don't have permissions.
>>
>>
>>
>> I would assume the permissions required would be "view the data"?
>>
>>
>>
>>
>>
>> Cheers
>>
>>
>>
>>
>>
>
>


Re: Requesting Obscene FlowFile Batch Sizes

2016-09-20 Thread Bryan Bende
Andy,

That was my thinking. An easy test might be to bump the threshold up to
100k (increase heap if needed) and see if it starts grabbing 100k every
time.

If it does then I would think it is swapping related, then need to figure
out if you really want to get all 1 million in a single batch, and if
theres enough heap to support that.

-Bryan

On Tue, Sep 20, 2016 at 5:29 PM, Andy LoPresto  wrote:

> Bryan,
>
> That’s a good point. Would running with a larger Java heap and higher swap
> threshold allow Peter to get larger batches out?
>
> Andy LoPresto
> alopre...@apache.org
> *alopresto.apa...@gmail.com *
> PGP Fingerprint: 70EC B3E5 98A6 5A3F D3C4  BACE 3C6E F65B 2F7D EF69
>
> On Sep 20, 2016, at 1:41 PM, Bryan Bende  wrote:
>
> Peter,
>
> Does 10k happen to be your swap threshold in nifi.properties by any chance
> (it defaults to 20k I believe)?
>
> I suspect the behavior you are seeing could be due to the way swapping
> works, but Mark or others could probably confirm.
>
> I found this thread where Mark explained how swapping works with a
> background thread, and I believe it still works this way:
> http://apache-nifi.1125220.n5.nabble.com/Nifi-amp-Spark-
> receiver-performance-configuration-td524.html
>
> -Bryan
>
> On Tue, Sep 20, 2016 at 10:22 AM, Peter Wicks (pwicks) 
> wrote:
>
>> I’m using JSONToSQL, followed by PutSQL.  I’m using Teradata, which
>> supports a special JDBC mode called FastLoad, designed for a minimum of
>> 100,000 rows of data per batch.
>>
>>
>>
>> What I’m finding is that when PutSQL requests a new batch of FlowFiles
>> from the queue, which has over 1 million rows in it, with a batch size of
>> 100, it always returns a maximum of 10k.  How can I get my obscenely
>> sized batch request to return all the FlowFile’s I’m asking for?
>>
>>
>>
>> Thanks,
>>
>>   Peter
>>
>
>
>


Re: Requesting Obscene FlowFile Batch Sizes

2016-09-20 Thread Andy LoPresto
Bryan,

That’s a good point. Would running with a larger Java heap and higher swap 
threshold allow Peter to get larger batches out?

Andy LoPresto
alopre...@apache.org
alopresto.apa...@gmail.com
PGP Fingerprint: 70EC B3E5 98A6 5A3F D3C4  BACE 3C6E F65B 2F7D EF69

> On Sep 20, 2016, at 1:41 PM, Bryan Bende  wrote:
> 
> Peter,
> 
> Does 10k happen to be your swap threshold in nifi.properties by any chance 
> (it defaults to 20k I believe)?
> 
> I suspect the behavior you are seeing could be due to the way swapping works, 
> but Mark or others could probably confirm.
> 
> I found this thread where Mark explained how swapping works with a background 
> thread, and I believe it still works this way:
> http://apache-nifi.1125220.n5.nabble.com/Nifi-amp-Spark-receiver-performance-configuration-td524.html
>  
> 
> 
> -Bryan
> 
> On Tue, Sep 20, 2016 at 10:22 AM, Peter Wicks (pwicks)  > wrote:
> I’m using JSONToSQL, followed by PutSQL.  I’m using Teradata, which supports 
> a special JDBC mode called FastLoad, designed for a minimum of 100,000 rows 
> of data per batch.
> 
> 
> 
> What I’m finding is that when PutSQL requests a new batch of FlowFiles from 
> the queue, which has over 1 million rows in it, with a batch size of 100, 
> it always returns a maximum of 10k.  How can I get my obscenely sized batch 
> request to return all the FlowFile’s I’m asking for?
> 
> 
> 
> Thanks,
> 
>   Peter
> 
> 



signature.asc
Description: Message signed with OpenPGP using GPGMail


Re: Requesting Obscene FlowFile Batch Sizes

2016-09-20 Thread Bryan Bende
Peter,

Does 10k happen to be your swap threshold in nifi.properties by any chance
(it defaults to 20k I believe)?

I suspect the behavior you are seeing could be due to the way swapping
works, but Mark or others could probably confirm.

I found this thread where Mark explained how swapping works with a
background thread, and I believe it still works this way:
http://apache-nifi.1125220.n5.nabble.com/Nifi-amp-Spark-receiver-performance-configuration-td524.html

-Bryan

On Tue, Sep 20, 2016 at 10:22 AM, Peter Wicks (pwicks) 
wrote:

> I’m using JSONToSQL, followed by PutSQL.  I’m using Teradata, which
> supports a special JDBC mode called FastLoad, designed for a minimum of
> 100,000 rows of data per batch.
>
>
>
> What I’m finding is that when PutSQL requests a new batch of FlowFiles
> from the queue, which has over 1 million rows in it, with a batch size of
> 100, it always returns a maximum of 10k.  How can I get my obscenely
> sized batch request to return all the FlowFile’s I’m asking for?
>
>
>
> Thanks,
>
>   Peter
>


Re: PutS3 object returns jvm out of memory or disk out of memory

2016-09-20 Thread Selvam Raman
I have 500+ HTTP request and that will return files which has various size
that will be stored into s3..

For each http (oai-pmh) request we will get file to put into s3.

So content repository keep on increasing for the file size. One sudden
point it reaches 4.6 GB  and that's the avaible disk space in my machine.

I do not know why content repository keeps file though I mentioned
content.archive is false.

I don't know how to limit content repository file size. Suppose if I am
going to put 1 TB Of data to s3 then do I need 1 TB of content repository.
I was clueless.

Thanks,
Selvam R
+91-97877-87724
On Sep 20, 2016 5:22 PM, "Aldrin Piri"  wrote:

> Hi Selvam,
>
> As mentioned, please keep messages to the one list. Moving dev to bcc
> again.
>
> Archiving is only applicable for that content which has exited the flow
> and is not referenced by any FlowFiles currently in your processing graph,
> similar to garbage collection in Java.  For this particular instance,
> unless there is content already on disk, this would likely not provide a
> remedy.
>
> The image did not show for me in my mail client, but was able to locate it
> at a list archive:  http://apache-nifi.1125220.n5.nabble.com/attachment/
> 12226/0/image.png
>
> That error shows InvokeHTTP providing an error.  Could you clarify if this
> is happening just on that processor or also on the previously mentioned
> PutS3?
>
> Could you possibly provide a template of your flow for inspection or
> provide more details about what it is doing?  Are there connections with
> large queues?  Does a "df -h" show that your instance partition is
> exhausted?
>
> NiFi will continuously bring data into the system and depending on what
> you are doing, will continue until disk space is exhausted which seems to
> be the issue at hand.  NiFi provides facilities to aid in avoiding
> situations such as these inclusive of backpressure and FlowFile
> expiration.  Upon introducing content into a flow, NiFi holds onto this
> until it finishes its path through the flow or is expunged via expiration
> making it eligible for removal and/or archival from the backing content
> repository.
>
> Thanks!
>
> On Tue, Sep 20, 2016 at 12:05 PM, Selvam Raman  wrote:
>
>> In my case it is going out of disk space.
>>
>> i set nifi.content.repository.archive.enabled=false. (when i changed this
>> have restarted nifi cluster )
>>
>> But still i can see the processor keep on writing here on the disk.
>>
>> On Tue, Sep 20, 2016 at 4:34 PM, Joe Witt  wrote:
>>
>> > Hello
>> >
>> > Please only post to one list.  I have moved 'dev@nifi' to bcc.
>> >
>> > In the docs for this processor [1] you'll find reference to "Multipart
>> > Part Size".  Set that to a smaller value appropriate for your JVM
>> > memory settings.  For instance, if you have a default JVM heap size of
>> > 512MB you'll want something far smaller like 50MB.  At least I suspect
>> > this is the issue.
>> >
>> > [1] https://nifi.apache.org/docs/nifi-docs/components/org.
>> > apache.nifi.processors.aws.s3.PutS3Object/index.html
>> >
>> > On Tue, Sep 20, 2016 at 11:30 AM, Selvam Raman 
>> wrote:
>> > > HI,
>> > >
>> > > I am pushing data to s3  using puts3object. I have setup nifi 1.0 zero
>> > > master cluster.
>> > >
>> > > Ec2 instance having only 8GB of hard disk. Content repository writing
>> > till
>> > > 4.6 gb of data then it throws jvm out of memory error.
>> > >
>> > > I changed nifi.properties for nifi.content.archive to false. but still
>> > it is
>> > > keep on writing.
>> > >
>> > > please help me.
>> > >
>> > > --
>> > > Selvam Raman
>> > > "லஞ்சம் தவிர்த்து நெஞ்சம் நிமிர்த்து"
>> >
>>
>>
>>
>> --
>> Selvam Raman
>> "லஞ்சம் தவிர்த்து நெஞ்சம் நிமிர்த்து"
>>
>
>


Re: UI: feedback on the processor 'color' in NiFi 1.0

2016-09-20 Thread Andrew Grande
No need to go wild, changing processor colors should be enough, IMO. PG and
RPG are possible candidates, but they are different enough already, I guess.

What I heard quite often was to differentiate between regular processors,
incoming sources of data and out only (data producers?). Maybe even with a
shape?

Andrew

On Tue, Sep 20, 2016, 12:35 PM Rob Moran  wrote:

> Good points. I was thinking a label would be tied to the group of
> components to which it was applied, but that could also introduce problems
> as things move and are added to a flow.
>
> So would you all expect to be able to change the color of every component
> type, or just processors?
>
> Andrew - your comment about coloring terminators red is interesting as
> well. What are some other parts of a flow you might use color to identify?
> Along with backpressure, we could explore other ways to call these things
> out so users do not come up with their own methods. Perhaps there are layer
> options, like on a map (e.g., "show terrain" or "show traffic").
>
> Rob
>
> On Tue, Sep 20, 2016 at 11:23 AM, Andrew Grande 
> wrote:
>
>> I agree. Labels are great for grouping, beyond PGs. Processor colors
>> individually add value. E.g. flow terminator colored in red was a very
>> common pattern I used. Besides, labels are not grouped with components, so
>> moving things and re-arranging is a pain.
>>
>> Andrew
>>
>> On Tue, Sep 20, 2016, 11:21 AM Joe Skora  wrote:
>>
>>> Rob,
>>>
>>> The labelling functionality you described sounds very useful in
>>> general.  But, I miss the processor color too.
>>>
>>> I think labels are really useful for identifying groups of components
>>> and areas in the flow, but I worry that needing to use them in volume for
>>> processor coloring will increase the API and browser canvas load for
>>> elements that don't actually affect the flow.
>>>
>>> On Tue, Sep 20, 2016 at 10:40 AM, Rob Moran  wrote:
>>>
 What if we promote the use of Labels as a way to highlight things. We
 could add functionality to expand their usefulness as a way to highlight
 things on the canvas. I believe that is their intended use.

 Today you can create a label and change its color to highlight single
 or multiple components. Even better you can do it for any component (not
 just processors).

 To expand on functionality, I'm imagining a context menu and palette
 action to "Label" a selected component or components. This would prompt
 a user to pick a background and add text which would place a label
 around everything once it's applied.

 Rob

 On Mon, Sep 19, 2016 at 6:42 PM, Jeff  wrote:

> I was thinking, in addition to changing the color of the icon on the
> processor, that the color of the drop shadow could be changed as well.
> That would provide more contrast, but preserve readability, in my opinion.
>
> On Mon, Sep 19, 2016 at 6:39 PM Andrew Grande 
> wrote:
>
>> Hi All,
>>
>> Rolling with UI feedback threads. This time I'd like to discuss how
>> NiFi 'lost' its ability to change processor boxes color. I.e. as you can
>> see from a screenshot attached, it does change color for the processor in
>> the flow overview panel, but the processor itself only changes the icon 
>> in
>> the top-left of the box. I came across a few users who definitely miss 
>> the
>> old way. I personally think changing the icon color for the processor
>> doesn't go far enough, especially when one is dealing with a flow of
>> several dozen processors, zooms in and out often. The overview helps, but
>> it's not the same.
>>
>> Proposal - can we restore how color selection for the processor
>> changed the actual background of the processor box on the canvas? Let the
>> user go wild with colors and deal with readability, but at least it's 
>> easy
>> to spot 'important' things this way. And with multi-tenant authorization 
>> it
>> becomes a poor-man's doc between teams, to an extent.
>>
>> Thanks for any feedback,
>> Andrew
>>
>

>>>
>


Re: UI: feedback on the processor 'color' in NiFi 1.0

2016-09-20 Thread Rob Moran
Good points. I was thinking a label would be tied to the group of
components to which it was applied, but that could also introduce problems
as things move and are added to a flow.

So would you all expect to be able to change the color of every component
type, or just processors?

Andrew - your comment about coloring terminators red is interesting as
well. What are some other parts of a flow you might use color to identify?
Along with backpressure, we could explore other ways to call these things
out so users do not come up with their own methods. Perhaps there are layer
options, like on a map (e.g., "show terrain" or "show traffic").

Rob

On Tue, Sep 20, 2016 at 11:23 AM, Andrew Grande  wrote:

> I agree. Labels are great for grouping, beyond PGs. Processor colors
> individually add value. E.g. flow terminator colored in red was a very
> common pattern I used. Besides, labels are not grouped with components, so
> moving things and re-arranging is a pain.
>
> Andrew
>
> On Tue, Sep 20, 2016, 11:21 AM Joe Skora  wrote:
>
>> Rob,
>>
>> The labelling functionality you described sounds very useful in general.
>> But, I miss the processor color too.
>>
>> I think labels are really useful for identifying groups of components and
>> areas in the flow, but I worry that needing to use them in volume for
>> processor coloring will increase the API and browser canvas load for
>> elements that don't actually affect the flow.
>>
>> On Tue, Sep 20, 2016 at 10:40 AM, Rob Moran  wrote:
>>
>>> What if we promote the use of Labels as a way to highlight things. We
>>> could add functionality to expand their usefulness as a way to highlight
>>> things on the canvas. I believe that is their intended use.
>>>
>>> Today you can create a label and change its color to highlight single or
>>> multiple components. Even better you can do it for any component (not just
>>> processors).
>>>
>>> To expand on functionality, I'm imagining a context menu and palette
>>> action to "Label" a selected component or components. This would prompt
>>> a user to pick a background and add text which would place a label
>>> around everything once it's applied.
>>>
>>> Rob
>>>
>>> On Mon, Sep 19, 2016 at 6:42 PM, Jeff  wrote:
>>>
 I was thinking, in addition to changing the color of the icon on the
 processor, that the color of the drop shadow could be changed as well.
 That would provide more contrast, but preserve readability, in my opinion.

 On Mon, Sep 19, 2016 at 6:39 PM Andrew Grande 
 wrote:

> Hi All,
>
> Rolling with UI feedback threads. This time I'd like to discuss how
> NiFi 'lost' its ability to change processor boxes color. I.e. as you can
> see from a screenshot attached, it does change color for the processor in
> the flow overview panel, but the processor itself only changes the icon in
> the top-left of the box. I came across a few users who definitely miss the
> old way. I personally think changing the icon color for the processor
> doesn't go far enough, especially when one is dealing with a flow of
> several dozen processors, zooms in and out often. The overview helps, but
> it's not the same.
>
> Proposal - can we restore how color selection for the processor
> changed the actual background of the processor box on the canvas? Let the
> user go wild with colors and deal with readability, but at least it's easy
> to spot 'important' things this way. And with multi-tenant authorization 
> it
> becomes a poor-man's doc between teams, to an extent.
>
> Thanks for any feedback,
> Andrew
>

>>>
>>


Re: Download item from queue - what permission is required?

2016-09-20 Thread Matt Gilman
I think I see the issue and someone else just submitted a similar JIRA [1]
which is caused by the same bug. When using an authentication which will
use the API tokens, download requests are processed using a one-time
password token (since they become part of the URL). These are only honored
for certain endpoints which do not appear correct.

As a work-around, you could use clients certificates, download via a curl
command, or use View as it is not subject to the same endpoint check (when
not clustered).

Matt

[1] https://issues.apache.org/jira/browse/NIFI-2797

On Tue, Sep 20, 2016 at 12:02 PM, Peter Wicks (pwicks) 
wrote:

> Andre/Matt,
>
>
>
> Sorry, my memory was wrong. My experience matches Andre’s, it only errors
> when I click Download; View is fine.
>
>
>
> We are running a customized build of 1.0 and I made the assumption that
> this was an issue caused by a bad merge on our part and wasn’t paying it
> much attention. I have not submitted a JIRA ticket.
>
>
>
> We are not clustered, running Kerberos for authentication.
>
>
>
> Thanks,
>
>   Peter
>
>
>
>
>
> *From:* Matt Gilman [mailto:matt.c.gil...@gmail.com]
> *Sent:* Tuesday, September 20, 2016 9:55 AM
> *To:* users@nifi.apache.org
> *Subject:* Re: Download item from queue - what permission is required?
>
>
>
> Downloading and viewing should be the same permissions. If you're seeing
> otherwise please file a JIRA with the details. Is the instance clustered,
> what permissions to you have set on the source component, etc?
>
>
>
> Andre,
>
>
>
> The 'view the data' is the correct policy that you need to configure. Is
> your instance clustered or are there anything proxying user requests? And
> endpoint that will be transferring 'data' (or 'metadata' like flow file
> attributes) will require that every link is the chain has the 'view the
> data' policy enabled. This ensures that every system between the user and
> NiFi is authorized to have the data.
>
>
>
> Let me know if that helps.
>
>
>
> Matt
>
>
>
> On Tue, Sep 20, 2016 at 11:41 AM, Andre  wrote:
>
> Peter,
>
>
>
> Quite curious as I am able to view the flowfile but unable to download it.
>
> Seems something we should either document (how to setup properly) or to
> fix in the next release.
>
>
>
> Have you already raised a JIRA?
>
>
>
>
>
> On Wed, Sep 21, 2016 at 12:30 AM, Peter Wicks (pwicks) 
> wrote:
>
> No help here, except to share that I’ve also seen this error.  I’ve been
> working around it by downloading the FlowFile instead of viewing it.
>
>
>
> *From:* Andre [mailto:andre-li...@fucs.org]
> *Sent:* Monday, September 19, 2016 11:18 PM
> *To:* users@nifi.apache.org
> *Subject:* Download item from queue - what permission is required?
>
>
>
> Hi there,
>
>
>
>
>
> I am puzzled but one of 1.0.0 features. I had some flowfiles in the queue
> and as customary I did a list queue.
>
>
>
> Flowfile was in there, attributes in perfect shape. Yet when I try to
> download the data of the flowfile (i.e. click the download button) it
> reports I don't have permissions.
>
>
>
> I would assume the permissions required would be "view the data"?
>
>
>
>
>
> Cheers
>
>
>
>
>


Re: PutS3 object returns jvm out of memory

2016-09-20 Thread Aldrin Piri
Hi Selvam,

As mentioned, please keep messages to the one list. Moving dev to bcc
again.

Archiving is only applicable for that content which has exited the flow and
is not referenced by any FlowFiles currently in your processing graph,
similar to garbage collection in Java.  For this particular instance,
unless there is content already on disk, this would likely not provide a
remedy.

The image did not show for me in my mail client, but was able to locate it
at a list archive:
http://apache-nifi.1125220.n5.nabble.com/attachment/12226/0/image.png

That error shows InvokeHTTP providing an error.  Could you clarify if this
is happening just on that processor or also on the previously mentioned
PutS3?

Could you possibly provide a template of your flow for inspection or
provide more details about what it is doing?  Are there connections with
large queues?  Does a "df -h" show that your instance partition is
exhausted?

NiFi will continuously bring data into the system and depending on what you
are doing, will continue until disk space is exhausted which seems to be
the issue at hand.  NiFi provides facilities to aid in avoiding situations
such as these inclusive of backpressure and FlowFile expiration.  Upon
introducing content into a flow, NiFi holds onto this until it finishes its
path through the flow or is expunged via expiration making it eligible for
removal and/or archival from the backing content repository.

Thanks!

On Tue, Sep 20, 2016 at 12:05 PM, Selvam Raman  wrote:

> In my case it is going out of disk space.
>
> i set nifi.content.repository.archive.enabled=false. (when i changed this
> have restarted nifi cluster )
>
> But still i can see the processor keep on writing here on the disk.
>
> On Tue, Sep 20, 2016 at 4:34 PM, Joe Witt  wrote:
>
> > Hello
> >
> > Please only post to one list.  I have moved 'dev@nifi' to bcc.
> >
> > In the docs for this processor [1] you'll find reference to "Multipart
> > Part Size".  Set that to a smaller value appropriate for your JVM
> > memory settings.  For instance, if you have a default JVM heap size of
> > 512MB you'll want something far smaller like 50MB.  At least I suspect
> > this is the issue.
> >
> > [1] https://nifi.apache.org/docs/nifi-docs/components/org.
> > apache.nifi.processors.aws.s3.PutS3Object/index.html
> >
> > On Tue, Sep 20, 2016 at 11:30 AM, Selvam Raman  wrote:
> > > HI,
> > >
> > > I am pushing data to s3  using puts3object. I have setup nifi 1.0 zero
> > > master cluster.
> > >
> > > Ec2 instance having only 8GB of hard disk. Content repository writing
> > till
> > > 4.6 gb of data then it throws jvm out of memory error.
> > >
> > > I changed nifi.properties for nifi.content.archive to false. but still
> > it is
> > > keep on writing.
> > >
> > > please help me.
> > >
> > > --
> > > Selvam Raman
> > > "லஞ்சம் தவிர்த்து நெஞ்சம் நிமிர்த்து"
> >
>
>
>
> --
> Selvam Raman
> "லஞ்சம் தவிர்த்து நெஞ்சம் நிமிர்த்து"
>


RE: Download item from queue - what permission is required?

2016-09-20 Thread Peter Wicks (pwicks)
Andre/Matt,

Sorry, my memory was wrong. My experience matches Andre’s, it only errors when 
I click Download; View is fine.

We are running a customized build of 1.0 and I made the assumption that this 
was an issue caused by a bad merge on our part and wasn’t paying it much 
attention. I have not submitted a JIRA ticket.

We are not clustered, running Kerberos for authentication.

Thanks,
  Peter


From: Matt Gilman [mailto:matt.c.gil...@gmail.com]
Sent: Tuesday, September 20, 2016 9:55 AM
To: users@nifi.apache.org
Subject: Re: Download item from queue - what permission is required?

Downloading and viewing should be the same permissions. If you're seeing 
otherwise please file a JIRA with the details. Is the instance clustered, what 
permissions to you have set on the source component, etc?

Andre,

The 'view the data' is the correct policy that you need to configure. Is your 
instance clustered or are there anything proxying user requests? And endpoint 
that will be transferring 'data' (or 'metadata' like flow file attributes) will 
require that every link is the chain has the 'view the data' policy enabled. 
This ensures that every system between the user and NiFi is authorized to have 
the data.

Let me know if that helps.

Matt

On Tue, Sep 20, 2016 at 11:41 AM, Andre 
> wrote:
Peter,

Quite curious as I am able to view the flowfile but unable to download it.

Seems something we should either document (how to setup properly) or to fix in 
the next release.

Have you already raised a JIRA?


On Wed, Sep 21, 2016 at 12:30 AM, Peter Wicks (pwicks) 
> wrote:
No help here, except to share that I’ve also seen this error.  I’ve been 
working around it by downloading the FlowFile instead of viewing it.

From: Andre [mailto:andre-li...@fucs.org]
Sent: Monday, September 19, 2016 11:18 PM
To: users@nifi.apache.org
Subject: Download item from queue - what permission is required?

Hi there,


I am puzzled but one of 1.0.0 features. I had some flowfiles in the queue and 
as customary I did a list queue.

Flowfile was in there, attributes in perfect shape. Yet when I try to download 
the data of the flowfile (i.e. click the download button) it reports I don't 
have permissions.

I would assume the permissions required would be "view the data"?


Cheers




Re: Download item from queue - what permission is required?

2016-09-20 Thread Matt Gilman
Downloading and viewing should be the same permissions. If you're seeing
otherwise please file a JIRA with the details. Is the instance clustered,
what permissions to you have set on the source component, etc?

Andre,

The 'view the data' is the correct policy that you need to configure. Is
your instance clustered or are there anything proxying user requests? And
endpoint that will be transferring 'data' (or 'metadata' like flow file
attributes) will require that every link is the chain has the 'view the
data' policy enabled. This ensures that every system between the user and
NiFi is authorized to have the data.

Let me know if that helps.

Matt

On Tue, Sep 20, 2016 at 11:41 AM, Andre  wrote:

> Peter,
>
> Quite curious as I am able to view the flowfile but unable to download it.
>
> Seems something we should either document (how to setup properly) or to
> fix in the next release.
>
> Have you already raised a JIRA?
>
>
> On Wed, Sep 21, 2016 at 12:30 AM, Peter Wicks (pwicks) 
> wrote:
>
>> No help here, except to share that I’ve also seen this error.  I’ve been
>> working around it by downloading the FlowFile instead of viewing it.
>>
>>
>>
>> *From:* Andre [mailto:andre-li...@fucs.org]
>> *Sent:* Monday, September 19, 2016 11:18 PM
>> *To:* users@nifi.apache.org
>> *Subject:* Download item from queue - what permission is required?
>>
>>
>>
>> Hi there,
>>
>>
>>
>>
>>
>> I am puzzled but one of 1.0.0 features. I had some flowfiles in the queue
>> and as customary I did a list queue.
>>
>>
>>
>> Flowfile was in there, attributes in perfect shape. Yet when I try to
>> download the data of the flowfile (i.e. click the download button) it
>> reports I don't have permissions.
>>
>>
>>
>> I would assume the permissions required would be "view the data"?
>>
>>
>>
>>
>>
>> Cheers
>>
>
>


Re: Download item from queue - what permission is required?

2016-09-20 Thread Andre
Peter,

Quite curious as I am able to view the flowfile but unable to download it.

Seems something we should either document (how to setup properly) or to fix
in the next release.

Have you already raised a JIRA?


On Wed, Sep 21, 2016 at 12:30 AM, Peter Wicks (pwicks) 
wrote:

> No help here, except to share that I’ve also seen this error.  I’ve been
> working around it by downloading the FlowFile instead of viewing it.
>
>
>
> *From:* Andre [mailto:andre-li...@fucs.org]
> *Sent:* Monday, September 19, 2016 11:18 PM
> *To:* users@nifi.apache.org
> *Subject:* Download item from queue - what permission is required?
>
>
>
> Hi there,
>
>
>
>
>
> I am puzzled but one of 1.0.0 features. I had some flowfiles in the queue
> and as customary I did a list queue.
>
>
>
> Flowfile was in there, attributes in perfect shape. Yet when I try to
> download the data of the flowfile (i.e. click the download button) it
> reports I don't have permissions.
>
>
>
> I would assume the permissions required would be "view the data"?
>
>
>
>
>
> Cheers
>


Re: PutS3 object returns jvm out of memory

2016-09-20 Thread Joe Witt
Hello

Please only post to one list.  I have moved 'dev@nifi' to bcc.

In the docs for this processor [1] you'll find reference to "Multipart
Part Size".  Set that to a smaller value appropriate for your JVM
memory settings.  For instance, if you have a default JVM heap size of
512MB you'll want something far smaller like 50MB.  At least I suspect
this is the issue.

[1] 
https://nifi.apache.org/docs/nifi-docs/components/org.apache.nifi.processors.aws.s3.PutS3Object/index.html

On Tue, Sep 20, 2016 at 11:30 AM, Selvam Raman  wrote:
> HI,
>
> I am pushing data to s3  using puts3object. I have setup nifi 1.0 zero
> master cluster.
>
> Ec2 instance having only 8GB of hard disk. Content repository writing till
> 4.6 gb of data then it throws jvm out of memory error.
>
> I changed nifi.properties for nifi.content.archive to false. but still it is
> keep on writing.
>
> please help me.
>
> --
> Selvam Raman
> "லஞ்சம் தவிர்த்து நெஞ்சம் நிமிர்த்து"


Re: PutS3 object returns jvm out of memory

2016-09-20 Thread Selvam Raman
This is the exact error.




On Tue, Sep 20, 2016 at 4:30 PM, Selvam Raman  wrote:

> HI,
>
> I am pushing data to s3  using puts3object. I have setup nifi 1.0 zero
> master cluster.
>
> Ec2 instance having only 8GB of hard disk. Content repository writing till
> 4.6 gb of data then it throws jvm out of memory error.
>
> I changed nifi.properties for nifi.content.archive to false. but still it
> is keep on writing.
>
> please help me.
>
> --
> Selvam Raman
> "லஞ்சம் தவிர்த்து நெஞ்சம் நிமிர்த்து"
>



-- 
Selvam Raman
"லஞ்சம் தவிர்த்து நெஞ்சம் நிமிர்த்து"


Re: UI: flow status and counters feedback

2016-09-20 Thread Andrew Grande
I like the tooltip addition of yours.

For more interactive feedback on the canvas I can immediately think of 2
items.

1. Indicator for when backpressure was configured on a connection (although
it's now always added by default, maybe less useful).

2. Changing the color of a connection when backpressure has engaged could
go a long way. Can go further, gradient color based on how close the
connection backlog is to triggering the backpressure controls. Immediately
highlights hotspots visually.

Andrew

On Tue, Sep 20, 2016, 9:40 AM Rob Moran  wrote:

> Andrew,
>
> Thanks for the feedback on the status bar. Separation between each item
> helps but realize after your comments how it can not feel like a single,
> cohesive group of items. We could probably tighten things up a bit.
>
> I think another part of this that could help would be to address some of
> the discussion around awareness of stats updating. Being able to call more
> attention (without being too intrusive) when stats change could help ease
> some of the burden of having to routinely scan the status bar to look for
> changes.
>
> Also related, I would like to see us get a tooltip that is seen when you
> hover anywhere on the status bar. That tooltip would provide more
> descriptive text about what each item means. It would help new users learn
> as well as provide detail and follow-on action when something is alerted.
>
> Let's see what others think and then I can work on filing a jira to
> capture thoughts.
>
> Rob
>
> On Mon, Sep 19, 2016 at 6:22 PM, Andrew Grande  wrote:
>
>> Hi All,
>>
>> I'd like to provide some feedback on the NiFi 1.0 UI now that I had a
>> chance to use it for a while, as well as pass along what I heard directly
>> from other end users.
>>
>> Attached is a screenshot of a status bar right above the main flow
>> canvas. The biggest difference from the 0.x UI is how much whitespace it
>> now has between elements. To a point where it's not possible to quickly
>> scan the state with a glance.
>>
>> Does anyone have other opinions? Can we adjust things slightly so they
>> are easier on the eye an have less horizontal friction?
>>
>> Thanks!
>> Andrew
>>
>>
>>
>


Re: Periodic delta pulls from a data source

2016-09-20 Thread Selvam Raman
Hi,

We are making OAI-PMH requests invoked within Http.

On Tue, Sep 20, 2016 at 9:31 AM, Pierre Villard  wrote:

> Hi Selvam,
>
> Supposing that your source if a SQL-like source, you should have a look at
> QueryDatabaseTable [1] processor. It proposes a 'Maximum-value Columns'
> that gives you the possibility to specify the column containing an ID and
> or timestamp. The processor will keep track of the maximum value for each
> column that has been returned since the processor started running. This can
> be used to retrieve only those rows that have been added/updated since the
> last retrieval. Note that some JDBC types such as bit/boolean are not
> conducive to maintaining maximum value, so columns of these types should
> not be listed in this property, and will result in error(s) during
> processing. If no columns are provided, all rows from the table will be
> considered, which could have a performance impact.
>
> [1] https://nifi.apache.org/docs/nifi-docs/
>
> Pierre
>
>
> 2016-09-20 8:50 GMT+02:00 Selvam Raman :
>
>>
>> > Hi,
>> >
>> > We have a requirement to pull data periodically from a data source. For
>> this to work we would like nifi to keep track of the last id or time stamp
>> that was pulled successfully so that the next pull starts from that point.
>> >
>> > Let me know if nifi supports this? If yes how do we configure?
>> >
>> > --
>> > Selvam Raman
>> > "லஞ்சம் தவிர்த்து நெஞ்சம் நிமிர்த்து"
>>
>
>


-- 
Selvam Raman
"லஞ்சம் தவிர்த்து நெஞ்சம் நிமிர்த்து"


Re: UI: feedback on the processor 'color' in NiFi 1.0

2016-09-20 Thread Andrew Grande
I agree. Labels are great for grouping, beyond PGs. Processor colors
individually add value. E.g. flow terminator colored in red was a very
common pattern I used. Besides, labels are not grouped with components, so
moving things and re-arranging is a pain.

Andrew

On Tue, Sep 20, 2016, 11:21 AM Joe Skora  wrote:

> Rob,
>
> The labelling functionality you described sounds very useful in general.
> But, I miss the processor color too.
>
> I think labels are really useful for identifying groups of components and
> areas in the flow, but I worry that needing to use them in volume for
> processor coloring will increase the API and browser canvas load for
> elements that don't actually affect the flow.
>
> On Tue, Sep 20, 2016 at 10:40 AM, Rob Moran  wrote:
>
>> What if we promote the use of Labels as a way to highlight things. We
>> could add functionality to expand their usefulness as a way to highlight
>> things on the canvas. I believe that is their intended use.
>>
>> Today you can create a label and change its color to highlight single or
>> multiple components. Even better you can do it for any component (not just
>> processors).
>>
>> To expand on functionality, I'm imagining a context menu and palette
>> action to "Label" a selected component or components. This would prompt
>> a user to pick a background and add text which would place a label
>> around everything once it's applied.
>>
>> Rob
>>
>> On Mon, Sep 19, 2016 at 6:42 PM, Jeff  wrote:
>>
>>> I was thinking, in addition to changing the color of the icon on the
>>> processor, that the color of the drop shadow could be changed as well.
>>> That would provide more contrast, but preserve readability, in my opinion.
>>>
>>> On Mon, Sep 19, 2016 at 6:39 PM Andrew Grande 
>>> wrote:
>>>
 Hi All,

 Rolling with UI feedback threads. This time I'd like to discuss how
 NiFi 'lost' its ability to change processor boxes color. I.e. as you can
 see from a screenshot attached, it does change color for the processor in
 the flow overview panel, but the processor itself only changes the icon in
 the top-left of the box. I came across a few users who definitely miss the
 old way. I personally think changing the icon color for the processor
 doesn't go far enough, especially when one is dealing with a flow of
 several dozen processors, zooms in and out often. The overview helps, but
 it's not the same.

 Proposal - can we restore how color selection for the processor changed
 the actual background of the processor box on the canvas? Let the user go
 wild with colors and deal with readability, but at least it's easy to spot
 'important' things this way. And with multi-tenant authorization it becomes
 a poor-man's doc between teams, to an extent.

 Thanks for any feedback,
 Andrew

>>>
>>
>


Re: UI: feedback on the processor 'color' in NiFi 1.0

2016-09-20 Thread Rob Moran
What if we promote the use of Labels as a way to highlight things. We could
add functionality to expand their usefulness as a way to highlight things
on the canvas. I believe that is their intended use.

Today you can create a label and change its color to highlight single or
multiple components. Even better you can do it for any component (not just
processors).

To expand on functionality, I'm imagining a context menu and palette action
to "Label" a selected component or components. This would prompt a user to
pick a background and add text which would place a label
around everything once it's applied.

Rob

On Mon, Sep 19, 2016 at 6:42 PM, Jeff  wrote:

> I was thinking, in addition to changing the color of the icon on the
> processor, that the color of the drop shadow could be changed as well.
> That would provide more contrast, but preserve readability, in my opinion.
>
> On Mon, Sep 19, 2016 at 6:39 PM Andrew Grande  wrote:
>
>> Hi All,
>>
>> Rolling with UI feedback threads. This time I'd like to discuss how NiFi
>> 'lost' its ability to change processor boxes color. I.e. as you can see
>> from a screenshot attached, it does change color for the processor in the
>> flow overview panel, but the processor itself only changes the icon in the
>> top-left of the box. I came across a few users who definitely miss the old
>> way. I personally think changing the icon color for the processor doesn't
>> go far enough, especially when one is dealing with a flow of several dozen
>> processors, zooms in and out often. The overview helps, but it's not the
>> same.
>>
>> Proposal - can we restore how color selection for the processor changed
>> the actual background of the processor box on the canvas? Let the user go
>> wild with colors and deal with readability, but at least it's easy to spot
>> 'important' things this way. And with multi-tenant authorization it becomes
>> a poor-man's doc between teams, to an extent.
>>
>> Thanks for any feedback,
>> Andrew
>>
>


RE: Download item from queue - what permission is required?

2016-09-20 Thread Peter Wicks (pwicks)
No help here, except to share that I’ve also seen this error.  I’ve been 
working around it by downloading the FlowFile instead of viewing it.

From: Andre [mailto:andre-li...@fucs.org]
Sent: Monday, September 19, 2016 11:18 PM
To: users@nifi.apache.org
Subject: Download item from queue - what permission is required?

Hi there,


I am puzzled but one of 1.0.0 features. I had some flowfiles in the queue and 
as customary I did a list queue.

Flowfile was in there, attributes in perfect shape. Yet when I try to download 
the data of the flowfile (i.e. click the download button) it reports I don't 
have permissions.

I would assume the permissions required would be "view the data"?


Cheers


Re: UI: flow status and counters feedback

2016-09-20 Thread Rob Moran
Andrew,

Thanks for the feedback on the status bar. Separation between each item
helps but realize after your comments how it can not feel like a single,
cohesive group of items. We could probably tighten things up a bit.

I think another part of this that could help would be to address some of
the discussion around awareness of stats updating. Being able to call more
attention (without being too intrusive) when stats change could help ease
some of the burden of having to routinely scan the status bar to look for
changes.

Also related, I would like to see us get a tooltip that is seen when you
hover anywhere on the status bar. That tooltip would provide more
descriptive text about what each item means. It would help new users learn
as well as provide detail and follow-on action when something is alerted.

Let's see what others think and then I can work on filing a jira to capture
thoughts.

Rob

On Mon, Sep 19, 2016 at 6:22 PM, Andrew Grande  wrote:

> Hi All,
>
> I'd like to provide some feedback on the NiFi 1.0 UI now that I had a
> chance to use it for a while, as well as pass along what I heard directly
> from other end users.
>
> Attached is a screenshot of a status bar right above the main flow canvas.
> The biggest difference from the 0.x UI is how much whitespace it now has
> between elements. To a point where it's not possible to quickly scan the
> state with a glance.
>
> Does anyone have other opinions? Can we adjust things slightly so they are
> easier on the eye an have less horizontal friction?
>
> Thanks!
> Andrew
>
>
>


Re: Nifi Running mode

2016-09-20 Thread Matt Gilman
Selvam,

The specific endpoint is

http://{host}:{port}/nifi-api/flow/cluster/summary

This will return a ClusterSummaryDTO [1] (this is incorrect in the
documentation). I'm having trouble accessing JIRA right now but we'll get
that fixed in the next release.

Matt

[1]
https://github.com/apache/nifi/blob/master/nifi-nar-bundles/nifi-framework-bundle/nifi-framework/nifi-client-dto/src/main/java/org/apache/nifi/web/api/dto/ClusterSummaryDTO.java

On Tue, Sep 20, 2016 at 6:12 AM, Pierre Villard  wrote:

> Hi,
>
> Have a look at the REST API : https://nifi.apache.org/docs/
> nifi-docs/rest-api/index.html
>
> Pierre
>
> 2016-09-20 11:32 GMT+02:00 Selvam Raman :
>
>> Hi,
>>
>> How to check nifi running mode (cluster, standalone).
>> is there any command to check.
>>
>> Thanks,
>> Selvam Raman
>> "லஞ்சம் தவிர்த்து நெஞ்சம் நிமிர்த்து"
>>
>>
>>
>> --
>> Selvam Raman
>> "லஞ்சம் தவிர்த்து நெஞ்சம் நிமிர்த்து"
>>
>
>


Fwd: Nifi Running mode

2016-09-20 Thread Selvam Raman
Hi,

How to check nifi running mode (cluster, standalone).
is there any command to check.

Thanks,
Selvam Raman
"லஞ்சம் தவிர்த்து நெஞ்சம் நிமிர்த்து"



-- 
Selvam Raman
"லஞ்சம் தவிர்த்து நெஞ்சம் நிமிர்த்து"


Re: Periodic delta pulls from a data source

2016-09-20 Thread Pierre Villard
Hi Selvam,

Supposing that your source if a SQL-like source, you should have a look at
QueryDatabaseTable [1] processor. It proposes a 'Maximum-value Columns'
that gives you the possibility to specify the column containing an ID and
or timestamp. The processor will keep track of the maximum value for each
column that has been returned since the processor started running. This can
be used to retrieve only those rows that have been added/updated since the
last retrieval. Note that some JDBC types such as bit/boolean are not
conducive to maintaining maximum value, so columns of these types should
not be listed in this property, and will result in error(s) during
processing. If no columns are provided, all rows from the table will be
considered, which could have a performance impact.

[1] https://nifi.apache.org/docs/nifi-docs/

Pierre


2016-09-20 8:50 GMT+02:00 Selvam Raman :

>
> > Hi,
> >
> > We have a requirement to pull data periodically from a data source. For
> this to work we would like nifi to keep track of the last id or time stamp
> that was pulled successfully so that the next pull starts from that point.
> >
> > Let me know if nifi supports this? If yes how do we configure?
> >
> > --
> > Selvam Raman
> > "லஞ்சம் தவிர்த்து நெஞ்சம் நிமிர்த்து"
>