Re: NiFi ExecuteSQL error => can not be represented as java.sql.Timestamp

2017-11-08 Thread Koji Kawamura
Hi Mohit,

Thanks for sharing the update, glad to know that you found a solution!
(The exception message still looks strange to me though..)

Koji

On Wed, Nov 8, 2017 at 4:29 PM,   wrote:
> Hi Koji,
>
> I was able to fix this issue using following with my JDBC connection -
>
> jdbc:mysql://localhost:3306/nifi_test?zeroDateTimeBehavior=convertToNull
>
> It was basically causing due to null values in a Timestamp column which I was 
> able to cater by converting it to null.
>
> Thanks,
> Mohit
>
> -Original Message-
> From: Koji Kawamura [mailto:ijokaruma...@gmail.com]
> Sent: 08 November 2017 05:00
> To: users@nifi.apache.org
> Subject: Re: NiFi ExecuteSQL error => can not be represented as 
> java.sql.Timestamp
>
> Hi Mohit,
>
> The exception looks as if the entire string ' 821725069 2161514622096
> ...  0-00 0   3' was converted to
> java.sql.Timestamp.
> Would you share your create table DDL statement, few sample record data, 
> NiFi, MySQL and JDBC driver version you're using?
>
> Thanks,
> Koji
>
> On Wed, Nov 8, 2017 at 1:14 AM,   wrote:
>> Hi all,
>>
>>
>>
>> I’m facing issue while fetching records from mysql table with
>> Timestamp column. Table has 4 timestamp columns. It is working fine
>> when I change the data type to string.
>>
>>
>>
>> It throws the following exception :
>>
>>
>>
>> org.apache.nifi.processor.exception.ProcessException: java.sql.SQLException:
>> Value '   821725069 2161514622096
>>
>> 
>> 001[1]20 18248217
>> 233243264523
>> 2332442942490
>> 00233244294249  004217 2017-05-01 00:42:17[1]65 N 62001-404-282360 111
>> -00-00 00:00:00
>> 010792116551710 62001-404-28236
>>
>> -10.751025
>>
>> -10.751025 0.270833 6.729165 6.458332[1]96 0.2708331 00
>>
>> 
>> 404
>> 0023324429420[1]650   [1]10220
>> 24326452301CS1plus_V.1.0@ericsson.comcore_context_...@ericsson.com
>> 8-1007175176-0846083c 20170501
>> 233243264523 2017-05-01
>> 19:37:061CCNCDR44-ASCCN8_03-Blk65536Blk-2360-20170501-3841
>> 
>> 614[1]410
>> 010792116551710
>> 
>> 0-00 0   3 ' can not be represented as
>> java.sql.Timestamp
>>
>>at
>> org.apache.nifi.processors.standard.ExecuteSQL$2.process(ExecuteSQL.ja
>> va:220)
>>
>>at
>> org.apache.nifi.controller.repository.StandardProcessSession.write(Sta
>> ndardProcessSession.java:2570)
>>
>>at
>> org.apache.nifi.processors.standard.ExecuteSQL.onTrigger(ExecuteSQL.ja
>> va:206)
>>
>>at
>> org.apache.nifi.processor.AbstractProcessor.onTrigger(AbstractProcesso
>> r.java:27)
>>
>>at
>> org.apache.nifi.controller.StandardProcessorNode.onTrigger(StandardPro
>> cessorNode.java:1119)
>>
>>at
>> org.apache.nifi.controller.tasks.ContinuallyRunProcessorTask.call(Cont
>> inuallyRunProcessorTask.java:147)
>>
>>at
>> org.apache.nifi.controller.tasks.ContinuallyRunProcessorTask.call(Cont
>> inuallyRunProcessorTask.java:47)
>>
>>at
>> org.apache.nifi.controller.scheduling.TimerDrivenSchedulingAgent$1.run
>> (TimerDrivenSchedulingAgent.java:128)
>>
>>at
>> java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511
>> )
>>
>>at
>> java.util.concurrent.FutureTask.runAndReset(FutureTask.java:308)
>>
>>at
>> java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.a
>> ccess$301(ScheduledThreadPoolExecutor.java:180)
>>
>>at
>> java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.r
>> un(ScheduledThreadPoolExecutor.java:294)
>>
>>at
>> java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.j
>> ava:1149)
>>
>>at
>> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.
>> java:624)
>>
>>at java.lang.Thread.run(Thread.java:748)
>>
>> Caused by: java.sql.SQLException: Value '   821725069
>> 2161514622096
>>
>> 
>> 001[1]20 18248217
>> 233243264523
>> 2332442942490
>> 00233244294249  004217 2017-05-01 00:42:17[1]65 N 62001-404-282360 111
>> -00-00 00:00:00
>> 010792116551710 62001-404-28236
>>
>> -10.751025
>>
>> -10.751025 0.270833 6.729165 6.458332[1]96 0.2708331 00
>>
>> 
>> 404
>> 0023324429420[1]650   [1]10220
>> 24326452301CS1plus_V.1.0@ericsson.comcore_context_...@ericsson.com
>> 8-1007175176-0846083c 20170501
>> 233243264523 2017-05-01
>> 19:37:061CCNCDR44-ASCCN8_03-Blk65536Blk-2360-20170501-3841
>> 
>> 614[1]410
>> 010792116551710
>> 
>> 0-00 0   3 ' can not be represented as
>> java.sql.Timestamp
>>
>>at
>> 

InvokeHttp does not support Basic Authentication username/password as expressions

2017-11-08 Thread Putta Challa
Hi,

We have a requirement to invoke 800 remote urls with login info (different for 
each url), since InvokeHttp Basic Authentication Username/Password
does not support expression language I tried “Authorization” property with 
value “Basic ” but it does not work.

DEBUG shows the following output:

InvokeHTTP[id=97d93e57-015f-1000--01f5309a]
Response from remote service:

http://cal.rets.mlxinnovia.com/cal/Search?Class=VacantLand=1=COMPACT-DECODED=1000=%28MlsStatus%3D%7CA%2C%26%2336%3B%2CN%2CB%2CG%29=DMQL2=Property
cache-control: [private, no-cache="set-cookie"]
connection: keep-alive
content-length: 951
content-type: text/html;charset=utf-8
date: Wed, 08 Nov 2017 20:22:29 GMT
rets-server: RETSInnoVia/3.8.0
rets-session-id: 428B108C80634DC564EBD75F7AA522B5.rets01c
set-cookie: 
[JSESSIONID=428B108C80634DC564EBD75F7AA522B5.rets01c; Path=/cal/; HttpOnly, 
AWSELB=71714B431CF68CCA8B83582868EC56EA3C25F91600FB06AF15CA40C5856EC055C8338C41D81B242E93F3FE3E4CB6ECDF939B78BF58B75715045ED8B1F537D49B5980932BEC;PATH=/;MAX-AGE=60]
www-authenticate: Digest 
realm="cal.rets.mlxinnovia.com",nonce="4eccca657a96f34684c78670e4937ba3",opaque="809432a284b99548b2f26791b90e2504",qop="auth"

Note: static values for Basic Authentication username/password works.

Thanks,
Putta Challa





This email may be confidential. If you are not the intended recipient, please 
notify us immediately and delete this copy from your system.


Re: Output from PostHTTP

2017-11-08 Thread Matt Burgess
That's an interesting service, must expect smallish files :) You can
get the entire content into an attribute using ExtractText, match the
whole thing and put it in an attribute called "file", then set the
Attributes To Send property in InvokeHttp to include "file" and
anything else (except Content-Type, that's its own property).  If you
need to get the content of /home/nifi/test/dummyfilename.txt, you can
use FetchFile before the ExtractText, just give it a flow file
containing two attributes: "absolute.path" set to "/home/nifi/test/",
and "filename" set to "dummyfilename.txt".

Regards,
Matt

On Wed, Nov 8, 2017 at 11:12 AM, James McMahon  wrote:
> The service I am calling expects the content to be in an http post attribute
> called "file". Using the guidance from you above Matt (thanks very much for
> that), I have been able to post in attribute "file" some random text,
> "@/home/nifi/test/dummyfilename.txt". I see that as output from a ListenTTP
> processor, so I have confirmed that I do get that. The last piece of the
> puzzle is to set that attribute to be the flowfile content. How do I set
> that attribute to be my flowfile content?
>
> The challenge I seem to be having is that the service is not a nifi flow.
> How do i feed to it the content body?
>
> On Wed, Nov 8, 2017 at 9:41 AM, Matt Burgess  wrote:
>>
>> Jim,
>>
>> The content of the flow file is the body of the outgoing POST, so you
>> could query provenance for the PostHttp processor, find the associated
>> flow file(s), and (if the content is still available in the content
>> repository) retrieve the content. Also the resolved URL for the POST
>> (after evaluating Expression Language, e.g.) is available in the
>> provenance event. This can all be done using the REST API. If you
>> don't need to review the contents "online", you can place a PutFile or
>> LogAttribute before the PostHttp, and effectively "download" the flow
>> file content as it will be presented to the PostHttp processor.
>>
>> Regards,
>> Matt
>>
>>
>> On Wed, Nov 8, 2017 at 9:08 AM, James McMahon 
>> wrote:
>> > But isn't the Http response what comes back to us? I like your thinking,
>> > but
>> > it is the outgoing post i need to review. The response won't help me
>> > with
>> > that. Am I missing the point?
>> >
>> > On Wed, Nov 8, 2017 at 8:52 AM, Mike Thomsen 
>> > wrote:
>> >>
>> >> Don't know, but you might want to try out InvokeHttp. I know it lets
>> >> you
>> >> tap into the output if you tell it to always output the HTTP response.
>> >>
>> >> On Wed, Nov 8, 2017 at 8:28 AM, James McMahon 
>> >> wrote:
>> >>>
>> >>> How can we tap into the workflow to see the output of the PostHTTP
>> >>> processor? What are options folks have used to do that?
>> >>>
>> >>> Thanks in advance. -Jim
>> >>
>> >>
>> >
>
>


Re: Output from PostHTTP

2017-11-08 Thread James McMahon
The service I am calling expects the content to be in an http post
attribute called "file". Using the guidance from you above Matt (thanks
very much for that), I have been able to post in attribute "file" some
random text, "@/home/nifi/test/dummyfilename.txt". I see that as output
from a ListenTTP processor, so I have confirmed that I do get that. The
last piece of the puzzle is to set that attribute to be the flowfile
content. How do I set that attribute to be my flowfile content?

The challenge I seem to be having is that the service is not a nifi flow.
How do i feed to it the content body?

On Wed, Nov 8, 2017 at 9:41 AM, Matt Burgess  wrote:

> Jim,
>
> The content of the flow file is the body of the outgoing POST, so you
> could query provenance for the PostHttp processor, find the associated
> flow file(s), and (if the content is still available in the content
> repository) retrieve the content. Also the resolved URL for the POST
> (after evaluating Expression Language, e.g.) is available in the
> provenance event. This can all be done using the REST API. If you
> don't need to review the contents "online", you can place a PutFile or
> LogAttribute before the PostHttp, and effectively "download" the flow
> file content as it will be presented to the PostHttp processor.
>
> Regards,
> Matt
>
>
> On Wed, Nov 8, 2017 at 9:08 AM, James McMahon 
> wrote:
> > But isn't the Http response what comes back to us? I like your thinking,
> but
> > it is the outgoing post i need to review. The response won't help me with
> > that. Am I missing the point?
> >
> > On Wed, Nov 8, 2017 at 8:52 AM, Mike Thomsen 
> wrote:
> >>
> >> Don't know, but you might want to try out InvokeHttp. I know it lets you
> >> tap into the output if you tell it to always output the HTTP response.
> >>
> >> On Wed, Nov 8, 2017 at 8:28 AM, James McMahon 
> >> wrote:
> >>>
> >>> How can we tap into the workflow to see the output of the PostHTTP
> >>> processor? What are options folks have used to do that?
> >>>
> >>> Thanks in advance. -Jim
> >>
> >>
> >
>


Re: Incorrect PublishKafka_0_10 documentation?

2017-11-08 Thread Bryan Bende
James,

Sorry it was confusing to get this working.

What you described is correct, the "Kerberos Service Name" should be
the serviceName you would put in the JAAS file which is typically
"kafka", and then the "Kerberos Principal' and "Kerberos Keytab" would
be the prinicpal and keytab from the JAAS file.

I believe "Kerberos Principal" and "Keberos Keytab" are optional
because you can alternatively set a JAAS file through the system
property, but if you provide these properties then NiFi creates one
dynamically for you.

Feel free to create a JIRA or submit a PR to improve the documentation
of these properties.

Thanks,

Bryan


On Tue, Nov 7, 2017 at 3:13 PM, James Srinivasan
 wrote:
> I've been struggling to get NiFi working with Kerberos authenticated
> Kafka. According to the docs, the "Kerberos Service Name" property
> specifies:
>
> "The Kerberos principal name that Kafka runs as. This can be defined
> either in Kafka's JAAS config or in Kafka's config. Corresponds to
> Kafka's 'security.protocol' property.It is ignored unless one of the
> SASL options of the  are selected."
>
> First off, it doesn't correspond to Kafka's security.protocol property
> - it corresponds to the JAAS serviceName property. Second, I'm not
> sure it is a Kerberos principal name - in my (HDP) install, it is set
> to "kafka", and using the full Kerberos principal name
> ("kafka@MYDOMAIN.LOCAL") doesn't work. I would submit a PR, but I'm
> not 100% sure about the second bit.
>
> Long story short, for my install setting this to "kafka" worked, plus
> setting "Kerberos Principal" and "Kerberos Keytab" to suitable things,
> and "Security Protocol" to "SASL_PLAINTEXT". In our environment, we
> enforce explicit topic creation so having done that and granted
> producer and consumer access to the correct users, everything works
> nicely.
>
> James


Re: Replace Text

2017-11-08 Thread Austin Duncan
I am using executeSQL thanks thats perfect.

On Wed, Nov 8, 2017 at 9:32 AM, Matt Burgess  wrote:

> Austin,
>
> If your data is not coming from something like ExecuteSQL (which Bryan
> mentioned) but you are defining a schema for it, there are a couple of
> options. First, what format is your data in? If CSV, you can configure
> a CSVReader to use your schema and ignore the header, effectively
> renaming the fields. That reader can be used in any downstream
> record-aware processor (don't bother using ConvertRecord to read and
> write to the same format, it is unnecessary).  If your data is in
> JSON, you can use the JoltTransformJSON processor to rename the
> fields. If your data is in XML, you can use TransformXML to rename the
> fields. If it is in some other format, please describe and we can get
> it figured out.
>
> Regards,
> Matt
>
>
> On Wed, Nov 8, 2017 at 9:21 AM, Bryan Bende  wrote:
> > Austin,
> >
> > Are you referring to Avro schemas created by ExecuteSQL?
> >
> > If so, there was a property added called "Normalize Table/Column
> > Names" which will convert non-compatible characters for you.
> >
> > -Bryan
> >
> >
> >
> > On Wed, Nov 8, 2017 at 9:10 AM, Austin Duncan 
> wrote:
> >> So avro schemas dont allow spaces. Is there a way for me to replace
> >> underscores with spaces efficiently? Right now I am using a bunch of
> replace
> >> text processors to replace the strings that contain underscores with
> strings
> >> with spaces. Is there a better way of doing this? It works now because
> my
> >> tables are not very big but i imagine there will be a situation in
> which I
> >> will have a table with a lot of columns that I would want to replace.
> >>
> >> --
> >> Austin Duncan
> >> Researcher
> >>
> >> PYA Analytics
> >> 2220 Sutherland Avenue
> >> Knoxville, TN 37919
> >> 423-260-4172
>



-- 
​Austin Duncan
*​Researcher​*

PYA Analytics
2220 Sutherland Avenue

Knoxville, TN 37919

423-260-4172

<%28865%29%20684-2828>


Re: Output from PostHTTP

2017-11-08 Thread Matt Burgess
Jim,

The content of the flow file is the body of the outgoing POST, so you
could query provenance for the PostHttp processor, find the associated
flow file(s), and (if the content is still available in the content
repository) retrieve the content. Also the resolved URL for the POST
(after evaluating Expression Language, e.g.) is available in the
provenance event. This can all be done using the REST API. If you
don't need to review the contents "online", you can place a PutFile or
LogAttribute before the PostHttp, and effectively "download" the flow
file content as it will be presented to the PostHttp processor.

Regards,
Matt


On Wed, Nov 8, 2017 at 9:08 AM, James McMahon  wrote:
> But isn't the Http response what comes back to us? I like your thinking, but
> it is the outgoing post i need to review. The response won't help me with
> that. Am I missing the point?
>
> On Wed, Nov 8, 2017 at 8:52 AM, Mike Thomsen  wrote:
>>
>> Don't know, but you might want to try out InvokeHttp. I know it lets you
>> tap into the output if you tell it to always output the HTTP response.
>>
>> On Wed, Nov 8, 2017 at 8:28 AM, James McMahon 
>> wrote:
>>>
>>> How can we tap into the workflow to see the output of the PostHTTP
>>> processor? What are options folks have used to do that?
>>>
>>> Thanks in advance. -Jim
>>
>>
>


Replace Text

2017-11-08 Thread Austin Duncan
So avro schemas dont allow spaces. Is there a way for me to replace
underscores with spaces efficiently? Right now I am using a bunch of
replace text processors to replace the strings that contain underscores
with strings with spaces. Is there a better way of doing this? It works now
because my tables are not very big but i imagine there will be a situation
in which I will have a table with a lot of columns that I would want to
replace.

-- 
​Austin Duncan
*​Researcher​*

PYA Analytics
2220 Sutherland Avenue

Knoxville, TN 37919

423-260-4172

<%28865%29%20684-2828>


Re: Output from PostHTTP

2017-11-08 Thread James McMahon
But isn't the Http *response *what comes back to us? I like your thinking,
but it is the outgoing post i need to review. The response won't help me
with that. Am I missing the point?

On Wed, Nov 8, 2017 at 8:52 AM, Mike Thomsen  wrote:

> Don't know, but you might want to try out InvokeHttp. I know it lets you
> tap into the output if you tell it to always output the HTTP response.
>
> On Wed, Nov 8, 2017 at 8:28 AM, James McMahon 
> wrote:
>
>> How can we tap into the workflow to see the output of the PostHTTP
>> processor? What are options folks have used to do that?
>>
>> Thanks in advance. -Jim
>>
>
>


Re: Found multiple policies exception

2017-11-08 Thread Kevin Doran
Hi Kumar,

 

Access Policies in NiFi 1.3.0 are defined with (resource, action) pairs, where 
"resource" is basically the path part of the resource URI (e.g., /controller, 
/policies), and "action" is either 'read' or 'write'.

 

For each policy defined (resource, action) must be unique. So if you want to 
grant a user or userGroup read or write access to a resource, rather than 
create a new policy, first check if that policy already exists, and if so, add 
the users/groups to that policy. If the policy for the (resource, action) pair 
you want to set does not exist, then create it.

 

Referencing the NiFi REST API documentation [1]:

 

To view all existing policies:

 

GET /policies

 

To update an existing policy:

 

PUT /policies/{policyId}    # where policyId is returned by the server in the 
GET response

 

To create a new policy for a (resource, action) pair that does not already 
exist:

 

POST /policies  # the created policy, including the server-set id, will be 
returned in the response upon success

 

Note, that in order to add tenants (ie, users and userGroups) to a policy, you 
must discover their ids as well. You can use:

 

    GET /tenants/users

    GET/tenants/user-groups

    GET /tenants/search-results  # search by tenant identity, i.e., 
user name or group name.

 

These tenants endpoints may be helpful in resolving the other error you noticed 
in the logs, which user not found exception. Make sure the user you are 
referencing is in the result set of GET /tenants/users and use the same entity 
id when you are adding a user to a policy.

 

[1] https://nifi.apache.org/docs/nifi-docs/rest-api/index.html  

(this link is to 1.4.0 rest api docs, but the tenant and policy API endpoints 
are unchanged from 1.3.0 to my knowledge)

 

Hope this helps!

Kevin

 

From: kumar r 
Reply-To: 
Date: Wednesday, November 8, 2017 at 05:30
To: 
Subject: Found multiple policies exception

 

Hi,

I am using NiFi-1.3.0 secured with Kerberos. When i set a policy for a user, i 
am getting 

Found multiple policies for '/controller' with 'write'.

After checking log file, below exception occurs

org.apache.nifi.web.ResourceNotFoundException: Unable to find user with id 
'311656fb-3fef-303d-8b61-24d4a7d8aeb9'.. Returning Not Found response.
java.lang.IllegalStateException: Found multiple policies for '/controller' with 
'write'.. Returning Conflict response.

how to solve this? Is this NiFi issue? 

Thanks,

Kumar



Re: Output from PostHTTP

2017-11-08 Thread Mike Thomsen
Don't know, but you might want to try out InvokeHttp. I know it lets you
tap into the output if you tell it to always output the HTTP response.

On Wed, Nov 8, 2017 at 8:28 AM, James McMahon  wrote:

> How can we tap into the workflow to see the output of the PostHTTP
> processor? What are options folks have used to do that?
>
> Thanks in advance. -Jim
>


Re: Reading flowfile in a stream callback

2017-11-08 Thread James McMahon
Thank you Andy, thank you again Joe. I'll rethink my approach based on your
recommendations.  -Jim

On Fri, Nov 3, 2017 at 1:31 PM, Andy LoPresto  wrote:

> James,
>
> I am not a Python expert, so I’m glad other people could weigh in. As far
> as routing on content type, I agree with Joe’s sentiment that
> IdentifyMimeType and RouteOnAttribute are the correct solutions there. You
> can route on a range of input options (the actual type, detected charset,
> etc.).
>
> I would definitely avoid putting code to handle multiple disparate content
> types (text vs. video, etc.) in the same ExecuteScript processor. This will
> be harder to test, maintain, enhance, etc. You’ll eventually reach a Switch
> Statement of Doom. Instead, approach this as each ES processor is a black
> box like a Unix tool — it does one thing really well — and chain them
> together. This is the philosophy NiFi is built on and you’ll have much more
> success swimming with the current than fighting it.
>
>
> Andy LoPresto
> alopre...@apache.org
> *alopresto.apa...@gmail.com *
> PGP Fingerprint: 70EC B3E5 98A6 5A3F D3C4  BACE 3C6E F65B 2F7D EF69
>
> On Nov 3, 2017, at 6:05 AM, Joe Witt  wrote:
>
> Mime type detection can be difficult business but I trust Apache Tika
> to do a far better job than I ever could.  The result you show for
> JSON appears correct and I'd simply add that string to the list of
> routing attributes that i treat as text.  Or I'd key off the charset
> being being provided as that would tell me enough to know it is text
> or however I wanted to treat it.
>
> Thanks
>
> On Fri, Nov 3, 2017 at 8:24 AM, James McMahon 
> wrote:
>
> I've always found that IdentifyMimeType returns a wide, wide range of
> values
> for mime.type. There is often ambiguity that mime.type is a reliable
> indicator of the nature of the content. To illustrate, I've passed file.txt
> into Nifi that contains a string representation of json. I'd expect this to
> be handled as textual data, but mime.type gets set to
> application/json;charset=UTF-8.
>
> Perhaps I am misusing the attribute mime.type. How have you worked around
> this challenge Joe?
>
> On Fri, Nov 3, 2017 at 7:54 AM, Joe Witt  wrote:
>
>
> "How can discern binary or character content using conditional checks
> to be sure I handle the file properly?"
>
> Use NiFi and the existing processors where able and extend/script only
> where necessary/critical.  For the case you mention use
> IdentifyMimeType and route appropriate data to the appropriate script
> execution.
>
> Joe
>
> On Fri, Nov 3, 2017 at 7:04 AM, James McMahon 
> wrote:
>
> Andy, regarding the the code sample you offered above - doesn't this put
> into text both the attributes metadata and the payload of the flowfile?
>
> If that is the case, how does one modify that to read in from the stream
> into variable text only the file payload?
>
> On Fri, Nov 3, 2017 at 5:48 AM, James McMahon 
> wrote:
>
>
> Thank you Andy. I'd like to ask just a few quick follow up questions.
>
> 1- My flow content may be textual characters, and it can also be binary
> -
> jpgs, pngs, and similar. How can discern binary or character content
> using
> conditional checks to be sure I handle the file properly? How would I
> alter
> this
>
> text = IOUtils.toString(inputStream, StandardCharsets.UTF_8)
>
> to read in the data from the stream as binary data in that case?
>
> 2- In the case where my data in the flowfile payload is binary, do I
> have
> another version of this
>
> outputStream.write(bytearray(reversedText.encode('utf-8')))
>
> that omits the encoding, like so:
>
> outputStream.write(bytearray(some_binary))  ?
>
> Thank you very much in advance. -Jim
>
> On Thu, Nov 2, 2017 at 8:26 PM, Andy LoPresto 
> wrote:
>
>
> James,
>
> The Python API should be the same as the Java FlowFile.java interface
> [1]. Matt Burgess’ blog has a good post about using Jython to do
> flowfile
> content manipulation. Something like:
>
> flowFile = session.get()
> if (flowFile != None):
>  flowFile = session.write(flowFile,PyStreamCallback())
>  session.transfer(flowFile, REL_SUCCESS)
>
> With PyStreamCallback declared as a class above that block in the
> script:
>
> import java.io
> from org.apache.commons.io import IOUtils
> from java.nio.charset import StandardCharsets
> from org.apache.nifi.processor.io import StreamCallback
>
> class PyStreamCallback(StreamCallback):
>  def __init__(self):
>pass
>  def process(self, inputStream, outputStream):
>text = IOUtils.toString(inputStream, StandardCharsets.UTF_8)
>reversedText = text[::-1]
>
>outputStream.write(bytearray(reversedText.encode('utf-8')))
>
> In Groovy, you can declare the StreamCallback as an inline closure to
> make this more compact, but I believe in Jython it needs to be a
> separate
> declaration. 

Output from PostHTTP

2017-11-08 Thread James McMahon
How can we tap into the workflow to see the output of the PostHTTP
processor? What are options folks have used to do that?

Thanks in advance. -Jim


Found multiple policies exception

2017-11-08 Thread kumar r
Hi,

I am using NiFi-1.3.0 secured with Kerberos. When i set a policy for a
user, i am getting
Found multiple policies for '/controller' with 'write'.

After checking log file, below exception occurs

org.apache.nifi.web.ResourceNotFoundException: Unable to find user with id
'311656fb-3fef-303d-8b61-24d4a7d8aeb9'.. Returning Not Found response.
java.lang.IllegalStateException: Found multiple policies for '/controller'
with 'write'.. Returning Conflict response.

how to solve this? Is this NiFi issue?

Thanks,
Kumar


Re: RE: [EXT] Re: Polling Processors impact on Latency

2017-11-08 Thread Chirag Dewan
 That's a great start Andy and Peter. Thank you for such precise answers. 
I will start tweaking with the parameters you mentioned and try and reach an 
optimum latency-resource configuration. 
Thanks a lot for your help. 
Chirag
On Tuesday 7 November 2017, 8:40:58 PM IST, Andy Christianson 
 wrote:  
 
 Chirag,

Peter's note about bored yield duration is right on. Some additional things I'd 
like to point out are:

1) You might get the lowest latency in a configuration where the processor runs 
continuously (bored yield duration 0). This is because with the code executing 
continuously, CPU caches should stay hot. The trade-off is wasted CPU cycles 
when the consumer is waiting for input more often than processing it.

2) We do have an event driven scheduling mode. For NiFi this still appears to 
be experimental:

"Event driven: When this mode is selected, the Processor will be triggered to 
run by an event, and that event occurs when FlowFiles enter Connections feeding 
this Processor. This mode is currently considered experimental and is not 
supported by all Processors. When this mode is selected, the ‘Run schedule’ 
option is not configurable, as the Processor is not triggered to run 
periodically but as the result of an event. Additionally, this is the only mode 
for which the ‘Concurrent tasks’ option can be set to 0. In this case, the 
number of threads is limited only by the size of the Event-Driven Thread Pool 
that the administrator has configured."

https://nifi.apache.org/docs/nifi-docs/html/user-guide.html

This mode is also implemented in MiNiFi - C++ and is done using condition 
variables. This type of event-driven scheduling should put you near the lower 
limit of latency without sacrificing much CPU, assuming a workload where the 
consumer is waiting for input more often than processing it.

I would suggest trying out different configurations and taking measurements, as 
the ideal config will depend a lot on your workload.

Regards,

Andy I.C.

Sent from ProtonMail, Swiss-based encrypted email.


> Original Message 
>Subject: RE: [EXT] Re: Polling Processors impact on Latency
>Local Time: November 7, 2017 7:29 AM
>UTC Time: November 7, 2017 12:29 PM
>From: pwi...@micron.com
>To: users@nifi.apache.org 
>
>
>If you schedule the processor to run every 0 sec (the default) then in my 
>experience you won’t notice latency from polling at all. But I guess this 
>depends on your expectations,
> volume, and over all Flow processing time.
>
>
>
>Yes, event driven may help, but from what I’ve read it’s more about reducing 
>server resource consumption than latency (could be wrong).
>
>
>
>As for a hard set limit, there is a configuration entry in nifi.properties 
>that seems relevant:
>
>
>
># If a component has no work to do (is "bored"), how long should we wait 
>before checking again for work?
>
>nifi.bored.yield.duration=10 millis
>
>
>
>Thanks,
>
>  Peter
>
>
>
>From: Chirag Dewan [mailto:chirag.dewa...@yahoo.in] 
>Sent: Tuesday, November 07, 2017 8:02 PM
>To: apere...@gmail.com; users@nifi.apache.org
>Subject: [EXT] Re: Polling Processors impact on Latency
>
>
>Thanks Andrew for the quick response.
>
>
>I am more concerned about the processors polling for flow files on the 
>connection between the processors?
>
>Thanks,
>
>Chirag
>
>Sent from Yahoo Mail on Android
>
>
>>On Tue, 7 Nov 2017 at 5:24 PM, Andrew Grande
>> wrote:
>>Yes, polling increases latency in some cases. But no, NiFi is not just 
>>polling. It has all kinds of sources, and listening vs polling vs subscribing 
>>purely depends on the protocol of that given processor.
>>
>>Hope this helps,
>> Andrew
>>
>>
>>
>>On Tue, Nov 7, 2017, 1:39 AM Chirag Dewan  wrote:
>>>Hi All,
>>>
>>>I am a layman to NiFi. I am exploring NiFi as a data flow engine to be 
>>>integrated with my Flink processing engine. A brief history of our approach 
>>>: 
>>>
>>>We are trying to build a Streaming Data processing engine. We started off 
>>>with Flink as the sole core engine, which is responsible for 
>>>collection(through Flink Sources) as well as processing
>>> the data. 
>>>
>>>Soon we fumbled onto NiFi and the data flow world. 
>>>
>>>So far, my understanding is that the NiFi processors are poling processors 
>>>and not Pub-Sub processors. That makes me wonder, whats the impact of 
>>>polling on latency? I know I can configure
>>> my processors to tradeoff latency with throughput, but is there a hard set 
>>> limit on the latency I can achieve using NiFi? 
>>>
>>>As I said, I am layman as yet. Perhaps my understanding is short here. Any 
>>>leads would be much appreciated. 
>>>
>>>P.S - Not diving much into Event Driven Processors. They look like something 
>>>which might clear my thoughts. But since they are marked experimental, would 
>>>be more interested in understanding
>>> the timer driven processors.
>>>
>>>Thanks,
>>>
>>>Chirag