RE: [EXTERNAL] - Re: GetTwitter to stream tweets instead pull

2017-05-04 Thread Jamie Wang
Hi Joey,

Thanks for the information. The name actually plays only a small part as you 
indicated. But I sort of got convinced it is pulling after reading the help 
documentation for GetTwitter. You can see it here: 
https://nifi.apache.org/docs/nifi-docs/components/org.apache.nifi.processors.twitter.GetTwitter/.
 The line "Pulls status changes from Twitter's streaming API" sort of got me to 
believe it is a pull instead streaming. Also it'd be a good idea if possible to 
add a line to explicitly document it is actually streaming. Thanks again for 
your note and my apology for the late response.

Jamie

From: Joey Frazee [mailto:joey.fra...@icloud.com]
Sent: Wednesday, May 03, 2017 12:51 PM
To: users@nifi.apache.org
Subject: [EXTERNAL] - Re: GetTwitter to stream tweets instead pull

Jamie, can you explain a little bit more about what you’re looking for?

The GetTwitter processor is accessing the spritzer/decahose/firehouse, what 
have you, via Twitter’s Hosebird library. This library is indeed streaming the 
Tweets from their sample and filter APIs in the usual way with a persistent, 
chunk-encoded HTTP connection to 
https://stream.twitter.com/1.1/statuses/sample.json
 and 
https://stream.twitter.com/1.1/statuses/filter.json.

I’ll admit the name might be a little confusing since the Get might suggest 
it’s hitting one of the REST 
https://api.twitter.com/1.1/statuses/
 resources periodically instead of using a long-term HTTP connection.

-joey

On May 3, 2017, at 2:00 PM, Jamie Wang 
> wrote:

Hi,

I understand the built-in processor GetTwitter is a pull. Are there streaming 
based processor for getting Tweets available? If no, any suggestions on how 
would l go by to build one?

Thanks
Jamie



Re: Issue with Groovy script

2017-05-04 Thread Matt Burgess
Mike,

To follow up on Andy's question, you will likely need more than just
the http-builder JAR, I don't believe it is shaded (aka "fat JAR"). I
have the "http-builder-0.7-all.zip" unzipped to a folder, and it has
the http-builder-0.7.jar at the root level, but then a "dependencies"
folder as well. If you have something similar, you will want to add
the JAR and the dependencies folder to the Module Directory property.

Regards,
Matt

On Thu, May 4, 2017 at 3:04 PM, Andy LoPresto  wrote:
> Mike,
>
> When you say you’ve “included the http-builder jar as a dependency” do you
> mean you provided the location of the directory containing that JAR as the
> Module Path in the ExecuteScript processor?
>
> Andy LoPresto
> alopre...@apache.org
> alopresto.apa...@gmail.com
> PGP Fingerprint: 70EC B3E5 98A6 5A3F D3C4  BACE 3C6E F65B 2F7D EF69
>
> On May 4, 2017, at 1:58 PM, Mike Harding  wrote:
>
> Hi all,
>
> I'm trying to run a simple groovy script in ExecuteScript processor to make
> a HTTP GET request (I understand their are processors get this but I'm just
> exploring Groovy at the minute).
>
>> import groovyx.net.http.HTTPBuilder
>> flowFile = session.get()
>> def http = new HTTPBuilder('https://google.com')
>> def html = http.get(path : '/search', query : [q:'waffles'])
>> log.warn(html)
>> session.transfer(flowFile, REL_SUCCESS)
>
>
> Ive included the http-builder jar as a dependency but I'm getting the error:
>
> 
>
> I'm not new to NiFi but new to using Groovy. I've tried import
> org.apache.http.* but that doesn't help. I'm assuming that the missing class
> library is a default library in Groovy?
>
> Any help much appreciated,
> Mike
>
>


Re: Issue with Groovy script

2017-05-04 Thread Andy LoPresto
Mike,

When you say you’ve “included the http-builder jar as a dependency” do you mean 
you provided the location of the directory containing that JAR as the Module 
Path in the ExecuteScript processor?

Andy LoPresto
alopre...@apache.org
alopresto.apa...@gmail.com
PGP Fingerprint: 70EC B3E5 98A6 5A3F D3C4  BACE 3C6E F65B 2F7D EF69

> On May 4, 2017, at 1:58 PM, Mike Harding  wrote:
> 
> Hi all,
> 
> I'm trying to run a simple groovy script in ExecuteScript processor to make a 
> HTTP GET request (I understand their are processors get this but I'm just 
> exploring Groovy at the minute).
> 
> import groovyx.net.http.HTTPBuilder
> flowFile = session.get()
> def http = new HTTPBuilder('https://google.com ')
> def html = http.get(path : '/search', query : [q:'waffles'])
> log.warn(html)
> session.transfer(flowFile, REL_SUCCESS)
> 
> Ive included the http-builder jar as a dependency but I'm getting the error:
> 
> 
> 
> I'm not new to NiFi but new to using Groovy. I've tried import 
> org.apache.http.* but that doesn't help. I'm assuming that the missing class 
> library is a default library in Groovy?
> 
> Any help much appreciated,
> Mike



signature.asc
Description: Message signed with OpenPGP using GPGMail


Issue with Groovy script

2017-05-04 Thread Mike Harding
Hi all,

I'm trying to run a simple groovy script in ExecuteScript processor to make
a HTTP GET request (I understand their are processors get this but I'm just
exploring Groovy at the minute).

import groovyx.net.http.HTTPBuilder
> flowFile = session.get()
> def http = new HTTPBuilder('https://google.com')
> def html = http.get(path : '/search', query : [q:'waffles'])
> log.warn(html)
> session.transfer(flowFile, REL_SUCCESS)


Ive included the http-builder jar as a dependency but I'm getting the error:

[image: Inline images 1]

I'm not new to NiFi but new to using Groovy. I've tried import
org.apache.http.* but that doesn't help. I'm assuming that the missing
class library is a default library in Groovy?

Any help much appreciated,
Mike


Re: Issues with creating and connecting RPG programmatically

2017-05-04 Thread Matt Gilman
I've responded to the question on StackOverflow [1].

Thanks

Matt

[1]
http://stackoverflow.com/questions/43788780/nifi-issues-with-creating-and-connecting-rpg-programmatically/43789310#43789310

On Thu, May 4, 2017 at 12:49 PM, Pushkara R  wrote:

> Hi,
>
> I am trying to write a function that takes two parameters, a processor ID
> in machine A and an input port in machine B and creates an RPG in machine B
> which connects the above to.
>
> Now, how I am doing that is
> 1. A POST to the
> */nifi-api/process-groups//remote-process-groups*
> endpoint to create an RPG and retrieve the ID of that RPG
> 2. A POST to the */nifi-api/process-groups//connections
> *endpoint to create a connection between the processor and the input
> port. the processorID and the ID of the input port are being provided along
> with the list of the relationships.
> 3. A final PUT to */nifi-api/remote-process-groups/ *to enable the
> transmission between the machines*.*
>
> Now, the function always throws errors in step 2. A 409 is thrown for the
> POST request with the error being 'Unable to find specified destination'.
> (though refreshing the canvas on machine 1 shows the rpg having been
> created)
> However, when I manually run the steps 2 and 3 afterwards, with the same
> rpgid, the connection happens.
>
> Now, I'm not sure if this is a synchronization issue or not, but I want to
> figure it out because I would not want to separate out steps 1 2 and 3.
> Could somebody point out what could be the issue here?
>
> Pushkar
>
> PS - the post messages for steps 2 are the same when the api is called
> from within the function and manually.
>


Issues with creating and connecting RPG programmatically

2017-05-04 Thread Pushkara R
Hi,

I am trying to write a function that takes two parameters, a processor ID
in machine A and an input port in machine B and creates an RPG in machine B
which connects the above to.

Now, how I am doing that is
1. A POST to the
*/nifi-api/process-groups//remote-process-groups*
endpoint to create an RPG and retrieve the ID of that RPG
2. A POST to the */nifi-api/process-groups//connections
*endpoint to create a connection between the processor and the input port.
the processorID and the ID of the input port are being provided along with
the list of the relationships.
3. A final PUT to */nifi-api/remote-process-groups/ *to enable the
transmission between the machines*.*

Now, the function always throws errors in step 2. A 409 is thrown for the
POST request with the error being 'Unable to find specified destination'.
(though refreshing the canvas on machine 1 shows the rpg having been
created)
However, when I manually run the steps 2 and 3 afterwards, with the same
rpgid, the connection happens.

Now, I'm not sure if this is a synchronization issue or not, but I want to
figure it out because I would not want to separate out steps 1 2 and 3.
Could somebody point out what could be the issue here?

Pushkar

PS - the post messages for steps 2 are the same when the api is called from
within the function and manually.


Re: Phantom node

2017-05-04 Thread Neil Derraugh
Our three zookeepers are external.  The nodes that are healthy are all able
to ping each other.  nifi.properties follow (with minor mods around
hostnames/IPs).

# Core Properties #
nifi.version=1.1.2
nifi.flow.configuration.file=/mnt/mesos/sandbox/conf/flow.xml.gz
nifi.flow.configuration.archive.enabled=true
nifi.flow.configuration.archive.dir=/mnt/mesos/sandbox/conf/archive/
nifi.flow.configuration.archive.max.time=30 days
nifi.flow.configuration.archive.max.storage=500 MB
nifi.flowcontroller.autoResumeState=true
nifi.flowcontroller.graceful.shutdown.period=10 sec
nifi.flowservice.writedelay.interval=500 ms
nifi.administrative.yield.duration=30 sec
# If a component has no work to do (is "bored"), how long should we wait
before checking again for work?
nifi.bored.yield.duration=10 millis
nifi.authorizer.configuration.file=/mnt/mesos/sandbox/conf/authorizers.xml
nifi.login.identity.provider.configuration.file=/mnt/mesos/sandbox/conf/login-identity-providers.xml
nifi.templates.directory=/mnt/mesos/sandbox/conf/templates
nifi.ui.banner.text=master - dev3
nifi.ui.autorefresh.interval=30 sec
nifi.nar.library.directory=/opt/nifi/lib
nifi.nar.library.directory.custom=/mnt/mesos/sandbox/lib
nifi.nar.working.directory=/mnt/mesos/sandbox/work/nar/
nifi.documentation.working.directory=/mnt/mesos/sandbox/work/docs/components

# State Management #

nifi.state.management.configuration.file=/mnt/mesos/sandbox/conf/state-management.xml
# The ID of the local state provider
nifi.state.management.provider.local=local-provider
# The ID of the cluster-wide state provider. This will be ignored if NiFi
is not clustered but must be populated if running in a cluster.
nifi.state.management.provider.cluster=zk-provider
# Specifies whether or not this instance of NiFi should run an embedded
ZooKeeper server
nifi.state.management.embedded.zookeeper.start=false
# Properties file that provides the ZooKeeper properties to use if
 is set to true
nifi.state.management.embedded.zookeeper.properties=./conf/zookeeper.properties
# H2 Settings
nifi.database.directory=/mnt/mesos/sandbox/data/database_repository
nifi.h2.url.append=;LOCK_TIMEOUT=25000;WRITE_DELAY=0;AUTO_SERVER=FALSE
# FlowFile Repository
nifi.flowfile.repository.implementation=org.apache.nifi.controller.repository.WriteAheadFlowFileRepository
nifi.flowfile.repository.directory=/mnt/mesos/sandbox/data/flowfile_repository
nifi.flowfile.repository.partitions=256
nifi.flowfile.repository.checkpoint.interval=2 mins
nifi.flowfile.repository.always.sync=false
nifi.swap.manager.implementation=org.apache.nifi.controller.FileSystemSwapManager
nifi.queue.swap.threshold=2
nifi.swap.in.period=5 sec
nifi.swap.in.threads=1
nifi.swap.out.period=5 sec
nifi.swap.out.threads=4
# Content Repository
nifi.content.repository.implementation=org.apache.nifi.controller.repository.FileSystemRepository
nifi.content.claim.max.appendable.size=10 MB
nifi.content.claim.max.flow.files=100
nifi.content.repository.directory.default=/mnt/mesos/sandbox/data/content_repository
nifi.content.repository.archive.max.retention.period=12 hours
nifi.content.repository.archive.max.usage.percentage=50%
nifi.content.repository.archive.enabled=true
nifi.content.repository.always.sync=false
nifi.content.viewer.url=/nifi-content-viewer/
# Provenance Repository Properties
nifi.provenance.repository.implementation=org.apache.nifi.provenance.PersistentProvenanceRepository
# Persistent Provenance Repository Properties
nifi.provenance.repository.directory.default=/mnt/mesos/sandbox/data/provenance_repository
nifi.provenance.repository.max.storage.time=24 hours
nifi.provenance.repository.max.storage.size=1 GB
nifi.provenance.repository.rollover.time=30 secs
nifi.provenance.repository.rollover.size=100 MB
nifi.provenance.repository.query.threads=2
nifi.provenance.repository.index.threads=1
nifi.provenance.repository.compress.on.rollover=true
nifi.provenance.repository.always.sync=false
nifi.provenance.repository.journal.count=16
# Comma-separated list of fields. Fields that are not indexed will not be
searchable. Valid fields are:
# EventType, FlowFileUUID, Filename, TransitURI, ProcessorID,
AlternateIdentifierURI, Relationship, Details
nifi.provenance.repository.indexed.fields=EventType, FlowFileUUID,
Filename, ProcessorID, Relationship
# FlowFile Attributes that should be indexed and made searchable.  Some
examples to consider are filename, uuid, mime.type
nifi.provenance.repository.indexed.attributes=
# Large values for the shard size will result in more Java heap usage when
searching the Provenance Repository
# but should provide better performance
nifi.provenance.repository.index.shard.size=500 MB
# Indicates the maximum length that a FlowFile attribute can be when
retrieving a Provenance Event from
# the repository. If the length of any attribute exceeds this value, it
will be truncated when the event is retrieved.
nifi.provenance.repository.max.attribute.length=65536
# Volatile 

Handling HTTP Cookies in NiFi

2017-05-04 Thread Mike Harding
Hi All,

There is an external web service that I need to authenticate with through
NiFi. The service provides a POST endpoint (only) to pass my credentials to
authenticate, which then responses with a cookie with an auth token to be
used in subsequent requests.

I can perform a POST using InvokeHTTP which returns successfully with a
'Set-Cookie' attribute in the flowfile but I don't understand how I can set
this cookie somewhere in NiFi to then be used in subsequent requests.

For example after I have authenticated I want to perform a GET request to
access some data.

Any help much appreciated,

Mike


Publish with persistence using PublishAMQP?

2017-05-04 Thread James McMahon
New to using PublishAMQP and interested in applying best practices. I've
made a mistake in my initial use, in which messages I posted to RabbitMQ
were gone from my queues after I restarted the message broker. Goofy
mistake, but thankfully I am in development prior to production use.

I realize that I am not publishing persistently. I don't see any immediate
configuration option in the PublishAMQP processor to dictate persistent
publication to the exchange and bound queues.

What is the best practice folks employ to publish messages that persist?
Thanks very much in advance for any insights.   - Jim