NiFi can certainly be used for some data replication scenarios and
quite often is. If you can treat the source like a continuous data
source then there is some way to keep state about what has been pulled
already, what has changed or needs yet to be pulled, and it can just
keep running then
You'd only need to do that if you have strict ordering requirements like
reading directly from a transaction log and replicating it. If yes I'd
skip nifi unless your also doing other cases with it.
Sounds like Matts path gets you going though so that might work out just
Thanks Andy. Appreciate your guidance.
On Thu, Oct 13, 2016 at 10:39 AM, Andy LoPresto
> Hi Rai,
> There are some excellent documents on the Apache NiFi site  to help you
> learn. There is an Administrator Guide , a User Guide , a Developer
> Guide , a
Is there any book for apache NiFi?
Also, does Hortonworks conducts training for NiFi?
I am getting the following exception in nifi-0.6.1:
kafka.common.MessageSizeTooLargeException: Found a message larger than the
maximum fetch size of this consumer. Increase the fetch size, or decrease
the maximum message size the broker will allow.
What is the max size? How can I increase
I have been trying to use get and load processor for the dynamodb and I am
almost there. I am able to run the get processor and I see, data is flowing
But I see the following error in my nifi-app.log file:
2016-10-13 18:02:38,823 ERROR [Timer-Driven Process Thread-9]
Kafka consumer properties can be found here:
GetKafka uses the old consumer so the consumer property is:
The default for that property is ~1M.
If possible, you should limit the replica.fetch.max.bytes on
The GetDynamoDB processor requires a hash key value to look up an item in
the table. The default setting is an Expression Language statement that
reads the hash key value from a flowfile attribute,
dynamodb.item.hash.key.value. But this is not required. You can change it
to any attribute
Thanks James. I am looking to iterate through the table so that it takes
hash key values one by one. Do I achieve it through the expression
language? if I write an script to do that, how do I pass it to my processor?
On Thu, Oct 13, 2016 at 1:42 PM, James Wing
Thanks for submitting the PR Stephane! I see that Andy has already stated
that he's reviewing. Thanks Andy!
On Thu, Oct 13, 2016 at 7:42 PM, Stéphane Maarek
> Investigated some more, open a JIRA issue, closed it via
This is a request that has grown popular recently. NiFi was not initially
designed with environment promotion in mind, so it is something we are
currently investigating and trying to address.
The development/QA/production environment promotion process  (sometimes
Investigated some more, open a JIRA issue, closed it via
On Fri, Oct 14, 2016 at 9:47 AM Stéphane Maarek
> Thanks it helps ! Good to know there is already a java client I could use.
> Nonetheless I think it would
Stéphane asked a question on the PR but as it was already closed, I wanted to
reproduce it here for visibility and to see if other community members had
something to add:
good stuff. Quick question, what do you think of NiFi automating the build and
release of API clients in various
Great to hear, Marcio!
On Thu, Oct 13, 2016 at 9:26 PM Márcio Faria wrote:
> Many thanks. I'm now more confident NiFi could be a good fit for us.
> On Wednesday, October 12, 2016 9:06 PM, Jeff wrote:
> Hello Marcio,
Many thanks. I'm now more confident NiFi could be a good fit for us.
On Wednesday, October 12, 2016 9:06 PM, Jeff wrote:
You're asking on the right list!
Based on the scenario you described, I think NiFi would suit your needs. To
I agree with your assumption. It would be great to test that out and
provide some numbers but intuitively I agree.
I could envision certain scatter/gather data flows that could challenge
that sequential access assumption but honestly with how awesome disk
caching is in Linux these days in
Yes, you are correct that Apache NiFi uses swagger. However, we are only
using it for keeping the documentation in sync. We use a maven plugin that
inspects the swagger annotations and generates a swagger.json. The
swagger.json is generated to nifi-web-api/target/swagger-ui/swagger.json
Thank you very much. That was a really great explanation.
I investigated the Nifi architecture, and it seems that most of the
read/write operations for flow file repo and provenance repo are random.
However, for content repo most of the read/write operations are sequential.
Thank you very much.
I would be more than happy to provide some benchmark results after the
On Thu, Oct 13, 2016 at 11:32 PM, Joe Witt wrote:
> I agree with your assumption. It would be great to test that out and
> provide some
Mail list logo