I've got a PR for a new bulk ingest processor, so I could easily add batching the record ingest to that plus something like your PR. I think it might be useful to have some enforcement mechanisms that prevent a request from being way too big. Last documentation I saw said about 32MB/payload. What do you think about that?
On Wed, Feb 20, 2019 at 11:22 AM Joe Percivall <[email protected]> wrote: > Hey Mike, > > As a data point, we're ingesting into ES v6 using PutElasticsearchHttp and > PutElasticsearchHttpRecord. We do almost no querying of anything in ES > using NiFi. Continued improvement around ingesting into ES would be our > core use-case. > > One item that frustrated me was the issue around failures in the record > processor that I put up a PR here[1]. Another example of a potential > improvement would be to not load the entire request body (and thus all the > records/FF content) into memory when inserting into ES using those > processors. Not 100% sure how you would go about doing that but would be an > awesome improvement. Of course, any other improvements around performance > would also be welcome. > > [1] https://github.com/apache/nifi/pull/3299 > > Cheers, > Joe > > On Wed, Feb 20, 2019 at 8:08 AM Mike Thomsen <[email protected]> > wrote: > >> I'm looking for feedback from ElasticSearch users on how they use and how >> they **want** to use ElasticSearch v5 and newer with NiFi. >> >> So please respond with some use cases and what you want, what frustrates >> you, etc. so I can prioritize Jira tickets for the ElasticSearch REST API >> bundle. >> >> (Note: basic JSON DSL queries are already supported via >> JsonQueryElasticSearch. If you didn't know that, please try it out and drop >> some feedback on what is needed to make it work for your use cases.) >> >> Thanks, >> >> Mike >> > > > -- > *Joe Percivall* > linkedin.com/in/Percivall > e: [email protected] >
