What you describe sounds like the processor is working as designed & documented, i.e. it will restart the same query once it has reached the end of the paginated scroll (or search_after, or point-in-time) query.
Instead, it sounds like you want to try using the PaginatedJsonQueryElasticsearch [1] processor instead. This will execute the query given to it, either as the query property or the body of an incoming FlowFile, output the results, and then stop. [1] https://nifi.apache.org/docs/nifi-docs/components/org.apache.nifi/nifi-elasticsearch-restapi-nar/1.23.0/org.apache.nifi.processors.elasticsearch.PaginatedJsonQueryElasticsearch/index.html On 2023/08/16 07:57:43 Richard Beare wrote: > Hi, > I am using the SearchElasticSearch (1.20.0) processor to retrieve all > documents (~20M) from an index, process and eventually return results to a > new index, although for this test I'm retrieving and processing then > discarding. I'm using opensearch. > > My problem is that the process restarts after completion - I discovered > this, and docs confirm, after seeing warnings from my processing code > (which reformats json ready for other work) being repeated for the same > document ID. > > How do I configure the processor to stop after the completing the first > query. > > I've tried the following: > > Query: {"query" : {"match_all" :{}}} > > with pagination_type SCROLL > > I haven't found a combination of the properties that doesn't lead to > repeated cycles through the index. > > I've also tried {"query" : {"match_all" :{}}, "sort" : [{"Visit_DateTime" : > "asc"]}} > > and SEARCH_AFTER pagination type, with the same problem. > > What am I missing? > Thanks >