Sounds perfect. Thanks On Thu, Aug 17, 2023 at 5:11 AM Chris Sampson <chr...@apache.org> wrote:
> What you describe sounds like the processor is working as designed & > documented, i.e. it will restart the same query once it has reached the end > of the paginated scroll (or search_after, or point-in-time) query. > > Instead, it sounds like you want to try using the > PaginatedJsonQueryElasticsearch [1] processor instead. This will execute > the query given to it, either as the query property or the body of an > incoming FlowFile, output the results, and then stop. > > > [1] > https://nifi.apache.org/docs/nifi-docs/components/org.apache.nifi/nifi-elasticsearch-restapi-nar/1.23.0/org.apache.nifi.processors.elasticsearch.PaginatedJsonQueryElasticsearch/index.html > > On 2023/08/16 07:57:43 Richard Beare wrote: > > Hi, > > I am using the SearchElasticSearch (1.20.0) processor to retrieve all > > documents (~20M) from an index, process and eventually return results to > a > > new index, although for this test I'm retrieving and processing then > > discarding. I'm using opensearch. > > > > My problem is that the process restarts after completion - I discovered > > this, and docs confirm, after seeing warnings from my processing code > > (which reformats json ready for other work) being repeated for the same > > document ID. > > > > How do I configure the processor to stop after the completing the first > > query. > > > > I've tried the following: > > > > Query: {"query" : {"match_all" :{}}} > > > > with pagination_type SCROLL > > > > I haven't found a combination of the properties that doesn't lead to > > repeated cycles through the index. > > > > I've also tried {"query" : {"match_all" :{}}, "sort" : > [{"Visit_DateTime" : > > "asc"]}} > > > > and SEARCH_AFTER pagination type, with the same problem. > > > > What am I missing? > > Thanks > > >