Sounds perfect. Thanks

On Thu, Aug 17, 2023 at 5:11 AM Chris Sampson <chr...@apache.org> wrote:

> What you describe sounds like the processor is working as designed &
> documented, i.e. it will restart the same query once it has reached the end
> of the paginated scroll (or search_after, or point-in-time) query.
>
> Instead, it sounds like you want to try using the
> PaginatedJsonQueryElasticsearch [1] processor instead. This will execute
> the query given to it, either as the query property or the body of an
> incoming FlowFile, output the results, and then stop.
>
>
> [1]
> https://nifi.apache.org/docs/nifi-docs/components/org.apache.nifi/nifi-elasticsearch-restapi-nar/1.23.0/org.apache.nifi.processors.elasticsearch.PaginatedJsonQueryElasticsearch/index.html
>
> On 2023/08/16 07:57:43 Richard Beare wrote:
> > Hi,
> > I am using the SearchElasticSearch (1.20.0) processor to retrieve all
> > documents (~20M) from an index, process and eventually return results to
> a
> > new index, although for this test I'm retrieving and processing then
> > discarding. I'm using opensearch.
> >
> > My problem is that the process restarts after completion - I discovered
> > this, and docs confirm, after seeing warnings from my processing code
> > (which reformats json ready for other work) being repeated for the same
> > document ID.
> >
> > How do I configure the processor to stop after the completing the first
> > query.
> >
> > I've tried the following:
> >
> > Query: {"query" : {"match_all" :{}}}
> >
> > with pagination_type SCROLL
> >
> > I haven't found a combination of the properties that doesn't lead to
> > repeated cycles through the index.
> >
> > I've also tried {"query" : {"match_all" :{}}, "sort" :
> [{"Visit_DateTime" :
> > "asc"]}}
> >
> > and SEARCH_AFTER pagination type, with the same problem.
> >
> > What am I missing?
> > Thanks
> >
>

Reply via email to