Possibly worth mentioning, although it might not be appropriate for
your use case: if the fields you're interested in are configured with
docValues, you could use streaming expressions (or directly handle
thread-per-shard connections to the /export handler) and get
everything in a single shot without paging of any kind. (I'm actually
working on something of this nature now; though not quite ready for
prime time, it's reliably exporting 68 million records to a 24G
compressed zip archive in 23 minutes -- 24 shards).

On Mon, Feb 10, 2020 at 6:39 PM Erick Erickson <erickerick...@gmail.com> wrote:
>
> Any field that’s unique per doc would do, but yeah, that’s usually an ID.
>
> Hmmm, I don’t see why separate queries for 0-f are necessary if you’re firing
> at individual replicas. Each replica should have multiple UUIDs that start 
> with 0-f.
>
> Unless I misunderstand and you’re just firing off, say, 16 threads at the 
> entire
> collection rather than individual shards which would work too. But for 
> individual
> shards I think you need to look for all possible IDs...
>
> Erick
>
> > On Feb 10, 2020, at 5:37 PM, Walter Underwood <wun...@wunderwood.org> wrote:
> >
> >
> >> On Feb 10, 2020, at 2:24 PM, Walter Underwood <wun...@wunderwood.org> 
> >> wrote:
> >>
> >> Not sure if range queries work on a UUID field, ...
> >
> > A search for id:0* took 260 ms, so it looks like they work just fine. I’ll 
> > try separate queries for 0-f.
> >
> > wunder
> > Walter Underwood
> > wun...@wunderwood.org
> > http://observer.wunderwood.org/  (my blog)
> >
>

Reply via email to