Thanks Yonik for the explanation.

Hi Erick,
I was using the /xport functionality. But it hasn't been stable (Solr
5.5.0). I started running into run time Exceptions (JSON parsing
exceptions) while reading the stream of Tuples. This started happening as
the size of my collection increased 3 times and I started running queries
that return millions of documents (>10mm). I don't know if it is the query
result size or the actual data size (total number of docs in the
collection) that is causing the instability.

org.noggit.JSONParser$ParseException: Expected ',' or '}':
char=5,position=110938 BEFORE='uuid":"0lG99s8vyaKB2I/
I","space":"uuid","timestamp":1 5' AFTER='DB6 474294954},{"uuid":"
0lG99sHT8P5e'

I won't be able to move to Solr 6.0 due to some constraints in our
production environment and hence moving back to the cursor approach. Do you
have any other suggestion for me?

Thanks,
Chetas.

On Fri, Nov 4, 2016 at 10:17 PM, Erick Erickson <erickerick...@gmail.com>
wrote:

> Have you considered the /xport functionality?
>
> On Fri, Nov 4, 2016 at 5:56 PM, Yonik Seeley <ysee...@gmail.com> wrote:
> > No, you can't get cursor-marks ahead of time.
> > They are the serialized representation of the last sort values
> > encountered (hence not known ahead of time).
> >
> > -Yonik
> >
> >
> > On Fri, Nov 4, 2016 at 8:48 PM, Chetas Joshi <chetas.jo...@gmail.com>
> wrote:
> >> Hi,
> >>
> >> I am using the cursor approach to fetch results from Solr (5.5.0). Most
> of
> >> my queries return millions of results. Is there a way I can read the
> pages
> >> in parallel? Is there a way I can get all the cursors well in advance?
> >>
> >> Let's say my query returns 2M documents and I have set rows=100,000.
> >> Can I have multiple threads iterating over different pages like
> >> Thread1 -> docs 1 to 100K
> >> Thread2 -> docs 101K to 200K
> >> ......
> >> ......
> >>
> >> for this to happen, can I get all the cursorMarks for a given query so
> that
> >> I can leverage the following code in parallel
> >>
> >> cursorQ.set(CursorMarkParams.CURSOR_MARK_PARAM, cursorMark)
> >> val rsp: QueryResponse = c.query(cursorQ)
> >>
> >> Thank you,
> >> Chetas.
>

Reply via email to