Thanks all for the feedback.

I have a PR for this with some performance benchmarks
https://github.com/apache/solr/pull/4053

Considering the *potential* performance degradation I made this
opt-in via a new request parameter includeStoredFields.

> if I looked closer, I would find that ExportWriter was
> engineered/designed around the assumptions of DocValues -- chiefly
> that data is first organized by field then document.  But stored
fields are separated first. The access pattern and thus performance
> might suffer a lot, if true.

> Do you have any sense for how using stored vs. DV fields impacts the
> performance of /export?

David/Jason:
So whether stored fields are slower or not really depends on your
document and fl. For smaller document there really is little difference.
For larger documents with a relatively small fl the difference can be
substantial. If your fl is small and your docs are big then doing
StoredFields lookup is especially wasteful. My benchmark for this
scenario showed a 4-5X slowdown between stored and DV.

I was happy to do the benchmarks but some of this is beside the point.
Some fields don't even support DV, what if I am trying to export them?
Also, "you should've used DV" is a fait accompli, the index you are
trying to get the data from may already be non-DV stored and there is
little you can do about it. Finally. if you are planning to extract
*everything* (my case) it should not matter much either way.


> I have found myself switching to cursorMark in cases where I
> needed to work with fields which are non-DV stored, which is a
> few more lines of boilerplate code, and so this would be a welcome
> enhancement.

Rahul:
Yes this is exactly the kind of pain I am trying to prevent :-)

From: [email protected] At: 12/09/25 17:41:57 UTC-5:00To:  [email protected]
Subject: Re: ExportWriter (Optionally) Supporting Stored Fields

I haven't looked at this in depth (big disclaimer there!) but I *suspect*
if I looked closer, I would find that ExportWriter was engineered/designed
around the assumptions of DocValues -- chiefly that data is first organized
by field then document.  But stored fields are separated first.  The access
pattern and thus performance might suffer a lot, if true.  Some parts of
Solr like simply returning normal search results and even highlighting have
tried to navigate that balancing act.

On Mon, Nov 17, 2025 at 11:02 AM Luke Kot-Zaniewski (BLOOMBERG/ 919 3RD A) <
[email protected]> wrote:

> Hi All,
>
> I've come to enjoy Solr's streaming support with export writer very much,
> so when I ran into a case where I needed to extract non-dv stored fields
> I was motivated to equip ExportWriter with this power. I have a patch that
> does this, or at least appears to do this. I would be happy to share but
> wanted to gauge interest and solicit any opinions on the mailing list
> first.
> I realize DVs are usually better but this is in case you don't have DVs.
>
> Thanks,
> Luke
>
>


Reply via email to