[
https://issues.apache.org/jira/browse/SOLR-1931?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Erick Erickson updated SOLR-1931:
---------------------------------
Attachment: SOLR-1931-3x.patch
SOLR-1931-trunk.patch
Thanks Robert and Yonik for pointing me at the new 4x capabilities, they make a
huge difference. But you knew that.
The killer for 3.x was getting the document counts via a range query, I don't
think there's a good way to get the counts and not pay the penalty, so there's
a new parameter recordDocCounts.
Here's my latest and close-to-last cut at this, both for 3x and 4x.
The data set is 89M documents, times in seconds.
3.5
637 getting doc counts
3x with this patch
552 getting doc counts
53 Stats without doc counts, but
histogram etc. No option to do
this before.
4x, original
450 or so as I remember, getting doc
counts, histograms, etc..
4x with patch, histograms still work.
158 Getting the doc counts the old way
(span queries). I mean,
you guys *said* ranges were going
to be faster.
39 Getting the doc counts with
terms.getDocCount().
(including histograms)
Here's my proposal, I'll probably commit this next weekend at the latest unless
there are objections:
1> I'll let these stew for a couple of days, and look them over again. Anyone
who wants to look too, please feel free.
2> Live with getting the doc counts in 4x including the deleted docs and remove
the reportDocCounts parameter (it'll live in 3.6 and other 3x versions). I
think the performance is fine without carrying that kind of kludgy option
forward. I could be persuaded otherwise, but an optimized index will take care
of the counting of deleted documents problem if anyone really cares.
> Schema Browser does not scale with large indexes
> ------------------------------------------------
>
> Key: SOLR-1931
> URL: https://issues.apache.org/jira/browse/SOLR-1931
> Project: Solr
> Issue Type: Improvement
> Components: web gui
> Affects Versions: 3.6, 4.0
> Reporter: Lance Norskog
> Assignee: Erick Erickson
> Priority: Minor
> Attachments: SOLR-1931-3x.patch, SOLR-1931-3x.patch,
> SOLR-1931-trunk.patch, SOLR-1931-trunk.patch
>
>
> The Schema Browser JSP by default causes the Luke handler to "scan the
> world". In large indexes this make the UI useless.
> On an index with 64m documents & 8gb of disk space, the Schema Browser took 6
> minutes to open and hogged all disk I/O, making Solr useless.
--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators:
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]