[jira] [Updated] (SOLR-1931) Schema Browser does not scale with large indexes

Erick Erickson (Updated) (JIRA) Mon, 02 Jan 2012 18:27:12 -0800

     [ 
https://issues.apache.org/jira/browse/SOLR-1931?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]


Erick Erickson updated SOLR-1931:
---------------------------------

    Attachment: SOLR-1931-3x.patch
                SOLR-1931-trunk.patch

Thanks Robert and Yonik for pointing me at the new 4x capabilities, they make a 
huge difference. But you knew that.

The killer for 3.x was getting the document counts via a range query, I don't 
think there's a good way to get the counts and not pay the penalty, so there's 
a new parameter recordDocCounts.

Here's my latest and close-to-last cut at this, both for 3x and 4x.

The data set is 89M documents, times in seconds.

3.5 
637 getting doc counts


3x with this patch
552 getting doc counts
 53 Stats without doc counts, but
    histogram etc. No option to do 
    this before.

4x, original
450 or so as I remember, getting doc
    counts, histograms, etc..

4x with patch, histograms still work.
158 Getting the doc counts the old way
   (span queries). I mean,
    you guys *said* ranges were going 
    to be faster.
 39 Getting the doc counts with
    terms.getDocCount(). 
    (including histograms)
 
 
Here's my proposal, I'll probably commit this next weekend at the latest unless 
there are objections:

1> I'll let these stew for a couple of days, and look them over again. Anyone 
who wants to look too, please feel free.

2> Live with getting the doc counts in 4x including the deleted docs and remove 
the reportDocCounts parameter (it'll live in 3.6 and other 3x versions). I 
think the performance is fine without carrying that kind of kludgy option 
forward. I could be persuaded otherwise, but an optimized index will take care 
of the counting of deleted documents problem if anyone really cares.

                
> Schema Browser does not scale with large indexes
> ------------------------------------------------
>
>                 Key: SOLR-1931
>                 URL: https://issues.apache.org/jira/browse/SOLR-1931
>             Project: Solr
>          Issue Type: Improvement
>          Components: web gui
>    Affects Versions: 3.6, 4.0
>            Reporter: Lance Norskog
>            Assignee: Erick Erickson
>            Priority: Minor
>         Attachments: SOLR-1931-3x.patch, SOLR-1931-3x.patch, 
> SOLR-1931-trunk.patch, SOLR-1931-trunk.patch
>
>
> The Schema  Browser JSP by default causes the Luke handler to "scan the 
> world". In large indexes this make the UI useless.
> On an index with 64m documents & 8gb of disk space, the Schema Browser took 6 
> minutes to open and hogged all disk I/O, making Solr useless.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

[jira] [Updated] (SOLR-1931) Schema Browser does not scale with large indexes

Reply via email to