[
https://issues.apache.org/jira/browse/LUCENE-3435?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13104622#comment-13104622
]
Grant Ingersoll commented on LUCENE-3435:
-----------------------------------------
A good deal of it Mike and I worked out yesterday on IRC (well, mostly Mike
explained and I took copious notes). The disk storage stuff is based on LIA2.
It is a theoretical model and not an empirical one other than the bytes/term
calculation was based off of indexing wikipedia.
I would deem it a gross approximation of the state of trunk at this point in
time. My gut says the Lucene estimation is a little low, while Solr is fairly
close (since I suspect Solr's memory usage is dominated by caching). I imagine
there are things still unaccounted for. For instance, I haven't reverse
engineered the fieldValueCache memSize() method yet and I don't have a good
sense of how much memory would be consumed in a highly concurrent system by the
sheer number of Query objects instantiated or when one has really large Queries
(say 5K terms). It also is not meant to be one size fits all. Lucene/Solr
have a ton of tuning options that could change things significantly.
I did a few sanity checks against things I've seen in the past, and thought it
was reasonable. There is, of course, no substitute for good testing. In other
words, caveat emptor.
> Create a Size Estimator model for Lucene and Solr
> -------------------------------------------------
>
> Key: LUCENE-3435
> URL: https://issues.apache.org/jira/browse/LUCENE-3435
> Project: Lucene - Java
> Issue Type: Task
> Components: core/other
> Affects Versions: 4.0
> Reporter: Grant Ingersoll
> Assignee: Grant Ingersoll
> Priority: Minor
>
> It is often handy to be able to estimate the amount of memory and disk space
> that both Lucene and Solr use, given certain assumptions. I intend to check
> in an Excel spreadsheet that allows people to estimate memory and disk usage
> for trunk. I propose to put it under dev-tools, as I don't think it should
> be official documentation just yet and like the IDE stuff, we'll see how well
> it gets maintained.
--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]