[ https://issues.apache.org/jira/browse/SOLR-17775?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17955820#comment-17955820 ]
Yura commented on SOLR-17775: ----------------------------- [~dsmiley], the improvement was quite significant. This is especially noticeable if you need more than 100 rows. I don’t think it adds much memory usage. It’s only for the retrieved document IDs, which are usually quite small (<1000). The main gain is not in saving calls to {{{}getValues{}}}, but in retrieving values in order. Lucene data structures are optimized for iteration, not random seek. Internally, this approach uses {{DocIterator}} jumps from doc[N-1] to doc[N], instead of jumping from 0 to doc[N]. This is much cheaper. > Optimize ValueSourceAugmenter > ----------------------------- > > Key: SOLR-17775 > URL: https://issues.apache.org/jira/browse/SOLR-17775 > Project: Solr > Issue Type: Improvement > Components: search > Reporter: Yura > Priority: Minor > Labels: pull-request-available > Time Spent: 10m > Remaining Estimate: 0h > > h3. Problem > ValueSourceAugmenter currently calculates function values on-demand during > transform(), performing expensive binary searches and reader lookups for each > document individually. > h3. Solution > Pre-calculate function values for all result set documents during > setContext() by: > * Collecting and sorting document IDs from DocList > * Sequential iteration through sorted documents to calculate values once per > reader segment > * Storing results in hash map for O(1) lookup during transform() > * Fallback to on-demand calculation for documents outside the pre-calculated > set (RTG cases) > h3. Performance Benefit > Replaces repeated "find document at position N" operations (binary search per > document) with efficient "get next document" iteration (sequential processing > within reader segments), significantly reducing lookup overhead. > h3. Compatibility > Maintains full backward compatibility through fallback mechanism for edge > cases. -- This message was sent by Atlassian Jira (v8.20.10#820010) --------------------------------------------------------------------- To unsubscribe, e-mail: issues-unsubscr...@solr.apache.org For additional commands, e-mail: issues-h...@solr.apache.org