[
https://issues.apache.org/jira/browse/SOLR-17942?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Ishan Chattopadhyaya resolved SOLR-17942.
-----------------------------------------
Resolution: Fixed
Thanks [~puneet22]!
> Raising the hardcoded limit of lucene parameter ramPerThreadHardLimitMB using
> reflection
> ----------------------------------------------------------------------------------------
>
> Key: SOLR-17942
> URL: https://issues.apache.org/jira/browse/SOLR-17942
> Project: Solr
> Issue Type: Task
> Reporter: Puneet Ahuja
> Assignee: Ishan Chattopadhyaya
> Priority: Major
> Labels: pull-request-available
> Fix For: 10.0
>
> Time Spent: 0.5h
> Remaining Estimate: 0h
>
> The parameter ramPerThreadHardLimitMB cannot be larger than 2GB in Lucene,
> which means a single thread cannot write segments larger than 2GB.
> Refer:
> [https://lucene.apache.org/core/9_9_0/core/org/apache/lucene/index/IndexWriterConfig.html#setRAMPerThreadHardLimitMB(int])
> This issue proposes to make this parameter configurable above the 2GB limit,
> so that each thread can write a bigger segment. I plan to use reflection to
> bypass this hard-coded limit in Lucene.
>
> When indexing high dimensional vector data, each segment has its own HNSW
> graph. So more segments mean more graphs to search per shard and more graph
> rebuild work during merges. With this change, a single indexing thread can
> flush fewer, and larger segments, which is generally more resource-efficient
> for vector-heavy workloads.
> Lucene issue: https://github.com/apache/lucene/issues/15296
--
This message was sent by Atlassian Jira
(v8.20.10#820010)
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]