[
https://issues.apache.org/jira/browse/SOLR-17942?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=18047353#comment-18047353
]
Puneet Ahuja commented on SOLR-17942:
-------------------------------------
Hi [~janhoy] ,
Thanks for catching that. The issue was that my branch name had a "/" in it
(like `puneet/branch-name`), and when `writeChangelog` used the branch name to
create the YAML file, it created a subdirectory `/changelog/unreleased/puneet/`
instead of placing it directly under `/changelog/unreleased/`. I'll move the
file to the correct location and raise a PR.
I noticed the "Check changelog entry" workflow passed on GitHub, so this might
be something the validation could catch in the future. This gotcha with branch
names containing "/" should probably be addressed or at least mentioned in the
documentation for the new changelog process.
CC: [~ichattopadhyaya]
> Raising the hardcoded limit of lucene parameter ramPerThreadHardLimitMB using
> reflection
> ----------------------------------------------------------------------------------------
>
> Key: SOLR-17942
> URL: https://issues.apache.org/jira/browse/SOLR-17942
> Project: Solr
> Issue Type: Task
> Reporter: Puneet Ahuja
> Assignee: Ishan Chattopadhyaya
> Priority: Major
> Labels: pull-request-available
> Fix For: 10.0
>
> Time Spent: 0.5h
> Remaining Estimate: 0h
>
> The parameter ramPerThreadHardLimitMB cannot be larger than 2GB in Lucene,
> which means a single thread cannot write segments larger than 2GB.
> Refer:
> [https://lucene.apache.org/core/9_9_0/core/org/apache/lucene/index/IndexWriterConfig.html#setRAMPerThreadHardLimitMB(int])
> This issue proposes to make this parameter configurable above the 2GB limit,
> so that each thread can write a bigger segment. I plan to use reflection to
> bypass this hard-coded limit in Lucene.
>
> When indexing high dimensional vector data, each segment has its own HNSW
> graph. So more segments mean more graphs to search per shard and more graph
> rebuild work during merges. With this change, a single indexing thread can
> flush fewer, and larger segments, which is generally more resource-efficient
> for vector-heavy workloads.
> Lucene issue: https://github.com/apache/lucene/issues/15296
--
This message was sent by Atlassian Jira
(v8.20.10#820010)
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]