Hi all, Great to have this discussion!
My votes are for 2 and 4! Best, Pandu On 2023/05/16 08:50:24 Alessandro Benedetti wrote: > Hi all, > we have finalized all the options proposed by the community and we are > ready to vote for the preferred one and then proceed with the > implementation. > > *Option 1* > Keep it as it is (dimension limit hardcoded to 1024) > *Motivation*: > We are close to improving on many fronts. Given the criticality of Lucene > in computing infrastructure and the concerns raised by one of the most > active stewards of the project, I think we should keep working toward > improving the feature as is and move to up the limit after we can > demonstrate improvement unambiguously. > > *Option 2* > make the limit configurable, for example through a system property > *Motivation*: > The system administrator can enforce a limit its users need to respect that > it's in line with whatever the admin decided to be acceptable for them. > The default can stay the current one. > This should open the doors for Apache Solr, Elasticsearch, OpenSearch, and > any sort of plugin development > > *Option 3* > Move the max dimension limit lower level to a HNSW specific implementation. > Once there, this limit would not bind any other potential vector engine > alternative/evolution. > *Motivation:* There seem to be contradictory performance interpretations > about the current HNSW implementation. Some consider its performance ok, > some not, and it depends on the target data set and use case. Increasing > the max dimension limit where it is currently (in top level > FloatVectorValues) would not allow potential alternatives (e.g. for other > use-cases) to be based on a lower limit. > > *Option 4* > Make it configurable and move it to an appropriate place. > In particular, a simple Integer.getInteger("lucene.hnsw.maxDimensions", > 1024) should be enough. > *Motivation*: > Both are good and not mutually exclusive and could happen in any order. > Someone suggested to perfect what the _default_ limit should be, but I've > not seen an argument _against_ configurability. Especially in this way -- > a toggle that doesn't bind Lucene's APIs in any way. > > I'll keep this [VOTE] open for a week and then proceed to the > implementation. > -------------------------- > *Alessandro Benedetti* > Director @ Sease Ltd. > *Apache Lucene/Solr Committer* > *Apache Solr PMC Member* > > e-mail: a.benede...@sease.io > > > *Sease* - Information Retrieval Applied > Consulting | Training | Open Source > > Website: Sease.io <http://sease.io/> > LinkedIn <https://linkedin.com/company/sease-ltd> | Twitter > <https://twitter.com/seaseltd> | Youtube > <https://www.youtube.com/channel/UCDx86ZKLYNpI3gzMercM7BQ> | Github > <https://github.com/seaseltd> > --------------------------------------------------------------------- To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org