+1, Thanks Julie for tackling this, and serving as a release manager. On Tue, Oct 18, 2022 at 2:51 PM Michael Wechner <michael.wech...@wyona.com> wrote:
> +1 :-) > > Thanks > > Michael > > Am 18.10.22 um 19:52 schrieb Julie Tibshirani: > > Hi everyone, > > > > We recently discovered a severe bug in the 9.4 release in the kNN > > vectors format: https://github.com/apache/lucene/issues/11858. > > Explaining the problem: when ingesting a lot of data, or when > > performing a force merge, segments can grow large. The format > > validation code accidentally uses an int instead of a long to compute > > the data size, so it can fail on these large segments. When format > > validation fails, the segment is essentially lost and unusable. For > > some client systems like Elasticsearch, it can send the whole index > > into a "failed" state, blocking further writes or searches. > > > > I think this bug is sufficiently bad that we should perform a 9.4.1 > > release as soon as possible. The fix is just an update to the > > read-side validation code, there won't be any effect on the data > > format. This means it is safe to merge the fix into the existing 9.4 > > vectors format. The bug was introduced during the work to add > > quantization (https://github.com/apache/lucene/pull/1054) and does not > > affect versions before 9.4. > > > > Let me know what you think! I could serve as release manager. (We > > should also follow up with a plan to prevent this from happening in > > the future -- maybe we need to regularly run larger-scale benchmarks?) > > > > Julie > > > --------------------------------------------------------------------- > To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org > For additional commands, e-mail: dev-h...@lucene.apache.org > >