Raising the Value of MAX_DIMENSIONS of Vector Values

2022-08-07 Thread Marcus Eagan
Hi Lucene Team,

In general, I have advised very strongly against our team at MongoDB
modifying the Lucene source, except in scenarios where we have strong needs
for a particular customization. Ultimately, people can do what they would
like to do.

That being said, we have a number of customers preparing to use Lucene for
dense vector search. There are many language models that are optimized for
> 1024 dimensions. I remember Michael Wechner's email
 about
one instance with Open API.

I just tried to test the OpenAI model
> "text-similarity-davinci-001" with 12288 dimension


It seems that customers who attempt to use these models should not be
turned away. It could be sufficient to explain the issues. The only ones I
have identified are two expected ones in very slow indexing throughput,
high CPU usage, and a maybe less defined risk of more numerical errors.

I opened an issue  and PR
 for the discussion as well. I
would appreciate guidance on where we think the warning should go. I feel
like burying in a Javadoc is a less than ideal experience. It would be
better to be a warning on startup. In the PR, I increased the max limit by
a factor of twenty. We should let users use the system based on their needs
even if it was designed or optimized for the models they bring because we
need the feedback and the data from the world.

Is there something I'm overlooking from a risk standpoint?

Best,
-- 
Marcus Eagan


Re: [HELP] Link your Apache Lucene Jira and GitHub account ids before Thursday August 4 midnight (in your local time)

2022-08-07 Thread Glen Newton
jira: gnewton
github: gnewton  (github.com/gnewton)

Thanks,
Glen



On Sat, 6 Aug 2022 at 14:11, Tomoko Uchida 
wrote:

> Hi everyone.
>
> I wanted to let you know that we'll extend the deadline until the date the
> migration is started (the date is not fixed yet).
> Please let us know your Jira/Github usernames if you don't see mapping(s)
> for your account in this file:
>
> https://github.com/apache/lucene-jira-archive/blob/main/migration/mappings-data/account-map.csv.20220722.verified
>
> Tomoko
>
>
> 2022年8月7日(日) 1:36 Baris Kazar :
>
> > Thank You Thank You
> > Best regards
> > 
> > From: Michael McCandless 
> > Sent: Saturday, August 6, 2022 11:29:25 AM
> > To: Baris Kazar 
> > Cc: java-u...@lucene.apache.org 
> > Subject: Re: [HELP] Link your Apache Lucene Jira and GitHub account ids
> > before Thursday August 4 midnight (in your local time)
> >
> > OK done:
> >
> https://github.com/apache/lucene-jira-archive/commit/13fa4cb46a1a6d609448240e4f66c263da8b3fd1
> > <
> >
> https://urldefense.com/v3/__https://github.com/apache/lucene-jira-archive/commit/13fa4cb46a1a6d609448240e4f66c263da8b3fd1__;!!ACWV5N9M2RV99hQ!OJffdSKrjdfY7VYGcAVGsx4rKHPICvgac4eOcXOf1fnT7u9fJ2RSu9toYPgowHx72UC33Ixg1s1BLKR6GBFgnw$
> > >
> >
> > Mike McCandless
> >
> > http://blog.mikemccandless.com<
> >
> https://urldefense.com/v3/__http://blog.mikemccandless.com__;!!ACWV5N9M2RV99hQ!OJffdSKrjdfY7VYGcAVGsx4rKHPICvgac4eOcXOf1fnT7u9fJ2RSu9toYPgowHx72UC33Ixg1s1BLKQULWvYcw$
> > >
> >
> >
> > On Sat, Aug 6, 2022 at 10:29 AM Baris Kazar  > > wrote:
> > I think so.
> > Best regards
> > 
> > From: Michael McCandless  > luc...@mikemccandless.com>>
> > Sent: Saturday, August 6, 2022 10:12 AM
> > To: java-u...@lucene.apache.org <
> > java-u...@lucene.apache.org>
> > Cc: Baris Kazar mailto:baris.ka...@oracle.com>>
> > Subject: Re: [HELP] Link your Apache Lucene Jira and GitHub account ids
> > before Thursday August 4 midnight (in your local time)
> >
> > Thanks Baris,
> >
> > And your Jira ID is bkazar right?
> >
> > Mike
> >
> > On Sat, Aug 6, 2022 at 10:05 AM Baris Kazar  > > wrote:
> > My github username is bmkazar
> > can You please register me?
> > Best regards
> > 
> > From: Michael McCandless  > luc...@mikemccandless.com>>
> > Sent: Saturday, August 6, 2022 6:05:51 AM
> > To: dev@lucene.apache.org <
> > dev@lucene.apache.org>
> > Cc: Lucene Users  > java-u...@lucene.apache.org>>; java-dev  > >
> > Subject: Re: [HELP] Link your Apache Lucene Jira and GitHub account ids
> > before Thursday August 4 midnight (in your local time)
> >
> > Hi Adam, I added your linked accounts here:
> >
> >
> https://urldefense.com/v3/__https://github.com/apache/lucene-jira-archive/commit/c228cb184c073f4b96cd68d45a000cf390455b7c__;!!ACWV5N9M2RV99hQ!KNwyR7RuqeuKpyzEemagEZzGRGtdqjpE-OWaDfjjyZVHJ-zgsGLyYJhZ7ZWJCI1NrWR6H4DYdMbB8nLk1DO04g$
> >
> > And Tomoko added Rushabh's linked accounts here:
> >
> >
> >
> https://urldefense.com/v3/__https://github.com/apache/lucene-jira-archive/commit/6f9501ec68792c1b287e93770f7a9dfd351b86fb__;!!ACWV5N9M2RV99hQ!KNwyR7RuqeuKpyzEemagEZzGRGtdqjpE-OWaDfjjyZVHJ-zgsGLyYJhZ7ZWJCI1NrWR6H4DYdMbB8nITwUFX0A$
> >
> > Keep the linked accounts coming!
> >
> > Mike
> >
> > On Thu, Aug 4, 2022 at 7:02 PM Rushabh Shah
> > mailto:rushabh.s...@salesforce.com
> >.invalid>
> > wrote:
> >
> > > Hi,
> > > My mapping is:
> > > JiraName,GitHubAccount,JiraDispName
> > > shahrs87, shahrs87, Rushabh Shah
> > >
> > > Thank you Tomoko and Mike for all of your hard work.
> > >
> > >
> > >
> > >
> > > On Sun, Jul 31, 2022 at 3:08 AM Michael McCandless <
> > > luc...@mikemccandless.com> wrote:
> > >
> > >> Hello Lucene users, contributors and developers,
> > >>
> > >> If you have used Lucene's Jira and you have a GitHub account as well,
> > >> please check whether your user id mapping is in this file:
> > >>
> >
> https://urldefense.com/v3/__https://github.com/apache/lucene-jira-archive/blob/main/migration/mappings-data/account-map.csv.20220722.verified__;!!ACWV5N9M2RV99hQ!KNwyR7RuqeuKpyzEemagEZzGRGtdqjpE-OWaDfjjyZVHJ-zgsGLyYJhZ7ZWJCI1NrWR6H4DYdMbB8nLjA_KarQ$
> > >>
> > >> If not, please reply to this email and we will try to add you.
> > >>
> > >> Please forward this email to anyone you know might be impacted and who
> > >> might not be tracking the Lucene lists.
> > >>
> > >>
> > >> Full details:
> > >>
> > >> The Lucene project will soon migrate from Jira to GitHub for issue
> > >> tracking.
> > >>
> > >> There have been discussions, votes, a migration tool created /
> iterated
> > >> (thanks to Tomoko Uchida's incredibly hard work), all iterating on
> > Lucene's
> > >> dev list.
> > >>
> > >> When we run 

Re: [HELP] Link your Apache Lucene Jira and GitHub account ids before Thursday August 4 midnight (in your local time)

2022-08-07 Thread Aditya Varun Chadha
JIRA: adichad
GitHub: adichad

Thank you!

On Sat 6. Aug 2022 at 20:37, Glen Newton  wrote:

> jira: gnewton
> github: gnewton  (github.com/gnewton)
>
> Thanks,
> Glen
>
>
>
> On Sat, 6 Aug 2022 at 14:11, Tomoko Uchida 
> wrote:
>
> > Hi everyone.
> >
> > I wanted to let you know that we'll extend the deadline until the date
> the
> > migration is started (the date is not fixed yet).
> > Please let us know your Jira/Github usernames if you don't see mapping(s)
> > for your account in this file:
> >
> >
> https://github.com/apache/lucene-jira-archive/blob/main/migration/mappings-data/account-map.csv.20220722.verified
> >
> > Tomoko
> >
> >
> > 2022年8月7日(日) 1:36 Baris Kazar :
> >
> > > Thank You Thank You
> > > Best regards
> > > 
> > > From: Michael McCandless 
> > > Sent: Saturday, August 6, 2022 11:29:25 AM
> > > To: Baris Kazar 
> > > Cc: java-u...@lucene.apache.org 
> > > Subject: Re: [HELP] Link your Apache Lucene Jira and GitHub account ids
> > > before Thursday August 4 midnight (in your local time)
> > >
> > > OK done:
> > >
> >
> https://github.com/apache/lucene-jira-archive/commit/13fa4cb46a1a6d609448240e4f66c263da8b3fd1
> > > <
> > >
> >
> https://urldefense.com/v3/__https://github.com/apache/lucene-jira-archive/commit/13fa4cb46a1a6d609448240e4f66c263da8b3fd1__;!!ACWV5N9M2RV99hQ!OJffdSKrjdfY7VYGcAVGsx4rKHPICvgac4eOcXOf1fnT7u9fJ2RSu9toYPgowHx72UC33Ixg1s1BLKR6GBFgnw$
> > > >
> > >
> > > Mike McCandless
> > >
> > > http://blog.mikemccandless.com<
> > >
> >
> https://urldefense.com/v3/__http://blog.mikemccandless.com__;!!ACWV5N9M2RV99hQ!OJffdSKrjdfY7VYGcAVGsx4rKHPICvgac4eOcXOf1fnT7u9fJ2RSu9toYPgowHx72UC33Ixg1s1BLKQULWvYcw$
> > > >
> > >
> > >
> > > On Sat, Aug 6, 2022 at 10:29 AM Baris Kazar  > > > wrote:
> > > I think so.
> > > Best regards
> > > 
> > > From: Michael McCandless  > > luc...@mikemccandless.com>>
> > > Sent: Saturday, August 6, 2022 10:12 AM
> > > To: java-u...@lucene.apache.org <
> > > java-u...@lucene.apache.org>
> > > Cc: Baris Kazar mailto:baris.ka...@oracle.com
> >>
> > > Subject: Re: [HELP] Link your Apache Lucene Jira and GitHub account ids
> > > before Thursday August 4 midnight (in your local time)
> > >
> > > Thanks Baris,
> > >
> > > And your Jira ID is bkazar right?
> > >
> > > Mike
> > >
> > > On Sat, Aug 6, 2022 at 10:05 AM Baris Kazar  > > > wrote:
> > > My github username is bmkazar
> > > can You please register me?
> > > Best regards
> > > 
> > > From: Michael McCandless  > > luc...@mikemccandless.com>>
> > > Sent: Saturday, August 6, 2022 6:05:51 AM
> > > To: dev@lucene.apache.org <
> > > dev@lucene.apache.org>
> > > Cc: Lucene Users  > > java-u...@lucene.apache.org>>; java-dev  > > >
> > > Subject: Re: [HELP] Link your Apache Lucene Jira and GitHub account ids
> > > before Thursday August 4 midnight (in your local time)
> > >
> > > Hi Adam, I added your linked accounts here:
> > >
> > >
> >
> https://urldefense.com/v3/__https://github.com/apache/lucene-jira-archive/commit/c228cb184c073f4b96cd68d45a000cf390455b7c__;!!ACWV5N9M2RV99hQ!KNwyR7RuqeuKpyzEemagEZzGRGtdqjpE-OWaDfjjyZVHJ-zgsGLyYJhZ7ZWJCI1NrWR6H4DYdMbB8nLk1DO04g$
> > >
> > > And Tomoko added Rushabh's linked accounts here:
> > >
> > >
> > >
> >
> https://urldefense.com/v3/__https://github.com/apache/lucene-jira-archive/commit/6f9501ec68792c1b287e93770f7a9dfd351b86fb__;!!ACWV5N9M2RV99hQ!KNwyR7RuqeuKpyzEemagEZzGRGtdqjpE-OWaDfjjyZVHJ-zgsGLyYJhZ7ZWJCI1NrWR6H4DYdMbB8nITwUFX0A$
> > >
> > > Keep the linked accounts coming!
> > >
> > > Mike
> > >
> > > On Thu, Aug 4, 2022 at 7:02 PM Rushabh Shah
> > > mailto:rushabh.s...@salesforce.com
> > >.invalid>
> > > wrote:
> > >
> > > > Hi,
> > > > My mapping is:
> > > > JiraName,GitHubAccount,JiraDispName
> > > > shahrs87, shahrs87, Rushabh Shah
> > > >
> > > > Thank you Tomoko and Mike for all of your hard work.
> > > >
> > > >
> > > >
> > > >
> > > > On Sun, Jul 31, 2022 at 3:08 AM Michael McCandless <
> > > > luc...@mikemccandless.com> wrote:
> > > >
> > > >> Hello Lucene users, contributors and developers,
> > > >>
> > > >> If you have used Lucene's Jira and you have a GitHub account as
> well,
> > > >> please check whether your user id mapping is in this file:
> > > >>
> > >
> >
> https://urldefense.com/v3/__https://github.com/apache/lucene-jira-archive/blob/main/migration/mappings-data/account-map.csv.20220722.verified__;!!ACWV5N9M2RV99hQ!KNwyR7RuqeuKpyzEemagEZzGRGtdqjpE-OWaDfjjyZVHJ-zgsGLyYJhZ7ZWJCI1NrWR6H4DYdMbB8nLjA_KarQ$
> > > >>
> > > >> If not, please reply to this email and we will try to add you.
> > > >>
> > > >> Please forward this email to anyone you know might be impacted and
> who
> > > >> might not be 

Re: [HELP] Link your Apache Lucene Jira and GitHub account ids before Thursday August 4 midnight (in your local time)

2022-08-07 Thread Aditya Varun Chadha
Thanks Tomoko,
There is no activity in JIRA from me as far as I can recall. This is the
correct and only account though.

On Sun 7. Aug 2022 at 05:50, Tomoko Uchida 
wrote:

> Hi Aditya,
> I found a Jira user "adichad" but this account has no activities in Lucene
> Jira. See:
> https://issues.apache.org/jira/secure/ViewProfile.jspa?name=adichad
>
> I wonder if you have multiple Jira accounts and you use another account for
> Lucene? For example, there is a Jira user "abakle
> " -
> this has activities in LUCENE.
>
> Tomoko
>
>
> 2022年8月7日(日) 5:56 Aditya Varun Chadha :
>
> > JIRA: adichad
> > GitHub: adichad
> >
> > Thank you!
> >
> > On Sat 6. Aug 2022 at 20:37, Glen Newton  wrote:
> >
> > > jira: gnewton
> > > github: gnewton  (github.com/gnewton)
> > >
> > > Thanks,
> > > Glen
> > >
> > >
> > >
> > > On Sat, 6 Aug 2022 at 14:11, Tomoko Uchida <
> tomoko.uchida.1...@gmail.com
> > >
> > > wrote:
> > >
> > > > Hi everyone.
> > > >
> > > > I wanted to let you know that we'll extend the deadline until the
> date
> > > the
> > > > migration is started (the date is not fixed yet).
> > > > Please let us know your Jira/Github usernames if you don't see
> > mapping(s)
> > > > for your account in this file:
> > > >
> > > >
> > >
> >
> https://github.com/apache/lucene-jira-archive/blob/main/migration/mappings-data/account-map.csv.20220722.verified
> > > >
> > > > Tomoko
> > > >
> > > >
> > > > 2022年8月7日(日) 1:36 Baris Kazar :
> > > >
> > > > > Thank You Thank You
> > > > > Best regards
> > > > > 
> > > > > From: Michael McCandless 
> > > > > Sent: Saturday, August 6, 2022 11:29:25 AM
> > > > > To: Baris Kazar 
> > > > > Cc: java-u...@lucene.apache.org 
> > > > > Subject: Re: [HELP] Link your Apache Lucene Jira and GitHub account
> > ids
> > > > > before Thursday August 4 midnight (in your local time)
> > > > >
> > > > > OK done:
> > > > >
> > > >
> > >
> >
> https://github.com/apache/lucene-jira-archive/commit/13fa4cb46a1a6d609448240e4f66c263da8b3fd1
> > > > > <
> > > > >
> > > >
> > >
> >
> https://urldefense.com/v3/__https://github.com/apache/lucene-jira-archive/commit/13fa4cb46a1a6d609448240e4f66c263da8b3fd1__;!!ACWV5N9M2RV99hQ!OJffdSKrjdfY7VYGcAVGsx4rKHPICvgac4eOcXOf1fnT7u9fJ2RSu9toYPgowHx72UC33Ixg1s1BLKR6GBFgnw$
> > > > > >
> > > > >
> > > > > Mike McCandless
> > > > >
> > > > > http://blog.mikemccandless.com<
> > > > >
> > > >
> > >
> >
> https://urldefense.com/v3/__http://blog.mikemccandless.com__;!!ACWV5N9M2RV99hQ!OJffdSKrjdfY7VYGcAVGsx4rKHPICvgac4eOcXOf1fnT7u9fJ2RSu9toYPgowHx72UC33Ixg1s1BLKQULWvYcw$
> > > > > >
> > > > >
> > > > >
> > > > > On Sat, Aug 6, 2022 at 10:29 AM Baris Kazar <
> baris.ka...@oracle.com
> > > > > > wrote:
> > > > > I think so.
> > > > > Best regards
> > > > > 
> > > > > From: Michael McCandless  > > > > luc...@mikemccandless.com>>
> > > > > Sent: Saturday, August 6, 2022 10:12 AM
> > > > > To: java-u...@lucene.apache.org >
> > <
> > > > > java-u...@lucene.apache.org>
> > > > > Cc: Baris Kazar  > baris.ka...@oracle.com
> > > >>
> > > > > Subject: Re: [HELP] Link your Apache Lucene Jira and GitHub account
> > ids
> > > > > before Thursday August 4 midnight (in your local time)
> > > > >
> > > > > Thanks Baris,
> > > > >
> > > > > And your Jira ID is bkazar right?
> > > > >
> > > > > Mike
> > > > >
> > > > > On Sat, Aug 6, 2022 at 10:05 AM Baris Kazar <
> baris.ka...@oracle.com
> > > > > > wrote:
> > > > > My github username is bmkazar
> > > > > can You please register me?
> > > > > Best regards
> > > > > 
> > > > > From: Michael McCandless  > > > > luc...@mikemccandless.com>>
> > > > > Sent: Saturday, August 6, 2022 6:05:51 AM
> > > > > To: dev@lucene.apache.org <
> > > > > dev@lucene.apache.org>
> > > > > Cc: Lucene Users  > > > > java-u...@lucene.apache.org>>; java-dev <
> java-...@lucene.apache.org
> > > > > >
> > > > > Subject: Re: [HELP] Link your Apache Lucene Jira and GitHub account
> > ids
> > > > > before Thursday August 4 midnight (in your local time)
> > > > >
> > > > > Hi Adam, I added your linked accounts here:
> > > > >
> > > > >
> > > >
> > >
> >
> https://urldefense.com/v3/__https://github.com/apache/lucene-jira-archive/commit/c228cb184c073f4b96cd68d45a000cf390455b7c__;!!ACWV5N9M2RV99hQ!KNwyR7RuqeuKpyzEemagEZzGRGtdqjpE-OWaDfjjyZVHJ-zgsGLyYJhZ7ZWJCI1NrWR6H4DYdMbB8nLk1DO04g$
> > > > >
> > > > > And Tomoko added Rushabh's linked accounts here:
> > > > >
> > > > >
> > > > >
> > > >
> > >
> >
>