Re: Best strategy migrate indexes

2022-11-07 Thread Michael Sokolov
The error you got BufferedChecksumIndexInput(MMapIndexInput(path="tests_small_index-7.x-migrator\segments_1"))): 9 (needs to be between 6 and 7) indicates that the index you are reading was written by Lucene 9, so things are not set up the way you described (writing using Lucene 7) > Thanks TX

RE: [EXT] Re: Efficient sort on SortedDocValues

2022-11-07 Thread Solodin, Andrei (TR Technology)
Ah, of course. Thanks Mikhail. I realized it was a silly question that only made sense to me since my query was MatchAll docs. -Original Message- From: Mikhail Khludnev Sent: Monday, November 7, 2022 2:44 AM To: java-user@lucene.apache.org Subject: [EXT] Re: Efficient sort on

Re: Best strategy migrate indexes

2022-11-07 Thread Pablo Vázquez Blázquez
Thanks TX for your response. I would check that the Luke version matches the Lucene version - if > the two match, it shouldn't be possible to get issues like this. > That is, the precise versions of Lucene each is using. Yes, I am using https://github.com/DmitryKey/luke/releases/tag/luke-7.1.0

Re: Best strategy migrate indexes

2022-11-07 Thread Trejkaz
The process itself sounds like it should work (it's basically a reindex so it should be safer than trying to migrate directly.) I would check that the Luke version matches the Lucene version - if the two match, it shouldn't be possible to get issues like this. That is, the precise versions of

Re: Best strategy migrate indexes

2022-11-07 Thread Pablo Vázquez Blázquez
Hi! > I am trying to create a tool to read docs from a lucene5 index and generate lucene9 documents from them (with docValues). That might work, right? I am shading both lucene5 and lucene9 to avoid package conflicts. I am doing the following steps: - create IndexReader with lucene5 package

Re: Learning Lucene from ground up

2022-11-07 Thread Adrien Grand
+1 to MyCoy's suggestion. To answer your most immediate questions: - Lucene mostly loads metadata in memory at the time of opening a segment (dvm, tmd, fdm, vem, nvm, kdm files), other files are memory-mapped and Lucene relies on the filesystem cache to have their data efficiently available.

Re: Efficient sort on SortedDocValues

2022-11-07 Thread Adrien Grand
Hi Andrei, The case that you are describing got optimized in Lucene 9.4.0 in the case when your field is also indexed with a StringField: https://github.com/apache/lucene/pull/1023. See annotation ER at http://people.apache.org/~mikemccand/lucenebench/TermMonthSort.html. The way it works is that

Re: Efficient sort on SortedDocValues

2022-11-07 Thread Mikhail Khludnev
Hello, Andrei. Docs are scored in-order (see Weight.scoreAll(), scoreRange()), just because underneath postings API is in-order. There are a few shortcuts/optimizations, but they only omit some iterations/segments like checking competitive scores and so one. On Sun, Nov 6, 2022 at 1:35 AM