Re: https://issues.apache.org/jira/browse/LUCENE-8448

2020-11-13 Thread baris . kazar
Great answer Thanks Michael. Yes the difference was too much > 1G Best regards > On Nov 13, 2020, at 1:49 PM, Michael Sokolov wrote: > > You can't directly compare disk usage across two indexes, even with > the same data. Try re-indexing one of your datasets, and you will see > that the disk

Re: https://issues.apache.org/jira/browse/LUCENE-8448

2020-11-13 Thread Michael Sokolov
You can't directly compare disk usage across two indexes, even with the same data. Try re-indexing one of your datasets, and you will see that the disk size is not the same. Mostly this is due to the way segments are merged varying with some randomness from one run to another, although the size of

Re: https://issues.apache.org/jira/browse/LUCENE-8448

2020-11-13 Thread baris . kazar
Nothing changed between two index generations except the data changed a bit as i described. When Lucene is done generating index, that is what i am reporting as the size of the directory where all index files are stored. I dont know about deleted docs? How do you trace that? yes the queries

Re: https://issues.apache.org/jira/browse/LUCENE-8448

2020-11-13 Thread Erick Erickson
What does “final finished sizes” mean? After optimize of just after finishing all indexing? The former is what counts here. And you provided no information on the number of deleted docs in the two cases. Is the number of deletedDocs the same (or close)? And does the q=*:* query return the same

Re: https://issues.apache.org/jira/browse/LUCENE-8448

2020-11-12 Thread baris . kazar
Hi,- Thanks. These are final finished sizes in both cases. Best regards > On Nov 12, 2020, at 11:12 PM, Erick Erickson wrote: > > Yes, that issue is fixed. The “Resolution” tag is the key, it’s marked > “fixed” and the version is 8.0 > > As for your other question, index size is a very

Re: https://issues.apache.org/jira/browse/LUCENE-8448

2020-11-12 Thread Erick Erickson
are merged away. Best, Erick > On Nov 12, 2020, at 5:35 PM, baris.ka...@oracle.com wrote: > > https://issues.apache.org/jira/browse/LUCENE-8448 > > > Hi,- > > is this issue fixed please? Could You please help me figure i

Re: https://issues.apache.org/jira/browse/LUCENE-8448

2020-11-12 Thread baris . kazar
On a related issue: i experience that with Version 7.7.2 i experienced this: data is all lower case (same amount of docs as next case though) vs data is camel case except last word always in capital letters but i used in indexer the lowercase filter in both cases so indexing is done with

https://issues.apache.org/jira/browse/LUCENE-8448

2020-11-12 Thread baris . kazar
https://issues.apache.org/jira/browse/LUCENE-8448 Hi,-  is this issue fixed please? Could You please help me figure it out? Best regards - To unsubscribe, e-mail: java-user-unsubscr...@lucene.apache.org For additional