msokolov commented on PR #13153: URL: https://github.com/apache/lucene/pull/13153#issuecomment-2004334853
after disabling this for fields with positions, luceneutil perf looks pretty flat. I think it simply doesn't have any test cases that would exercise this. I wrote a small benchmark that indexes a lot of dense terms of different frequencies and then runs top 100 searches using randomly-generated boolean queries on them designed that exploit those fields. This does seem to show a significant difference: index size reduction of 25% and 30% improvement in QPS. This test is artificial; there's a fair amount of variance (although I used a fixed random seed and multiple runs), and the index is pretty small. But I think it shows promise, so I'll try to figure out how to get a wikipedia-based test in to luceneutil. Not sure what realistic data there is to support it in linedocs. Perhaps if we index Month and Year as docs-only fields we would see some impact on queries with those as filters? -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org --------------------------------------------------------------------- To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For additional commands, e-mail: issues-h...@lucene.apache.org