best practices for generating queries from users questions?

2023-09-20 Thread qrdl kaggle
Given a knowledge base indexed by lucene, users often pose searches via questions. Is there a good reference code/paper/doc on how to translate those natural language questions into an effective and accurate lucene query?

forceMerge(1) leads to ~10% perf gains

2023-09-21 Thread qrdl kaggle
After testing on 4800 fairly complex queries, I see a performance gain of 10% after doing indexWriter.forceMerge(1); indexWriter.commit(); from 209 ms per query, to 185 ms per query. Queries are quite complex, often about 30 or words, of the format OR text: It went from 214 to 14 files on the for