Re: Why use a different analyzer for "index" and "query"?

2020-09-10 Thread Tim Casey
People usually want to do some analysis during index time. This analysis should be considered 'expensive', compared to any single query run. You can think of it as indexing every day, over a 86400 second day, vs a 200 ms query time. Normally, you want to index as honestly as possible. That is,

Re: Why use a different analyzer for "index" and "query"?

2020-09-10 Thread Walter Underwood
It is very common for us to do more processing in the index analysis chain. In general, we do that when we want additional terms in the index to be searchable. Some examples: * synonyms: If the book title is “EMT” add “Emergency Medical Technician”. * ngrams: For prefix matching, generate all

Re: Why use a different analyzer for "index" and "query"?

2020-09-10 Thread Erick Erickson
When you want to do something different and index and query time. There, an answer that’s almost, but not quite, completely useless while being accurate ;) A concrete example is synonyms as have been mentioned. Say you have an index-time synonym definition of A,B,C These three tokens will be

Re: Why use a different analyzer for "index" and "query"?

2020-09-10 Thread Stavros Macrakis
I gave an example of why you might want to analyze the corpus differently from the query just yesterday -- see https://lucene.472066.n3.nabble.com/Lowercase-ing-everything-but-acronyms-td4462899.html -s On Thu, Sep 10, 2020 at 11:19 AM Steven White wrote: > Hi everyone, > > In

Re: Why use a different analyzer for "index" and "query"?

2020-09-10 Thread Alexandre Rafalovitch
There are a lot of different use cases and the separate analyzers for indexing and query is part of the Solr power. For example, you could apply ngram during indexing time to generate multiple substrings. But you don't want to do that during the query, because otherwise you are matching on 'shared

Re: Why use a different analyzer for "index" and "query"?

2020-09-10 Thread Thomas Corthals
Hi Steve I have a real-world use case. We don't apply a synonym filter at index time, but we do apply a managed synonym filter at query time. This allows content managers to add new synonyms (or remove existing ones) "on the fly" without having to reindex any documents. Thomas Op do 10 sep.

RE: Why use a different analyzer for "index" and "query"?

2020-09-10 Thread Dunham-Wilkie, Mike CITZ:EX
Hi Steven, I can think of one case. If we have an index of database table or column names, e.g., words like 'THIS_IS_A_TABLE_NAME', we may want to split the name at the underscores when indexing (as well as keep the original), since the individual parts might be significant and meaningful.

Why use a different analyzer for "index" and "query"?

2020-09-10 Thread Steven White
Hi everyone, In Solr's schema, I have come across field types that use a different logic for "index" than for "query". To be clear, I"m talking about this block: Why would one want to not use the same logic for both and simply use:

Re: Unable to get test cases running in Intellij via maven / ant

2020-09-10 Thread Erick Erickson
I’ve had IntelliJ be uncooperative. Here are some things to try: 1> just try again. First close the project, then “git -dxf” at the top level will get rid of _everything_ not in Git. Then “ant idea”. Then open/import 2> Invalidate the IntelliJ caches (file>>invalidate caches) and restart

How to disable Zookeeper ACL

2020-09-10 Thread Amy Bai
Hi community, I enabled Zookeeper ACL according to https://lucene.apache.org/solr/guide/6_6/zookeeper-access-control.html. I wonder how to disable Zookeeper ACL?

Unable to get test cases running in Intellij via maven / ant

2020-09-10 Thread krishan goyal
Hi, I downloaded the solr source from https://github.com/apache/lucene-solr and checked out to branch_7_7 Configured intellij using the steps on https://cwiki.apache.org/confluence/display/LUCENE/HowtoConfigureIntelliJ. Configured the project SDK too as mentioned. Facing the following problems