Are docValues useful for FilterQueries?

2019-07-08 Thread Ashwin Ramesh
Hi everybody, I can't find concrete evidence whether docValues are indeed useful for filter queries. One example of a field: This field will have a value between 0-1 The only usecase for this field is to filter on a range / subset of values. There will be no scoring / querying on this

Re: Are docValues useful for FilterQueries?

2019-07-08 Thread Erick Erickson
DocValues are irrelevant for scoring. Here’s the way I think of it. When querying (and thus scoring), you have a term X. I need to know > what docs does it appear in? > how many docs does it appear in? > how often does the term appear in the entire corpus? These are questions the inverted index

Understanding DebugQuery

2019-07-08 Thread Paresh Khandelwal
Hi All, I tried to get the debug information about the query for my INNER JOIN and ACROSS JOIN and trying to understand it. See the query below - 1487 msec { "responseHeader":{ "status":0, "QTime":1487, "params":{ "q":"*:*",

How to read DebugQuery output

2019-07-08 Thread Paresh Khandelwal
Hi All, I tried to get the debug information about the query for my INNER JOIN and ACROSS JOIN and trying to understand it. See the query below - 1487 msec { "responseHeader":{ "status":0, "QTime":1487, "params":{ "q":"*:*",

Re: Solr 6.6.0 - DIH - Multiple entities - Multiple DBs

2019-07-08 Thread Alexandre Rafalovitch
You may also want to look at the existing systems, such as https://nifi.apache.org/ Regards, Alex. On Mon, 8 Jul 2019 at 08:23, Joseph_Tucker wrote: > > Thanks again. > > I guess I'll have to start researching how to create such custom indexing > scripts and determine which language would be

Re: Facet Query performance

2019-07-08 Thread Shawn Heisey
On 7/8/2019 3:08 AM, Midas A wrote: I have enabled docvalues on facet field but query is still taking time. How i can improve the Query time . docValues="true" multiValued="true" termVectors="true" /> *Query: * There's very little information here -- only a single field definition and

Re: Solr 6.6.0 - DIH - Multiple entities - Multiple DBs

2019-07-08 Thread Jörn Franke
Ideally you use scripts that can use JVM/Java - in this way you can always use the latest SolrJ client library but also other libraries that are relevant (eg Tika for unstructured content). This does not have to be Java directly but can be based also on Scala or JVM script languages, such as

Re: Relevance by term position

2019-07-08 Thread Erick Erickson
To re-enforce what Alex said, payloads have first-class Solr support as of Solr 6.6, see: https://lucidworks.com/post/solr-payloads/ > On Jul 8, 2019, at 7:15 AM, Jay Potharaju wrote: > > Thanks, use of payloads works for my use case. > Jay > >> On Jun 28, 2019, at 6:46 AM, Alexandre

Re: Relevance by term position

2019-07-08 Thread Jay Potharaju
Thanks, use of payloads works for my use case. Jay > On Jun 28, 2019, at 6:46 AM, Alexandre Rafalovitch wrote: > > This past thread may be relevant: > https://markmail.org/message/aau6bjllkpwcpmro > It suggests that using SpanFirst of XMLQueryParser will have automatic > boost for earlier

Facet Query performance

2019-07-08 Thread Midas A
Hi , I have enabled docvalues on facet field but query is still taking time. How i can improve the Query time . *Query: * http://X.X.X.X:

Creating HttpEntityEnclosingRequestBase with a repeatable entity

2019-07-08 Thread Tomer Shahar
Hi. I'm using solrj (7.3.1). I encountered an error for delete queries that fail on an unauthorized exception. I noticed other requests succeed. I managed to track it down to NTLM authentications. org.apache.http.impl.execchain.MainClientExec (line 315) will remove authentication headers

Solr 7.7 autoscaling trigger

2019-07-08 Thread Mark Thill
My scenario is: - 60 GB collection - 2 shards of ~30GB - Each shard having 2 replicas so I have a backup - So I have 4 nodes with each node holding a single core My goal is to have autoscaling handle when I lose a node. So upon loss of a node the nodeLost event deletes the node.

Re: Facet Query performance

2019-07-08 Thread Shawn Heisey
On 7/8/2019 12:00 PM, Midas A wrote: Number of Docs :50+ docs Index Size: 300 GB RAM: 256 GB JVM: 32 GB Half a million documents producing an index size of 300GB suggests *very* large documents. That typically produces an index with fields that have very high cardinality, due to text

Creating HttpEntityEnclosingRequestBase with a repeatable entity

2019-07-08 Thread Avi Steiner
Hi. I'm using solrj (7.3.1). I encountered an error for delete queries that fail on an unauthorized exception. I noticed other requests succeed. I managed to track it down to NTLM authentications. org.apache.http.impl.execchain.MainClientExec (line 315) will remove authentication headers

Re: Facet Query performance

2019-07-08 Thread Midas A
Hi How i can know whether DocValues are getting used or not ? Please help me here . On Mon, Jul 8, 2019 at 2:38 PM Midas A wrote: > Hi , > > I have enabled docvalues on facet field but query is still taking time. > > How i can improve the Query time . > docValues="true" multiValued="true"

Re: Solr 6.6.0 - DIH - Multiple entities - Multiple DBs

2019-07-08 Thread Joseph_Tucker
Thanks again. I guess I'll have to start researching how to create such custom indexing scripts and determine which language would be best based on the environment I'm using (Azure in this case). Appreciate the help greatly Charlie Hull-3 wrote > On 05/07/2019 14:33, Joseph_Tucker wrote: