Re: Processing query clause combinations at indexing time

2020-12-14 Thread Atri Sharma
+1

I would suggest that this be an independent project hosted on Github (there
have been similar projects in the past that have seen success that way)

On Tue, 15 Dec 2020, 09:37 David Smiley,  wrote:

> Great optimization!
>
> I'm dubious on it being a good contribution to Lucene itself however,
> because what you propose fits cleanly above Lucene.  Even at a ES/Solr
> layer (which I know you don't use, but hypothetically speaking), I'm
> dubious there as well.
>
> ~ David Smiley
> Apache Lucene/Solr Search Developer
> http://www.linkedin.com/in/davidwsmiley
>
>
> On Mon, Dec 14, 2020 at 2:37 PM Michael Froh  wrote:
>
>> My team at work has a neat feature that we've built on top of Lucene that
>> has provided a substantial (20%+) increase in maximum qps and some
>> reduction in query latency.
>>
>> Basically, we run a training process that looks at historical queries to
>> find frequently co-occurring combinations of required clauses, say "+A +B
>> +C +D". Then at indexing time, if a document satisfies one of these known
>> combinations, we add a new term to the doc, like "opto:ABCD". At query
>> time, we can then replace the required clauses with a single TermQuery for
>> the "optimized" term.
>>
>> It adds a little bit of extra work at indexing time and requires the
>> offline training step, but we've found that it yields a significant boost
>> at query time.
>>
>> We're interested in open-sourcing this feature. Is it something worth
>> adding to Lucene? Since it doesn't require any core changes, maybe as a
>> module?
>>
>


Re: Processing query clause combinations at indexing time

2020-12-14 Thread David Smiley
Great optimization!

I'm dubious on it being a good contribution to Lucene itself however,
because what you propose fits cleanly above Lucene.  Even at a ES/Solr
layer (which I know you don't use, but hypothetically speaking), I'm
dubious there as well.

~ David Smiley
Apache Lucene/Solr Search Developer
http://www.linkedin.com/in/davidwsmiley


On Mon, Dec 14, 2020 at 2:37 PM Michael Froh  wrote:

> My team at work has a neat feature that we've built on top of Lucene that
> has provided a substantial (20%+) increase in maximum qps and some
> reduction in query latency.
>
> Basically, we run a training process that looks at historical queries to
> find frequently co-occurring combinations of required clauses, say "+A +B
> +C +D". Then at indexing time, if a document satisfies one of these known
> combinations, we add a new term to the doc, like "opto:ABCD". At query
> time, we can then replace the required clauses with a single TermQuery for
> the "optimized" term.
>
> It adds a little bit of extra work at indexing time and requires the
> offline training step, but we've found that it yields a significant boost
> at query time.
>
> We're interested in open-sourcing this feature. Is it something worth
> adding to Lucene? Since it doesn't require any core changes, maybe as a
> module?
>


Processing query clause combinations at indexing time

2020-12-14 Thread Michael Froh
My team at work has a neat feature that we've built on top of Lucene that
has provided a substantial (20%+) increase in maximum qps and some
reduction in query latency.

Basically, we run a training process that looks at historical queries to
find frequently co-occurring combinations of required clauses, say "+A +B
+C +D". Then at indexing time, if a document satisfies one of these known
combinations, we add a new term to the doc, like "opto:ABCD". At query
time, we can then replace the required clauses with a single TermQuery for
the "optimized" term.

It adds a little bit of extra work at indexing time and requires the
offline training step, but we've found that it yields a significant boost
at query time.

We're interested in open-sourcing this feature. Is it something worth
adding to Lucene? Since it doesn't require any core changes, maybe as a
module?


Re: Problems creating collections on branch_8x due to SSL errors

2020-12-14 Thread Timothy Potter
just merged a fix, re-pull, see:
https://issues.apache.org/jira/browse/SOLR-15046 for details

On Mon, Dec 14, 2020 at 9:42 AM Joel Bernstein  wrote:

> I did a pull this morning and checked out branch_8x and then did the
> following:
>
> ant server
> bin/solr start -c
> bin/solr create -c test -s 1 -d _default
>
> I get the following error in the logs. Jason Gerlowski confirmed he is
> seeing it as well. Anyone know the cause of this? If not I'll create a
> ticket.
>
> 2020-12-14 16:33:15.463 INFO  
> (OverseerStateUpdate-72065849883426816-10.0.0.238:8983_solr-n_00)
> [   ] o.a.s.c.o.SliceMutator createReplica() {
>
>   "operation":"ADDREPLICA",
>
>   "collection":"test",
>
>   "shard":"shard1",
>
>   "core":"test_shard1_replica_n1",
>
>   "state":"down",
>
>   "node_name":"10.0.0.238:8983_solr",
>
>   "type":"NRT",
>
>   "waitForFinalState":"false"}
>
> 2020-12-14 16:33:15.736 ERROR
> (OverseerThreadFactory-18-thread-1-processing-n:10.0.0.238:8983_solr) [   ]
> o.a.s.c.a.c.OverseerCollectionMessageHandler Error from shard:
> https://10.0.0.238:8983/solr =>
> org.apache.solr.client.solrj.SolrServerException: IOException occurred when
> talking to server at: https://10.0.0.238:8983/solr
>
> at
> org.apache.solr.client.solrj.impl.HttpSolrClient.executeMethod(HttpSolrClient.java:695)
>
> org.apache.solr.client.solrj.SolrServerException: IOException occurred
> when talking to server at: https://10.0.0.238:8983/solr
>
> at
> org.apache.solr.client.solrj.impl.HttpSolrClient.executeMethod(HttpSolrClient.java:695)
> ~[?:?]
>
> at
> org.apache.solr.client.solrj.impl.HttpSolrClient.request(HttpSolrClient.java:266)
> ~[?:?]
>
> at
> org.apache.solr.client.solrj.impl.HttpSolrClient.request(HttpSolrClient.java:248)
> ~[?:?]
>
> at
> org.apache.solr.client.solrj.SolrClient.request(SolrClient.java:1290) ~[?:?]
>
> at
> org.apache.solr.handler.component.HttpShardHandlerFactory$1.request(HttpShardHandlerFactory.java:169)
> ~[?:?]
>
> at
> org.apache.solr.handler.component.ShardRequestor.call(ShardRequestor.java:130)
> ~[?:?]
>
> at
> org.apache.solr.handler.component.ShardRequestor.call(ShardRequestor.java:41)
> ~[?:?]
>
> at java.util.concurrent.FutureTask.run(FutureTask.java:266)
> ~[?:1.8.0_271]
>
> at
> java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511)
> ~[?:1.8.0_271]
>
> at java.util.concurrent.FutureTask.run(FutureTask.java:266)
> ~[?:1.8.0_271]
>
> at
> com.codahale.metrics.InstrumentedExecutorService$InstrumentedRunnable.run(InstrumentedExecutorService.java:180)
> ~[metrics-core-4.1.5.jar:4.1.5]
>
> at
> org.apache.solr.common.util.ExecutorUtil$MDCAwareThreadPoolExecutor.lambda$execute$0(ExecutorUtil.java:218)
> ~[?:?]
>
> at
> java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149)
> ~[?:1.8.0_271]
>
> at
> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624)
> ~[?:1.8.0_271]
>
> at java.lang.Thread.run(Thread.java:748) [?:1.8.0_271]
>
> Caused by: javax.net.ssl.SSLException: Unsupported or unrecognized SSL
> message
>
> at
> sun.security.ssl.SSLSocketInputRecord.handleUnknownRecord(SSLSocketInputRecord.java:448)
> ~[?:1.8.0_271]
>
> at
> sun.security.ssl.SSLSocketInputRecord.decode(SSLSocketInputRecord.java:174)
> ~[?:1.8.0_271]
>
> at sun.security.ssl.SSLTransport.decode(SSLTransport.java:110)
> ~[?:1.8.0_271]
>
> at sun.security.ssl.SSLSocketImpl.decode(SSLSocketImpl.java:1279)
> ~[?:1.8.0_271]
>
> at
> sun.security.ssl.SSLSocketImpl.readHandshakeRecord(SSLSocketImpl.java:1188)
> ~[?:1.8.0_271]
>
> at
> sun.security.ssl.SSLSocketImpl.startHandshake(SSLSocketImpl.java:401)
> ~[?:1.8.0_271]
>
> at
> sun.security.ssl.SSLSocketImpl.startHandshake(SSLSocketImpl.java:373)
> ~[?:1.8.0_271]
>
> at
> org.apache.http.conn.ssl.SSLConnectionSocketFactory.createLayeredSocket(SSLConnectionSocketFactory.java:436)
> ~[?:?]
>
> at
> org.apache.http.conn.ssl.SSLConnectionSocketFactory.connectSocket(SSLConnectionSocketFactory.java:384)
> ~[?:?]
>
> at
> org.apache.http.impl.conn.DefaultHttpClientConnectionOperator.connect(DefaultHttpClientConnectionOperator.java:142)
> ~[?:?]
>
> at
> org.apache.http.impl.conn.PoolingHttpClientConnectionManager.connect(PoolingHttpClientConnectionManager.java:376)
> ~[?:?]
>
> at
> org.apache.http.impl.execchain.MainClientExec.establishRoute(MainClientExec.java:393)
> ~[?:?]
>
> at
> org.apache.http.impl.execchain.MainClientExec.execute(MainClientExec.java:236)
> ~[?:?]
>
> at
> org.apache.http.impl.execchain.ProtocolExec.execute(ProtocolExec.java:186)
> ~[?:?]
>
> at
> org.apache.http.impl.execchain.RetryExec.execute(RetryExec.java:89) ~[?:?]
>
> at
> org.apache.http.impl.execchain.RedirectExec.execute(RedirectExec.java:110)
> ~[?:?]
>
> at
> 

Problems creating collections on branch_8x due to SSL errors

2020-12-14 Thread Joel Bernstein
I did a pull this morning and checked out branch_8x and then did the
following:

ant server
bin/solr start -c
bin/solr create -c test -s 1 -d _default

I get the following error in the logs. Jason Gerlowski confirmed he is
seeing it as well. Anyone know the cause of this? If not I'll create a
ticket.

2020-12-14 16:33:15.463 INFO
(OverseerStateUpdate-72065849883426816-10.0.0.238:8983_solr-n_00)
[   ] o.a.s.c.o.SliceMutator createReplica() {

  "operation":"ADDREPLICA",

  "collection":"test",

  "shard":"shard1",

  "core":"test_shard1_replica_n1",

  "state":"down",

  "node_name":"10.0.0.238:8983_solr",

  "type":"NRT",

  "waitForFinalState":"false"}

2020-12-14 16:33:15.736 ERROR
(OverseerThreadFactory-18-thread-1-processing-n:10.0.0.238:8983_solr) [   ]
o.a.s.c.a.c.OverseerCollectionMessageHandler Error from shard:
https://10.0.0.238:8983/solr =>
org.apache.solr.client.solrj.SolrServerException: IOException occurred when
talking to server at: https://10.0.0.238:8983/solr

at
org.apache.solr.client.solrj.impl.HttpSolrClient.executeMethod(HttpSolrClient.java:695)

org.apache.solr.client.solrj.SolrServerException: IOException occurred when
talking to server at: https://10.0.0.238:8983/solr

at
org.apache.solr.client.solrj.impl.HttpSolrClient.executeMethod(HttpSolrClient.java:695)
~[?:?]

at
org.apache.solr.client.solrj.impl.HttpSolrClient.request(HttpSolrClient.java:266)
~[?:?]

at
org.apache.solr.client.solrj.impl.HttpSolrClient.request(HttpSolrClient.java:248)
~[?:?]

at
org.apache.solr.client.solrj.SolrClient.request(SolrClient.java:1290) ~[?:?]

at
org.apache.solr.handler.component.HttpShardHandlerFactory$1.request(HttpShardHandlerFactory.java:169)
~[?:?]

at
org.apache.solr.handler.component.ShardRequestor.call(ShardRequestor.java:130)
~[?:?]

at
org.apache.solr.handler.component.ShardRequestor.call(ShardRequestor.java:41)
~[?:?]

at java.util.concurrent.FutureTask.run(FutureTask.java:266)
~[?:1.8.0_271]

at
java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511)
~[?:1.8.0_271]

at java.util.concurrent.FutureTask.run(FutureTask.java:266)
~[?:1.8.0_271]

at
com.codahale.metrics.InstrumentedExecutorService$InstrumentedRunnable.run(InstrumentedExecutorService.java:180)
~[metrics-core-4.1.5.jar:4.1.5]

at
org.apache.solr.common.util.ExecutorUtil$MDCAwareThreadPoolExecutor.lambda$execute$0(ExecutorUtil.java:218)
~[?:?]

at
java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149)
~[?:1.8.0_271]

at
java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624)
~[?:1.8.0_271]

at java.lang.Thread.run(Thread.java:748) [?:1.8.0_271]

Caused by: javax.net.ssl.SSLException: Unsupported or unrecognized SSL
message

at
sun.security.ssl.SSLSocketInputRecord.handleUnknownRecord(SSLSocketInputRecord.java:448)
~[?:1.8.0_271]

at
sun.security.ssl.SSLSocketInputRecord.decode(SSLSocketInputRecord.java:174)
~[?:1.8.0_271]

at sun.security.ssl.SSLTransport.decode(SSLTransport.java:110)
~[?:1.8.0_271]

at sun.security.ssl.SSLSocketImpl.decode(SSLSocketImpl.java:1279)
~[?:1.8.0_271]

at
sun.security.ssl.SSLSocketImpl.readHandshakeRecord(SSLSocketImpl.java:1188)
~[?:1.8.0_271]

at
sun.security.ssl.SSLSocketImpl.startHandshake(SSLSocketImpl.java:401)
~[?:1.8.0_271]

at
sun.security.ssl.SSLSocketImpl.startHandshake(SSLSocketImpl.java:373)
~[?:1.8.0_271]

at
org.apache.http.conn.ssl.SSLConnectionSocketFactory.createLayeredSocket(SSLConnectionSocketFactory.java:436)
~[?:?]

at
org.apache.http.conn.ssl.SSLConnectionSocketFactory.connectSocket(SSLConnectionSocketFactory.java:384)
~[?:?]

at
org.apache.http.impl.conn.DefaultHttpClientConnectionOperator.connect(DefaultHttpClientConnectionOperator.java:142)
~[?:?]

at
org.apache.http.impl.conn.PoolingHttpClientConnectionManager.connect(PoolingHttpClientConnectionManager.java:376)
~[?:?]

at
org.apache.http.impl.execchain.MainClientExec.establishRoute(MainClientExec.java:393)
~[?:?]

at
org.apache.http.impl.execchain.MainClientExec.execute(MainClientExec.java:236)
~[?:?]

at
org.apache.http.impl.execchain.ProtocolExec.execute(ProtocolExec.java:186)
~[?:?]

at
org.apache.http.impl.execchain.RetryExec.execute(RetryExec.java:89) ~[?:?]

at
org.apache.http.impl.execchain.RedirectExec.execute(RedirectExec.java:110)
~[?:?]

at
org.apache.http.impl.client.InternalHttpClient.doExecute(InternalHttpClient.java:185)
~[?:?]

at
org.apache.http.impl.client.CloseableHttpClient.execute(CloseableHttpClient.java:83)
~[?:?]

at
org.apache.http.impl.client.CloseableHttpClient.execute(CloseableHttpClient.java:56)
~[?:?]

at
org.apache.solr.client.solrj.impl.HttpSolrClient.executeMethod(HttpSolrClient.java:571)
~[?:?]








Joel Bernstein