LTR performance issues

2018-05-07 Thread ilayaraja
LTR with grouping results in very high latency (3x) even while re-ranking 24
top groups.

How is re-ranking implemented in Solr? Is it expected that it would result
in 3x more query time.

Need clarifications on:
1. How many top groups are actually re-ranked, is it exactly what we pass in
reRankDocs?
2. How many documents within each group is re-ranked? Can we control it with
group.limit or some other parameter?

What causes LTR take more time when grouping is performed? Is it scoring the
documents again or merging the re-ranked docs with rest of the docs?

Is there anyway to optimize this? 







-
--Ilay
--
Sent from: http://lucene.472066.n3.nabble.com/Solr-User-f472068.html


LTR performance issues

2018-05-07 Thread ilayaraja
LTR with grouping results in very high latency (3x) even while re-ranking 24
top groups.

How is re-ranking implemented in Solr? Is it expected that it would result
in 3x more query time.

Need clarifications on:
1. How many top groups are actually re-ranked, is it exactly what we pass in
reRankDocs?
2. How many documents within each group is re-ranked? Can we control it with
group.limit or some other parameter?

What causes LTR take more time when grouping is performed? Is it scoring the
documents again or merging the re-ranked docs with rest of the docs?

Is there anyway to optimize this? 







-
--Ilay
--
Sent from: http://lucene.472066.n3.nabble.com/Solr-User-f472068.html


Re: Async exceptions during distributed update

2018-05-07 Thread Jay Potharaju
I have about 3-5 updates per second.


> On May 7, 2018, at 5:02 PM, Shawn Heisey  wrote:
> 
>> On 5/7/2018 5:05 PM, Jay Potharaju wrote:
>> There are some deletes by query. I have not had any issues with DBQ,
>> currently have 5.3 running in production.
> 
> Here's the big problem with DBQ.  Imagine this sequence of events with
> these timestamps:
> 
> 13:00:00: A commit for change visibility happens.
> 13:00:00: A segment merge is triggered by the commit.
> (It's a big merge that takes exactly 3 minutes.)
> 13:00:05: A deleteByQuery is sent.
> 13:00:15: An update to the index is sent.
> 13:00:25: An update to the index is sent.
> 13:00:35: An update to the index is sent.
> 13:00:45: An update to the index is sent.
> 13:00:55: An update to the index is sent.
> 13:01:05: An update to the index is sent.
> 13:01:15: An update to the index is sent.
> 13:01:25: An update to the index is sent.
> {time passes, more updates might be sent}
> 13:03:00: The merge finishes.
> 
> Here's what would happen in this scenario:  The DBQ and all of the
> update requests sent *after* the DBQ will block until the merge
> finishes.  That means that it's going to take up to three minutes for
> Solr to respond to those requests.  If the client that is sending the
> request is configured with a 60 second socket timeout, which inter-node
> requests made by Solr are by default, then it is going to experience a
> timeout error.  The request will probably complete successfully once the
> merge finishes, but the connection is gone, and the client has already
> received an error.
> 
> Now imagine what happens if an optimize (forced merge of the entire
> index) is requested on an index that's 50GB.  That optimize may take 2-3
> hours, possibly longer.  A deleteByQuery started on that index after the
> optimize begins (and any updates requested after the DBQ) will pause
> until the optimize is done.  A pause of 2 hours or more is a BIG problem.
> 
> This is why deleteByQuery is not recommended.
> 
> If the deleteByQuery were changed into a two-step process involving a
> query to retrieve ID values and then one or more deleteById requests,
> then none of that blocking would occur.  The deleteById operation can
> run at the same time as a segment merge, so neither it nor subsequent
> update requests will have the significant pause.  From what I
> understand, you can even do commits in this scenario and have changes be
> visible before the merge completes.  I haven't verified that this is the
> case.
> 
> Experienced devs: Can we fix this problem with DBQ?  On indexes with a
> uniqueKey, can DBQ be changed to use the two-step process I mentioned?
> 
> Thanks,
> Shawn
> 


Re: Async exceptions during distributed update

2018-05-07 Thread Jay Potharaju
Thanks for explaining that Shawn!
Emir, I use php library called solarium to do updates/deletes to solr. The 
request is sent to any of the available nodes in the cluster.

> On May 7, 2018, at 5:02 PM, Shawn Heisey  wrote:
> 
>> On 5/7/2018 5:05 PM, Jay Potharaju wrote:
>> There are some deletes by query. I have not had any issues with DBQ,
>> currently have 5.3 running in production.
> 
> Here's the big problem with DBQ.  Imagine this sequence of events with
> these timestamps:
> 
> 13:00:00: A commit for change visibility happens.
> 13:00:00: A segment merge is triggered by the commit.
> (It's a big merge that takes exactly 3 minutes.)
> 13:00:05: A deleteByQuery is sent.
> 13:00:15: An update to the index is sent.
> 13:00:25: An update to the index is sent.
> 13:00:35: An update to the index is sent.
> 13:00:45: An update to the index is sent.
> 13:00:55: An update to the index is sent.
> 13:01:05: An update to the index is sent.
> 13:01:15: An update to the index is sent.
> 13:01:25: An update to the index is sent.
> {time passes, more updates might be sent}
> 13:03:00: The merge finishes.
> 
> Here's what would happen in this scenario:  The DBQ and all of the
> update requests sent *after* the DBQ will block until the merge
> finishes.  That means that it's going to take up to three minutes for
> Solr to respond to those requests.  If the client that is sending the
> request is configured with a 60 second socket timeout, which inter-node
> requests made by Solr are by default, then it is going to experience a
> timeout error.  The request will probably complete successfully once the
> merge finishes, but the connection is gone, and the client has already
> received an error.
> 
> Now imagine what happens if an optimize (forced merge of the entire
> index) is requested on an index that's 50GB.  That optimize may take 2-3
> hours, possibly longer.  A deleteByQuery started on that index after the
> optimize begins (and any updates requested after the DBQ) will pause
> until the optimize is done.  A pause of 2 hours or more is a BIG problem.
> 
> This is why deleteByQuery is not recommended.
> 
> If the deleteByQuery were changed into a two-step process involving a
> query to retrieve ID values and then one or more deleteById requests,
> then none of that blocking would occur.  The deleteById operation can
> run at the same time as a segment merge, so neither it nor subsequent
> update requests will have the significant pause.  From what I
> understand, you can even do commits in this scenario and have changes be
> visible before the merge completes.  I haven't verified that this is the
> case.
> 
> Experienced devs: Can we fix this problem with DBQ?  On indexes with a
> uniqueKey, can DBQ be changed to use the two-step process I mentioned?
> 
> Thanks,
> Shawn
> 


Re: Async exceptions during distributed update

2018-05-07 Thread Shawn Heisey
On 5/7/2018 5:05 PM, Jay Potharaju wrote:
> There are some deletes by query. I have not had any issues with DBQ,
> currently have 5.3 running in production.

Here's the big problem with DBQ.  Imagine this sequence of events with
these timestamps:

13:00:00: A commit for change visibility happens.
13:00:00: A segment merge is triggered by the commit.
(It's a big merge that takes exactly 3 minutes.)
13:00:05: A deleteByQuery is sent.
13:00:15: An update to the index is sent.
13:00:25: An update to the index is sent.
13:00:35: An update to the index is sent.
13:00:45: An update to the index is sent.
13:00:55: An update to the index is sent.
13:01:05: An update to the index is sent.
13:01:15: An update to the index is sent.
13:01:25: An update to the index is sent.
{time passes, more updates might be sent}
13:03:00: The merge finishes.

Here's what would happen in this scenario:  The DBQ and all of the
update requests sent *after* the DBQ will block until the merge
finishes.  That means that it's going to take up to three minutes for
Solr to respond to those requests.  If the client that is sending the
request is configured with a 60 second socket timeout, which inter-node
requests made by Solr are by default, then it is going to experience a
timeout error.  The request will probably complete successfully once the
merge finishes, but the connection is gone, and the client has already
received an error.

Now imagine what happens if an optimize (forced merge of the entire
index) is requested on an index that's 50GB.  That optimize may take 2-3
hours, possibly longer.  A deleteByQuery started on that index after the
optimize begins (and any updates requested after the DBQ) will pause
until the optimize is done.  A pause of 2 hours or more is a BIG problem.

This is why deleteByQuery is not recommended.

If the deleteByQuery were changed into a two-step process involving a
query to retrieve ID values and then one or more deleteById requests,
then none of that blocking would occur.  The deleteById operation can
run at the same time as a segment merge, so neither it nor subsequent
update requests will have the significant pause.  From what I
understand, you can even do commits in this scenario and have changes be
visible before the merge completes.  I haven't verified that this is the
case.

Experienced devs: Can we fix this problem with DBQ?  On indexes with a
uniqueKey, can DBQ be changed to use the two-step process I mentioned?

Thanks,
Shawn



Re: Async exceptions during distributed update

2018-05-07 Thread Emir Arnautović
How many concurrent updates can be sent? Do you always send updates to the
same node? Do you use solrj?

Emir

On Tue, May 8, 2018, 1:02 AM Jay Potharaju  wrote:

> The updates are pushed in real time not batched. No complex analysis and
> everything is committed using autocommit settings in solr.
>
> Thanks
> Jay Potharaju
>
>
> On Mon, May 7, 2018 at 4:00 PM, Emir Arnautović <
> emir.arnauto...@sematext.com> wrote:
>
> > How do you send documents? Large batches? Complex analysis? Do you send
> all
> > batches to the same node? How do you commit? Do you delete by query while
> > indexing?
> >
> > Emir
> >
> > On Tue, May 8, 2018, 12:30 AM Jay Potharaju 
> wrote:
> >
> > > I didn't see any OOM errors in the logs on either of the nodes. I saw
> GC
> > > pause of 1 second on the box that was throwing error ...but nothing on
> > the
> > > other node. Any other recommendations?
> > > Thanks
> > >
> > >
> > > Thanks
> > > Jay Potharaju
> > >
> > >
> > > On Mon, May 7, 2018 at 9:48 AM, Jay Potharaju 
> > > wrote:
> > >
> > > > Ah thanks for explaining that!
> > > >
> > > > Thanks
> > > > Jay Potharaju
> > > >
> > > >
> > > > On Mon, May 7, 2018 at 9:45 AM, Emir Arnautović <
> > > > emir.arnauto...@sematext.com> wrote:
> > > >
> > > >> Node A receives batch of documents to index. It forwards documents
> to
> > > >> shards that are on the node B. Node B is having issues with GC so it
> > > takes
> > > >> a while to respond. Node A sees it as read timeout and reports it in
> > > logs.
> > > >> So the issue is on node B not node A.
> > > >>
> > > >> Emir
> > > >> --
> > > >> Monitoring - Log Management - Alerting - Anomaly Detection
> > > >> Solr & Elasticsearch Consulting Support Training -
> > http://sematext.com/
> > > >>
> > > >>
> > > >>
> > > >> > On 7 May 2018, at 18:39, Jay Potharaju 
> > wrote:
> > > >> >
> > > >> > Yes, the nodes are well balanced. I am just using these boxes for
> > > >> indexing
> > > >> > the data and is not serving any traffic at this time.  The error
> > > >> indicates
> > > >> > it is having issues errors on the shards that are hosted on the
> box
> > > and
> > > >> not
> > > >> > on the other box.
> > > >> > I will check GC logs to see if there were any issues.
> > > >> > thanks
> > > >> >
> > > >> > Thanks
> > > >> > Jay Potharaju
> > > >> >
> > > >> >
> > > >> > On Mon, May 7, 2018 at 9:34 AM, Emir Arnautović <
> > > >> > emir.arnauto...@sematext.com> wrote:
> > > >> >
> > > >> >> Hi Jay,
> > > >> >> My first guess would be that there was some major GC on other box
> > so
> > > it
> > > >> >> did not respond on time. Are your nodes well balanced - do they
> > serve
> > > >> equal
> > > >> >> amount of data?
> > > >> >>
> > > >> >> Thanks,
> > > >> >> Emir
> > > >> >> --
> > > >> >> Monitoring - Log Management - Alerting - Anomaly Detection
> > > >> >> Solr & Elasticsearch Consulting Support Training -
> > > >> http://sematext.com/
> > > >> >>
> > > >> >>
> > > >> >>
> > > >> >>> On 7 May 2018, at 18:11, Jay Potharaju 
> > > wrote:
> > > >> >>>
> > > >> >>> Hi,
> > > >> >>> I am seeing the following lines in the error log. My setup has 2
> > > >> nodes in
> > > >> >>> the solrcloud cluster, each node has 3 shards with no
> replication.
> > > >> From
> > > >> >> the
> > > >> >>> error log it seems like all the shards on this box are throwing
> > > async
> > > >> >>> exception errors. Other node in the cluster does not have any
> > errors
> > > >> in
> > > >> >> the
> > > >> >>> logs. Any suggestions on how to tackle this error?
> > > >> >>>
> > > >> >>> Solr setup
> > > >> >>> Solr:6.6.3
> > > >> >>> 2Nodes: 3 shards each
> > > >> >>>
> > > >> >>>
> > > >> >>> ERROR org.apache.solr.servlet.HttpSolrCall
> > [test_shard3_replica1] ?
> > > >> >>>
> null:org.apache.solr.update.processor.DistributedUpdateProcessor$
> > > >> >> DistributedUpdatesAsyncException:
> > > >> >>> Async exception during distributed update: Read timed out
> > > >> >>> at
> > > >> >>>
> > > org.apache.solr.update.processor.DistributedUpdateProcessor.doFinish(
> > > >> >> DistributedUpdateProcessor.java:972)
> > > >> >>> at
> > > >> >>> org.apache.solr.update.processor.DistributedUpdateProcessor.
> > finish(
> > > >> >> DistributedUpdateProcessor.java:1911)
> > > >> >>> at
> > > >> >>> org.apache.solr.handler.ContentStreamHandlerBase.
> > handleRequestBody(
> > > >> >> ContentStreamHandlerBase.java:78)
> > > >> >>> at
> > > >> >>> org.apache.solr.handler.RequestHandlerBase.handleRequest(
> > > >> >> RequestHandlerBase.java:173)
> > > >> >>> at org.apache.solr.core.SolrCore.execute(SolrCore.java:2477)
> > > >> >>> at org.apache.solr.servlet.HttpSolrCall.execute(HttpSolrCall.
> > > >> java:723)
> > > >> >>> at org.apache.solr.servlet.HttpSolrCall.call(
> > HttpSolrCall.java:529)
> > > >> >>> at
> > > >> >>> org.apache.solr.servlet.SolrDispatchFilter.doFilter(
> > > >> >> SolrDispatchFilter.java:361)
> > > >> >>> at
> > > >> >>> org.apache.solr.servlet.SolrDispatchFilter.doFilter(
> > > >> >> SolrDispatchFilter.java:3

Re: Async exceptions during distributed update

2018-05-07 Thread Jay Potharaju
There are some deletes by query. I have not had any issues with DBQ,
currently have 5.3 running in production.

Thanks
Jay Potharaju


On Mon, May 7, 2018 at 4:02 PM, Jay Potharaju  wrote:

> The updates are pushed in real time not batched. No complex analysis and
> everything is committed using autocommit settings in solr.
>
> Thanks
> Jay Potharaju
>
>
> On Mon, May 7, 2018 at 4:00 PM, Emir Arnautović <
> emir.arnauto...@sematext.com> wrote:
>
>> How do you send documents? Large batches? Complex analysis? Do you send
>> all
>> batches to the same node? How do you commit? Do you delete by query while
>> indexing?
>>
>> Emir
>>
>> On Tue, May 8, 2018, 12:30 AM Jay Potharaju 
>> wrote:
>>
>> > I didn't see any OOM errors in the logs on either of the nodes. I saw GC
>> > pause of 1 second on the box that was throwing error ...but nothing on
>> the
>> > other node. Any other recommendations?
>> > Thanks
>> >
>> >
>> > Thanks
>> > Jay Potharaju
>> >
>> >
>> > On Mon, May 7, 2018 at 9:48 AM, Jay Potharaju 
>> > wrote:
>> >
>> > > Ah thanks for explaining that!
>> > >
>> > > Thanks
>> > > Jay Potharaju
>> > >
>> > >
>> > > On Mon, May 7, 2018 at 9:45 AM, Emir Arnautović <
>> > > emir.arnauto...@sematext.com> wrote:
>> > >
>> > >> Node A receives batch of documents to index. It forwards documents to
>> > >> shards that are on the node B. Node B is having issues with GC so it
>> > takes
>> > >> a while to respond. Node A sees it as read timeout and reports it in
>> > logs.
>> > >> So the issue is on node B not node A.
>> > >>
>> > >> Emir
>> > >> --
>> > >> Monitoring - Log Management - Alerting - Anomaly Detection
>> > >> Solr & Elasticsearch Consulting Support Training -
>> http://sematext.com/
>> > >>
>> > >>
>> > >>
>> > >> > On 7 May 2018, at 18:39, Jay Potharaju 
>> wrote:
>> > >> >
>> > >> > Yes, the nodes are well balanced. I am just using these boxes for
>> > >> indexing
>> > >> > the data and is not serving any traffic at this time.  The error
>> > >> indicates
>> > >> > it is having issues errors on the shards that are hosted on the box
>> > and
>> > >> not
>> > >> > on the other box.
>> > >> > I will check GC logs to see if there were any issues.
>> > >> > thanks
>> > >> >
>> > >> > Thanks
>> > >> > Jay Potharaju
>> > >> >
>> > >> >
>> > >> > On Mon, May 7, 2018 at 9:34 AM, Emir Arnautović <
>> > >> > emir.arnauto...@sematext.com> wrote:
>> > >> >
>> > >> >> Hi Jay,
>> > >> >> My first guess would be that there was some major GC on other box
>> so
>> > it
>> > >> >> did not respond on time. Are your nodes well balanced - do they
>> serve
>> > >> equal
>> > >> >> amount of data?
>> > >> >>
>> > >> >> Thanks,
>> > >> >> Emir
>> > >> >> --
>> > >> >> Monitoring - Log Management - Alerting - Anomaly Detection
>> > >> >> Solr & Elasticsearch Consulting Support Training -
>> > >> http://sematext.com/
>> > >> >>
>> > >> >>
>> > >> >>
>> > >> >>> On 7 May 2018, at 18:11, Jay Potharaju 
>> > wrote:
>> > >> >>>
>> > >> >>> Hi,
>> > >> >>> I am seeing the following lines in the error log. My setup has 2
>> > >> nodes in
>> > >> >>> the solrcloud cluster, each node has 3 shards with no
>> replication.
>> > >> From
>> > >> >> the
>> > >> >>> error log it seems like all the shards on this box are throwing
>> > async
>> > >> >>> exception errors. Other node in the cluster does not have any
>> errors
>> > >> in
>> > >> >> the
>> > >> >>> logs. Any suggestions on how to tackle this error?
>> > >> >>>
>> > >> >>> Solr setup
>> > >> >>> Solr:6.6.3
>> > >> >>> 2Nodes: 3 shards each
>> > >> >>>
>> > >> >>>
>> > >> >>> ERROR org.apache.solr.servlet.HttpSolrCall
>> [test_shard3_replica1] ?
>> > >> >>> null:org.apache.solr.update.processor.DistributedUpdateProce
>> ssor$
>> > >> >> DistributedUpdatesAsyncException:
>> > >> >>> Async exception during distributed update: Read timed out
>> > >> >>> at
>> > >> >>>
>> > org.apache.solr.update.processor.DistributedUpdateProcessor.doFinish(
>> > >> >> DistributedUpdateProcessor.java:972)
>> > >> >>> at
>> > >> >>> org.apache.solr.update.processor.DistributedUpdateProcessor.
>> finish(
>> > >> >> DistributedUpdateProcessor.java:1911)
>> > >> >>> at
>> > >> >>> org.apache.solr.handler.ContentStreamHandlerBase.handleReque
>> stBody(
>> > >> >> ContentStreamHandlerBase.java:78)
>> > >> >>> at
>> > >> >>> org.apache.solr.handler.RequestHandlerBase.handleRequest(
>> > >> >> RequestHandlerBase.java:173)
>> > >> >>> at org.apache.solr.core.SolrCore.execute(SolrCore.java:2477)
>> > >> >>> at org.apache.solr.servlet.HttpSolrCall.execute(HttpSolrCall.
>> > >> java:723)
>> > >> >>> at org.apache.solr.servlet.HttpSolrCall.call(HttpSolrCall.java:
>> 529)
>> > >> >>> at
>> > >> >>> org.apache.solr.servlet.SolrDispatchFilter.doFilter(
>> > >> >> SolrDispatchFilter.java:361)
>> > >> >>> at
>> > >> >>> org.apache.solr.servlet.SolrDispatchFilter.doFilter(
>> > >> >> SolrDispatchFilter.java:305)
>> > >> >>> at
>> > >> >>> org.eclipse.jetty.servlet.ServletHandler$CachedChain.
>> > >> >> doFilter(Servlet

Re: Async exceptions during distributed update

2018-05-07 Thread Jay Potharaju
The updates are pushed in real time not batched. No complex analysis and
everything is committed using autocommit settings in solr.

Thanks
Jay Potharaju


On Mon, May 7, 2018 at 4:00 PM, Emir Arnautović <
emir.arnauto...@sematext.com> wrote:

> How do you send documents? Large batches? Complex analysis? Do you send all
> batches to the same node? How do you commit? Do you delete by query while
> indexing?
>
> Emir
>
> On Tue, May 8, 2018, 12:30 AM Jay Potharaju  wrote:
>
> > I didn't see any OOM errors in the logs on either of the nodes. I saw GC
> > pause of 1 second on the box that was throwing error ...but nothing on
> the
> > other node. Any other recommendations?
> > Thanks
> >
> >
> > Thanks
> > Jay Potharaju
> >
> >
> > On Mon, May 7, 2018 at 9:48 AM, Jay Potharaju 
> > wrote:
> >
> > > Ah thanks for explaining that!
> > >
> > > Thanks
> > > Jay Potharaju
> > >
> > >
> > > On Mon, May 7, 2018 at 9:45 AM, Emir Arnautović <
> > > emir.arnauto...@sematext.com> wrote:
> > >
> > >> Node A receives batch of documents to index. It forwards documents to
> > >> shards that are on the node B. Node B is having issues with GC so it
> > takes
> > >> a while to respond. Node A sees it as read timeout and reports it in
> > logs.
> > >> So the issue is on node B not node A.
> > >>
> > >> Emir
> > >> --
> > >> Monitoring - Log Management - Alerting - Anomaly Detection
> > >> Solr & Elasticsearch Consulting Support Training -
> http://sematext.com/
> > >>
> > >>
> > >>
> > >> > On 7 May 2018, at 18:39, Jay Potharaju 
> wrote:
> > >> >
> > >> > Yes, the nodes are well balanced. I am just using these boxes for
> > >> indexing
> > >> > the data and is not serving any traffic at this time.  The error
> > >> indicates
> > >> > it is having issues errors on the shards that are hosted on the box
> > and
> > >> not
> > >> > on the other box.
> > >> > I will check GC logs to see if there were any issues.
> > >> > thanks
> > >> >
> > >> > Thanks
> > >> > Jay Potharaju
> > >> >
> > >> >
> > >> > On Mon, May 7, 2018 at 9:34 AM, Emir Arnautović <
> > >> > emir.arnauto...@sematext.com> wrote:
> > >> >
> > >> >> Hi Jay,
> > >> >> My first guess would be that there was some major GC on other box
> so
> > it
> > >> >> did not respond on time. Are your nodes well balanced - do they
> serve
> > >> equal
> > >> >> amount of data?
> > >> >>
> > >> >> Thanks,
> > >> >> Emir
> > >> >> --
> > >> >> Monitoring - Log Management - Alerting - Anomaly Detection
> > >> >> Solr & Elasticsearch Consulting Support Training -
> > >> http://sematext.com/
> > >> >>
> > >> >>
> > >> >>
> > >> >>> On 7 May 2018, at 18:11, Jay Potharaju 
> > wrote:
> > >> >>>
> > >> >>> Hi,
> > >> >>> I am seeing the following lines in the error log. My setup has 2
> > >> nodes in
> > >> >>> the solrcloud cluster, each node has 3 shards with no replication.
> > >> From
> > >> >> the
> > >> >>> error log it seems like all the shards on this box are throwing
> > async
> > >> >>> exception errors. Other node in the cluster does not have any
> errors
> > >> in
> > >> >> the
> > >> >>> logs. Any suggestions on how to tackle this error?
> > >> >>>
> > >> >>> Solr setup
> > >> >>> Solr:6.6.3
> > >> >>> 2Nodes: 3 shards each
> > >> >>>
> > >> >>>
> > >> >>> ERROR org.apache.solr.servlet.HttpSolrCall
> [test_shard3_replica1] ?
> > >> >>> null:org.apache.solr.update.processor.DistributedUpdateProcessor$
> > >> >> DistributedUpdatesAsyncException:
> > >> >>> Async exception during distributed update: Read timed out
> > >> >>> at
> > >> >>>
> > org.apache.solr.update.processor.DistributedUpdateProcessor.doFinish(
> > >> >> DistributedUpdateProcessor.java:972)
> > >> >>> at
> > >> >>> org.apache.solr.update.processor.DistributedUpdateProcessor.
> finish(
> > >> >> DistributedUpdateProcessor.java:1911)
> > >> >>> at
> > >> >>> org.apache.solr.handler.ContentStreamHandlerBase.
> handleRequestBody(
> > >> >> ContentStreamHandlerBase.java:78)
> > >> >>> at
> > >> >>> org.apache.solr.handler.RequestHandlerBase.handleRequest(
> > >> >> RequestHandlerBase.java:173)
> > >> >>> at org.apache.solr.core.SolrCore.execute(SolrCore.java:2477)
> > >> >>> at org.apache.solr.servlet.HttpSolrCall.execute(HttpSolrCall.
> > >> java:723)
> > >> >>> at org.apache.solr.servlet.HttpSolrCall.call(
> HttpSolrCall.java:529)
> > >> >>> at
> > >> >>> org.apache.solr.servlet.SolrDispatchFilter.doFilter(
> > >> >> SolrDispatchFilter.java:361)
> > >> >>> at
> > >> >>> org.apache.solr.servlet.SolrDispatchFilter.doFilter(
> > >> >> SolrDispatchFilter.java:305)
> > >> >>> at
> > >> >>> org.eclipse.jetty.servlet.ServletHandler$CachedChain.
> > >> >> doFilter(ServletHandler.java:1691)
> > >> >>> at
> > >> >>> org.eclipse.jetty.servlet.ServletHandler.doHandle(
> > >> >> ServletHandler.java:582)
> > >> >>> at
> > >> >>> org.eclipse.jetty.server.handler.ScopedHandler.handle(
> > >> >> ScopedHandler.java:143)
> > >> >>> at
> > >> >>> org.eclipse.jetty.security.SecurityHandler.handle(
> > >> >> SecurityHandler.java:548)
> > 

Re: Async exceptions during distributed update

2018-05-07 Thread Emir Arnautović
How do you send documents? Large batches? Complex analysis? Do you send all
batches to the same node? How do you commit? Do you delete by query while
indexing?

Emir

On Tue, May 8, 2018, 12:30 AM Jay Potharaju  wrote:

> I didn't see any OOM errors in the logs on either of the nodes. I saw GC
> pause of 1 second on the box that was throwing error ...but nothing on the
> other node. Any other recommendations?
> Thanks
>
>
> Thanks
> Jay Potharaju
>
>
> On Mon, May 7, 2018 at 9:48 AM, Jay Potharaju 
> wrote:
>
> > Ah thanks for explaining that!
> >
> > Thanks
> > Jay Potharaju
> >
> >
> > On Mon, May 7, 2018 at 9:45 AM, Emir Arnautović <
> > emir.arnauto...@sematext.com> wrote:
> >
> >> Node A receives batch of documents to index. It forwards documents to
> >> shards that are on the node B. Node B is having issues with GC so it
> takes
> >> a while to respond. Node A sees it as read timeout and reports it in
> logs.
> >> So the issue is on node B not node A.
> >>
> >> Emir
> >> --
> >> Monitoring - Log Management - Alerting - Anomaly Detection
> >> Solr & Elasticsearch Consulting Support Training - http://sematext.com/
> >>
> >>
> >>
> >> > On 7 May 2018, at 18:39, Jay Potharaju  wrote:
> >> >
> >> > Yes, the nodes are well balanced. I am just using these boxes for
> >> indexing
> >> > the data and is not serving any traffic at this time.  The error
> >> indicates
> >> > it is having issues errors on the shards that are hosted on the box
> and
> >> not
> >> > on the other box.
> >> > I will check GC logs to see if there were any issues.
> >> > thanks
> >> >
> >> > Thanks
> >> > Jay Potharaju
> >> >
> >> >
> >> > On Mon, May 7, 2018 at 9:34 AM, Emir Arnautović <
> >> > emir.arnauto...@sematext.com> wrote:
> >> >
> >> >> Hi Jay,
> >> >> My first guess would be that there was some major GC on other box so
> it
> >> >> did not respond on time. Are your nodes well balanced - do they serve
> >> equal
> >> >> amount of data?
> >> >>
> >> >> Thanks,
> >> >> Emir
> >> >> --
> >> >> Monitoring - Log Management - Alerting - Anomaly Detection
> >> >> Solr & Elasticsearch Consulting Support Training -
> >> http://sematext.com/
> >> >>
> >> >>
> >> >>
> >> >>> On 7 May 2018, at 18:11, Jay Potharaju 
> wrote:
> >> >>>
> >> >>> Hi,
> >> >>> I am seeing the following lines in the error log. My setup has 2
> >> nodes in
> >> >>> the solrcloud cluster, each node has 3 shards with no replication.
> >> From
> >> >> the
> >> >>> error log it seems like all the shards on this box are throwing
> async
> >> >>> exception errors. Other node in the cluster does not have any errors
> >> in
> >> >> the
> >> >>> logs. Any suggestions on how to tackle this error?
> >> >>>
> >> >>> Solr setup
> >> >>> Solr:6.6.3
> >> >>> 2Nodes: 3 shards each
> >> >>>
> >> >>>
> >> >>> ERROR org.apache.solr.servlet.HttpSolrCall  [test_shard3_replica1] ?
> >> >>> null:org.apache.solr.update.processor.DistributedUpdateProcessor$
> >> >> DistributedUpdatesAsyncException:
> >> >>> Async exception during distributed update: Read timed out
> >> >>> at
> >> >>>
> org.apache.solr.update.processor.DistributedUpdateProcessor.doFinish(
> >> >> DistributedUpdateProcessor.java:972)
> >> >>> at
> >> >>> org.apache.solr.update.processor.DistributedUpdateProcessor.finish(
> >> >> DistributedUpdateProcessor.java:1911)
> >> >>> at
> >> >>> org.apache.solr.handler.ContentStreamHandlerBase.handleRequestBody(
> >> >> ContentStreamHandlerBase.java:78)
> >> >>> at
> >> >>> org.apache.solr.handler.RequestHandlerBase.handleRequest(
> >> >> RequestHandlerBase.java:173)
> >> >>> at org.apache.solr.core.SolrCore.execute(SolrCore.java:2477)
> >> >>> at org.apache.solr.servlet.HttpSolrCall.execute(HttpSolrCall.
> >> java:723)
> >> >>> at org.apache.solr.servlet.HttpSolrCall.call(HttpSolrCall.java:529)
> >> >>> at
> >> >>> org.apache.solr.servlet.SolrDispatchFilter.doFilter(
> >> >> SolrDispatchFilter.java:361)
> >> >>> at
> >> >>> org.apache.solr.servlet.SolrDispatchFilter.doFilter(
> >> >> SolrDispatchFilter.java:305)
> >> >>> at
> >> >>> org.eclipse.jetty.servlet.ServletHandler$CachedChain.
> >> >> doFilter(ServletHandler.java:1691)
> >> >>> at
> >> >>> org.eclipse.jetty.servlet.ServletHandler.doHandle(
> >> >> ServletHandler.java:582)
> >> >>> at
> >> >>> org.eclipse.jetty.server.handler.ScopedHandler.handle(
> >> >> ScopedHandler.java:143)
> >> >>> at
> >> >>> org.eclipse.jetty.security.SecurityHandler.handle(
> >> >> SecurityHandler.java:548)
> >> >>> at
> >> >>> org.eclipse.jetty.server.session.SessionHandler.
> >> >> doHandle(SessionHandler.java:226)
> >> >>> at
> >> >>> org.eclipse.jetty.server.handler.ContextHandler.
> >> >> doHandle(ContextHandler.java:1180)
> >> >>> at org.eclipse.jetty.servlet.ServletHandler.doScope(
> >> >> ServletHandler.java:512)
> >> >>> at
> >> >>> org.eclipse.jetty.server.session.SessionHandler.
> >> >> doScope(SessionHandler.java:185)
> >> >>> at
> >> >>> org.eclipse.jetty.server.handler.ContextHandler.
> >> >> doScope(ContextHandler.java:1112)
> >>

Re: Async exceptions during distributed update

2018-05-07 Thread Jay Potharaju
I didn't see any OOM errors in the logs on either of the nodes. I saw GC
pause of 1 second on the box that was throwing error ...but nothing on the
other node. Any other recommendations?
Thanks


Thanks
Jay Potharaju


On Mon, May 7, 2018 at 9:48 AM, Jay Potharaju  wrote:

> Ah thanks for explaining that!
>
> Thanks
> Jay Potharaju
>
>
> On Mon, May 7, 2018 at 9:45 AM, Emir Arnautović <
> emir.arnauto...@sematext.com> wrote:
>
>> Node A receives batch of documents to index. It forwards documents to
>> shards that are on the node B. Node B is having issues with GC so it takes
>> a while to respond. Node A sees it as read timeout and reports it in logs.
>> So the issue is on node B not node A.
>>
>> Emir
>> --
>> Monitoring - Log Management - Alerting - Anomaly Detection
>> Solr & Elasticsearch Consulting Support Training - http://sematext.com/
>>
>>
>>
>> > On 7 May 2018, at 18:39, Jay Potharaju  wrote:
>> >
>> > Yes, the nodes are well balanced. I am just using these boxes for
>> indexing
>> > the data and is not serving any traffic at this time.  The error
>> indicates
>> > it is having issues errors on the shards that are hosted on the box and
>> not
>> > on the other box.
>> > I will check GC logs to see if there were any issues.
>> > thanks
>> >
>> > Thanks
>> > Jay Potharaju
>> >
>> >
>> > On Mon, May 7, 2018 at 9:34 AM, Emir Arnautović <
>> > emir.arnauto...@sematext.com> wrote:
>> >
>> >> Hi Jay,
>> >> My first guess would be that there was some major GC on other box so it
>> >> did not respond on time. Are your nodes well balanced - do they serve
>> equal
>> >> amount of data?
>> >>
>> >> Thanks,
>> >> Emir
>> >> --
>> >> Monitoring - Log Management - Alerting - Anomaly Detection
>> >> Solr & Elasticsearch Consulting Support Training -
>> http://sematext.com/
>> >>
>> >>
>> >>
>> >>> On 7 May 2018, at 18:11, Jay Potharaju  wrote:
>> >>>
>> >>> Hi,
>> >>> I am seeing the following lines in the error log. My setup has 2
>> nodes in
>> >>> the solrcloud cluster, each node has 3 shards with no replication.
>> From
>> >> the
>> >>> error log it seems like all the shards on this box are throwing async
>> >>> exception errors. Other node in the cluster does not have any errors
>> in
>> >> the
>> >>> logs. Any suggestions on how to tackle this error?
>> >>>
>> >>> Solr setup
>> >>> Solr:6.6.3
>> >>> 2Nodes: 3 shards each
>> >>>
>> >>>
>> >>> ERROR org.apache.solr.servlet.HttpSolrCall  [test_shard3_replica1] ?
>> >>> null:org.apache.solr.update.processor.DistributedUpdateProcessor$
>> >> DistributedUpdatesAsyncException:
>> >>> Async exception during distributed update: Read timed out
>> >>> at
>> >>> org.apache.solr.update.processor.DistributedUpdateProcessor.doFinish(
>> >> DistributedUpdateProcessor.java:972)
>> >>> at
>> >>> org.apache.solr.update.processor.DistributedUpdateProcessor.finish(
>> >> DistributedUpdateProcessor.java:1911)
>> >>> at
>> >>> org.apache.solr.handler.ContentStreamHandlerBase.handleRequestBody(
>> >> ContentStreamHandlerBase.java:78)
>> >>> at
>> >>> org.apache.solr.handler.RequestHandlerBase.handleRequest(
>> >> RequestHandlerBase.java:173)
>> >>> at org.apache.solr.core.SolrCore.execute(SolrCore.java:2477)
>> >>> at org.apache.solr.servlet.HttpSolrCall.execute(HttpSolrCall.
>> java:723)
>> >>> at org.apache.solr.servlet.HttpSolrCall.call(HttpSolrCall.java:529)
>> >>> at
>> >>> org.apache.solr.servlet.SolrDispatchFilter.doFilter(
>> >> SolrDispatchFilter.java:361)
>> >>> at
>> >>> org.apache.solr.servlet.SolrDispatchFilter.doFilter(
>> >> SolrDispatchFilter.java:305)
>> >>> at
>> >>> org.eclipse.jetty.servlet.ServletHandler$CachedChain.
>> >> doFilter(ServletHandler.java:1691)
>> >>> at
>> >>> org.eclipse.jetty.servlet.ServletHandler.doHandle(
>> >> ServletHandler.java:582)
>> >>> at
>> >>> org.eclipse.jetty.server.handler.ScopedHandler.handle(
>> >> ScopedHandler.java:143)
>> >>> at
>> >>> org.eclipse.jetty.security.SecurityHandler.handle(
>> >> SecurityHandler.java:548)
>> >>> at
>> >>> org.eclipse.jetty.server.session.SessionHandler.
>> >> doHandle(SessionHandler.java:226)
>> >>> at
>> >>> org.eclipse.jetty.server.handler.ContextHandler.
>> >> doHandle(ContextHandler.java:1180)
>> >>> at org.eclipse.jetty.servlet.ServletHandler.doScope(
>> >> ServletHandler.java:512)
>> >>> at
>> >>> org.eclipse.jetty.server.session.SessionHandler.
>> >> doScope(SessionHandler.java:185)
>> >>> at
>> >>> org.eclipse.jetty.server.handler.ContextHandler.
>> >> doScope(ContextHandler.java:1112)
>> >>> at
>> >>> org.eclipse.jetty.server.handler.ScopedHandler.handle(
>> >> ScopedHandler.java:141)
>> >>> at
>> >>> org.eclipse.jetty.server.handler.ContextHandlerCollection.handle(
>> >> ContextHandlerCollection.java:213)
>> >>> at
>> >>> org.eclipse.jetty.server.handler.HandlerCollection.
>> >> handle(HandlerCollection.java:119)
>> >>> at
>> >>> org.eclipse.jetty.server.handler.HandlerWrapper.handle(
>> >> HandlerWrapper.java:134)
>> >>> at
>> >>> org.eclipse.jetty.rewrite.handler.RewriteHandler.handl

Re: Determine Solr Core Creation Timestamp

2018-05-07 Thread Atita Arora
Hi Shawn,

I noticed the same and hence overruled the idea to use it.
Further , while exploring the V2 api (as we're currently in Solr 6.6 and
will soon be on Solr 7.X) ,I came across the shards API which has
"property.index.version": "1525453818563"

Which is listed for each of the shards. I wonder if I should be leveraging
this as this seem to be the index version & I dont think this number should
vary on restart.

Any pointers ?

Thanks,
Atita


On Mon, May 7, 2018 at 11:16 AM, Shawn Heisey  wrote:

> On 5/6/2018 3:09 PM, Atita Arora wrote:
>
>> I am working on a developing a utility which lets one monitor the
>> indexPipeline Status.
>> The indexing job runs in two forms where either it -
>> 1. Creates a new core OR
>> 2. Runs the delta on existing core.
>> To put down to simplest form I look into the DB timestamp when the
>> indexing
>> job was triggered and have a desire to read some stat / metric from Solr
>> (preferably an API) which reports a timestamp when the CORE was created /
>> modified.
>> My utility completely relies on the difference between timestamps from DB
>> &
>> Solr as these two timestamps are leveraged to determine health of
>> pipeline.
>>
>> I see the Master Version Timestamp under each shard which details the
>> version / Gen / Size.
>> Is that what I should be using ? How can I grab these from API ?
>> I tried using metrics api :
>> *http://localhost:8983/solr/admin/metrics?group=core&prefix=CORE
>> *
>> which details *CORE.startTime *but this timestamp changes whenever data is
>> being added to any core on this node.
>> *Is there any other suggestion to use some other way to determine the core
>> creation timestamp* ?
>>
>
> The startTime value is the time at which Solr started the core.  If that
> is getting updated frequently, then a reload operation is probably
> happening on the core.  Or, less likely, the Solr instance has been
> restarted.  I have checked a 6.6 system and on a core that is getting
> updates as frequently as once a minute, startTime is a couple of days ago,
> which was the last time that core was reloaded.
>
> I've been trying to figure out whether a Lucene index keeps track of the
> time it was created, but I haven't found anything yet.  If it doesn't, I do
> wonder whether there might be some kind of metadata that Solr could write
> to the index to record information like this.  Solr would always have the
> option of writing such metadata to an entirely different location within
> the instanceDir.  The index creation time is probably not the only
> information that would be useful to have available.
>
> Thanks,
> Shawn
>
>


Must clause with filter queries

2018-05-07 Thread manuj singh
Hi all,
I am kind of confused how must clause(+) behaves with the filter queries.
e.g i have below query:
q=*:*&fq=+{!frange cost=200 l=NOW-179DAYS u=NOW/DAY+1DAY incl=true
incu=false}date

So i am filtering documents which are less then 179 old days.
So e.g if now is May 7th, 10.23 cst,2018, i should only see documents which
have date > Nov 9th, 10.23 cst, 2017.

However with the above query i am also seeing documents which are done on
Nov 5th,2017 (which seems like it is returning some docs from filter cache.
which is wired because in my date range for the start date  i am using
NOW-179DAYS and
Now is changing every time, so it shouldn't go to filtercache as every new
request will have  a different time stamp. )

However if i remove the + from the filter query it seems to work fine.

I am mostly thinking it seems to be a filtercache issue but not sure how i
prove that.

Our auto soft commit is 500 ms , so every 0.5 second we should have a new
searcher open and cache should be flushed.

Something is not right and i am not able to figure out what. Has some one
seen this kind of issue before ?

If i move the query from fq to q then also it works fine.

One more thing when i put debug query i see the following in the parse query


*"QParser": "LuceneQParser", "filter_queries": [ "+{!frange cost=200
l=NOW-179DAYS u=NOW/DAY+1DAY incl=true incu=false}date", "-_parent_:F" ],
"parsed_filter_queries": [
"+FunctionRangeQuery(ConstantScore(frange(date(date)):[NOW-179DAYS TO
NOW/DAY+1DAY}))", "-_parent_:false" ]*

So in the above i do not see the date getting resolved to an actual time
stamp.

However if i change the syntax of the query to not use frange and local
params i see the transaction date resolving into correct timestamp.

So for the following query
q=*:*&fq=+date:[NOW-179DAYS TO NOW/DAY+1DAY]

i see the following in the debug query, and see the actualy timestamp:
"QParser": "LuceneQParser", "filter_queries": [ "date:[NOW-179DAYS TO
NOW/DAY+1DAY]", "-_parent_:F" ], "parsed_filter_queries": [
"date:[1510242067383
TO 152573760]", "-_parent_:false" ],


Not sure if its just a red herring ?


Re: Async exceptions during distributed update

2018-05-07 Thread Jay Potharaju
Ah thanks for explaining that!

Thanks
Jay Potharaju


On Mon, May 7, 2018 at 9:45 AM, Emir Arnautović <
emir.arnauto...@sematext.com> wrote:

> Node A receives batch of documents to index. It forwards documents to
> shards that are on the node B. Node B is having issues with GC so it takes
> a while to respond. Node A sees it as read timeout and reports it in logs.
> So the issue is on node B not node A.
>
> Emir
> --
> Monitoring - Log Management - Alerting - Anomaly Detection
> Solr & Elasticsearch Consulting Support Training - http://sematext.com/
>
>
>
> > On 7 May 2018, at 18:39, Jay Potharaju  wrote:
> >
> > Yes, the nodes are well balanced. I am just using these boxes for
> indexing
> > the data and is not serving any traffic at this time.  The error
> indicates
> > it is having issues errors on the shards that are hosted on the box and
> not
> > on the other box.
> > I will check GC logs to see if there were any issues.
> > thanks
> >
> > Thanks
> > Jay Potharaju
> >
> >
> > On Mon, May 7, 2018 at 9:34 AM, Emir Arnautović <
> > emir.arnauto...@sematext.com> wrote:
> >
> >> Hi Jay,
> >> My first guess would be that there was some major GC on other box so it
> >> did not respond on time. Are your nodes well balanced - do they serve
> equal
> >> amount of data?
> >>
> >> Thanks,
> >> Emir
> >> --
> >> Monitoring - Log Management - Alerting - Anomaly Detection
> >> Solr & Elasticsearch Consulting Support Training - http://sematext.com/
> >>
> >>
> >>
> >>> On 7 May 2018, at 18:11, Jay Potharaju  wrote:
> >>>
> >>> Hi,
> >>> I am seeing the following lines in the error log. My setup has 2 nodes
> in
> >>> the solrcloud cluster, each node has 3 shards with no replication. From
> >> the
> >>> error log it seems like all the shards on this box are throwing async
> >>> exception errors. Other node in the cluster does not have any errors in
> >> the
> >>> logs. Any suggestions on how to tackle this error?
> >>>
> >>> Solr setup
> >>> Solr:6.6.3
> >>> 2Nodes: 3 shards each
> >>>
> >>>
> >>> ERROR org.apache.solr.servlet.HttpSolrCall  [test_shard3_replica1] ?
> >>> null:org.apache.solr.update.processor.DistributedUpdateProcessor$
> >> DistributedUpdatesAsyncException:
> >>> Async exception during distributed update: Read timed out
> >>> at
> >>> org.apache.solr.update.processor.DistributedUpdateProcessor.doFinish(
> >> DistributedUpdateProcessor.java:972)
> >>> at
> >>> org.apache.solr.update.processor.DistributedUpdateProcessor.finish(
> >> DistributedUpdateProcessor.java:1911)
> >>> at
> >>> org.apache.solr.handler.ContentStreamHandlerBase.handleRequestBody(
> >> ContentStreamHandlerBase.java:78)
> >>> at
> >>> org.apache.solr.handler.RequestHandlerBase.handleRequest(
> >> RequestHandlerBase.java:173)
> >>> at org.apache.solr.core.SolrCore.execute(SolrCore.java:2477)
> >>> at org.apache.solr.servlet.HttpSolrCall.execute(HttpSolrCall.java:723)
> >>> at org.apache.solr.servlet.HttpSolrCall.call(HttpSolrCall.java:529)
> >>> at
> >>> org.apache.solr.servlet.SolrDispatchFilter.doFilter(
> >> SolrDispatchFilter.java:361)
> >>> at
> >>> org.apache.solr.servlet.SolrDispatchFilter.doFilter(
> >> SolrDispatchFilter.java:305)
> >>> at
> >>> org.eclipse.jetty.servlet.ServletHandler$CachedChain.
> >> doFilter(ServletHandler.java:1691)
> >>> at
> >>> org.eclipse.jetty.servlet.ServletHandler.doHandle(
> >> ServletHandler.java:582)
> >>> at
> >>> org.eclipse.jetty.server.handler.ScopedHandler.handle(
> >> ScopedHandler.java:143)
> >>> at
> >>> org.eclipse.jetty.security.SecurityHandler.handle(
> >> SecurityHandler.java:548)
> >>> at
> >>> org.eclipse.jetty.server.session.SessionHandler.
> >> doHandle(SessionHandler.java:226)
> >>> at
> >>> org.eclipse.jetty.server.handler.ContextHandler.
> >> doHandle(ContextHandler.java:1180)
> >>> at org.eclipse.jetty.servlet.ServletHandler.doScope(
> >> ServletHandler.java:512)
> >>> at
> >>> org.eclipse.jetty.server.session.SessionHandler.
> >> doScope(SessionHandler.java:185)
> >>> at
> >>> org.eclipse.jetty.server.handler.ContextHandler.
> >> doScope(ContextHandler.java:1112)
> >>> at
> >>> org.eclipse.jetty.server.handler.ScopedHandler.handle(
> >> ScopedHandler.java:141)
> >>> at
> >>> org.eclipse.jetty.server.handler.ContextHandlerCollection.handle(
> >> ContextHandlerCollection.java:213)
> >>> at
> >>> org.eclipse.jetty.server.handler.HandlerCollection.
> >> handle(HandlerCollection.java:119)
> >>> at
> >>> org.eclipse.jetty.server.handler.HandlerWrapper.handle(
> >> HandlerWrapper.java:134)
> >>> at
> >>> org.eclipse.jetty.rewrite.handler.RewriteHandler.handle(
> >> RewriteHandler.java:335)
> >>> at
> >>> org.eclipse.jetty.server.handler.HandlerWrapper.handle(
> >> HandlerWrapper.java:134)
> >>> at org.eclipse.jetty.server.Server.handle(Server.java:534)
> >>> at org.eclipse.jetty.server.HttpChannel.handle(HttpChannel.java:320)
> >>> at
> >>> org.eclipse.jetty.server.HttpConnection.onFillable(
> >> HttpConnection.java:251)
> >>> at
> >>> org.eclipse.jetty.io.AbstractConnection$Re

Re: Async exceptions during distributed update

2018-05-07 Thread Emir Arnautović
Node A receives batch of documents to index. It forwards documents to shards 
that are on the node B. Node B is having issues with GC so it takes a while to 
respond. Node A sees it as read timeout and reports it in logs. So the issue is 
on node B not node A.

Emir 
--
Monitoring - Log Management - Alerting - Anomaly Detection
Solr & Elasticsearch Consulting Support Training - http://sematext.com/



> On 7 May 2018, at 18:39, Jay Potharaju  wrote:
> 
> Yes, the nodes are well balanced. I am just using these boxes for indexing
> the data and is not serving any traffic at this time.  The error indicates
> it is having issues errors on the shards that are hosted on the box and not
> on the other box.
> I will check GC logs to see if there were any issues.
> thanks
> 
> Thanks
> Jay Potharaju
> 
> 
> On Mon, May 7, 2018 at 9:34 AM, Emir Arnautović <
> emir.arnauto...@sematext.com> wrote:
> 
>> Hi Jay,
>> My first guess would be that there was some major GC on other box so it
>> did not respond on time. Are your nodes well balanced - do they serve equal
>> amount of data?
>> 
>> Thanks,
>> Emir
>> --
>> Monitoring - Log Management - Alerting - Anomaly Detection
>> Solr & Elasticsearch Consulting Support Training - http://sematext.com/
>> 
>> 
>> 
>>> On 7 May 2018, at 18:11, Jay Potharaju  wrote:
>>> 
>>> Hi,
>>> I am seeing the following lines in the error log. My setup has 2 nodes in
>>> the solrcloud cluster, each node has 3 shards with no replication. From
>> the
>>> error log it seems like all the shards on this box are throwing async
>>> exception errors. Other node in the cluster does not have any errors in
>> the
>>> logs. Any suggestions on how to tackle this error?
>>> 
>>> Solr setup
>>> Solr:6.6.3
>>> 2Nodes: 3 shards each
>>> 
>>> 
>>> ERROR org.apache.solr.servlet.HttpSolrCall  [test_shard3_replica1] ?
>>> null:org.apache.solr.update.processor.DistributedUpdateProcessor$
>> DistributedUpdatesAsyncException:
>>> Async exception during distributed update: Read timed out
>>> at
>>> org.apache.solr.update.processor.DistributedUpdateProcessor.doFinish(
>> DistributedUpdateProcessor.java:972)
>>> at
>>> org.apache.solr.update.processor.DistributedUpdateProcessor.finish(
>> DistributedUpdateProcessor.java:1911)
>>> at
>>> org.apache.solr.handler.ContentStreamHandlerBase.handleRequestBody(
>> ContentStreamHandlerBase.java:78)
>>> at
>>> org.apache.solr.handler.RequestHandlerBase.handleRequest(
>> RequestHandlerBase.java:173)
>>> at org.apache.solr.core.SolrCore.execute(SolrCore.java:2477)
>>> at org.apache.solr.servlet.HttpSolrCall.execute(HttpSolrCall.java:723)
>>> at org.apache.solr.servlet.HttpSolrCall.call(HttpSolrCall.java:529)
>>> at
>>> org.apache.solr.servlet.SolrDispatchFilter.doFilter(
>> SolrDispatchFilter.java:361)
>>> at
>>> org.apache.solr.servlet.SolrDispatchFilter.doFilter(
>> SolrDispatchFilter.java:305)
>>> at
>>> org.eclipse.jetty.servlet.ServletHandler$CachedChain.
>> doFilter(ServletHandler.java:1691)
>>> at
>>> org.eclipse.jetty.servlet.ServletHandler.doHandle(
>> ServletHandler.java:582)
>>> at
>>> org.eclipse.jetty.server.handler.ScopedHandler.handle(
>> ScopedHandler.java:143)
>>> at
>>> org.eclipse.jetty.security.SecurityHandler.handle(
>> SecurityHandler.java:548)
>>> at
>>> org.eclipse.jetty.server.session.SessionHandler.
>> doHandle(SessionHandler.java:226)
>>> at
>>> org.eclipse.jetty.server.handler.ContextHandler.
>> doHandle(ContextHandler.java:1180)
>>> at org.eclipse.jetty.servlet.ServletHandler.doScope(
>> ServletHandler.java:512)
>>> at
>>> org.eclipse.jetty.server.session.SessionHandler.
>> doScope(SessionHandler.java:185)
>>> at
>>> org.eclipse.jetty.server.handler.ContextHandler.
>> doScope(ContextHandler.java:1112)
>>> at
>>> org.eclipse.jetty.server.handler.ScopedHandler.handle(
>> ScopedHandler.java:141)
>>> at
>>> org.eclipse.jetty.server.handler.ContextHandlerCollection.handle(
>> ContextHandlerCollection.java:213)
>>> at
>>> org.eclipse.jetty.server.handler.HandlerCollection.
>> handle(HandlerCollection.java:119)
>>> at
>>> org.eclipse.jetty.server.handler.HandlerWrapper.handle(
>> HandlerWrapper.java:134)
>>> at
>>> org.eclipse.jetty.rewrite.handler.RewriteHandler.handle(
>> RewriteHandler.java:335)
>>> at
>>> org.eclipse.jetty.server.handler.HandlerWrapper.handle(
>> HandlerWrapper.java:134)
>>> at org.eclipse.jetty.server.Server.handle(Server.java:534)
>>> at org.eclipse.jetty.server.HttpChannel.handle(HttpChannel.java:320)
>>> at
>>> org.eclipse.jetty.server.HttpConnection.onFillable(
>> HttpConnection.java:251)
>>> at
>>> org.eclipse.jetty.io.AbstractConnection$ReadCallback.succeeded(
>> AbstractConnection.java:273)
>>> at org.eclipse.jetty.io.FillInterest.fillable(FillInterest.java:95)
>>> at
>>> org.eclipse.jetty.io.SelectChannelEndPoint$2.run(
>> SelectChannelEndPoint.java:93)
>>> at
>>> org.eclipse.jetty.util.thread.QueuedThreadPool.runJob(
>> QueuedThreadPool.java:671)
>>> at
>>> org.eclipse.jetty.util.thread.QueuedThreadPool$2.run(
>> QueuedThreadP

Re: Async exceptions during distributed update

2018-05-07 Thread Jay Potharaju
Yes, the nodes are well balanced. I am just using these boxes for indexing
the data and is not serving any traffic at this time.  The error indicates
it is having issues errors on the shards that are hosted on the box and not
on the other box.
I will check GC logs to see if there were any issues.
thanks

Thanks
Jay Potharaju


On Mon, May 7, 2018 at 9:34 AM, Emir Arnautović <
emir.arnauto...@sematext.com> wrote:

> Hi Jay,
> My first guess would be that there was some major GC on other box so it
> did not respond on time. Are your nodes well balanced - do they serve equal
> amount of data?
>
> Thanks,
> Emir
> --
> Monitoring - Log Management - Alerting - Anomaly Detection
> Solr & Elasticsearch Consulting Support Training - http://sematext.com/
>
>
>
> > On 7 May 2018, at 18:11, Jay Potharaju  wrote:
> >
> > Hi,
> > I am seeing the following lines in the error log. My setup has 2 nodes in
> > the solrcloud cluster, each node has 3 shards with no replication. From
> the
> > error log it seems like all the shards on this box are throwing async
> > exception errors. Other node in the cluster does not have any errors in
> the
> > logs. Any suggestions on how to tackle this error?
> >
> > Solr setup
> > Solr:6.6.3
> > 2Nodes: 3 shards each
> >
> >
> > ERROR org.apache.solr.servlet.HttpSolrCall  [test_shard3_replica1] ?
> > null:org.apache.solr.update.processor.DistributedUpdateProcessor$
> DistributedUpdatesAsyncException:
> > Async exception during distributed update: Read timed out
> > at
> > org.apache.solr.update.processor.DistributedUpdateProcessor.doFinish(
> DistributedUpdateProcessor.java:972)
> > at
> > org.apache.solr.update.processor.DistributedUpdateProcessor.finish(
> DistributedUpdateProcessor.java:1911)
> > at
> > org.apache.solr.handler.ContentStreamHandlerBase.handleRequestBody(
> ContentStreamHandlerBase.java:78)
> > at
> > org.apache.solr.handler.RequestHandlerBase.handleRequest(
> RequestHandlerBase.java:173)
> > at org.apache.solr.core.SolrCore.execute(SolrCore.java:2477)
> > at org.apache.solr.servlet.HttpSolrCall.execute(HttpSolrCall.java:723)
> > at org.apache.solr.servlet.HttpSolrCall.call(HttpSolrCall.java:529)
> > at
> > org.apache.solr.servlet.SolrDispatchFilter.doFilter(
> SolrDispatchFilter.java:361)
> > at
> > org.apache.solr.servlet.SolrDispatchFilter.doFilter(
> SolrDispatchFilter.java:305)
> > at
> > org.eclipse.jetty.servlet.ServletHandler$CachedChain.
> doFilter(ServletHandler.java:1691)
> > at
> > org.eclipse.jetty.servlet.ServletHandler.doHandle(
> ServletHandler.java:582)
> > at
> > org.eclipse.jetty.server.handler.ScopedHandler.handle(
> ScopedHandler.java:143)
> > at
> > org.eclipse.jetty.security.SecurityHandler.handle(
> SecurityHandler.java:548)
> > at
> > org.eclipse.jetty.server.session.SessionHandler.
> doHandle(SessionHandler.java:226)
> > at
> > org.eclipse.jetty.server.handler.ContextHandler.
> doHandle(ContextHandler.java:1180)
> > at org.eclipse.jetty.servlet.ServletHandler.doScope(
> ServletHandler.java:512)
> > at
> > org.eclipse.jetty.server.session.SessionHandler.
> doScope(SessionHandler.java:185)
> > at
> > org.eclipse.jetty.server.handler.ContextHandler.
> doScope(ContextHandler.java:1112)
> > at
> > org.eclipse.jetty.server.handler.ScopedHandler.handle(
> ScopedHandler.java:141)
> > at
> > org.eclipse.jetty.server.handler.ContextHandlerCollection.handle(
> ContextHandlerCollection.java:213)
> > at
> > org.eclipse.jetty.server.handler.HandlerCollection.
> handle(HandlerCollection.java:119)
> > at
> > org.eclipse.jetty.server.handler.HandlerWrapper.handle(
> HandlerWrapper.java:134)
> > at
> > org.eclipse.jetty.rewrite.handler.RewriteHandler.handle(
> RewriteHandler.java:335)
> > at
> > org.eclipse.jetty.server.handler.HandlerWrapper.handle(
> HandlerWrapper.java:134)
> > at org.eclipse.jetty.server.Server.handle(Server.java:534)
> > at org.eclipse.jetty.server.HttpChannel.handle(HttpChannel.java:320)
> > at
> > org.eclipse.jetty.server.HttpConnection.onFillable(
> HttpConnection.java:251)
> > at
> > org.eclipse.jetty.io.AbstractConnection$ReadCallback.succeeded(
> AbstractConnection.java:273)
> > at org.eclipse.jetty.io.FillInterest.fillable(FillInterest.java:95)
> > at
> > org.eclipse.jetty.io.SelectChannelEndPoint$2.run(
> SelectChannelEndPoint.java:93)
> > at
> > org.eclipse.jetty.util.thread.QueuedThreadPool.runJob(
> QueuedThreadPool.java:671)
> > at
> > org.eclipse.jetty.util.thread.QueuedThreadPool$2.run(
> QueuedThreadPool.java:589)
> > at java.lang.Thread.run(Unknown Source)
> >
> >
> > Thanks
> > Jay
>
>


Re: Async exceptions during distributed update

2018-05-07 Thread Emir Arnautović
Hi Jay,
My first guess would be that there was some major GC on other box so it did not 
respond on time. Are your nodes well balanced - do they serve equal amount of 
data?

Thanks,
Emir
--
Monitoring - Log Management - Alerting - Anomaly Detection
Solr & Elasticsearch Consulting Support Training - http://sematext.com/



> On 7 May 2018, at 18:11, Jay Potharaju  wrote:
> 
> Hi,
> I am seeing the following lines in the error log. My setup has 2 nodes in
> the solrcloud cluster, each node has 3 shards with no replication. From the
> error log it seems like all the shards on this box are throwing async
> exception errors. Other node in the cluster does not have any errors in the
> logs. Any suggestions on how to tackle this error?
> 
> Solr setup
> Solr:6.6.3
> 2Nodes: 3 shards each
> 
> 
> ERROR org.apache.solr.servlet.HttpSolrCall  [test_shard3_replica1] ?
> null:org.apache.solr.update.processor.DistributedUpdateProcessor$DistributedUpdatesAsyncException:
> Async exception during distributed update: Read timed out
> at
> org.apache.solr.update.processor.DistributedUpdateProcessor.doFinish(DistributedUpdateProcessor.java:972)
> at
> org.apache.solr.update.processor.DistributedUpdateProcessor.finish(DistributedUpdateProcessor.java:1911)
> at
> org.apache.solr.handler.ContentStreamHandlerBase.handleRequestBody(ContentStreamHandlerBase.java:78)
> at
> org.apache.solr.handler.RequestHandlerBase.handleRequest(RequestHandlerBase.java:173)
> at org.apache.solr.core.SolrCore.execute(SolrCore.java:2477)
> at org.apache.solr.servlet.HttpSolrCall.execute(HttpSolrCall.java:723)
> at org.apache.solr.servlet.HttpSolrCall.call(HttpSolrCall.java:529)
> at
> org.apache.solr.servlet.SolrDispatchFilter.doFilter(SolrDispatchFilter.java:361)
> at
> org.apache.solr.servlet.SolrDispatchFilter.doFilter(SolrDispatchFilter.java:305)
> at
> org.eclipse.jetty.servlet.ServletHandler$CachedChain.doFilter(ServletHandler.java:1691)
> at
> org.eclipse.jetty.servlet.ServletHandler.doHandle(ServletHandler.java:582)
> at
> org.eclipse.jetty.server.handler.ScopedHandler.handle(ScopedHandler.java:143)
> at
> org.eclipse.jetty.security.SecurityHandler.handle(SecurityHandler.java:548)
> at
> org.eclipse.jetty.server.session.SessionHandler.doHandle(SessionHandler.java:226)
> at
> org.eclipse.jetty.server.handler.ContextHandler.doHandle(ContextHandler.java:1180)
> at org.eclipse.jetty.servlet.ServletHandler.doScope(ServletHandler.java:512)
> at
> org.eclipse.jetty.server.session.SessionHandler.doScope(SessionHandler.java:185)
> at
> org.eclipse.jetty.server.handler.ContextHandler.doScope(ContextHandler.java:1112)
> at
> org.eclipse.jetty.server.handler.ScopedHandler.handle(ScopedHandler.java:141)
> at
> org.eclipse.jetty.server.handler.ContextHandlerCollection.handle(ContextHandlerCollection.java:213)
> at
> org.eclipse.jetty.server.handler.HandlerCollection.handle(HandlerCollection.java:119)
> at
> org.eclipse.jetty.server.handler.HandlerWrapper.handle(HandlerWrapper.java:134)
> at
> org.eclipse.jetty.rewrite.handler.RewriteHandler.handle(RewriteHandler.java:335)
> at
> org.eclipse.jetty.server.handler.HandlerWrapper.handle(HandlerWrapper.java:134)
> at org.eclipse.jetty.server.Server.handle(Server.java:534)
> at org.eclipse.jetty.server.HttpChannel.handle(HttpChannel.java:320)
> at
> org.eclipse.jetty.server.HttpConnection.onFillable(HttpConnection.java:251)
> at
> org.eclipse.jetty.io.AbstractConnection$ReadCallback.succeeded(AbstractConnection.java:273)
> at org.eclipse.jetty.io.FillInterest.fillable(FillInterest.java:95)
> at
> org.eclipse.jetty.io.SelectChannelEndPoint$2.run(SelectChannelEndPoint.java:93)
> at
> org.eclipse.jetty.util.thread.QueuedThreadPool.runJob(QueuedThreadPool.java:671)
> at
> org.eclipse.jetty.util.thread.QueuedThreadPool$2.run(QueuedThreadPool.java:589)
> at java.lang.Thread.run(Unknown Source)
> 
> 
> Thanks
> Jay



Async exceptions during distributed update

2018-05-07 Thread Jay Potharaju
Hi,
I am seeing the following lines in the error log. My setup has 2 nodes in
the solrcloud cluster, each node has 3 shards with no replication. From the
error log it seems like all the shards on this box are throwing async
exception errors. Other node in the cluster does not have any errors in the
logs. Any suggestions on how to tackle this error?

Solr setup
Solr:6.6.3
2Nodes: 3 shards each


ERROR org.apache.solr.servlet.HttpSolrCall  [test_shard3_replica1] ?
null:org.apache.solr.update.processor.DistributedUpdateProcessor$DistributedUpdatesAsyncException:
Async exception during distributed update: Read timed out
at
org.apache.solr.update.processor.DistributedUpdateProcessor.doFinish(DistributedUpdateProcessor.java:972)
at
org.apache.solr.update.processor.DistributedUpdateProcessor.finish(DistributedUpdateProcessor.java:1911)
at
org.apache.solr.handler.ContentStreamHandlerBase.handleRequestBody(ContentStreamHandlerBase.java:78)
at
org.apache.solr.handler.RequestHandlerBase.handleRequest(RequestHandlerBase.java:173)
at org.apache.solr.core.SolrCore.execute(SolrCore.java:2477)
at org.apache.solr.servlet.HttpSolrCall.execute(HttpSolrCall.java:723)
at org.apache.solr.servlet.HttpSolrCall.call(HttpSolrCall.java:529)
at
org.apache.solr.servlet.SolrDispatchFilter.doFilter(SolrDispatchFilter.java:361)
at
org.apache.solr.servlet.SolrDispatchFilter.doFilter(SolrDispatchFilter.java:305)
at
org.eclipse.jetty.servlet.ServletHandler$CachedChain.doFilter(ServletHandler.java:1691)
at
org.eclipse.jetty.servlet.ServletHandler.doHandle(ServletHandler.java:582)
at
org.eclipse.jetty.server.handler.ScopedHandler.handle(ScopedHandler.java:143)
at
org.eclipse.jetty.security.SecurityHandler.handle(SecurityHandler.java:548)
at
org.eclipse.jetty.server.session.SessionHandler.doHandle(SessionHandler.java:226)
at
org.eclipse.jetty.server.handler.ContextHandler.doHandle(ContextHandler.java:1180)
at org.eclipse.jetty.servlet.ServletHandler.doScope(ServletHandler.java:512)
at
org.eclipse.jetty.server.session.SessionHandler.doScope(SessionHandler.java:185)
at
org.eclipse.jetty.server.handler.ContextHandler.doScope(ContextHandler.java:1112)
at
org.eclipse.jetty.server.handler.ScopedHandler.handle(ScopedHandler.java:141)
at
org.eclipse.jetty.server.handler.ContextHandlerCollection.handle(ContextHandlerCollection.java:213)
at
org.eclipse.jetty.server.handler.HandlerCollection.handle(HandlerCollection.java:119)
at
org.eclipse.jetty.server.handler.HandlerWrapper.handle(HandlerWrapper.java:134)
at
org.eclipse.jetty.rewrite.handler.RewriteHandler.handle(RewriteHandler.java:335)
at
org.eclipse.jetty.server.handler.HandlerWrapper.handle(HandlerWrapper.java:134)
at org.eclipse.jetty.server.Server.handle(Server.java:534)
at org.eclipse.jetty.server.HttpChannel.handle(HttpChannel.java:320)
at
org.eclipse.jetty.server.HttpConnection.onFillable(HttpConnection.java:251)
at
org.eclipse.jetty.io.AbstractConnection$ReadCallback.succeeded(AbstractConnection.java:273)
at org.eclipse.jetty.io.FillInterest.fillable(FillInterest.java:95)
at
org.eclipse.jetty.io.SelectChannelEndPoint$2.run(SelectChannelEndPoint.java:93)
at
org.eclipse.jetty.util.thread.QueuedThreadPool.runJob(QueuedThreadPool.java:671)
at
org.eclipse.jetty.util.thread.QueuedThreadPool$2.run(QueuedThreadPool.java:589)
at java.lang.Thread.run(Unknown Source)


Thanks
Jay


Re: ampersand handling in solr cloud 7 in text_general field

2018-05-07 Thread Shawn Heisey

On 5/7/2018 8:45 AM, kumar gaurav wrote:

Hi Shawn

It is solr 7.3 .

On Sun, May 6, 2018 at 1:17 AM, Shawn Heisey  wrote:

The error in what you shared is incomplete.  Can you find any errors in
solr.log and provide the full error text for any of them that occurred
around the relevant timestamp?  Each one is going to be many lines long.


Getting the full text of the errors was more important than the version, 
and we still don't have that.  The version is needed to fully and 
correctly connect the stacktraces in the error to Solr's source code, if 
looking at the code becomes necessary.


Thanks,
Shawn



Re: ampersand handling in solr cloud 7 in text_general field

2018-05-07 Thread kumar gaurav
Hi Shawn

It is solr 7.3 .

On Sun, May 6, 2018 at 1:17 AM, Shawn Heisey  wrote:

> On 5/5/2018 1:02 PM, kumar gaurav wrote:
>
>> I am facing possible analysis error. in case of indexing "&" ( ampersand )
>> in text_general fields  . It is working fine if solr is running in single
>> node mode also working fine in string fields . Exceptions is coming by
>> replicas i hope.
>>
>> Anybody please suggest if anything needs to handle while storing &
>> (ampersand) in text_general fields.  Exception is following :
>>
>
> The error in what you shared is incomplete.  Can you find any errors in
> solr.log and provide the full error text for any of them that occurred
> around the relevant timestamp?  Each one is going to be many lines long.
>
> What version of Solr?
>
> Thanks,
> Shawn
>
>


Re: Howto disable PrintGCTimeStamps in Solr

2018-05-07 Thread Bernd Fehling
Hi Dominique,

thanks for asking, I figured it out this morning.
If setting -Xloggc= the option -XX:+PrintGCTimeStamps will be set
as default and can't be disabled. It's inside JAVA.

Currently using Solr 6.4.2 with
Java HotSpot(TM) 64-Bit Server VM (25.121-b13) for linux-amd64 JRE 
(1.8.0_121-b13)


Regards,
Bernd


Am 07.05.2018 um 14:50 schrieb Dominique Bejean:
> Hi,
> 
> Which version of Solr are you using ?
> 
> Regards
> 
> Dominique
> 
> 
> Le ven. 4 mai 2018 à 09:13, Bernd Fehling 
> a écrit :
> 
>> Hi list,
>>
>> this sounds simple but I can't disable PrintGCTimeStamps in solr_gc
>> logging.
>> I tried with GC_LOG_OPTS in start scripts and --verbose reporting during
>> start to make sure it is not in Solr start scripts.
>> But if Solr is up and running there are always TimeStamps in solr_gc.log
>> and
>> the file reports at the top with "CommandLine flags:" that the option
>> -XX:+PrintGCTimeStamps has been set.
>> But where?
>>
>> Is it something passed down from Jetty?
>>
>> Regards,
>> Bernd
>>
>>
>>
>> --
> Dominique Béjean
> 06 08 46 12 43
> 


Re: Search Help

2018-05-07 Thread Shawn Heisey

On 5/7/2018 8:09 AM, natejasper wrote:

I'm setting up SOLR on an internal website for my company and I would like
to know if anyone can recommend an analytics that I can see what the users
are searching for? Does the log in SOLR give me that information?


Unless the logging configuration is changed, the solr.log file will 
contain a record that includes everything you need to re-execute any query.


Thanks,
Shawn



Re: solr collection id field type long

2018-05-07 Thread Vincenzo D'Amore
Thanks :)

On Mon, May 7, 2018 at 4:18 PM, Shawn Heisey  wrote:

> On 5/7/2018 3:27 AM, Vincenzo D'Amore wrote:
>
>> So just to understand, why we have this behaviour? Is there anything, a
>> mail thread or a ticket I could read?
>>
>
> https://issues.apache.org/jira/browse/SOLR-10829?attachmentOrder=desc
>
> Thanks,
> Shawn
>
>


-- 
Vincenzo D'Amore


Re: solr collection id field type long

2018-05-07 Thread Shawn Heisey

On 5/7/2018 3:27 AM, Vincenzo D'Amore wrote:

So just to understand, why we have this behaviour? Is there anything, a
mail thread or a ticket I could read?


https://issues.apache.org/jira/browse/SOLR-10829?attachmentOrder=desc

Thanks,
Shawn



Search Help

2018-05-07 Thread natejasper
Hello all here,

I'm setting up SOLR on an internal website for my company and I would like
to know if anyone can recommend an analytics that I can see what the users
are searching for? Does the log in SOLR give me that information? 

Thanks for your time,



--
Sent from: http://lucene.472066.n3.nabble.com/Solr-User-f472068.html


Re: Howto disable PrintGCTimeStamps in Solr

2018-05-07 Thread Dominique Bejean
Hi,

Which version of Solr are you using ?

Regards

Dominique


Le ven. 4 mai 2018 à 09:13, Bernd Fehling 
a écrit :

> Hi list,
>
> this sounds simple but I can't disable PrintGCTimeStamps in solr_gc
> logging.
> I tried with GC_LOG_OPTS in start scripts and --verbose reporting during
> start to make sure it is not in Solr start scripts.
> But if Solr is up and running there are always TimeStamps in solr_gc.log
> and
> the file reports at the top with "CommandLine flags:" that the option
> -XX:+PrintGCTimeStamps has been set.
> But where?
>
> Is it something passed down from Jetty?
>
> Regards,
> Bernd
>
>
>
> --
Dominique Béjean
06 08 46 12 43


solr collection id field type long

2018-05-07 Thread Vincenzo D'Amore
Hi all,

I'm moving an old collection from Solr 4.8.1 to 7.3.0 where the "id" field
has type solr.TrieLongField.

Given that solr.TrieLongField has been deprecated I've changed to with the
newer LongPointField.

But when I tried to create the collection Solr returned the following
exception:

org.apache.solr.common.SolrException:org.apache.solr.common.SolrException:
Could not load conf for core collection1_shard1_replica_n1: Can't load
schema schema.xml: uniqueKey field (id) can not be configured to use a
Points based FieldType: plong

I've not found an alternative, so I've temporarily switched to "string".

Looking around I've found in the source code:

if (uniqueKeyField.getType().isPointField()) {
  String msg = UNIQUE_KEY + " field ("+uniqueKeyFieldName+
") can not be configured to use a Points based FieldType: " +
uniqueKeyField.getType().getTypeName();
  log.error(msg);
  throw new SolrException(ErrorCode.SERVER_ERROR, msg);
}

Which explain clearly that I cannot do this :D ...

So just to understand, why we have this behaviour? Is there anything, a
mail thread or a ticket I could read?

Thanks for your time,
Vincenzo


-- 
Vincenzo D'Amore


Re: Regarding LTR feature

2018-05-07 Thread prateek . agarwal
Hi Alessandro,

You're right it doesn't have to be that accurate to the query time but our 
requirement is having a more solid control over our outputs from Solr like if 
we have 4 features then we can adjust the weights giving something like 
(40,20,20,20) to each feature such that the sum total of features for a 
document is 100 this is only possible if we could scale the feature outputs 
between 0-1.

Secondly, I also have a doubt regarding the scaling function like why it is not 
considering only the documents filtered out by the FQ filter and considering 
all the documents which match the query.

Thaks a lot in advance.
Looking forward to hearing back from you soon.


Regards,
Prateek

On 2018/05/04 10:26:55, Alessandro Benedetti  wrote: 
> Hi Preteek,
> I would assume you have that feature at training time as well, can't you use
> the training set to estabilish the parameters for the normalizer at query
> time ?
> 
> In the end being a normalization, doesn't have to be that accurate to the
> query time state, but it must reflect the relations the model learnt from
> the training set.
> Let me know !
> 
> 
> 
> -
> ---
> Alessandro Benedetti
> Search Consultant, R&D Software Engineer, Director
> Sease Ltd. - www.sease.io
> --
> Sent from: http://lucene.472066.n3.nabble.com/Solr-User-f472068.html
>