Managing leaders when recycling a cluster

2020-08-11 Thread Adam Woods
Hi,

We've just recently gone through the process of upgrading Solr the 8.6 and
have implemented an automated rolling update mechanism to allow us to more
easily make changes to our cluster in the future.

Our process for this looks like this:
1. Cluster has 3 nodes.
2. Scale out to 6 nodes.
3. Protect the cluster overseer from scale in.
4. Scale in to 5 nodes.
5. Scale in to 4 nodes.
6. Expose the cluster overseer to scale in.
7. Scale in to 3 nodes.

When scaling in, the nodes are removed by the oldest first. Whenever we
scale in or out, we ensure that the cluster reaches a state where it has
the required number of active nodes, and each node contains an active
replica for each collection.

It appears to work quite well. We were scaling down more than one node at a
time previously, but we ran into this bug:
https://issues.apache.org/jira/browse/SOLR-11208. Scaling down one at a
time works around this for now.

We were wondering if we should be taking more care around managing the
leaders of our collections during this process. Should we move the
collection leaders across to the new nodes that were created as part of
step 2 before we start removing the old nodes?

It looks like it's possible as Solr provides the ability to be able to do
this by calling the REBALANCELEADERS api after setting preferedLeader=true
on the replicas. Using this we could shift the leaders to the new nodes.

A thought I had while looking at the APIs available to set the
preferredLeader property was that the BALANCESHARDUNIQUE api would be
perfect for this scenario if it had the ability to limit the nodes to a
specific set. Otherwise our option is to do this balancing logic ourselves
and call the ADDREPLICAPROP api.

https://lucene.apache.org/solr/guide/8_6/cluster-node-management.html#balanceshardunique

Cheers,
Adam


Re: Cannot add replica during backup

2020-08-11 Thread Ashwin Ramesh
Hey Matthew,

Unfortunately, our shard leaders are across multiple nodes thus a single
EBS couldn't work. Did you manage to get around this issue yourself?

Regards,

Ash

On Tue, Aug 11, 2020 at 9:00 PM matthew sporleder 
wrote:

> I can already tell you it is EFS that is slow. I had to switch to an ebs
> disk for backups on a different project because efs couldn't keep up.
>
> > On Aug 10, 2020, at 9:43 PM, Ashwin Ramesh 
> wrote:
> >
> > Hey Aroop, the general process for our backup is:
> > - Connect all machines to an EFS drive (AWS's NFS service)
> > - Call the collections API to backup into EFS
> > - ZIP the directory once the backup is completed
> > - Copy the ZIP into an s3 bucket
> >
> > I'll probably have to see which part of the process is the slowest.
> >
> > On another note, can you simply remove the task from the ZK path to
> > continue the execution of tasks?
> >
> > Regards,
> >
> > Ash
> >
> >> On Tue, Aug 11, 2020 at 11:40 AM Aroop Ganguly
> >>  wrote:
> >>
> >> 12 hours is extreme, we take backups of 10TB worth of indexes in 15 mins
> >> using the collection backup api.
> >> How are you taking the backup?
> >>
> >> Do you actually see any backup progress or u are just seeing the task in
> >> the overseer queue linger ?
> >> I have seen restore tasks hanging in the queue forever despite process
> >> completing in Solr 77 so wouldn’t be surprised this happens with backup
> as
> >> well. And also observed that unless that unless that task is removed
> from
> >> the overseer-collection-queue the next ones do not proceed.
> >>
> >> Also adding replicas while backup seems like overkill, why don’t you
> just
> >> have the appropriate replication factor in the first place and have
> >> autoAddReplicas=true for indemnity?
> >>
> >>> On Aug 10, 2020, at 6:32 PM, Ashwin Ramesh 
> >> wrote:
> >>>
> >>> Hi everybody,
> >>>
> >>> We are using solr 7.6 (SolrCloud). We notices that when the backup is
> >>> running, we cannot add any replicas to the collection. By the looks of
> >> it,
> >>> the job to add the replica is put into the Overseer queue, but it is
> not
> >>> being processed. Is this expected? And are there any workarounds?
> >>>
> >>> Our backups take about 12 hours. Maybe we should try optimize that too.
> >>>
> >>> Regards,
> >>>
> >>> Ash
> >>>
> >>> --
> >>> **
> >>> ** Empowering the world to design
> >>> Share accurate
> >>> information on COVID-19 and spread messages of support to your
> community.
> >>>
> >>> Here are some resources
> >>> <
> >>
> https://about.canva.com/coronavirus-awareness-collection/?utm_medium=pr&utm_source=news&utm_campaign=covid19_templates
> >
> >>
> >>> that can help.
> >>>  
> >>>  
> >>>   
> >>> 
> >>>
> >>>
> >>>
> >>>
> >>>
> >>>
> >>>
> >>>
> >>>
> >>>
> >>
> >>
> >
> > --
> > **
> > ** Empowering the world to design
> > Share accurate
> > information on COVID-19 and spread messages of support to your community.
> >
> > Here are some resources
> > <
> https://about.canva.com/coronavirus-awareness-collection/?utm_medium=pr&utm_source=news&utm_campaign=covid19_templates>
>
> > that can help.
> >  
> >  
> >   
> > 
> >
> >
> >
> >
> >
> >
> >
> >
> >
> >
>

-- 
**
** Empowering the world to design
Share accurate 
information on COVID-19 and spread messages of support to your community.

Here are some resources 

 
that can help.
   
   
    













Re: Survey on ManagedResources feature

2020-08-11 Thread Noble Paul
The end point is served by restlet. So, your rules are not going to be
honored. The rules work only if it is served by a Solr request handler

On Wed, Aug 12, 2020, 12:46 AM Jason Gerlowski 
wrote:

> Hey Noble,
>
> Can you explain what you mean when you say it's not secured?  Just for
> those of us who haven't been following the discussion so far?  On the
> surface of things users taking advantage of our RuleBasedAuth plugin
> can secure this API like they can any other HTTP API.  Or are you
> talking about some other security aspect here?
>
> Jason
>
> On Tue, Aug 11, 2020 at 9:55 AM Noble Paul  wrote:
> >
> > Hi all,
> > The end-point for Managed resources is not secured. So it needs to be
> > fixed/eliminated.
> >
> > I would like to know what is the level of adoption for that feature
> > and if it is a critical feature for users.
> >
> > Another possibility is to offer a replacement for the feature using a
> > different API
> >
> > Your feedback will help us decide on what a potential solution should be
> >
> > --
> > -
> > Noble Paul
>


Re: Incorrect Insecure Settings Check in CoreContainer

2020-08-11 Thread Jason Gerlowski
Yikes, yeah it's hard to argue with that.

I'm a little confused because I remember testing this, but maybe it
snuck in at the last minute?  In any case, I'll reopen that jira to
fix the check there.

Sorry guys.

Jason


On Wed, Aug 5, 2020 at 9:22 AM Jan Høydahl  wrote:
>
> This seems to have been introduced in 
> https://issues.apache.org/jira/browse/SOLR-13972 in 8.4
> That test seems to be inverted for sure.
>
> Jason?
>
> Jan
>
> > 5. aug. 2020 kl. 13:15 skrev Mark Todd1 :
> >
> >
> > I've configured SolrCloud (8.5) with both SSL and Authentication which is 
> > working correctly. However, I get the following warning in the logs
> >
> > Solr authentication is enabled, but SSL is off. Consider enabling SSL to 
> > protect user credentials and data with encryption
> >
> > Looking at the source code for SolrCloud there appears to be a bug
> > if (authenticationPlugin !=null && 
> > StringUtils.isNotEmpty(System.getProperty("solr.jetty.https.port"))) {
> >
> > log.warn("Solr authentication is enabled, but SSL is off.  Consider 
> > enabling SSL to protect user credentials and data with encryption.");
> >
> > }
> >
> > Rather than checking for an empty system property (which would indicate SLL 
> > is off) its checking for a populated one which is what you get when SSL is 
> > on.
> >
> > Should I raise this as a Jira bug?
> >
> > Mark ToddUnless stated otherwise above:
> > IBM United Kingdom Limited - Registered in England and Wales with number 
> > 741598.
> > Registered office: PO Box 41, North Harbour, Portsmouth, Hampshire PO6 3AU
> >
>


massive numbers of threads with name that includes commitScheduler

2020-08-11 Thread Schwartz, Tony
>From time to time I see massive number of threads that have commitScheduler in 
>the name of the thread.  When this happens, solr is pegging the disk IO and 
>querying becomes unusable for a while.  I have many collections (240 shards).  
>It happens once in a while, I'm really not sure what is causing it or how to 
>prevent it.  Any thoughts on where I can look?  I have set the mergeScheduler 
>as such:

   
6
1


Thank you!

Tony Schwartz


Re: Backups in SolrCloud using snapshots of individual cores?

2020-08-11 Thread Bram Van Dam
On 11/08/2020 13:15, Erick Erickson wrote:
> CDCR is being deprecated. so I wouldn’t suggest it for the long term.

Ah yes, thanks for pointing that out. That makes Dominique's alternative
less attractive. I guess I'll stick to my original proposal!

Thanks Erick :-)

 - Bram


Re: Slow query response from SOLR 5.4.1

2020-08-11 Thread Jason Gerlowski
Hey Abhijit,

The information you provided isn't really enough for anyone else on
the mailing list to debug the problem.  If you'd like help, please
provide some more information.

Good places to start would be: what is the query, what does Solr tell
you when you add a "debug=timing" parameter to your request, what does
your Solr setup look like (num nodes, shards, replicas, other
collections/cores, QPS).  It's hard to say upfront what piece of info
will be the one that helps you get an answer to your question -
performance problems have a lot of varied causes.  But providing
_some_ of these things or other related details might help you get the
answer you're looking for.

Alternately, if you've figured out the issue already post the answer
on this thread - help anyone with a similar issue in the future.
Jason

On Tue, Aug 4, 2020 at 4:11 PM Abhijit Pawar  wrote:
>
> Hello,
>
> I am seeing a performance issue in querying in one of the SOLR servers -
> instance version 5.4.1.
> Total number of documents indexed are 20K plus.
> Data returned for this particular query is just as less as 22 documents
> however it takes almost 2 minutes to get the results back.
>
> Is there a way to improve on performance of query - in general the query
> response time is slow..
>
> I have most of the fields which are stored and indexed both.I can take off
> some fields which are just needed to be indexed however those are not many
> fields.
>
> Can I do something solrconfig.xml in terms of cache or something else?
>
> Any suggestions?
>
> Thanks!!


Re: Survey on ManagedResources feature

2020-08-11 Thread Jason Gerlowski
Hey Noble,

Can you explain what you mean when you say it's not secured?  Just for
those of us who haven't been following the discussion so far?  On the
surface of things users taking advantage of our RuleBasedAuth plugin
can secure this API like they can any other HTTP API.  Or are you
talking about some other security aspect here?

Jason

On Tue, Aug 11, 2020 at 9:55 AM Noble Paul  wrote:
>
> Hi all,
> The end-point for Managed resources is not secured. So it needs to be
> fixed/eliminated.
>
> I would like to know what is the level of adoption for that feature
> and if it is a critical feature for users.
>
> Another possibility is to offer a replacement for the feature using a
> different API
>
> Your feedback will help us decide on what a potential solution should be
>
> --
> -
> Noble Paul


Re: Multiple "df" fields

2020-08-11 Thread Erick Erickson
Have you explored edismax?

> On Aug 11, 2020, at 10:34 AM, Alexandre Rafalovitch  
> wrote:
> 
> I can't remember if field aliasing works with df but it may be worth a try:
> 
> https://lucene.apache.org/solr/guide/8_1/the-extended-dismax-query-parser.html#field-aliasing-using-per-field-qf-overrides
> 
> Another example:
> https://github.com/arafalov/solr-indexing-book/blob/master/published/languages/conf/solrconfig.xml
> 
> Regards,
>Alex
> 
> On Tue., Aug. 11, 2020, 9:59 a.m. Edward Turner, 
> wrote:
> 
>> Hi all,
>> 
>> Is it possible to have multiple "df" fields? (We think the answer is no
>> because our experiments did not work when adding multiple "df" values to
>> solrconfig.xml -- but we just wanted to double check with those who know
>> better.) The reason we would like to do this is that we have two main field
>> types (with different analyzers) and we'd like queries without a field to
>> be searched over both of them. We could also use copyfields, but this would
>> require us to have a common analyzer, which isn't exactly what we want.
>> 
>> An alternative solution is to pre-process the query prior to sending it to
>> Solr, so that queries with no field are changed as follows:
>> 
>> q=value -> q=(field1:value OR field2:value)
>> 
>> ... however, we feel a bit uncomfortable doing this though via String
>> manipulation.
>> 
>> Is there an obvious way we should tackle this problem that we are missing
>> (e.g., which would be cleaner/safer and perhaps works at the Query object
>> level)?
>> 
>> Many thanks and best wishes,
>> 
>> Edd
>> 



Re: Multiple "df" fields

2020-08-11 Thread Alexandre Rafalovitch
I can't remember if field aliasing works with df but it may be worth a try:

https://lucene.apache.org/solr/guide/8_1/the-extended-dismax-query-parser.html#field-aliasing-using-per-field-qf-overrides

Another example:
https://github.com/arafalov/solr-indexing-book/blob/master/published/languages/conf/solrconfig.xml

Regards,
Alex

On Tue., Aug. 11, 2020, 9:59 a.m. Edward Turner, 
wrote:

> Hi all,
>
> Is it possible to have multiple "df" fields? (We think the answer is no
> because our experiments did not work when adding multiple "df" values to
> solrconfig.xml -- but we just wanted to double check with those who know
> better.) The reason we would like to do this is that we have two main field
> types (with different analyzers) and we'd like queries without a field to
> be searched over both of them. We could also use copyfields, but this would
> require us to have a common analyzer, which isn't exactly what we want.
>
> An alternative solution is to pre-process the query prior to sending it to
> Solr, so that queries with no field are changed as follows:
>
> q=value -> q=(field1:value OR field2:value)
>
> ... however, we feel a bit uncomfortable doing this though via String
> manipulation.
>
> Is there an obvious way we should tackle this problem that we are missing
> (e.g., which would be cleaner/safer and perhaps works at the Query object
> level)?
>
> Many thanks and best wishes,
>
> Edd
>


Re: Multiple "df" fields

2020-08-11 Thread Edward Turner
Hi David,

We tried using copyfields, and we can get this to work, but it's not
exactly what we want because we need to use a common type. E.g.,
















Then if our "df" is specified as the "content" field, we can search over
"id", "name" and "organism" in one swoop. However, "content" has a
different type to "id" and "name", and so our search results might be
different than if we had searched directly on "id" or "name".

e.g.,
q=id:value1 // hits id field, which uses the "simple" type
q=value1 // hits content field, which uses the "complex" type
... so results might differ between the two queries

I hope this clarifies our question?

Best,

Edd


Edward Turner


On Tue, 11 Aug 2020 at 15:03, David Hastings 
wrote:

> why not use a copyfield for indexing?
>
> On Tue, Aug 11, 2020 at 9:59 AM Edward Turner  wrote:
>
> > Hi all,
> >
> > Is it possible to have multiple "df" fields? (We think the answer is no
> > because our experiments did not work when adding multiple "df" values to
> > solrconfig.xml -- but we just wanted to double check with those who know
> > better.) The reason we would like to do this is that we have two main
> field
> > types (with different analyzers) and we'd like queries without a field to
> > be searched over both of them. We could also use copyfields, but this
> would
> > require us to have a common analyzer, which isn't exactly what we want.
> >
> > An alternative solution is to pre-process the query prior to sending it
> to
> > Solr, so that queries with no field are changed as follows:
> >
> > q=value -> q=(field1:value OR field2:value)
> >
> > ... however, we feel a bit uncomfortable doing this though via String
> > manipulation.
> >
> > Is there an obvious way we should tackle this problem that we are missing
> > (e.g., which would be cleaner/safer and perhaps works at the Query object
> > level)?
> >
> > Many thanks and best wishes,
> >
> > Edd
> >
>


Re: Multiple "df" fields

2020-08-11 Thread David Hastings
why not use a copyfield for indexing?

On Tue, Aug 11, 2020 at 9:59 AM Edward Turner  wrote:

> Hi all,
>
> Is it possible to have multiple "df" fields? (We think the answer is no
> because our experiments did not work when adding multiple "df" values to
> solrconfig.xml -- but we just wanted to double check with those who know
> better.) The reason we would like to do this is that we have two main field
> types (with different analyzers) and we'd like queries without a field to
> be searched over both of them. We could also use copyfields, but this would
> require us to have a common analyzer, which isn't exactly what we want.
>
> An alternative solution is to pre-process the query prior to sending it to
> Solr, so that queries with no field are changed as follows:
>
> q=value -> q=(field1:value OR field2:value)
>
> ... however, we feel a bit uncomfortable doing this though via String
> manipulation.
>
> Is there an obvious way we should tackle this problem that we are missing
> (e.g., which would be cleaner/safer and perhaps works at the Query object
> level)?
>
> Many thanks and best wishes,
>
> Edd
>


Multiple "df" fields

2020-08-11 Thread Edward Turner
Hi all,

Is it possible to have multiple "df" fields? (We think the answer is no
because our experiments did not work when adding multiple "df" values to
solrconfig.xml -- but we just wanted to double check with those who know
better.) The reason we would like to do this is that we have two main field
types (with different analyzers) and we'd like queries without a field to
be searched over both of them. We could also use copyfields, but this would
require us to have a common analyzer, which isn't exactly what we want.

An alternative solution is to pre-process the query prior to sending it to
Solr, so that queries with no field are changed as follows:

q=value -> q=(field1:value OR field2:value)

... however, we feel a bit uncomfortable doing this though via String
manipulation.

Is there an obvious way we should tackle this problem that we are missing
(e.g., which would be cleaner/safer and perhaps works at the Query object
level)?

Many thanks and best wishes,

Edd


Survey on ManagedResources feature

2020-08-11 Thread Noble Paul
Hi all,
The end-point for Managed resources is not secured. So it needs to be
fixed/eliminated.

I would like to know what is the level of adoption for that feature
and if it is a critical feature for users.

Another possibility is to offer a replacement for the feature using a
different API

Your feedback will help us decide on what a potential solution should be

-- 
-
Noble Paul


Study On Rejected Refactorings

2020-08-11 Thread Jevgenija Pantiuchina
Dear contributors,

As part of a research team from Università della Svizzera italiana 
(Switzerland) and University of Sannio (Italy), we have analyzed refactoring 
pull requests in apache/lucene-solr repository and are looking for developers 
for a short 5-10 min survey 
(https://usi.eu.qualtrics.com/jfe/form/SV_cO6Ayah0D6q4eSF). Would you please 
spare your time by answering some questions about refactoring-related 
contributions? We would greatly appreciate your input — it would help us 
understand how developers can improve the quality of refactoring contributions, 
and benefit the development process. The responses will be anonymized and 
handled confidentially! Thank you a lot!

If you consider this message to be spam, I'm very sorry! There will be no 
follow-up to bug you.



Re: Backups in SolrCloud using snapshots of individual cores?

2020-08-11 Thread Dominique Bejean
An idea could be use autoscaling API in order to add a PULL replica for
each shard located in one or more low resource backup dedicated nodes in
separate hardware.
However, we need to exclude these "PULL backup replica" from searches.
Unfortunately, I am not aware of this possibility.
For better RPO, TLOG replica would be better, but it could become an NRT
replica.

So, may be one solution could be create a new BACKUP replica type with
these characteristics :

   - According to RPO, options at creation time : based on PULL or TLOG
   sync mode
   - Search disabled


Dominique



Le mar. 11 août 2020 à 14:07, Erick Erickson  a
écrit :

> Dominique:
>
> Alternatives are under discussion, there isn’t a recommendation yet.
>
> Erick
>
> > On Aug 11, 2020, at 7:49 AM, Dominique Bejean 
> wrote:
> >
> > I missed that !
> > Are you aware about an alternative ?
> >
> > Regards
> >
> > Dominique
> >
> >
> > Le mar. 11 août 2020 à 13:15, Erick Erickson  a
> > écrit :
> >
> >> CDCR is being deprecated. so I wouldn’t suggest it for the long term.
> >>
> >>> On Aug 10, 2020, at 9:33 PM, Ashwin Ramesh 
> >> wrote:
> >>>
> >>> I would love an answer to this too!
> >>>
> >>> On Fri, Aug 7, 2020 at 12:18 AM Bram Van Dam 
> >> wrote:
> >>>
>  Hey folks,
> 
>  Been reading up about the various ways of creating backups. The whole
>  "shared filesystem for Solrcloud backups"-thing is kind of a no-go in
>  our environment, so I've been looking for ways around that, and here's
>  what I've come up with so far:
> 
>  1. Stop applications from writing to solr
> 
>  2. Commit everything
> 
>  3. Identify a single core for each shard in each collection
> 
>  4. Snapshot that core using CREATESNAPSHOT in the Collections API
> 
>  5. Once complete, re-enable application write access to Solr
> 
>  6. Create a backup from these snapshots using the replication
> handler's
>  backup function (replication?command=backup&commitName=mySnapshot)
> 
>  7. Put the backups somewhere safe
> 
>  8. Clean up snapshots
> 
> 
>  This seems ... too good to be true? I've seen so many threads about
> how
>  hard it is to create backups in SolrCloud on this mailing list over
> the
>  years, but this seems pretty straightforward? Am I missing some
>  glaringly obvious reason why this will fail catastrophically?
> 
>  Using Solr 7.7 in this case.
> 
>  Feedback much appreciated!
> 
>  Thanks,
> 
>  - Bram
> 
> >>>
> >>> --
> >>> **
> >>> ** Empowering the world to design
> >>> Share accurate
> >>> information on COVID-19 and spread messages of support to your
> community.
> >>>
> >>> Here are some resources
> >>> <
> >>
> https://about.canva.com/coronavirus-awareness-collection/?utm_medium=pr&utm_source=news&utm_campaign=covid19_templates
> >
> >>
> >>> that can help.
> >>>  
> >>>  
> >>>   
> >>> 
> >>>
> >>>
> >>>
> >>>
> >>>
> >>>
> >>>
> >>>
> >>>
> >>>
> >>
> >>
>
>


Re: Backups in SolrCloud using snapshots of individual cores?

2020-08-11 Thread Erick Erickson
Dominique:

Alternatives are under discussion, there isn’t a recommendation yet.

Erick

> On Aug 11, 2020, at 7:49 AM, Dominique Bejean  
> wrote:
> 
> I missed that !
> Are you aware about an alternative ?
> 
> Regards
> 
> Dominique
> 
> 
> Le mar. 11 août 2020 à 13:15, Erick Erickson  a
> écrit :
> 
>> CDCR is being deprecated. so I wouldn’t suggest it for the long term.
>> 
>>> On Aug 10, 2020, at 9:33 PM, Ashwin Ramesh 
>> wrote:
>>> 
>>> I would love an answer to this too!
>>> 
>>> On Fri, Aug 7, 2020 at 12:18 AM Bram Van Dam 
>> wrote:
>>> 
 Hey folks,
 
 Been reading up about the various ways of creating backups. The whole
 "shared filesystem for Solrcloud backups"-thing is kind of a no-go in
 our environment, so I've been looking for ways around that, and here's
 what I've come up with so far:
 
 1. Stop applications from writing to solr
 
 2. Commit everything
 
 3. Identify a single core for each shard in each collection
 
 4. Snapshot that core using CREATESNAPSHOT in the Collections API
 
 5. Once complete, re-enable application write access to Solr
 
 6. Create a backup from these snapshots using the replication handler's
 backup function (replication?command=backup&commitName=mySnapshot)
 
 7. Put the backups somewhere safe
 
 8. Clean up snapshots
 
 
 This seems ... too good to be true? I've seen so many threads about how
 hard it is to create backups in SolrCloud on this mailing list over the
 years, but this seems pretty straightforward? Am I missing some
 glaringly obvious reason why this will fail catastrophically?
 
 Using Solr 7.7 in this case.
 
 Feedback much appreciated!
 
 Thanks,
 
 - Bram
 
>>> 
>>> --
>>> **
>>> ** Empowering the world to design
>>> Share accurate
>>> information on COVID-19 and spread messages of support to your community.
>>> 
>>> Here are some resources
>>> <
>> https://about.canva.com/coronavirus-awareness-collection/?utm_medium=pr&utm_source=news&utm_campaign=covid19_templates>
>> 
>>> that can help.
>>>  
>>>  
>>>   
>>> 
>>> 
>>> 
>>> 
>>> 
>>> 
>>> 
>>> 
>>> 
>>> 
>>> 
>> 
>> 



Re: Backups in SolrCloud using snapshots of individual cores?

2020-08-11 Thread Dominique Bejean
I missed that !
Are you aware about an alternative ?

Regards

Dominique


Le mar. 11 août 2020 à 13:15, Erick Erickson  a
écrit :

> CDCR is being deprecated. so I wouldn’t suggest it for the long term.
>
> > On Aug 10, 2020, at 9:33 PM, Ashwin Ramesh 
> wrote:
> >
> > I would love an answer to this too!
> >
> > On Fri, Aug 7, 2020 at 12:18 AM Bram Van Dam 
> wrote:
> >
> >> Hey folks,
> >>
> >> Been reading up about the various ways of creating backups. The whole
> >> "shared filesystem for Solrcloud backups"-thing is kind of a no-go in
> >> our environment, so I've been looking for ways around that, and here's
> >> what I've come up with so far:
> >>
> >> 1. Stop applications from writing to solr
> >>
> >> 2. Commit everything
> >>
> >> 3. Identify a single core for each shard in each collection
> >>
> >> 4. Snapshot that core using CREATESNAPSHOT in the Collections API
> >>
> >> 5. Once complete, re-enable application write access to Solr
> >>
> >> 6. Create a backup from these snapshots using the replication handler's
> >> backup function (replication?command=backup&commitName=mySnapshot)
> >>
> >> 7. Put the backups somewhere safe
> >>
> >> 8. Clean up snapshots
> >>
> >>
> >> This seems ... too good to be true? I've seen so many threads about how
> >> hard it is to create backups in SolrCloud on this mailing list over the
> >> years, but this seems pretty straightforward? Am I missing some
> >> glaringly obvious reason why this will fail catastrophically?
> >>
> >> Using Solr 7.7 in this case.
> >>
> >> Feedback much appreciated!
> >>
> >> Thanks,
> >>
> >> - Bram
> >>
> >
> > --
> > **
> > ** Empowering the world to design
> > Share accurate
> > information on COVID-19 and spread messages of support to your community.
> >
> > Here are some resources
> > <
> https://about.canva.com/coronavirus-awareness-collection/?utm_medium=pr&utm_source=news&utm_campaign=covid19_templates>
>
> > that can help.
> >  
> >  
> >   
> > 
> >
> >
> >
> >
> >
> >
> >
> >
> >
> >
>
>


Re: Backups in SolrCloud using snapshots of individual cores?

2020-08-11 Thread Erick Erickson
CDCR is being deprecated. so I wouldn’t suggest it for the long term.

> On Aug 10, 2020, at 9:33 PM, Ashwin Ramesh  wrote:
> 
> I would love an answer to this too!
> 
> On Fri, Aug 7, 2020 at 12:18 AM Bram Van Dam  wrote:
> 
>> Hey folks,
>> 
>> Been reading up about the various ways of creating backups. The whole
>> "shared filesystem for Solrcloud backups"-thing is kind of a no-go in
>> our environment, so I've been looking for ways around that, and here's
>> what I've come up with so far:
>> 
>> 1. Stop applications from writing to solr
>> 
>> 2. Commit everything
>> 
>> 3. Identify a single core for each shard in each collection
>> 
>> 4. Snapshot that core using CREATESNAPSHOT in the Collections API
>> 
>> 5. Once complete, re-enable application write access to Solr
>> 
>> 6. Create a backup from these snapshots using the replication handler's
>> backup function (replication?command=backup&commitName=mySnapshot)
>> 
>> 7. Put the backups somewhere safe
>> 
>> 8. Clean up snapshots
>> 
>> 
>> This seems ... too good to be true? I've seen so many threads about how
>> hard it is to create backups in SolrCloud on this mailing list over the
>> years, but this seems pretty straightforward? Am I missing some
>> glaringly obvious reason why this will fail catastrophically?
>> 
>> Using Solr 7.7 in this case.
>> 
>> Feedback much appreciated!
>> 
>> Thanks,
>> 
>> - Bram
>> 
> 
> -- 
> **
> ** Empowering the world to design
> Share accurate 
> information on COVID-19 and spread messages of support to your community.
> 
> Here are some resources 
> 
>  
> that can help.
>   
>    
>     
> 
> 
> 
> 
> 
> 
> 
> 
> 
> 
> 



Re: Production Issue: TIMED_WAITING - Will net.ipv4.tcp_tw_reuse=1 help?

2020-08-11 Thread Doss
Hi Dominique,

Our issues are similar to the one discussed here.
https://github.com/eclipse/jetty.project/issues/4105

Your views on this.

Thanks,
Mohandoss.

On Tue, Aug 11, 2020 at 7:06 AM Doss  wrote:

> Hi Dominique,
>
> Thanks for the response.
>
> I don't think I would use a JVM version 14. OpenJDK 11 in my opinion is
> the best choice for LTS version.
>
> >> We will try changing it.
>
> You change a lot of default values. Any specific raisons ? Il seems very
> aggressive !
>
> >> Our product team wants data to be reflected in Near Real Time.
>  mergePolicyFactory, mergeScheduler - This is based on our oldest SOLR
> cluster where these parameter tweaking gave good results.
>
> You have to analyze GC on all nodes !
>
> >> I checked other nodes GC, found no issues. I shared the node's GC which
> gets into trouble very frequently.
>
> Your heap is very big. According to full GC frequency, I don't think you
> really need such a big heap for only indexing. May be when you will perform
> queries.
>
> >> Heap Sizing is based on the select requests we are expecting. We expect
> it would be around 10 to 15 million per day. We have plans to increase CPU
> before routing select traffics.
>
> Did you check your network performances ?
>
> >> We do checked in sar reports, but unable to figure out an issue, we use
> 10 GBPS connection. Is there any SOLR metric API which will give network
> related information? Please suggest other ways to dig this further.
>
> Did you check Zookeeper logs ?
>
> >> We never looked at the Zookeeper logs, will check and share, is there
> any kind of information to watch out for?
>
> Regards,
> Doss
>
>
> On Monday, August 10, 2020, Dominique Bejean 
> wrote:
>
>> Doss,
>>
>> See below.
>>
>> Dominique
>>
>>
>> Le lun. 10 août 2020 à 17:41, Doss  a écrit :
>>
>>> Hi Dominique,
>>>
>>> Thanks for your response. Find below the details, please do let me know
>>> if anything I missed.
>>>
>>>
>>> *- hardware architecture and sizing*
>>> >> Centos 7, VMs,4CPUs, 66GB RAM, 16GB Heap, 250GB SSD
>>>
>>>
>>> *- JVM version / settings*
>>> >> Red Hat, Inc. OpenJDK 64-Bit Server VM, version:"14.0.1 14.0.1+7" -
>>> Default Settings including GC
>>>
>>
>> I don't think I would use a JVM version 14. OpenJDK 11 in my opinion is
>> the best choice for LTS version.
>>
>>
>>>
>>> *- Solr settings*
>>> >> softCommit: 15000 (15 sec), autoCommit: 30 (5 mins)
>>> >> class="org.apache.solr.index.TieredMergePolicyFactory">>> name="maxMergeAtOnce">30 100
>>> 30.0 
>>>
>>>   >> class="org.apache.lucene.index.ConcurrentMergeScheduler">>> name="maxMergeCount">18>> name="maxThreadCount">6
>>>
>>
>> You change a lot of default values. Any specific raisons ? Il seems very
>> aggressive !
>>
>>
>>>
>>>
>>> *- collections and queries information   *
>>> >> One Collection, with 4 shards , 3 replicas , 3.5 Million Records, 150
>>> columns, mostly integer fields, Average doc size is 350kb. Insert / Updates
>>> 0.5 Million Span across the whole day (peak time being 6PM to 10PM) ,
>>> selects not yet started. Daily once we do delta import of cetrain fields of
>>> type multivalued with some good amount of data.
>>>
>>> *- gc logs or gceasy results*
>>>
>>> Easy GC Report says GC health is good, one server's gc report:
>>> https://drive.google.com/file/d/1C2SqEn0iMbUOXnTNlYi46Gq9kF_CmWss/view?usp=sharing
>>> CPU Load Pattern:
>>> https://drive.google.com/file/d/1rjRMWv5ritf5QxgbFxDa0kPzVlXdbySe/view?usp=sharing
>>>
>>>
>> You have to analyze GC on all nodes !
>> Your heap is very big. According to full GC frequency, I don't think you
>> really need such a big heap for only indexing. May be when you will perform
>> queries.
>>
>> Did you check your network performances ?
>> Did you check Zookeeper logs ?
>>
>>
>>>
>>> Thanks,
>>> Doss.
>>>
>>>
>>>
>>> On Mon, Aug 10, 2020 at 7:39 PM Dominique Bejean <
>>> dominique.bej...@eolya.fr> wrote:
>>>
 Hi Doss,

 See a lot of TIMED_WATING connection occurs with high tcp traffic
 infrastructure as in a LAMP solution when the Apache server can't
 anymore connect to the MySQL/MariaDB database.
 In this case, tweak net.ipv4.tcp_tw_reuse is a possible solution (but
 never net.ipv4.tcp_tw_recycle as you suggested in your previous post).
 This
 is well explained in this great article
 https://vincent.bernat.ch/en/blog/2014-tcp-time-wait-state-linux

 However, in general and more specifically in your case, I would
 investigate
 the root cause of your issue and do not try to find a workaround.

 Can you provide more information about your use case (we know : 3 node
 SOLR
 (8.3.1 NRT) + 3 Node Zookeeper Ensemble) ?

- hardware architecture and sizing
- JVM version / settings
- Solr settings
- collections and queries information
- gc logs or gceasy results

 Regards

 Dominique



 Le lun. 10 août 2020 à 15:43, Doss  a écrit :
>>>

Re: Cannot add replica during backup

2020-08-11 Thread matthew sporleder
I can already tell you it is EFS that is slow. I had to switch to an ebs disk 
for backups on a different project because efs couldn't keep up. 

> On Aug 10, 2020, at 9:43 PM, Ashwin Ramesh  wrote:
> 
> Hey Aroop, the general process for our backup is:
> - Connect all machines to an EFS drive (AWS's NFS service)
> - Call the collections API to backup into EFS
> - ZIP the directory once the backup is completed
> - Copy the ZIP into an s3 bucket
> 
> I'll probably have to see which part of the process is the slowest.
> 
> On another note, can you simply remove the task from the ZK path to
> continue the execution of tasks?
> 
> Regards,
> 
> Ash
> 
>> On Tue, Aug 11, 2020 at 11:40 AM Aroop Ganguly
>>  wrote:
>> 
>> 12 hours is extreme, we take backups of 10TB worth of indexes in 15 mins
>> using the collection backup api.
>> How are you taking the backup?
>> 
>> Do you actually see any backup progress or u are just seeing the task in
>> the overseer queue linger ?
>> I have seen restore tasks hanging in the queue forever despite process
>> completing in Solr 77 so wouldn’t be surprised this happens with backup as
>> well. And also observed that unless that unless that task is removed from
>> the overseer-collection-queue the next ones do not proceed.
>> 
>> Also adding replicas while backup seems like overkill, why don’t you just
>> have the appropriate replication factor in the first place and have
>> autoAddReplicas=true for indemnity?
>> 
>>> On Aug 10, 2020, at 6:32 PM, Ashwin Ramesh 
>> wrote:
>>> 
>>> Hi everybody,
>>> 
>>> We are using solr 7.6 (SolrCloud). We notices that when the backup is
>>> running, we cannot add any replicas to the collection. By the looks of
>> it,
>>> the job to add the replica is put into the Overseer queue, but it is not
>>> being processed. Is this expected? And are there any workarounds?
>>> 
>>> Our backups take about 12 hours. Maybe we should try optimize that too.
>>> 
>>> Regards,
>>> 
>>> Ash
>>> 
>>> --
>>> **
>>> ** Empowering the world to design
>>> Share accurate
>>> information on COVID-19 and spread messages of support to your community.
>>> 
>>> Here are some resources
>>> <
>> https://about.canva.com/coronavirus-awareness-collection/?utm_medium=pr&utm_source=news&utm_campaign=covid19_templates>
>> 
>>> that can help.
>>>  
>>>  
>>>   
>>> 
>>> 
>>> 
>>> 
>>> 
>>> 
>>> 
>>> 
>>> 
>>> 
>>> 
>> 
>> 
> 
> -- 
> **
> ** Empowering the world to design
> Share accurate 
> information on COVID-19 and spread messages of support to your community.
> 
> Here are some resources 
> 
>  
> that can help.
>   
>    
>     
> 
> 
> 
> 
> 
> 
> 
> 
> 
> 
> 


Re: Solrcloud tlog are not deleted

2020-08-11 Thread Dominique Bejean
Hi,

Did you disable CDCR buffer ?
solr//cdcr?action=DISABLEBUFFER

You can check with "cdcr?action=STATUS"

Regards

Dominique


Le mar. 11 août 2020 à 10:57, Michel Bamouni  a
écrit :

> Hello,
>
>
> We had setup a synchronization between our solr instances on 2 datacenters
> by using  the CDCR.
> until now, every thing worked fine but after an upgrade from solr 7.3 to
> solr 7.7, we are facing an issue.
> Indeed, our tlog files are not deleted even if we see the new values on
> the  two solr.
> It is like that the hard commit doesn't occur.
> In our solrconfig.xml file, we had configure the autocommit as below :
>
>
> 
>   ${solr.autoCommit.maxTime:15000}
>   false
> 
>
>
> and the softautocommit looks like that:
>
> 
>   ${solr.autoSoftCommit.maxTime:-1}
> 
>
>
> if someone has already meet this issue, I'm looking for your return.
>
>
> Best regards,
>
>
> Michel
>
>


Re: Backups in SolrCloud using snapshots of individual cores?

2020-08-11 Thread Dominique Bejean
  Hi,

This procedure looks fine but it is a little complexe to automatize.

Why not consider backup based on CDCR for Solrcloud or Replication for Solr
standalone ?

For Solrcloud, CDCR can be configured with source and target collections in
the same Solrcloud cluster. The target collection can have their shards
located in dedicated nodes and replication factor set to 1.

You need to be careful of locating target nodes on separate hardware (VM
and storage) and ideally in separate geographical locations.

You will be able to achieve very good RPO and RTO.
If RTO is not high, the dedicated nodes for backup destination can have few
CPU and RAM
If RTO is high we can imagine the backup becomes the live collection very
fast instead of restore or in degraded search only mode during restore.

Regards.

Dominique



Le jeu. 6 août 2020 à 16:18, Bram Van Dam  a écrit :

> Hey folks,
>
> Been reading up about the various ways of creating backups. The whole
> "shared filesystem for Solrcloud backups"-thing is kind of a no-go in
> our environment, so I've been looking for ways around that, and here's
> what I've come up with so far:
>
> 1. Stop applications from writing to solr
>
> 2. Commit everything
>
> 3. Identify a single core for each shard in each collection
>
> 4. Snapshot that core using CREATESNAPSHOT in the Collections API
>
> 5. Once complete, re-enable application write access to Solr
>
> 6. Create a backup from these snapshots using the replication handler's
> backup function (replication?command=backup&commitName=mySnapshot)
>
> 7. Put the backups somewhere safe
>
> 8. Clean up snapshots
>
>
> This seems ... too good to be true? I've seen so many threads about how
> hard it is to create backups in SolrCloud on this mailing list over the
> years, but this seems pretty straightforward? Am I missing some
> glaringly obvious reason why this will fail catastrophically?
>
> Using Solr 7.7 in this case.
>
> Feedback much appreciated!
>
> Thanks,
>
>  - Bram
>


Solrcloud tlog are not deleted

2020-08-11 Thread Michel Bamouni
Hello,


We had setup a synchronization between our solr instances on 2 datacenters by 
using  the CDCR.
until now, every thing worked fine but after an upgrade from solr 7.3 to solr 
7.7, we are facing an issue.
Indeed, our tlog files are not deleted even if we see the new values on the  
two solr.
It is like that the hard commit doesn't occur.
In our solrconfig.xml file, we had configure the autocommit as below :



  ${solr.autoCommit.maxTime:15000}
  false



and the softautocommit looks like that:


  ${solr.autoSoftCommit.maxTime:-1}



if someone has already meet this issue, I'm looking for your return.


Best regards,


Michel