timeouts when update sent to non-Leader

2021-02-05 Thread Oakley, Craig (NIH/NLM/NCBI) [C]
We have a problem on a 3.5gig collection running Solr7.4 (we will soon upgrade 
to Solr8.5.2)

Users were often encountering timeout errors of the type shown below

My colleague found a blog post at 
https://stackoverflow.com/questions/62075353/solr-8-4-getting-async-exception-during-distributed-update-java-io-ioexception
 which prompted him to ask the users to direct their updates to the host which 
has the Leader replica rather than the host that has the non-Leader replica.

This seems to have provided a temporary solution; but should not SolrCloud be 
able to handle updates sent to any node of the SolrCloud? Generally our 
experience has been that SolrCloud is able to handle this as advertised; but 
has anyone else encountered (and perhaps corrected) this phenomenon?


In case it is helpful, the users' error message is below:

Error: (CSolrClientException::Http error) 



  -1
  656637


  
org.apache.solr.update.processor.DistributedUpdateProcessor$DistributedUpdatesAsyncException
org.apache.solr.update.processor.DistributedUpdateProcessor$DistributedUpdatesAsyncException
  
  Async exception during distributed update: Read timed 
out
  org.apache.solr.update.processor.DistributedUpdateProcessor$DistributedUpdatesAsyncException:
 Async exception during distributed update: Read timed out
at 
org.apache.solr.update.processor.DistributedUpdateProcessor.doFinish(DistributedUpdateProcessor.java:944)
at 
org.apache.solr.update.processor.DistributedUpdateProcessor.finish(DistributedUpdateProcessor.java:1946)
at 
org.apache.solr.update.processor.LogUpdateProcessorFactory$LogUpdateProcessor.finish(LogUpdateProcessorFactory.java:182)
at 
org.apache.solr.handler.ContentStreamHandlerBase.handleRequestBody(ContentStreamHandlerBase.java:78)
at 
org.apache.solr.handler.RequestHandlerBase.handleRequest(RequestHandlerBase.java:199)
at org.apache.solr.core.SolrCore.execute(SolrCore.java:2539)
at org.apache.solr.servlet.HttpSolrCall.execute(HttpSolrCall.java:709)
at org.apache.solr.servlet.HttpSolrCall.call(HttpSolrCall.java:515)
at 
org.apache.solr.servlet.SolrDispatchFilter.doFilter(SolrDispatchFilter.java:377)
at 
org.apache.solr.servlet.SolrDispatchFilter.doFilter(SolrDispatchFilter.java:323)
at 
org.eclipse.jetty.servlet.ServletHandler$CachedChain.doFilter(ServletHandler.java:1634)
at 
org.eclipse.jetty.servlet.ServletHandler.doHandle(ServletHandler.java:533)
at 
org.eclipse.jetty.server.handler.ScopedHandler.handle(ScopedHandler.java:146)
at 
org.eclipse.jetty.security.SecurityHandler.handle(SecurityHandler.java:548)
at 
org.eclipse.jetty.server.handler.HandlerWrapper.handle(HandlerWrapper.java:132)
at 
org.eclipse.jetty.server.handler.ScopedHandler.nextHandle(ScopedHandler.java:257)
at 
org.eclipse.jetty.server.session.SessionHandler.doHandle(SessionHandler.java:1595)
at 
org.eclipse.jetty.server.handler.ScopedHandler.nextHandle(ScopedHandler.java:255)
at 
org.eclipse.jetty.server.handler.ContextHandler.doHandle(ContextHandler.java:1253)
at 
org.eclipse.jetty.server.handler.ScopedHandler.nextScope(ScopedHandler.java:203)
at 
org.eclipse.jetty.servlet.ServletHandler.doScope(ServletHandler.java:473)
at 
org.eclipse.jetty.server.session.SessionHandler.doScope(SessionHandler.java:1564)
at 
org.eclipse.jetty.server.handler.ScopedHandler.nextScope(ScopedHandler.java:201)
at 
org.eclipse.jetty.server.handler.ContextHandler.doScope(ContextHandler.java:1155)
at 
org.eclipse.jetty.server.handler.ScopedHandler.handle(ScopedHandler.java:144)
at 
org.eclipse.jetty.server.handler.ContextHandlerCollection.handle(ContextHandlerCollection.java:219)
at 
org.eclipse.jetty.server.handler.HandlerCollection.handle(HandlerCollection.java:126)
at 
org.eclipse.jetty.server.handler.HandlerWrapper.handle(HandlerWrapper.java:132)
at 
org.eclipse.jetty.rewrite.handler.RewriteHandler.handle(RewriteHandler.java:335)
at 
org.eclipse.jetty.server.handler.HandlerWrapper.handle(HandlerWrapper.java:132)
at org.eclipse.jetty.server.Server.handle(Server.java:531)
at org.eclipse.jetty.server.HttpChannel.handle(HttpChannel.java:352)
at 
org.eclipse.jetty.server.HttpConnection.onFillable(HttpConnection.java:260)
at 
org.eclipse.jetty.io.AbstractConnection$ReadCallback.succeeded(AbstractConnection.java:281)
at org.eclipse.jetty.io.FillInterest.fillable(FillInterest.java:102)
at org.eclipse.jetty.io.ChannelEndPoint$2.run(ChannelEndPoint.java:118)
at 
org.eclipse.jetty.util.thread.strategy.EatWhatYouKill.runTask(EatWhatYouKill.java:333)
at 
org.eclipse.jetty.util.thread.strategy.EatWhatYouKill.doProduce(EatWhatYouKill.java:310)
at 
org.eclipse.jetty.util.thread.strategy.EatWhatYouKill.tryProduce(EatWhatYouKill.java:168)
at 

RE: Authentication for all but selects

2021-02-05 Thread Oakley, Craig (NIH/NLM/NCBI) [C]
What works for us is having something like this at the bottom of security.json:
  {
"name":"open_select",
"path":"/select/*",
"role":null,
"index":9},
  {
"name":"catch-all-nocollection",
"collection":null,
"path":"/*",
"role":"allgen",
"index":10},
  {
"name":"catch-all-collection",
"path":"/*",
"role":"allgen",
"index":11}],
"":{"v":9}}}

The clause with the name open_select specifically allows selects to run without 
any role ("role":null)

The last two clauses say that anything else (with any collection and without 
any collection) requires allgen role: and that is a role that I grant to all 
users generally

Other permissions can go higher up in security.json (disallowing normal users 
from running DELETEREPLICA, and things like that); but these are the three 
clauses which I think should allow select without any login (and without any 
password), while everything else does require a login and password.

-Original Message-
From: Robert Douglas  
Sent: Friday, February 05, 2021 1:19 PM
To: solr-user@lucene.apache.org
Subject: Authentication for all but selects

Hello all,

We are working on some migrations and we want to be incorporating 
authentication more uniformly across all our installations of Solr, but we are 
getting stuck on allowing Select statements through without authentication 
while having authentication on with RBAP for everything else. For some of our 
apps the authentication for Selects isn’t an issue but for others, where we 
can’t really touch the code, it is.

Is there a way of doing this?

Cheers,
R

Robert Douglas
DevOps Engineer
Cornell University Library


Cores renamed

2021-01-28 Thread Oakley, Craig (NIH/NLM/NCBI) [C]
We recently have had a few occasions when cores for one specific collection 
were renamed (or more likely dropped and recreated, and thus ended up with a 
different core name).

Is this a known phenomenon? Is there any explanation?

It may be relevant that we just recently started running this SolrCloud on 
version 8.5.2, although the collection was created under Solr7.4. Also, this 
collection seems to experience some heavy updates such that the non-Leader 
replica has trouble keeping up. One of these renames occurred at 4:33am, so I 
highly suspect that the rename (or drop and recreate) was done by some internal 
Solr thread rather than by any of my coworkers. One other potential clue is 
that I can see that /solr/admin/cores?action=REQUESTRECOVERY was usually run on 
the new core a moment after it was created.

Does anyone have any insights?


RE: disallowing delete through security.json

2021-01-12 Thread Oakley, Craig (NIH/NLM/NCBI) [C]
Does anyone yet have any examples or suggestion for using the "method" section 
in lucene.apache.org/solr/guide/8_4/rule-based-authorization-plugin.html ?

Also, if anyone has any other suggestions of how to provide high availability 
while completely dropping and recreating and reloading a large collection (as 
required in order to complete the upgrade to a new release), let me know.

-Original Message-----
From: Oakley, Craig (NIH/NLM/NCBI) [C]  
Sent: Tuesday, November 24, 2020 1:56 PM
To: solr-user@lucene.apache.org
Subject: RE: disallowing delete through security.json

Thank you for the response

The use case I have in mind is trying to approximate incremental updates (as 
are available in Sybase or MSSQL, to which I am more accustomed).

We are wanting to upgrade a large collection from Solr7.4 to Solr8.5. It turns 
out that Solr8.5 cannot run against the current data, because the collection 
was created under Solr6.6. We want to migrate in such a way that, in a year or 
so, we will be able to migrate to Solr9 without worrying about Solr7.4 let 
alone Solr6.6. We want to create a new collection (of the same name) in a brand 
new Solr8.5 SolrCloud, and then to select everything from the current Solr7.4 
collection in json format and load it into the new Solr8.5 collection. All of 
the fields have stored="true", with the exception of fields populated by 
copyField. The select will be done by ranges of id values, so as to avoid 
OutOfMemory errors. That process will take several days; and in the meanwhile, 
users will be continuing to add data. When all the data will have been copied 
(including that which is described below), we can switch port numbers so that 
the new Solr8.5 SolrCloud takes the place of the old Solr7.4 SolrCloud.

The plan is to find a value of _version_ (call it V1) which was in the Solr7.4 
collection when we started the first select, but which is greater than almost 
all values of _version_ in the collection (we are fine with having an overlap 
of _version_ values, but we want to avoid losing anything by having a gap in 
_version_ values). After the initial selects are complete, we can run other 
selects by ranges of id with the additional criteria that the _version_ will be 
no lower than the V1 value. As we have seen in test runs, this will involve 
less data and will run faster. We will also keep note of a new value of 
_version_ (call it V2) which was in the Solr7.4 collection when we start the V1 
select, but which is greater than almost all values of _version_ in the V1 
select. Following this procedure through various iterations (V3, V4, however 
many it takes), we can load the V1 set of selects when we will have completed 
the loading of the initial set of selects. We can then load the V2 set of 
selects when we will have completed the loading of the V1 set of selects. The 
plan is that the selecting and loading of the last Vn set of selects will 
involve a maintenance window measured in minutes rather than in days.

The users claim that they never do deletes: which is good, because a delete 
would be something which would be missed by this plan. If (as you describe) the 
users were to update a record so that only the id field (and the _version_ 
field) are left, that update would get picked up by one of these incremental 
selects and would be applied to the new collection. A delete, however, would 
not be noticed: and the new Solr8.5 collection would still have the record 
which had been deleted from the old Solr7.4 collection. The users claim that 
they never do deletes: but it would seem safer to actually disallow deletes 
during the maintenance.

Let me know if you have any suggestions.

Thank you again for your reply.


-Original Message-
From: Jason Gerlowski  
Sent: Tuesday, November 24, 2020 12:35 PM
To: solr-user@lucene.apache.org
Subject: Re: disallowing delete through security.json

Hey Craig,

I think this will be tricky to do with the current Rule-Based
Authorization support.  As you pointed out in your initial post -
there are lots of ways to delete documents.  The Rule-Based Auth code
doesn't inspect request bodies (AFAIK), so it's going to have trouble
differentiating between traditional "/update" requests with
method=POST that are request-body driven.

But to zoom out a bit, does it really make sense to lock down deletes,
but not updates more broadly?  After all, "updates" can remove and add
fields.  Users might submit an update that strips everything but "id"
from your documents.  In many/most usecases that'd be equally
concerning.  Just wondering what your usecase is - if it's generally
applicable this is probably worth a JIRA ticket.

Best,

Jason

On Thu, Nov 19, 2020 at 10:34 AM Oakley, Craig (NIH/NLM/NCBI) [C]
 wrote:
>
> Having not heard back, I thought I would ask again whether anyone else has 
> been able to use security.json to disallow deletes, and/or if anyone has 
> 

RE: disallowing delete through security.json

2020-11-24 Thread Oakley, Craig (NIH/NLM/NCBI) [C]
Thank you for the response

The use case I have in mind is trying to approximate incremental updates (as 
are available in Sybase or MSSQL, to which I am more accustomed).

We are wanting to upgrade a large collection from Solr7.4 to Solr8.5. It turns 
out that Solr8.5 cannot run against the current data, because the collection 
was created under Solr6.6. We want to migrate in such a way that, in a year or 
so, we will be able to migrate to Solr9 without worrying about Solr7.4 let 
alone Solr6.6. We want to create a new collection (of the same name) in a brand 
new Solr8.5 SolrCloud, and then to select everything from the current Solr7.4 
collection in json format and load it into the new Solr8.5 collection. All of 
the fields have stored="true", with the exception of fields populated by 
copyField. The select will be done by ranges of id values, so as to avoid 
OutOfMemory errors. That process will take several days; and in the meanwhile, 
users will be continuing to add data. When all the data will have been copied 
(including that which is described below), we can switch port numbers so that 
the new Solr8.5 SolrCloud takes the place of the old Solr7.4 SolrCloud.

The plan is to find a value of _version_ (call it V1) which was in the Solr7.4 
collection when we started the first select, but which is greater than almost 
all values of _version_ in the collection (we are fine with having an overlap 
of _version_ values, but we want to avoid losing anything by having a gap in 
_version_ values). After the initial selects are complete, we can run other 
selects by ranges of id with the additional criteria that the _version_ will be 
no lower than the V1 value. As we have seen in test runs, this will involve 
less data and will run faster. We will also keep note of a new value of 
_version_ (call it V2) which was in the Solr7.4 collection when we start the V1 
select, but which is greater than almost all values of _version_ in the V1 
select. Following this procedure through various iterations (V3, V4, however 
many it takes), we can load the V1 set of selects when we will have completed 
the loading of the initial set of selects. We can then load the V2 set of 
selects when we will have completed the loading of the V1 set of selects. The 
plan is that the selecting and loading of the last Vn set of selects will 
involve a maintenance window measured in minutes rather than in days.

The users claim that they never do deletes: which is good, because a delete 
would be something which would be missed by this plan. If (as you describe) the 
users were to update a record so that only the id field (and the _version_ 
field) are left, that update would get picked up by one of these incremental 
selects and would be applied to the new collection. A delete, however, would 
not be noticed: and the new Solr8.5 collection would still have the record 
which had been deleted from the old Solr7.4 collection. The users claim that 
they never do deletes: but it would seem safer to actually disallow deletes 
during the maintenance.

Let me know if you have any suggestions.

Thank you again for your reply.


-Original Message-
From: Jason Gerlowski  
Sent: Tuesday, November 24, 2020 12:35 PM
To: solr-user@lucene.apache.org
Subject: Re: disallowing delete through security.json

Hey Craig,

I think this will be tricky to do with the current Rule-Based
Authorization support.  As you pointed out in your initial post -
there are lots of ways to delete documents.  The Rule-Based Auth code
doesn't inspect request bodies (AFAIK), so it's going to have trouble
differentiating between traditional "/update" requests with
method=POST that are request-body driven.

But to zoom out a bit, does it really make sense to lock down deletes,
but not updates more broadly?  After all, "updates" can remove and add
fields.  Users might submit an update that strips everything but "id"
from your documents.  In many/most usecases that'd be equally
concerning.  Just wondering what your usecase is - if it's generally
applicable this is probably worth a JIRA ticket.

Best,

Jason

On Thu, Nov 19, 2020 at 10:34 AM Oakley, Craig (NIH/NLM/NCBI) [C]
 wrote:
>
> Having not heard back, I thought I would ask again whether anyone else has 
> been able to use security.json to disallow deletes, and/or if anyone has 
> examples of using the "method" section in 
> lucene.apache.org/solr/guide/8_4/rule-based-authorization-plugin.html
>
> -Original Message-
> From: Oakley, Craig (NIH/NLM/NCBI) [C] 
> Sent: Monday, October 26, 2020 6:23 PM
> To: solr-user@lucene.apache.org
> Subject: disallowing delete through security.json
>
> I am interested in disallowing delete through security.json
>
> After seeing the "method" section in 
> lucene.apache.org/solr/guide/8_4/rule-based-authorization-plugin.html my 
> first attempt was as follows:
&

RE: disallowing delete through security.json

2020-11-19 Thread Oakley, Craig (NIH/NLM/NCBI) [C]
Having not heard back, I thought I would ask again whether anyone else has been 
able to use security.json to disallow deletes, and/or if anyone has examples of 
using the "method" section in 
lucene.apache.org/solr/guide/8_4/rule-based-authorization-plugin.html

-Original Message-----
From: Oakley, Craig (NIH/NLM/NCBI) [C]  
Sent: Monday, October 26, 2020 6:23 PM
To: solr-user@lucene.apache.org
Subject: disallowing delete through security.json

I am interested in disallowing delete through security.json

After seeing the "method" section in 
lucene.apache.org/solr/guide/8_4/rule-based-authorization-plugin.html my first 
attempt was as follows:

{"set-permission":{
"name":"NO_delete",
"path":["/update/*","/update"],
"collection":col_name,
"role":"NoSuchRole",
"method":"DELETE",
"before":4}}

I found, however, that this did not disallow deleted: I could still run
curl -u ... "http://.../solr/col_name/update?commit=true; --data 
"id:11"

After further experimentation, I seemed to have success with
{"set-permission":
{"name":"NO_delete6",
"path":"/update/*",
"collection":"col_name",
"role":"NoSuchRole",
"method":["REGEX:(?i)DELETE"],
"before":4}}

My initial impression was that this did what I wanted; but now I find that this 
disallows *any* updates to this collection (which had previously been allowed). 
Other attempts to tweak this strategy, such as granting permissions for 
"/update/*" for methods other than DELETE to a role which is granted to the 
desired user, have not yet been successful.

Does anyone have an example of security.json disallowing a delete while still 
allowing an update?

Thanks


intermittent log rotation bug in Solr8.5.2?

2020-11-09 Thread Oakley, Craig (NIH/NLM/NCBI) [C]
We are in the process of upgrading from Solr7.4.0 to Solr8.5.2, and we are 
experiencing intermittent problem with log rotation

Our log4j2.xml file (both for Solr7.4.0 and for Solr8.5.2) includes the 
following:


  

  %d{-MM-dd HH:mm:ss.SSS} %-5p (%t) [%X{collection} %X{shard} 
%X{replica} %X{core}] %c{1.} %m%n

  
  


  
  


When we start our Solr nodes, we pass in "-Dlog.backup.index=99"

Last month, we had a problem in that, on those nodes which have been upgraded 
to Solr8.5.2, only the two most recent solr.log files were being retained. That 
problem went away in the middle of last month, but has reappeared over the 
weekend. Log rotation has been consistently been correct on those nodes which 
are still running Solr7.4.0

Has anyone else encountered any similar phenomenon? Has Solr8 introduced some 
limitation on DefaultRolloverStrategy max which varies by the day of the month? 
(or by the phase of the moon? Or something else?)


disallowing delete through security.json

2020-10-26 Thread Oakley, Craig (NIH/NLM/NCBI) [C]
I am interested in disallowing delete through security.json

After seeing the "method" section in 
lucene.apache.org/solr/guide/8_4/rule-based-authorization-plugin.html my first 
attempt was as follows:

{"set-permission":{
"name":"NO_delete",
"path":["/update/*","/update"],
"collection":col_name,
"role":"NoSuchRole",
"method":"DELETE",
"before":4}}

I found, however, that this did not disallow deleted: I could still run
curl -u ... "http://.../solr/col_name/update?commit=true; --data 
"id:11"

After further experimentation, I seemed to have success with
{"set-permission":
{"name":"NO_delete6",
"path":"/update/*",
"collection":"col_name",
"role":"NoSuchRole",
"method":["REGEX:(?i)DELETE"],
"before":4}}

My initial impression was that this did what I wanted; but now I find that this 
disallows *any* updates to this collection (which had previously been allowed). 
Other attempts to tweak this strategy, such as granting permissions for 
"/update/*" for methods other than DELETE to a role which is granted to the 
desired user, have not yet been successful.

Does anyone have an example of security.json disallowing a delete while still 
allowing an update?

Thanks


RE: Master/Slave

2020-10-06 Thread Oakley, Craig (NIH/NLM/NCBI) [C]
> it better not ever be depreciated.  it has been the most reliable mechanism 
> for its purpose

I would like to know whether that is the consensus of Solr developers.

We had been scrambling to move from Master/Slave to CDCR based on the assertion 
that CDCR support would last far longer than Master/Slave support.

Can we now assume safely that this assertion is now completely moot? Can we now 
assume safely that Master/Slave is likely to be supported for the foreseeable 
future? Or are we forced to assume that Master/Slave support will evaporate 
shortly after the now-evaporated CDCR support?

-Original Message-
From: David Hastings  
Sent: Wednesday, September 30, 2020 3:10 PM
To: solr-user@lucene.apache.org
Subject: Re: Master/Slave

>whether we should expect Master/Slave replication also to be deprecated

it better not ever be depreciated.  it has been the most reliable mechanism
for its purpose, solr cloud isnt going to replace standalone, if it does,
thats when I guess I stop upgrading or move to elastic

On Wed, Sep 30, 2020 at 2:58 PM Oakley, Craig (NIH/NLM/NCBI) [C]
 wrote:

> Based on the thread below (reading "legacy" as meaning "likely to be
> deprecated in later versions"), we have been working to extract ourselves
> from Master/Slave replication
>
> Most of our collections need to be in two data centers (a read/write copy
> in one local data center: the disaster-recovery-site SolrCloud could be
> read-only). We also need redundancy within each data center for when one
> host or another is unavailable. We implemented this by having different
> SolrClouds in the different data centers; with Master/Slave replication
> pulling data from one of the read/write replicas to each of the Slave
> replicas in the disaster-recovery-site read-only SolrCloud. Additionally,
> for some collections, there is a desire to have local read-only replicas
> remain unchanged for querying during the loading process: for these
> collections, there is a local read/write loading SolrCloud, a local
> read-only querying SolrCloud (normally configured for Master/Slave
> replication from one of the replicas of the loader SolrCloud to both
> replicas of the query SolrCloud, but with Master/Slave disabled when the
> load was in progress on the loader SolrCloud, and with Master/Slave resumed
> after the loaded data passes QA checks).
>
> Based on the thread below, we made an attempt to switch to CDCR. The main
> reason for wanting to change was that CDCR was said to be the supported
> mechanism, and the replacement for Master/Slave replication.
>
> After multiple unsuccessful attempts to get CDCR to work, we ended up with
> reproducible cases of CDCR loosing data in transit. In June, I initiated a
> thread in this group asking for clarification of how/whether CDCR could be
> made reliable. This seemed to me to be met with deafening silence until the
> announcement in July of the release of Solr8.6 and the deprecation of CDCR.
>
> So we are left with the question whether we should expect Master/Slave
> replication also to be deprecated; and if so, with what is it expected to
> be replaced (since not with CDCR)? Or is it now sufficiently safe to assume
> that Master/Slave replication will continue to be supported after all
> (since the assertion that it would be replaced by CDCR has been
> discredited)? In either case, are there other suggested implementations of
> having a read-only SolrCloud receive data from a read/write SolrCloud?
>
>
> Thanks
>
> -Original Message-
> From: Shawn Heisey 
> Sent: Tuesday, May 21, 2019 11:15 AM
> To: solr-user@lucene.apache.org
> Subject: Re: SolrCloud (7.3) and Legacy replication slaves
>
> On 5/21/2019 8:48 AM, Michael Tracey wrote:
> > Is it possible set up an existing SolrCloud cluster as the master for
> > legacy replication to a slave server or two?   It looks like another
> option
> > is to use Uni-direction CDCR, but not sure what is the best option in
> this
> > case.
>
> You're asking for problems if you try to combine legacy replication with
> SolrCloud.  The two features are not guaranteed to work together.
>
> CDCR is your best bet.  This replicates from one SolrCloud cluster to
> another.
>
> Thanks,
> Shawn
>


Master/Slave

2020-09-30 Thread Oakley, Craig (NIH/NLM/NCBI) [C]
Based on the thread below (reading "legacy" as meaning "likely to be deprecated 
in later versions"), we have been working to extract ourselves from 
Master/Slave replication

Most of our collections need to be in two data centers (a read/write copy in 
one local data center: the disaster-recovery-site SolrCloud could be 
read-only). We also need redundancy within each data center for when one host 
or another is unavailable. We implemented this by having different SolrClouds 
in the different data centers; with Master/Slave replication pulling data from 
one of the read/write replicas to each of the Slave replicas in the 
disaster-recovery-site read-only SolrCloud. Additionally, for some collections, 
there is a desire to have local read-only replicas remain unchanged for 
querying during the loading process: for these collections, there is a local 
read/write loading SolrCloud, a local read-only querying SolrCloud (normally 
configured for Master/Slave replication from one of the replicas of the loader 
SolrCloud to both replicas of the query SolrCloud, but with Master/Slave 
disabled when the load was in progress on the loader SolrCloud, and with 
Master/Slave resumed after the loaded data passes QA checks).

Based on the thread below, we made an attempt to switch to CDCR. The main 
reason for wanting to change was that CDCR was said to be the supported 
mechanism, and the replacement for Master/Slave replication.

After multiple unsuccessful attempts to get CDCR to work, we ended up with 
reproducible cases of CDCR loosing data in transit. In June, I initiated a 
thread in this group asking for clarification of how/whether CDCR could be made 
reliable. This seemed to me to be met with deafening silence until the 
announcement in July of the release of Solr8.6 and the deprecation of CDCR.

So we are left with the question whether we should expect Master/Slave 
replication also to be deprecated; and if so, with what is it expected to be 
replaced (since not with CDCR)? Or is it now sufficiently safe to assume that 
Master/Slave replication will continue to be supported after all (since the 
assertion that it would be replaced by CDCR has been discredited)? In either 
case, are there other suggested implementations of having a read-only SolrCloud 
receive data from a read/write SolrCloud?


Thanks

-Original Message-
From: Shawn Heisey  
Sent: Tuesday, May 21, 2019 11:15 AM
To: solr-user@lucene.apache.org
Subject: Re: SolrCloud (7.3) and Legacy replication slaves

On 5/21/2019 8:48 AM, Michael Tracey wrote:
> Is it possible set up an existing SolrCloud cluster as the master for
> legacy replication to a slave server or two?   It looks like another option
> is to use Uni-direction CDCR, but not sure what is the best option in this
> case.

You're asking for problems if you try to combine legacy replication with 
SolrCloud.  The two features are not guaranteed to work together.

CDCR is your best bet.  This replicates from one SolrCloud cluster to 
another.

Thanks,
Shawn


RE: CDCR stress-test issues

2020-07-17 Thread Oakley, Craig (NIH/NLM/NCBI) [C]
Yes, I saw that yesterday.

I guess that I was not the only one who noticed the unreliability after all.

-Original Message-
From: Ishan Chattopadhyaya  
Sent: Friday, July 17, 2020 1:17 AM
To: solr-user 
Subject: Re: CDCR stress-test issues

FYI, CDCR support, as it exists in Solr today, has been deprecated in 8.6.
It suffers from serious design flaws and it allows such things to happen
that you observe. While there may be workarounds, it is advisable to not
rely on CDCR in production.

Thanks,
Ishan

On Thu, 2 Jul, 2020, 1:12 am Oakley, Craig (NIH/NLM/NCBI) [C],
 wrote:

> For the record, it is not just Solr7.4 which has the problem. When I start
> afresh with Solr8.5.2, both symptoms persist.
>
> With Solr8.5.2, tlogs accumulate endlessly at the non-Leader nodes of the
> Source SolrCloud and are never released regardless of maxNumLogsToKeep
> setting
>
> And with Solr8.5.2, if four scripts run simultaneously for a few minutes,
> each script running a loop each iteration of which adds batches of 6
> records to the Source SolrCloud, a couple dozen records wind up on the
> Source without ever arriving at the Target SolrCloud (although the Target
> does have records which were added after the missing records).
>
> Does anyone yet have any suggestion how to get CDCR to work properly?
>
>
> -----Original Message-
> From: Oakley, Craig (NIH/NLM/NCBI) [C] 
> Sent: Wednesday, June 24, 2020 9:46 AM
> To: solr-user@lucene.apache.org
> Subject: CDCR stress-test issues
>
> In attempting to stress-test CDCR (running Solr 7.4), I am running into a
> couple of issues.
>
> One is that the tlog files keep accumulating for some nodes in the CDCR
> system, particularly for the non-Leader nodes in the Source SolrCloud. No
> quantity of hard commits seem to cause any of these tlog files to be
> released. This can become a problem upon reboot if there are hundreds of
> thousands of tlog files, and Solr fails to start (complaining that there
> are too many open files).
>
> The tlogs had been accumulating on all the nodes of the CDCR set of
> SolrClouds until I added these two lines to the solrconfig.xml file (for
> testing purposes, using numbers much lower than in the examples):
> 5
> 2
> Since then, it is mostly the non-Leader nodes of the Source SolrCloud
> which accumulates tlog files (the Target SolrCloud does seem to have a
> tendency to clean up the tlog files, as does the Leader of the Source
> SolrCloud). If I use ADDREPLICAPROP and REBALANCELEADERS to change which
> node is the Leader, and if I then start adding more data, the tlogs on the
> new Leader sometimes will go away, but then the old Leader begins
> accumulating tlog files. I am dubious whether frequent reassignment of
> Leadership would be a practical solution.
>
> I also have several times attempted to simulate a production environment
> by running several loops simultaneously, each of which inserts multiple
> records on each iteration of the loop. Several times, I end up with a dozen
> records on (both replicas of) the Source which never make it to (either
> replica of) the Target. The Target has thousands of records which were
> inserted before the missing records, and thousands of records which were
> inserted after the missing records (and all these records, the replicated
> and the missing, were inserted by curl commands which only differed in
> sequential numbers incorporated into the values being inserted).
>
> I also have a question regarding SOLR-13141: the 11/Feb/19 comment says
> that the fix for Solr 7.3 had a problem; and the header says "Affects
> Version/s: 7.5, 7.6": does that indicate that Solr 7.4 is not affected?
>
> Are  there any suggestions?
>
> Thanks
>


RE: CDCR stress-test issues

2020-07-01 Thread Oakley, Craig (NIH/NLM/NCBI) [C]
For the record, it is not just Solr7.4 which has the problem. When I start 
afresh with Solr8.5.2, both symptoms persist.

With Solr8.5.2, tlogs accumulate endlessly at the non-Leader nodes of the 
Source SolrCloud and are never released regardless of maxNumLogsToKeep setting

And with Solr8.5.2, if four scripts run simultaneously for a few minutes, each 
script running a loop each iteration of which adds batches of 6 records to the 
Source SolrCloud, a couple dozen records wind up on the Source without ever 
arriving at the Target SolrCloud (although the Target does have records which 
were added after the missing records).

Does anyone yet have any suggestion how to get CDCR to work properly?


-Original Message-
From: Oakley, Craig (NIH/NLM/NCBI) [C]  
Sent: Wednesday, June 24, 2020 9:46 AM
To: solr-user@lucene.apache.org
Subject: CDCR stress-test issues

In attempting to stress-test CDCR (running Solr 7.4), I am running into a 
couple of issues.

One is that the tlog files keep accumulating for some nodes in the CDCR system, 
particularly for the non-Leader nodes in the Source SolrCloud. No quantity of 
hard commits seem to cause any of these tlog files to be released. This can 
become a problem upon reboot if there are hundreds of thousands of tlog files, 
and Solr fails to start (complaining that there are too many open files).

The tlogs had been accumulating on all the nodes of the CDCR set of SolrClouds 
until I added these two lines to the solrconfig.xml file (for testing purposes, 
using numbers much lower than in the examples):
5
2
Since then, it is mostly the non-Leader nodes of the Source SolrCloud which 
accumulates tlog files (the Target SolrCloud does seem to have a tendency to 
clean up the tlog files, as does the Leader of the Source SolrCloud). If I use 
ADDREPLICAPROP and REBALANCELEADERS to change which node is the Leader, and if 
I then start adding more data, the tlogs on the new Leader sometimes will go 
away, but then the old Leader begins accumulating tlog files. I am dubious 
whether frequent reassignment of Leadership would be a practical solution.

I also have several times attempted to simulate a production environment by 
running several loops simultaneously, each of which inserts multiple records on 
each iteration of the loop. Several times, I end up with a dozen records on 
(both replicas of) the Source which never make it to (either replica of) the 
Target. The Target has thousands of records which were inserted before the 
missing records, and thousands of records which were inserted after the missing 
records (and all these records, the replicated and the missing, were inserted 
by curl commands which only differed in sequential numbers incorporated into 
the values being inserted).

I also have a question regarding SOLR-13141: the 11/Feb/19 comment says that 
the fix for Solr 7.3 had a problem; and the header says "Affects Version/s: 
7.5, 7.6": does that indicate that Solr 7.4 is not affected?

Are  there any suggestions?

Thanks


CDCR stress-test issues

2020-06-24 Thread Oakley, Craig (NIH/NLM/NCBI) [C]
In attempting to stress-test CDCR (running Solr 7.4), I am running into a 
couple of issues.

One is that the tlog files keep accumulating for some nodes in the CDCR system, 
particularly for the non-Leader nodes in the Source SolrCloud. No quantity of 
hard commits seem to cause any of these tlog files to be released. This can 
become a problem upon reboot if there are hundreds of thousands of tlog files, 
and Solr fails to start (complaining that there are too many open files).

The tlogs had been accumulating on all the nodes of the CDCR set of SolrClouds 
until I added these two lines to the solrconfig.xml file (for testing purposes, 
using numbers much lower than in the examples):
5
2
Since then, it is mostly the non-Leader nodes of the Source SolrCloud which 
accumulates tlog files (the Target SolrCloud does seem to have a tendency to 
clean up the tlog files, as does the Leader of the Source SolrCloud). If I use 
ADDREPLICAPROP and REBALANCELEADERS to change which node is the Leader, and if 
I then start adding more data, the tlogs on the new Leader sometimes will go 
away, but then the old Leader begins accumulating tlog files. I am dubious 
whether frequent reassignment of Leadership would be a practical solution.

I also have several times attempted to simulate a production environment by 
running several loops simultaneously, each of which inserts multiple records on 
each iteration of the loop. Several times, I end up with a dozen records on 
(both replicas of) the Source which never make it to (either replica of) the 
Target. The Target has thousands of records which were inserted before the 
missing records, and thousands of records which were inserted after the missing 
records (and all these records, the replicated and the missing, were inserted 
by curl commands which only differed in sequential numbers incorporated into 
the values being inserted).

I also have a question regarding SOLR-13141: the 11/Feb/19 comment says that 
the fix for Solr 7.3 had a problem; and the header says "Affects Version/s: 
7.5, 7.6": does that indicate that Solr 7.4 is not affected?

Are  there any suggestions?

Thanks


Re: [EXTERNAL] Getting rid of Master/Slave nomenclature in Solr

2020-06-18 Thread Oakley, Craig (NIH/NLM/NCBI) [C]
Trying think of a term that both fresh (and yet sort of standard already) and 
appropriate: how about IndexFetcher instead of "Slave"? And then "Master" could 
be "FetchedIndex" or "FetchedSource"

I think it could be beneficial to broaden the range of candidates.

From: Walter Underwood 
Sent: Thursday, June 18, 2020 10:34 PM
To: solr-user@lucene.apache.org 
Subject: Re: [EXTERNAL] Getting rid of Master/Slave nomenclature in Solr

We don’t get to decide whether “master” is a problem. The rest of the world
has already decided that it is a problem.

Our task is to replace the terms “master” and “slave” in Solr.

wunder
Walter Underwood
wun...@wunderwood.org
http://observer.wunderwood.org/  (my blog)

> On Jun 18, 2020, at 6:50 PM, Rahul Goswami  wrote:
>
> I agree with Phill, Noble and Ilan above. The problematic term is "slave"
> (not master) which I am all for changing if it causes less regression than
> removing BOTH master and slave. Since some people have pointed out Github
> changing the "master" terminology, in my personal opinion, it was not a
> measured response to addressing the bigger problem we are all trying to
> tackle. There is no concept of a "slave" branch, and "master" by itself is
> a pretty generic term (Is someone having "mastery" over a skill a bad
> thing?). I fear all it would end up achieving in the end with Github is a
> mess of broken build scripts at best.
> So +1 on "slave" being the problematic term IMO, not "master".
>
> On Thu, Jun 18, 2020 at 8:19 PM Phill Campbell
>  wrote:
>
>> Master - Worker
>> Master - Peon
>> Master - Helper
>> Master - Servant
>>
>> The term that is not wanted is “slave’. The term “master” is not a problem
>> IMO.
>>
>>> On Jun 18, 2020, at 3:59 PM, Jan Høydahl  wrote:
>>>
>>> I support Mike Drob and Trey Grainger. We shuold re-use the
>> leader/replica
>>> terminology from Cloud. Even if you hand-configure a master/slave cluster
>>> and orchestrate what doc goes to which node/shard, and hand-code your
>> shards
>>> parameter, you will still have a cluster where you’d send updates to the
>> leader of
>>> each shard and the replicas would replicate the index from the leader.
>>>
>>> Let’s instead find a new good name for the cluster type. Standalone kind
>> of works
>>> for me, but I see it can be confused with single-node. We have also
>> discussed
>>> replacing SolrCloud (which is a terrible name) with something more
>> descriptive.
>>>
>>> Today: SolrCloud vs Master/slave
>>> Alt A: SolrCloud vs Standalone
>>> Alt B: SolrCloud vs Legacy
>>> Alt C: Clustered vs Independent
>>> Alt D: Clustered vs Manual mode
>>>
>>> Jan
>>>
 18. jun. 2020 kl. 15:53 skrev Mike Drob :

 I personally think that using Solr cloud terminology for this would be
>> fine
 with leader/follower. The leader is the one that accepts updates,
>> followers
 cascade the updates somehow. The presence of ZK or election doesn’t
>> really
 change this detail.

 However, if folks feel that it’s confusing, then I can’t tell them that
 they’re not confused. Especially when they’re working with others who
>> have
 less Solr experience than we do and are less familiar with the
>> intricacies.

 Primary/Replica seems acceptable. Coordinator instead of Overseer seems
 acceptable.

 Would love to see this in 9.0!

 Mike

 On Thu, Jun 18, 2020 at 8:25 AM John Gallagher
  wrote:

> While on the topic of renaming roles, I'd like to propose finding a
>> better
> term than "overseer" which has historical slavery connotations as well.
> Director, perhaps?
>
>
> John Gallagher
>
> On Thu, Jun 18, 2020 at 8:48 AM Jason Gerlowski >>
> wrote:
>
>> +1 to rename master/slave, and +1 to choosing terminology distinct
>> from what's used for SolrCloud.  I could be happy with several of the
>> proposed options.  Since a good few have been proposed though, maybe
>> an eventual vote thread is the most organized way to aggregate the
>> opinions here.
>>
>> I'm less positive about the prospect of changing the name of our
>> primary git branch.  Most projects that contributors might come from,
>> most tutorials out there to learn git, most tools built on top of git
>> - the majority are going to assume "master" as the main branch.  I
>> appreciate the change that Github is trying to effect in changing the
>> default for new projects, but it'll be a long time before that
>> competes with the huge bulk of projects, documentation, etc. out there
>> using "master".  Our contributors are smart and I'm sure they'd figure
>> it out if we used "main" or something else instead, but having a
>> non-standard git setup would be one more "papercut" in understanding
>> how to contribute to a project that already makes that harder than it
>> should.
>>
>> Jason
>>
>>
>> On Thu, Jun 18, 

RE: No files to download for index generation

2020-03-30 Thread Oakley, Craig (NIH/NLM/NCBI) [C]
I wanted to ask *yet again* whether anyone could please clarify what this error 
means?

The wording could be interpreted as a benign "I found that there was nothing 
which needed to be done after all"; but were that to be the meaning of this 
error, why would it be flagged as an ERROR rather than as INFO or WARN ?

Please advise


-Original Message-----
From: Oakley, Craig (NIH/NLM/NCBI) [C] 
Sent: Wednesday, March 11, 2020 5:18 PM
To: solr-user@lucene.apache.org
Subject: RE: No files to download for index generation

I wanted to ask *again* whether anyone has any insight regarding this message

There seem to have been several people asking the question on this forum 
(Markus Jelsma on 8/23/19, Akreeti Agarwal on 12/27/19 and Vadim Ivanov on 
12/29/19)

The only response I have seen was five words from Erick Erickson on 12/27/19: 
"Not sure about that one"

Could someone please clarify what this error means?

The wording could be interpreted as a benign "I found that there was nothing 
which needed to be done after all"; but were that to be the meaning of this 
error, why would it be flagged as an ERROR rather than as INFO or WARN ?


-----Original Message-----
From: Oakley, Craig (NIH/NLM/NCBI) [C] 
Sent: Monday, June 10, 2019 9:57 AM
To: solr-user@lucene.apache.org
Subject: RE: No files to download for index generation

Does anyone yet have any insight on interpreting the severity of this message?

-----Original Message-----
From: Oakley, Craig (NIH/NLM/NCBI) [C] 
Sent: Tuesday, June 04, 2019 4:07 PM
To: solr-user@lucene.apache.org
Subject: No files to download for index generation

We have occasionally been seeing an error such as the following:
2019-06-03 23:32:45.583 INFO  (indexFetcher-45-thread-1) [   ] 
o.a.s.h.IndexFetcher Master's generation: 1424625
2019-06-03 23:32:45.583 INFO  (indexFetcher-45-thread-1) [   ] 
o.a.s.h.IndexFetcher Master's version: 1559619115480
2019-06-03 23:32:45.583 INFO  (indexFetcher-45-thread-1) [   ] 
o.a.s.h.IndexFetcher Slave's generation: 1424624
2019-06-03 23:32:45.583 INFO  (indexFetcher-45-thread-1) [   ] 
o.a.s.h.IndexFetcher Slave's version: 1559619050130
2019-06-03 23:32:45.583 INFO  (indexFetcher-45-thread-1) [   ] 
o.a.s.h.IndexFetcher Starting replication process
2019-06-03 23:32:45.587 ERROR (indexFetcher-45-thread-1) [   ] 
o.a.s.h.IndexFetcher No files to download for index generation: 1424625

Is that last line actually an error as in "there SHOULD be files to download, 
but there are none"?

Or is it simply informative as in "there are no files to download, so we are 
all done here"?


RE: No files to download for index generation

2020-03-11 Thread Oakley, Craig (NIH/NLM/NCBI) [C]
I wanted to ask *again* whether anyone has any insight regarding this message

There seem to have been several people asking the question on this forum 
(Markus Jelsma on 8/23/19, Akreeti Agarwal on 12/27/19 and Vadim Ivanov on 
12/29/19)

The only response I have seen was five words from Erick Erickson on 12/27/19: 
"Not sure about that one"

Could someone please clarify what this error means?

The wording could be interpreted as a benign "I found that there was nothing 
which needed to be done after all"; but were that to be the meaning of this 
error, why would it be flagged as an ERROR rather than as INFO or WARN ?


-Original Message-
From: Oakley, Craig (NIH/NLM/NCBI) [C]  
Sent: Monday, June 10, 2019 9:57 AM
To: solr-user@lucene.apache.org
Subject: RE: No files to download for index generation

Does anyone yet have any insight on interpreting the severity of this message?

-Original Message-
From: Oakley, Craig (NIH/NLM/NCBI) [C]  
Sent: Tuesday, June 04, 2019 4:07 PM
To: solr-user@lucene.apache.org
Subject: No files to download for index generation

We have occasionally been seeing an error such as the following:
2019-06-03 23:32:45.583 INFO  (indexFetcher-45-thread-1) [   ] 
o.a.s.h.IndexFetcher Master's generation: 1424625
2019-06-03 23:32:45.583 INFO  (indexFetcher-45-thread-1) [   ] 
o.a.s.h.IndexFetcher Master's version: 1559619115480
2019-06-03 23:32:45.583 INFO  (indexFetcher-45-thread-1) [   ] 
o.a.s.h.IndexFetcher Slave's generation: 1424624
2019-06-03 23:32:45.583 INFO  (indexFetcher-45-thread-1) [   ] 
o.a.s.h.IndexFetcher Slave's version: 1559619050130
2019-06-03 23:32:45.583 INFO  (indexFetcher-45-thread-1) [   ] 
o.a.s.h.IndexFetcher Starting replication process
2019-06-03 23:32:45.587 ERROR (indexFetcher-45-thread-1) [   ] 
o.a.s.h.IndexFetcher No files to download for index generation: 1424625

Is that last line actually an error as in "there SHOULD be files to download, 
but there are none"?

Or is it simply informative as in "there are no files to download, so we are 
all done here"?


RE: Limiting access to /admin path

2020-02-28 Thread Oakley, Craig (NIH/NLM/NCBI) [C]
I have found that for admin commands you may need to include "collection":null
  {
"name":"admin-info-system2",
"path":"/admin/*",
"collection":null,
"role":"*"}


-Original Message-
From: Jesús Roca  
Sent: Friday, February 28, 2020 2:10 PM
To: solr-user@lucene.apache.org
Subject: Limiting access to /admin path

 Hello,

I have a Solr 7.7.2 instance with basic authentication.

Anyone knows how to limit only to authenticated users the access to /admin
path?
For example to:

https://localhost:8983/solr/admin/info/system

When I access to that section this is the log generated:
2020-02-28 18:05:58.896 INFO  (qtp694316372-17) [   ] o.a.s.s.HttpSolrCall
[admin] webapp=null path=/admin/info/system params={} status=0 QTime=36

I have added the following custom permission, but it doesn't block the
unauthenticated request to that section:

"permissions":[
  {
"name":"admin-info-system",
"path":"/admin/info/system",
"role":"*"}
  ],

If I create the following custom permissions with diferent path:

"permissions":[
  {
"name":"admin-info-system1",
"path":"/select/*",
"role":"*"},
  {
"name":"admin-info-system2",
"path":"/admin/*",
"role":"*"}
  ],

Then, I have to authenticate when I query a collection, but I can still
access to /admin/info/system or /admin/collections?action=CLUSTERSTATUS

Definitely, I don't know how to block unauthenticated access to /admin path
without add the blockUnknown=true attribute but, if I do that, all the
request will have to be authenticated and I didn't.

Thanks in advance!


RE: Solr8 changes how security.json restricts access to GUI

2019-12-13 Thread Oakley, Craig (NIH/NLM/NCBI) [C]
Thanks for the clarification

Created SOLR-14083


-Original Message-
From: Erick Erickson  
Sent: Friday, December 13, 2019 6:26 PM
To: solr-user@lucene.apache.org
Subject: Re: Solr8 changes how security.json restricts access to GUI

Anyone who has an account can open a JIRA, have you created one?

> On Dec 13, 2019, at 5:10 PM, Oakley, Craig (NIH/NLM/NCBI) [C] 
>  wrote:
> 
> It looks as though I do not have an option under 
> issues.apache.org/jira/projects/SOLR/issues by which to create an issue. 
> Could you create one (and let me know its number)?
> 
> Thanks
> 
> -Original Message-
> From: Jan Høydahl  
> Sent: Friday, December 13, 2019 3:52 PM
> To: solr-user@lucene.apache.org
> Subject: Re: Solr8 changes how security.json restricts access to GUI
> 
> Ok, se should perhaps print a warning somewhere that IE is not supported. Can 
> you file a JIRA issue? 
> 
> Jan Høydahl
> 
>> 13. des. 2019 kl. 21:43 skrev Oakley, Craig (NIH/NLM/NCBI) [C] 
>> :
>> 
>> Well that is progress: indeed Firefox and Chrome and Edge do indeed prompt 
>> for login and password (as desired). It is Internet Explorer which does not, 
>> nor does curl (that is to say, if you ask curl only to go to the top level: 
>> host:port/solr -- going any further it will complain, such as your 
>> /solr/admin/info/system example gets Error 401 Authentication failed, 
>> Response code: 401)
>> 
>> 
>> 
>> -Original Message-
>> From: Jan Høydahl  
>> Sent: Friday, December 13, 2019 2:15 PM
>> To: solr-user 
>> Subject: Re: Solr8 changes how security.json restricts access to GUI
>> 
>> I got your screenshot 
>> (https://www.dropbox.com/s/7tbn7gx3uag6jcg/crippledSolrGUI.jpg?dl=0 
>> <https://www.dropbox.com/s/7tbn7gx3uag6jcg/crippledSolrGUI.jpg?dl=0>)
>> 
>> This is quite uncommon. You should see a loging screen if you have basicAuth 
>> enabled.
>> Have you tried a different browser?
>> 
>> What do you get if you run this command
>> 
>> curl -i http://your-solr-url/solr/admin/info/system
>> 
>> Or if you use your browser’s developer tools to inspect network traffic?
>> 
>> Jan
>> 
>>> 12. des. 2019 kl. 23:49 skrev Jan Høydahl :
>>> 
>>> Attachments are stripped from list, can you post a link to the screenshot 
>>> of the UI when you first visit?
>>> 
>>> Jan
>>> 
>>>>> 12. des. 2019 kl. 17:27 skrev Oakley, Craig (NIH/NLM/NCBI) [C] 
>>>>> :
>>>> 
>>>> Below is the security.json (with password hashes redacted): in Solr7.4 it 
>>>> prompts for a password and (if you get it right) lets you into the whole 
>>>> GUI; But in Solr8.1.1 and in Solr 8.3, it does not prompt for a password 
>>>> before letting you into a crippled version of the GUI (as depicted in the 
>>>> attachment)
>>>> 
>>>> {
>>>> "authentication":{
>>>> "class":"solr.BasicAuthPlugin",
>>>> "credentials":{
>>>>   "solradmin":"[redacted]",
>>>>   "pysolrmon":"[redacted]",
>>>>   "solrtrg":"[redacted]"},
>>>> "":{"v":2}},
>>>> "authorization":{
>>>> "class":"solr.RuleBasedAuthorizationPlugin",
>>>> "user-role":{
>>>>   "solradmin":[
>>>> "admin",
>>>> "allgen",
>>>> "trgadmin",
>>>> "genadmin"],
>>>>   "solrtrg":[
>>>> "trgadmin",
>>>> "allgen"],
>>>>   "pysolrmon":["clustatus_role"]},
>>>> "permissions":[
>>>>   {
>>>> "name":"gen_admin",
>>>> "collection":"NULL",
>>>> "path":"/admin/cores",
>>>> "params":{"action":[
>>>> "REGEX:(?i)CREATE",
>>>> "REGEX:(?i)RENAME",
>>>> "REGEX:(?i)SWAP",
>>>> "REGEX:(?i)UNLOAD",
>>>> "REGEX:(?i)SPLIT"]},
>>>> "role":"genadmin"},
>>>>   {
>>>> "name":"col_admin",
>>>> "collection":null,
>>>> "path":"

RE: Solr8 changes how security.json restricts access to GUI

2019-12-13 Thread Oakley, Craig (NIH/NLM/NCBI) [C]
It looks as though I do not have an option under 
issues.apache.org/jira/projects/SOLR/issues by which to create an issue. Could 
you create one (and let me know its number)?

Thanks

-Original Message-
From: Jan Høydahl  
Sent: Friday, December 13, 2019 3:52 PM
To: solr-user@lucene.apache.org
Subject: Re: Solr8 changes how security.json restricts access to GUI

Ok, se should perhaps print a warning somewhere that IE is not supported. Can 
you file a JIRA issue? 

Jan Høydahl

> 13. des. 2019 kl. 21:43 skrev Oakley, Craig (NIH/NLM/NCBI) [C] 
> :
> 
> Well that is progress: indeed Firefox and Chrome and Edge do indeed prompt 
> for login and password (as desired). It is Internet Explorer which does not, 
> nor does curl (that is to say, if you ask curl only to go to the top level: 
> host:port/solr -- going any further it will complain, such as your 
> /solr/admin/info/system example gets Error 401 Authentication failed, 
> Response code: 401)
> 
> 
> 
> -Original Message-
> From: Jan Høydahl  
> Sent: Friday, December 13, 2019 2:15 PM
> To: solr-user 
> Subject: Re: Solr8 changes how security.json restricts access to GUI
> 
> I got your screenshot 
> (https://www.dropbox.com/s/7tbn7gx3uag6jcg/crippledSolrGUI.jpg?dl=0 
> <https://www.dropbox.com/s/7tbn7gx3uag6jcg/crippledSolrGUI.jpg?dl=0>)
> 
> This is quite uncommon. You should see a loging screen if you have basicAuth 
> enabled.
> Have you tried a different browser?
> 
> What do you get if you run this command
> 
> curl -i http://your-solr-url/solr/admin/info/system
> 
> Or if you use your browser’s developer tools to inspect network traffic?
> 
> Jan
> 
>> 12. des. 2019 kl. 23:49 skrev Jan Høydahl :
>> 
>> Attachments are stripped from list, can you post a link to the screenshot of 
>> the UI when you first visit?
>> 
>> Jan
>> 
>>>> 12. des. 2019 kl. 17:27 skrev Oakley, Craig (NIH/NLM/NCBI) [C] 
>>>> :
>>> 
>>> Below is the security.json (with password hashes redacted): in Solr7.4 it 
>>> prompts for a password and (if you get it right) lets you into the whole 
>>> GUI; But in Solr8.1.1 and in Solr 8.3, it does not prompt for a password 
>>> before letting you into a crippled version of the GUI (as depicted in the 
>>> attachment)
>>> 
>>> {
>>> "authentication":{
>>>  "class":"solr.BasicAuthPlugin",
>>>  "credentials":{
>>>"solradmin":"[redacted]",
>>>"pysolrmon":"[redacted]",
>>>"solrtrg":"[redacted]"},
>>>  "":{"v":2}},
>>> "authorization":{
>>>  "class":"solr.RuleBasedAuthorizationPlugin",
>>>  "user-role":{
>>>"solradmin":[
>>>  "admin",
>>>  "allgen",
>>>  "trgadmin",
>>>  "genadmin"],
>>>"solrtrg":[
>>>  "trgadmin",
>>>  "allgen"],
>>>"pysolrmon":["clustatus_role"]},
>>>  "permissions":[
>>>{
>>>  "name":"gen_admin",
>>>  "collection":"NULL",
>>>  "path":"/admin/cores",
>>>  "params":{"action":[
>>>  "REGEX:(?i)CREATE",
>>>  "REGEX:(?i)RENAME",
>>>  "REGEX:(?i)SWAP",
>>>  "REGEX:(?i)UNLOAD",
>>>  "REGEX:(?i)SPLIT"]},
>>>  "role":"genadmin"},
>>>{
>>>  "name":"col_admin",
>>>  "collection":null,
>>>  "path":"/admin/collections",
>>>  "params":{"action":[
>>>  "REGEX:(?i)CREATE",
>>>  "REGEX:(?i)MODIFYCOLLECTION",
>>>  "REGEX:(?i)SPLITSHARD",
>>>  "REGEX:(?i)CREATESHARD",
>>>  "REGEX:(?i)DELETESHARD",
>>>  "REGEX:(?i)CREATEALIAS",
>>>  "REGEX:(?i)DELETEALIAS",
>>>  "REGEX:(?i)DELETE",
>>>  "REGEX:(?i)DELETEREPLICA",
>>>  "REGEX:(?i)ADDREPLICA",
>>>  "REGEX:(?i)CLUSTERPROP",
>>>  "REGEX:(?i)MIGRATE",
>>>  &qu

RE: Solr8 changes how security.json restricts access to GUI

2019-12-13 Thread Oakley, Craig (NIH/NLM/NCBI) [C]
Well that is progress: indeed Firefox and Chrome and Edge do indeed prompt for 
login and password (as desired). It is Internet Explorer which does not, nor 
does curl (that is to say, if you ask curl only to go to the top level: 
host:port/solr -- going any further it will complain, such as your 
/solr/admin/info/system example gets Error 401 Authentication failed, Response 
code: 401)



-Original Message-
From: Jan Høydahl  
Sent: Friday, December 13, 2019 2:15 PM
To: solr-user 
Subject: Re: Solr8 changes how security.json restricts access to GUI

I got your screenshot 
(https://www.dropbox.com/s/7tbn7gx3uag6jcg/crippledSolrGUI.jpg?dl=0 
<https://www.dropbox.com/s/7tbn7gx3uag6jcg/crippledSolrGUI.jpg?dl=0>)

This is quite uncommon. You should see a loging screen if you have basicAuth 
enabled.
Have you tried a different browser?

What do you get if you run this command

curl -i http://your-solr-url/solr/admin/info/system

Or if you use your browser’s developer tools to inspect network traffic?

Jan

> 12. des. 2019 kl. 23:49 skrev Jan Høydahl :
> 
> Attachments are stripped from list, can you post a link to the screenshot of 
> the UI when you first visit?
> 
> Jan
> 
>> 12. des. 2019 kl. 17:27 skrev Oakley, Craig (NIH/NLM/NCBI) [C] 
>> :
>> 
>> Below is the security.json (with password hashes redacted): in Solr7.4 it 
>> prompts for a password and (if you get it right) lets you into the whole 
>> GUI; But in Solr8.1.1 and in Solr 8.3, it does not prompt for a password 
>> before letting you into a crippled version of the GUI (as depicted in the 
>> attachment)
>> 
>> {
>> "authentication":{
>>   "class":"solr.BasicAuthPlugin",
>>   "credentials":{
>> "solradmin":"[redacted]",
>> "pysolrmon":"[redacted]",
>> "solrtrg":"[redacted]"},
>>   "":{"v":2}},
>> "authorization":{
>>   "class":"solr.RuleBasedAuthorizationPlugin",
>>   "user-role":{
>> "solradmin":[
>>   "admin",
>>   "allgen",
>>   "trgadmin",
>>   "genadmin"],
>> "solrtrg":[
>>   "trgadmin",
>>   "allgen"],
>> "pysolrmon":["clustatus_role"]},
>>   "permissions":[
>> {
>>   "name":"gen_admin",
>>   "collection":"NULL",
>>   "path":"/admin/cores",
>>   "params":{"action":[
>>   "REGEX:(?i)CREATE",
>>   "REGEX:(?i)RENAME",
>>   "REGEX:(?i)SWAP",
>>   "REGEX:(?i)UNLOAD",
>>   "REGEX:(?i)SPLIT"]},
>>   "role":"genadmin"},
>> {
>>   "name":"col_admin",
>>   "collection":null,
>>   "path":"/admin/collections",
>>   "params":{"action":[
>>   "REGEX:(?i)CREATE",
>>   "REGEX:(?i)MODIFYCOLLECTION",
>>   "REGEX:(?i)SPLITSHARD",
>>   "REGEX:(?i)CREATESHARD",
>>   "REGEX:(?i)DELETESHARD",
>>   "REGEX:(?i)CREATEALIAS",
>>   "REGEX:(?i)DELETEALIAS",
>>   "REGEX:(?i)DELETE",
>>   "REGEX:(?i)DELETEREPLICA",
>>   "REGEX:(?i)ADDREPLICA",
>>   "REGEX:(?i)CLUSTERPROP",
>>   "REGEX:(?i)MIGRATE",
>>   "REGEX:(?i)ADDROLE",
>>   "REGEX:(?i)REMOVEROLE",
>>   "REGEX:(?i)ADDREPLICAPROP",
>>   "REGEX:(?i)DELETEREPLICAPROP",
>>   "REGEX:(?i)BALANCESHARDUNIQUE",
>>   "REGEX:(?i)REBALANCELEADERS",
>>   "REGEX:(?i)FORCELEADER",
>>   "REGEX:(?i)MIGRATESTATEFORMAT"]},
>>   "role":"genadmin"},
>> {
>>   "name":"security-edit",
>>   "role":"admin"},
>> {
>>   "name":"clustatus",
>>   "path":"/admin/collections",
>>   "params":{"action":["REGEX:(?i)CLUSTERSTATUS"]},
>>   "role":[
>> "clustatus_role",
>> "allgen"],
>&g

RE: Solr8 changes how security.json restricts access to GUI

2019-12-12 Thread Oakley, Craig (NIH/NLM/NCBI) [C]
Below is the security.json (with password hashes redacted): in Solr7.4 it 
prompts for a password and (if you get it right) lets you into the whole GUI; 
But in Solr8.1.1 and in Solr 8.3, it does not prompt for a password before 
letting you into a crippled version of the GUI (as depicted in the attachment)

{
  "authentication":{
"class":"solr.BasicAuthPlugin",
"credentials":{
  "solradmin":"[redacted]",
  "pysolrmon":"[redacted]",
  "solrtrg":"[redacted]"},
"":{"v":2}},
  "authorization":{
"class":"solr.RuleBasedAuthorizationPlugin",
"user-role":{
  "solradmin":[
"admin",
"allgen",
"trgadmin",
"genadmin"],
  "solrtrg":[
"trgadmin",
"allgen"],
  "pysolrmon":["clustatus_role"]},
"permissions":[
  {
"name":"gen_admin",
"collection":"NULL",
"path":"/admin/cores",
"params":{"action":[
"REGEX:(?i)CREATE",
"REGEX:(?i)RENAME",
"REGEX:(?i)SWAP",
"REGEX:(?i)UNLOAD",
"REGEX:(?i)SPLIT"]},
"role":"genadmin"},
  {
"name":"col_admin",
"collection":null,
"path":"/admin/collections",
"params":{"action":[
"REGEX:(?i)CREATE",
"REGEX:(?i)MODIFYCOLLECTION",
"REGEX:(?i)SPLITSHARD",
"REGEX:(?i)CREATESHARD",
"REGEX:(?i)DELETESHARD",
"REGEX:(?i)CREATEALIAS",
"REGEX:(?i)DELETEALIAS",
"REGEX:(?i)DELETE",
"REGEX:(?i)DELETEREPLICA",
"REGEX:(?i)ADDREPLICA",
"REGEX:(?i)CLUSTERPROP",
"REGEX:(?i)MIGRATE",
"REGEX:(?i)ADDROLE",
"REGEX:(?i)REMOVEROLE",
"REGEX:(?i)ADDREPLICAPROP",
"REGEX:(?i)DELETEREPLICAPROP",
"REGEX:(?i)BALANCESHARDUNIQUE",
"REGEX:(?i)REBALANCELEADERS",
"REGEX:(?i)FORCELEADER",
"REGEX:(?i)MIGRATESTATEFORMAT"]},
"role":"genadmin"},
  {
"name":"security-edit",
"role":"admin"},
  {
"name":"clustatus",
"path":"/admin/collections",
"params":{"action":["REGEX:(?i)CLUSTERSTATUS"]},
"role":[
  "clustatus_role",
  "allgen"],
"collection":null},
  {
"name":"corestatus",
"path":"/admin/cores",
"params":{"action":["REGEX:(?i)STATUS"]},
"role":[
  "allgen",
  "clustatus_role"],
"collection":null},
  {
"name":"trgadmin",
"collection":"trg_col",
"path":"/admin/*",
"role":"trgadmin"},
  {
"name":"open_select",
"path":"/select/*",
"role":null},
  {
"name":"open_search",
"path":"/search/*",
"role":null},
  {
"name":"catch-all-nocollection",
"collection":null,
"path":"/*",
"role":"allgen"},
  {
"name":"catch-all-collection",
"path":"/*",
"role":"allgen"},
  {
"name":"all-admincol",
"collection":null,
"path":"/admin/collections",
"role":"allgen"},
  {
"name":"all-admincores",
"collection":null,
"path":"/admin/cores",
"role":"allgen"}],
"":{"v":5}}}

-Original Message-
From: Jan Høydahl  
Sent: Wednesday, December 11, 2019 7:35 PM
To: solr-user@lucene.apache.org
Subject: Re: Solr8 changes how security.json restricts access to GUI

Please show your complete Security.json so we know how auth is configured. 
Which 8.x version are you trying? There should be a login screen shown in admin 
UI now.

Jan Høydahl

> 11. des. 2019 kl. 22:40 skrev Oakley, Craig (NIH/NLM/NCBI) [C] 
> :
> 
> In Solr 7, we had clauses in our security.json saying
> 
>  {
>"name":"all-admin",
>"collection":null,
>"path":"/*",
>"role":"allgen",
>"index":15},
>  {
>"name":"all-core-handlers",
>"path":"/*",
>"role":"allgen",
>"index":16},
> 
> We granted the role allgen to all users; but this kept our security folk 
> happy in that no one could even get to the top level of the Solr GUI without 
> a password.
> 
> Now under Solr 8, the GUI does not prompt for a password. It just brings you 
> into the GUI (albeit a stripped down version, saying such things as "No cores 
> available"). By what means can we require a password to get this far? And by 
> what means can we prompt for a password in order to get further?


Solr8 changes how security.json restricts access to GUI

2019-12-11 Thread Oakley, Craig (NIH/NLM/NCBI) [C]
In Solr 7, we had clauses in our security.json saying

  {
"name":"all-admin",
"collection":null,
"path":"/*",
"role":"allgen",
"index":15},
  {
"name":"all-core-handlers",
"path":"/*",
"role":"allgen",
"index":16},

We granted the role allgen to all users; but this kept our security folk happy 
in that no one could even get to the top level of the Solr GUI without a 
password.

Now under Solr 8, the GUI does not prompt for a password. It just brings you 
into the GUI (albeit a stripped down version, saying such things as "No cores 
available"). By what means can we require a password to get this far? And by 
what means can we prompt for a password in order to get further?


RE: async BACKUP under Solr8.3

2019-11-22 Thread Oakley, Craig (NIH/NLM/NCBI) [C]
For the record, the solution was to edit solr.xml changing

${socketTimeout:0}

to

${socketTimeout:60}

-Original Message-
From: Oakley, Craig (NIH/NLM/NCBI) [C]  
Sent: Tuesday, November 19, 2019 6:19 PM
To: solr-user@lucene.apache.org
Subject: RE: async BACKUP under Solr8.3

In some collections I am having problems with Solr8.1.1 through 8.3; with other 
collections it is fine in Solr8.1.1 through 8.3

I'm investigating what might be wrong with the collections which have the 
problems.

Thanks

-Original Message-
From: Oakley, Craig (NIH/NLM/NCBI) [C]  
Sent: Tuesday, November 19, 2019 9:53 AM
To: solr-user@lucene.apache.org
Subject: RE: async BACKUP under Solr8.3

FYI, I DO succeed in doing an async backup in Solr8.1

-Original Message-
From: Oakley, Craig (NIH/NLM/NCBI) [C]  
Sent: Tuesday, November 19, 2019 9:03 AM
To: solr-user@lucene.apache.org
Subject: RE: async BACKUP under Solr8.3

This is on a test server: simple case: one node, one shard, one replica

In production we currently use Solr7.4 and the async BACKUP works fine. I could 
test whether I get the same symptoms on Solr8.1 and/or 8.2

Thanks

-Original Message-
From: Mikhail Khludnev  
Sent: Tuesday, November 19, 2019 12:40 AM
To: solr-user 
Subject: Re: async BACKUP under Solr8.3

Hello, Craig.
There was a significant  fix for async BACKUP in 8.1, if I remember it
correctly.
Which version you used for it before? How many nodes, shards, replicas
`bug` has?
Unfortunately this stacktrace is not really representative, it just says
that some node (ok, it's overseer) fails to wait another one.
Ideally we need a log from overseer node and subordinate node during backup
operation.
Thanks.

On Tue, Nov 19, 2019 at 2:13 AM Oakley, Craig (NIH/NLM/NCBI) [C]
 wrote:

> For Solr 8.3, when I attempt a command of the form
>
>
> host:port/solr/admin/collections?action=BACKUP=snapshot1=col1=/tmp=bug
>
> And then when I run
> /solr/admin/collections?action=REQUESTSTATUS=bug I get
> "msg":"found [bug] in failed tasks"
>
> The solr.log file has a stack trace like the following
> 2019-11-18 17:31:31.369 ERROR
> (OverseerThreadFactory-9-thread-5-processing-n:host:port_solr) [c:col1   ]
> o.a.s.c.a.c.OverseerCollectionMessageHandler Error from shard:
> http://host:port/solr =>
> org.apache.solr.client.solrj.SolrServerException: Timeout occured while
> waiting response from server at: http://host:port/solr/admin/cores
> at
> org.apache.solr.client.solrj.impl.Http2SolrClient.request(Http2SolrClient.java:408)
> org.apache.solr.client.solrj.SolrServerException: Timeout occured while
> waiting response from server at: http://host:port/solr/admin/cores
> at
> org.apache.solr.client.solrj.impl.Http2SolrClient.request(Http2SolrClient.java:408)
> ~[?:?]
> at
> org.apache.solr.client.solrj.impl.Http2SolrClient.request(Http2SolrClient.java:754)
> ~[?:?]
> at
> org.apache.solr.client.solrj.SolrClient.request(SolrClient.java:1290) ~[?:?]
> at
> org.apache.solr.handler.component.HttpShardHandler.request(HttpShardHandler.java:238)
> ~[?:?]
> at
> org.apache.solr.handler.component.HttpShardHandler.lambda$submit$0(HttpShardHandler.java:199)
> ~[?:?]
> at java.util.concurrent.FutureTask.run(FutureTask.java:266)
> ~[?:1.8.0_232]
> at
> java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511)
> ~[?:1.8.0_232]
> at java.util.concurrent.FutureTask.run(FutureTask.java:266)
> ~[?:1.8.0_232]
> at
> com.codahale.metrics.InstrumentedExecutorService$InstrumentedRunnable.run(InstrumentedExecutorService.java:181)
> ~[metrics-core-4.0.5.jar:4.0.5]
> at
> org.apache.solr.common.util.ExecutorUtil$MDCAwareThreadPoolExecutor.lambda$execute$0(ExecutorUtil.java:210)
> ~[?:?]
> at
> java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149)
> ~[?:1.8.0_232]
> at
> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624)
> ~[?:1.8.0_232]
> at java.lang.Thread.run(Thread.java:748) [?:1.8.0_232]
> Caused by: java.util.concurrent.TimeoutException
> at
> org.eclipse.jetty.client.util.InputStreamResponseListener.get(InputStreamResponseListener.java:216)
> ~[?:?]
> at
> org.apache.solr.client.solrj.impl.Http2SolrClient.request(Http2SolrClient.java:399)
> ~[?:?]
> ... 12 more
>
> If I remove the async=bug, then it works
>
> In fact, the backup looks successful, but REQUESTSTATUS does not recognize
> it as such
>
> I notice that the 3:30am 11/4/19 Email to solr-user@lucene.apache.org
> mentions in Solr 8.3.0 Release Highlights "Fix for SPLITSHARD (async) with
> failures in underlying sub-operations can result in data loss"
>
> Did a fix to SPLITSHARD break BACKUP?
>
> Has anyone been successful running
> solr/admin/collections?action=BACKUP=requestname under Solr8.3?
>
> Thanks
>


-- 
Sincerely yours
Mikhail Khludnev


RE: async BACKUP under Solr8.3

2019-11-19 Thread Oakley, Craig (NIH/NLM/NCBI) [C]
In some collections I am having problems with Solr8.1.1 through 8.3; with other 
collections it is fine in Solr8.1.1 through 8.3

I'm investigating what might be wrong with the collections which have the 
problems.

Thanks

-Original Message-
From: Oakley, Craig (NIH/NLM/NCBI) [C]  
Sent: Tuesday, November 19, 2019 9:53 AM
To: solr-user@lucene.apache.org
Subject: RE: async BACKUP under Solr8.3

FYI, I DO succeed in doing an async backup in Solr8.1

-Original Message-
From: Oakley, Craig (NIH/NLM/NCBI) [C]  
Sent: Tuesday, November 19, 2019 9:03 AM
To: solr-user@lucene.apache.org
Subject: RE: async BACKUP under Solr8.3

This is on a test server: simple case: one node, one shard, one replica

In production we currently use Solr7.4 and the async BACKUP works fine. I could 
test whether I get the same symptoms on Solr8.1 and/or 8.2

Thanks

-Original Message-
From: Mikhail Khludnev  
Sent: Tuesday, November 19, 2019 12:40 AM
To: solr-user 
Subject: Re: async BACKUP under Solr8.3

Hello, Craig.
There was a significant  fix for async BACKUP in 8.1, if I remember it
correctly.
Which version you used for it before? How many nodes, shards, replicas
`bug` has?
Unfortunately this stacktrace is not really representative, it just says
that some node (ok, it's overseer) fails to wait another one.
Ideally we need a log from overseer node and subordinate node during backup
operation.
Thanks.

On Tue, Nov 19, 2019 at 2:13 AM Oakley, Craig (NIH/NLM/NCBI) [C]
 wrote:

> For Solr 8.3, when I attempt a command of the form
>
>
> host:port/solr/admin/collections?action=BACKUP=snapshot1=col1=/tmp=bug
>
> And then when I run
> /solr/admin/collections?action=REQUESTSTATUS=bug I get
> "msg":"found [bug] in failed tasks"
>
> The solr.log file has a stack trace like the following
> 2019-11-18 17:31:31.369 ERROR
> (OverseerThreadFactory-9-thread-5-processing-n:host:port_solr) [c:col1   ]
> o.a.s.c.a.c.OverseerCollectionMessageHandler Error from shard:
> http://host:port/solr =>
> org.apache.solr.client.solrj.SolrServerException: Timeout occured while
> waiting response from server at: http://host:port/solr/admin/cores
> at
> org.apache.solr.client.solrj.impl.Http2SolrClient.request(Http2SolrClient.java:408)
> org.apache.solr.client.solrj.SolrServerException: Timeout occured while
> waiting response from server at: http://host:port/solr/admin/cores
> at
> org.apache.solr.client.solrj.impl.Http2SolrClient.request(Http2SolrClient.java:408)
> ~[?:?]
> at
> org.apache.solr.client.solrj.impl.Http2SolrClient.request(Http2SolrClient.java:754)
> ~[?:?]
> at
> org.apache.solr.client.solrj.SolrClient.request(SolrClient.java:1290) ~[?:?]
> at
> org.apache.solr.handler.component.HttpShardHandler.request(HttpShardHandler.java:238)
> ~[?:?]
> at
> org.apache.solr.handler.component.HttpShardHandler.lambda$submit$0(HttpShardHandler.java:199)
> ~[?:?]
> at java.util.concurrent.FutureTask.run(FutureTask.java:266)
> ~[?:1.8.0_232]
> at
> java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511)
> ~[?:1.8.0_232]
> at java.util.concurrent.FutureTask.run(FutureTask.java:266)
> ~[?:1.8.0_232]
> at
> com.codahale.metrics.InstrumentedExecutorService$InstrumentedRunnable.run(InstrumentedExecutorService.java:181)
> ~[metrics-core-4.0.5.jar:4.0.5]
> at
> org.apache.solr.common.util.ExecutorUtil$MDCAwareThreadPoolExecutor.lambda$execute$0(ExecutorUtil.java:210)
> ~[?:?]
> at
> java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149)
> ~[?:1.8.0_232]
> at
> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624)
> ~[?:1.8.0_232]
> at java.lang.Thread.run(Thread.java:748) [?:1.8.0_232]
> Caused by: java.util.concurrent.TimeoutException
> at
> org.eclipse.jetty.client.util.InputStreamResponseListener.get(InputStreamResponseListener.java:216)
> ~[?:?]
> at
> org.apache.solr.client.solrj.impl.Http2SolrClient.request(Http2SolrClient.java:399)
> ~[?:?]
> ... 12 more
>
> If I remove the async=bug, then it works
>
> In fact, the backup looks successful, but REQUESTSTATUS does not recognize
> it as such
>
> I notice that the 3:30am 11/4/19 Email to solr-user@lucene.apache.org
> mentions in Solr 8.3.0 Release Highlights "Fix for SPLITSHARD (async) with
> failures in underlying sub-operations can result in data loss"
>
> Did a fix to SPLITSHARD break BACKUP?
>
> Has anyone been successful running
> solr/admin/collections?action=BACKUP=requestname under Solr8.3?
>
> Thanks
>


-- 
Sincerely yours
Mikhail Khludnev


RE: async BACKUP under Solr8.3

2019-11-19 Thread Oakley, Craig (NIH/NLM/NCBI) [C]
FYI, I DO succeed in doing an async backup in Solr8.1

-Original Message-
From: Oakley, Craig (NIH/NLM/NCBI) [C]  
Sent: Tuesday, November 19, 2019 9:03 AM
To: solr-user@lucene.apache.org
Subject: RE: async BACKUP under Solr8.3

This is on a test server: simple case: one node, one shard, one replica

In production we currently use Solr7.4 and the async BACKUP works fine. I could 
test whether I get the same symptoms on Solr8.1 and/or 8.2

Thanks

-Original Message-
From: Mikhail Khludnev  
Sent: Tuesday, November 19, 2019 12:40 AM
To: solr-user 
Subject: Re: async BACKUP under Solr8.3

Hello, Craig.
There was a significant  fix for async BACKUP in 8.1, if I remember it
correctly.
Which version you used for it before? How many nodes, shards, replicas
`bug` has?
Unfortunately this stacktrace is not really representative, it just says
that some node (ok, it's overseer) fails to wait another one.
Ideally we need a log from overseer node and subordinate node during backup
operation.
Thanks.

On Tue, Nov 19, 2019 at 2:13 AM Oakley, Craig (NIH/NLM/NCBI) [C]
 wrote:

> For Solr 8.3, when I attempt a command of the form
>
>
> host:port/solr/admin/collections?action=BACKUP=snapshot1=col1=/tmp=bug
>
> And then when I run
> /solr/admin/collections?action=REQUESTSTATUS=bug I get
> "msg":"found [bug] in failed tasks"
>
> The solr.log file has a stack trace like the following
> 2019-11-18 17:31:31.369 ERROR
> (OverseerThreadFactory-9-thread-5-processing-n:host:port_solr) [c:col1   ]
> o.a.s.c.a.c.OverseerCollectionMessageHandler Error from shard:
> http://host:port/solr =>
> org.apache.solr.client.solrj.SolrServerException: Timeout occured while
> waiting response from server at: http://host:port/solr/admin/cores
> at
> org.apache.solr.client.solrj.impl.Http2SolrClient.request(Http2SolrClient.java:408)
> org.apache.solr.client.solrj.SolrServerException: Timeout occured while
> waiting response from server at: http://host:port/solr/admin/cores
> at
> org.apache.solr.client.solrj.impl.Http2SolrClient.request(Http2SolrClient.java:408)
> ~[?:?]
> at
> org.apache.solr.client.solrj.impl.Http2SolrClient.request(Http2SolrClient.java:754)
> ~[?:?]
> at
> org.apache.solr.client.solrj.SolrClient.request(SolrClient.java:1290) ~[?:?]
> at
> org.apache.solr.handler.component.HttpShardHandler.request(HttpShardHandler.java:238)
> ~[?:?]
> at
> org.apache.solr.handler.component.HttpShardHandler.lambda$submit$0(HttpShardHandler.java:199)
> ~[?:?]
> at java.util.concurrent.FutureTask.run(FutureTask.java:266)
> ~[?:1.8.0_232]
> at
> java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511)
> ~[?:1.8.0_232]
> at java.util.concurrent.FutureTask.run(FutureTask.java:266)
> ~[?:1.8.0_232]
> at
> com.codahale.metrics.InstrumentedExecutorService$InstrumentedRunnable.run(InstrumentedExecutorService.java:181)
> ~[metrics-core-4.0.5.jar:4.0.5]
> at
> org.apache.solr.common.util.ExecutorUtil$MDCAwareThreadPoolExecutor.lambda$execute$0(ExecutorUtil.java:210)
> ~[?:?]
> at
> java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149)
> ~[?:1.8.0_232]
> at
> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624)
> ~[?:1.8.0_232]
> at java.lang.Thread.run(Thread.java:748) [?:1.8.0_232]
> Caused by: java.util.concurrent.TimeoutException
> at
> org.eclipse.jetty.client.util.InputStreamResponseListener.get(InputStreamResponseListener.java:216)
> ~[?:?]
> at
> org.apache.solr.client.solrj.impl.Http2SolrClient.request(Http2SolrClient.java:399)
> ~[?:?]
> ... 12 more
>
> If I remove the async=bug, then it works
>
> In fact, the backup looks successful, but REQUESTSTATUS does not recognize
> it as such
>
> I notice that the 3:30am 11/4/19 Email to solr-user@lucene.apache.org
> mentions in Solr 8.3.0 Release Highlights "Fix for SPLITSHARD (async) with
> failures in underlying sub-operations can result in data loss"
>
> Did a fix to SPLITSHARD break BACKUP?
>
> Has anyone been successful running
> solr/admin/collections?action=BACKUP=requestname under Solr8.3?
>
> Thanks
>


-- 
Sincerely yours
Mikhail Khludnev


RE: async BACKUP under Solr8.3

2019-11-19 Thread Oakley, Craig (NIH/NLM/NCBI) [C]
This is on a test server: simple case: one node, one shard, one replica

In production we currently use Solr7.4 and the async BACKUP works fine. I could 
test whether I get the same symptoms on Solr8.1 and/or 8.2

Thanks

-Original Message-
From: Mikhail Khludnev  
Sent: Tuesday, November 19, 2019 12:40 AM
To: solr-user 
Subject: Re: async BACKUP under Solr8.3

Hello, Craig.
There was a significant  fix for async BACKUP in 8.1, if I remember it
correctly.
Which version you used for it before? How many nodes, shards, replicas
`bug` has?
Unfortunately this stacktrace is not really representative, it just says
that some node (ok, it's overseer) fails to wait another one.
Ideally we need a log from overseer node and subordinate node during backup
operation.
Thanks.

On Tue, Nov 19, 2019 at 2:13 AM Oakley, Craig (NIH/NLM/NCBI) [C]
 wrote:

> For Solr 8.3, when I attempt a command of the form
>
>
> host:port/solr/admin/collections?action=BACKUP=snapshot1=col1=/tmp=bug
>
> And then when I run
> /solr/admin/collections?action=REQUESTSTATUS=bug I get
> "msg":"found [bug] in failed tasks"
>
> The solr.log file has a stack trace like the following
> 2019-11-18 17:31:31.369 ERROR
> (OverseerThreadFactory-9-thread-5-processing-n:host:port_solr) [c:col1   ]
> o.a.s.c.a.c.OverseerCollectionMessageHandler Error from shard:
> http://host:port/solr =>
> org.apache.solr.client.solrj.SolrServerException: Timeout occured while
> waiting response from server at: http://host:port/solr/admin/cores
> at
> org.apache.solr.client.solrj.impl.Http2SolrClient.request(Http2SolrClient.java:408)
> org.apache.solr.client.solrj.SolrServerException: Timeout occured while
> waiting response from server at: http://host:port/solr/admin/cores
> at
> org.apache.solr.client.solrj.impl.Http2SolrClient.request(Http2SolrClient.java:408)
> ~[?:?]
> at
> org.apache.solr.client.solrj.impl.Http2SolrClient.request(Http2SolrClient.java:754)
> ~[?:?]
> at
> org.apache.solr.client.solrj.SolrClient.request(SolrClient.java:1290) ~[?:?]
> at
> org.apache.solr.handler.component.HttpShardHandler.request(HttpShardHandler.java:238)
> ~[?:?]
> at
> org.apache.solr.handler.component.HttpShardHandler.lambda$submit$0(HttpShardHandler.java:199)
> ~[?:?]
> at java.util.concurrent.FutureTask.run(FutureTask.java:266)
> ~[?:1.8.0_232]
> at
> java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511)
> ~[?:1.8.0_232]
> at java.util.concurrent.FutureTask.run(FutureTask.java:266)
> ~[?:1.8.0_232]
> at
> com.codahale.metrics.InstrumentedExecutorService$InstrumentedRunnable.run(InstrumentedExecutorService.java:181)
> ~[metrics-core-4.0.5.jar:4.0.5]
> at
> org.apache.solr.common.util.ExecutorUtil$MDCAwareThreadPoolExecutor.lambda$execute$0(ExecutorUtil.java:210)
> ~[?:?]
> at
> java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149)
> ~[?:1.8.0_232]
> at
> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624)
> ~[?:1.8.0_232]
> at java.lang.Thread.run(Thread.java:748) [?:1.8.0_232]
> Caused by: java.util.concurrent.TimeoutException
> at
> org.eclipse.jetty.client.util.InputStreamResponseListener.get(InputStreamResponseListener.java:216)
> ~[?:?]
> at
> org.apache.solr.client.solrj.impl.Http2SolrClient.request(Http2SolrClient.java:399)
> ~[?:?]
> ... 12 more
>
> If I remove the async=bug, then it works
>
> In fact, the backup looks successful, but REQUESTSTATUS does not recognize
> it as such
>
> I notice that the 3:30am 11/4/19 Email to solr-user@lucene.apache.org
> mentions in Solr 8.3.0 Release Highlights "Fix for SPLITSHARD (async) with
> failures in underlying sub-operations can result in data loss"
>
> Did a fix to SPLITSHARD break BACKUP?
>
> Has anyone been successful running
> solr/admin/collections?action=BACKUP=requestname under Solr8.3?
>
> Thanks
>


-- 
Sincerely yours
Mikhail Khludnev


async BACKUP under Solr8.3

2019-11-18 Thread Oakley, Craig (NIH/NLM/NCBI) [C]
For Solr 8.3, when I attempt a command of the form

host:port/solr/admin/collections?action=BACKUP=snapshot1=col1=/tmp=bug

And then when I run /solr/admin/collections?action=REQUESTSTATUS=bug 
I get "msg":"found [bug] in failed tasks"

The solr.log file has a stack trace like the following
2019-11-18 17:31:31.369 ERROR 
(OverseerThreadFactory-9-thread-5-processing-n:host:port_solr) [c:col1   ] 
o.a.s.c.a.c.OverseerCollectionMessageHandler Error from shard: 
http://host:port/solr => org.apache.solr.client.solrj.SolrServerException: 
Timeout occured while waiting response from server at: 
http://host:port/solr/admin/cores
at 
org.apache.solr.client.solrj.impl.Http2SolrClient.request(Http2SolrClient.java:408)
org.apache.solr.client.solrj.SolrServerException: Timeout occured while waiting 
response from server at: http://host:port/solr/admin/cores
at 
org.apache.solr.client.solrj.impl.Http2SolrClient.request(Http2SolrClient.java:408)
 ~[?:?]
at 
org.apache.solr.client.solrj.impl.Http2SolrClient.request(Http2SolrClient.java:754)
 ~[?:?]
at 
org.apache.solr.client.solrj.SolrClient.request(SolrClient.java:1290) ~[?:?]
at 
org.apache.solr.handler.component.HttpShardHandler.request(HttpShardHandler.java:238)
 ~[?:?]
at 
org.apache.solr.handler.component.HttpShardHandler.lambda$submit$0(HttpShardHandler.java:199)
 ~[?:?]
at java.util.concurrent.FutureTask.run(FutureTask.java:266) 
~[?:1.8.0_232]
at 
java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511) 
~[?:1.8.0_232]
at java.util.concurrent.FutureTask.run(FutureTask.java:266) 
~[?:1.8.0_232]
at 
com.codahale.metrics.InstrumentedExecutorService$InstrumentedRunnable.run(InstrumentedExecutorService.java:181)
 ~[metrics-core-4.0.5.jar:4.0.5]
at 
org.apache.solr.common.util.ExecutorUtil$MDCAwareThreadPoolExecutor.lambda$execute$0(ExecutorUtil.java:210)
 ~[?:?]
at 
java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149) 
~[?:1.8.0_232]
at 
java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624) 
~[?:1.8.0_232]
at java.lang.Thread.run(Thread.java:748) [?:1.8.0_232]
Caused by: java.util.concurrent.TimeoutException
at 
org.eclipse.jetty.client.util.InputStreamResponseListener.get(InputStreamResponseListener.java:216)
 ~[?:?]
at 
org.apache.solr.client.solrj.impl.Http2SolrClient.request(Http2SolrClient.java:399)
 ~[?:?]
... 12 more

If I remove the async=bug, then it works

In fact, the backup looks successful, but REQUESTSTATUS does not recognize it 
as such

I notice that the 3:30am 11/4/19 Email to solr-user@lucene.apache.org mentions 
in Solr 8.3.0 Release Highlights "Fix for SPLITSHARD (async) with failures in 
underlying sub-operations can result in data loss"

Did a fix to SPLITSHARD break BACKUP?

Has anyone been successful running 
solr/admin/collections?action=BACKUP=requestname under Solr8.3?

Thanks


RE: Basic Authentication problem

2019-08-02 Thread Oakley, Craig (NIH/NLM/NCBI) [C]
Was I correct in my description yesterday (which I am pasting in below)? That 
you are using a hash based on the "solr" account name and expecting that to 
work if you change the account name but not the hash?

Am I correct in assuming that everything other than security-edit functions 
currently works for you with any account and any password, including without 
any login-and-password at all?


-Original Message-----
From: Oakley, Craig (NIH/NLM/NCBI) [C] 
Sent: Thursday, August 01, 2019 10:58 AM
To: solr-user@lucene.apache.org
Subject: RE: Basic Authentication problem

The hash value 
"IV0EHq1OnNrj6gvRCwvFwTrZ1+z1oBbnQdiVC3otuq0=Ndd7LKvVBAaZIF0QAVi1ekCfAJXr1GGfLtRUXhgrF8c="
 is based on both the plain text password AND the plain test login. Since 
"solr" is not the same string as "solr-admin", the password will not work. If 
the only authorization in security.json is restricting security-edit, then you 
can do anything else with any password, or with no password.

What you can do is setup the security.json file as specified in the Reference 
Guide (whence you got the hash of the login and password), then use the default 
solr login to run your set-user (to add the solr-admin user alongside the 
existing solr login), then use the default solr login to run 
{"set-user-role":{"solr-admin":["security-edit"]}}, and then (when you are sure 
things are correctly setup for solr-admin) drop the default solr login


-Original Message-
From: Zheng Lin Edwin Yeo  
Sent: Friday, August 02, 2019 2:59 AM
To: solr-user@lucene.apache.org
Subject: Re: Basic Authentication problem

From what I see, you are trying to change your own user's password. If I
remembered correctly this might not be allowed, which is why you are
getting the "Unauthorized request" error.

You can try to create another user with admin role as well, and to change
your existing user's password from the new user.

Regards,
Edwin

On Fri, 2 Aug 2019 at 13:32, Salmaan Rashid Syed 
wrote:

> My curl command works fine for querying, updating etc.
>
> I don't think it is the fault of curl command.
>
> I get the following error message when I tried to change the password of
> solr-admin,
>
>
> 
>
> 
>
> 
>
> Error 403 Unauthorized request, Response code: 403
>
> 
>
> HTTP ERROR 403
>
> Problem accessing /solr/admin/authentication. Reason:
>
> Unauthorized request, Response code: 403
>
> 
>
> 
>
>
> And if I give incorrect username and password, it states bad credentials
> entered. So, I think the curl command is fine. There is some issue with
> basic authentication.
>
>
> Okay, One way around is to figure out how to convert my password into a
> SHA256 (password + salt) and enter it in security.json file. But, I have no
> idea how to generate the SHA256 equivalent of my password.
>
>
> Any suggestions?
>
>
>
> On Fri, Aug 2, 2019 at 10:55 AM Zheng Lin Edwin Yeo 
> wrote:
>
> > Hi Salmaan,
> >
> > Does your curl command works for other curl commands like normal
> querying?
> > Or is it just not working when updating password and adding new users?
> >
> > Regards,
> > Edwin
> >
> >
> >
> > On Fri, 2 Aug 2019 at 13:03, Salmaan Rashid Syed <
> > salmaan.ras...@mroads.com>
> > wrote:
> >
> > > Hi Zheng,
> > >
> > > I tried and it works. But, when I use the curl command to update
> password
> > > or add new users it doesn't work.
> > >
> > > I don't know what is going wrong with curl command!
> > >
> > > Regards,
> > > Salmaan
> > >
> > >
> > > On Fri, Aug 2, 2019 at 8:26 AM Zheng Lin Edwin Yeo <
> edwinye...@gmail.com
> > >
> > > wrote:
> > >
> > > > Have you tried to access the Solr Admin UI with your created user
> name
> > > and
> > > > password to see if it works?
> > > >
> > > > Regards,
> > > > Edwin
> > > >
> > > > On Thu, 1 Aug 2019 at 19:51, Salmaan Rashid Syed <
> > > > salmaan.ras...@mroads.com>
> > > > wrote:
> > > >
> > > > > Hi Solr User,
> > > > >
> > > > > Please help me with my issue.
> > > > >
> > > > > I have enabled Solr basic authentication as shown in Solr
> > > documentations.
> > > > >
> > > > > I have changed username from solr to solr-admin as follow
> > > > >
> > > > > {
> > > > > "authentication":{
> > > > >"blockUnknown": true,
>

RE: Basic Authentication problem

2019-08-01 Thread Oakley, Craig (NIH/NLM/NCBI) [C]
The hash value 
"IV0EHq1OnNrj6gvRCwvFwTrZ1+z1oBbnQdiVC3otuq0=Ndd7LKvVBAaZIF0QAVi1ekCfAJXr1GGfLtRUXhgrF8c="
 is based on both the plain text password AND the plain test login. Since 
"solr" is not the same string as "solr-admin", the password will not work. If 
the only authorization in security.json is restricting security-edit, then you 
can do anything else with any password, or with no password.

What you can do is setup the security.json file as specified in the Reference 
Guide (whence you got the hash of the login and password), then use the default 
solr login to run your set-user (to add the solr-admin user alongside the 
existing solr login), then use the default solr login to run 
{"set-user-role":{"solr-admin":["security-edit"]}}, and then (when you are sure 
things are correctly setup for solr-admin) drop the default solr login

-Original Message-
From: Salmaan Rashid Syed  
Sent: Thursday, August 01, 2019 7:51 AM
To: solr-user@lucene.apache.org
Subject: Re: Basic Authentication problem

Hi Solr User,

Please help me with my issue.

I have enabled Solr basic authentication as shown in Solr documentations.

I have changed username from solr to solr-admin as follow

{
"authentication":{
   "blockUnknown": true,
   "class":"solr.BasicAuthPlugin",

 "credentials":{"solr-admin":"IV0EHq1OnNrj6gvRCwvFwTrZ1+z1oBbnQdiVC3otuq0=
Ndd7LKvVBAaZIF0QAVi1ekCfAJXr1GGfLtRUXhgrF8c="}
},
"authorization":{
   "class":"solr.RuleBasedAuthorizationPlugin",
   "permissions":[{"name":"security-edit",
  "role":"admin"}],
   "user-role":{"solr-admin":"admin"}
}}

I am able to login to the page using the credentials solr-admin:SolrRocks.

But, when I try to change the default password using the curl command as
follows,

curl --user solr-admin:SolrRocks
http://localhost:8983/solr/admin/authentication -H
'Content-type:application/json' -d '{"set-user":{"solr-admin":"s2019"}}'


I get the following error message,








Error 403 Unauthorized request, Response code: 403



HTTP ERROR 403

Problem accessing /solr/admin/authentication. Reason:

Unauthorized request, Response code: 403






Please help.

Regards,
Salmaan


On Thu, Aug 1, 2019 at 1:51 PM Salmaan Rashid Syed <
salmaan.ras...@mroads.com> wrote:

> Small correction in the user-name. It is solr-admin everywhere.
>
> Hi Solr Users,
>
> I have enabled Solr basic authentication as shown in Solr documentations.
>
> I have changed username from solr to solr-admin as follow
>
> {
> "authentication":{
>"blockUnknown": true,
>"class":"solr.BasicAuthPlugin",
>
>  "credentials":{"solr-admin":"IV0EHq1OnNrj6gvRCwvFwTrZ1+z1oBbnQdiVC3otuq0=
> Ndd7LKvVBAaZIF0QAVi1ekCfAJXr1GGfLtRUXhgrF8c="}
> },
> "authorization":{
>"class":"solr.RuleBasedAuthorizationPlugin",
>"permissions":[{"name":"security-edit",
>   "role":"admin"}],
>"user-role":{"solr-admin":"admin"}
> }}
>
> I am able to login to the page using the credentials
> mroads-solr-admin:SolrRocks.
>
> But, when I try to change the default password using the curl command as
> follows,
>
> curl --user solr-admin:SolrRocks
> http://localhost:8983/solr/admin/authentication -H
> 'Content-type:application/json' -d '{"set-user":{"solr-admin":"s2019"}}'
>
>
>
> I get the following error message,
>
>
> 
>
> 
>
> 
>
> Error 403 Unauthorized request, Response code: 403
>
> 
>
> HTTP ERROR 403
>
> Problem accessing /solr/admin/authentication. Reason:
>
> Unauthorized request, Response code: 403
>
> 
>
> 
>
>
> Please help.
>
>
> *Thanks and Regards,*
> Salmaan Rashid Syed
> +91 8978353445 | www.panna.ai |
> 5550 Granite Pkwy, Suite #225, Plano TX-75024.
> Cyber Gateways, Hi-tech City, Hyderabad, Telangana, India.
>
>
>
> On Thu, Aug 1, 2019 at 1:48 PM Salmaan Rashid Syed <
> salmaan.ras...@mroads.com> wrote:
>
>> Hi Solr Users,
>>
>> I have enabled Solr basic authentication as shown in Solr documentations.
>>
>> I have changed username from solr to solr-admin as follow
>>
>> {
>> "authentication":{
>>"blockUnknown": true,
>>"class":"solr.BasicAuthPlugin",
>>
>>  "credentials":{"solr-admin":"IV0EHq1OnNrj6gvRCwvFwTrZ1+z1oBbnQdiVC3otuq0=
>> Ndd7LKvVBAaZIF0QAVi1ekCfAJXr1GGfLtRUXhgrF8c="}
>> },
>> "authorization":{
>>"class":"solr.RuleBasedAuthorizationPlugin",
>>"permissions":[{"name":"security-edit",
>>   "role":"admin"}],
>>"user-role":{"solr-admin":"admin"}
>> }}
>>
>> I am able to login to the page using the credentials
>> mroads-solr-admin:SolrRocks.
>>
>> But, when I try to change the default password using the curl command as
>> follows,
>>
>> curl --user mroads-solr-admin:SolrRocks
>> http://localhost:8983/solr/admin/authentication -H
>> 'Content-type:application/json' -d '{"set-user":{"mroads-solr":"Mroads@2019
>> #"}}'
>>
>>
>>
>> I get the following error message,
>>
>>
>> 
>>
>> 
>>
>> 
>>
>> Error 403 Unauthorized request, Response code: 403
>>
>> 
>>
>> HTTP ERROR 403
>>
>> Problem accessing /solr/admin/authentication. Reason:
>>
>> Unauthorized request, 

memory leak?

2019-07-17 Thread Oakley, Craig (NIH/NLM/NCBI) [C]
I am having problems with a SolrCloud where both nodes become unresponsive. I 
am wondering whether there is some sort of memory leak. Attached is a portion 
of the solr_gc.log from around the time that the problem starts. Have you any 
suggestions how to diagnose and address this issue?

Thanks
2019-07-17T04:24:35.810-0400: 65828.261: Total time for which application 
threads were stopped: 0.8963997 seconds, Stopping threads took: 0.8958437 
seconds
2019-07-17T04:24:42.059-0400: 65834.510: [GC (CMS Initial Mark) [1 
CMS-initial-mark: 20963525K(35389440K)] 24965285K(45219840K), 0.2193216 secs] 
[Times: user=0.61 sys=0.27, real=0.22 secs] 
2019-07-17T04:24:42.279-0400: 65834.730: Total time for which application 
threads were stopped: 3.4630816 seconds, Stopping threads took: 3.2432089 
seconds
2019-07-17T04:24:42.279-0400: 65834.730: [CMS-concurrent-mark-start]
2019-07-17T04:24:43.281-0400: 65835.732: Total time for which application 
threads were stopped: 0.0005248 seconds, Stopping threads took: 0.0001216 
seconds
2019-07-17T04:24:43.282-0400: 65835.733: [CMS-concurrent-mark: 1.002/1.003 
secs] [Times: user=5.18 sys=0.19, real=1.01 secs] 
2019-07-17T04:24:43.282-0400: 65835.733: [CMS-concurrent-preclean-start]
2019-07-17T04:24:43.377-0400: 65835.829: [CMS-concurrent-preclean: 0.096/0.096 
secs] [Times: user=0.14 sys=0.05, real=0.09 secs] 
2019-07-17T04:24:43.378-0400: 65835.829: 
[CMS-concurrent-abortable-preclean-start]
2019-07-17T04:24:44.282-0400: 65836.733: Total time for which application 
threads were stopped: 0.0005190 seconds, Stopping threads took: 0.0001509 
seconds
2019-07-17T04:24:46.282-0400: 65838.733: Total time for which application 
threads were stopped: 0.0005607 seconds, Stopping threads took: 0.0001641 
seconds
 CMS: abort preclean due to time 2019-07-17T04:24:49.495-0400: 65841.946: 
[CMS-concurrent-abortable-preclean: 5.343/6.117 secs] [Times: user=11.07 
sys=0.79, real=6.12 secs] 
2019-07-17T04:24:49.496-0400: 65841.947: [GC (CMS Final Remark) [YG occupancy: 
4287360 K (9830400 K)]{Heap before GC invocations=4 (full 3):
 par new generation   total 9830400K, used 4287360K [0x7fd06000, 
0x7fd33000, 0x7fd33000)
  eden space 7864320K,  51% used [0x7fd06000, 0x7fd1564c5bf8, 
0x7fd24000)
  from space 1966080K,  12% used [0x7fd24000, 0x7fd24f61a718, 
0x7fd2b800)
  to   space 1966080K,   0% used [0x7fd2b800, 0x7fd2b800, 
0x7fd33000)
 concurrent mark-sweep generation total 35389440K, used 20963525K 
[0x7fd33000, 0x7fdba000, 0x7fdba000)
 Metaspace   used 47785K, capacity 49718K, committed 49972K, reserved 51200K
2019-07-17T04:24:49.496-0400: 65841.947: [GC (CMS Final Remark) 
2019-07-17T04:24:49.496-0400: 65841.947: [ParNew
Desired survivor size 1811939328 bytes, new threshold 8 (max 8)
- age   1:  305474592 bytes,  305474592 total
- age   2:  101513064 bytes,  406987656 total
- age   3:  118054640 bytes,  525042296 total
- age   4:792 bytes,  525043088 total
- age   5:   12209536 bytes,  537252624 total
: 4287360K->543113K(9830400K), 0.7519541 secs] 25250886K->21506639K(45219840K), 
0.7521144 secs] [Times: user=2.62 sys=0.39, real=0.75 secs] 
Heap after GC invocations=5 (full 3):
 par new generation   total 9830400K, used 543113K [0x7fd06000, 
0x7fd33000, 0x7fd33000)
  eden space 7864320K,   0% used [0x7fd06000, 0x7fd06000, 
0x7fd24000)
  from space 1966080K,  27% used [0x7fd2b800, 0x7fd2d9262690, 
0x7fd33000)
  to   space 1966080K,   0% used [0x7fd24000, 0x7fd24000, 
0x7fd2b800)
 concurrent mark-sweep generation total 35389440K, used 20963525K 
[0x7fd33000, 0x7fdba000, 0x7fdba000)
 Metaspace   used 47785K, capacity 49718K, committed 49972K, reserved 51200K
}
2019-07-17T04:24:50.248-0400: 65842.699: [Rescan (parallel) , 0.1138935 
secs]2019-07-17T04:24:50.362-0400: 65842.813: [weak refs processing, 0.0007099 
secs]2019-07-17T04:24:50.363-0400: 65842.814: [class unloading, 0.0343465 
secs]2019-07-17T04:24:50.397-0400: 65842.848: [scrub symbol table, 0.0137971 
secs]2019-07-17T04:24:50.411-0400: 65842.862: [scrub string table, 0.0009367 
secs][1 CMS-remark: 20963525K(35389440K)] 21506639K(45219840K), 0.9220238 secs] 
[Times: user=3.09 sys=0.42, real=0.92 secs] 
2019-07-17T04:24:50.418-0400: 65842.869: Total time for which application 
threads were stopped: 0.9225790 seconds, Stopping threads took: 0.0001272 
seconds
2019-07-17T04:24:50.418-0400: 65842.869: [CMS-concurrent-sweep-start]
2019-07-17T04:24:50.418-0400: 65842.869: [CMS-concurrent-sweep: 0.000/0.000 
secs] [Times: user=0.01 sys=0.00, real=0.00 secs] 
2019-07-17T04:24:50.418-0400: 65842.869: [CMS-concurrent-reset-start]
2019-07-17T04:24:50.663-0400: 65843.114: [CMS-concurrent-reset: 0.245/0.245 
secs] [Times: user=0.33 sys=0.16, real=0.25 secs] 
2019-07-17T04:24:52.663-0400: 65845.114: [GC 

RE: No files to download for index generation

2019-06-10 Thread Oakley, Craig (NIH/NLM/NCBI) [C]
Does anyone yet have any insight on interpreting the severity of this message?

-Original Message-
From: Oakley, Craig (NIH/NLM/NCBI) [C]  
Sent: Tuesday, June 04, 2019 4:07 PM
To: solr-user@lucene.apache.org
Subject: No files to download for index generation

We have occasionally been seeing an error such as the following:
2019-06-03 23:32:45.583 INFO  (indexFetcher-45-thread-1) [   ] 
o.a.s.h.IndexFetcher Master's generation: 1424625
2019-06-03 23:32:45.583 INFO  (indexFetcher-45-thread-1) [   ] 
o.a.s.h.IndexFetcher Master's version: 1559619115480
2019-06-03 23:32:45.583 INFO  (indexFetcher-45-thread-1) [   ] 
o.a.s.h.IndexFetcher Slave's generation: 1424624
2019-06-03 23:32:45.583 INFO  (indexFetcher-45-thread-1) [   ] 
o.a.s.h.IndexFetcher Slave's version: 1559619050130
2019-06-03 23:32:45.583 INFO  (indexFetcher-45-thread-1) [   ] 
o.a.s.h.IndexFetcher Starting replication process
2019-06-03 23:32:45.587 ERROR (indexFetcher-45-thread-1) [   ] 
o.a.s.h.IndexFetcher No files to download for index generation: 1424625

Is that last line actually an error as in "there SHOULD be files to download, 
but there are none"?

Or is it simply informative as in "there are no files to download, so we are 
all done here"?


No files to download for index generation

2019-06-04 Thread Oakley, Craig (NIH/NLM/NCBI) [C]
We have occasionally been seeing an error such as the following:
2019-06-03 23:32:45.583 INFO  (indexFetcher-45-thread-1) [   ] 
o.a.s.h.IndexFetcher Master's generation: 1424625
2019-06-03 23:32:45.583 INFO  (indexFetcher-45-thread-1) [   ] 
o.a.s.h.IndexFetcher Master's version: 1559619115480
2019-06-03 23:32:45.583 INFO  (indexFetcher-45-thread-1) [   ] 
o.a.s.h.IndexFetcher Slave's generation: 1424624
2019-06-03 23:32:45.583 INFO  (indexFetcher-45-thread-1) [   ] 
o.a.s.h.IndexFetcher Slave's version: 1559619050130
2019-06-03 23:32:45.583 INFO  (indexFetcher-45-thread-1) [   ] 
o.a.s.h.IndexFetcher Starting replication process
2019-06-03 23:32:45.587 ERROR (indexFetcher-45-thread-1) [   ] 
o.a.s.h.IndexFetcher No files to download for index generation: 1424625

Is that last line actually an error as in "there SHOULD be files to download, 
but there are none"?

Or is it simply informative as in "there are no files to download, so we are 
all done here"?


What determines which logging settings are available?

2019-05-10 Thread Oakley, Craig (NIH/NLM/NCBI) [C]
We are wanting to tweak the logging levels of our Solr 7.4 nodes to see what 
might be helpful to add to the solr.log for debugging purposes.

In investigating what is available, however, I run /solr/admin/info/logging and 
I find that there is little consistency in what logging settings are available 
to any given node (referring here to the "name" field of the "loggers" array 
returned by /solr/admin/info/logging) - even between nodes running in the same 
SolrCloud and/or on the same host.

For instance one node of a SolrCloud might have 
org.apache.solr.cloud.overseer.SliceMutator which can be set, but no other 
instance of the same SolrCloud has that setting.

Could someone clarify what determines which setting are available?

Thanks


RE: is df needed for SolrCloud replication?

2019-03-21 Thread Oakley, Craig (NIH/NLM/NCBI) [C]
Thanks.

That resolves the issue.


Thanks again.

-Original Message-
From: Shawn Heisey  
Sent: Tuesday, March 19, 2019 7:10 PM
To: solr-user@lucene.apache.org
Subject: Re: is df needed for SolrCloud replication?

On 3/19/2019 4:48 PM, Oakley, Craig (NIH/NLM/NCBI) [C] wrote:
> I recently noticed that my solr.log files have been getting the following 
> error message:
> o.a.s.h.RequestHandlerBase org.apache.solr.common.SolrException: no field 
> name specified in query and no default specified via 'df' param
> 
> The timing of these messages coincide with pings to leader node of the 
> SolrCloud from other nodes of the SolrCloud (the message appears only on 
> whatever node is currently the leader).
> 
> I believe that the user for whom I set up this SolrCloud intentionally 
> removed df from the defaults section of solrconfig.xml (in order to 
> streamline out parts of the code which he does not use).
> 
> I have not (yet) noticed any ill effects from this error. Is this error 
> benign? Or shall I ask the user to reinstate df in the defaults section of 
> solrconfig.xml? Or can SorlCloud replication be configured to work around any 
> ill effects that there may be?

If you don't define df (which means "default field"), then every query 
must indicate which field(s) it will query, or you will see that error 
message.

It sounds like the query that is in the ping handler needs to be changed 
so it includes a field name.  Typically ping handlers use *:* for their 
query, which is special syntax for all documents, and works even when no 
fields are defined.  That query is usually extremely fast.

Thanks,
Shawn


is df needed for SolrCloud replication?

2019-03-19 Thread Oakley, Craig (NIH/NLM/NCBI) [C]
I recently noticed that my solr.log files have been getting the following error 
message:
o.a.s.h.RequestHandlerBase org.apache.solr.common.SolrException: no field name 
specified in query and no default specified via 'df' param

The timing of these messages coincide with pings to leader node of the 
SolrCloud from other nodes of the SolrCloud (the message appears only on 
whatever node is currently the leader).

I believe that the user for whom I set up this SolrCloud intentionally removed 
df from the defaults section of solrconfig.xml (in order to streamline out 
parts of the code which he does not use).

I have not (yet) noticed any ill effects from this error. Is this error benign? 
Or shall I ask the user to reinstate df in the defaults section of 
solrconfig.xml? Or can SorlCloud replication be configured to work around any 
ill effects that there may be?

Please advise


RE: change in White Space when upgrading 6.6 to 7.4

2019-02-08 Thread Oakley, Craig (NIH/NLM/NCBI) [C]
> Can we take this thread back to the mailing list, please? It would be good to 
> allow other people to weigh in!

Sure

-Original Message-
From: Matt Pearce  
Sent: Friday, February 08, 2019 6:45 AM
To: Oakley, Craig (NIH/NLM/NCBI) [C] 
Subject: Re: change in White Space when upgrading 6.6 to 7.4


The first (sow=false) query parses to:
"+(+((text:pd text:2485621) | isolation_source:PDS24856.21) 
-erd_group:PDS24856.21)"
while the sow=true query parses to:
"+(+(text:\"pd 2485621\" | isolation_source:PDS24856.21) 
-erd_group:PDS24856.21)"

This suggests to me that the analyzer on the text field is using the 
WordDelimiterFilterFactory (or WordDelimiterGraphFilterFactory), and 
splitting the query text into separate tokens on number/word boundaries 
- so "ABC123" => "ABC" "123". It is also stripping the "S" from "PDS", 
and the decimal point from the numeric part, as you can see from the 
"text:2485621" part of both queries - this may not be the 
WordDelimiter filter, but I suspect it probably is.

It works when sow=true, because it's generating a phrase query. When 
sow=false, it doesn't generate a phrase query and you're getting matches 
on both "pd" and "2485621" - presumably "pd" appears in a lot of 
your documents.

A possible solution without using sow=true would be to modify the 
analyzer on your text field so it doesn't use 
WordDelimiterFilterFactory, and retains "PD24856.21" as a single 
token, or modify the behaviour of that filter so it doesn't split the 
tokens the same way. Of course, this may not be what you want, depending 
on the other data you have in the text field.

Can we take this thread back to the mailing list, please? It would be 
good to allow other people to weigh in!

Thanks,
Matt

On 07/02/2019 15:58, Oakley, Craig (NIH/NLM/NCBI) [C] wrote:
> Thanks. Here is what I have.
> 
> The first curl output is the problem results. The next two were changing the 
> query (adding quotation marks or adding "*:")
> 
> After the third curl output, I upload a new solrconfig.xml (in another 
> window) to include true in the /select requestHandler; 
> I then RELOAD the core and run the final curl commend
> 
> The correct answer should have numFound 0 (and the only one which fails to 
> get the correct answer is the first: the original query with sow defaulting 
> to false in Solr7.4)
> 
> Let me know if you see any clarification in the debugQuery output
> 
> Thanks again
> 
> 
> 
> *[10:33 ~ 2209]$ curl -s 
> 'http://host:/solr/isolates/select?indent=on=PDS24856.21%20AND%20-erd_group:PDS24856.21=json=0=on'|tee
>  ~/solr/DBH14432debug190207a.out
> {
>"responseHeader":{
>  "zkConnected":true,
>  "status":0,
>  "QTime":1,
>  "params":{
>"q":"PDS24856.21 AND -erd_group:PDS24856.21",
>"indent":"on",
>"rows":"0",
>"wt":"json",
>"debugQuery":"on"}},
>"response":{"numFound":21322074,"start":0,"docs":[]
>},
>"debug":{
>  "rawquerystring":"PDS24856.21 AND -erd_group:PDS24856.21",
>  "querystring":"PDS24856.21 AND -erd_group:PDS24856.21",
>  "parsedquery":"+(+DisjunctionMaxQuery(((text:pd text:2485621) | 
> isolation_source:PDS24856.21)) -erd_group:PDS24856.21)",
>  "parsedquery_toString":"+(+((text:pd text:2485621) | 
> isolation_source:PDS24856.21) -erd_group:PDS24856.21)",
>  "explain":{},
>  "QParser":"ExtendedDismaxQParser",
>  "altquerystring":null,
>  "boost_queries":null,
>  "parsed_boost_queries":[],
>  "boostfuncs":null,
>  "timing":{
>"time":1.0,
>"prepare":{
>  "time":0.0,
>  "query":{
>"time":0.0},
>  "facet":{
>"time":0.0},
>  "facet_module":{
>"time":0.0},
>  "mlt":{
>"time":0.0},
>  "highlight":{
>"time":0.0},
>  "stats":{
>"time":0.0},
>  "expand":{
>"time":0.0},
>  "terms":{
>"time":0.0},

change in White Space when upgrading 6.6 to 7.4

2019-02-01 Thread Oakley, Craig (NIH/NLM/NCBI) [C]
We had a problem when upgrading from Solr 6.6 to Solr 7.4 in that a query 
ceased to work.


The query was of the form 
http://localhost:8983/solr/collection/select?indent=on=ABC4856.21%20AND%20-field1:ABC4856.21=json=0

Basically finding a count of those records where there is some field which has 
"ABC4856.21", but where the field field1 does not have that string (in other 
words, where there is some field other than field1 which has "ABC4856.21")

For this particular collection, running the query against Solr 6.6 resulted in 
"response":{"numFound":0" (which was correct), but running it against Solr 7.4 
resulted in ""response":{"numFound":21322074"

After some investigation, it seemed to be a problem with the initial 
"ABC4856.21" being tokenized as "ABC4856" and "21"

We found various work-arounds such as putting quotation marks around the string 
or adding "*:" after the "q="; but the user wanted the exact same query to work 
in Solr 7.4 as it had in Solr 6.6

Eventually, we found a solution by adding "true" to the 
Select handler in solrconfig.xml (for "Separate On Whitespace").

This solution seems to be sufficient; but we would like to be sure we 
understand the solution.

Looking at lucene.apache.org/solr/guide/7_4/tokenizers.html#standard-tokenizer 
it would seem that the period should not split the string into two tokens.

Could someone clarify how we can know which Tokenize is used when, and which 
definition of White Space is used when?

Thanks


RE: SPLITSHARD not working as expected

2019-01-30 Thread Oakley, Craig (NIH/NLM/NCBI) [C]
"Sometimes for one of the sub-shards, the new leader and one of the new 
followers end up on the same instance"

Actually, it seems to be the case that every single time in the entire history 
of SPLITSHARD for one of the sub-shards, both the new leader and one of the new 
followers end up on the exact same instance.

I asked several months ago (see below under "ATTACHED MESSAGE") whether anyone 
anywhere had ever seen a case where this bug did not occur, and it seems that 
no one has been able to provide a counterexample: I think we have to concluded 
that this bug is universal

-Original Message-
From: Chris Ulicny  
Sent: Wednesday, January 30, 2019 1:46 PM
To: solr-user@lucene.apache.org
Subject: Re: SPLITSHARD not working as expected

I'm not sure what the expected behavior is. However, as of 7.4.0, it
doesn't seem like there is any attempt to prevent both the new leader and
follower replicas from being created on the same instance.

Sometimes for one of the sub-shards, the new leader and one of the new
followers end up on the same instance. We just manually end up moving them
since we don't split shards very often.

Best,
Chris

On Wed, Jan 30, 2019 at 12:46 PM Rahul Goswami 
wrote:

> Hello,
> I have a followup question on SPLITSHARD behavior. I understand that after
> a split, the leader replicas of the sub shards would reside on the same
> node as the leader of the parent. However, is there an expected behavior
> for the follower replicas of the sub shards as to where they will be
> created post split?
>
> Regards,
> Rahul
>
>
>
> On Wed, Jan 30, 2019 at 1:18 AM Rahul Goswami 
> wrote:
>
> > Thanks for the reply Jan. I have been referring to documentation for
> > SPLISHARD on 7.2.1
> > <
> https://lucene.apache.org/solr/guide/7_2/collections-api.html#splitshard>
> which
> > seems to be missing some important information present in 7.6
> > <
> https://lucene.apache.org/solr/guide/7_6/collections-api.html#splitshard>.
> > Especially these two pieces of information.:
> > "When using splitMethod=rewrite (default) you must ensure that the node
> > running the leader of the parent shard has enough free disk space i.e.,
> > more than twice the index size, for the split to succeed "
> >
> > "The first replicas of resulting sub-shards will always be placed on the
> > shard leader node"
> >
> > The idea of having an entire shard (both the replicas of it) present on
> > the same node did come across as an unexpected behavior at the beginning.
> > Anyway, I guess I am going to have to take care of the rebalancing with
> > MOVEREPLICA following a SPLITSHARD.
> >
> > Thanks for the clarification.
> >
> >
> > On Mon, Jan 28, 2019 at 3:40 AM Jan Høydahl 
> wrote:
> >
> >> This is normal. Please read
> >>
> https://lucene.apache.org/solr/guide/7_6/collections-api.html#splitshard
> >> PS: Images won't make it to the list, but don't think you need a
> >> screenshot here, what you describe is the default behaviour.
> >>
> >> --
> >> Jan Høydahl, search solution architect
> >> Cominvent AS - www.cominvent.com
> >>
> >> > 28. jan. 2019 kl. 09:05 skrev Rahul Goswami :
> >> >
> >> > Hello,
> >> > I am using Solr 7.2.1. I created a two node example collection on the
> >> same machine. Two shards with two replicas each. I then called
> SPLITSHARD
> >> on shard2 and expected the split shards to have one replica on each
> node.
> >> However I see that for shard2_1, both replicas reside on the same node.
> Is
> >> this a valid behavior?  Unless I am missing something, this could be
> >> potentially fatal.
> >> >
> >> > Here's the query and the cluster state post split:
> >> >
> >>
> http://localhost:8983/solr/admin/collections?action=SPLITSHARD=gettingstarted=shard2=true
> >> <
> >>
> http://localhost:8983/solr/admin/collections?action=SPLITSHARD=gettingstarted=shard2=true
> >
> >>
> >> >
> >> >
> >> >
> >> > Thanks,
> >> > Rahul
> >>
> >>
>








 ATTACHED MESSAGE 
-Original Message-
From: Oakley, Craig (NIH/NLM/NCBI) [C]  
Sent: Wednesday, September 19, 2018 4:52 PM
To: solr-user@lucene.apache.org
Subject: RE: sharding and placement of replicas

I am still wondering whether anyone has ever seen any examples of this actually 
working (has anyone ever seen any example of SPLITSHARD on a two-node SolrCloud 
placing replicas of the each shard on different hosts than other replicas of 
the same shards

RE: sharding and placement of replicas

2018-09-19 Thread Oakley, Craig (NIH/NLM/NCBI) [C]
I am still wondering whether anyone has ever seen any examples of this actually 
working (has anyone ever seen any example of SPLITSHARD on a two-node SolrCloud 
placing replicas of the each shard on different hosts than other replicas of 
the same shards)?


Anyone?

-Original Message-
From: Oakley, Craig (NIH/NLM/NCBI) [C]  
Sent: Friday, August 10, 2018 12:54 PM
To: solr-user@lucene.apache.org
Subject: RE: sharding and placement of replicas

Note that I usually create collections with commands which contain (for example)

solr/admin/collections?action=CREATE=collectest=collectest=1=1=

I give one node in the createNodeSet and then ADDREPLICA to the other node.

In case this were related, I now tried it a different way, using a command 
which contains

solr/admin/collections?action=CREATE=collectest5=collectest=1=2=

I gave both nodes in the createNodeSet in this case. It created one replica on 
each node (each node being on a different host at the same port). This is what 
I would consider the expected behavior (refraining from putting two replicas of 
the same one shard on the same node)

After this I ran a command including

solr/admin/collections?action=SPLITSHARD=collectest5=shard1=on=test20180810h

The result was still the same: one of the four new shards was on one node and 
the other three were all together on the node from which I issued this command 
(including putting two replicas of the same shard on the same node).





I am wondering whether there are any examples of this actually working (any 
examples of SPLITSHARD occasionally placing replicas of the each shard on 
different hosts than other replicas of the same shards)


-Original Message-
From: Oakley, Craig (NIH/NLM/NCBI) [C] [mailto:craig.oak...@nih.gov] 
Sent: Thursday, August 09, 2018 5:08 PM
To: solr-user@lucene.apache.org
Subject: RE: sharding and placement of replicas

Okay, I've tried again with two nodes running Solr7.4 on different hosts.

Before SPLITSHARD, collectest2_shard1_replica_n1 was on the host nosqltest22, 
and collectest2_shard1_replica_n3 was on the host nosqltest11

After running SPLITSHARD (on the nosqltest22 node), only 
collectest2_shard1_0_replica0 was added to nosqltest11; nosqltest22 became the 
location for collectest2_shard1_0_replica_n5 and 
collectest2_shard1_1_replica_n6 and collectest2_shard1_1_replica0 (and so if 
nosqltest22 were to be down, shard1_1 would not be available).


-Original Message-
From: Erick Erickson [mailto:erickerick...@gmail.com] 
Sent: Tuesday, July 31, 2018 5:16 PM
To: solr-user 
Subject: Re: sharding and placement of replicas

Right, two JVMs on the same physical host with different ports are
"different Solrs" by default. If you had two replicas per shard and
both were on either Solr instance (same port) that would be
unexpected.

Problem is that this would have been a bug clear back in the Solr 4x
days so the fact that you say you saw it on 6.6 would be unexpected.

Of course if you have three replicas and two instances, I'd absolutely
expect that two replicas would be on one of them for each shard.

Best,
Erick

On Tue, Jul 31, 2018 at 12:24 PM, Oakley, Craig (NIH/NLM/NCBI) [C]
 wrote:
> In my case, when trying on Solr7.4 (in response to Shawn Heisey's 6/19/18 
> comment "If this is a provable and reproducible bug, and it's still a problem 
> in the current stable branch"), I had only installed Solr7.4 on one host, and 
> so I was testing with two nodes on the same host (different port numbers). I 
> had previously had the same symptom when the two nodes were on different 
> hosts, but that was with Solr6.6 -- I can try it again with Solr7.4 with two 
> hosts and report back.
>
> -Original Message-
> From: Shawn Heisey [mailto:apa...@elyograg.org]
> Sent: Tuesday, July 31, 2018 2:26 PM
> To: solr-user@lucene.apache.org
> Subject: Re: sharding and placement of replicas
>
> On 7/27/2018 8:26 PM, Erick Erickson wrote:
>> Yes with some fiddling as far as "placement rules", start here:
>> https://lucene.apache.org/solr/guide/6_6/rule-based-replica-placement.html
>>
>> The idea (IIUC) is that you provide a snitch" that identifies what
>> "rack" the Solr instance is on and can define placement rules that
>> define "don't put more than one thingy on the same rack". "Thingy"
>> here is replica, shard, whatever as defined by other placement rules.
>
> I'd like to see an improvement in Solr's behavior when nothing has been
> configured in auto-scaling or rule-based replica placement.  Configuring
> those things is certainly an option, but I think we can do better even
> without that config.
>
> I believe that Solr already has some default intelligence that keeps
> multiple replicas from ending up on the same *node* when possible ... I
> would like this to also be aware of *

RE: sharding and placement of replicas

2018-08-10 Thread Oakley, Craig (NIH/NLM/NCBI) [C]
Note that I usually create collections with commands which contain (for example)

solr/admin/collections?action=CREATE=collectest=collectest=1=1=

I give one node in the createNodeSet and then ADDREPLICA to the other node.

In case this were related, I now tried it a different way, using a command 
which contains

solr/admin/collections?action=CREATE=collectest5=collectest=1=2=

I gave both nodes in the createNodeSet in this case. It created one replica on 
each node (each node being on a different host at the same port). This is what 
I would consider the expected behavior (refraining from putting two replicas of 
the same one shard on the same node)

After this I ran a command including

solr/admin/collections?action=SPLITSHARD=collectest5=shard1=on=test20180810h

The result was still the same: one of the four new shards was on one node and 
the other three were all together on the node from which I issued this command 
(including putting two replicas of the same shard on the same node).





I am wondering whether there are any examples of this actually working (any 
examples of SPLITSHARD occasionally placing replicas of the each shard on 
different hosts than other replicas of the same shards)


-Original Message-
From: Oakley, Craig (NIH/NLM/NCBI) [C] [mailto:craig.oak...@nih.gov] 
Sent: Thursday, August 09, 2018 5:08 PM
To: solr-user@lucene.apache.org
Subject: RE: sharding and placement of replicas

Okay, I've tried again with two nodes running Solr7.4 on different hosts.

Before SPLITSHARD, collectest2_shard1_replica_n1 was on the host nosqltest22, 
and collectest2_shard1_replica_n3 was on the host nosqltest11

After running SPLITSHARD (on the nosqltest22 node), only 
collectest2_shard1_0_replica0 was added to nosqltest11; nosqltest22 became the 
location for collectest2_shard1_0_replica_n5 and 
collectest2_shard1_1_replica_n6 and collectest2_shard1_1_replica0 (and so if 
nosqltest22 were to be down, shard1_1 would not be available).


-Original Message-
From: Erick Erickson [mailto:erickerick...@gmail.com] 
Sent: Tuesday, July 31, 2018 5:16 PM
To: solr-user 
Subject: Re: sharding and placement of replicas

Right, two JVMs on the same physical host with different ports are
"different Solrs" by default. If you had two replicas per shard and
both were on either Solr instance (same port) that would be
unexpected.

Problem is that this would have been a bug clear back in the Solr 4x
days so the fact that you say you saw it on 6.6 would be unexpected.

Of course if you have three replicas and two instances, I'd absolutely
expect that two replicas would be on one of them for each shard.

Best,
Erick

On Tue, Jul 31, 2018 at 12:24 PM, Oakley, Craig (NIH/NLM/NCBI) [C]
 wrote:
> In my case, when trying on Solr7.4 (in response to Shawn Heisey's 6/19/18 
> comment "If this is a provable and reproducible bug, and it's still a problem 
> in the current stable branch"), I had only installed Solr7.4 on one host, and 
> so I was testing with two nodes on the same host (different port numbers). I 
> had previously had the same symptom when the two nodes were on different 
> hosts, but that was with Solr6.6 -- I can try it again with Solr7.4 with two 
> hosts and report back.
>
> -Original Message-
> From: Shawn Heisey [mailto:apa...@elyograg.org]
> Sent: Tuesday, July 31, 2018 2:26 PM
> To: solr-user@lucene.apache.org
> Subject: Re: sharding and placement of replicas
>
> On 7/27/2018 8:26 PM, Erick Erickson wrote:
>> Yes with some fiddling as far as "placement rules", start here:
>> https://lucene.apache.org/solr/guide/6_6/rule-based-replica-placement.html
>>
>> The idea (IIUC) is that you provide a snitch" that identifies what
>> "rack" the Solr instance is on and can define placement rules that
>> define "don't put more than one thingy on the same rack". "Thingy"
>> here is replica, shard, whatever as defined by other placement rules.
>
> I'd like to see an improvement in Solr's behavior when nothing has been
> configured in auto-scaling or rule-based replica placement.  Configuring
> those things is certainly an option, but I think we can do better even
> without that config.
>
> I believe that Solr already has some default intelligence that keeps
> multiple replicas from ending up on the same *node* when possible ... I
> would like this to also be aware of *hosts*.
>
> Craig hasn't yet indicated whether there is more than one node per host,
> so I don't know whether the behavior he's seeing should be considered a bug.
>
> If somebody gives one machine multiple names/addresses and uses
> different hostnames in their SolrCloud config for one actual host, then
> it wouldn't be able to do any better than it does now, but if there are
> matches in the hostname part of different entrie

RE: sharding and placement of replicas

2018-08-09 Thread Oakley, Craig (NIH/NLM/NCBI) [C]
Okay, I've tried again with two nodes running Solr7.4 on different hosts.

Before SPLITSHARD, collectest2_shard1_replica_n1 was on the host nosqltest22, 
and collectest2_shard1_replica_n3 was on the host nosqltest11

After running SPLITSHARD (on the nosqltest22 node), only 
collectest2_shard1_0_replica0 was added to nosqltest11; nosqltest22 became the 
location for collectest2_shard1_0_replica_n5 and 
collectest2_shard1_1_replica_n6 and collectest2_shard1_1_replica0 (and so if 
nosqltest22 were to be down, shard1_1 would not be available).


-Original Message-
From: Erick Erickson [mailto:erickerick...@gmail.com] 
Sent: Tuesday, July 31, 2018 5:16 PM
To: solr-user 
Subject: Re: sharding and placement of replicas

Right, two JVMs on the same physical host with different ports are
"different Solrs" by default. If you had two replicas per shard and
both were on either Solr instance (same port) that would be
unexpected.

Problem is that this would have been a bug clear back in the Solr 4x
days so the fact that you say you saw it on 6.6 would be unexpected.

Of course if you have three replicas and two instances, I'd absolutely
expect that two replicas would be on one of them for each shard.

Best,
Erick

On Tue, Jul 31, 2018 at 12:24 PM, Oakley, Craig (NIH/NLM/NCBI) [C]
 wrote:
> In my case, when trying on Solr7.4 (in response to Shawn Heisey's 6/19/18 
> comment "If this is a provable and reproducible bug, and it's still a problem 
> in the current stable branch"), I had only installed Solr7.4 on one host, and 
> so I was testing with two nodes on the same host (different port numbers). I 
> had previously had the same symptom when the two nodes were on different 
> hosts, but that was with Solr6.6 -- I can try it again with Solr7.4 with two 
> hosts and report back.
>
> -Original Message-
> From: Shawn Heisey [mailto:apa...@elyograg.org]
> Sent: Tuesday, July 31, 2018 2:26 PM
> To: solr-user@lucene.apache.org
> Subject: Re: sharding and placement of replicas
>
> On 7/27/2018 8:26 PM, Erick Erickson wrote:
>> Yes with some fiddling as far as "placement rules", start here:
>> https://lucene.apache.org/solr/guide/6_6/rule-based-replica-placement.html
>>
>> The idea (IIUC) is that you provide a snitch" that identifies what
>> "rack" the Solr instance is on and can define placement rules that
>> define "don't put more than one thingy on the same rack". "Thingy"
>> here is replica, shard, whatever as defined by other placement rules.
>
> I'd like to see an improvement in Solr's behavior when nothing has been
> configured in auto-scaling or rule-based replica placement.  Configuring
> those things is certainly an option, but I think we can do better even
> without that config.
>
> I believe that Solr already has some default intelligence that keeps
> multiple replicas from ending up on the same *node* when possible ... I
> would like this to also be aware of *hosts*.
>
> Craig hasn't yet indicated whether there is more than one node per host,
> so I don't know whether the behavior he's seeing should be considered a bug.
>
> If somebody gives one machine multiple names/addresses and uses
> different hostnames in their SolrCloud config for one actual host, then
> it wouldn't be able to do any better than it does now, but if there are
> matches in the hostname part of different entries in live_nodes, then I
> think the improvement might be relatively easy.  Not saying that I know
> what to do, but somebody who is familiar with the Collections API code
> can probably do it.
>
> Thanks,
> Shawn
>


RE: sharding and placement of replicas

2018-07-31 Thread Oakley, Craig (NIH/NLM/NCBI) [C]
In my case, when trying on Solr7.4 (in response to Shawn Heisey's 6/19/18 
comment "If this is a provable and reproducible bug, and it's still a problem 
in the current stable branch"), I had only installed Solr7.4 on one host, and 
so I was testing with two nodes on the same host (different port numbers). I 
had previously had the same symptom when the two nodes were on different hosts, 
but that was with Solr6.6 -- I can try it again with Solr7.4 with two hosts and 
report back.

-Original Message-
From: Shawn Heisey [mailto:apa...@elyograg.org] 
Sent: Tuesday, July 31, 2018 2:26 PM
To: solr-user@lucene.apache.org
Subject: Re: sharding and placement of replicas

On 7/27/2018 8:26 PM, Erick Erickson wrote:
> Yes with some fiddling as far as "placement rules", start here:
> https://lucene.apache.org/solr/guide/6_6/rule-based-replica-placement.html
>
> The idea (IIUC) is that you provide a snitch" that identifies what
> "rack" the Solr instance is on and can define placement rules that
> define "don't put more than one thingy on the same rack". "Thingy"
> here is replica, shard, whatever as defined by other placement rules.

I'd like to see an improvement in Solr's behavior when nothing has been
configured in auto-scaling or rule-based replica placement.  Configuring
those things is certainly an option, but I think we can do better even
without that config.

I believe that Solr already has some default intelligence that keeps
multiple replicas from ending up on the same *node* when possible ... I
would like this to also be aware of *hosts*.

Craig hasn't yet indicated whether there is more than one node per host,
so I don't know whether the behavior he's seeing should be considered a bug.

If somebody gives one machine multiple names/addresses and uses
different hostnames in their SolrCloud config for one actual host, then
it wouldn't be able to do any better than it does now, but if there are
matches in the hostname part of different entries in live_nodes, then I
think the improvement might be relatively easy.  Not saying that I know
what to do, but somebody who is familiar with the Collections API code
can probably do it.

Thanks,
Shawn



RE: sharding and placement of replicas

2018-07-25 Thread Oakley, Craig (NIH/NLM/NCBI) [C]
I just now tried it with Solr7.4 and am getting the same symptoms as I describe 
below.

The symptoms I describe are quite different from my impression of Shawn 
Heisey's impression of my symptoms, so I will describe my symptoms again.

Let us assume that we start with a SolrCloud of two nodes: one at 
hostname1: and the other at hostname2:

Let us assume that we have a one-shard collection with two replicas. One of the 
replicas is on the node at hostname1: (with the core col_shard1_replica_n1) 
and the other on the node at hostname2: (with the core 
col_shard1_replica_n3)

Then I run SPLITSHARD

I end up with four cores instead of two, as expected. The problem is that three 
of the four cores (col_shard1_0_replica_n5, col_shard1_0_replica0 and 
col_shard1_1_replica_n6) are *all on hostname1*. Only col_shard1_1_replica0 was 
placed on hostname2.

Prior to the SPLITSHARD, if hostname1 becomes temporarily unavailable, the 
SolrCloud can still be used: hostname2 has all the data.

After the SPLITSHARD, if hostname1 becomes temporarily unavailable, the 
SolrCloud does not have any access to the data in shard1_0

Granted, I could add a replica of shard1_0 onto hostname2, and I could then 
drop one of the extraneous shard1_0 replicas which are on hostname1: but I 
don't see the logic in requiring such additional steps every time.



My question is: How can I tell Solr "avoid putting two replicas of the same 
shard on the same node"?



-Original Message-
From: Shawn Heisey [mailto:apa...@elyograg.org] 
Sent: Tuesday, June 19, 2018 2:20 PM
To: solr-user@lucene.apache.org
Subject: Re: sharding and placement of replicas

On 6/15/2018 11:08 AM, Oakley, Craig (NIH/NLM/NCBI) [C] wrote:
> If I start with a collection X on two nodes with one shard and two replicas 
> (for redundancy, in case a node goes down): a node on host1 has 
> X_shard1_replica1 and a node on host2 has X_shard1_replica2: when I try 
> SPLITSHARD, I generally get X_shard1_0_replica1, X_shard1_1_replica1 and 
> X_shard1_0_replica0 all on the node on host1 with X_shard1_1_replica0 sitting 
> alone on the node on host2. If host1 were to go down at this point, shard1_0 
> would be unavailable.

https://lucene.apache.org/solr/guide/6_6/collections-api.html#CollectionsAPI-splitshard

That documentation says "The new shards will have as many replicas as
the original shard."  That tells me that what you're seeing is not
matching the *intent* of the SPLITSHARD feature.  The fact that you get
*one* of the new shards but not the other is suspicious.  I'm wondering
if maybe Solr tried to create it but had a problem doing so.  Can you
check for errors in the solr logfile on host2?

If there's nothing about your environment that would cause a failure to
create the replica, then it might be a bug.

> Is there a way either of specifying placement or of giving hints that 
> replicas ought to be separated?

It shouldn't be necessary to give Solr any parameters for that.  All
nodes where the shard exists should get copies of the new shards when
you split it.

> I am currently running Solr6.6.0, if that is relevant.

If this is a provable and reproducible bug, and it's still a problem in
the current stable branch (next release from that will be 7.4.0), then
it will definitely be fixed.  If it's only a problem in 6.x, then I
can't guarantee that it will be fixed.  That's because the 6.x line is
in maintenance mode, which means that there's a very high bar for
changes.  In most cases, only changes that meet one of these criteria
are made in maintenance mode:

 * Fixes a security bug.
 * Fixes a MAJOR bug with no workaround.
 * Fix is a very trivial code change and not likely to introduce new bugs.

Of those criteria, generally only the first two are likely to prompt an
actual new software release.  If enough changes of the third type
accumulate, that might prompt a new release.

My personal opinion:  If this is a general problem in 6.x, it should be
fixed there.  Because there is a workaround, it would not be cause for
an immediate new release.

Thanks,
Shawn



sharding and placement of replicas

2018-06-15 Thread Oakley, Craig (NIH/NLM/NCBI) [C]
If I start with a collection X on two nodes with one shard and two replicas 
(for redundancy, in case a node goes down): a node on host1 has 
X_shard1_replica1 and a node on host2 has X_shard1_replica2: when I try 
SPLITSHARD, I generally get X_shard1_0_replica1, X_shard1_1_replica1 and 
X_shard1_0_replica0 all on the node on host1 with X_shard1_1_replica0 sitting 
alone on the node on host2. If host1 were to go down at this point, shard1_0 
would be unavailable.

I realize I do have the option to ADDREPLICA creating X_shard1_0_replica2 on 
the node on host2 and then to DELETEREPLICA for X_shard1_0_replica0: but I 
don't see the logic behind requiring this extra step. Of the half dozen times I 
have experimented with SPLITSHARD (starting with one shard and two replicas on 
separate nodes), it always puts three-out-of-four of the new cores on the same 
node.

Is there a way either of specifying placement or of giving hints that replicas 
ought to be separated?

I am currently running Solr6.6.0, if that is relevant.


sharding guidelines

2018-06-04 Thread Oakley, Craig (NIH/NLM/NCBI) [C]
I have a sharding question.





We have a collection (one shard, two replicas, currently running Solr6.6) which 
sometimes becomes unresponsive on the non-leader node. It is 214 gigabytes, and 
we were wondering whether there is a rule of thumb how large to allow a core to 
grow before sharding. I have a reference in my notes from the 2015 Solr 
conference in Austin "baseline no more than 100 million docs/shard" and "ideal 
shard-to-memory ratio, if at all possible index should fit into RAM, but other 
than that it gets really specific really fast"; but that was several versions 
ago, and so I wanted to ask whether these suggestions have been recalculated.

Thanks


RE: solr.log rotation

2017-10-02 Thread Oakley, Craig (NIH/NLM/NCBI) [C]
My guess would be to edit server/resources/log4j.properties to have

log4j.appender.file.MaxBackupIndex=0

-Original Message-
From: Noriyuki TAKEI [mailto:nta...@sios.com] 
Sent: Monday, October 02, 2017 10:39 AM
To: solr-user@lucene.apache.org
Subject: solr.log rotation

HI,All

When I restart Solr Service, solr.log is rotated as below.

solr.log.1
solr.log.2
solr.log.3
...

I would like to stop this rotation.

Do you have Any idea?



--
Sent from: http://lucene.472066.n3.nabble.com/Solr-User-f472068.html


RE: Anonymous Read?

2017-06-06 Thread Oakley, Craig (NIH/NLM/NCBI) [C]
We usually end security.json with the permissions

   {
"name":"open_select",
 "path":"/select/*",
 "role":null},
 {
"name":"all-admin",
"collection":null,
"path":"/*",
"role":"allgen"},
 {
"name":"all-core-handlers",
"path":"/*",
 "role":"allgen"}]
 } }


...and then assign the "allgen" role to all users

This allows a select without a login & password, but requires a login & 
password for anything else (including the front page of the GUI)

-Original Message-
From: Solr User [mailto:solr...@gmail.com] 
Sent: Tuesday, June 06, 2017 2:27 PM
To: solr-user@lucene.apache.org
Subject: Anonymous Read?

Is it possible to setup Solr security to allow anonymous query (/select
etc.) but restricted access to other permissions as described in
https://lucidworks.com/2015/08/17/securing-solr-basic-auth-permission-rules/
?


RE: Underlying file changed by an external force

2017-05-11 Thread Oakley, Craig (NIH/NLM/NCBI) [C]
FWIW, we now have a hypothetical suspect. We are getting these errors on three 
CentOS7 hosts, each of which recently had antivirus software installed.

-Original Message-
From: Oakley, Craig (NIH/NLM/NCBI) [C] [mailto:craig.oak...@nih.gov] 
Sent: Thursday, May 11, 2017 11:03 AM
To: solr-user@lucene.apache.org
Subject: RE: Underlying file changed by an external force

None of them have dataDir properties: they just use the "data" subdirectory in 
the same directory as the core.properties

-Original Message-
From: Erick Erickson [mailto:erickerick...@gmail.com] 
Sent: Wednesday, May 10, 2017 6:59 PM
To: solr-user <solr-user@lucene.apache.org>
Subject: Re: Underlying file changed by an external force

bq: All the core.properties files are each in their own directory with
no overlap

Not quite what I was asking. By definition, all core.properties are in
their own directory. In fact Solr stops looking down the tree when it
finds the first directory with core.properties in it and immediately
moves on to the next sibling directory.

_Inside_ the core.properties files, are there any dataDir properties
pointing to the same place as in any other core.properties? Note,
dataDir properties usually aren't even present unless you did
something special so don't be surprised if there's nothing there.

Best,
Erick

On Wed, May 10, 2017 at 10:56 AM, Oakley, Craig (NIH/NLM/NCBI) [C]
<craig.oak...@nih.gov> wrote:
>> You need to look at all of your core.properties files and see if any of them 
>> point to the same data directory.
>
> All the core.properties files are each in their own directory with no overlap.
>
>> Second: if you issue a "kill -9" you can leave write locks lingering.
>
> We manage our Solr instances with supervisor, which can send a "kill -9" if 
> "kill -6" does not suffice; but the problem tends to manifest itself at some 
> time other than startup
>
> The Solr version is 5.4.1, in case that is relevant.
>
>
> -Original Message-
> From: Erick Erickson [mailto:erickerick...@gmail.com]
> Sent: Thursday, May 04, 2017 3:20 PM
> To: solr-user <solr-user@lucene.apache.org>
> Subject: Re: Underlying file changed by an external force
>
> You need to look at all of your core.properties files and see if any
> of them point to the same data directory.
>
> Second: if you issue a "kill -9" you can leave write locks lingering.
>
> Best,
> Erick
>
> On Thu, May 4, 2017 at 11:00 AM, Oakley, Craig (NIH/NLM/NCBI) [C]
> <craig.oak...@nih.gov> wrote:
>> We have been having problems with different collections on different 
>> SolrCloud clusters, all seeming to be related to the write.lock file with 
>> stack traces similar to the following. Are there any suggestions what might 
>> be the cause and what might be the solution? Thanks
>>
>>
>> org.apache.lucene.store.AlreadyClosedException: Underlying file changed by 
>> an external force at 2017-04-13T20:43:08.630152Z, 
>> (lock=NativeFSLock(path=/data/solr/biosample/dba_test_shard1_replica1/data/index/write.lock,impl=sun.nio.ch.FileLockImpl[0:9223372036854775807
>>  exclusive valid],ctime=2017-04-13T20:43:08.630152Z))
>>
>>at 
>> org.apache.lucene.store.NativeFSLockFactory$NativeFSLock.ensureValid(NativeFSLockFactory.java:179)
>>
>>at 
>> org.apache.lucene.store.LockValidatingDirectoryWrapper.deleteFile(LockValidatingDirectoryWrapper.java:37)
>>
>>at 
>> org.apache.lucene.index.IndexFileDeleter.deleteFile(IndexFileDeleter.java:732)
>>
>>at 
>> org.apache.lucene.index.IndexFileDeleter.deletePendingFiles(IndexFileDeleter.java:503)
>>
>>at 
>> org.apache.lucene.index.IndexFileDeleter.refresh(IndexFileDeleter.java:448)
>>
>>at 
>> org.apache.lucene.index.IndexWriter.rollbackInternalNoCommit(IndexWriter.java:2099)
>>
>>at 
>> org.apache.lucene.index.IndexWriter.rollbackInternal(IndexWriter.java:2041)
>>
>>at org.apache.lucene.index.IndexWriter.shutdown(IndexWriter.java:1083)
>>
>>at org.apache.lucene.index.IndexWriter.close(IndexWriter.java:1125)
>>
>>at 
>> org.apache.solr.update.SolrIndexWriter.close(SolrIndexWriter.java:131)
>>
>>at 
>> org.apache.solr.update.DefaultSolrCoreState.changeWriter(DefaultSolrCoreState.java:183)
>>
>>at 
>> org.apache.solr.update.DefaultSolrCoreState.newIndexWriter(DefaultSolrCoreState.java:207)
>>
>>at org.apache.solr.core.SolrCore.reload(SolrCore.java:472)
>>
>>at org.apache.solr.core.CoreContainer.reload(CoreCont

RE: Underlying file changed by an external force

2017-05-11 Thread Oakley, Craig (NIH/NLM/NCBI) [C]
None of them have dataDir properties: they just use the "data" subdirectory in 
the same directory as the core.properties

-Original Message-
From: Erick Erickson [mailto:erickerick...@gmail.com] 
Sent: Wednesday, May 10, 2017 6:59 PM
To: solr-user <solr-user@lucene.apache.org>
Subject: Re: Underlying file changed by an external force

bq: All the core.properties files are each in their own directory with
no overlap

Not quite what I was asking. By definition, all core.properties are in
their own directory. In fact Solr stops looking down the tree when it
finds the first directory with core.properties in it and immediately
moves on to the next sibling directory.

_Inside_ the core.properties files, are there any dataDir properties
pointing to the same place as in any other core.properties? Note,
dataDir properties usually aren't even present unless you did
something special so don't be surprised if there's nothing there.

Best,
Erick

On Wed, May 10, 2017 at 10:56 AM, Oakley, Craig (NIH/NLM/NCBI) [C]
<craig.oak...@nih.gov> wrote:
>> You need to look at all of your core.properties files and see if any of them 
>> point to the same data directory.
>
> All the core.properties files are each in their own directory with no overlap.
>
>> Second: if you issue a "kill -9" you can leave write locks lingering.
>
> We manage our Solr instances with supervisor, which can send a "kill -9" if 
> "kill -6" does not suffice; but the problem tends to manifest itself at some 
> time other than startup
>
> The Solr version is 5.4.1, in case that is relevant.
>
>
> -Original Message-
> From: Erick Erickson [mailto:erickerick...@gmail.com]
> Sent: Thursday, May 04, 2017 3:20 PM
> To: solr-user <solr-user@lucene.apache.org>
> Subject: Re: Underlying file changed by an external force
>
> You need to look at all of your core.properties files and see if any
> of them point to the same data directory.
>
> Second: if you issue a "kill -9" you can leave write locks lingering.
>
> Best,
> Erick
>
> On Thu, May 4, 2017 at 11:00 AM, Oakley, Craig (NIH/NLM/NCBI) [C]
> <craig.oak...@nih.gov> wrote:
>> We have been having problems with different collections on different 
>> SolrCloud clusters, all seeming to be related to the write.lock file with 
>> stack traces similar to the following. Are there any suggestions what might 
>> be the cause and what might be the solution? Thanks
>>
>>
>> org.apache.lucene.store.AlreadyClosedException: Underlying file changed by 
>> an external force at 2017-04-13T20:43:08.630152Z, 
>> (lock=NativeFSLock(path=/data/solr/biosample/dba_test_shard1_replica1/data/index/write.lock,impl=sun.nio.ch.FileLockImpl[0:9223372036854775807
>>  exclusive valid],ctime=2017-04-13T20:43:08.630152Z))
>>
>>at 
>> org.apache.lucene.store.NativeFSLockFactory$NativeFSLock.ensureValid(NativeFSLockFactory.java:179)
>>
>>at 
>> org.apache.lucene.store.LockValidatingDirectoryWrapper.deleteFile(LockValidatingDirectoryWrapper.java:37)
>>
>>at 
>> org.apache.lucene.index.IndexFileDeleter.deleteFile(IndexFileDeleter.java:732)
>>
>>at 
>> org.apache.lucene.index.IndexFileDeleter.deletePendingFiles(IndexFileDeleter.java:503)
>>
>>at 
>> org.apache.lucene.index.IndexFileDeleter.refresh(IndexFileDeleter.java:448)
>>
>>at 
>> org.apache.lucene.index.IndexWriter.rollbackInternalNoCommit(IndexWriter.java:2099)
>>
>>at 
>> org.apache.lucene.index.IndexWriter.rollbackInternal(IndexWriter.java:2041)
>>
>>at org.apache.lucene.index.IndexWriter.shutdown(IndexWriter.java:1083)
>>
>>at org.apache.lucene.index.IndexWriter.close(IndexWriter.java:1125)
>>
>>at 
>> org.apache.solr.update.SolrIndexWriter.close(SolrIndexWriter.java:131)
>>
>>at 
>> org.apache.solr.update.DefaultSolrCoreState.changeWriter(DefaultSolrCoreState.java:183)
>>
>>at 
>> org.apache.solr.update.DefaultSolrCoreState.newIndexWriter(DefaultSolrCoreState.java:207)
>>
>>at org.apache.solr.core.SolrCore.reload(SolrCore.java:472)
>>
>>at org.apache.solr.core.CoreContainer.reload(CoreContainer.java:849)
>>
>>at 
>> org.apache.solr.handler.admin.CoreAdminHandler.handleReloadAction(CoreAdminHandler.java:768)
>>
>>at 
>> org.apache.solr.handler.admin.CoreAdminHandler.handleRequestInternal(CoreAdminHandler.java:230)
>>
>>at 
>> org.apache.solr.handler.admin.CoreAdminHandler.handleRequestBod

RE: Underlying file changed by an external force

2017-05-10 Thread Oakley, Craig (NIH/NLM/NCBI) [C]
> You need to look at all of your core.properties files and see if any of them 
> point to the same data directory.

All the core.properties files are each in their own directory with no overlap.

> Second: if you issue a "kill -9" you can leave write locks lingering.

We manage our Solr instances with supervisor, which can send a "kill -9" if 
"kill -6" does not suffice; but the problem tends to manifest itself at some 
time other than startup

The Solr version is 5.4.1, in case that is relevant.


-Original Message-
From: Erick Erickson [mailto:erickerick...@gmail.com] 
Sent: Thursday, May 04, 2017 3:20 PM
To: solr-user <solr-user@lucene.apache.org>
Subject: Re: Underlying file changed by an external force

You need to look at all of your core.properties files and see if any
of them point to the same data directory.

Second: if you issue a "kill -9" you can leave write locks lingering.

Best,
Erick

On Thu, May 4, 2017 at 11:00 AM, Oakley, Craig (NIH/NLM/NCBI) [C]
<craig.oak...@nih.gov> wrote:
> We have been having problems with different collections on different 
> SolrCloud clusters, all seeming to be related to the write.lock file with 
> stack traces similar to the following. Are there any suggestions what might 
> be the cause and what might be the solution? Thanks
>
>
> org.apache.lucene.store.AlreadyClosedException: Underlying file changed by an 
> external force at 2017-04-13T20:43:08.630152Z, 
> (lock=NativeFSLock(path=/data/solr/biosample/dba_test_shard1_replica1/data/index/write.lock,impl=sun.nio.ch.FileLockImpl[0:9223372036854775807
>  exclusive valid],ctime=2017-04-13T20:43:08.630152Z))
>
>at 
> org.apache.lucene.store.NativeFSLockFactory$NativeFSLock.ensureValid(NativeFSLockFactory.java:179)
>
>at 
> org.apache.lucene.store.LockValidatingDirectoryWrapper.deleteFile(LockValidatingDirectoryWrapper.java:37)
>
>at 
> org.apache.lucene.index.IndexFileDeleter.deleteFile(IndexFileDeleter.java:732)
>
>at 
> org.apache.lucene.index.IndexFileDeleter.deletePendingFiles(IndexFileDeleter.java:503)
>
>at 
> org.apache.lucene.index.IndexFileDeleter.refresh(IndexFileDeleter.java:448)
>
>at 
> org.apache.lucene.index.IndexWriter.rollbackInternalNoCommit(IndexWriter.java:2099)
>
>at 
> org.apache.lucene.index.IndexWriter.rollbackInternal(IndexWriter.java:2041)
>
>at org.apache.lucene.index.IndexWriter.shutdown(IndexWriter.java:1083)
>
>at org.apache.lucene.index.IndexWriter.close(IndexWriter.java:1125)
>
>at 
> org.apache.solr.update.SolrIndexWriter.close(SolrIndexWriter.java:131)
>
>at 
> org.apache.solr.update.DefaultSolrCoreState.changeWriter(DefaultSolrCoreState.java:183)
>
>at 
> org.apache.solr.update.DefaultSolrCoreState.newIndexWriter(DefaultSolrCoreState.java:207)
>
>at org.apache.solr.core.SolrCore.reload(SolrCore.java:472)
>
>at org.apache.solr.core.CoreContainer.reload(CoreContainer.java:849)
>
>at 
> org.apache.solr.handler.admin.CoreAdminHandler.handleReloadAction(CoreAdminHandler.java:768)
>
>at 
> org.apache.solr.handler.admin.CoreAdminHandler.handleRequestInternal(CoreAdminHandler.java:230)
>
>at 
> org.apache.solr.handler.admin.CoreAdminHandler.handleRequestBody(CoreAdminHandler.java:184)
>
>at 
> org.apache.solr.handler.RequestHandlerBase.handleRequest(RequestHandlerBase.java:156)
>
>at 
> org.apache.solr.servlet.HttpSolrCall.handleAdminRequest(HttpSolrCall.java:664)
>
>at org.apache.solr.servlet.HttpSolrCall.call(HttpSolrCall.java:438)
>
>at 
> org.apache.solr.servlet.SolrDispatchFilter.doFilter(SolrDispatchFilter.java:223)
>
>at 
> org.apache.solr.servlet.SolrDispatchFilter.doFilter(SolrDispatchFilter.java:181)
>
>at 
> org.eclipse.jetty.servlet.ServletHandler$CachedChain.doFilter(ServletHandler.java:1652)
>
>at 
> org.eclipse.jetty.servlet.ServletHandler.doHandle(ServletHandler.java:585)
>
>at 
> org.eclipse.jetty.server.handler.ScopedHandler.handle(ScopedHandler.java:143)
>
>at 
> org.eclipse.jetty.security.SecurityHandler.handle(SecurityHandler.java:577)
>
>at 
> org.eclipse.jetty.server.session.SessionHandler.doHandle(SessionHandler.java:223)
>
>at 
> org.eclipse.jetty.server.handler.ContextHandler.doHandle(ContextHandler.java:1127)
>
>at 
> org.eclipse.jetty.servlet.ServletHandler.doScope(ServletHandler.java:515)
>
>at 
> org.eclipse.jetty.server.session.SessionHandler.doScope(SessionHandler.java:185)
>
>at 
> org.eclipse.jetty.server.handler.Context

Underlying file changed by an external force

2017-05-04 Thread Oakley, Craig (NIH/NLM/NCBI) [C]
We have been having problems with different collections on different SolrCloud 
clusters, all seeming to be related to the write.lock file with stack traces 
similar to the following. Are there any suggestions what might be the cause and 
what might be the solution? Thanks


org.apache.lucene.store.AlreadyClosedException: Underlying file changed by an 
external force at 2017-04-13T20:43:08.630152Z, 
(lock=NativeFSLock(path=/data/solr/biosample/dba_test_shard1_replica1/data/index/write.lock,impl=sun.nio.ch.FileLockImpl[0:9223372036854775807
 exclusive valid],ctime=2017-04-13T20:43:08.630152Z))

   at 
org.apache.lucene.store.NativeFSLockFactory$NativeFSLock.ensureValid(NativeFSLockFactory.java:179)

   at 
org.apache.lucene.store.LockValidatingDirectoryWrapper.deleteFile(LockValidatingDirectoryWrapper.java:37)

   at 
org.apache.lucene.index.IndexFileDeleter.deleteFile(IndexFileDeleter.java:732)

   at 
org.apache.lucene.index.IndexFileDeleter.deletePendingFiles(IndexFileDeleter.java:503)

   at 
org.apache.lucene.index.IndexFileDeleter.refresh(IndexFileDeleter.java:448)

   at 
org.apache.lucene.index.IndexWriter.rollbackInternalNoCommit(IndexWriter.java:2099)

   at 
org.apache.lucene.index.IndexWriter.rollbackInternal(IndexWriter.java:2041)

   at org.apache.lucene.index.IndexWriter.shutdown(IndexWriter.java:1083)

   at org.apache.lucene.index.IndexWriter.close(IndexWriter.java:1125)

   at org.apache.solr.update.SolrIndexWriter.close(SolrIndexWriter.java:131)

   at 
org.apache.solr.update.DefaultSolrCoreState.changeWriter(DefaultSolrCoreState.java:183)

   at 
org.apache.solr.update.DefaultSolrCoreState.newIndexWriter(DefaultSolrCoreState.java:207)

   at org.apache.solr.core.SolrCore.reload(SolrCore.java:472)

   at org.apache.solr.core.CoreContainer.reload(CoreContainer.java:849)

   at 
org.apache.solr.handler.admin.CoreAdminHandler.handleReloadAction(CoreAdminHandler.java:768)

   at 
org.apache.solr.handler.admin.CoreAdminHandler.handleRequestInternal(CoreAdminHandler.java:230)

   at 
org.apache.solr.handler.admin.CoreAdminHandler.handleRequestBody(CoreAdminHandler.java:184)

   at 
org.apache.solr.handler.RequestHandlerBase.handleRequest(RequestHandlerBase.java:156)

   at 
org.apache.solr.servlet.HttpSolrCall.handleAdminRequest(HttpSolrCall.java:664)

   at org.apache.solr.servlet.HttpSolrCall.call(HttpSolrCall.java:438)

   at 
org.apache.solr.servlet.SolrDispatchFilter.doFilter(SolrDispatchFilter.java:223)

   at 
org.apache.solr.servlet.SolrDispatchFilter.doFilter(SolrDispatchFilter.java:181)

   at 
org.eclipse.jetty.servlet.ServletHandler$CachedChain.doFilter(ServletHandler.java:1652)

   at 
org.eclipse.jetty.servlet.ServletHandler.doHandle(ServletHandler.java:585)

   at 
org.eclipse.jetty.server.handler.ScopedHandler.handle(ScopedHandler.java:143)

   at 
org.eclipse.jetty.security.SecurityHandler.handle(SecurityHandler.java:577)

   at 
org.eclipse.jetty.server.session.SessionHandler.doHandle(SessionHandler.java:223)

   at 
org.eclipse.jetty.server.handler.ContextHandler.doHandle(ContextHandler.java:1127)

   at 
org.eclipse.jetty.servlet.ServletHandler.doScope(ServletHandler.java:515)

   at 
org.eclipse.jetty.server.session.SessionHandler.doScope(SessionHandler.java:185)

   at 
org.eclipse.jetty.server.handler.ContextHandler.doScope(ContextHandler.java:1061)

   at 
org.eclipse.jetty.server.handler.ScopedHandler.handle(ScopedHandler.java:141)

   at 
org.eclipse.jetty.server.handler.ContextHandlerCollection.handle(ContextHandlerCollection.java:215)

   at 
org.eclipse.jetty.server.handler.HandlerCollection.handle(HandlerCollection.java:110)

   at 
org.eclipse.jetty.server.handler.HandlerWrapper.handle(HandlerWrapper.java:97)

   at org.eclipse.jetty.server.Server.handle(Server.java:499)

   at org.eclipse.jetty.server.HttpChannel.handle(HttpChannel.java:310)

   at 
org.eclipse.jetty.server.HttpConnection.onFillable(HttpConnection.java:257)

   at 
org.eclipse.jetty.io.AbstractConnection$2.run(AbstractConnection.java:540)

   at 
org.eclipse.jetty.util.thread.QueuedThreadPool.runJob(QueuedThreadPool.java:635)

   at 
org.eclipse.jetty.util.thread.QueuedThreadPool$3.run(QueuedThreadPool.java:555)

   at java.lang.Thread.run(Thread.java:745)



CDCR & firewall holes

2017-05-01 Thread Oakley, Craig (NIH/NLM/NCBI) [C]
We are considering using Cross Data Center Replication between SolrClouds in 
different domains which have a firewall between them. Is it documented anywhere 
how many firewall holes will be needed? From each source SolrCloud node to each 
target SolrCloud node? From each target SolrCloud node to each source SolrCloud 
node? From each source SolrCloud node to each target Zookeeper node? Do the 
target SolrCloud nodes ever need to talk to the source Zookeeper nodes (or vice 
versa)? Is there a need for communication between the two Zookeeper clusters?

Thanks



RE: post.jar with security.json

2016-07-11 Thread Oakley, Craig (NIH/NLM/NCBI) [C]
FYI

After some time, we revisited this issue, and found that post.jar *DOES* work 
with security.json after all.

My test of whether or not it would work happened to have a typo in the port 
number; and I misinterpreted the error message as an erroneous indication that 
post.jar would not work with security.json -- as it turns out, it *DOES* work

Sorry for the confusion

-Original Message-
From: Upayavira [mailto:u...@odoko.co.uk] 
Sent: Tuesday, December 29, 2015 2:11 PM
To: solr-user@lucene.apache.org
Subject: Re: post.jar with security.json

You will probably find that the SimplePostTool (aka post.jar) has not
been updated to take into account security.json functionality.

Thus, the way to do this would be to look at the source code (it will
just use SolrJ to connect to Solr) and make enhancements to get it to
work (or if you're not familiar with Java, get someone else to do it).
Unfortunately, that is the nature of open source - there's so many such
features that *could* be extended, they tend to get the feature when
someone actually needs it.

Upayavira

On Tue, Dec 29, 2015, at 06:14 PM, Oakley, Craig (NIH/NLM/NCBI) [C]
wrote:
> I do have authorization and authentication setup in security.json: the
> question is how to pass the login and password into post.jar and/or into
> solr-5.4.0/bin/post -- it does not seem to like the
> user:pswd@host:8983/solr/corename/update syntax from SOLR-5960: when I
> try that, it complains "SimplePostTool: FATAL: Connection error (is Solr
> running at http://user:pswd@hostname:8983/solr/five4a/update ?):
> java.net.ConnectException: Connection refused", and nothing shows up in
> solr.log (although I do set
> log4j.logger.org.eclipse.jetty.server.Server=DEBUG to check for 401
> errors, etc).
> 
> FYI, I get a 404 from the link you sited: perhaps I don't have access, or
> perhaps you meant
> https://lucidworks.com/blog/2015/08/17/securing-solr-basic-auth-permission-rules
> (although that doesn't mention post.jar)
> 
> -Original Message-
> From: esther.quan...@lucidworks.com
> [mailto:esther.quan...@lucidworks.com] 
> Sent: Tuesday, December 29, 2015 12:54 PM
> To: solr-user@lucene.apache.org
> Subject: Re: post.jar with security.json
> 
> Hi Craig,
> 
> To pass the username and password, you'll want to enable authorization
> and authentication in security.json as is mentioned in this blog post in
> step 1 of "Enabling Basic Authentication". 
> 
> https://lucidworks.com/blog/2015/08/17/securing-solr-basic-auth--rules/
> 
> Is this what you're looking for?
> 
> Thanks,
> 
> Esther Quansah
> 
> > Le 29 déc. 2015 à 12:24, Oakley, Craig (NIH/NLM/NCBI) [C] 
> > <craig.oak...@nih.gov> a écrit :
> > 
> > Or to put it another way, how does one get security.json to work with 
> > SOLR-5960?
> > 
> > Has anyone any suggestions?
> > 
> > -Original Message-
> > From: Oakley, Craig (NIH/NLM/NCBI) [C] 
> > Sent: Thursday, December 24, 2015 2:12 PM
> > To: 'solr-user@lucene.apache.org' <solr-user@lucene.apache.org>
> > Subject: post.jar with security.json
> > 
> > In the old jetty-based implementation of Basic Authentication, one could 
> > use post.jar by running something like
> > 
> > java -Durl="http://user:pswd@host:8983/solr/corename/update; 
> > -Dtype=application/xml -jar post.jar example.xml
> > 
> > By what mechanism does one pass in the user name and password to post.jar 
> > (or, I suppose more likely, to solr-5.4.0/bin/post) when using 
> > security.json?
> > 
> > Thanks


RE: backups of analyzingInfixSuggesterIndexDir

2016-05-12 Thread Oakley, Craig (NIH/NLM/NCBI) [C]
Backup simply by copying the files? or is there some option by which to say 
"include analyzingInfixSuggesterIndexDir as well"?

-Original Message-
From: Erick Erickson [mailto:erickerick...@gmail.com] 
Sent: Wednesday, May 11, 2016 11:53 PM
To: solr-user <solr-user@lucene.apache.org>
Subject: Re: backups of analyzingInfixSuggesterIndexDir

Well, it can always be rebuilt from the backed-up index. That suggester
reads the _stored_ fields from the docs to build up the suggester
index. With a lot of documents that could take a very long time though.

If you desperately need it, AFAIK you'll have to back it up whenever
you build it I'm afraid.

Best,
Erick

On Wed, May 11, 2016 at 8:30 AM, Oakley, Craig (NIH/NLM/NCBI) [C]
<craig.oak...@nih.gov> wrote:
> I have a client whose Solr installation creates a 
> analyzingInfixSuggesterIndexDir directory besides index and tlog. I notice 
> that this analyzingInfixSuggesterIndexDir is not included in backups (created 
> by replication?command=backup). Is there a way to include this? Or does it 
> not need to be backed-up?
>
> I haven't needed this yet, but wanted to ask before I find that I might need 
> it.


backups of analyzingInfixSuggesterIndexDir

2016-05-11 Thread Oakley, Craig (NIH/NLM/NCBI) [C]
I have a client whose Solr installation creates a 
analyzingInfixSuggesterIndexDir directory besides index and tlog. I notice that 
this analyzingInfixSuggesterIndexDir is not included in backups (created by 
replication?command=backup). Is there a way to include this? Or does it not 
need to be backed-up?

I haven't needed this yet, but wanted to ask before I find that I might need it.


RE: BYOPW in security.json

2016-04-06 Thread Oakley, Craig (NIH/NLM/NCBI) [C]
Thanks.

I googled to look for examples of how to proceed, and notice that you opened 
SOLR-8951

Thanks again

-Original Message-
From: Jan Høydahl [mailto:jan@cominvent.com] 
Sent: Wednesday, April 06, 2016 4:18 AM
To: solr-user@lucene.apache.org
Subject: Re: BYOPW in security.json

Hi

Note that storing the user names and passwords in security.json is just one 
implementation, to easily get started. It uses the Sha256AuthenticationProvider 
class, which is pluggable. That means that if you require Basic Auth with some 
form of self-service management, you could/should add another 
AuthenticationProvider (implement interface 
BasicAuthPlugin.AuthenticationProvider which e.g. pulls valid users and 
passwords from a database or some other source that you control. Or perhaps 
your organization uses LDAP already, it would be convenient to create an 
LDAPAuthenticationProvider.

I would not recommend adding such complexity to the existing json backed user 
list, although it has the benefit of beting 100% self contained.

--
Jan Høydahl, search solution architect
Cominvent AS - www.cominvent.com

> 18. mar. 2016 kl. 23.30 skrev Oakley, Craig (NIH/NLM/NCBI) [C] 
> <craig.oak...@nih.gov>:
> 
> When using security.json (in Solr 5.4.1 for instance), is there a recommended 
> method to allow users to change their own passwords? We certainly would not 
> want to grant blanket security-edit to all users; but requiring users to 
> divulge their intended passwords (in Email or by other means) to the 
> administrators of the Solr installation is also arguably less than optimal. 
> It is unclear whether one could setup (for each individual user: "user1" in 
> this example) something like:
> 
> "set-permission": {"name":"edit_pwd_user1",
> "path":"/admin/authentication",
> "params":{"command":[set-user],"login":[user1]},
> "role": "edit_pw_user1"}
> "set-user-role": {"user1": ["edit_pw_user1","other","roles","here"]}
> 
> One point that is unclear would be whether "command" and "login" are the 
> correct strings in the third line of the example above: would they instead be 
> "cmd" and "user"? "action" and "username"? something else?
> 
> Even if this worked when implemented for each individual login, it would be 
> nice to be able to say once and for all "every login can edit its own 
> password".
> 
> There could be ways to create a utility which would change the OS-ownership 
> of its own process in order to decrypt a file containing the 
> Solr-admin-password, and to use that to set the password of the Solr login 
> which matched the OS login which initiated the process; but before embarking 
> on developing such a utility, I thought I would ask whether there were other 
> suggestions.



RE: RETRY: SolrCloud does not recover after ZooKeeper ensemble loses (and then regains) a quorum

2016-03-19 Thread Oakley, Craig (NIH/NLM/NCBI) [C]
I am wondering whether this might be the bug of SOLR-8326, which is fixed in 
Solr 5.4

That's my guess as a user who ran into the bug myself.

-Original Message-
From: Kelly, Frank [mailto:frank.ke...@here.com] 
Sent: Wednesday, March 16, 2016 3:09 PM
To: solr-user@lucene.apache.org
Subject: Re: RETRY: SolrCloud does not recover after ZooKeeper ensemble loses 
(and then regains) a quorum

Any thoughts on this?

Hoping for just a quick
1) Yes - once ZooKeeper loses a Quorum you need to restart Solr and your
SolrJ Client
2) No - that¹s not expected behavior - Solr and SolrJ should recover -
please file a JIRA issue

Cheers!

Frank Kelly
Principal Software Engineer
Predictive Analytics Team (SCBE/HAC/CDA)

HERE 
5 Wayside Rd, Burlington, MA 01803, USA
42° 29' 7" N 71° 11' 32² W

    

   







On 3/16/16, 8:54 AM, "Kelly, Frank"  wrote:

>
>
>Just wondering if my observation of SolrCloud behavior after ZooKeeper
>loses a quorum is normal or to-be-expected
>
>Version of Solr: 5.3.1
>Version of ZooKeeper: 3.4.7
>Using SolrCloud with external ZooKeeper
>Deployed on AWS
>
>Our Solr cluster has 3 nodes
>
>Our Zookeeper ensemble consists of three nodes with the same config using
>DNS names e.g.
>
>$ more ../conf/zoo.cfg
>tickTime=2000
>dataDir=/var/zookeeper
>dataLogDir=/var/log/zookeeper
>clientPort=2181
>initLimit=10
>syncLimit=5
>standaloneEnabled=false
>server.1=zookeeper1.qa.eu-west-1.mysearch.com:2888:3888
>server.2=zookeeper2.qa.eu-west-1.mysearch.com:2888:3888
>server.3=zookeeper3.qa.eu-west-1.mysearch.com:2888:3888
>
>If we terminate one of the zookeeper nodes we get a ZK election (and I
>think) a quorum is maintained.
>Operation continues OK and we detect the terminated instance and relaunch
>a new ZK node which comes up fine
>
>If we terminate two of the ZK nodes we lose a quorum and then we observe
>the following
>
>1.1) Admin UI shows an error that it is unable to contact ZooKeeper
>³Could not connect to ZooKeeper"
>
>1.2) SolrJ returns the following
>
>org.apache.solr.common.SolrException: Could not load collection from
>ZK:qa_eu-west-1_public_index
>at 
>org.apache.solr.common.cloud.ZkStateReader.getCollectionLive(ZkStateReader
>.java:850)
>at 
>org.apache.solr.common.cloud.ZkStateReader$7.get(ZkStateReader.java:515)
>at 
>org.apache.solr.client.solrj.impl.CloudSolrClient.getDocCollection(CloudSo
>lrClient.java:1205)
>at 
>org.apache.solr.client.solrj.impl.CloudSolrClient.requestWithRetryOnStaleS
>tate(CloudSolrClient.java:837)
>at 
>org.apache.solr.client.solrj.impl.CloudSolrClient.request(CloudSolrClient.
>java:805)
>at org.apache.solr.client.solrj.SolrRequest.process(SolrRequest.java:135)
>at org.apache.solr.client.solrj.SolrClient.add(SolrClient.java:107)
>at org.apache.solr.client.solrj.SolrClient.add(SolrClient.java:72)
>at org.apache.solr.client.solrj.SolrClient.add(SolrClient.java:86)
>at 
>com.here.scbe.search.solr.SolrFacadeImpl.addToSearchIndex(SolrFacadeImpl.j
>ava:112)
>Caused by: org.apache.zookeeper.KeeperException$ConnectionLossException:
>KeeperErrorCode = ConnectionLoss for
>/collections/qa_eu-west-1_public_index/state.json
>at org.apache.zookeeper.KeeperException.create(KeeperException.java:99)
>at org.apache.zookeeper.KeeperException.create(KeeperException.java:51)
>at org.apache.zookeeper.ZooKeeper.getData(ZooKeeper.java:1155)
>at 
>org.apache.solr.common.cloud.SolrZkClient$7.execute(SolrZkClient.java:345)
>at 
>org.apache.solr.common.cloud.SolrZkClient$7.execute(SolrZkClient.java:342)
>at 
>org.apache.solr.common.cloud.ZkCmdExecutor.retryOperation(ZkCmdExecutor.ja
>va:61)
>at 
>org.apache.solr.common.cloud.SolrZkClient.getData(SolrZkClient.java:342)
>at 
>org.apache.solr.common.cloud.ZkStateReader.getCollectionLive(ZkStateReader
>.java:841)
>... 24 more
>
>This makes sense based on our understanding.
>When our AutoScale groups launch two new ZooKeeper nodes, initialize
>them, fix the DNS etc. we regain a quorum but at this point
>
>2.1) Admin UI shows the shards as ³GONE² (all greyed out)
>
>2.2) SolrJ returns the same error even though the ZooKeeper DNS names are
>now bound to new IP addresses
>
>So at this point I restart the Solr nodes. At this point then
>
>3.1) Admin UI shows the collections as OK (all shards are green) ­ yeah
>the nodes are back!
>
>3.2) SolrJ Client still shows the same error ­ namely
>
>org.apache.solr.common.SolrException: Could not load collection from
>ZK:qa_eu-west-1_here_account
>at 
>org.apache.solr.common.cloud.ZkStateReader.getCollectionLive(ZkStateReader
>.java:850)
>at 
>org.apache.solr.common.cloud.ZkStateReader$7.get(ZkStateReader.java:515)
>at 
>org.apache.solr.client.solrj.impl.CloudSolrClient.getDocCollection(CloudSo
>lrClient.java:1205)
>at 
>org.apache.solr.client.solrj.impl.CloudSolrClient.requestWithRetryOnStaleS
>tate(CloudSolrClient.java:837)
>at 

BYOPW in security.json

2016-03-18 Thread Oakley, Craig (NIH/NLM/NCBI) [C]
When using security.json (in Solr 5.4.1 for instance), is there a recommended 
method to allow users to change their own passwords? We certainly would not 
want to grant blanket security-edit to all users; but requiring users to 
divulge their intended passwords (in Email or by other means) to the 
administrators of the Solr installation is also arguably less than optimal. It 
is unclear whether one could setup (for each individual user: "user1" in this 
example) something like:

"set-permission": {"name":"edit_pwd_user1",
"path":"/admin/authentication",
"params":{"command":[set-user],"login":[user1]},
"role": "edit_pw_user1"}
"set-user-role": {"user1": ["edit_pw_user1","other","roles","here"]}

One point that is unclear would be whether "command" and "login" are the 
correct strings in the third line of the example above: would they instead be 
"cmd" and "user"? "action" and "username"? something else?

Even if this worked when implemented for each individual login, it would be 
nice to be able to say once and for all "every login can edit its own password".

There could be ways to create a utility which would change the OS-ownership of 
its own process in order to decrypt a file containing the Solr-admin-password, 
and to use that to set the password of the Solr login which matched the OS 
login which initiated the process; but before embarking on developing such a 
utility, I thought I would ask whether there were other suggestions.


upgrade SolrCloud

2016-01-28 Thread Oakley, Craig (NIH/NLM/NCBI) [C]
I'm planning to upgrade (from 5.4.0 to 5.4.1) a SolrCloud with two replicas 
(one shard).

Am I correct in thinking I should be able simply to shutdown one node, change 
it to using 5.4.1, restart the upgraded node, shutdown the other node and 
upgrade it? Or are there caveats to consider?


RE: Solr Heap memory vs. OS memory

2016-01-13 Thread Oakley, Craig (NIH/NLM/NCBI) [C]
Followup question:

If one has multiple instances on the same host (a host running basically 
nothing except multiple instances of Solr), then the values specified as -Xmx 
in the various instances should add up to 25% of the RAM of the host...

Is that correct?

-Original Message-
From: Markus Jelsma [mailto:markus.jel...@openindex.io] 
Sent: Wednesday, December 09, 2015 10:28 AM
To: solr-user@lucene.apache.org
Subject: RE: Solr Heap memory vs. OS memory

Yes. This is still accurate, Lucene still relies on memory mapped files. And 
Solr usually doesn't require that much RAM, except if you have lots of massive 
cache entries.
Markus
 
-Original message-
> From:Kelly, Frank 
> Sent: Wednesday 9th December 2015 16:19
> To: solr-user@lucene.apache.org
> Subject: Solr Heap memory vs. OS memory
> 
> Hi Folks,
> 
>  I was wondering if this link I found recommended by Erick is still accurate 
> (for Solr 5.3.1)
> 
> "For configuring your Java VM, you should rethink your memory requirements: 
> Give only the really needed amount of heap space and leave as much as 
> possible to the O/S. As a rule of thumb: Don't use more than 1Z4 of your 
> physical memory as heap space for Java running Lucene/Solr, keep the 
> remaining memory free for the operating system cache."
> 
> http://blog.thetaphi.de/2012/07/use-lucenes-mmapdirectory-on-64bit.html
> 
> So I am using several CentOS Vms (on AWS) with 8GB RAM I so should plan for < 
> 2GB for -Xms and -Xmx?
> Our scaling plan - being on AWS - is to scale out (adding more Vms - not 
> adding more memory).
> 
> Thanks!
> 
> -Frank
> 
> 
> 
> 
> 
> 
> 
> 
> 
> 


RE: post.jar with security.json

2015-12-29 Thread Oakley, Craig (NIH/NLM/NCBI) [C]
I do have authorization and authentication setup in security.json: the question 
is how to pass the login and password into post.jar and/or into 
solr-5.4.0/bin/post -- it does not seem to like the 
user:pswd@host:8983/solr/corename/update syntax from SOLR-5960: when I try 
that, it complains "SimplePostTool: FATAL: Connection error (is Solr running at 
http://user:pswd@hostname:8983/solr/five4a/update ?): 
java.net.ConnectException: Connection refused", and nothing shows up in 
solr.log (although I do set log4j.logger.org.eclipse.jetty.server.Server=DEBUG 
to check for 401 errors, etc).

FYI, I get a 404 from the link you sited: perhaps I don't have access, or 
perhaps you meant 
https://lucidworks.com/blog/2015/08/17/securing-solr-basic-auth-permission-rules
 (although that doesn't mention post.jar)

-Original Message-
From: esther.quan...@lucidworks.com [mailto:esther.quan...@lucidworks.com] 
Sent: Tuesday, December 29, 2015 12:54 PM
To: solr-user@lucene.apache.org
Subject: Re: post.jar with security.json

Hi Craig,

To pass the username and password, you'll want to enable authorization and 
authentication in security.json as is mentioned in this blog post in step 1 of 
"Enabling Basic Authentication". 

https://lucidworks.com/blog/2015/08/17/securing-solr-basic-auth--rules/

Is this what you're looking for?

Thanks,

Esther Quansah

> Le 29 déc. 2015 à 12:24, Oakley, Craig (NIH/NLM/NCBI) [C] 
> <craig.oak...@nih.gov> a écrit :
> 
> Or to put it another way, how does one get security.json to work with 
> SOLR-5960?
> 
> Has anyone any suggestions?
> 
> -----Original Message-
> From: Oakley, Craig (NIH/NLM/NCBI) [C] 
> Sent: Thursday, December 24, 2015 2:12 PM
> To: 'solr-user@lucene.apache.org' <solr-user@lucene.apache.org>
> Subject: post.jar with security.json
> 
> In the old jetty-based implementation of Basic Authentication, one could use 
> post.jar by running something like
> 
> java -Durl="http://user:pswd@host:8983/solr/corename/update; 
> -Dtype=application/xml -jar post.jar example.xml
> 
> By what mechanism does one pass in the user name and password to post.jar 
> (or, I suppose more likely, to solr-5.4.0/bin/post) when using security.json?
> 
> Thanks


RE: post.jar with security.json

2015-12-29 Thread Oakley, Craig (NIH/NLM/NCBI) [C]
Or to put it another way, how does one get security.json to work with SOLR-5960?

Has anyone any suggestions?

-Original Message-
From: Oakley, Craig (NIH/NLM/NCBI) [C] 
Sent: Thursday, December 24, 2015 2:12 PM
To: 'solr-user@lucene.apache.org' <solr-user@lucene.apache.org>
Subject: post.jar with security.json

In the old jetty-based implementation of Basic Authentication, one could use 
post.jar by running something like

java -Durl="http://user:pswd@host:8983/solr/corename/update; 
-Dtype=application/xml -jar post.jar example.xml

By what mechanism does one pass in the user name and password to post.jar (or, 
I suppose more likely, to solr-5.4.0/bin/post) when using security.json?

Thanks


post.jar with security.json

2015-12-24 Thread Oakley, Craig (NIH/NLM/NCBI) [C]
In the old jetty-based implementation of Basic Authentication, one could use 
post.jar by running something like

java -Durl="http://user:pswd@host:8983/solr/corename/update; 
-Dtype=application/xml -jar post.jar example.xml

By what mechanism does one pass in the user name and password to post.jar (or, 
I suppose more likely, to solr-5.4.0/bin/post) when using security.json?

Thanks


RE: Re:Re: Implementing security.json is breaking ADDREPLICA

2015-12-14 Thread Oakley, Craig (NIH/NLM/NCBI) [C]
Anshum and Nobel,

I've downloaded 5.4, and this seems to be working so far

Thanks again

-Original Message-
From: Anshum Gupta [mailto:ans...@anshumgupta.net] 
Sent: Tuesday, December 01, 2015 12:52 AM
To: solr-user@lucene.apache.org
Subject: Re: Re:Re: Implementing security.json is breaking ADDREPLICA

Hi Craig,

As part of my manual testing for the 5.3 RC, I tried out collection admin
request restriction and update restriction on a single node setup. I don't
have the manual test steps documented but it wasn't too intensive I'd
admit. I think the complications involved in stopping specific nodes and
bringing them back up stop us from testing the node restarts as part of the
automated tests but we should find a way and fix that.

I've just found another issue and opened SOLR-8355 for the same and it
involves the "update" permission.

As far as patching 5.3.1 go, it's involves more than just this one patch
and this patch alone wouldn't help you resolve this issue. You'd certainly
need the patch from SOLR-8167. Also, make sure you actually use the
'commit' and not the posted patch as the patch on SOLR-8167 is different
from the commit. I don't think you'd need anything other than those patches
and whatever comes from 8355 to have a patched 5.3.1.

Any help in testing this out would be awesome and thanks for reporting and
following up on the issues!


On Tue, Dec 1, 2015 at 6:09 AM, Oakley, Craig (NIH/NLM/NCBI) [C] <
craig.oak...@nih.gov> wrote:

> Thank you, Anshum and Nobel, for your progress on SOLR-8326
>
> I have a couple questions to tide me over until 5.4 (hoping to test
> security.json a bit further while I wait).
>
> Given that the seven steps (tar xvzf solr-5.3.1.tgz; tar xvzf
> zookeeper-3.4.6.tar.gz; zkServer.sh start zoo_sample.cfg; zkcli.sh -zkhost
> localhost:2181 -cmd putfile /security.json ~/security.json; solr start -e
> cloud -z localhost:2181; solr stop -p 7574 & solr start -c -p 7574 -s
> "example/cloud/node2/solr" -z localhost:2181) demonstrate the problem, are
> there a similar set of steps by which one can load _some_ minimal
> security.json and still be able to stop & successfully restart one node of
> the cluster? (I am wondering what steps were used in the original testing
> of 5.3.1)
>
> Also, has it been verified that the SOLR-8326 patch resolves the
> ADDREPLICA bug in addition to the
> shutdown-&-restart-one-node-while-keeping-another-node-running bug?
>
> Also, would it make sense for me to download solr-5.3.1-src.tgz and (in a
> test environment) make the changes described in the latest attachment to
> SOLR-8326? Or would it be more advisable just to wait for 5.4? I don't know
> what may be involved in compiling a new solr.war from the source code.
>
> Thanks again
>
> -Original Message-
> From: Erick Erickson [mailto:erickerick...@gmail.com]
> Sent: Tuesday, November 24, 2015 1:25 PM
> To: solr-user <solr-user@lucene.apache.org>
> Subject: Re: Re:Re: Implementing security.json is breaking ADDREPLICA
>
> bq: I don't suppose there is an ETA for 5.4?
>
> Actually, 5.4 is probably in the works within the next month. I'm not
> the one cutting the
> release, but there's some rumors that a label will be cut this week,
> then the "usual"
> process is a week or two (sometimes more if bugs are flushed out) before
> the
> official release.
>
> Call it the first of the year for safety's sake, but that's a guess.
>
> Best,
> Erick
>
> On Tue, Nov 24, 2015 at 10:22 AM, Oakley, Craig (NIH/NLM/NCBI) [C]
> <craig.oak...@nih.gov> wrote:
> > Thanks for the reply,
> >
> > I don't suppose there is an ETA for 5.4?
> >
> >
> > Thanks again
> >
> > -Original Message-
> ...
>



-- 
Anshum Gupta


RE: Authorization API versus zkcli.sh

2015-12-11 Thread Oakley, Craig (NIH/NLM/NCBI) [C]
So, when one has finished constructing the desired security.json (by means of 
Authentication and Authorization commands) and then run "zkcli.sh -cmd getfile" 
to get this security.json in order for it to be used as a template: one should 
edit the template to remove this "":{"v":85} clause (and the comma which 
precedes it): correct?

I notice that the documented minimal security.json which simply creates the 
solr:SolrRocks login:pswd does not have such a clause: so I assume that the 
lack of such a clause is not an error.


From: Anshum Gupta [ans...@anshumgupta.net]
Sent: Friday, December 11, 2015 9:48 AM
To: solr-user@lucene.apache.org
Subject: Re: Authorization API versus zkcli.sh

yes, that's the assumption. The reason why there's a version there is to
optimize on reloads i.e. Authentication and Authorization plugins are
reloaded only when the version number is changed. e.g.
* Start with Ver 1 for both authentication and authorization
* Make changes to Authentication, the version for this section is updated
to the znode version, while the version for the authorization section is
not changed. This forces the authentication plugin to be reloaded but not
the authorization plugin. Similarly for authorization.

It's a way to optimize the reloads without splitting the definition into 2
znodes, which is also an option.


On Fri, Dec 11, 2015 at 8:06 PM, Shalin Shekhar Mangar <
shalinman...@gmail.com> wrote:

> Shouldn't this be the znode version? Why put a version in
> security.json? Or is the idea that the user will upload security.json
> only once and then use the security APIs for all further changes?
>
> On Fri, Dec 11, 2015 at 11:51 AM, Noble Paul <noble.p...@gmail.com> wrote:
> > Please do not put any number. That number is used by the system to
> > optimize loading/reloading plugins. It is not relevant for the user.
> >
> > On Thu, Dec 10, 2015 at 11:52 PM, Oakley, Craig (NIH/NLM/NCBI) [C]
> > <craig.oak...@nih.gov> wrote:
> >> Looking at security.json in Zookeeper, I notice that both the
> authentication section and the authorization section ends with something
> like
> >>
> >> "":{"v":47}},
> >>
> >> Am I correct in thinking that this 47 (in this case) is a version
> number, and that ANY number could be used in the file uploaded to
> security.json using "zkcli.sh -putfile"?
> >>
> >> Or is this some sort of checksum whose value must match some unclear
> criteria?
> >>
> >>
> >> -Original Message-
> >> From: Anshum Gupta [mailto:ans...@anshumgupta.net]
> >> Sent: Sunday, December 06, 2015 8:42 AM
> >> To: solr-user@lucene.apache.org
> >> Subject: Re: Authorization API versus zkcli.sh
> >>
> >> There's nothing cluster specific in security.json if you're using those
> >> plugins. It is totally safe to just take the file from one cluster and
> >> upload it for another for things to work.
> >>
> >> On Sat, Dec 5, 2015 at 3:38 AM, Oakley, Craig (NIH/NLM/NCBI) [C] <
> >> craig.oak...@nih.gov> wrote:
> >>
> >>> Looking through
> >>>
> cwiki.apache.org/confluence/display/solr/Authentication+and+Authorization+Plugins
> >>> one notices that security.json is initially created by zkcli.sh, and
> then
> >>> modified by means of the Authentication API and the Authorization API.
> By
> >>> and large, this sounds like a good way to accomplish such tasks,
> assuming
> >>> that these APIs do some error checking to prevent corruption of
> >>> security.json
> >>>
> >>> I was wondering about cases where one is cloning an existing Solr
> >>> instance, such as when creating an instance in Amazon Cloud. If one
> has a
> >>> security.json that has been thoroughly tried and successfully tested on
> >>> another Solr instance, is it possible / safe / not-un-recommended to
> use
> >>> zkcli.sh to load the full security.json (as extracted via zkcli.sh
> from the
> >>> Zookeeper of the thoroughly tested existing instance)? Or would the
> >>> official verdict be that the only acceptable way to create
> security.json is
> >>> to load a minimal version with zkcli.sh and then to build the remaining
> >>> components with the Authentication API and the Authorization API (in a
> >>> script, if one wants to automate the process: although such a script
> would
> >>> have to include plain-text passwords)?
> >>>
> >>> I figured there is no harm in asking.
> >>>
> >>
> >>
> >>
> >> --
> >> Anshum Gupta
> >
> >
> >
> > --
> > -
> > Noble Paul
>
>
>
> --
> Regards,
> Shalin Shekhar Mangar.
>



--
Anshum Gupta

RE: Authorization API versus zkcli.sh

2015-12-10 Thread Oakley, Craig (NIH/NLM/NCBI) [C]
Looking at security.json in Zookeeper, I notice that both the authentication 
section and the authorization section ends with something like

"":{"v":47}},

Am I correct in thinking that this 47 (in this case) is a version number, and 
that ANY number could be used in the file uploaded to security.json using 
"zkcli.sh -putfile"?

Or is this some sort of checksum whose value must match some unclear criteria?


-Original Message-
From: Anshum Gupta [mailto:ans...@anshumgupta.net] 
Sent: Sunday, December 06, 2015 8:42 AM
To: solr-user@lucene.apache.org
Subject: Re: Authorization API versus zkcli.sh

There's nothing cluster specific in security.json if you're using those
plugins. It is totally safe to just take the file from one cluster and
upload it for another for things to work.

On Sat, Dec 5, 2015 at 3:38 AM, Oakley, Craig (NIH/NLM/NCBI) [C] <
craig.oak...@nih.gov> wrote:

> Looking through
> cwiki.apache.org/confluence/display/solr/Authentication+and+Authorization+Plugins
> one notices that security.json is initially created by zkcli.sh, and then
> modified by means of the Authentication API and the Authorization API. By
> and large, this sounds like a good way to accomplish such tasks, assuming
> that these APIs do some error checking to prevent corruption of
> security.json
>
> I was wondering about cases where one is cloning an existing Solr
> instance, such as when creating an instance in Amazon Cloud. If one has a
> security.json that has been thoroughly tried and successfully tested on
> another Solr instance, is it possible / safe / not-un-recommended to use
> zkcli.sh to load the full security.json (as extracted via zkcli.sh from the
> Zookeeper of the thoroughly tested existing instance)? Or would the
> official verdict be that the only acceptable way to create security.json is
> to load a minimal version with zkcli.sh and then to build the remaining
> components with the Authentication API and the Authorization API (in a
> script, if one wants to automate the process: although such a script would
> have to include plain-text passwords)?
>
> I figured there is no harm in asking.
>



-- 
Anshum Gupta


Authorization API versus zkcli.sh

2015-12-04 Thread Oakley, Craig (NIH/NLM/NCBI) [C]
Looking through 
cwiki.apache.org/confluence/display/solr/Authentication+and+Authorization+Plugins
 one notices that security.json is initially created by zkcli.sh, and then 
modified by means of the Authentication API and the Authorization API. By and 
large, this sounds like a good way to accomplish such tasks, assuming that 
these APIs do some error checking to prevent corruption of security.json

I was wondering about cases where one is cloning an existing Solr instance, 
such as when creating an instance in Amazon Cloud. If one has a security.json 
that has been thoroughly tried and successfully tested on another Solr 
instance, is it possible / safe / not-un-recommended to use zkcli.sh to load 
the full security.json (as extracted via zkcli.sh from the Zookeeper of the 
thoroughly tested existing instance)? Or would the official verdict be that the 
only acceptable way to create security.json is to load a minimal version with 
zkcli.sh and then to build the remaining components with the Authentication API 
and the Authorization API (in a script, if one wants to automate the process: 
although such a script would have to include plain-text passwords)?

I figured there is no harm in asking.


RE: Re:Re: Implementing security.json is breaking ADDREPLICA

2015-11-30 Thread Oakley, Craig (NIH/NLM/NCBI) [C]
Thank you, Anshum and Nobel, for your progress on SOLR-8326

I have a couple questions to tide me over until 5.4 (hoping to test 
security.json a bit further while I wait).

Given that the seven steps (tar xvzf solr-5.3.1.tgz; tar xvzf 
zookeeper-3.4.6.tar.gz; zkServer.sh start zoo_sample.cfg; zkcli.sh -zkhost 
localhost:2181 -cmd putfile /security.json ~/security.json; solr start -e cloud 
-z localhost:2181; solr stop -p 7574 & solr start -c -p 7574 -s 
"example/cloud/node2/solr" -z localhost:2181) demonstrate the problem, are 
there a similar set of steps by which one can load _some_ minimal security.json 
and still be able to stop & successfully restart one node of the cluster? (I am 
wondering what steps were used in the original testing of 5.3.1)

Also, has it been verified that the SOLR-8326 patch resolves the ADDREPLICA bug 
in addition to the 
shutdown-&-restart-one-node-while-keeping-another-node-running bug?

Also, would it make sense for me to download solr-5.3.1-src.tgz and (in a test 
environment) make the changes described in the latest attachment to SOLR-8326? 
Or would it be more advisable just to wait for 5.4? I don't know what may be 
involved in compiling a new solr.war from the source code.

Thanks again

-Original Message-
From: Erick Erickson [mailto:erickerick...@gmail.com] 
Sent: Tuesday, November 24, 2015 1:25 PM
To: solr-user <solr-user@lucene.apache.org>
Subject: Re: Re:Re: Implementing security.json is breaking ADDREPLICA

bq: I don't suppose there is an ETA for 5.4?

Actually, 5.4 is probably in the works within the next month. I'm not
the one cutting the
release, but there's some rumors that a label will be cut this week,
then the "usual"
process is a week or two (sometimes more if bugs are flushed out) before the
official release.

Call it the first of the year for safety's sake, but that's a guess.

Best,
Erick

On Tue, Nov 24, 2015 at 10:22 AM, Oakley, Craig (NIH/NLM/NCBI) [C]
<craig.oak...@nih.gov> wrote:
> Thanks for the reply,
>
> I don't suppose there is an ETA for 5.4?
>
>
> Thanks again
>
> -Original Message-
...


RE: Re:Re: Implementing security.json is breaking ADDREPLICA

2015-11-24 Thread Oakley, Craig (NIH/NLM/NCBI) [C]
Thank you for the reply

Trying those exact commands, I'm still getting the same issue
tar xvzf /net/sybdev11/export/home/sybase/Distr/Solr/solr-5.3.1.tgz
tar xvzf /net/sybdev11/export/home/sybase/Distr/Solr/zookeeper-3.4.6.tar.gz 
cd zookeeper-3.4.6/
bin/zkServer.sh start zoo_sample.cfg 
cd ..
solr-5.3.1/server/scripts/cloud-scripts/zkcli.sh -zkhost localhost:2181 -cmd 
putfile /security.json PREVsolr-5.3.1/server/scripts/cloud-scripts/security.json
solr-5.3.1/server/scripts/cloud-scripts/zkcli.sh -zkhost localhost:2181 -cmd 
list
solr-5.3.1/bin/solr start -e cloud -z localhost:2181
cd solr-5.3.1/
bin/solr stop -p 7574
bin/solr start -c -p 7574 -s "example/cloud/node2/solr" -z localhost:2181
tail -f example/cloud/node2/logs/solr.log

The -cmd list shows
/ (2)
DATA:

 /zookeeper (1)
 DATA:
 
 /security.json (0)
 DATA:
 
{"authentication":{"class":"solr.BasicAuthPlugin","credentials":{"solr":"IV0EHq1OnNrj6gvRCwvFwTrZ1+z1oBbnQdiVC3otuq0=
 
Ndd7LKvVBAaZIF0QAVi1ekCfAJXr1GGfLtRUXhgrF8c="}},"authorization":{"class":"solr.RuleBasedAuthorizationPlugin","user-role":
 {"solr":["admin"]}
 
 ,"permissions":[
 {"name":"security-edit","role":"admin"}
 
 ]}}
 

While the output of tail contains
ERROR - 2015-11-24 10:45:54.796; [c:gettingstarted s:shard1 r:core_node4 
x:gettingstarted_shard1_replica1] org.apache.solr.common.SolrException; Error 
while trying to recover.:java.util.concurrent.ExecutionException: 
org.apache.http.ParseException: Invalid content type: 
at java.util.concurrent.FutureTask.report(FutureTask.java:122)
at java.util.concurrent.FutureTask.get(FutureTask.java:192)
at 
org.apache.solr.cloud.RecoveryStrategy.sendPrepRecoveryCmd(RecoveryStrategy.java:598)
at 
org.apache.solr.cloud.RecoveryStrategy.doRecovery(RecoveryStrategy.java:361)
at org.apache.solr.cloud.RecoveryStrategy.run(RecoveryStrategy.java:227)
Caused by: org.apache.http.ParseException: Invalid content type: 
at org.apache.http.entity.ContentType.parse(ContentType.java:273)
at 
org.apache.solr.client.solrj.impl.HttpSolrClient.executeMethod(HttpSolrClient.java:512)
at 
org.apache.solr.client.solrj.impl.HttpSolrClient$1.call(HttpSolrClient.java:270)
at 
org.apache.solr.client.solrj.impl.HttpSolrClient$1.call(HttpSolrClient.java:266)
at java.util.concurrent.FutureTask.run(FutureTask.java:266)
at 
org.apache.solr.common.util.ExecutorUtil$MDCAwareThreadPoolExecutor$1.run(ExecutorUtil.java:210)
at 
java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142)
at 
java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617)
at java.lang.Thread.run(Thread.java:745)


-Original Message-
From: Anshum Gupta [mailto:ans...@anshumgupta.net] 
Sent: Monday, November 23, 2015 7:24 PM
To: solr-user@lucene.apache.org
Subject: Re: Re:Re: Implementing security.json is breaking ADDREPLICA

Yes, I see the same issue. I'll update the JIRA and drill down. Thanks.

On Mon, Nov 23, 2015 at 4:18 PM, Anshum Gupta <ans...@anshumgupta.net>
wrote:

> To restart solr, you should instead use something like:
> bin/solr start -c -p 8983 -s "example/cloud/node1/solr" -z localhost:2181
> or
> bin/solr start -c -p 7574 -s "example/cloud/node2/solr" -z localhost:2181
>
> I've seen others report the same exception but never ran into this one
> myself. Let me try this out.
>
>
>
> On Mon, Nov 23, 2015 at 2:55 PM, Oakley, Craig (NIH/NLM/NCBI) [C] <
> craig.oak...@nih.gov> wrote:
>
>> FWIW
>>
>> I am getting fairly consistent results that if I follow the SOLR-8326
>> procedure just up through the step of "solr-5.3.1/bin/solr start -e cloud
>> -z localhost:2181": if I then stop just one node (either "./solr stop -p
>> 7574" or "./solr stop -p 8983") and then restart that same node (using the
>> command suggested by "solr-5.3.1/bin/solr start -e cloud -z
>> localhost:2181"), then the solr.log for the stopped-and-restarted node gets
>> such stack traces as
>> ERROR - 2015-11-23 21:49:28.663; [c:gettingstarted s:shard2 r:core_node3
>> x:gettingstarted_shard2_replica2] org.apache.solr.common.SolrException;
>> Error while trying to recover.:java.util.concurrent.ExecutionException:
>> org.apache.http.ParseException: Invalid content type:
>> at java.util.concurrent.FutureTask.report(FutureTask.java:122)
>> at java.util.concurrent.FutureTask.get(FutureTask.java:192)
>> at
>> org.apache.solr.cloud.RecoveryStrategy.sendPrepRecoveryCmd(RecoveryStrategy.java:598)
>>

RE: Re:Re: Implementing security.json is breaking ADDREPLICA

2015-11-24 Thread Oakley, Craig (NIH/NLM/NCBI) [C]
Thanks for the reply,

I don't suppose there is an ETA for 5.4?


Thanks again

-Original Message-
From: Anshum Gupta [mailto:ans...@anshumgupta.net] 
Sent: Tuesday, November 24, 2015 12:31 PM
To: solr-user@lucene.apache.org
Subject: Re: Re:Re: Implementing security.json is breaking ADDREPLICA

Yes, it certainly is a PKI issue.

On Tue, Nov 24, 2015 at 7:59 AM, Oakley, Craig (NIH/NLM/NCBI) [C] <
craig.oak...@nih.gov> wrote:

> Thank you for the reply
>
> Trying those exact commands, I'm still getting the same issue
> tar xvzf /net/sybdev11/export/home/sybase/Distr/Solr/solr-5.3.1.tgz
> tar xvzf /net/sybdev11/export/home/sybase/Distr/Solr/zookeeper-3.4.6.tar.gz
> cd zookeeper-3.4.6/
> bin/zkServer.sh start zoo_sample.cfg
> cd ..
> solr-5.3.1/server/scripts/cloud-scripts/zkcli.sh -zkhost localhost:2181
> -cmd putfile /security.json
> PREVsolr-5.3.1/server/scripts/cloud-scripts/security.json
> solr-5.3.1/server/scripts/cloud-scripts/zkcli.sh -zkhost localhost:2181
> -cmd list
> solr-5.3.1/bin/solr start -e cloud -z localhost:2181
> cd solr-5.3.1/
> bin/solr stop -p 7574
> bin/solr start -c -p 7574 -s "example/cloud/node2/solr" -z localhost:2181
> tail -f example/cloud/node2/logs/solr.log
>
> The -cmd list shows
> / (2)
> DATA:
>
>  /zookeeper (1)
>  DATA:
>
>  /security.json (0)
>  DATA:
>
>  
> {"authentication":{"class":"solr.BasicAuthPlugin","credentials":{"solr":"IV0EHq1OnNrj6gvRCwvFwTrZ1+z1oBbnQdiVC3otuq0=
> Ndd7LKvVBAaZIF0QAVi1ekCfAJXr1GGfLtRUXhgrF8c="}},"authorization":{"class":"solr.RuleBasedAuthorizationPlugin","user-role":
>  {"solr":["admin"]}
>
>  ,"permissions":[
>  {"name":"security-edit","role":"admin"}
>
>  ]}}
>
>
> While the output of tail contains
> ERROR - 2015-11-24 10:45:54.796; [c:gettingstarted s:shard1 r:core_node4
> x:gettingstarted_shard1_replica1] org.apache.solr.common.SolrException;
> Error while trying to recover.:java.util.concurrent.ExecutionException:
> org.apache.http.ParseException: Invalid content type:
> at java.util.concurrent.FutureTask.report(FutureTask.java:122)
> at java.util.concurrent.FutureTask.get(FutureTask.java:192)
> at
> org.apache.solr.cloud.RecoveryStrategy.sendPrepRecoveryCmd(RecoveryStrategy.java:598)
> at
> org.apache.solr.cloud.RecoveryStrategy.doRecovery(RecoveryStrategy.java:361)
> at
> org.apache.solr.cloud.RecoveryStrategy.run(RecoveryStrategy.java:227)
> Caused by: org.apache.http.ParseException: Invalid content type:
> at org.apache.http.entity.ContentType.parse(ContentType.java:273)
> at
> org.apache.solr.client.solrj.impl.HttpSolrClient.executeMethod(HttpSolrClient.java:512)
> at
> org.apache.solr.client.solrj.impl.HttpSolrClient$1.call(HttpSolrClient.java:270)
> at
> org.apache.solr.client.solrj.impl.HttpSolrClient$1.call(HttpSolrClient.java:266)
> at java.util.concurrent.FutureTask.run(FutureTask.java:266)
> at
> org.apache.solr.common.util.ExecutorUtil$MDCAwareThreadPoolExecutor$1.run(ExecutorUtil.java:210)
> at
> java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142)
> at
> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617)
> at java.lang.Thread.run(Thread.java:745)
>
>
> -Original Message-
> From: Anshum Gupta [mailto:ans...@anshumgupta.net]
> Sent: Monday, November 23, 2015 7:24 PM
> To: solr-user@lucene.apache.org
> Subject: Re: Re:Re: Implementing security.json is breaking ADDREPLICA
>
> Yes, I see the same issue. I'll update the JIRA and drill down. Thanks.
>
> On Mon, Nov 23, 2015 at 4:18 PM, Anshum Gupta <ans...@anshumgupta.net>
> wrote:
>
> > To restart solr, you should instead use something like:
> > bin/solr start -c -p 8983 -s "example/cloud/node1/solr" -z localhost:2181
> > or
> > bin/solr start -c -p 7574 -s "example/cloud/node2/solr" -z localhost:2181
> >
> > I've seen others report the same exception but never ran into this one
> > myself. Let me try this out.
> >
> >
> >
> > On Mon, Nov 23, 2015 at 2:55 PM, Oakley, Craig (NIH/NLM/NCBI) [C] <
> > craig.oak...@nih.gov> wrote:
> >
> >> FWIW
> >>
> >> I am getting fairly consistent results that if I follow the SOLR-8326
> >> procedure just up through the step of "solr-5.3.1/bin/solr start -e
> cloud
> >> -z localhost:2181": if I then stop just one node (either "./solr stop -p
> >>

RE: Re:Re: Implementing security.json is breaking ADDREPLICA

2015-11-23 Thread Oakley, Craig (NIH/NLM/NCBI) [C]
 the other node -- in this case it successfully starts.

Is there some necessary environment tweaking? The symptoms seem similar whether 
I use the security.json from SOLR-8326 or the security.json from the Wiki (with 
the comma repositioned).



-Original Message-
From: Oakley, Craig (NIH/NLM/NCBI) [C] 
Sent: Friday, November 20, 2015 6:59 PM
To: 'solr-user@lucene.apache.org' <solr-user@lucene.apache.org>
Subject: RE: Re:Re: Implementing security.json is breaking ADDREPLICA

Thanks

It seems to work when there is no security.json, so perhaps there's some typo 
in the initial version.

I notice that the version you sent is different from the documentation at 
cwiki.apache.org/confluence/display/solr/Authentication+and+Authorization+Plugins
 in that the Wiki version has "permissions" before "user-role": I also notice 
that (at least as of right this moment) the Wiki version has a comma at the end 
of '"user-role":{"solr":"admin"},' even though it is at the end: and I notice 
that the Wiki version seems to lack a comma between the "permissions" section 
and the "user-role" section. I just now also noticed that the version you sent 
has '"user-role":{"solr":["admin"]}' (with square brackets) whereas the Wiki 
does not have square brackets.

The placement of the comma definitely looks wrong in the Wiki at the moment 
(though perhaps someone might correct the Wiki before too long). Other than 
that, I don’t know whether the order and/or the square brackets make a 
difference. I can try with different permutations.

Thanks again

P.S. for the record, the Wiki currently has
{
"authentication":{
   "class":"solr.BasicAuthPlugin",
   "credentials":{"solr":"IV0EHq1OnNrj6gvRCwvFwTrZ1+z1oBbnQdiVC3otuq0= 
Ndd7LKvVBAaZIF0QAVi1ekCfAJXr1GGfLtRUXhgrF8c="}
},
"authorization":{
   "class":"solr.RuleBasedAuthorizationPlugin",
   "permissions":[{"name":"security-edit",
  "role":"admin"}]
   "user-role":{"solr":"admin"},
}}

-Original Message-
From: Anshum Gupta [mailto:ans...@anshumgupta.net] 
Sent: Friday, November 20, 2015 6:18 PM
To: solr-user@lucene.apache.org
Subject: Re: Re:Re: Implementing security.json is breaking ADDREPLICA

This seems unrelated and more like a user error somewhere. Can you just
follow the steps, without any security settings i.e. not even uploading
security.json and see if you still see this? Sorry, but I don't have access
to the code right now, I try and look at this later tonight.

On Fri, Nov 20, 2015 at 3:07 PM, Oakley, Craig (NIH/NLM/NCBI) [C] <
craig.oak...@nih.gov> wrote:

> Thank you for opening SOLR-8326
>
> As a side note, in the procedure you listed, even before adding the
> collection-admin-edit authorization, I'm already hitting trouble: stopping
> and restarting a node results in the following
>
> INFO  - 2015-11-20 22:48:41.275; [c:solr8326 s:shard2 r:core_node4
> x:solr8326_shard2_replica1] org.apache.solr.cloud.RecoveryStrategy;
> Publishing state of core solr8326_shard2_replica1 as recovering, leader is
> http://{IP-address-redacted}:8983/solr/solr8326_shard2_replica2/ and I am
> http://{IP-address-redacted}:7574/solr/solr8326_shard2_replica1/
> INFO  - 2015-11-20 22:48:41.275; [c:solr8326 s:shard2 r:core_node4
> x:solr8326_shard2_replica1] org.apache.solr.cloud.ZkController; publishing
> state=recovering
> INFO  - 2015-11-20 22:48:41.278; [c:solr8326 s:shard1 r:core_node3
> x:solr8326_shard1_replica1] org.apache.solr.cloud.RecoveryStrategy;
> Publishing state of core solr8326_shard1_replica1 as recovering, leader is
> http://{IP-address-redacted}:8983/solr/solr8326_shard1_replica2/ and I am
> http://{IP-address-redacted}:7574/solr/solr8326_shard1_replica1/
> INFO  - 2015-11-20 22:48:41.280; [c:solr8326 s:shard1 r:core_node3
> x:solr8326_shard1_replica1] org.apache.solr.cloud.ZkController; publishing
> state=recovering
> INFO  - 2015-11-20 22:48:41.282; [c:solr8326 s:shard2 r:core_node4
> x:solr8326_shard2_replica1] org.apache.solr.cloud.RecoveryStrategy; Sending
> prep recovery command to http://{IP-address-redacted}:8983/solr;
> WaitForState:
> action=PREPRECOVERY=solr8326_shard2_replica2={IP-address-redacted}%3A7574_solr=core_node4=recovering=true=true=true
> INFO  - 2015-11-20 22:48:41.289; [   ]
> org.apache.solr.common.cloud.ZkStateReader$8; A cluster state change:
> WatchedEvent state:SyncConnected type:NodeDataChanged
> path:/collections/solr8326/state.json for collection solr8326 has occurred
> - updating... (live nodes size: 2)
> INFO  - 2015-11-20 22:48:41.290; [c:solr8326 s:shard1 r:core_node3
> x:solr8326_shard1_replica1] org.apache.solr.cloud.RecoveryStrategy; Sending
> pre

RE: Re:Re: Implementing security.json is breaking ADDREPLICA

2015-11-20 Thread Oakley, Craig (NIH/NLM/NCBI) [C]
ng a core would lead to a collection creation
but that's not really supported. It was just something that was done when
there were no Collections API.


On Thu, Nov 19, 2015 at 12:36 PM, Oakley, Craig (NIH/NLM/NCBI) [C] <
craig.oak...@nih.gov> wrote:

> I tried again with the following security.json, but the results were the
> same:
>
> {
>   "authentication":{
> "class":"solr.BasicAuthPlugin",
> "credentials":{
>   "solr":"IV0EHq1OnNrj6gvRCwvFwTrZ1+z1oBbnQdiVC3otuq0=
> Ndd7LKvVBAaZIF0QAVi1ekCfAJXr1GGfLtRUXhgrF8c=",
>   "solruser":"VgZX1TAMNHT2IJikoGdKtxQdXc+MbNwfqzf89YqcLEE=
> 37pPWQ9v4gciIKHuTmFmN0Rv66rnlMOFEWfEy9qjJfY="},
> "":{"v":9}},
>   "authorization":{
> "class":"solr.RuleBasedAuthorizationPlugin",
> "user-role":{
>   "solr":[
> "admin",
> "read",
> "xmpladmin",
> "xmplgen",
> "xmplsel"],
>   "solruser":[
> "read",
> "xmplgen",
> "xmplsel"]},
> "permissions":[
>   {
> "name":"security-edit",
> "role":"admin"},
>   {
> "name":"xmpl_admin",
> "collection":"xmpl",
> "path":"/admin/*",
> "role":"xmpladmin"},
>   {
> "name":"xmpl_sel",
> "collection":"xmpl",
> "path":"/select/*",
> "role":null},
>   {
>  "name":"all-admin",
>  "collection":null,
>  "path":"/*",
>  "role":"xmplgen"},
>   {
>  "name":"all-core-handlers",
>  "path":"/*",
>  "role":"xmplgen"}],
> "":{"v":42}}}
>
> -Original Message-
> From: Oakley, Craig (NIH/NLM/NCBI) [C]
> Sent: Thursday, November 19, 2015 1:46 PM
> To: 'solr-user@lucene.apache.org' <solr-user@lucene.apache.org>
> Subject: RE: Re:Re: Implementing security.json is breaking ADDREPLICA
>
> I note that the thread called "Security Problems" (most recent post by
> Nobel Paul) seems like it may help with much of what I'm trying to do. I
> will see to what extent that may help.
>



-- 
Anshum Gupta


RE: Re:Re: Implementing security.json is breaking ADDREPLICA

2015-11-20 Thread Oakley, Craig (NIH/NLM/NCBI) [C]
nt-type:application/json' -d '{"set-permission" : {"name":"read",
"role":"read"}}'

Add a user:
> curl --user solr:SolrRocks http://localhost:8983/solr/admin/authentication
-H 'Content-type:application/json' -d '{"set-user" :
{"solrread":"solrRocks"}}'

Assign a role to the user:
>curl --user solr:SolrRocks http://localhost:8983/solr/admin/authorization
-H 'Content-type:application/json' -d '{"set-user-role" :
{"solrread":["read"]}}'

After this, you should start having issues with ADDREPLICA.
Also, as you would at this point have a collection with a shard that has a
replication factor > 1 (remember the ADDREPLICA we did earlier), you would
have issues when you restart the cluster again using the steps I mentioned
above.


Can you confirm this? I guess I'll just use this text to create a new JIRA
now.


On Fri, Nov 20, 2015 at 10:04 AM, Oakley, Craig (NIH/NLM/NCBI) [C] <
craig.oak...@nih.gov> wrote:

> Thank you again for the reply.
>
> Below is the Email I was about to send prior to your reply a moment ago:
> shall I try again without "read" in the security.json?
>
>
>
> The Collections API method was not discussed in the "Unleashed" class at
> the conference in DC in 2014 (probably because it was not yet available),
> so I was using the method I knew.
>
> I have now tried again using admin/collections?action=CREATE (using
> different port numbers to avoid confusion from the failed previous
> attempts: the previously created nodes had been shutdown and their
> core.properties files renamed so as not to be discovered), but the results
> are the same:
> INFO  - 2015-11-20 16:56:25.283; [c:xmpl3 s:shard1 r:core_node2
> x:xmpl3_shard1_replica2] org.apache.solr.cloud.RecoveryStrategy; Starting
> Replication Recovery.
> INFO  - 2015-11-20 16:56:25.284; [c:xmpl3 s:shard1 r:core_node2
> x:xmpl3_shard1_replica2] org.apache.solr.cloud.RecoveryStrategy; Begin
> buffering updates.
> INFO  - 2015-11-20 16:56:25.284; [c:xmpl3 s:shard1 r:core_node2
> x:xmpl3_shard1_replica2] org.apache.solr.update.UpdateLog; Starting to
> buffer updates. FSUpdateLog{state=ACTIVE, tlog=null}
> INFO  - 2015-11-20 16:56:25.284; [c:xmpl3 s:shard1 r:core_node2
> x:xmpl3_shard1_replica2] org.apache.solr.cloud.RecoveryStrategy; Attempting
> to replicate from http://
> {IP-address-redacted}:4685/solr/xmpl3_shard1_replica1/.
> ERROR - 2015-11-20 16:56:25.292; [c:xmpl3 s:shard1 r:core_node2
> x:xmpl3_shard1_replica2] org.apache.solr.common.SolrException; Error while
> trying to
> recover:org.apache.solr.client.solrj.impl.HttpSolrClient$RemoteSolrException:
> Error from server at 
> http://{IP-address-redacted}:4685/solr/xmpl3_shard1_replica1:
> Expected mime type application/octet-stream but got text/html. 
> 
> 
> Error 401 Unauthorized request, Response code: 401
> 
> HTTP ERROR 401
> Problem accessing /solr/xmpl3_shard1_replica1/update. Reason:
> Unauthorized request, Response code:
> 401Powered by Jetty://
>
> 
> 
>
> at
> org.apache.solr.client.solrj.impl.HttpSolrClient.executeMethod(HttpSolrClient.java:528)
> at
> org.apache.solr.client.solrj.impl.HttpSolrClient.request(HttpSolrClient.java:234)
> at
> org.apache.solr.client.solrj.impl.HttpSolrClient.request(HttpSolrClient.java:226)
> at
> org.apache.solr.client.solrj.SolrRequest.process(SolrRequest.java:135)
> at
> org.apache.solr.client.solrj.SolrRequest.process(SolrRequest.java:152)
> at
> org.apache.solr.cloud.RecoveryStrategy.commitOnLeader(RecoveryStrategy.java:207)
> at
> org.apache.solr.cloud.RecoveryStrategy.replicate(RecoveryStrategy.java:147)
> at
> org.apache.solr.cloud.RecoveryStrategy.doRecovery(RecoveryStrategy.java:437)
> at
> org.apache.solr.cloud.RecoveryStrategy.run(RecoveryStrategy.java:227)
>
> INFO  - 2015-11-20 16:56:25.292; [c:xmpl3 s:shard1 r:core_node2
> x:xmpl3_shard1_replica2] org.apache.solr.update.UpdateLog; Dropping
> buffered updates FSUpdateLog{state=BUFFERING, tlog=null}
> ERROR - 2015-11-20 16:56:25.293; [c:xmpl3 s:shard1 r:core_node2
> x:xmpl3_shard1_replica2] org.apache.solr.cloud.RecoveryStrategy; Recovery
> failed - trying again... (2)
> INFO  - 2015-11-20 16:56:25.293; [c:xmpl3 s:shard1 r:core_node2
> x:xmpl3_shard1_replica2] org.apache.solr.cloud.RecoveryStrategy; Wait 8.0
> seconds before trying to recover again (3)
>
>
> Below is a list of the steps I took.
>
> ./zkcli.sh --zkhost localhost:4545 -cmd makepath /solr/xmpl3
> ./zkcli.sh --zkhost localhost:4545/solr/xmpl3 -cmd putfile /security.json
> ~/solr/security151119a.json
> ./zkcli.sh --zkhost localhost:4545/solr/xmpl3 -cmd upconfig -confdir

RE: Re:Re: Implementing security.json is breaking ADDREPLICA

2015-11-20 Thread Oakley, Craig (NIH/NLM/NCBI) [C]
Thanks

It seems to work when there is no security.json, so perhaps there's some typo 
in the initial version.

I notice that the version you sent is different from the documentation at 
cwiki.apache.org/confluence/display/solr/Authentication+and+Authorization+Plugins
 in that the Wiki version has "permissions" before "user-role": I also notice 
that (at least as of right this moment) the Wiki version has a comma at the end 
of '"user-role":{"solr":"admin"},' even though it is at the end: and I notice 
that the Wiki version seems to lack a comma between the "permissions" section 
and the "user-role" section. I just now also noticed that the version you sent 
has '"user-role":{"solr":["admin"]}' (with square brackets) whereas the Wiki 
does not have square brackets.

The placement of the comma definitely looks wrong in the Wiki at the moment 
(though perhaps someone might correct the Wiki before too long). Other than 
that, I don’t know whether the order and/or the square brackets make a 
difference. I can try with different permutations.

Thanks again

P.S. for the record, the Wiki currently has
{
"authentication":{
   "class":"solr.BasicAuthPlugin",
   "credentials":{"solr":"IV0EHq1OnNrj6gvRCwvFwTrZ1+z1oBbnQdiVC3otuq0= 
Ndd7LKvVBAaZIF0QAVi1ekCfAJXr1GGfLtRUXhgrF8c="}
},
"authorization":{
   "class":"solr.RuleBasedAuthorizationPlugin",
   "permissions":[{"name":"security-edit",
  "role":"admin"}]
   "user-role":{"solr":"admin"},
}}

-Original Message-
From: Anshum Gupta [mailto:ans...@anshumgupta.net] 
Sent: Friday, November 20, 2015 6:18 PM
To: solr-user@lucene.apache.org
Subject: Re: Re:Re: Implementing security.json is breaking ADDREPLICA

This seems unrelated and more like a user error somewhere. Can you just
follow the steps, without any security settings i.e. not even uploading
security.json and see if you still see this? Sorry, but I don't have access
to the code right now, I try and look at this later tonight.

On Fri, Nov 20, 2015 at 3:07 PM, Oakley, Craig (NIH/NLM/NCBI) [C] <
craig.oak...@nih.gov> wrote:

> Thank you for opening SOLR-8326
>
> As a side note, in the procedure you listed, even before adding the
> collection-admin-edit authorization, I'm already hitting trouble: stopping
> and restarting a node results in the following
>
> INFO  - 2015-11-20 22:48:41.275; [c:solr8326 s:shard2 r:core_node4
> x:solr8326_shard2_replica1] org.apache.solr.cloud.RecoveryStrategy;
> Publishing state of core solr8326_shard2_replica1 as recovering, leader is
> http://{IP-address-redacted}:8983/solr/solr8326_shard2_replica2/ and I am
> http://{IP-address-redacted}:7574/solr/solr8326_shard2_replica1/
> INFO  - 2015-11-20 22:48:41.275; [c:solr8326 s:shard2 r:core_node4
> x:solr8326_shard2_replica1] org.apache.solr.cloud.ZkController; publishing
> state=recovering
> INFO  - 2015-11-20 22:48:41.278; [c:solr8326 s:shard1 r:core_node3
> x:solr8326_shard1_replica1] org.apache.solr.cloud.RecoveryStrategy;
> Publishing state of core solr8326_shard1_replica1 as recovering, leader is
> http://{IP-address-redacted}:8983/solr/solr8326_shard1_replica2/ and I am
> http://{IP-address-redacted}:7574/solr/solr8326_shard1_replica1/
> INFO  - 2015-11-20 22:48:41.280; [c:solr8326 s:shard1 r:core_node3
> x:solr8326_shard1_replica1] org.apache.solr.cloud.ZkController; publishing
> state=recovering
> INFO  - 2015-11-20 22:48:41.282; [c:solr8326 s:shard2 r:core_node4
> x:solr8326_shard2_replica1] org.apache.solr.cloud.RecoveryStrategy; Sending
> prep recovery command to http://{IP-address-redacted}:8983/solr;
> WaitForState:
> action=PREPRECOVERY=solr8326_shard2_replica2={IP-address-redacted}%3A7574_solr=core_node4=recovering=true=true=true
> INFO  - 2015-11-20 22:48:41.289; [   ]
> org.apache.solr.common.cloud.ZkStateReader$8; A cluster state change:
> WatchedEvent state:SyncConnected type:NodeDataChanged
> path:/collections/solr8326/state.json for collection solr8326 has occurred
> - updating... (live nodes size: 2)
> INFO  - 2015-11-20 22:48:41.290; [c:solr8326 s:shard1 r:core_node3
> x:solr8326_shard1_replica1] org.apache.solr.cloud.RecoveryStrategy; Sending
> prep recovery command to http://{IP-address-redacted}:8983/solr;
> WaitForState:
> action=PREPRECOVERY=solr8326_shard1_replica2={IP-address-redacted}%3A7574_solr=core_node3=recovering=true=true=true
> INFO  - 2015-11-20 22:48:41.291; [   ]
> org.apache.solr.common.cloud.ZkStateReader; Updating data for solr8326 to
> ver 25
> ERROR - 2015-11-20 22:48:41.298; [c:solr8326 s:shard2 r:core_node4
> x:solr8326_shard2_replica1] org.apac

RE: Re:Re: Implementing security.json is breaking ADDREPLICA

2015-11-19 Thread Oakley, Craig (NIH/NLM/NCBI) [C]
I note that the thread called "Security Problems" (most recent post by Nobel 
Paul) seems like it may help with much of what I'm trying to do. I will see to 
what extent that may help.


RE: Re:Re: Implementing security.json is breaking ADDREPLICA

2015-11-19 Thread Oakley, Craig (NIH/NLM/NCBI) [C]
I tried again with the following security.json, but the results were the same:

{
  "authentication":{
"class":"solr.BasicAuthPlugin",
"credentials":{
  "solr":"IV0EHq1OnNrj6gvRCwvFwTrZ1+z1oBbnQdiVC3otuq0= 
Ndd7LKvVBAaZIF0QAVi1ekCfAJXr1GGfLtRUXhgrF8c=",
  "solruser":"VgZX1TAMNHT2IJikoGdKtxQdXc+MbNwfqzf89YqcLEE= 
37pPWQ9v4gciIKHuTmFmN0Rv66rnlMOFEWfEy9qjJfY="},
"":{"v":9}},
  "authorization":{
"class":"solr.RuleBasedAuthorizationPlugin",
"user-role":{
  "solr":[
"admin",
"read",
"xmpladmin",
"xmplgen",
"xmplsel"],
  "solruser":[
"read",
"xmplgen",
"xmplsel"]},
"permissions":[
  {
"name":"security-edit",
"role":"admin"},
  {
"name":"xmpl_admin",
"collection":"xmpl",
"path":"/admin/*",
"role":"xmpladmin"},
  {
    "name":"xmpl_sel",
"collection":"xmpl",
"path":"/select/*",
"role":null},
  {
 "name":"all-admin",
 "collection":null,
 "path":"/*",
 "role":"xmplgen"},
  {
 "name":"all-core-handlers",
 "path":"/*",
 "role":"xmplgen"}],
"":{"v":42}}}

-Original Message-
From: Oakley, Craig (NIH/NLM/NCBI) [C] 
Sent: Thursday, November 19, 2015 1:46 PM
To: 'solr-user@lucene.apache.org' <solr-user@lucene.apache.org>
Subject: RE: Re:Re: Implementing security.json is breaking ADDREPLICA

I note that the thread called "Security Problems" (most recent post by Nobel 
Paul) seems like it may help with much of what I'm trying to do. I will see to 
what extent that may help.


RE: Re:Re: Implementing security.json is breaking ADDREPLICA

2015-11-19 Thread Oakley, Craig (NIH/NLM/NCBI) [C]
Thank you for the reply.

What we are attempting is to require a password for practically everything, so 
that even were a hacker to get within the firewall, they would have limited 
access to the various services (the Security team even complained that, for 
Solr 4.5 servers, attempts to access host:port (without "/solr") resulted in an 
error message that gave the full pathname to solr.war)

I am sending the solr.log files directly to Anshum, so as not to clutter up the 
Email list.

The steps I used to recreate the problem are as follows:
cd zookeeper-3.4.6/conf/
sed 's/2181/4545/' zoo_sample.cfg | tee zoo_sample4545.cfg 
cd ../bin
./zkServer.sh start zoo_sample4545.cfg
cd ../../solr-5.3.1/server/solr
mkdir xmpl
echo 'name=xmpl' | tee xmpl/core.properties
mkdir xmpl/data
mkdir xmpl/data/index
mkdir xmpl/data/tlog
cd ../scripts/cloud-scripts/
./zkcli.sh --zkhost localhost:4545 -cmd makepath /solr
./zkcli.sh --zkhost localhost:4545 -cmd makepath /solr/xmpl
./zkcli.sh --zkhost localhost:4545/solr/xmpl  -cmd upconfig -confdir 
../../solr/configsets/basic_configs/conf -confname xmpl
mkdir ../../example/solr
cp solr.xml ../../example/solr
./zkcli.sh --zkhost localhost:4545/solr/xmpl  -cmd putfile /security.json 
~/solr/security151117a.json 
cd ../../../bin
mkdir  ../example/solr/pid
./solr -c -p 4575 -d ~dbman/solr/straight531outofbox/solr-5.3.1/server/ -z 
localhost:4545/solr/xmpl -s 
~dbman/solr/straight531outofbox/solr-5.3.1/example/solr
./solr -c -p 4565 -d ~dbman/solr/straight531outofbox/solr-5.3.1/server/ -z 
localhost:4545/solr/xmpl -s 
~dbman/solr/straight531outofbox/solr-5.3.1/server/solr
curl -u solr:SolrRocks 'http:// 
{IP-address-redacted}:4575/solr/admin/collections?action=ADDREPLICA=xmpl=shard1={IP-address-redacted}:4575_solr=json=true'

The contents of security151117a.json is included in the original post

If there is a better way using the Well Known Permissions as described at 
lucidworks.com/blog/2015/08/17/securing-solr-basic-auth-permission-rules, I'm 
open to trying that.

I would like to acknowledge that there definitely seem to be some IMPROVEMENTS 
in the security.json implementation: particularly in terms of Core Admin (using 
jetty-implemented Authentication in webdefault.xml, anyone who could get into 
the GUI front page could rename cores, unless prevented by OS-level permissions 
on core.properties).


Thanks again


Implementing security.json is breaking ADDREPLICA

2015-11-18 Thread Oakley, Craig (NIH/NLM/NCBI) [C]
Implementing security.json is breaking ADDREPLICA

I have been able to reproduce this issue with minimal changes from an 
out-of-the-box Zookeeper (3.4.6) and Solr (5.3.1): loading 
configsets/basic_configs/conf into Zookeeper, creating the security.json listed 
below, creating two nodes (one with a core named xmpl and one without any 
core)- I can provide details if helpful.

The security.json is as follows:

{
  "authentication":{
    "class":"solr.BasicAuthPlugin",
    "credentials":{
  "solr":"IV0EHq1OnNrj6gvRCwvFwTrZ1+z1oBbnQdiVC3otuq0= 
Ndd7LKvVBAaZIF0QAVi1ekCfAJXr1GGfLtRUXhgrF8c=",
  "solruser":"VgZX1TAMNHT2IJikoGdKtxQdXc+MbNwfqzf89YqcLEE= 
37pPWQ9v4gciIKHuTmFmN0Rv66rnlMOFEWfEy9qjJfY="},
    "":{"v":9}},
  "authorization":{
    "class":"solr.RuleBasedAuthorizationPlugin",
    "user-role":{
  "solr":[
    "admin",
    "read",
    "xmpladmin",
    "xmplgen",
    "xmplsel"],
  "solruser":[
    "read",
    "xmplgen",
    "xmplsel"]},
    "permissions":[
  {
    "name":"security-edit",
    "role":"admin"},
  {
    "name":"xmpl_admin",
    "collection":"xmpl",
    "path":"/admin/*",
    "role":"xmpladmin"},
  {
    "name":"xmpl_sel",
    "collection":"xmpl",
    "path":"/select/*",
    "role":null},
  {
    "name":"xmpl_gen",
    "collection":"xmpl",
    "path":"/*",
    "role":"xmplgen"}],
    "":{"v":42}}}





When I then execute admin/collections?action=ADDREPLICA, I get errors such as 
the following in the solr.log of the node which was created without a core.

INFO  - 2015-11-17 21:03:54.157; [c:xmpl s:shard1 r:core_node2 
x:xmpl_shard1_replica1] org.apache.solr.cloud.RecoveryStrategy; Starting 
Replication Recovery.
INFO  - 2015-11-17 21:03:54.158; [c:xmpl s:shard1 r:core_node2 
x:xmpl_shard1_replica1] org.apache.solr.cloud.RecoveryStrategy; Begin buffering 
updates.
INFO  - 2015-11-17 21:03:54.158; [c:xmpl s:shard1 r:core_node2 
x:xmpl_shard1_replica1] org.apache.solr.update.UpdateLog; Starting to buffer 
updates. FSUpdateLog{state=ACTIVE, tlog=null}
INFO  - 2015-11-17 21:03:54.159; [c:xmpl s:shard1 r:core_node2 
x:xmpl_shard1_replica1] org.apache.solr.cloud.RecoveryStrategy; Attempting to 
replicate from http://{IP-address-redacted}:4565/solr/xmpl/.
ERROR - 2015-11-17 21:03:54.166; [c:xmpl s:shard1 r:core_node2 
x:xmpl_shard1_replica1] org.apache.solr.common.SolrException; Error while 
trying to 
recover:org.apache.solr.client.solrj.impl.HttpSolrClient$RemoteSolrException: 
Error from server at http://{IP-address-redacted}:4565/solr/xmpl: Expected mime 
type application/octet-stream but got text/html. 


Error 401 Unauthorized request, Response code: 401

HTTP ERROR 401
Problem accessing /solr/xmpl/update. Reason:
    Unauthorized request, Response code: 
401Powered by Jetty://




    at 
org.apache.solr.client.solrj.impl.HttpSolrClient.executeMethod(HttpSolrClient.java:528)
    at 
org.apache.solr.client.solrj.impl.HttpSolrClient.request(HttpSolrClient.java:234)
    at 
org.apache.solr.client.solrj.impl.HttpSolrClient.request(HttpSolrClient.java:226)
    at 
org.apache.solr.client.solrj.SolrRequest.process(SolrRequest.java:135)
    at 
org.apache.solr.client.solrj.SolrRequest.process(SolrRequest.java:152)
    at 
org.apache.solr.cloud.RecoveryStrategy.commitOnLeader(RecoveryStrategy.java:207)
    at 
org.apache.solr.cloud.RecoveryStrategy.replicate(RecoveryStrategy.java:147)
    at 
org.apache.solr.cloud.RecoveryStrategy.doRecovery(RecoveryStrategy.java:437)
    at org.apache.solr.cloud.RecoveryStrategy.run(RecoveryStrategy.java:227)

INFO  - 2015-11-17 21:03:54.166; [c:xmpl s:shard1 r:core_node2 
x:xmpl_shard1_replica1] org.apache.solr.update.UpdateLog; Dropping buffered 
updates FSUpdateLog{state=BUFFERING, tlog=null}
ERROR - 2015-11-17 21:03:54.166; [c:xmpl s:shard1 r:core_node2 
x:xmpl_shard1_replica1] org.apache.solr.cloud.RecoveryStrategy; Recovery failed 
- trying again... (2)
INFO  - 2015-11-17 21:03:54.166; [c:xmpl s:shard1 r:core_node2 
x:xmpl_shard1_replica1] org.apache.solr.cloud.RecoveryStrategy; Wait 8.0 
seconds before trying to recover again (3)



And (after modifying Logging Levels), the solr.log of the node which already 
had a core gets errors such as the following:

2015-11-17 21:03:50.743 DEBUG (qtp59559151-87) [   ] o.e.j.s.Server REQUEST GET 
/solr/tpl/cloud.html on 
HttpChannelOverHttp@37cf94f4{r=1,c=false,a=DISPATCHED,uri=/solr/tpl/cloud.html}
2015-11-17 21:03:50.744 DEBUG (qtp59559151-87) [   ] o.e.j.s.Server RESPONSE 
/solr/tpl/cloud.html  200 handled=true
2015-11-17 21:03:50.802 DEBUG (qtp59559151-91) [   ] o.e.j.s.Server REQUEST GET 
/solr/zookeeper on 
HttpChannelOverHttp@37cf94f4{r=2,c=false,a=DISPATCHED,uri=/solr/zookeeper}
2015-11-17 21:03:50.803 INFO  (qtp59559151-91) [   ] o.a.s.s.HttpSolrCall 
userPrincipal: [null] type: [UNKNOWN], collections: [], Path: [/zookeeper]
2015-11-17 21:03:50.831 DEBUG