Re: Congratulations to the new Apache Solr PMC Chair, Jan Høydahl!

2021-02-19 Thread Jason Gerlowski
Congrats!

On Fri, Feb 19, 2021 at 10:06 AM Divye  wrote:
>
> Congratulations Jan!
>
> Regards,
> Divye
>
> On Fri, 19 Feb, 2021, 00:26 Anshum Gupta,  wrote:
>
> > Hi everyone,
> >
> > I’d like to inform everyone that the newly formed Apache Solr PMC nominated
> > and elected Jan Høydahl for the position of the Solr PMC Chair and Vice
> > President. This decision was approved by the board in its February 2021
> > meeting.
> >
> > Congratulations Jan!
> >
> > --
> > Anshum Gupta
> >


Re: Solr Cloud freezes during scheduled backup

2021-02-02 Thread Jason Gerlowski
Hi Pawel,

This definitely sounds like garbage collection biting you.

Backups themselves aren't usually memory intensive, but if indexing is
going on at the same time you should expect elevated memory usage.
Essentially this is because for each core being backed up, Solr needs
to hold pieces of two different "versions" of the index in memory: the
commit-point being backed up, and the current state of the index with
the new documents.

If disabling indexing during backups is feasible that's where I'd
start in your shoes.  If it's not you might need to consider tweaks to
your heap and JVM GC settings to shorten the long individual GC pauses
you're reporting.

Good luck,

Jason

On Wed, Jan 20, 2021 at 7:00 AM Paweł Róg  wrote:
>
> Hello everyone,
> I have a nasty problem with the scheduled Solr collections backup. From
> time to time when a scheduled backup is triggered (backup operation takes
> around 10 minutes) Solr freezes for 20-30 seconds. The freeze happens on
> one Solr instance at time but this affects all queries latency (because of
> distributed queries on 6 shards). I can reproduce the problem only when
> updates in the Solr cluster are enabled. When I disable updates, the
> problem is gone.
>
> Lucene index is not big and fits into OS cache. I am wondering if taking a
> backup can be the culprit of the problem. I'm wondering if the process
> messes up operating system caches. Maybe all the files which are copied to
> NFS are eating up the OS cache and when the OS reaches high memory usage it
> starts cleaning up memory and making Solr to freeze.
>
> During the time of freeze monitoring charts are showing higher IO wait
> times. In addition to that Solr nodes which seem to be affected are
> reaching 95-100% total memory usage (used + buffers + caches).
>
> I cannot see anything valuable in GC logs apart from a message which
> suggests that the application was stopped for 20-30 seconds (Application
> time).
>
> The cluster consists of 12 machines. Each Solr is running on Ubuntu 16.04.
> All the servers are running in AWS EC2. Each Solr node is running inside
> Docker. EC2 instances have local SSD disks (but the same problem appeared
> with EBS).
>
> Does anyone have a similar problem and can share some thoughts? I'll
> appreciate all help.
>
> --
> Pawel Rog


Re: Change uniqueKey using SolrJ

2021-02-01 Thread Jason Gerlowski
Hi,

SolrJ doesn't have any purpose-made request class to change the
uniqueKey, afaict.  However doing so is still possible (though less
convenient) using the "GenericSolrRequest" class, which can be used to
hit arbitrary Solr APIs.

If you'd like to see better support for this in SolrJ, open a JIRA
ticket with the details of what you're trying to do (or a PR directly)
and I'd be happy to take a look.

Best,

Jason

On Fri, Jan 22, 2021 at 9:29 AM Timo Grün  wrote:
>
> Hi All,
>
> I’m currently trying to change the uniqueKey of my Solr Cloud schema using 
> Solrj.
> While creating new Fields and FieldDefinitions is pretty straight forward, I 
> struggle to find any solution to change the Unique Key field with Solrj.
>
> Any advice here?
>
> Best Regards,
>
> Timo Gruen
>


Re: Ghost Documents or Shards out of Sync

2021-02-01 Thread Jason Gerlowski
Forgot to answer your second question:

> Can I trigger the "fixing" mechanism that Solr runs at restart by an API call 
> or some other method?

It depends on what the cause is.  But for at least some possible
causes there is an API call that can resolve this.  Though that API
itself (Solr's misnamed "optimize" feature) comes with a lot of
warnings and has been discouraged by the community in the past.  (I
won't get into those specifics though until you figure out the cause.)

Before you consider calling "optimize" or taking any other means to
fix this though, it might be worth revisiting whether this is really
an issue?  While this quirk of Solr's can bedevil automated tests or
other things that rely on repeatability, it's unusual in many
applications for end-users to submit identical queries multiple times.
Every case is different of course, but something to consider.

Best,

Jason

On Mon, Feb 1, 2021 at 3:49 PM Jason Gerlowski  wrote:
>
> Hi Ronen,
>
> The first thing I'd figure out in your situation is whether the
> results are actually different each time, or whether the ordering is
> what differs (which might push a particular result off the page you're
> looking at, giving the appearance that it didn't match).
>
> In the case of the former, this can happen briefly if queries come in
> when some but not all replicas have seen a commit.  But usually this
> is a transient concern - either waiting for the next autocommit or
> triggering an explicit commit resolves the discrepancy in this case.
> Since you only see identical results after a restart, this _doesn't_
> sound like what you're seeing.
>
> In the case of the latter (same results, differently ordered) this is
> expected sometimes.  Solr sorts on relevance by default with the
> internal Lucene document ID being a tiebreaker.  Both the relevance
> statistics and Lucene's document IDs can differ across SolrCloud
> replicas (due to non-deterministic conditions such as the segment
> merging and deleted-doc removal that Lucene does under the hood), and
> this can produce differently-ordered result sets for users that issue
> the same query repeatedly.
>
> Good luck narrowing things down!
>
> Jason
>
> On Mon, Jan 25, 2021 at 3:32 AM Ronen Nussbaum  wrote:
> >
> > Hi All,
> >
> > I'm using Solr Cloud (version 8.3.0) with shards and replicas (replication
> > factor of 2).
> > Recently, I've encountered several times that running the same query
> > repeatedly yields different results. Restarting the nodes fixes the problem
> > (until next time).
> > I assume that some shards are not synchronized and I have several questions:
> > 1. What can cause this - many atomic updates? issues with commits?
> > 2. Can I trigger the "fixing" mechanism that Solr runs at restart by an API
> > call or some other method?
> >
> > Thanks in advance,
> > Ronen.


Re: Ghost Documents or Shards out of Sync

2021-02-01 Thread Jason Gerlowski
Hi Ronen,

The first thing I'd figure out in your situation is whether the
results are actually different each time, or whether the ordering is
what differs (which might push a particular result off the page you're
looking at, giving the appearance that it didn't match).

In the case of the former, this can happen briefly if queries come in
when some but not all replicas have seen a commit.  But usually this
is a transient concern - either waiting for the next autocommit or
triggering an explicit commit resolves the discrepancy in this case.
Since you only see identical results after a restart, this _doesn't_
sound like what you're seeing.

In the case of the latter (same results, differently ordered) this is
expected sometimes.  Solr sorts on relevance by default with the
internal Lucene document ID being a tiebreaker.  Both the relevance
statistics and Lucene's document IDs can differ across SolrCloud
replicas (due to non-deterministic conditions such as the segment
merging and deleted-doc removal that Lucene does under the hood), and
this can produce differently-ordered result sets for users that issue
the same query repeatedly.

Good luck narrowing things down!

Jason

On Mon, Jan 25, 2021 at 3:32 AM Ronen Nussbaum  wrote:
>
> Hi All,
>
> I'm using Solr Cloud (version 8.3.0) with shards and replicas (replication
> factor of 2).
> Recently, I've encountered several times that running the same query
> repeatedly yields different results. Restarting the nodes fixes the problem
> (until next time).
> I assume that some shards are not synchronized and I have several questions:
> 1. What can cause this - many atomic updates? issues with commits?
> 2. Can I trigger the "fixing" mechanism that Solr runs at restart by an API
> call or some other method?
>
> Thanks in advance,
> Ronen.


Re: Getting Solr's statistic using SolrJ

2021-02-01 Thread Jason Gerlowski
Hi Steven,

AFAIK, SolrJ doesn't have built in request objects for the metrics
API.  But you can still use the "GenericSolrRequest" class to hit any
Solr API:

e.g.

SolrParams params = new ModifiableSolrParams();
params.set("action", "list");
GenericSolrRequest request = new
GenericSolrRequest(SolrRequest.METHOD.GET, "/admin/metrics/history",
params);
final SimpleSolrResponse response = request.process(solrClient);

Hope that helps,

Jason

On Fri, Jan 22, 2021 at 11:21 AM Gael Jourdan-Weil
 wrote:
>
> Hello Steven,
>
> I believe what you are looking for cannot be accessed using SolrJ (I didn't 
> really check though).
>
> But you can easily access it either via the Collections APIs and/or the 
> Metrics API depending on what you need exactly.
> See https://lucene.apache.org/solr/guide/8_4/cluster-node-management.html and 
> https://lucene.apache.org/solr/guide/8_4/metrics-reporting.html
>
> Gaël
>
>
> De : Steven White 
> Envoyé : vendredi 22 janvier 2021 16:46
> À : solr-user@lucene.apache.org 
> Objet : Getting Solr's statistic using SolrJ
>
> Hi everyone,
>
> Is there a SolrJ API that I can use to collect statistics data about Solr
> (everything that I see on the dashboard if possible)?
>
> I am in need to collect data about Solr instances, those same data that I
> see on the dashboard such as swap-memory, jvm-memory, list of cores, info
> about each core, etc. etc. using SolrJ API.
>
> Thanks
>
> Steven


Re: nested facets of query and terms type in JSON format

2020-12-10 Thread Jason Gerlowski
Hey Arturas,

Can't help you with the secrets of Michael's inspiration (though I'm
also curious :-p).  And I'm not sure if there's any equivalent of
facet.threads for JSON Faceting.  You're on your own there
unfortunately.

But you (or other readers) might find this "Query Facet" example handy
- it uses the "type": "query" syntax that MIchael mentioned. [1]

[1] https://lucene.apache.org/solr/guide/8_5/json-facet-api.html#query-facet

Best,
Jason

On Thu, Dec 3, 2020 at 5:49 PM Arturas Mazeika  wrote:
>
> Hi Michael,
>
> I wish I were able to do a percent of what you are doing. Where does your
> inspiration come from? It is not from the manuals, cause I've checked
> those. How do you come up with this piece of art? Did you check this from
> the source code? Which lines revealed these secrets? I am eternally
> grateful for your help!
>
> Michael, maybe you happen to know how I can plugin in facet.threads
> parameter in that JSON body below, so the query uses more threads to
> compute the answer? I am dying out of curiosity.
>
> Cheers,
> Arturas
>
> On Thu, Dec 3, 2020 at 7:59 PM Michael Gibney 
> wrote:
>
> > I think the first "error" case in your set of examples above is closest to
> > being correct. For "query" facet type, I think you want to explicitly
> > specify `"type":"query"`, and specify the query itself in the `"q"` param,
> > i.e.:
> > {
> > "query"  : "*:*",
> > "limit"  : 0,
> >
> > "facet": {
> > "aip": {
> > "type":  "query",
> > "q":  "cfname2:aip",
> > "facet": {
> > "t_buckets": {
> > "type":  "range",
> > "field": "t",
> > "sort": { "t": "asc" },
> > "start": "2018-05-02T17:00:00.000Z",
> > "end":   "2020-11-16T21:00:00.000Z",
> > "gap":   "+1HOUR"
> > "limit": 1
> > }
> > }
> > }
> > }
> > }
> >
> > On Thu, Dec 3, 2020 at 12:59 PM Arturas Mazeika  wrote:
> >
> > > Hi Michael,
> > >
> > > Thanks for helping me to figure this out.
> > >
> > > If I fire:
> > >
> > > {
> > > "query"  : "*:*",
> > > "limit"  : 0,
> > >
> > > "facet": {
> > > "aip": { "query":  "cfname2:aip", }
> > >
> > > }
> > > }
> > >
> > > I get
> > >
> > > "response": { "numFound": 20560849, "start": 0, "numFoundExact": true,
> > > "docs": [] }, "facets": { "count": 20560849, "aip": { "count": 2307 } } }
> > >
> > > (works). If I fire
> > >
> > >
> > > {
> > > "query"  : "*:*",
> > > "limit"  : 0,
> > >
> > > "facet": {
> > > "t_buckets": {
> > > "type":  "range",
> > > "field": "t",
> > > "sort": { "t": "asc" },
> > > "start": "2018-05-02T17:00:00.000Z",
> > > "end":   "2020-11-16T21:00:00.000Z",
> > > "gap":   "+1HOUR"
> > > "limit": 1
> > > }
> > > }
> > > }
> > >
> > > I get
> > >
> > > "response": { "numFound": 20560849, "start": 0, "numFoundExact": true,
> > > "docs": [] }, "facets": { "count": 20560849, "t_buckets": { "buckets": [
> > {
> > > "val": "2018-05-02T17:00:00Z", "count": 150 },
> > >
> > > (works). If I fire:
> > >
> > > {
> > > "query"  : "*:*",
> > > "limit"  : 0,
> > >
> > > "facet": {
> > > "aip": { "query":  "cfname2:aip"

Re: security.json help

2020-11-25 Thread Jason Gerlowski
Hi Mark,

It looks like you're using the "path" wildcard as it's intended, but
some bug is causing the behavior you're seeing.  It should be working
as you expected, but evidently it's not.

One potential workaround might be to leave out the "path" property
entirely in your "custom-example" permission.  When I do that (on Solr
8.6.2), I get the following behavior in the following pastebin link,
which looks close to what you're after: https://paste.apache.org/ygndt

Hope that helps!

Jason

On Mon, Oct 19, 2020 at 3:49 PM Mark Dadisman
 wrote:
>
> Hey, I'm new to configuring Solr. I'm trying to configure Solr with Rule 
> Based Authorization. 
> https://lucene.apache.org/solr/guide/8_6/rule-based-authorization-plugin.html
>
> I have permissions working if I allow everything with "all", but I want to 
> limit access so that a site can only access its own collection, in addition 
> to a server ping path, so I'm trying to add the collection-specific 
> permission at the top:
>
> "permissions": [
>   {
> "name": "custom-example",
> "collection": "example",
> "path": "*",
> "role": [
>   "admin",
>   "example"
> ]
>   },
>   {
> "name": "custom-collection",
> "collection": "*",
> "path": [
>   "/admin/luke",
>   "/admin/mbeans",
>   "/admin/system"
> ],
> "role": "*"
>   },
>   {
> "name": "custom-ping",
> "collection": null,
> "path": [
>   "/admin/info/system"
> ],
> "role": "*"
>   },
>   {
> "name": "all",
> "role": "admin"
>   }
> ]
>
> The rule "custom-ping" works, and "all" works. But when the above permissions 
> are used, access is denied to the "example" user-role for collection 
> "example" at the path "/solr/example/select". If I specify paths explicitly, 
> the permissions work, but I can't get permissions to work with path wildcards 
> for a specific collection.
>
> I also had to declare "custom-collection" with the specific paths needed to 
> get collection info in order for those paths to work. I would've expected 
> that these paths would be included in the collection-specific paths and be 
> covered by the first rule, but they aren't. For example, the call to 
> "/solr/example/admin/luke" will fail if the path is removed from this rule.
>
> I don't really want to specify every single path I might need to use. Am I 
> using the path wildcard wrong somehow? Is there a better way to do 
> collection-specific authorizations for a collection "example"?
>
> Thanks.
> - M
>


Re: disallowing delete through security.json

2020-11-24 Thread Jason Gerlowski
Hey Craig,

I think this will be tricky to do with the current Rule-Based
Authorization support.  As you pointed out in your initial post -
there are lots of ways to delete documents.  The Rule-Based Auth code
doesn't inspect request bodies (AFAIK), so it's going to have trouble
differentiating between traditional "/update" requests with
method=POST that are request-body driven.

But to zoom out a bit, does it really make sense to lock down deletes,
but not updates more broadly?  After all, "updates" can remove and add
fields.  Users might submit an update that strips everything but "id"
from your documents.  In many/most usecases that'd be equally
concerning.  Just wondering what your usecase is - if it's generally
applicable this is probably worth a JIRA ticket.

Best,

Jason

On Thu, Nov 19, 2020 at 10:34 AM Oakley, Craig (NIH/NLM/NCBI) [C]
 wrote:
>
> Having not heard back, I thought I would ask again whether anyone else has 
> been able to use security.json to disallow deletes, and/or if anyone has 
> examples of using the "method" section in 
> lucene.apache.org/solr/guide/8_4/rule-based-authorization-plugin.html
>
> -Original Message-
> From: Oakley, Craig (NIH/NLM/NCBI) [C] 
> Sent: Monday, October 26, 2020 6:23 PM
> To: solr-user@lucene.apache.org
> Subject: disallowing delete through security.json
>
> I am interested in disallowing delete through security.json
>
> After seeing the "method" section in 
> lucene.apache.org/solr/guide/8_4/rule-based-authorization-plugin.html my 
> first attempt was as follows:
>
> {"set-permission":{
> "name":"NO_delete",
> "path":["/update/*","/update"],
> "collection":col_name,
> "role":"NoSuchRole",
> "method":"DELETE",
> "before":4}}
>
> I found, however, that this did not disallow deleted: I could still run
> curl -u ... "http://.../solr/col_name/update?commit=true; --data 
> "id:11"
>
> After further experimentation, I seemed to have success with
> {"set-permission":
> {"name":"NO_delete6",
> "path":"/update/*",
> "collection":"col_name",
> "role":"NoSuchRole",
> "method":["REGEX:(?i)DELETE"],
> "before":4}}
>
> My initial impression was that this did what I wanted; but now I find that 
> this disallows *any* updates to this collection (which had previously been 
> allowed). Other attempts to tweak this strategy, such as granting permissions 
> for "/update/*" for methods other than DELETE to a role which is granted to 
> the desired user, have not yet been successful.
>
> Does anyone have an example of security.json disallowing a delete while still 
> allowing an update?
>
> Thanks


Re: SolrJ NestableJsonFacet ordering of query facet

2020-11-19 Thread Jason Gerlowski
Hi Shivram,

I think the short answer is "no".  At least, not without sub-classing
some of the JSON-Facet classes in SolrJ.

But it's hard for me to understand your particular concern without
seeing a concrete example.  If you provide an example (maybe in the
form of a JUnit test snippet showing the actual and expected values),
I may be able to provide more help.

Best,

Jason

On Fri, Oct 30, 2020 at 1:54 AM Shivam Jha  wrote:
>
> Hi folks,
>
> Does anyone have any advice on this issue?
>
> Thanks,
> Shivam
>
> On Tue, Oct 27, 2020 at 1:20 PM Shivam Jha  wrote:
>
> > Hi folks,
> >
> > Doing some faceted queries using 'facet.json' param and SolrJ, the results
> > of which I am processing using SolrJ NestableJsonFacet class.
> > basically as   *queryResponse.getJsonFacetingResponse() -> returns 
> > *NestableJsonFacet
> > object.
> >
> > But I have noticed it does not maintain the facet-query order in which it
> > was given in *facet.json.*
> > *Direct queries to solr do maintain that order, but not after it comes to
> > Java layer in SolrJ.*
> >
> > Is there a way to make it maintain that order ?
> > Hopefully the question makes sense, if not please let me know I can
> > clarify further.
> >
> > Thanks,
> > Shivam
> >
>
>
> --
> shivamJha


Re: Using fromIndex for single collection

2020-11-19 Thread Jason Gerlowski
Hi Irina,

Yes, the "fromIndex" parameter can be used to perform a join from the
host collection to a separate, single-shard collection in SolrCloud.
If specified, this "fromIndex" collection must be present on whichever
host is processing the request.  (Often this involves over-replicating
your "fromIndex" so that it's co-located with the other involved
collection).

Additionally, Solr has recently gained support for "Cross Collection
Joins".  This separate approach to joining avoids the restrictions
mentioned above.  This is documented here:
https://lucene.apache.org/solr/guide/8_6/other-parsers.html#cross-collection-join

Best,

Jason

On Wed, Oct 7, 2020 at 12:45 PM Irina Kamalova  wrote:
>
> I suppose my question is very simple.
> Am I right that if I want to use joins in the single collection in
> SolrCloud across several shards,
> I need to use semantic "fromIndex"?
> According to documentation I should use it only if I have different
> collections.
> I have one single collection across multiple shards and I didn't find a way
> to join documents correctly, but with "fromIndex" semantic.
>
> Am I correct?
>
> Best regards,
> Irina Kamalova


Re: Faceting: !terms vs mincount precedence

2020-11-17 Thread Jason Gerlowski
Thanks for the context David - I didn't realize this was built as an
internal mechanism and then documented later on.  A few other thoughts
below:

> {!terms}, it suggests a reference to the TermsQParser, but when you write 
> {!terms=a,b,c} it suggests local-params
I agree that the two are easy to confuse.  Apologies for abbreviating
it at points in my earlier email - I was doing it for brevity and
didn't intend the confusion.

> I think that "terms" local-param to faceting was a purely internal thing that 
> wasn't documented
That may be.  But I disagree that it shouldn't've been documented in
the first place.  Digging into this has cost me a good bit of time,
and even now maybe I've got more digging to do, maybe a bug to fix,
etc.  But without someone's (Christine's?) documentation I'd be even
worse off, without any idea that this "terms" local-params support
exists at all.  The documentation even mentions that "terms" doesn't
work well with some other faceting params.  The details could be a bit
fuller, but the warning *is* there.  So I don't find any fault with
documenting this sort of stuff - especially when it gives warnings
about potential limitations.

Anyway, still hoping someone else might chime in with a slick
workaround or something.  But it does look at this point like I'll
have to go another route or put in some effort myself.

Jason

On Tue, Nov 17, 2020 at 3:41 PM David Smiley  wrote:
>
> This is confusing because when you write {!terms}, it suggests a reference
> to the TermsQParser, but when you write {!terms=a,b,c} it suggests
> local-params, with key "terms" and value "a,b,c" -- entirely different
> things.  I think that "terms" local-param to faceting was a purely internal
> thing that wasn't documented; it existed as an internal implementation
> detail.  Then someone (I think Christine, if not then Mikhail) observed it
> wasn't documented, and added some basic docs.  Now you come along and try
> to use it with other things that unsurprisingly it just wasn't designed
> for.  That's my estimation of the matter... and *if* true, illustrates that
> maybe some internal params should stay internal and don't need to be
> publicly documented.  I confess I've used that faceting local-param in an
> app once before too; it's useful.  I know my response isn't a direct answer
> to your question RE mincount... perhaps it can be made to work?
>
> ~ David Smiley
> Apache Lucene/Solr Search Developer
> http://www.linkedin.com/in/davidwsmiley
>
>
> On Tue, Nov 17, 2020 at 8:21 AM Jason Gerlowski 
> wrote:
>
> > Hey all,
> >
> > I was using the {!terms} local parameter on some traditional field
> > facets to make sure particular values were returned.
> >
> > e.g.
> > facet=true={!terms='fantasy,scifi,mystery'}genre_s_s.facet.mincount=2
> >
> > On single-shard collections in 8.6.3 this worked as I expected -
> > "fantasy", "scifi", and "mystery" were the only 3 field values
> > returned, and "mystery" was returned despite its count value being
> > less than the specified "mincount".  But on a multi-shard collection
> > "mystery" isn't returned (presumably because a "mincount" check
> > filters out the values on the facet aggregator node).
> >
> > What are the expected semantics when "{!terms}" and "mincount" are
> > used together?  Should mincount filter out values in {!terms}, or
> > should those values be excluded from any mincount filtering?  The
> > behavior is clearly inconsistent between single and multi-shard, so it
> > deserves a JIRA either way.  Just trying to figure out what the
> > expected behavior is.
> >
> > Best,
> >
> > Jason
> >


Re: Multiple Facets on Same Field

2020-11-17 Thread Jason Gerlowski
Thanks Michael,

I agree - JSON Facets is a better candidate for the functionality I'm
looking for.  In my case specifically though, I think I'm pegged to
traditional facets because I also want to use the "terms" local params
support that doesn't have a native equivalent in JSON Faceting (yet:
SOLR-14921).

If no one has other ideas here, maybe my best bet is to switch to
using JSON Faceting and adding an explicit "{!terms}" query as a
filter.  I see you suggested that as a workaround here [1].

Jason

[1] 
http://mail-archives.apache.org/mod_mbox/lucene-dev/202010.mbox/%3CCAF%3DheHGKwGtvq%3DgAndmVrgvo1cxKmzP0neGi17_eoVhubpaBZA%40mail.gmail.com%3E

On Tue, Nov 17, 2020 at 10:02 AM Michael Gibney
 wrote:
>
> Answering a slightly different question perhaps, but you can
> definitely do this with the "JSON Facet" API, where there's much
> cleaner separation between different facets (and output is assigned to
> arbitrary keys).
> Michael
>
> On Tue, Nov 17, 2020 at 9:36 AM Jason Gerlowski  wrote:
> >
> > Hi all,
> >
> > Is it possible to have multiple facets on the same field with
> > different parameters (mincount, limit, prefix, etc.) on each?
> >
> > The ref-guide describes these per-facet parameters as being settable
> > on a "per-field basis" with syntax of
> > "f..facet." [1].  But I wasn't sure whether to
> > take that at face value, or hope that the "" value there
> > could be something more flexible (like the value of facet.field which
> > can take local params).
> >
> > I've been trying variations of
> > "facet=true=f1=5={!key=someOutputKey}f1",
> > but without luck.  "mincount" is always applied to both of the
> > facet.field's being computed.
> >
> > Best,
> >
> > Jason


Faceting: !terms vs mincount precedence

2020-11-17 Thread Jason Gerlowski
Hey all,

I was using the {!terms} local parameter on some traditional field
facets to make sure particular values were returned.

e.g. 
facet=true={!terms='fantasy,scifi,mystery'}genre_s_s.facet.mincount=2

On single-shard collections in 8.6.3 this worked as I expected -
"fantasy", "scifi", and "mystery" were the only 3 field values
returned, and "mystery" was returned despite its count value being
less than the specified "mincount".  But on a multi-shard collection
"mystery" isn't returned (presumably because a "mincount" check
filters out the values on the facet aggregator node).

What are the expected semantics when "{!terms}" and "mincount" are
used together?  Should mincount filter out values in {!terms}, or
should those values be excluded from any mincount filtering?  The
behavior is clearly inconsistent between single and multi-shard, so it
deserves a JIRA either way.  Just trying to figure out what the
expected behavior is.

Best,

Jason


[ANNOUNCE] Apache Solr 8.6.3 released

2020-10-08 Thread Jason Gerlowski
The Lucene PMC is pleased to announce the release of Apache Solr 8.6.3.

Solr is the popular, blazing fast, open source NoSQL search platform
from the Apache Lucene project. Its major features include powerful
full-text search, hit highlighting, faceted search, dynamic
clustering, database integration, rich document handling, and
geospatial search. Solr is highly scalable, providing fault tolerant
distributed search and indexing, and powers the search and navigation
features of many of the world's largest internet sites.

Solr 8.6.3 is available for immediate download at:
  

### Solr 8.6.3 Release Highlights:

 * SOLR-14898: Prevent duplicate header accumulation on internally
forwarded requests
 * SOLR-14768: Fix HTTP multipart POST requests to Solr (8.6.0 regression)
 * SOLR-14859: PrefixTree-based fields now reject invalid schema
properties instead of quietly failing certain queries
 * SOLR-14663: CREATE ConfigSet action now copies base node content

Please refer to the Upgrade Notes in the Solr Ref Guide for
information on upgrading from previous Solr versions:
  

Please read CHANGES.txt for a full list of bugfixes:
  

Solr 8.6.3 also includes bugfixes in the corresponding Apache Lucene release:
  

Note: The Apache Software Foundation uses an extensive mirroring network for
distributing releases. It is possible that the mirror you are using may not have
replicated the release yet. If that is the case, please try another mirror.
This also applies to Maven access.


Re: BasicAuth help

2020-09-03 Thread Jason Gerlowski
Hi Ali,

1. Solr doesn't have any support for LDAP authentication ootb (at
least, as far as I'm aware).  The BasicAuth plugin requires users to
be defined in the JSON configuration.

2. What failed when you ran the documented BasicAuth example?  What
error messages did you get etc.?  If there's something wrong with that
example, maybe we can fix the docs.

Jason

On Fri, Aug 28, 2020 at 3:28 PM Vanalli, Ali A - DOT
 wrote:
>
> Hello,
>
> Solr is running on windows machine and wondering if it possible to setup 
> BasicAuth with the LDAP?
>
> Also, tried the example of Basic-Authentication that is published 
> here<https://lucene.apache.org/solr/guide/8_6/rule-based-authorization-plugin.html#rule-based-authorization-plugin>
>  but this did not work too.
>
> Thanks...Ali
>
>


Re: Incorrect Insecure Settings Check in CoreContainer

2020-08-13 Thread Jason Gerlowski
Hey Mark,

I've fixed it for 8.7 as a part of this ticket here:
https://issues.apache.org/jira/browse/SOLR-14748. Thanks for reporting
this.

Jason

On Tue, Aug 11, 2020 at 3:19 PM Jason Gerlowski  wrote:
>
> Yikes, yeah it's hard to argue with that.
>
> I'm a little confused because I remember testing this, but maybe it
> snuck in at the last minute?  In any case, I'll reopen that jira to
> fix the check there.
>
> Sorry guys.
>
> Jason
>
>
> On Wed, Aug 5, 2020 at 9:22 AM Jan Høydahl  wrote:
> >
> > This seems to have been introduced in 
> > https://issues.apache.org/jira/browse/SOLR-13972 in 8.4
> > That test seems to be inverted for sure.
> >
> > Jason?
> >
> > Jan
> >
> > > 5. aug. 2020 kl. 13:15 skrev Mark Todd1 :
> > >
> > >
> > > I've configured SolrCloud (8.5) with both SSL and Authentication which is 
> > > working correctly. However, I get the following warning in the logs
> > >
> > > Solr authentication is enabled, but SSL is off. Consider enabling SSL to 
> > > protect user credentials and data with encryption
> > >
> > > Looking at the source code for SolrCloud there appears to be a bug
> > > if (authenticationPlugin !=null && 
> > > StringUtils.isNotEmpty(System.getProperty("solr.jetty.https.port"))) {
> > >
> > > log.warn("Solr authentication is enabled, but SSL is off.  Consider 
> > > enabling SSL to protect user credentials and data with encryption.");
> > >
> > > }
> > >
> > > Rather than checking for an empty system property (which would indicate 
> > > SLL is off) its checking for a populated one which is what you get when 
> > > SSL is on.
> > >
> > > Should I raise this as a Jira bug?
> > >
> > > Mark ToddUnless stated otherwise above:
> > > IBM United Kingdom Limited - Registered in England and Wales with number 
> > > 741598.
> > > Registered office: PO Box 41, North Harbour, Portsmouth, Hampshire PO6 3AU
> > >
> >


Re: Incorrect Insecure Settings Check in CoreContainer

2020-08-11 Thread Jason Gerlowski
Yikes, yeah it's hard to argue with that.

I'm a little confused because I remember testing this, but maybe it
snuck in at the last minute?  In any case, I'll reopen that jira to
fix the check there.

Sorry guys.

Jason


On Wed, Aug 5, 2020 at 9:22 AM Jan Høydahl  wrote:
>
> This seems to have been introduced in 
> https://issues.apache.org/jira/browse/SOLR-13972 in 8.4
> That test seems to be inverted for sure.
>
> Jason?
>
> Jan
>
> > 5. aug. 2020 kl. 13:15 skrev Mark Todd1 :
> >
> >
> > I've configured SolrCloud (8.5) with both SSL and Authentication which is 
> > working correctly. However, I get the following warning in the logs
> >
> > Solr authentication is enabled, but SSL is off. Consider enabling SSL to 
> > protect user credentials and data with encryption
> >
> > Looking at the source code for SolrCloud there appears to be a bug
> > if (authenticationPlugin !=null && 
> > StringUtils.isNotEmpty(System.getProperty("solr.jetty.https.port"))) {
> >
> > log.warn("Solr authentication is enabled, but SSL is off.  Consider 
> > enabling SSL to protect user credentials and data with encryption.");
> >
> > }
> >
> > Rather than checking for an empty system property (which would indicate SLL 
> > is off) its checking for a populated one which is what you get when SSL is 
> > on.
> >
> > Should I raise this as a Jira bug?
> >
> > Mark ToddUnless stated otherwise above:
> > IBM United Kingdom Limited - Registered in England and Wales with number 
> > 741598.
> > Registered office: PO Box 41, North Harbour, Portsmouth, Hampshire PO6 3AU
> >
>


Re: Slow query response from SOLR 5.4.1

2020-08-11 Thread Jason Gerlowski
Hey Abhijit,

The information you provided isn't really enough for anyone else on
the mailing list to debug the problem.  If you'd like help, please
provide some more information.

Good places to start would be: what is the query, what does Solr tell
you when you add a "debug=timing" parameter to your request, what does
your Solr setup look like (num nodes, shards, replicas, other
collections/cores, QPS).  It's hard to say upfront what piece of info
will be the one that helps you get an answer to your question -
performance problems have a lot of varied causes.  But providing
_some_ of these things or other related details might help you get the
answer you're looking for.

Alternately, if you've figured out the issue already post the answer
on this thread - help anyone with a similar issue in the future.
Jason

On Tue, Aug 4, 2020 at 4:11 PM Abhijit Pawar  wrote:
>
> Hello,
>
> I am seeing a performance issue in querying in one of the SOLR servers -
> instance version 5.4.1.
> Total number of documents indexed are 20K plus.
> Data returned for this particular query is just as less as 22 documents
> however it takes almost 2 minutes to get the results back.
>
> Is there a way to improve on performance of query - in general the query
> response time is slow..
>
> I have most of the fields which are stored and indexed both.I can take off
> some fields which are just needed to be indexed however those are not many
> fields.
>
> Can I do something solrconfig.xml in terms of cache or something else?
>
> Any suggestions?
>
> Thanks!!


Re: Survey on ManagedResources feature

2020-08-11 Thread Jason Gerlowski
Hey Noble,

Can you explain what you mean when you say it's not secured?  Just for
those of us who haven't been following the discussion so far?  On the
surface of things users taking advantage of our RuleBasedAuth plugin
can secure this API like they can any other HTTP API.  Or are you
talking about some other security aspect here?

Jason

On Tue, Aug 11, 2020 at 9:55 AM Noble Paul  wrote:
>
> Hi all,
> The end-point for Managed resources is not secured. So it needs to be
> fixed/eliminated.
>
> I would like to know what is the level of adoption for that feature
> and if it is a critical feature for users.
>
> Another possibility is to offer a replacement for the feature using a
> different API
>
> Your feedback will help us decide on what a potential solution should be
>
> --
> -
> Noble Paul


Re: How to route requests to a specific core of a node hosting multiple shards?

2020-08-04 Thread Jason J Baik
Thanks for looking into this @Erick Erickson.
What'd be the proper way to get David Smiley's attention on this issue? A
JIRA ticket?

As for the performance difference, we haven't had a chance to test it.
We're still in the dev phase for migrating to solr 8, so we'll run our
benchmarks afterward, and try to see if it's a serious problem.

On Mon, Jul 20, 2020 at 10:43 AM Erick Erickson 
wrote:

> Hmm, ok.
>
> I’d have to defer to David Smiley about whether that was an intended
> change.
>
> I’m curious whether you can actually measure the difference in
> performance. If
> you can then that changes the urgency. Of course it’ll be a little more
> expensive
> for the replica serving shard2 on that machine to forward it to the replica
> serving shard1, but since it’s not going across the network IDK if it’s a
> consequential difference.
>
> Best,
> Erick
>
> > On Jul 20, 2020, at 10:04 AM, Jason J Baik 
> wrote:
> >
> > Our use case here is that we want to highlight a single document (against
> > user-provided keywords), and we know the document's unique key already.
> > So this is really not a distributed query, but more of a get by id, but
> we
> > use SolrClient.query() for highlighting capabilities.
> > And since we know the unique key, for speed gains, we've been making use
> of
> > the "_route_" param to limit the request to the shard containing the
> > document.
> >
> > Our use case aside, SOLR-11444
> > <https://issues.apache.org/jira/browse/SOLR-11444> generally seems to
> be at
> > odds with the advertised use of the "_route_" param
> >
> https://lucene.apache.org/solr/guide/7_5/solrcloud-query-routing-and-read-tolerance.html#_route_-parameter
> > .
> > Solr is routing the request to the correct "node", but it no longer
> routes
> > to the correct "shard" on that node?
> >
> >
> > On Mon, Jul 20, 2020 at 9:33 AM Erick Erickson 
> > wrote:
> >
> >> First I want to check if this is an XY problem. Why do you want to do
> this?
> >>
> >> If you’re using CloudSolrClient, requests are automatically load
> balanced.
> >> And
> >> even if you send a top-level request (assuming you do NOT set
> >> distrib=false),
> >> then the request may be forwarded to another Solr node anyway. This is
> to
> >> handle the case where people are sending requests to a specific node,
> you
> >> don’t
> >> really want that node doing all the aggregating.
> >>
> >> Of course if you’re using an external load balancer, you can avoid all
> >> that.
> >>
> >> I’m not sure what the value is of sending a general request to a
> specific
> >> core in the same JVM. A “node” is really Solr running in a JVM, so there
> >> may be multiple of these on a particular machine, but the resolution
> >> takes that into account.
> >>
> >> If you have reason to ping a specific replica _only_ (I’ve often done
> this
> >> for
> >> troubleshooting), address the full replica and add “distrib=false”, i.e.
> >> http://…../solr/collection1_shard1_replica1?q=*:*=false
> >>
> >> Best,
> >> Erick
> >>
> >>> On Jul 20, 2020, at 9:02 AM, Jason J Baik 
> >> wrote:
> >>>
> >>> Hi,
> >>>
> >>> After upgrading from Solr 6.6.2 to 7.6.0, we're seeing an issue with
> >>> request routing in CloudSolrClient. It seems that we've lost the
> ability
> >> to
> >>> route a request to a specific core of a node.
> >>>
> >>> For example, if a host is serving shard 1 core 1, and shard 2 core
> >>> 1, @6.6.2, adding a "_route_="
> >>> param was sufficient for CloudSolrClient to figure out the request
> should
> >>> go to shard 1 core 1, but @7.6.0, the request is routed to one of them
> >>> randomly.
> >>>
> >>> It seems the core-level url resolution has been removed from
> >>> CloudSolrClient at commit e001f352895c83652c3cf31e3c724d29a46bb721
> around
> >>> L1053, as part of SOLR-11444
> >>> <https://issues.apache.org/jira/browse/SOLR-11444>. The url the
> request
> >> is
> >>> sent to is now constructed only to the node level, and no longer to the
> >>> core level.
> >>>
> >>> There's a related issue for this at SOLR-10695
> >>> <https://issues.apache.org/jira/browse/SOLR-10695>, and SOLR-9063
> >>> <https://issues.apache.org/jira/browse/SOLR-9063> but not quite the
> >> same.
> >>> Can somebody please advise what the new way to achieve this nowadays
> is?
> >>
> >>
>
>


Re: bin/solr auth enable

2020-07-31 Thread Jason Gerlowski
Hi David,

I tried this out locally but couldn't reproduce. The command you
provided above works just fine for me.

Can you tell us a bit about your environment?  Do you have the full
stack trace of the NPE handy?

Best,

Jason

On Fri, Jul 24, 2020 at 8:01 PM David Glick  wrote:
>
> When I issue “bin/solr auth enable -prompt true -blockUnknown true”, I get a 
> Null Pointer Exception.  I’m using the 8.5.1 release.  Am I doing something 
> wrong?
>
> Thanks.
>
> Sent from my iPhone


Re: SOLR and Zookeeper compatibility

2020-07-22 Thread Jason Gerlowski
Hi Mithun,

AFAIK, Solr 7.5.0 comes with ZooKeeper 3.4.11.  At least, those are
the jar versions I see when I unpack a Solr 7.5.0 distribution.  Where
are you seeing 1.3.11?  There is no 1.3.11 ZooKeeper release as far as
I'm aware.  There must be some confusion here.

Generally speaking, since 3.4.11 is the version the community
primarily was testing with at the time of Solr 7.5.0's release, that's
also probably the safest version to use.  That said, users do
frequently choose other ZooKeeper versions within the same release
line (3.4.x) for one reason or another and don't report many issues
doing so.  A few of the exceptions are tracked in our JIRA portal and
you can get more info by searching there.

Best,

Jason


On Mon, Jul 13, 2020 at 5:24 AM Mithun Seal  wrote:
>
> Hi Team,
>
> Could you please help me with below compatibility question.
>
> 1. We are trying to install zookeeper externally along with SOLR 7.5.0. As
> noted, SOLR 7.5.0 comes with Zookeeper 1.3.11. Can I install Zookeeper
> 1.3.10 with SOLR 7.5.0. Zookeeper 1.3.10 will be compatible with SOLR 7.5.0?
>
> 2. What is the suggested version of Zookeeper should be used with SOLR
> 7.5.0?
>
>
> Thanks,
> Mithun


Re: How to route requests to a specific core of a node hosting multiple shards?

2020-07-20 Thread Jason J Baik
Our use case here is that we want to highlight a single document (against
user-provided keywords), and we know the document's unique key already.
So this is really not a distributed query, but more of a get by id, but we
use SolrClient.query() for highlighting capabilities.
And since we know the unique key, for speed gains, we've been making use of
the "_route_" param to limit the request to the shard containing the
document.

Our use case aside, SOLR-11444
<https://issues.apache.org/jira/browse/SOLR-11444> generally seems to be at
odds with the advertised use of the "_route_" param
https://lucene.apache.org/solr/guide/7_5/solrcloud-query-routing-and-read-tolerance.html#_route_-parameter
.
Solr is routing the request to the correct "node", but it no longer routes
to the correct "shard" on that node?


On Mon, Jul 20, 2020 at 9:33 AM Erick Erickson 
wrote:

> First I want to check if this is an XY problem. Why do you want to do this?
>
> If you’re using CloudSolrClient, requests are automatically load balanced.
> And
> even if you send a top-level request (assuming you do NOT set
> distrib=false),
> then the request may be forwarded to another Solr node anyway. This is to
> handle the case where people are sending requests to a specific node, you
> don’t
> really want that node doing all the aggregating.
>
> Of course if you’re using an external load balancer, you can avoid all
> that.
>
> I’m not sure what the value is of sending a general request to a specific
> core in the same JVM. A “node” is really Solr running in a JVM, so there
> may be multiple of these on a particular machine, but the resolution
> takes that into account.
>
> If you have reason to ping a specific replica _only_ (I’ve often done this
> for
> troubleshooting), address the full replica and add “distrib=false”, i.e.
> http://…../solr/collection1_shard1_replica1?q=*:*=false
>
> Best,
> Erick
>
> > On Jul 20, 2020, at 9:02 AM, Jason J Baik 
> wrote:
> >
> > Hi,
> >
> > After upgrading from Solr 6.6.2 to 7.6.0, we're seeing an issue with
> > request routing in CloudSolrClient. It seems that we've lost the ability
> to
> > route a request to a specific core of a node.
> >
> > For example, if a host is serving shard 1 core 1, and shard 2 core
> > 1, @6.6.2, adding a "_route_="
> > param was sufficient for CloudSolrClient to figure out the request should
> > go to shard 1 core 1, but @7.6.0, the request is routed to one of them
> > randomly.
> >
> > It seems the core-level url resolution has been removed from
> > CloudSolrClient at commit e001f352895c83652c3cf31e3c724d29a46bb721 around
> > L1053, as part of SOLR-11444
> > <https://issues.apache.org/jira/browse/SOLR-11444>. The url the request
> is
> > sent to is now constructed only to the node level, and no longer to the
> > core level.
> >
> > There's a related issue for this at SOLR-10695
> > <https://issues.apache.org/jira/browse/SOLR-10695>, and SOLR-9063
> > <https://issues.apache.org/jira/browse/SOLR-9063> but not quite the
> same.
> > Can somebody please advise what the new way to achieve this nowadays is?
>
>


How to route requests to a specific core of a node hosting multiple shards?

2020-07-20 Thread Jason J Baik
Hi,

After upgrading from Solr 6.6.2 to 7.6.0, we're seeing an issue with
request routing in CloudSolrClient. It seems that we've lost the ability to
route a request to a specific core of a node.

For example, if a host is serving shard 1 core 1, and shard 2 core
1, @6.6.2, adding a "_route_="
param was sufficient for CloudSolrClient to figure out the request should
go to shard 1 core 1, but @7.6.0, the request is routed to one of them
randomly.

It seems the core-level url resolution has been removed from
CloudSolrClient at commit e001f352895c83652c3cf31e3c724d29a46bb721 around
L1053, as part of SOLR-11444
. The url the request is
sent to is now constructed only to the node level, and no longer to the
core level.

There's a related issue for this at SOLR-10695
, and SOLR-9063
 but not quite the same.
Can somebody please advise what the new way to achieve this nowadays is?


Re: Restored collection cluster status rendering some values as Long (as opposed to String for other collections)

2020-06-25 Thread Jason Gerlowski
Hi Aliaksandr

This sounds like a bug to me - I can't think of any reason why this
would be intentional behavior.  Maybe I'm missing something and this
is "expected", but if so someone will come along and correct me.

Can you file a JIRA ticket with this information in it?

Jason

On Wed, Jun 24, 2020 at 10:03 AM Aliaksandr Asiptsou
 wrote:
>
> Sorry I forgot to mention: we use Solr 8.3.1
>
> Best regards,
> Aliaksandr Asiptsou
> From: Aliaksandr Asiptsou
> Sent: Wednesday, June 24, 2020 12:44 AM
> To: solr-user@lucene.apache.org
> Subject: Restored collection cluster status rendering some values as Long (as 
> opposed to String for other collections)
>
> Hello Solr experts,
>
> Our team noticed the below behavior:
>
> 1. A collection is restored from a backup, and a replication factor is 
> specified within the restore command:
>
> /solr/admin/collections?action=RESTORE=backup_name=/backups/solr=collection_name=config_name=1=1
>
> 2. Collection restored successfully, but looking into cluster status we see 
> several values are rendered as Long for this particular collection:
>
> /solr/admin/collections?action=clusterstatus=xml
>
> 0
> 1
> 1
> false
> 1
> 0
> 138
>
> Whereas for all the other collections pullReplicas, replicationFactor, 
> nrtReplicas and tlogReplicas are Strings.
>
> Please advise whether it is known and expected or it needs to be fixed (if 
> so, is there a Jira ticket already for this or should we create one)?
>
> Best regards,
> Aliaksandr Asiptsou


Re: Index files on Windows fileshare

2020-06-25 Thread Jason Gerlowski
Hi Fiz,

Since you're just looking for a POC solution, I think Solr's
"bin/post" tool would probably help you achieve your first
requirement.

But I don't think "bin/post" gives you much control over the fields
that get indexed - if you need the file path to be stored, you might
be better off writing a small crawler in Java and using SolrJ to do
the indexing.

Good luck!

Jason

On Fri, Jun 19, 2020 at 9:34 AM Fiz N  wrote:
>
> Hello Solr experts,
>
> I am using standalone version of SOLR 8.5 on Windows machine.
>
> 1)  I want to index all types of files under different directory in the
> file share.
>
> 2) I need to index  absolute path of the files and store it solr field. I
> need that info so that end user can click and open the file(Pop-up)
>
> Could you please tell me how to go about this?
> This is for POC purpose once we finalize the solution we would be further
> going ahead with stable approach.
>
> Thanks
> Fiz Nadian.


Re: [EXTERNAL] Re: Getting rid of Master/Slave nomenclature in Solr

2020-06-18 Thread Jason Gerlowski
+1 to rename master/slave, and +1 to choosing terminology distinct
from what's used for SolrCloud.  I could be happy with several of the
proposed options.  Since a good few have been proposed though, maybe
an eventual vote thread is the most organized way to aggregate the
opinions here.

I'm less positive about the prospect of changing the name of our
primary git branch.  Most projects that contributors might come from,
most tutorials out there to learn git, most tools built on top of git
- the majority are going to assume "master" as the main branch.  I
appreciate the change that Github is trying to effect in changing the
default for new projects, but it'll be a long time before that
competes with the huge bulk of projects, documentation, etc. out there
using "master".  Our contributors are smart and I'm sure they'd figure
it out if we used "main" or something else instead, but having a
non-standard git setup would be one more "papercut" in understanding
how to contribute to a project that already makes that harder than it
should.

Jason


On Thu, Jun 18, 2020 at 7:33 AM Demian Katz  wrote:
>
> Regarding people having a problem with the word "master" -- GitHub is 
> changing the default branch name away from "master," even in isolation from a 
> "slave" pairing... so the terminology seems to be falling out of favor in all 
> contexts. See:
>
> https://www.cnet.com/news/microsofts-github-is-removing-coding-terms-like-master-and-slave/
>
> I'm not here to start a debate about the semantics of that, just to provide 
> evidence that in some communities, the term "master" is causing concern all 
> by itself. If we're going to make the change anyway, it might be best to get 
> it over with and pick the most appropriate terminology we can agree upon, 
> rather than trying to minimize the amount of change. It's going to be 
> backward breaking anyway, so we might as well do it all now rather than risk 
> having to go through two separate breaking changes at different points in 
> time.
>
> - Demian
>
> -Original Message-
> From: Noble Paul 
> Sent: Thursday, June 18, 2020 1:51 AM
> To: solr-user@lucene.apache.org
> Subject: [EXTERNAL] Re: Getting rid of Master/Slave nomenclature in Solr
>
> Looking at the code I see a 692 occurrences of the word "slave".
> Mostly variable names and ref guide docs.
>
> The word "slave" is present in the responses as well. Any change in the 
> request param/response payload is backward incompatible.
>
> I have no objection to changing the names in ref guide and other internal 
> variables. Going ahead with backward incompatible changes is painful. If 
> somebody has the appetite to take it up, it's OK
>
> If we must change, master/follower can be a good enough option.
>
> master (noun): A man in charge of an organization or group.
> master(adj) : having or showing very great skill or proficiency.
> master(verb): acquire complete knowledge or skill in (a subject, technique, 
> or art).
> master (verb): gain control of; overcome.
>
> I hope nobody has a problem with the term "master"
>
> On Thu, Jun 18, 2020 at 3:19 PM Ilan Ginzburg  wrote:
> >
> > Would master/follower work?
> >
> > Half the rename work while still getting rid of the slavery connotation...
> >
> >
> > On Thu 18 Jun 2020 at 07:13, Walter Underwood  wrote:
> >
> > > > On Jun 17, 2020, at 4:00 PM, Shawn Heisey  wrote:
> > > >
> > > > It has been interesting watching this discussion play out on
> > > > multiple
> > > open source mailing lists.  On other projects, I have seen a VERY
> > > high level of resistance to these changes, which I find disturbing
> > > and surprising.
> > >
> > > Yes, it is nice to see everyone just pitch in and do it on this list.
> > >
> > > wunder
> > > Walter Underwood
> > > wun...@wunderwood.org
> > > https://nam04.safelinks.protection.outlook.com/?url=http%3A%2F%2Fobs
> > > erver.wunderwood.org%2Fdata=02%7C01%7Cdemian.katz%40villanova.e
> > > du%7C1eef0604700a442deb7e08d8134b97fb%7C765a8de5cf9444f09cafae5bf8cf
> > > a366%7C0%7C0%7C637280562684672329sdata=0GyK5Tlq0PGsWxl%2FirJOVN
> > > VaFCELlEChdxuLJ5RxdQs%3Dreserved=0  (my blog)
> > >
> > >
>
>
>
> --
> -
> Noble Paul


Re: Can't fetch table from cassandra through jdbc connection

2020-06-16 Thread Jason Gerlowski
The way I read the stack trace you provided, it looks like DIH is
running the query "select test_field from test_keyspace.test_table
limit 10", but the Cassandra jdbc driver is reporting that Cassandra
doesn't support some aspect of that query.  If I'm reading that right,
this seems like a question for the Cassandra folks who wrote that jdbc
driver instead of the Solr folks here.  Though maybe there's someone
here who happens to know.

The only thing I'd suggest to get more DIH logging would be to raise
the log levels for DIH classes, but from what you said above it sounds
like you already did that for the root logger and it didn't give you
anything that helped solve the issue.  So I'm stumped.

Good luck,

Jason

On Tue, Jun 16, 2020 at 6:05 AM Ирина Камалова  wrote:
>
> Could you please tell me if I can expand log trace here?
> (if I'm trying to do it through solr admin and make root log ALL - it
> doesn't help me)
>
>
> Best regards,
> Irina Kamalova
>
>
> On Mon, 15 Jun 2020 at 10:12, Ирина Камалова 
> wrote:
>
> > I’m using Solr 7.7.3 and latest Cassandra jdbc driver 1.3.5
> >
> > I get  *SQLFeatureNotSupportedException *
> >
> >
> > I see this error and have no idea what’s wrong (not enough verbose - table
> > name or field wrong/ couldn’t mapping type or driver doesn’t support?)
> >
> >
> > Full Import failed:java.lang.RuntimeException: java.lang.RuntimeException: 
> > org.apache.solr.handler.dataimport.DataImportHandlerException: Unable to 
> > execute query: select test_field from test_keyspace.test_table limit 10; 
> > Processing Document # 1
> > at 
> > org.apache.solr.handler.dataimport.DocBuilder.execute(DocBuilder.java:271)
> > at 
> > org.apache.solr.handler.dataimport.DataImporter.doFullImport(DataImporter.java:424)
> > at 
> > org.apache.solr.handler.dataimport.DataImporter.runCmd(DataImporter.java:483)
> > at 
> > org.apache.solr.handler.dataimport.DataImporter.lambda$runAsync$0(DataImporter.java:466)
> > at java.lang.Thread.run(Thread.java:748)
> > Caused by: java.lang.RuntimeException: 
> > org.apache.solr.handler.dataimport.DataImportHandlerException: Unable to 
> > execute query: select test_field from test_keyspace.test_table limit 10; 
> > Processing Document # 1
> > at 
> > org.apache.solr.handler.dataimport.DocBuilder.buildDocument(DocBuilder.java:417)
> > at 
> > org.apache.solr.handler.dataimport.DocBuilder.doFullDump(DocBuilder.java:330)
> > at 
> > org.apache.solr.handler.dataimport.DocBuilder.execute(DocBuilder.java:233)
> > ... 4 more
> > Caused by: org.apache.solr.handler.dataimport.DataImportHandlerException: 
> > Unable to execute query: select test_field from test_keyspace.test_table 
> > limit 10; Processing Document # 1
> > at 
> > org.apache.solr.handler.dataimport.DataImportHandlerException.wrapAndThrow(DataImportHandlerException.java:69)
> > at 
> > org.apache.solr.handler.dataimport.JdbcDataSource$ResultSetIterator.(JdbcDataSource.java:327)
> > at 
> > org.apache.solr.handler.dataimport.JdbcDataSource.createResultSetIterator(JdbcDataSource.java:288)
> > at 
> > org.apache.solr.handler.dataimport.JdbcDataSource.getData(JdbcDataSource.java:283)
> > at 
> > org.apache.solr.handler.dataimport.JdbcDataSource.getData(JdbcDataSource.java:52)
> > at 
> > org.apache.solr.handler.dataimport.SqlEntityProcessor.initQuery(SqlEntityProcessor.java:59)
> > at 
> > org.apache.solr.handler.dataimport.SqlEntityProcessor.nextRow(SqlEntityProcessor.java:73)
> > at 
> > org.apache.solr.handler.dataimport.EntityProcessorWrapper.nextRow(EntityProcessorWrapper.java:267)
> > at 
> > org.apache.solr.handler.dataimport.DocBuilder.buildDocument(DocBuilder.java:476)
> > at 
> > org.apache.solr.handler.dataimport.DocBuilder.buildDocument(DocBuilder.java:415)
> > ... 6 more
> > Caused by: java.sql.SQLFeatureNotSupportedException
> > at 
> > com.dbschema.CassandraConnection.createStatement(CassandraConnection.java:75)
> > at 
> > org.apache.solr.handler.dataimport.JdbcDataSource$ResultSetIterator.createStatement(JdbcDataSource.java:342)
> > at 
> > org.apache.solr.handler.dataimport.JdbcDataSource$ResultSetIterator.(JdbcDataSource.java:318)
> > ... 14 more
> >
> >
> >
> >
> > Best regards,
> > Irina Kamalova
> >


Re: HTTP 401 when searching on alias in secured Solr

2020-06-16 Thread Jason Gerlowski
Just wanted to close the loop here: Isabelle filed SOLR-14569 for this
and eventually reported there that the problem seems specific to her
custom configuration which specifies a seemingly innocuous
 in solrconfig.xml.

See that jira for more detailed explanation (and hopefully a
resolution coming soon).

On Wed, Jun 10, 2020 at 4:01 PM Jan Høydahl  wrote:
>
> Please share your security.json file
>
> Jan Høydahl
>
> > 10. jun. 2020 kl. 21:53 skrev Isabelle Giguere 
> > :
> >
> > Hi;
> >
> > I'm using Solr 8.5.0.  I have uploaded security.json to Zookeeper.  I can 
> > log in the Solr Admin UI.  I can create collections and aliases, and I can 
> > index documents in Solr.
> >
> > Collections : test1, test2
> > Alias: test (combines test1, test2)
> >
> > Indexed document "solr-word.pdf" in collection test1
> >
> > Searching on a collection works:
> > http://localhost:8983/solr/test1/select?q=*:*=xml
> > 
> >
> > But searching on an alias results in HTTP 401
> > http://localhost:8983/solr/test/select?q=*:*=xml
> >
> > Error from server at null: Expected mime type application/octet-stream but 
> > got text/html.> content="text/html;charset=utf-8"/> Error 401 Authentication failed, 
> > Response code: 401  HTTP ERROR 401 Authentication 
> > failed, Response code: 401  
> > URI:/solr/test1_shard1_replica_n1/select 
> > STATUS:401 
> > MESSAGE:Authentication failed, Response code: 
> > 401 SERVLET:default   
> > 
> >
> > Even if https://issues.apache.org/jira/browse/SOLR-13510 is fixed in Solr 
> > 8.5.0, I did try to start Solr with -Dsolr.http1=true, and I set 
> > "forwardCredentials":true in security.json.
> >
> > Nothing works.  I just cannot use aliases when Solr is secured.
> >
> > Can anyone confirm if this may be a configuration issue, or if this could 
> > possibly be a bug ?
> >
> > Thank you;
> >
> > Isabelle Giguère
> > Computational Linguist & Java Developer
> > Linguiste informaticienne & développeur java
> >
> >


Re: CDCR behaviour

2020-06-05 Thread Jason Gerlowski
Hi Daniel,

Just a heads up that attachments and images are stripped pretty
aggressively by the mailing list - none of your images made it through.
You might more success linking to the images in Dropbox or some other
online storage medium.

Best,

Jason

On Thu, Jun 4, 2020 at 10:55 AM Gell-Holleron, Daniel <
daniel.gell-holle...@gb.unisys.com> wrote:

> Hi,
>
>
>
> Looks for some advice, sent a few questions on CDCR the last couple of
> days.
>
>
>
> I just want to see if this is expected behavior from Solr or not?
>
>
>
> When a document is added to Site A, it is then supposed to replicate
> across, however in the statistics page I see the following:
>
>
>
> Site A
>
>
>
>
> Site B
>
>
>
>
>
> When I perform a search on Site B through the Solr admin page, I do get
> results (which I find strange). The only way for the numb docs parameter to
> be matching is restart Solr, I then get the below:
>
>
>
>
>
> I just want to know whether this behavior is expected or is a bug? My
> expectation is that the data will always be current between the two sites.
>
>
>
> Thanks,
>
> Daniel
>
>
>


Re: SolrCloud upgrade concern

2020-05-27 Thread Jason Gerlowski
Hi Arnold,

>From what I saw in the community, CDCR saw an initial burst of
development around when it was contributed, but hasn't seen much
attention or improvement since.  So while it's been around for a few
years, I'm not sure it's improved much in terms of stability or
compatibility with other Solr features.

Some of the bigger ticket issues still open around CDCR:
- SOLR-11959 no support for basic-auth
- SOLR-12842 infinite retry of failed update-requests (leads to
sync/recovery problems)
- SOLR-12057 no real support for NRT/TLOG/PULL replicas
- SOLR-10679 no support for collection aliases

These are in addition to other more architectural issues: CDCR can be
a bottleneck on clusters with high ingestion rates, CDCR uses
full-index-replication more than traditional indexing setups, which
can cause issues with modern index sizes, etc.

So, unfortunately, no real good news in terms of CDCR maturing much in
recent releases.  Joel Bernstein filed a JIRA recently suggesting its
removal entirely actually.  Though I don't think it's gone anywhere.

That said, I gather from what you said that you're already using CDCR
successfully with Master-Slave.  If none of these pitfalls are biting
you in your current Master-Slave setup, you might not be bothered by
them any more in SolrCloud.  Most of the problems with CDCR are
applicable in master-slave as well as SolrCloud.  I wouldn't recommend
CDCR if you were starting from scratch, and I still recommend you
consider other options.  But since you're already using it with some
success, it might be an orthogonal concern to your potential migration
to SolrCloud.

Best of luck deciding!

Jason

On Fri, May 22, 2020 at 7:06 PM gnandre  wrote:
>
> Thanks for this reply, Jason.
>
> I am mostly worried about CDCR feature. I am relying heavily on it.
> Although, I am planning to use Solr 8.3. It has been long time since CDCR
> was first introduced. I wonder what is the state of CDCR is 8.3. Is it
> stable now?
>
> On Wed, Jan 22, 2020, 8:01 AM Jason Gerlowski  wrote:
>
> > Hi Arnold,
> >
> > The stability and complexity issues Mark highlighted in his post
> > aren't just imagined - there are real, sometimes serious, bugs in
> > SolrCloud features.  But at the same time there are many many stable
> > deployments out there where SolrCloud is a real success story for
> > users.  Small example, I work at a company (Lucidworks) where our main
> > product (Fusion) is built heavily on top of SolrCloud and we see it
> > deployed successfully every day.
> >
> > In no way am I trying to minimize Mark's concerns (or David's).  There
> > are stability bugs.  But the extent to which those need affect you
> > depends a lot on what your deployment looks like.  How many nodes?
> > How many collections?  How tightly are you trying to squeeze your
> > hardware?  Is your network flaky?  Are you looking to use any of
> > SolrCloud's newer, less stable features like CDCR, etc.?
> >
> > Is SolrCloud better for you than Master/Slave?  It depends on what
> > you're hoping to gain by a move to SolrCloud, and on your answers to
> > some of the questions above.  I would be leery of following any
> > recommendations that are made without regard for your reason for
> > switching or your deployment details.  Those things are always the
> > biggest driver in terms of success.
> >
> > Good luck making your decision!
> >
> > Best,
> >
> > Jason
> >


Re: Query takes more time in Solr 8.5.1 compare to 6.1.0 version

2020-05-21 Thread Jason Gerlowski
Hi Jay,

I can't speak to why you're seeing a performance change between 6.x
and 8.x.  What I can suggest though is an alternative way of
formulating the query: you might get different performance if you run
your query using Solr's "terms" query parser:
https://lucene.apache.org/solr/guide/8_5/other-parsers.html#terms-query-parser
 It's not guaranteed to help, but there's a chance it'll work for you.
And knowing whether or not it helps might point others here towards
the cause of your slowdown.

Even if "terms" performs better for you, it's probably worth
understanding what's going on here of course.

Are all other queries running comparably?

Jason

On Thu, May 21, 2020 at 10:25 AM jay harkhani  wrote:
>
> Hello,
>
> Please refer below details.
>
> >Did you create Solrconfig.xml for the collection from scratch after 
> >upgrading and reindexing?
> Yes, We have created collection from scratch and also re-indexing.
>
> >Was it based on the latest template?
> Yes, It was as per latest template.
>
> >What happens if you reexecute the query?
> Not more visible difference. Minor change in milliseconds.
>
> >Are there other processes/containers running on the same VM?
> No
>
> >How much heap and how much total memory you have?
> My heap and total memory are same as Solr 6.1.0. heap memory 5 gb and total 
> memory 25gb. As per me there is no issue related to memory.
>
> >Maybe also you need to increase the corresponding caches in the config.
> We are not using cache in both version.
>
> Both version have same configuration.
>
> Regards,
> Jay Harkhani.
>
> 
> From: Jörn Franke 
> Sent: Thursday, May 21, 2020 7:05 PM
> To: solr-user@lucene.apache.org 
> Subject: Re: Query takes more time in Solr 8.5.1 compare to 6.1.0 version
>
> Did you create Solrconfig.xml for the collection from scratch after upgrading 
> and reindexing? Was it based on the latest template?
> If not then please try this. Maybe also you need to increase the 
> corresponding caches in the config.
>
> What happens if you reexecute the query?
>
> Are there other processes/containers running on the same VM?
>
> How much heap and how much total memory you have? You should only have a 
> minor fraction of the memory as heap and most of it „free“ (this means it is 
> used for file caches).
>
>
>
> > Am 21.05.2020 um 15:24 schrieb vishal patel :
> >
> > Any one is looking this issue?
> > I got same issue.
> >
> > Regards,
> > Vishal Patel
> >
> >
> >
> > 
> > From: jay harkhani 
> > Sent: Wednesday, May 20, 2020 7:39 PM
> > To: solr-user@lucene.apache.org 
> > Subject: Query takes more time in Solr 8.5.1 compare to 6.1.0 version
> >
> > Hello,
> >
> > Currently I upgrade Solr version from 6.1.0 to 8.5.1 and come across one 
> > issue. Query which have more ids (around 3000) and grouping is applied 
> > takes more time to execute. In Solr 6.1.0 it takes 677ms and in Solr 8.5.1 
> > it takes 26090ms. While take reading we have same solr schema and same no. 
> > of records in both solr version.
> >
> > Please refer below details for query, logs and thead dump (generate from 
> > Solr Admin while execute query).
> >
> > Query : 
> > https://drive.google.com/file/d/1bavCqwHfJxoKHFzdOEt-mSG8N0fCHE-w/view
> >
> > Logs and Thread dump stack trace
> > Solr 8.5.1 : 
> > https://drive.google.com/file/d/149IgaMdLomTjkngKHrwd80OSEa1eJbBF/view
> > Solr 6.1.0 : 
> > https://drive.google.com/file/d/13v1u__fM8nHfyvA0Mnj30IhdffW6xhwQ/view
> >
> > To analyse further more we found that if we remove grouping field or we 
> > reduce no. of ids from query it execute fast. Is anything change in 8.5.1 
> > version compare to 6.1.0 as in 6.1.0 even for large no. Ids along with 
> > grouping it works faster?
> >
> > Can someone please help to isolate this issue.
> >
> > Regards,
> > Jay Harkhani.


Re: Rule-Based Auth - update not working

2020-05-17 Thread Jason Gerlowski
One slight correction: I missed that you actually do have a
path/collection-specific permission in your list there.  So Solr will
check the permissions in descending list-order for most requests - the
exception being /luke requests when the /luke permission filters to
the top and is checked first.

We should really change this resolution order to be something more commonsense.

Jason

On Sun, May 17, 2020 at 2:52 PM Jason Gerlowski  wrote:
>
> Hi Isabelle,
>
> Two things to keep in mind with Solr's Rule-Based Authorization.
>
> 1. Each request is controlled by the first permission to that matches
> the request.
> 2. With the permissions you have present, Solr will check them in
> descending list order.  (This isn't always true - collection-specific
> and path-specific permissions are given precedence, so you don't need
> to consider that.)
>
> As you can imagine given the rules above - permission order is very
> important.  In your case the "all" rule will match pretty much all
> requests, which explains why an "indexing" user can't actually index.
> Generally speaking, it's best to put the most specific rules first,
> with the broader ones coming later.
>
> For more information, see the "Permission Ordering and Resolution"
> section in the page you linked to in your request.
>
> Good luck, hope that helps.
>
> Jason
>
> On Tue, May 12, 2020 at 12:34 PM Isabelle Giguere
>  wrote:
> >
> > Hi;
> >
> > I'm using Solr 8.5.0.
> >
> > I'm having trouble setting up some permissions using the rule-based 
> > authorization plugin: 
> > https://lucene.apache.org/solr/guide/8_5/rule-based-authorization-plugin.html
> >
> > I have 3 users: "admin", "search", and "indexer".
> >
> > I have set permissions and user roles:
> > "permissions": [  {  "name": "all", "role": "admin", "index": 1  },
> >   { "name": "admin-luke", "collection": "*", "role": "luke", "index": 
> > 2, "path": "/admin/luke"  },
> >   { "name": "read", "role": "searching", "index": 3  },
> >   {  "name": "update", "role": "indexing", "index": 4 }],
> > "user-role": {  "admin": "admin",
> >   "search": ["searching","luke"],
> >   "indexer": "indexing"   }  }
> > Attached: full output of GET /admin/authorization
> >
> > So why can't user "indexer" add anything in a collection ?  I always get 
> > HTTP 403 Forbidden.
> > Using Postman, I click the checkbox to show the password, so I'm sure I 
> > typed the right one.
> >
> > Note that user "search" can't use the /select handler either, as should be 
> > the case with permission to "read".   This user can, however, use the Luke 
> > handler, as the custom permission allows.
> >
> > User "admin" can use any API.  So at least the predefined permission "all" 
> > does work.
> >
> > Note that the collections were created before enabling authentication and 
> > authorization.  Could that be the cause of the permission issues ?
> >
> > Thanks;
> >
> > Isabelle Giguère
> > Computational Linguist & Java Developer
> > Linguiste informaticienne & développeur java
> >
> >


Re: Rule-Based Auth - update not working

2020-05-17 Thread Jason Gerlowski
Hi Isabelle,

Two things to keep in mind with Solr's Rule-Based Authorization.

1. Each request is controlled by the first permission to that matches
the request.
2. With the permissions you have present, Solr will check them in
descending list order.  (This isn't always true - collection-specific
and path-specific permissions are given precedence, so you don't need
to consider that.)

As you can imagine given the rules above - permission order is very
important.  In your case the "all" rule will match pretty much all
requests, which explains why an "indexing" user can't actually index.
Generally speaking, it's best to put the most specific rules first,
with the broader ones coming later.

For more information, see the "Permission Ordering and Resolution"
section in the page you linked to in your request.

Good luck, hope that helps.

Jason

On Tue, May 12, 2020 at 12:34 PM Isabelle Giguere
 wrote:
>
> Hi;
>
> I'm using Solr 8.5.0.
>
> I'm having trouble setting up some permissions using the rule-based 
> authorization plugin: 
> https://lucene.apache.org/solr/guide/8_5/rule-based-authorization-plugin.html
>
> I have 3 users: "admin", "search", and "indexer".
>
> I have set permissions and user roles:
> "permissions": [  {  "name": "all", "role": "admin", "index": 1  },
>   { "name": "admin-luke", "collection": "*", "role": "luke", "index": 2, 
> "path": "/admin/luke"  },
>   { "name": "read", "role": "searching", "index": 3  },
>   {  "name": "update", "role": "indexing", "index": 4 }],
> "user-role": {  "admin": "admin",
>   "search": ["searching","luke"],
>   "indexer": "indexing"   }  }
> Attached: full output of GET /admin/authorization
>
> So why can't user "indexer" add anything in a collection ?  I always get HTTP 
> 403 Forbidden.
> Using Postman, I click the checkbox to show the password, so I'm sure I typed 
> the right one.
>
> Note that user "search" can't use the /select handler either, as should be 
> the case with permission to "read".   This user can, however, use the Luke 
> handler, as the custom permission allows.
>
> User "admin" can use any API.  So at least the predefined permission "all" 
> does work.
>
> Note that the collections were created before enabling authentication and 
> authorization.  Could that be the cause of the permission issues ?
>
> Thanks;
>
> Isabelle Giguère
> Computational Linguist & Java Developer
> Linguiste informaticienne & développeur java
>
>


Re: SolrJ connection leak with SolrCloud and Jetty Gzip compression enabled

2020-04-22 Thread Jason Gerlowski
Hi Samuel,

Thanks for the very detailed description of the problem here.  Very
thorough!  I don't think you're missing anything obvious, please file the
jira tickets if you haven't already.

Best,

Jason

On Mon, Apr 13, 2020 at 6:12 PM Samuel Garcia Martinez <
samuel...@inditex.com> wrote:

> Reading again the last two paragraphs I realized that, those two
> specially, are very poorly worded (grammar ). I tried to rephrase them
> and correct some of the errors below.
>
> Here I can see three different problems:
>
> * HttpSolrCall should not use HttpServletResponse#setCharacterEncoding to
> set the Content-Encoding header. This is obviously a mistake.
> * HttpSolrClient, specifically the HttpClientUtil, should be modified to
> prevent that if the Content-Encoding header lies about the actual content,
> the connection is leaked forever. It should the exception though.
> * HttpSolrClient should allow clients to customize HttpClient's
> connectionRequestTimeout, preventing the application to be blocked forever
> waiting for a connection to be available. This way, the application could
> respond to requests that won’t use Solr instead of rejecting any incoming
> requests because all threads are blocked forever for a connection that
> won’t be available ever.
>
> I think the two first points are bugs that should be fixed.  The third one
> is a feature improvement to me.
>
> Unless I missed something, I'll file the two bugs and provide a patch for
> them. The same goes for the the feature improvement.
>
>
>
> Get Outlook for iOS<https://aka.ms/o0ukef>
>
>
>
> En el caso de haber recibido este mensaje por error, le rogamos que nos lo
> comunique por esta misma vía, proceda a su eliminación y se abstenga de
> utilizarlo en modo alguno.
> If you receive this message by error, please notify the sender by return
> e-mail and delete it. Its use is forbidden.
>
>
>
> 
> From: Samuel Garcia Martinez 
> Sent: Monday, April 13, 2020 10:08:36 PM
> To: solr-user@lucene.apache.orG 
> Subject: SolrJ connection leak with SolrCloud and Jetty Gzip compression
> enabled
>
> Hi!
>
> Today, I've seen a weird issue in production workloads when the gzip
> compression was enabled. After some minutes, the client app ran out of
> connections and stopped responding.
>
> The cluster setup is pretty simple:
> Solr version: 7.7.2
> Solr cloud enabled
> Cluster topology: 6 nodes, 1 single collection, 10 shards and 3 replicas.
> 1 HTTP LB using Round Robin over all nodes
> All cluster nodes have gzip enabled for all paths, all HTTP verbs and all
> MIME types.
> Solr client: HttpSolrClient targeting the HTTP LB
>
> Problem description: when the Solr node that receives the request has to
> forward the request to a Solr Node that actually can perform the query, the
> response headers are added incorrectly to the client response, causing the
> SolrJ client to fail and to never release the connection back to the pool.
>
> To simplify the case, let's try to start from the following repro scenario:
>
>   *   Start one node with cloud mode and port 8983
>   *   Create one single collection (1 shard, 1 replica)
>   *   Start another node with port 8984 and the previusly started zk (-z
> localhost:9983)
>   *   Start a java application and query the cluster using the node on
> port 8984 (the one that doesn't host the collection)
>
> So, the steps occur like:
>
>   *   The application queries node:8984 with compression enabled
> ("Accept-Encoding: gzip") and wt=javabin
>   *   Node:8984 can't perform the query and creates a http request behind
> the scenes to node:8983
>   *   Node:8983 returns a gzipped response with "Content-Encoding: gzip"
> and "Content-Type: application/octet-stream"
>   *   Node:8984 adds the "Content-Encoding: gzip" header as character
> stream to the response (it should be forwarded as "Content-Encoding"
> header, not character encoding)
>   *   HttpSolrClient receives a "Content-Type:
> application/octet-stream;charset=gzip", causing an exception.
>   *   HttpSolrClient tries to quietly close the connection, but since the
> stream is broken, the Utils.consumeFully fails to actually consume the
> entity (it throws another exception in GzipDecompressingEntity#getContent()
> with "not in GZIP format")
>
> The exception thrown by HttpSolrClient is:
> java.nio.charset.UnsupportedCharsetException: gzip
>at java.nio.charset.Charset.forName(Charset.java:531)
>at
> org.apache.http.entity.ContentType.create(ContentType.java:271)
>  

Re: Request Tracking in Solr

2020-04-01 Thread Jason Gerlowski
Hi Prakhar,

Newer versions of Solr offer an "Audit Logging" plugin for use cases
similar to yours.
https://lucene.apache.org/solr/guide/8_1/audit-logging.html

If don't think that's available as far back as 5.2.1 though.  Just
thought I'd mention it in case upgrading is an option.

Best,

Jason

On Wed, Apr 1, 2020 at 2:29 AM Prakhar Kumar
 wrote:
>
> Hello Folks,
>
> I'm looking for a way to track requests in Solr from a
> particular user/client. Suppose, I've created a user, say *Client1*, using
> the basic authentication/authorization plugin. Now I want to get a count of
> the number of requests/queries made by *Client1* on the Solr server.
> Looking forward to some valuable suggestions.
>
> P.S. We are using Solr 5.2.1
>
> --
> Kind Regards,
> Prakhar Kumar
> Sr. Enterprise Software Engineer
>
> *HotWax Systems*
> *Enterprise open source experts*
> cell: +91-89628-81820
> office: 0731-409-3684
> http://www.hotwaxsystems.com


Re: Checking in on Solr Progress

2020-03-02 Thread Jason Gerlowski
Very low-tech and manual, but worth mentioning...

If there's a particularly large core that's doing a full recovery, and
you have access to the disk itself you can navigate to the relevant
directory for that core and run something like "watch -n 10 ls -lah"
or "watch -n 10 du -sh ." to see how the data transfer is going.

On Fri, Feb 7, 2020 at 11:16 AM Walter Underwood  wrote:
>
> I wrote some Python that checks CLUSTERSTATUS and reports replica status to 
> Telegraf. Great for charts and alerts, but it only shows status, not progress.
>
> wunder
> Walter Underwood
> wun...@wunderwood.org
> http://observer.wunderwood.org/  (my blog)
>
> > On Feb 7, 2020, at 7:58 AM, Erick Erickson  wrote:
> >
> > I was wondering about using metrics myself. I confess I didn’t look to see 
> > what was already there either ;)
> >
> > Actually, using metrics might be easiest all told, but I also confess I 
> > have no clue what it takes to build a new metric in. Nor how to use the 
> > same (?) collection process for the 5 situations I outlined, and those just 
> > off the top of my head.
> >
> > It’s particularly frustrating when diagnosing these not knowing whether the 
> > “recovering” state is going to resolve itself sometime or not. I’ve seen 
> > Solr replicas stuck in that state forever….
> >
> > Andrzej could certainly shed some light on that question.
> >
> > All ideas welcome of course!
> >
> >> On Feb 7, 2020, at 10:40 AM, Jan Høydahl  wrote:
> >>
> >> Could we expose some high level recovery info as part of metrics api? Then 
> >> people could track number of cores recovering, recovery time, recovery 
> >> phase, number of recoveries failed etc, and also build alerts on top of 
> >> that.
> >>
> >> Jan Høydahl
> >>
> >>> 6. feb. 2020 kl. 19:42 skrev Erick Erickson :
> >>>
> >>> There’s actually a crying need for this, but there’s nothing that’s 
> >>> there yet, basically you have to look at the log files and try to figure 
> >>> it out.
> >>>
> >>> Actually I think this would be a great thing to work on, but it’d be 
> >>> pretty much all new. If you’d like, you can create a Solr Improvement 
> >>> Proposal here: 
> >>> https://cwiki.apache.org/confluence/display/SOLR/SIP+Template to flesh 
> >>> out what this would look like.
> >>>
> >>> A couple of thoughts off the top of my head:
> >>>
> >>> I really think what would be most useful would be a collections API 
> >>> command, something like “RECOVERYSTATUS”, or maybe extend CLUSTERSTATUS. 
> >>> Currently a replica can be stuck in recovery and never get out. There are 
> >>> several scenarios that’d have to be considered:
> >>>
> >>> 1> normal startup. The replica briefly goes from down->recovering->active 
> >>> which should be quite brief.
> >>> 1a> Waiting for a leader to be elected before continuing
> >>>
> >>> 2> “peer sync” where another replica is replaying documents from the tlog.
> >>>
> >>> 3> situations where the replica is replaying documents from its own tlog. 
> >>> This can be very, very, very long too.
> >>>
> >>> 4> full sync where it’s copying the entire index from a leader.
> >>>
> >>> 5> knickers in a knot, it’s given up even trying to recover.
> >>>
> >>> In either case, you’d want to report “all ok” if nothing was in recovery, 
> >>> “just the ones having trouble” and “everything because I want to look”.
> >>>
> >>> But like I said, there’s nothing really built into the system to 
> >>> accomplish this now that I know of.
> >>>
> >>> Best,
> >>> Erick
> >>>
>  On Feb 6, 2020, at 12:15 PM, dj-manning  
>  wrote:
> 
>  Erick Erickson wrote
> > When you say “look”, where are you looking from? Http requests? SolrJ? 
> > The
> > admin UI?
> 
>  I'm open to looking form anywhere  - http request, or the admin UI, or
>  following a log if possible.
> 
>  My objective for this ask would be to human interactively follow/watch
>  solr's recovery progress - if that's even possible.
> 
>  Stretch goal would be to autonomously report on recovery progress.
> 
>  The question stems from seeing recovery in log or the admin UI, then
>  wondering what progress is.
> 
>  Appreciation.
> 
> 
> 
> 
>  --
>  Sent from: https://lucene.472066.n3.nabble.com/Solr-User-f472068.html
> >>>
> >
>


Re: Replica type affinity

2020-02-03 Thread Jason Gerlowski
This is a bit of a guess - I haven't used this functionality before.
But to a novice the "tag" Rule Condition for "Rule Based Replica
Placement" sounds similar to the requirements you mentioned above.

https://lucene.apache.org/solr/guide/8_3/rule-based-replica-placement.html#rule-conditions

Good luck,

Jason

On Thu, Jan 30, 2020 at 1:00 PM Karl Stoney
 wrote:
>
> Hey,
> Thanks for the reply but I'm trying to have something fully automated and 
> dynamic.  For context I run solr on kubernetes, and at the moment it works 
> beautifully with autoscaling (i can scale up the kubernetes deployment and 
> solr adds replicas and removes them).
>
> I'm trying to add a new type of node though, backed by very fast but 
> ephemeral disks and the idea was to have only PULL replicas running on those 
> nodes automatically and NRT on the persistent disk instances.
>
> Might be a pipe dream but I'm striving for no manual configuration.
> 
> From: Edward Ribeiro 
> Sent: 30 January 2020 16:56
> To: solr-user@lucene.apache.org 
> Subject: Re: Replica type affinity
>
> Hi Karl,
>
> During collection creation you can specify the `createNodeSet` parameter as
> specified by the Solr Reference Guide snippet below:
>
> "createNodeSet
> Allows defining the nodes to spread the new collection across. The format
> is a comma-separated list of node_names, such as
> localhost:8983_solr,localhost:8984_solr,localhost:8985_solr.
> If not provided, the CREATE operation will create shard-replicas spread
> across all live Solr nodes.
> Alternatively, use the special value of EMPTY to initially create no
> shard-replica within the new collection and then later use the ADDREPLICA
> operation to add shard-replicas when and where required."
>
>
> There's also Collections API that you can use the node parameter of
> ADDREPLICA to specify the node that replica shard should be created on.
> See:
> https://eur03.safelinks.protection.outlook.com/?url=https%3A%2F%2Flucene.apache.org%2Fsolr%2Fguide%2F6_6%2Fcollections-api.html%23CollectionsAPI-Input.9data=02%7C01%7Ckarl.stoney%40autotrader.co.uk%7Ce6f81aab85274cd0081408d7a5a56464%7C926f3743f3d24b8a816818cfcbe776fe%7C0%7C0%7C637160002076345528sdata=3pFUtr6o7vK0srGR60lIUc%2Fo9QSftmAcnQDkcx5z%2Bl8%3Dreserved=0
> Other
> commands that can be useful are REPLACENODE, MOVEREPLICA.
>
> Edward
>
>
> On Thu, Jan 30, 2020 at 1:00 PM Karl Stoney
>  wrote:
>
> > Hey everyone,
> > Does anyone know of a way to have solr replicas assigned to specific nodes
> > by some sort of identifying value (in solrcloud).
> >
> > In summary I’m trying to have some Read only replicas only every be
> > assigned to nodes named “solr-ephemeral-x” and my nrt and masters assigned
> > to “solr-index”.
> >
> > Kind of like rack affinity in elasticsearch!
> >
> > Get Outlook for 
> > iOS<https://eur03.safelinks.protection.outlook.com/?url=https%3A%2F%2Faka.ms%2Fo0ukefdata=02%7C01%7Ckarl.stoney%40autotrader.co.uk%7Ce6f81aab85274cd0081408d7a5a56464%7C926f3743f3d24b8a816818cfcbe776fe%7C0%7C0%7C637160002076345528sdata=a%2BRpt9TyPy4oksfWZzl79rs7pLIwPnPE4AX%2B2SZr03w%3Dreserved=0>
> > This e-mail is sent on behalf of Auto Trader Group Plc, Registered Office:
> > 1 Tony Wilson Place, Manchester, Lancashire, M15 4FN (Registered in England
> > No. 9439967). This email and any files transmitted with it are confidential
> > and may be legally privileged, and intended solely for the use of the
> > individual or entity to whom they are addressed. If you have received this
> > email in error please notify the sender. This email message has been swept
> > for the presence of computer viruses.
> >
> This e-mail is sent on behalf of Auto Trader Group Plc, Registered Office: 1 
> Tony Wilson Place, Manchester, Lancashire, M15 4FN (Registered in England No. 
> 9439967). This email and any files transmitted with it are confidential and 
> may be legally privileged, and intended solely for the use of the individual 
> or entity to whom they are addressed. If you have received this email in 
> error please notify the sender. This email message has been swept for the 
> presence of computer viruses.


Re: Solr fact response strange behaviour

2020-01-29 Thread Jason Gerlowski
Thanks Adi,

There's no SolrJ code in your stacktrace, so this was something other
than SOLR-13780 apparently.  Best of luck!

Jason

On Wed, Jan 29, 2020 at 1:28 PM Kaminski, Adi  wrote:
>
> Sure, thanks for the guidance and the assistance anyway.
>
> Here is the stack trace:
> Here is the stack trace:
> [29/01/20 08:09:41:041 IST] [http-nio-8080-exec-2] ERROR api.BaseAPI: There 
> was an Exception calling Solr
> java.lang.ClassCastException: java.lang.Integer cannot be cast to 
> java.lang.Long
> at 
> com.productcore.analytics.api.AutoCompleteAPI.lambda$mapSolrResponse$0(AutoCompleteAPI.java:170)
>  ~[classes/:?]
> at 
> java.util.ArrayList$ArrayListSpliterator.forEachRemaining(ArrayList.java:1382)
>  ~[?:1.8.0_201]
> at 
> java.util.stream.ReferencePipeline$Head.forEach(ReferencePipeline.java:580) 
> ~[?:1.8.0_201]
> at 
> com.productcore.analytics.api.AutoCompleteAPI.mapSolrResponse(AutoCompleteAPI.java:167)
>  ~[classes/:?]
> at com.productcore.analytics.api.BaseAPI.execute(BaseAPI.java:48) [classes/:?]
> at 
> com.productcore.analytics.controllers.DalController.getAutocomplete(DalController.java:205)
>  [classes/:?]
> at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) ~[?:1.8.0_201]
> at 
> sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62) 
> ~[?:1.8.0_201]
> at 
> sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
>  ~[?:1.8.0_201]
> at java.lang.reflect.Method.invoke(Method.java:498) ~[?:1.8.0_201]
> at 
> org.springframework.web.method.support.InvocableHandlerMethod.doInvoke(InvocableHandlerMethod.java:189)
>  [spring-web-5.1.6.RELEASE.jar:5.1.6.RELEASE]
> at 
> org.springframework.web.method.support.InvocableHandlerMethod.invokeForRequest(InvocableHandlerMethod.java:138)
>  [spring-web-5.1.6.RELEASE.jar:5.1.6.RELEASE]
> at 
> org.springframework.web.servlet.mvc.method.annotation.ServletInvocableHandlerMethod.invokeAndHandle(ServletInvocableHandlerMethod.java:102)
>  [spring-webmvc-5.1.6.RELEASE.jar:5.1.6.RELEASE]
> at 
> org.springframework.web.servlet.mvc.method.annotation.RequestMappingHandlerAdapter.invokeHandlerMethod(RequestMappingHandlerAdapter.java:892)
>  [spring-webmvc-5.1.6.RELEASE.jar:5.1.6.RELEASE]
> at 
> org.springframework.web.servlet.mvc.method.annotation.RequestMappingHandlerAdapter.handleInternal(RequestMappingHandlerAdapter.java:797)
>  [spring-webmvc-5.1.6.RELEASE.jar:5.1.6.RELEASE]
> at 
> org.springframework.web.servlet.mvc.method.AbstractHandlerMethodAdapter.handle(AbstractHandlerMethodAdapter.java:87)
>  [spring-webmvc-5.1.6.RELEASE.jar:5.1.6.RELEASE]
> at 
> org.springframework.web.servlet.DispatcherServlet.doDispatch(DispatcherServlet.java:1038)
>  [spring-webmvc-5.1.6.RELEASE.jar:5.1.6.RELEASE]
> at 
> org.springframework.web.servlet.DispatcherServlet.doService(DispatcherServlet.java:942)
>  [spring-webmvc-5.1.6.RELEASE.jar:5.1.6.RELEASE]
> at 
> org.springframework.web.servlet.FrameworkServlet.processRequest(FrameworkServlet.java:1005)
>  [spring-webmvc-5.1.6.RELEASE.jar:5.1.6.RELEASE]
> at 
> org.springframework.web.servlet.FrameworkServlet.doPost(FrameworkServlet.java:908)
>  [spring-webmvc-5.1.6.RELEASE.jar:5.1.6.RELEASE]
> at javax.servlet.http.HttpServlet.service(HttpServlet.java:660) 
> [tomcat-embed-core-9.0.17.jar:9.0.17]
> at 
> org.springframework.web.servlet.FrameworkServlet.service(FrameworkServlet.java:882)
>  [spring-webmvc-5.1.6.RELEASE.jar:5.1.6.RELEASE]
> at javax.servlet.http.HttpServlet.service(HttpServlet.java:741) 
> [tomcat-embed-core-9.0.17.jar:9.0.17]
> at 
> org.apache.catalina.core.ApplicationFilterChain.internalDoFilter(ApplicationFilterChain.java:231)
>  [tomcat-embed-core-9.0.17.jar:9.0.17]
> at 
> org.apache.catalina.core.ApplicationFilterChain.doFilter(ApplicationFilterChain.java:166)
>  [tomcat-embed-core-9.0.17.jar:9.0.17]
> at org.apache.tomcat.websocket.server.WsFilter.doFilter(WsFilter.java:53) 
> [tomcat-embed-websocket-9.0.17.jar:9.0.17]
> at 
> org.apache.catalina.core.ApplicationFilterChain.internalDoFilter(ApplicationFilterChain.java:193)
>  [tomcat-embed-core-9.0.17.jar:9.0.17]
> at 
> org.apache.catalina.core.ApplicationFilterChain.doFilter(ApplicationFilterChain.java:166)
>  [tomcat-embed-core-9.0.17.jar:9.0.17]
> at 
> org.springframework.boot.actuate.web.trace.servlet.HttpTraceFilter.doFilterInternal(HttpTraceFilter.java:90)
>  [spring-boot-actuator-2.1.4.RELEASE.jar:2.1.4.RELEASE]
> at 
> org.springframework.web.filter.OncePerRequestFilter.doFilter(OncePerRequestFilter.java:107)
>  [spring-web-5.1.6.RELEASE.jar:5.1.6.RELEASE]
> at 
> org.apache.catalina.core.ApplicationFilterChain.internalDoFilter(ApplicationFilterChain.java:193)
>  [tomcat-embed-core-9.0.17.jar:9.0.17]

Re: Solr fact response strange behaviour

2020-01-29 Thread Jason Gerlowski
Hey Adi,

There was a separate JIRA for this on the SolrJ objects it sounds like
you're using: SOLR-13780.  That JIRA was fixed, apparently in 8.3, so
I'm surprised you're still seeing the issue.  If you include the full
stacktrace and a snippet of code to reproduce, I'm curious to take a
look.

That won't help you in the short term though.  For that, yes, you'll
have to use ((Number)count).longValue() in the interim.

Best,

Jason

On Tue, Jan 28, 2020 at 2:20 AM Kaminski, Adi  wrote:
>
> Thanks Mikhail  !
>
> In issue comments that you have shared it seems that Yonik S doesn't agree 
> it's a defect...so probably will remain opened for a while.
>
>
>
> So meanwhile, is it recommended to perform casting 
> ((Number)count).longValue()  to our relevant logic ?
>
>
>
> Thanks,
> Adi
>
>
>
> -Original Message-
> From: Mikhail Khludnev 
> Sent: Tuesday, January 28, 2020 9:14 AM
> To: solr-user 
> Subject: Re: Solr fact response strange behaviour
>
>
>
> https://issues.apache.org/jira/browse/SOLR-11775
>
>
>
> On Tue, Jan 28, 2020 at 10:00 AM Kaminski, Adi 
> mailto:adi.kamin...@verint.com>>
>
> wrote:
>
>
>
> > Is it existing issue and tracked for future fix consideration ?
>
> >
>
> > What's the suggestion as W/A until fix - to case every related
>
> > response with ((Number)count).longValue() ?
>
> >
>
> > -Original Message-
>
> > From: Mikhail Khludnev mailto:m...@apache.org>>
>
> > Sent: Tuesday, January 28, 2020 8:53 AM
>
> > To: solr-user 
> > mailto:solr-user@lucene.apache.org>>
>
> > Subject: Re: Solr fact response strange behaviour
>
> >
>
> > I suppose there's an issue, which no one ever took a look.
>
> >
>
> > https://lucene.472066.n3.nabble.com/JSON-facets-count-a-long-or-an-int
>
> > eger-in-cloud-and-non-cloud-modes-td4265291.html
>
> >
>
> >
>
> > On Mon, Jan 27, 2020 at 11:47 PM Kaminski, Adi
>
> > mailto:adi.kamin...@verint.com>>
>
> > wrote:
>
> >
>
> > > SolrJ client is used of SolrCloud of Solr 8.3 version for JSON
>
> > > Facets requests...any idea why not consistent ?
>
> > >
>
> > > Sent from Workspace ONE Boxer
>
> > >
>
> > > On Jan 27, 2020 22:13, Mikhail Khludnev 
> > > mailto:m...@apache.org>> wrote:
>
> > > Hello,
>
> > > It might be different between SolrCloud and standalone mode. No data
>
> > > enough to make a conclusion.
>
> > >
>
> > > On Mon, Jan 27, 2020 at 5:40 PM Rudenko, Artur
>
> > > mailto:artur.rude...@verint.com>>
>
> > > wrote:
>
> > >
>
> > > > I'm trying to parse facet response, but sometimes the count
>
> > > > returns as Long type and sometimes as Integer type(on different
>
> > > > environments), The error is:
>
> > > > "java.lang.ClassCastException: java.lang.Integer cannot be cast to
>
> > > > java.lang.Long"
>
> > > >
>
> > > > Can you please explain why this happenes? Why it not consistent?
>
> > > >
>
> > > > I know the workaround to use Number class and longValue method but
>
> > > > I want to to the root cause before using this workaround
>
> > > >
>
> > > > Artur Rudenko
>
> > > >
>
> > > >
>
> > > >
>
> > > > This electronic message may contain proprietary and confidential
>
> > > > information of Verint Systems Inc., its affiliates and/or subsidiaries.
>
> > > The
>
> > > > information is intended to be for the use of the individual(s) or
>
> > > > entity(ies) named above. If you are not the intended recipient (or
>
> > > > authorized to receive this e-mail for the intended recipient), you
>
> > > > may
>
> > > not
>
> > > > use, copy, disclose or distribute to anyone this message or any
>
> > > information
>
> > > > contained in this message. If you have received this electronic
>
> > > > message
>
> > > in
>
> > > > error, please notify us by replying to this e-mail.
>
> > > >
>
> > >
>
> > >
>
> > > --
>
> > > Sincerely yours
>
> > > Mikhail Khludnev
>
> > >
>
> > >
>
> > > This electronic message may contain proprietary and confidential
>
> > > information of Verint Systems

Re: Solr cloud production set up

2020-01-28 Thread Jason Gerlowski
Hi Rajdeep,

Unfortunately it's near impossible for anyone here to tell you what
parameters to tweak.  People might take guesses based on their
individual past experience, but ultimately those are just guesses.

There are just too many variables affecting Solr performance for
anyone to have a good guess without access to the cluster itself and
the time and will to dig into it.

Are there GC params that need tweaking?  Very possible, but you'll
have to look into your gc logs to see how much time is being spent in
gc.  Are there query params you could be changing?  Very possible, but
you'll have to identify the types of queries you're submitting and see
whether the ref-guide offers any information on how to tweak
performance for those particular qparsers, facets, etc.  Is the number
of facets the reason for slow queries?  Very possible, but you'll have
to turn faceting off or run debug=timing and see how what that tells
you about the QTime's.

Tuning Solr performance is a tough, time consuming process.  I wish
there was an easier answer for you, but there's not.

Best,

Jason

On Mon, Jan 20, 2020 at 12:06 PM Rajdeep Sahoo
 wrote:
>
> Please suggest anyone
>
> On Sun, 19 Jan, 2020, 9:43 AM Rajdeep Sahoo, 
> wrote:
>
> > Apart from reducing no of facets in the query, is there any other query
> > params or gc params or heap space or anything else that we need to tweak
> > for improving search response time.
> >
> > On Sun, 19 Jan, 2020, 3:15 AM Erick Erickson, 
> > wrote:
> >
> >> Add =timing to the query and it’ll show you the time each component
> >> takes.
> >>
> >> > On Jan 18, 2020, at 1:50 PM, Rajdeep Sahoo 
> >> wrote:
> >> >
> >> > Thanks for the suggestion,
> >> >
> >> > Is there any way to get the info which operation or which query params
> >> are
> >> > increasing the response time.
> >> >
> >> >
> >> > On Sat, 18 Jan, 2020, 11:59 PM Dave, 
> >> wrote:
> >> >
> >> >> If you’re not getting values, don’t ask for the facet. Facets are
> >> >> expensive as hell, maybe you should think more about your query’s than
> >> your
> >> >> infrastructure, solr cloud won’t help you at all especially if your
> >> asking
> >> >> for things you don’t need
> >> >>
> >> >>> On Jan 18, 2020, at 1:25 PM, Rajdeep Sahoo <
> >> rajdeepsahoo2...@gmail.com>
> >> >> wrote:
> >> >>>
> >> >>> We have assigned 16 gb out of 24gb for heap .
> >> >>> No other process is running on that node.
> >> >>>
> >> >>> 200 facets fields are there in the query but we will not be getting
> >> the
> >> >>> values for each facets for every search.
> >> >>> There can be max of 50-60 facets for which we will be getting values.
> >> >>>
> >> >>> We are using caching,is it not going to help.
> >> >>>
> >> >>>
> >> >>>
> >> >>>> On Sat, 18 Jan, 2020, 11:36 PM Shawn Heisey, 
> >> >> wrote:
> >> >>>>
> >> >>>>> On 1/18/2020 10:09 AM, Rajdeep Sahoo wrote:
> >> >>>>> We are having 2.3 million documents and size is 2.5 gb.
> >> >>>>>  10 core cpu and 24 gb ram . 16 slave nodes.
> >> >>>>>
> >> >>>>>  Still some of the queries are taking 50 sec at solr end.
> >> >>>>> As we are using solr 4.6 .
> >> >>>>>  Other thing is we are having 200 (avg) facet fields  in a query.
> >> >>>>> And 30 searchable fields.
> >> >>>>> Is there any way to identify why it is taking 50 sec for a query.
> >> >>>>>Multiple concurrent requests are there.
> >> >>>>
> >> >>>> Searching 30 fields and computing 200 facets is never going to be
> >> super
> >> >>>> fast.  Switching to cloud will not help, and might make it slower.
> >> >>>>
> >> >>>> Your index is pretty small to a lot of us.  There are people running
> >> >>>> indexes with billions of documents that take terabytes of disk space.
> >> >>>>
> >> >>>> As Walter mentioned, computing 200 facets is going to require a fair
> >> >>>> amount of heap memory.  One *possible* problem here is that the Solr
> >> >>>> heap size is too small, so a lot of GC is required.  How much of the
> >> >>>> 24GB have you assigned to the heap?  Is there any software other than
> >> >>>> Solr running on these nodes?
> >> >>>>
> >> >>>> Thanks,
> >> >>>> Shawn
> >> >>>>
> >> >>
> >>
> >>


Re: SolrCloud upgrade concern

2020-01-22 Thread Jason Gerlowski
Hi Arnold,

The stability and complexity issues Mark highlighted in his post
aren't just imagined - there are real, sometimes serious, bugs in
SolrCloud features.  But at the same time there are many many stable
deployments out there where SolrCloud is a real success story for
users.  Small example, I work at a company (Lucidworks) where our main
product (Fusion) is built heavily on top of SolrCloud and we see it
deployed successfully every day.

In no way am I trying to minimize Mark's concerns (or David's).  There
are stability bugs.  But the extent to which those need affect you
depends a lot on what your deployment looks like.  How many nodes?
How many collections?  How tightly are you trying to squeeze your
hardware?  Is your network flaky?  Are you looking to use any of
SolrCloud's newer, less stable features like CDCR, etc.?

Is SolrCloud better for you than Master/Slave?  It depends on what
you're hoping to gain by a move to SolrCloud, and on your answers to
some of the questions above.  I would be leery of following any
recommendations that are made without regard for your reason for
switching or your deployment details.  Those things are always the
biggest driver in terms of success.

Good luck making your decision!

Best,

Jason


Re: SOLR 7.5 Performance WARN

2020-01-15 Thread Jason Gerlowski
Hi Akreeti,

The "onDeckSearcher" count is the number of searchers that are
currently being opened/warmed for a given core.  New searchers are
opened by (some types of) commits.  So essentially, what this message
means is that you're asking Solr to do commits so close together that
commit N is happening before commit N-1 has even finished.

The fix usually is to commit less frequently.  Are you triggering
explicit commits via the API (or through SolrJ)?  How frequently do
the settings in your solrconfig.xml have you committing?

Hope that helps,

Jason

On Wed, Jan 15, 2020 at 3:08 AM Akreeti Agarwal  wrote:
>
> Hi All,
>
> I am using SOLR 7.5 version with master slave architecture.
> I am getting :
>
> "PERFORMANCE WARNING: Overlapping onDeckSearchers=2"
>
> continuously on my master logs for all cores. Please help me to resolve this.
>
>
> Thanks & Regards,
> Akreeti Agarwal
>
> ::DISCLAIMER::
> 
> The contents of this e-mail and any attachment(s) are confidential and 
> intended for the named recipient(s) only. E-mail transmission is not 
> guaranteed to be secure or error-free as information could be intercepted, 
> corrupted, lost, destroyed, arrive late or incomplete, or may contain viruses 
> in transmission. The e mail and its contents (with or without referred 
> errors) shall therefore not attach any liability on the originator or HCL or 
> its affiliates. Views or opinions, if any, presented in this email are solely 
> those of the author and may not necessarily reflect the views or opinions of 
> HCL or its affiliates. Any form of reproduction, dissemination, copying, 
> disclosure, modification, distribution and / or publication of this message 
> without the prior written consent of authorized representative of HCL is 
> strictly prohibited. If you have received this email in error please delete 
> it and notify the sender immediately. Before opening any email and/or 
> attachments, please check them for viruses and other defects.
> 


Re: understanding solr metrics

2020-01-02 Thread Jason Gerlowski
Hi Akhil,

I'm not an expert on these metrics, but the way I've been reading them:

"meanRate" is a measure of how many requests come in per some unit of
time.  It has nothing to do with how long individual requests take.

"mean_ms" is the average time taken by requests (in milliseconds).

Hope that helps,

Jason

On Thu, Jan 2, 2020 at 9:13 AM akhil dutt  wrote:
>
> Hi,
> I'm trying to understand solr metrics and was looking at request/response
> dispatch rate. I want to understand what meanRate signify. As per below
> values, am I to suppose that  each request takes 300 seconds (1/ meanRate )?
>
>
> org.eclipse.jetty.server.handler.DefaultHandler.dispatches:
> {
>
>- count: 628,
>- meanRate: 0.003289067572916428,
>- 1minRate: 0.05987072200468513,
>- 5minRate: 0.0011878359052365337,
>- 15minRate: 0.001259541736414636,
>- min_ms: 0,
>- max_ms: 755,
>- mean_ms: 43.5,
>- median_ms: 6,
>- stddev_ms: 40.5,
>- p75_ms: 84,
>- p95_ms: 84,
>- p99_ms: 84,
>- p999_ms: 84
>
> },
>
> Thanks,
> Akhil


Re: Facing jwt authentication problem using solr 8.1.1

2019-12-20 Thread Jason Gerlowski
Oh, ok.

>From the user's error message it looked to me like bin/solr was making
an admin/info/system call from bash, but it must be something else.

On Fri, Dec 20, 2019 at 6:28 AM Jan Høydahl  wrote:
>
> No, I doubt that bin/solr support would do more than just wire in a simple 
> initial JWT config, with some default Rule-based config.
>
> Jan
>
> > 17. des. 2019 kl. 16:42 skrev Jason Gerlowski :
> >
> > Hey Jan,
> >
> > Is this a case of something that'd be fixed by
> > https://issues.apache.org/jira/browse/SOLR-13071 ?
> >
> > Just wondering
> >
> > Best,
> > Jason
> >
> > On Thu, Dec 12, 2019 at 5:43 PM Jan Høydahl  wrote:
> >>
> >> Try something like this 
> >> https://gist.github.com/b330e1bea7842bcdc1e5fa3940b4a4f7 
> >> <https://gist.github.com/b330e1bea7842bcdc1e5fa3940b4a4f7>
> >>
> >> The trick is to «whitelist» certain paths that will not require auth, but 
> >> then further down add rules to block all other paths either as admin role 
> >> or with special role *»* which means «any authenticated user».
> >>
> >> Jan
> >>
> >>> 12. des. 2019 kl. 07:47 skrev Lakhan Gupta 
> >>> :
> >>>
> >>> Hi,
> >>>
> >>> Using solr 8.1.1 version and facing problem while enabling jwt 
> >>> authentication in solr. Jwt authentication is working fine after 
> >>> configuring security.json file. Below is the configuration I am using for 
> >>> enabling jwt authentication.
> >>>
> >>> Security.json
> >>>
> >>> {
> >>> "authentication":{
> >>>  "blockUnknown": false,
> >>>   "class":"solr.JWTAuthPlugin",
> >>>  "jwk":{
> >>> "kty":"oct",
> >>> "use":"sig",
> >>> "kid":"k1",
> >>> 
> >>> "k":"7A02618BE6943C22FD81CAB9F6FCF063B6E1732C3614BC3ACA6032B6B3215CAF0D28A34FD423423CA3AC34BEA27D3F79",
> >>> "alg":"HS256"},
> >>>   "aud":"solr"},
> >>>  "authorization":{
> >>> "class":"solr.RuleBasedAuthorizationPlugin",
> >>> "permissions":[
> >>> {
> >>>   "name":"all",
> >>>"path":"/*",
> >>>   "role":"admin"
> >>>}
> >>> ],
> >>> "user-role":{
> >>>"solr":"admin"
> >>> }
> >>>  }
> >>> }
> >>>
> >>> Using secret key
> >>> 7A02618BE6943C22FD81CAB9F6FCF063B6E1732C3614BC3ACA6032B6B3215CAF0D28A34FD423423CA3AC34BEA27D3F79
> >>>
> >>> Jwt token is generated:
> >>> eyJhbGciOiJIUzI1NiIsInR5cCI6IkpXVCJ9.eyJzdWIiOiJhZCIsImF1ZCI6InNvbHIiLCJleHAiOjk5MTYyMzkwMjJ9.M4PksJTJ9gFjOlvvFmG1eDSyXDtKIRSGIYicIW9hwT4
> >>>
> >>> Below header and payload I'm using for generate jwt token :
> >>>
> >>> The header is
> >>> {
> >>> "alg": "HS256",
> >>> "typ": "JWT"
> >>> }
> >>>
> >>> and the payload is
> >>>
> >>> {
> >>> "sub": "admin",
> >>> "aud": "Solr",
> >>> "exp": 9916239022
> >>> }
> >>>
> >>> With above configuration my jwt authentication is working fine. But there 
> >>> is a problem when request is sent without authentication in header the 
> >>> api still retrieving data. I want to prevent it when request come without 
> >>> authentication header.
> >>>
> >>> For that, I've enabled blockUnknown parameter in security.json file. That 
> >>> works fine and authentication request is required. But, after enabling 
> >>> blockunknown  parameter I am facing below exception while starting solr 
> >>> using solr start command.
> >>>
> >>>
> >>> ERROR: Solr requires authentication for 
> >>> http://localhost:8983/solr/admin/info/system. Please supply valid 
> >>> credentials. HTTP code=401
> >>>
> >>> I've googled a lot and find out
> >>>
> >>> solr/admin/info/system endpoint required authentication.
> >>>
> >>> How to authenticate solr/admin/info/system endpoint while startup solr?
> >>>
> >>> Need urgent help. I'd appreciate if someone can help me.
> >>>
> >>> Thanks
> >>> Lakhan Gupta
> >>>
> >>>
> >>>
> >>> The information in this email is confidential and may be legally 
> >>> privileged. It is intended solely for the addressee and access to it by 
> >>> anyone else is unauthorized. If you are not the intended recipient, any 
> >>> disclosure, copying, distribution or any action taken or omitted to be 
> >>> taken based on it, is strictly prohibited and may be unlawful.
> >>
>


Re: Move SOLR from cloudera HDFS to SOLR on Docker

2019-12-19 Thread Jason Gerlowski
Hi Wael,

Getting configs and data out of Cloudera's HDP is about the same as
moving data between any 2 Solr clusters.

Moving configs is going to be the easy part.

If you're currently using Solr in SolrCloud mode, then your configs
all live in ZooKeeper.  Recent versions of Solr have a utility for
downloading and uploading collection configs from ZooKeeper: run
"bin/solr zk" for more details.  Without checking, I'm not sure
whether this tool is available as far back as 4.10.3.  But the way
that the tool works, I believe the current version would work against
an older SolrCloud install, so you can download a more recent version
and use the tool to extract and reupload your configs where you need
them.

If you're _not_ using SolrCloud, your collection configs will be on
disk, and moving them between installs is as simple as moving them on
disk.

Much more complicated is getting your index data into your new
install.  If you stay on the same Solr version, you should be able to
re-use your existing index files.  That said, recent releases have
seen Solr make strides in becoming cloud/docker aware, or at least
tolerant.  8.3.1 or 7.7.2 will likely be easier to manage on docker
than 4.10.3.  Additionally, 4.10.3 no longer receives any security
backports from the community, and hasn't for some time.  It's worth
considering whether that offers enough benefits to be worth the pain
of reindexing.

Best,

Jason

On Wed, Dec 18, 2019 at 9:26 AM Wael Kader  wrote:
>
> Hello,
>
> I want to move data from my SOLR setup on Cloudera Hadoop to a docker SOLR
> container.
> I don't need to run all the hadoop services in my setup as I am only
> currently using SOLR from the cloudera HDP.
>
> My concern now is to know what's the best way to move the data and schema
> to Docker container.
> I don't mind moving data to an older version of SOLR Container to match the
> 4.10.3 SOLR Version I have on Cloudera.
>
> Much help is appreciated.
>
> --
> Regards,
> Wael


Re: Facing jwt authentication problem using solr 8.1.1

2019-12-17 Thread Jason Gerlowski
Hey Jan,

Is this a case of something that'd be fixed by
https://issues.apache.org/jira/browse/SOLR-13071 ?

Just wondering

Best,
Jason

On Thu, Dec 12, 2019 at 5:43 PM Jan Høydahl  wrote:
>
> Try something like this 
> https://gist.github.com/b330e1bea7842bcdc1e5fa3940b4a4f7 
> <https://gist.github.com/b330e1bea7842bcdc1e5fa3940b4a4f7>
>
> The trick is to «whitelist» certain paths that will not require auth, but 
> then further down add rules to block all other paths either as admin role or 
> with special role *»* which means «any authenticated user».
>
> Jan
>
> > 12. des. 2019 kl. 07:47 skrev Lakhan Gupta 
> > :
> >
> > Hi,
> >
> > Using solr 8.1.1 version and facing problem while enabling jwt 
> > authentication in solr. Jwt authentication is working fine after 
> > configuring security.json file. Below is the configuration I am using for 
> > enabling jwt authentication.
> >
> > Security.json
> >
> > {
> >  "authentication":{
> >   "blockUnknown": false,
> >"class":"solr.JWTAuthPlugin",
> >   "jwk":{
> >  "kty":"oct",
> >  "use":"sig",
> >  "kid":"k1",
> >  
> > "k":"7A02618BE6943C22FD81CAB9F6FCF063B6E1732C3614BC3ACA6032B6B3215CAF0D28A34FD423423CA3AC34BEA27D3F79",
> >  "alg":"HS256"},
> >"aud":"solr"},
> >   "authorization":{
> >  "class":"solr.RuleBasedAuthorizationPlugin",
> >  "permissions":[
> >  {
> >"name":"all",
> > "path":"/*",
> >"role":"admin"
> > }
> >  ],
> >  "user-role":{
> > "solr":"admin"
> >  }
> >   }
> > }
> >
> > Using secret key
> > 7A02618BE6943C22FD81CAB9F6FCF063B6E1732C3614BC3ACA6032B6B3215CAF0D28A34FD423423CA3AC34BEA27D3F79
> >
> > Jwt token is generated:
> > eyJhbGciOiJIUzI1NiIsInR5cCI6IkpXVCJ9.eyJzdWIiOiJhZCIsImF1ZCI6InNvbHIiLCJleHAiOjk5MTYyMzkwMjJ9.M4PksJTJ9gFjOlvvFmG1eDSyXDtKIRSGIYicIW9hwT4
> >
> > Below header and payload I'm using for generate jwt token :
> >
> > The header is
> > {
> >  "alg": "HS256",
> >  "typ": "JWT"
> > }
> >
> > and the payload is
> >
> > {
> >  "sub": "admin",
> >  "aud": "Solr",
> >  "exp": 9916239022
> > }
> >
> > With above configuration my jwt authentication is working fine. But there 
> > is a problem when request is sent without authentication in header the api 
> > still retrieving data. I want to prevent it when request come without 
> > authentication header.
> >
> > For that, I've enabled blockUnknown parameter in security.json file. That 
> > works fine and authentication request is required. But, after enabling 
> > blockunknown  parameter I am facing below exception while starting solr 
> > using solr start command.
> >
> >
> > ERROR: Solr requires authentication for 
> > http://localhost:8983/solr/admin/info/system. Please supply valid 
> > credentials. HTTP code=401
> >
> > I've googled a lot and find out
> >
> > solr/admin/info/system endpoint required authentication.
> >
> > How to authenticate solr/admin/info/system endpoint while startup solr?
> >
> > Need urgent help. I'd appreciate if someone can help me.
> >
> > Thanks
> > Lakhan Gupta
> >
> >
> >
> > The information in this email is confidential and may be legally 
> > privileged. It is intended solely for the addressee and access to it by 
> > anyone else is unauthorized. If you are not the intended recipient, any 
> > disclosure, copying, distribution or any action taken or omitted to be 
> > taken based on it, is strictly prohibited and may be unlawful.
>


Re: Possible data corruption in JavaBinCodec in Solr 8.3 during distributed update?

2019-11-21 Thread Jason Gerlowski
Very curious what the config change that's related to reproducing this
looks like.  Maybe it's something that is worth adding
test-randomization around?  Just thinking aloud.


Re: exact matches on a join

2019-11-21 Thread Jason Gerlowski
Are these fields "string" or "text" fields?

Text fields receive analysis that splits them into a series of terms.
That's why the query "Freeman" matches the document "A-1 Freeman".
"A-1 Freeman" gets split up into multiple terms, and the "Freeman"
query matches one of those terms.  Text fields are what you use when
you want matches to have some wiggle room based on your analyzers.

String fields are much more geared towards exact matches.  No analysis
is done, so a query for "Freeman" would only match docs who have that
value identically.

Jason

On Tue, Nov 19, 2019 at 2:44 PM rhys J  wrote:
>
> I am trying to do a join, which I have working properly on 2 cores.
>
> One core has report_as, and the other core has debt_id.
>
> If I enter 'report_as: "Freeman", I expect to get 272 results. But I get
> 557.
>
> When I do a database search on the matched fields, it shows me that
> report_as: "Freeman" is matching also on 'A-1 Freeman'.
>
> I have tried boosting the score as report_as: "Freeman"^2, but I get the
> same results from the API, and from the browser itself.
>
> Here is my query:
>
> {
>   "responseHeader":{
> "status":0,
> "QTime":5,
> "params":{
>   "q":"( * )",
>   "indent":"on",
>   "fl":"debt_id, score",
>   "cursorMark":"*",
>   "sort":"score desc, id desc",
>   "fq":"{!join from=debtor_id to=debt_id fromIndex=dbtr}(
> report_as:\"Freeman\"^2)",
>   "rows":"1000"}},
>   "response":{"numFound":557,"start":0,"maxScore":1.0,"docs":[
>   {
> "debt_id":"485435",
> "score":1.0},
>   {
> "debt_id":"485435",
> "score":1.0},
>   {
> "debt_id":"482795",
> "score":1.0},
>   {
> "debt_id":"482795",
> "score":1.0},
>   {
> "debt_id":"482794",
> "score":1.0},
>   {
> "debt_id":"482794",
> "score":1.0},
>   {
> "debt_id":"482794",
> "score":1.0},
>
> SKIP
>
>
>
> {
> "debt_id":"396925",
> "score":1.0},
>   {
> "debt_id":"396925",
> "score":1.0},
>   {
> "debt_id":"396925",
> "score":1.0},
>   {
> "debt_id":"396925",
> "score":1.0},
>   {
> "debt_id":"396925",
> "score":1.0},
>   {
> "debt_id":"396925",
> "score":1.0},
>   {
> "debt_id":"396925",
> "score":1.0},
>   {
> "debt_id":"396925",
> "score":1.0},
>   {
> "debt_id":"396925",
> "score":1.0},
>   {
> "debt_id":"396925",
> "score":1.0},
>   {
> "debt_id":"396925",
>
>
> These ones are the correct matches that I can verify with the
> database, but their scores are the same as the ones matching on
> 'A1-Freeman'
>
> Is my scoring set up wrong?
>
> Thanks,
>
> Rhys


Re: Possible bug in cluster status - > solr 8.3

2019-11-21 Thread Jason Gerlowski
It seems like an issue to me.  Can you open a JIRA with these details?

On Fri, Nov 15, 2019 at 10:51 AM Jacek Kikiewicz  wrote:
>
> I found interesting situation, I've created a collection with only one 
> replica.
> Then I scaled solr-cloud cluster, and run  'addreplica' call to add 2 more.
> So I have a collection with 3 tlog replicas, cluster status page shows
> them but shows also this:
>   "core_node2":{
> "core":"EDITED_NAME_shard1_replica_t1",
> "base_url":"http://EDITED_NODE:8983/solr;,
> "node_name":"EDITED_NODE:8983_solr",
> "state":"active",
> "type":"TLOG",
> "force_set_state":"false",
> "leader":"true"},
>   "core_node5":{
> "core":"EDITED_NAME_shard1_replica_t3",
> "base_url":"http://EDITED_NODE:8983/solr;,
> "node_name":"EDITED_NODE:8983_solr",
> "state":"active",
> "type":"TLOG",
> "force_set_state":"false"},
>   "core_node6":{
> "core":"EDITED_NAME_shard1_replica_t4",
> "base_url":"http://EDITED_NODE:8983/solr;,
> "node_name":"EDITED_NODE:8983_solr",
> "state":"active",
> "type":"TLOG",
> "force_set_state":"false",
> "router":{"name":"compositeId"},
> "maxShardsPerNode":"1",
> "autoAddReplicas":"false",
> "nrtReplicas":"1",
> "tlogReplicas":"1",
> "znodeVersion":11,
>
>
> As you can see I have 3 replicas but then I have also: "tlogReplicas":"1"
>
> If I create collection with tlogReplicas=3 then cluster status shows
> "tlogReplicas":"3"
> IS that a bug or somehow 'works as it should' ?
>
> Regards,
> Jacek


Re: CloudSolrClient - basic auth - multi shard collection

2019-11-20 Thread Jason Gerlowski
Hi Nicholas,

I'm not really familiar with spring-data-solr, so I can't speak to
that detail, but it sounds like you might be running into either
https://issues.apache.org/jira/browse/SOLR-13510 or
https://issues.apache.org/jira/browse/SOLR-13472.  There are partial
workarounds on those issues that might help you.  If those aren't
sufficient, you can fix the issue by upgrading to 8.2 - both of those
bugs are fixed in that version.

Hope that helps,

Jason


On Mon, Nov 18, 2019 at 8:26 AM Nicolas Paris  wrote:
>
> Hello,
>
> I am having trouble with basic auth on a solrcloud instance. When the
> collection is only one shard, there is no problem. When the collection
> is multiple shard, there is no problem until I ask multiple query
> concurrently: I get 401 error and asking for credentials for concurrent
> queries.
>
> I have created a Premptive Auth Interceptor which should add the
> credential information for every http call.
>
> Thanks for any pointer,
>
> solr:8.1
> spring-data-solr:4.1.0
> --
> nicolas


Re: Anyway to encrypt admin user plain text password in Solr

2019-11-14 Thread Jason Gerlowski
Hi Vinodh,

I don't know of any way to encrypt the credentials in
"basicAuth.conf", and looking at the code that loads that file I don't
see any logic to handle that sort of thing.  So I'm near positive
there's no way to avoid plaintext here.

But, that said, I don't think this should really be that concerning.
To read this file, an adversary would need (1) access to the machine
Solr runs on and (2) access to the user/group running Solr.  If an
adversary has those two things, the credentials are besides the point.
They could kill your Solr process directly.  Or read the index data
files directly.  Or edit them. etc.  There may be edge cases around
using network drives or HDFS where encrypting this file is useful, I
haven't thought that side of things through entirely.  But for most
use-cases I'm not sure encrypting basicAuth.conf provides anything
beyond security theater.

Best,

Jason



On Thu, Nov 14, 2019 at 9:49 AM Mark H. Wood  wrote:
>
> On Thu, Nov 14, 2019 at 11:35:47AM +, Kommu, Vinodh K. wrote:
> > We store the plain text password in basicAuth.conf file. This is a normal 
> > file & we are securing it only with 600 file permissions so that others 
> > cannot read it. We also run various solr APIs in our custom script for 
> > various purposes using curl commands which needs admin user credentials to 
> > perform operations. If admin credentials details from basicAuth.conf file 
> > or from curl commands are exposed/compromised, eventually any person within 
> > the organization who knows credentials can login to admin UI and perform 
> > any read/write operations. This is a concern and auditing issue as well.
>
> If the password is encrypted, then the decryption key must be supplied
> before the password can be used.  This leads to one of two unfortunate
> situations:
>
> o  The user must enter the decryption key every time.  This defeats
>the purpose of storing credentials at the client.
>
>- or -
>
> o  The decryption key is stored at the client, making it a new secret
>that must be protected (by encrypting it? you see where this is
>going)
>
> There is no way around this.  If the client system stores a full set
> of credentials, then anyone with sufficient access to the client
> system can get everything he needs to authenticate an identity, no
> matter what you do.  If the client system does not store a full set of
> credentials, then the user must supply at least some of them whenever
> they are needed.  The best one can usually do is to reduce the
> frequency at which some credential must be entered manually.
>
> Solr supplies several authentication mechanisms besides BasicAuth.
> Would one of those serve?
>
> --
> Mark H. Wood
> Lead Technology Analyst
>
> University Library
> Indiana University - Purdue University Indianapolis
> 755 W. Michigan Street
> Indianapolis, IN 46202
> 317-274-0749
> www.ulib.iupui.edu


Re: Anyway to encrypt admin user plain text password in Solr

2019-11-13 Thread Jason Gerlowski
Hi,

To clarify, Solr credentials are stored and shown in a few different
places.  In some situations the password might live in your
"solr.in.sh" file.  It also might live in a separate basicAuth.conf
file.  If you're using SolrCloud, the password might appear in Solr's
Admin UI (depending on your version of Solr).  The password is also
stored in ZooKeeper.

Some of these locations already store the credentials in an encrypted
form.  Other locations are only problematic if attackers have access
to the disk that Solr is running on, at which point you have much
bigger problems.

If you can be more specific about the exposure you're concerned about,
we can discuss whether there's an actual security concern there and
how to work around it.

Best,

Jason

On Wed, Nov 13, 2019 at 11:22 AM Kommu, Vinodh K.  wrote:
>
> Does anyone have an any idea on this? If so, please help.
>
> Thanks
> From: Kommu, Vinodh K.
> Sent: Monday, November 11, 2019 4:11 PM
> To: solr-user@lucene.apache.org
> Subject: Anyway to encrypt admin user plain text password in Solr
>
> Hi,
>
> After creating admin user in Solr when security is enabled, we have to store 
> the admin user's credentials in plain text format. Is there any option or a 
> way to encrypt the plain text password?
>
> Thanks,
> Vinodh
> DTCC DISCLAIMER: This email and any files transmitted with it are 
> confidential and intended solely for the use of the individual or entity to 
> whom they are addressed. If you have received this email in error, please 
> notify us immediately and delete the email and any attachments from your 
> system. The recipient should check this email and any attachments for the 
> presence of viruses. The company accepts no liability for any damage caused 
> by any virus transmitted by this email.


Re: Query on changing FieldType

2019-10-22 Thread Jason Gerlowski
Hi Shubbham,

Emir gave you accurate advice - you cannot (safely) change field types
without reindexing.  You may avoid errors for a time, and searches may
even return the results you expect.  But the type-change is still a
ticking time bomb...Solr might try to merge segments down the road or
do some other operation and blow up in unexpected ways.  For more
information on why this is, see the documentation here:
https://lucene.apache.org/solr/guide/8_2/reindexing.html.

Unfortunately there's no way around it.  This, by the way, is why the
community strongly recommends against using schema-guessing mode for
anything other than experimentation.

Best of luck,

Jason

On Tue, Oct 22, 2019 at 7:42 AM Shubham Goswami
 wrote:
>
> Hi Emir
>
> As you have mentioned above we cannot change field type after indexing once
> and we have to do dull re-indexing again, I tried to change field type from
> plong to pint which has implemented class solr.LongPointField and
> solr.IntPointField respectively and it was showing error as expected.
> But when i changed field types from pint/plong to any type which
> has implemented class solr.TextField, in this case its working fine and i
> am able to index the documents after changing its fieldtype with same and
> different id.
>
> So i want to know if is there any compatibility with implemented classes ?
>
> Thanks
> Shubham
>
> On Tue, Oct 22, 2019 at 2:46 PM Emir Arnautović <
> emir.arnauto...@sematext.com> wrote:
>
> > Hi Shubham,
> > No you cannot. What you can do is to use copy field or update request
> > processor to store is as some other field and use that in your query and
> > ignore the old one that will eventually disappear as the result of segment
> > merges.
> >
> > HTH,
> > Emir
> > --
> > Monitoring - Log Management - Alerting - Anomaly Detection
> > Solr & Elasticsearch Consulting Support Training - http://sematext.com/
> >
> >
> >
> > > On 22 Oct 2019, at 10:53, Shubham Goswami 
> > wrote:
> > >
> > > Hi Emir
> > >
> > > Thanks for the reply, i got your point.
> > > But is there any other way to do like one field could have two or more
> > > different types defined ?
> > > or  if i talk about my previous query, can we index some data for the
> > same
> > > field with different unique id after replacing the type ?
> > >
> > > Thanks again
> > > Shubham
> > >
> > > On Tue, Oct 22, 2019 at 1:23 PM Emir Arnautović <
> > > emir.arnauto...@sematext.com> wrote:
> > >
> > >> Hi Shubham,
> > >> Changing type is not allowed without full reindexing. If you do
> > something
> > >> like that, Solr will end up with segments with different types for the
> > same
> > >> field. Remember that segments are immutable and that reindexing some
> > >> document will be in new segment, but old segment will still be there
> > and at
> > >> query type Solr will have mismatch between what is stated in schema and
> > >> what is in segment. In order to change type you have to do full
> > reindexing
> > >> - create a new collection and reindex all documents.
> > >>
> > >> HTH,
> > >> Emir
> > >> --
> > >> Monitoring - Log Management - Alerting - Anomaly Detection
> > >> Solr & Elasticsearch Consulting Support Training - http://sematext.com/
> > >>
> > >>
> > >>
> > >>> On 22 Oct 2019, at 09:25, Shubham Goswami 
> > >> wrote:
> > >>>
> > >>> Hello Community
> > >>>
> > >>> I have indexed some documents for which solr has taken its
> > type="plongs"
> > >> by
> > >>> auto guessing but i am trying to change its type="pint" and re-indexing
> > >> the
> > >>> same data with the same id and indexing the data with different id
> > where
> > >> id
> > >>> is unique key but it is showing error.
> > >>>
> > >>> Can somebody please let me know if it is possible or not, if not
> > possible
> > >>> then why it is not possible as i am using different id as well ? if
> > >>> possible then how we could achieve it ?
> > >>> Any help will be appreciated. Thanks in advance.
> > >>>
> > >>> --
> > >>> *Thanks & Regards*
> > >>> Shubham Goswami
> > >>> Enterprise Software Engineer
> > >>> *HotWax Systems*
> > >>> *Enterprise open source experts*
> > >>> cell: +91-7803886288
> > >>> office: 0731-409-3684
> > >>> http://www.hotwaxsystems.com
> > >>
> > >>
> > >
> > > --
> > > *Thanks & Regards*
> > > Shubham Goswami
> > > Enterprise Software Engineer
> > > *HotWax Systems*
> > > *Enterprise open source experts*
> > > cell: +91-7803886288
> > > office: 0731-409-3684
> > > http://www.hotwaxsystems.com
> >
> >
>
> --
> *Thanks & Regards*
> Shubham Goswami
> Enterprise Software Engineer
> *HotWax Systems*
> *Enterprise open source experts*
> cell: +91-7803886288
> office: 0731-409-3684
> http://www.hotwaxsystems.com


Re: Solr enabled kerberos and create collection failed

2019-10-22 Thread Jason Gerlowski
I _think_ this is the fourth time you've submitted this exact question
as a different email thread.  Most of your other threads have
responses on them, but maybe you're not seeing that for some reason.

Maybe you won't be able to see this response either, but in case you
can: I think you'll have better luck getting help and getting your
problem solved if you keep the discussion in a single thread, rather
than continually reposting.  Please look back at your other threads.

On Mon, Oct 21, 2019 at 9:40 PM Lvyankui  wrote:
>
> SolrCloud mode, Solr and Zookeeper enabled kerberos, create collection failed 
> with following command
> curl --negotiate -u : 'http:// 
> noder27:8983/solr/admin/collections?action=CREATE=test01=1=1=_default=json'
> The error is:
> {
>   "responseHeader":{
> "status":0,
> "QTime":31818},
>   "failure":{
> 
> "noder27.hde.h3c.com:8983_solr":"org.apache.solr.client.solrj.impl.HttpSolrClient$RemoteSolrException:Error
>  from server at http://noder27.hde.h3c.com:8983/solr: Expected mime type 
> application/octet-stream but got text/html. \n\n http-equiv=\"Content-Type\" 
> content=\"text/html;charset=utf-8\"/>\nError 401 Authentication 
> required\n\nHTTP ERROR 401\nProblem 
> accessing /solr/admin/cores. Reason:\nAuthentication 
> required\n\n\n"
> }}
>
> But if I restart Solr several times,it will return to normal probability.
> -
> 本邮件及其附件含有新华三集团的保密信息,仅限于发送给上面地址中列出
> 的个人或群组。禁止任何其他人以任何形式使用(包括但不限于全部或部分地泄露、复制、
> 或散发)本邮件中的信息。如果您错收了本邮件,请您立即电话或邮件通知发件人并删除本
> 邮件!
> This e-mail and its attachments contain confidential information from New 
> H3C, which is
> intended only for the person or entity whose address is listed above. Any use 
> of the
> information contained herein in any way (including, but not limited to, total 
> or partial
> disclosure, reproduction, or dissemination) by persons other than the intended
> recipient(s) is prohibited. If you receive this e-mail in error, please 
> notify the sender
> by phone or email immediately and delete it!


Re: Solr enabled kerberos and create collection failed

2019-10-22 Thread Jason Gerlowski
I _think_ this is the third time you've submitted this exact question
as a different email thread.  Both of your other threads have
responses on them, but maybe you're not seeing that for some reason.

Maybe you won't be able to see this response either, but in case you
can: I think you'll have better luck getting help and getting your
problem solved if you keep the discussion in a single thread, rather
than continually reposting.  Please look back at your other threads.

On Mon, Oct 21, 2019 at 7:51 AM Lvyankui  wrote:
>
>
> SolrCloud mode, Solr and Zookeeper enabled kerberos, create collection failed 
> with following command
> curl --negotiate -u : 'http:// 
> noder27:8983/solr/admin/collections?action=CREATE=test01=1=1=_default=json'
> The error is:
> {
>   "responseHeader":{
> "status":0,
> "QTime":31818},
>   "failure":{
> 
> "noder27.hde.h3c.com:8983_solr":"org.apache.solr.client.solrj.impl.HttpSolrClient$RemoteSolrException:Error
>  from server at http://noder27.hde.h3c.com:8983/solr: Expected mime type 
> application/octet-stream but got text/html. \n\n http-equiv=\"Content-Type\" 
> content=\"text/html;charset=utf-8\"/>\nError 401 Authentication 
> required\n\nHTTP ERROR 401\nProblem 
> accessing /solr/admin/cores. Reason:\nAuthentication 
> required\n\n\n"
> }}
>
> But if I restart Solr several times,it will return to normal probability.
> -
> 本邮件及其附件含有新华三集团的保密信息,仅限于发送给上面地址中列出
> 的个人或群组。禁止任何其他人以任何形式使用(包括但不限于全部或部分地泄露、复制、
> 或散发)本邮件中的信息。如果您错收了本邮件,请您立即电话或邮件通知发件人并删除本
> 邮件!
> This e-mail and its attachments contain confidential information from New 
> H3C, which is
> intended only for the person or entity whose address is listed above. Any use 
> of the
> information contained herein in any way (including, but not limited to, total 
> or partial
> disclosure, reproduction, or dissemination) by persons other than the intended
> recipient(s) is prohibited. If you receive this e-mail in error, please 
> notify the sender
> by phone or email immediately and delete it!


Re: Solr enabled kerberos and create collection failed

2019-10-22 Thread Jason Gerlowski
Hi,

You posted this same question in a different thread and Jorn Franke
replied to say that you likely need to run "kinit" before invoking
"bin/solr".  That seems like a likely possible explanation to me.
But, since you've given us very little information on how you've setup
Kerberos and what you've tried, it's impossible to say for sure.

How did you set up Kerberos?  What is the java command that Solr is
running with?  Have you tried running "kinit" before bin/solr?  Does
the problem _only_ occur when running "bin/solr", or does a similar
error message occur when making requests through curl or other
clients?

You might want to read this page for other ideas of what information
would help us help you:
https://cwiki.apache.org/confluence/display/solr/UsingMailingLists

Best,

Jason

On Sun, Oct 20, 2019 at 9:23 PM Lvyankui  wrote:
>
> SolrCloud mode, Solr and Zookeeper enabled kerberos, create collection failed 
> with following command
> curl --negotiate -u : 'http:// 
> noder27:8983/solr/admin/collections?action=CREATE=test01=1=1=_default=json'
> The error is:
> {
>   "responseHeader":{
> "status":0,
> "QTime":31818},
>   "failure":{
> 
> "noder27.hde.h3c.com:8983_solr":"org.apache.solr.client.solrj.impl.HttpSolrClient$RemoteSolrException:Error
>  from server at http://noder27.hde.h3c.com:8983/solr: Expected mime type 
> application/octet-stream but got text/html. \n\n http-equiv=\"Content-Type\" 
> content=\"text/html;charset=utf-8\"/>\nError 401 Authentication 
> required\n\nHTTP ERROR 401\nProblem 
> accessing /solr/admin/cores. Reason:\nAuthentication 
> required\n\n\n"
> }}
>
> But if I restart Solr several times,it will return to normal probability.
> -
> 本邮件及其附件含有新华三集团的保密信息,仅限于发送给上面地址中列出
> 的个人或群组。禁止任何其他人以任何形式使用(包括但不限于全部或部分地泄露、复制、
> 或散发)本邮件中的信息。如果您错收了本邮件,请您立即电话或邮件通知发件人并删除本
> 邮件!
> This e-mail and its attachments contain confidential information from New 
> H3C, which is
> intended only for the person or entity whose address is listed above. Any use 
> of the
> information contained herein in any way (including, but not limited to, total 
> or partial
> disclosure, reproduction, or dissemination) by persons other than the intended
> recipient(s) is prohibited. If you receive this e-mail in error, please 
> notify the sender
> by phone or email immediately and delete it!


Re: Solaris Install Package

2019-10-17 Thread Jason Gerlowski
Hi Andrew,

I believe that yes, Solr should work on Solaris.  I've never done so
personally, but very occasionally I hear of someone doing so.
Additionally, Uwe runs a Jenkins server that runs tests on Solaris
(among other OSs), and the results for Solaris look to be pretty
standard for our test suite.  I'm not sure what Solaris version these
tests run on, that might be worth double checking here:
https://jenkins.thetaphi.de/view/Lucene-Solr/job/Lucene-Solr-8.x-Solaris/
. If you find any particular issues on Solaris (especially in the
scripts accompanying Solr, e.g. bin/solr, bin/post), it'd be
appropriate to open up JIRA tickets for those.

That saidwhile it seems to work and receives at least some
testing, it's definitely not common in terms of what the community
uses, and tests, and codes for on a daily basis.  As with any
open-source project, there's always a certain amount of risk in
diverging from the commonly used/tested environments and usage
patterns.  So, YMMV.

Best,

Jason

On Mon, Oct 7, 2019 at 5:34 PM Andrew Corbett  wrote:
>
> I have been trying to research the possibility of adding Solr to servers 
> running the Solaris 10 and 11 operating systems. Solaris isn't mentioned in 
> the documentation. Would adding Solr to these servers be possible? Would I 
> need to make a feature request?


Re: Problems with restricting access to users using Basic auth

2019-09-03 Thread Jason Gerlowski
Yeah, it beats me.  If you've made sure that the security.json in
ZooKeeper is exactly the same as the one I posted but you're still
getting different results, then I'm stumped.  Maybe someone else here
has an idea.

Out of curiosity, are you setting your security.json via the
authentication/authorization APIs, or by uploading the file directly
to ZooKeeper?

RuleBasedAuthorizationPlugin logging has improved in more recent
versions of Solr, so that when the log-level is raised to DEBUG
there's a lot more information given for each request about which
permissions apply and what the result of looking at each is.  But that
won't help you on 7.6 unfortunately.

Good luck, and let us know if you are able to fix things, or
eventually find out what the difference in behavior is between our two
setups.

Jason

On Tue, Sep 3, 2019 at 8:01 AM Salmaan Rashid Syed
 wrote:
>
> Hi Jason,
>
> Apologies for the late reply. My laptop was broken and I got it today from
> service centre.
>
> I am still having issues with solr-user able to view the Collections list
> as follow.
>
> Testing permissions for user [solr]
> Request [/admin/collections?action=LIST] returned status [200]
> Request [/collection1/select?q=*:*] returned status [200]
> Request [/collection2/select?q=*:*] returned status [200]
> Request [/collection3/select?q=*:*] returned status [200]
>
> Testing permissions for user [solr-user]
> Request [/admin/collections?action=LIST] returned status [200]
> Request [/collection1/select?q=*:*] returned status [200]
> Request [/collection2/select?q=*:*] returned status [200]
> Request [/collection3/select?q=*:*] returned status [403]
>
> I am still wondering wher I am going wrong.
>
> Thanks,
> Salmaan
>
>
>
>
> On Thu, Aug 29, 2019 at 1:34 PM Salmaan Rashid Syed <
> salmaan.ras...@mroads.com> wrote:
>
> > Thanks a lot Jason,
> >
> > I will try this out and let you know.
> >
> > Thanks again.
> >
> > On Wed 28 Aug, 2019, 7:45 PM Jason Gerlowski, 
> > wrote:
> >
> >> Hi Salmaan,
> >>
> >> Are you still seeing this behavior, or were you able to figure things out?
> >>
> >> I just got a chance to try out the security.json in Solr 7.6 myself,
> >> and I can't reproduce the behavior you're seeing.
> >>
> >> It might be helpful to level set here.  Make sure that our
> >> security.json settings and our test requests are exactly the same.
> >>
> >> This is the security.json I used in my test deployment:
> >>
> >> {
> >>   "authentication":{
> >>"blockUnknown": true,
> >>"class":"solr.BasicAuthPlugin",
> >>"credentials":{
> >>  "solr":"gP31s0FQevh3k0i0y6g9AP/TZLWctxfZjtC9sOh8vZU=
> >> J7an406gVyx4v4CkR8YLgmhClk9Yv/fIBSfZoi1f0kY=",
> >>  "solr-user":"gP31s0FQevh3k0i0y6g9AP/TZLWctxfZjtC9sOh8vZU=
> >> J7an406gVyx4v4CkR8YLgmhClk9Yv/fIBSfZoi1f0kY="
> >>}
> >>   },
> >>   "authorization":{
> >>"class":"solr.RuleBasedAuthorizationPlugin",
> >>"permissions":[
> >>   {"name": "dev-read", "collection": ["collection1",
> >> "collection2"], "role": ["dev", "admin"] },
> >>   {"name": "security-edit", "role": "admin"},
> >>   {"name": "security-read", "role": "admin"},
> >>   {"name": "schema-edit", "role": "admin"},
> >>   {"name": "schema-read", "role": "admin"},
> >>   {"name": "config-edit", "role": "admin"},
> >>   {"name": "config-read", "role": "admin"},
> >>   {"name": "core-admin-edit", "role": "admin"},
> >>   {"name": "core-admin-read", "role": "admin"},
> >>   {"name": "collection-api-edit", "role": "admin"},
> >>   {"name": "collection-api-read", "role": "admin"},
> >>   {"name": "read", "role": "admin"},
> >>   {"name": "update", "role": "admin"},
> >>   {"name": "all", "role&

Re: Problems with restricting access to users using Basic auth

2019-08-28 Thread Jason Gerlowski
Hi Salmaan,

Are you still seeing this behavior, or were you able to figure things out?

I just got a chance to try out the security.json in Solr 7.6 myself,
and I can't reproduce the behavior you're seeing.

It might be helpful to level set here.  Make sure that our
security.json settings and our test requests are exactly the same.

This is the security.json I used in my test deployment:

{
  "authentication":{
   "blockUnknown": true,
   "class":"solr.BasicAuthPlugin",
   "credentials":{
 "solr":"gP31s0FQevh3k0i0y6g9AP/TZLWctxfZjtC9sOh8vZU=
J7an406gVyx4v4CkR8YLgmhClk9Yv/fIBSfZoi1f0kY=",
 "solr-user":"gP31s0FQevh3k0i0y6g9AP/TZLWctxfZjtC9sOh8vZU=
J7an406gVyx4v4CkR8YLgmhClk9Yv/fIBSfZoi1f0kY="
   }
  },
  "authorization":{
   "class":"solr.RuleBasedAuthorizationPlugin",
   "permissions":[
  {"name": "dev-read", "collection": ["collection1",
"collection2"], "role": ["dev", "admin"] },
  {"name": "security-edit", "role": "admin"},
  {"name": "security-read", "role": "admin"},
  {"name": "schema-edit", "role": "admin"},
  {"name": "schema-read", "role": "admin"},
  {"name": "config-edit", "role": "admin"},
  {"name": "config-read", "role": "admin"},
  {"name": "core-admin-edit", "role": "admin"},
  {"name": "core-admin-read", "role": "admin"},
  {"name": "collection-api-edit", "role": "admin"},
  {"name": "collection-api-read", "role": "admin"},
  {"name": "read", "role": "admin"},
  {"name": "update", "role": "admin"},
  {"name": "all", "role": "admin"}
   ],
   "user-role":{
 "solr":"admin",
 "solr-user": "dev"
   }
  }
}

And this is the output of a script I use to test permissions quickly:

$ ./test-security.sh

Testing permissions for user [solr]
Request [/admin/collections?action=LIST] returned status [200]
Request [/collection1/select?q=*:*] returned status [200]
Request [/collection2/select?q=*:*] returned status [200]
    Request [/collection3/select?q=*:*] returned status [200]

Testing permissions for user [solr-user]
Request [/admin/collections?action=LIST] returned status [403]
Request [/collection1/select?q=*:*] returned status [200]
Request [/collection2/select?q=*:*] returned status [200]
Request [/collection3/select?q=*:*] returned status [403]

You can find this script here, to see the exact curl commands being
used and run it yourself: https://paste.apache.org/tjtdg

That output looks correct to me.  solr-user is prevented from
accessing other APIs and other collections, but can access collection1
and collection2.

Does your security.json match mine, or do the permissions differ in
some way?  Can you still reproduce the behavior using my script?

Good luck,

Jason

On Thu, Aug 22, 2019 at 2:13 AM Salmaan Rashid Syed
 wrote:
>
> Hi,
>
> Any suggestions as to what can be done?
>
> Regards,
> Salmaan
>
>
> On Wed, Aug 21, 2019 at 4:33 PM Jason Gerlowski 
> wrote:
>
> > Ah, ok.  SOLR-13355 still affects 7.6, so that explains why you're
> > seeing this behavior.
> >
> > You could upgrade to get the new behavior, but you don't need to-
> > there's a workaround.  You just need to add a few extra rules to your
> > security.json.  The problem in SOLR-13355 is that the "all" permission
> > isn't being considered for APIs that are covered by other predefined
> > permissions.  So the workaround is to add a permission rule for each
> > of the predefined permissions, locking them down to the "admin" role.
> > It really bloats security.json, but should do the job.  So your
> > security.json should have a permissions section that looks like the
> > JSON below:
> >
> > {"name": "dev-read", "collection": ["collection1", "collection2"],
> > "role": "dev"},
> > {"name": "security-edit", "role": "admin"},
> > {"name": "security-read", "role": "admin"},
> > {"name": "schema-edit", "role": "admin"},
>

Re: Problems with restricting access to users using Basic auth

2019-08-21 Thread Jason Gerlowski
Ah, ok.  SOLR-13355 still affects 7.6, so that explains why you're
seeing this behavior.

You could upgrade to get the new behavior, but you don't need to-
there's a workaround.  You just need to add a few extra rules to your
security.json.  The problem in SOLR-13355 is that the "all" permission
isn't being considered for APIs that are covered by other predefined
permissions.  So the workaround is to add a permission rule for each
of the predefined permissions, locking them down to the "admin" role.
It really bloats security.json, but should do the job.  So your
security.json should have a permissions section that looks like the
JSON below:

{"name": "dev-read", "collection": ["collection1", "collection2"],
"role": "dev"},
{"name": "security-edit", "role": "admin"},
{"name": "security-read", "role": "admin"},
{"name": "schema-edit", "role": "admin"},
{"name": "schema-read", "role": "admin"},
{"name": "config-edit", "role": "admin"},
{"name": "config-read", "role": "admin"},
{"name": "core-admin-edit", "role": "admin"},
{"name": "core-admin-read", "role": "admin"},
{"name": "collection-api-edit", "role": "admin"},
{"name": "collection-api-read", "role": "admin"},
{"name": "read", "role": "admin"},
{"name": "update", "role": "admin"},
{"name": "all", "role": "admin"}

Hope that helps.  Let me know if that still has any problems for you.

Jason

On Wed, Aug 21, 2019 at 6:48 AM Salmaan Rashid Syed
 wrote:
>
> Hi Jason,
>
> Is there a way to fix this in version 7.6?
>
> Or is it mandatory to upgrade to other versions?
>
> If I have to upgrade to a higher version, then what is the best way to do
> this without effecting the current configuration and indexed data?
>
> Thanks,
> Salmaan
>
>
>
> On Wed, Aug 21, 2019 at 4:13 PM Salmaan Rashid Syed <
> salmaan.ras...@mroads.com> wrote:
>
> > Hi Jason,
> >
> > I am using version 7.6 of Solr.
> >
> > Thanks,
> > Salmaan
> >
> >
> >
> > On Wed, Aug 21, 2019 at 4:12 PM Jason Gerlowski 
> > wrote:
> >
> >> The "all" permissions _should_ block solr-user from accessing all of
> >> those resources, and I believe it does in newer versions of Solr.
> >> There was a bug with it that was fixed a few versions back though- it
> >> sounds like you might be running into that. (see
> >> https://issues.apache.org/jira/browse/SOLR-13355) What version of Solr
> >> are you using?
> >>
> >> Jason
> >>
> >>
> >>
> >> On Wed, Aug 21, 2019 at 5:21 AM Salmaan Rashid Syed
> >>  wrote:
> >> >
> >> > Hi Jason,
> >> >
> >> > Thanks for your prompt reply.
> >> >
> >> > Your code does address few of my concerns like restricting *solr-user*
> >> from
> >> > accessing the dashboard and from executing other request methods apart
> >> from
> >> > *"update"* and *"read"*.
> >> >
> >> > But I am still able to access other collections such as *"Collection3",
> >> > "Collection4"* and so on, apart from the intended two collection
> >> entered in
> >> > the code. I can give *"update"* and *"read" *requests to these external
> >> > Collections which solr-user should not be able to do.
> >> >
> >> > Moreover solr-user can look at the
> >> > *http://localhost:8983/solr/admin/authentication
> >> > <http://localhost:8983/solr/admin/authentication>* link which lists the
> >> > users and their *SHA256* coded passwords. How can I hide this and
> >> restrict
> >> > access to other collections?
> >> >
> >> > Thanks and regards
> >> > Salmaan
> >> >
> >> >
> >> > On Wed, Aug 21, 2019 at 5:07 AM Jason Gerlowski 
> >> > wrote:
> >> >
> >> > > Hi Salmaan,
> >> > >
> >> > > Solr's RuleBasedAuthorizationPlugin allows requests through if none of
> >> > > the specified permissions apply. 

Re: Problems with restricting access to users using Basic auth

2019-08-21 Thread Jason Gerlowski
The "all" permissions _should_ block solr-user from accessing all of
those resources, and I believe it does in newer versions of Solr.
There was a bug with it that was fixed a few versions back though- it
sounds like you might be running into that. (see
https://issues.apache.org/jira/browse/SOLR-13355) What version of Solr
are you using?

Jason



On Wed, Aug 21, 2019 at 5:21 AM Salmaan Rashid Syed
 wrote:
>
> Hi Jason,
>
> Thanks for your prompt reply.
>
> Your code does address few of my concerns like restricting *solr-user* from
> accessing the dashboard and from executing other request methods apart from
> *"update"* and *"read"*.
>
> But I am still able to access other collections such as *"Collection3",
> "Collection4"* and so on, apart from the intended two collection entered in
> the code. I can give *"update"* and *"read" *requests to these external
> Collections which solr-user should not be able to do.
>
> Moreover solr-user can look at the
> *http://localhost:8983/solr/admin/authentication
> <http://localhost:8983/solr/admin/authentication>* link which lists the
> users and their *SHA256* coded passwords. How can I hide this and restrict
> access to other collections?
>
> Thanks and regards
> Salmaan
>
>
> On Wed, Aug 21, 2019 at 5:07 AM Jason Gerlowski 
> wrote:
>
> > Hi Salmaan,
> >
> > Solr's RuleBasedAuthorizationPlugin allows requests through if none of
> > the specified permissions apply.  I think that's what you're running
> > into in your example above.  If you want to lockdown a particular API
> > (or set of APIs) then you need to explicitly add a permission that
> > restricts those APIs to a particular role.
> >
> > One way to get the behavior that it sounds like you're looking for
> > would be to add a catch-all permission at the bottom of your
> > permissions list that restricts all other APIs to "admin".  This would
> > look a bit like:
> >
> >  "permissions":[
> > {
> > "name":"security-edit",
> > "role":"admin"
> > },
> > {
> > "collection": ["Collection1", "Collection2"],
> > "name": ["update", "read"],
> > "role": "dev"
> > },
> > {
> > "name": "all",
> > "role": "admin"
> > }
> > ]
> >
> > Hope that helps get you started.
> >
> > Best,
> >
> > Jason
> >
> > On Tue, Aug 20, 2019 at 3:19 AM Salmaan Rashid Syed
> >  wrote:
> > >
> > > Hi Solr Users,
> > >
> > > I want to create a user that has restricted access to Solr. I did the
> > > follwowing:-
> > >
> > >
> > >1. {
> > >2. "authentication":{
> > >3."blockUnknown": true,
> > >4."class":"solr.BasicAuthPlugin",
> > >5."credentials":{
> > >6. "solr-admin":
> > >"2IUJD9dxRhxSXaJGdMP5z8ggSn4I285Ty9GCWeRNMUg=
> > > /sSNJJufPtj4baRizoJshJawFsWvopvZSqZpQ/Nwd78="
> > >,
> > >7. "solr-user":
> > >"p+XwOh15p/rvFltv2LXP1CwtbvwBgGlC9qcDKxV73B4=
> > > DcNsjfA6Wf16V1XKT+YraosSFQ5Cr3eRUX6BQnx9XKA="
> > >
> > >8.  }
> > >9. },
> > >10. "authorization":{
> > >11."class":"solr.RuleBasedAuthorizationPlugin",
> > >12."user-role":{"solr-admin":"admin", "solr-user":"dev"},
> > >13."permissions":[
> > >14.   {
> > >15."name":"security-edit",
> > >16."role":"admin"
> > >17.   },
> > >18.   {
> > >19. "collection": ["Collection1", "Collection2"],
> > >20. "name": ["update", "read"],
> > >21. "role": "dev"
> > >22.   }
> > >23.   ]
> > >24. }}
> > >
> > >
> > > But when Login intot the Solr admin dash-board using Solr-user
> > credentials,
> > > I can read, select, write, update, delete collections and do all sorts of
> > > things like a solr-admin can do.
> > >
> > > I want solr-user to be able to access only *Collection1* and
> > *Collection2*
> > > and be able to only *update *and *read*. He should not be able to access
> > > other collections and do anything apart from the above mentioned role.
> > >
> > > Where am I exactly going wrong?
> > >
> > > Thanks and Regards,
> > > Salmaan
> >


Re: Problems with restricting access to users using Basic auth

2019-08-20 Thread Jason Gerlowski
Hi Salmaan,

Solr's RuleBasedAuthorizationPlugin allows requests through if none of
the specified permissions apply.  I think that's what you're running
into in your example above.  If you want to lockdown a particular API
(or set of APIs) then you need to explicitly add a permission that
restricts those APIs to a particular role.

One way to get the behavior that it sounds like you're looking for
would be to add a catch-all permission at the bottom of your
permissions list that restricts all other APIs to "admin".  This would
look a bit like:

 "permissions":[
{
"name":"security-edit",
"role":"admin"
},
{
"collection": ["Collection1", "Collection2"],
"name": ["update", "read"],
"role": "dev"
},
{
"name": "all",
"role": "admin"
}
]

Hope that helps get you started.

Best,

Jason

On Tue, Aug 20, 2019 at 3:19 AM Salmaan Rashid Syed
 wrote:
>
> Hi Solr Users,
>
> I want to create a user that has restricted access to Solr. I did the
> follwowing:-
>
>
>1. {
>2. "authentication":{
>3."blockUnknown": true,
>4."class":"solr.BasicAuthPlugin",
>5."credentials":{
>6. "solr-admin":
>"2IUJD9dxRhxSXaJGdMP5z8ggSn4I285Ty9GCWeRNMUg=
> /sSNJJufPtj4baRizoJshJawFsWvopvZSqZpQ/Nwd78="
>,
>7. "solr-user":
>"p+XwOh15p/rvFltv2LXP1CwtbvwBgGlC9qcDKxV73B4=
> DcNsjfA6Wf16V1XKT+YraosSFQ5Cr3eRUX6BQnx9XKA="
>
>8.  }
>9. },
>10. "authorization":{
>11."class":"solr.RuleBasedAuthorizationPlugin",
>12."user-role":{"solr-admin":"admin", "solr-user":"dev"},
>13."permissions":[
>14.   {
>15."name":"security-edit",
>16."role":"admin"
>17.   },
>18.   {
>19. "collection": ["Collection1", "Collection2"],
>20. "name": ["update", "read"],
>21. "role": "dev"
>22.   }
>23.   ]
>24. }}
>
>
> But when Login intot the Solr admin dash-board using Solr-user credentials,
> I can read, select, write, update, delete collections and do all sorts of
> things like a solr-admin can do.
>
> I want solr-user to be able to access only *Collection1* and *Collection2*
> and be able to only *update *and *read*. He should not be able to access
> other collections and do anything apart from the above mentioned role.
>
> Where am I exactly going wrong?
>
> Thanks and Regards,
> Salmaan


Re: Contact for Wiki / Support page maintainer

2019-07-29 Thread Jason Gerlowski
I was under the impression that non-committers could also edit the
wiki pages if the requested the appropriate karma on the mailing list.

Though maybe that changed with the move to cwiki, or maybe that's
never been the case

On Thu, Jul 25, 2019 at 4:10 PM Jan Høydahl  wrote:
>
> All committers can edit. What would you like to change/add?
>
> Jan Høydahl
>
> > 25. jul. 2019 kl. 09:11 skrev Jaroslaw Rozanski :
> >
> > Hi folks!
> >
> > Who is the maintainer of Solr Support page in the Apache Solr Wiki 
> > (https://cwiki.apache.org/confluence/display/solr/Support)?
> >
> > Thanks,
> > Jaroslaw
> >
> > --
> > Jaroslaw Rozanski | m...@jarekrozanski.eu


Re: HowtoConfigureIntelliJ link is broken

2019-07-22 Thread Jason Gerlowski
Hi Richard,

Thanks for the heads up.  I think this was a known issue.  We recently
moved our wiki from a Moin wiki to a Confluence one, and this changed
the urls.  There was an issue with the redirects at first, but they
appear to be working now.  Glad you were able to find what you needed
regardless.  The url you posted works for me now.  You can also use
the new url: 
https://cwiki.apache.org/confluence/display/lucene/HowtoConfigureIntelliJ

Best,

Jason

On Thu, Jul 18, 2019 at 12:12 PM Richard Goodman
 wrote:
>
> Hi there,
>
> I went to set up the repo with intellij, but it was having some problems
> figuring out the source folders etc., So I went to navigate to the
> following link <https://wiki.apache.org/lucene-java/HowtoConfigureIntelliJ>
> as I remember from the past there were a few commands that helped, however,
> it appears to be broken? I used a website archiver to retrieve the original
> contents, but wasn't sure if it had been raised.
>
> Thanks,
>
> --
>
> Richard Goodman|Data Infrastructure engineer
>
> richa...@brandwatch.com
>
>
> NEW YORK   | BOSTON   | BRIGHTON   | LONDON   | BERLIN |   STUTTGART |
> PARIS   | SINGAPORE | SYDNEY
>
> <https://www.brandwatch.com/blog/digital-consumer-intelligence/>


Re: Problems using a suggester component in the /select handler in cloud mode

2019-07-22 Thread Jason Gerlowski
Hi Alexandros,

The first step would be to package up your changes in a patch file, and
upload that to the JIRA you linked to in your initial email. (SOLR-12060).
More detailed instructions can be found here:
https://cwiki.apache.org/confluence/display/solr/HowToContribute#HowToContribute-Generatingapatch.
If you prefer, you can also create a PR on github with your changes, and
put a link to the PR on the JIRA ticket.  (There are many guides out there
on creating Github PRs, so I won't get into that.)

Thanks for putting in the effort to share your work.  Good luck!

Best,

Jason

On Tue, Jul 16, 2019 at 8:50 AM Alexandros Paramythis <
alexandros.paramyt...@contexity.ch> wrote:

> Hi everyone,
>
> We have a fix for the problem described in the message below. Could anyone
> provide pointers to documentation on how we would go about contributing
> this back?
>
> Thanks in advance for your input,
>
> Alex
>
>
> On 26/06/2019 10:48, Alexandros Paramythis wrote:
>
> Hi everyone,
>
> Environment:
>
> Solr 7.5.0, cloud mode (but problem should be identical in multiple
> versions, at least in 7.x)
>
> Summary:
>
> We have a Solr configuration that returns suggestions in the course of a
> normal search call (i.e., we have a 'suggest' component added to the
> 'last-components' for '/select' request handler). This does not work in
> cloud mode, where we get an NPE in QueryComponent. This problem seems to
> have been reported in various forms in the past -- see for example [1] and
> [2] (links at the end of this email) -- but we couldn't find any resolution
> (or in-depth discussion for that matter).
>
> In more detail:
>
> We have a suggest component configured as follows:
>
>   
>
> 
> default
>  name="classname">org.apache.solr.spelling.suggest.Suggester
>  name="lookupImpl">org.apache.solr.spelling.suggest.fst.AnalyzingLookupFactory
> dict_default
> text_suggest
> text_suggest
> true
> true
> true
> 
>
> 
> suggest_phrase
>  name="lookupImpl">org.apache.solr.spelling.suggest.fst.AnalyzingLookupFactory
> dict_suggest_phrase
>  name="suggestAnalyzerFieldType">text_suggest_phrase
> suggest_phrase
> true
> true
> true
> 
>
> 
> suggest_infix_shingle
> AnalyzingInfixLookupFactory
> suggestInfixShingleDir
>  name="suggestAnalyzerFieldType">text_suggest_phrase
> suggest_phrase
> true
> true
> true
> 
>
> 
> suggest_prefix
> Suggester
> AnalyzingLookupFactory
>  name="suggestAnalyzerFieldType">text_suggest_prefix
> suggest_prefix
> true
> true
> true
> 
>
>   
>
>
> This component works without issued both in standalone and cloud mode,
> when used as the sole component in a handler, such as in the following
> excerpt:
>
>  startup="lazy">
> 
> default
> suggest_phrase
>  name="suggest.dictionary">suggest_infix_shingle
> suggest_prefix
> true
> 10
> false
> 
> 
> suggest
> 
> 
>
>
> It also works when used along with other component in standalone mode,
> such as in the following excerpt, where we use the suggest component to get
> suggestions during a "normal" search call:
>
> 
> 
> explicit
> 10
> text_search
>
> edismax
>
> title^5.0 subtitle^3.0 abstract^2.0
> text_search
> title^5.0 subtitle^3.0 abstract^2.0
> text_search
> 4
> on
> default
> true
> 10
>  name="spellcheck.alternativeTermCount">5
>  name="spellcheck.maxR

Re: Getting list of unique values in a field

2019-07-15 Thread Jason Gerlowski
The Solr ref-guide has examples which show how to do this too.  Take a
look at some of the faceting examples here:
https://lucene.apache.org/solr/guide/8_1/json-facet-api.html#bucketing-facet-example

Best,

Jason

On Fri, Jul 12, 2019 at 9:50 AM David Hastings
 wrote:
>
> i found this:
>
> https://stackoverflow.com/questions/14485031/faceting-using-solrj-and-solr4
>
> and this
>
> https://www.programcreek.com/java-api-examples/?api=org.apache.solr.client.solrj.response.FacetField
>
>
> just from a google search
>
> On Fri, Jul 12, 2019 at 9:46 AM Steven White  wrote:
>
> > Thanks David.  But is there a SolrJ sample code on how to do this?  I need
> > to see one, or at least the API, so I know how to make the call.
> >
> > Steven
> >
> > On Fri, Jul 12, 2019 at 9:42 AM David Hastings <
> > hastings.recurs...@gmail.com>
> > wrote:
> >
> > > just use a facet on the field should work yes?
> > >
> > > On Fri, Jul 12, 2019 at 9:39 AM Steven White 
> > wrote:
> > >
> > > > Hi everyone,
> > > >
> > > > One of my indexed field is as follows:
> > > >
> > > >  > > > multiValued="false" indexed="true" required="true" stored="false"/>
> > > >
> > > > It holds the file extension of the files I'm indexing.  That is, let us
> > > say
> > > > I indexed 10 million files and the result of such indexing, the field
> > > > CC_FILE_EXT will now have the file extension.  In my case the unique
> > file
> > > > extension list is about 300.
> > > >
> > > > Using SolrJ, is there a quick and fast way for me to get back all the
> > > > unique values this field has across all of my document?  I don't and
> > > cannot
> > > > scan all the 10 million indexed documents in Solr to build that list.
> > > That
> > > > would be very inefficient.
> > > >
> > > > Thanks,
> > > >
> > > > Steven
> > > >
> > >
> >


Re: Release of Solr 8.1.2 bug fix

2019-07-03 Thread Jason Gerlowski
Hi Edwin,

Solr releases can be a messy process.  They're subject to a lot of
unforeseen issues that can drag the process out: test failures
springing up at the last minute, other committers asking to squeeze in
last minute fixes, infrastructure problems cropping up unexpectedly,
etc.  So release-managers rarely offer timelines for when they'll be
able to finish a release.

Cao Manh Dat has volunteered to do the release, and is actively
working on it.  And all of the bugs have been merged that committers
asked Dat to wait for.  Beyond that, there's no real timeline.
Hopefully it'll be soon, but not necessarily.

Best,

Jason

On Wed, Jul 3, 2019 at 2:34 AM Zheng Lin Edwin Yeo  wrote:
>
> Hi,
>
> I understand that currently there is plan for a Solr 8.1.2 bug fix release
> to resolve some of the bugs, like the SOLR-13510 basic authentication issue.
>
> Would like to check, what is the timeline like for the release?
>
> Regards,
> Edwin


Re: Issue with Solr 7.7.2 - ClassCastException: org.apache.solr.common.util.ByteArrayUtf8CharSequence

2019-06-19 Thread Jason Gerlowski
Hi David,

Thanks for the heads up.  We'd hoped to put an end to these issues as
a part of SOLR-13331, but missed some field types as you pointed out.
We're aware of the issue and working on a fix for upcoming Solr
versions.  Anyone interested can watch our progress here:
https://issues.apache.org/jira/browse/SOLR-13539 . (SOLR-13538 also
has some information)

Best,

Jason

On Tue, Jun 11, 2019 at 4:00 PM David Winter  wrote:
>
> Hi,
>
> I would like let you know about server side exceptions for specific field
> types after upgrading to 7.7.2, like ClassCastException:
> org.apache.solr.common.util.ByteArrayUtf8CharSequence.
> For references: https://issues.apache.org/jira/browse/SOLR-13285 and
> https://issues.apache.org/jira/browse/SOLR-13331
>
> You may check out these issues for before upgrading.
>
> java.lang.ClassCastException:
> org.apache.solr.common.util.ByteArrayUtf8CharSequence cannot be cast to
> java.lang.String
> at
> org.apache.solr.schema.TrieDateField.toNativeType(TrieDateField.java:100)
> at
> org.apache.solr.update.processor.AtomicUpdateDocumentMerger.doSet(AtomicUpdateDocumentMerger.java:319)
>
> Mit freundlichen Grüßen / Kind regards
>
> David Winter
>
>
>


Re: Intermittent error 401 with JSON Facet query to retrieve count all collections

2019-06-04 Thread Jason Gerlowski
Hi Edwin,

Thanks for the additional datapoint.  It seemed to work for me, but we
don't really understand the problem yet, so maybe it's not a solid
work around like I'd hoped.  I'm curious to hear whether it works for
Colvin.

To double check though: forwardCredentials is only supported in Solr >
8.0.  You're using an 8.x version, right?

Jason

On Tue, Jun 4, 2019 at 2:45 AM Zheng Lin Edwin Yeo  wrote:
>
> Hi Jason,
>
> Thanks for your reply.
>
> I have tried to add the "forwardCredentials": true in the security.json,
> but I still get the same error.
>
> Regards,
> Edwin
>
> On Mon, 3 Jun 2019 at 22:19, Colvin Cowie 
> wrote:
>
> > Hi, thanks I'll give that a go when I get a chance.
> >
> > I was trying to reply to an older thread (
> >
> > http://mail-archives.apache.org/mod_mbox/lucene-solr-user/201904.mbox/%3CCAF2DzVXeVZqnixnkbzw0La1ui5N5-RG9PwfMBHG9vmkfBSMzJA%40mail.gmail.com%3E
> > ),
> > which I don't have in my mailbox, so obviously didn't reply to the right
> > address to get my response threaded so mine has appeared on its own. Oops.
> >
> > A JIRA issue was raised on that thread
> > https://issues.apache.org/jira/browse/SOLR-13421 but it's not had any
> > attention.
> >
> >
> > On Mon, 3 Jun 2019 at 14:46, Jason Gerlowski 
> > wrote:
> >
> > > Hi Colvin,
> > >
> > > We're still taking a look at fixing the bug, but as a workaround in
> > > the meantime, you can look into adding a "forwardCredentials":true
> > > property under the "authentication" section of security.json.  That
> > > seems to fix the issue in my reproduction at least.
> > >
> > > e.g.
> > >
> > > {
> > >     "authentication": {
> > >     "blockUnknown": true,
> > > "class": "solr.BasicAuthPlugin",
> > > "credentials": {
> > > "solradmin": ""
> > > },
> > > "forwardCredentials": true
> > > },
> > > ...
> > > }
> > >
> > > Jason
> > >
> > > On Mon, Jun 3, 2019 at 9:31 AM Jason Gerlowski 
> > > wrote:
> > > >
> > > > One last note: as far as I can tell, nothing about this issue is
> > > > specific to JSON Faceting or the JSON request API.  It can be
> > > > triggered just as easily with "/select?q=*:*".
> > > >
> > > > The bug created for this is: SOLR-13510
> > > >
> > > > On Mon, Jun 3, 2019 at 9:17 AM Jason Gerlowski 
> > > wrote:
> > > > >
> > > > > I'm also able to reproduce this bug on master.  A few more notes
> > about
> > > > > the bad behavior:
> > > > >
> > > > > - the behavior occurs regardless of the specific permissions
> > > > > configured in security.json.  (i.e. whether the top permission is
> > > > > "all", or "security-edit", or there are no permissions at all.)
> > > > > - I tried looking for a pattern in which requests saw the 401s, but
> > > > > didn't have any luck.  The 401 occurs when talking to the whole
> > > > > collection or targeting individual cores directly.  It occurs when
> > > > > curl hits a host containing a replica for the collection in question,
> > > > > and when it doesnt. etc.  This distinguishes it from SOLR-13472,
> > which
> > > > > seems more specific to collection structure/layout.
> > > > >
> > > > > I'll create a bug for this in JIRA.
> > > > >
> > > > > On Sun, Jun 2, 2019 at 9:53 AM Colvin Cowie <
> > > colvin.cowie@gmail.com> wrote:
> > > > > >
> > > > > > Hello. I encountered this issue too and wrote this up before I
> > found
> > > this
> > > > > > thread, but I thought I might as well post it still, if it helps...
> > > > > >
> > > > > > Currently I'm trying to move our product on to Solr 8.1.1. We are
> > > currently
> > > > > > using 6.6.6, so things have definitely moved on.
> > > > > >
> > > > > > We use the BasicAuthPlugin + RuleBasedAuthorizationPlugin to lock
> > > down Solr
> > > > > > (and we also secure our zookeeper). Here's an example for solradmin
> > > as the
> > > > > > user and passw

Re: alias read access impossible for anyone other than admin?

2019-06-04 Thread Jason Gerlowski
Hi Sotiris,

First off, forget what I said earlier about the "all" permission.
What I said is mostly correct, but I had forgotten about some of the
other behavior here that complicates things some.

I replicated the behavior you're seeing and spent a bit of time
tracing things through on the Solr side.  I'll walk through it below
in more detail, but ultimately what I think you're running into is
that aliases are resolved _before_ authorization is done.  The only
way to write permissions affecting an alias is to write permissions
that affect the underlying collections.  The way I'm reading the authz
code, a permission like {"collection": "sCollAlias", "path":
"/select/*", "role": "readSCollAlias"} (taken from your first email),
will never have any effect because Solr treats that incoming request
as being for collection "sColl" (I'm just guessing at this name) by
the point authz gets triggered.

> Could someone please ELI5 going through the rules one by one?

RBAP's process of authorizing requests is complicated, but it might
help to think of it happening in 2 distinct steps:

1. Find the first rule (if any) that matches the incoming request.  A
rule matches if the "collection", "path", "method", and "params"
properties all match values in the incoming request.
2. Looks at the "role" for the selected permission.  Allow the request
if the user making the request has this role.  Otherwise deny the
request.

The second half of this process is pretty straightforward.  The first
half (determining which permission rule governs the request) is the
complicated bit that causes most of the confusion.  So in debugging
RBAP, the real question is: how are permissions ordered, and how does
Solr determine which one matches an incoming request?

1. First, Solr figures out which collections are involved in a
request.  Solr looks at the path param (e.g. /solr/foo/select), the
"collection" query-param (e.g.
/collections?action=CLUSTERSTATUS=foo), and resolves any
aliases (e.g. fooAlias -> fooCollection).  It gets the collections
referenced in all these places and puts them together in a list.
2. Solr begins looking for rules that match the incoming request.  A
permission is considered a match if the "collection", "path",
"method", and "params" properties (when defined) all match values on
the incoming request.  security.json shows the permissions in a flat
list, but this isn't the order they're tested in.  Instead, the
permissions are tested in the following order:
2a. Permissions with both "collection" and "path" present and matching
the incoming request
2b. Permissions with "collection" matching, but no path specified
2c. Permissions with no "collection" specified (or the wildcard values
specified) and a "path" matching the incoming request
2d. Permissions agnostic of both "collection" and "path"
3. Within each of the sub-steps above, permissions are tested in the
order they appear in security.json.
4. When testing each individual permission, Solr either looks at the
remaining properties ("method", "params").  If those check out, we've
got a winner.  This is the only permission that will matter for this
request.

So that's the logic in how the rules are processed.  Let's walk
through your originally posted security.json and see how this works
out in practice for a request from "user".  I'm assuming that
sCollAlias is an alias that references the single collection "Coll".
Imagine a request from "user" for
http://:8983/solr/sCollAlias/select?q=*:*

1. Solr looks at the request and realizes that sCollAlias really
points to Coll.  It puts "Coll" in its list of referenced-collections.
2. Solr looks for permissions with a "collection" value of "Coll" and
a path value of "/select".  There is one: {name:readColl,
collection:Coll, path:/select, role:readColl}
3. Looking at that permission further, Solr makes sure the "method"
and "params" properties match the request.  Since the properties
aren't present, they're treated as wildcards and implicitly match.
4. So we've found a matching permission, now Solr checks whether
"user" has the correct role.  The permission says that this request
can only be made by those with the role "readColl".  But "user" only
has the role "readSCollAlias".  So the request is denied.

Hope that example helps.  Let me know if you have any more questions.

Best,

Jason

On Mon, Jun 3, 2019 at 2:06 PM Sotiris Fragkiskos  wrote:
>
> it's 7.2.1. Thanks!
>
> On Mon, Jun 3, 2019 at 6:26 PM Jason Gerlowski 
> wrote:
>
> > Hi Sotiris,
> >
> > What version of Solr are you running? 

Re: alias read access impossible for anyone other than admin?

2019-06-03 Thread Jason Gerlowski
Hi Sotiris,

What version of Solr are you running?  The behavior has changed some
over time, both intentionally and due to bugs that have come and gone
over time.  I (or someone else) can explain things and offer you
better help once we know your Solr version.

Jason

On Mon, Jun 3, 2019 at 12:13 PM Sotiris Fragkiskos  wrote:
>
> Hi again,
>
> I moved the "all" permission to the bottom as suggested, but it still
> doesn't work. Actually, i tried all possible combinations that I could
> think of, but I just can't get it to work.
> Could there be something else that I'm doing wrong? I'm a complete newbie,
> so pretty much anything is a possibility at this point :(
> Could it be because I use getfile/putfile commands to update the
> security.json file? (it seems to be working, i.e. what i put with putfile
> is later retrieved successfully with getfile)
> Could there be some system update/refresh mechanism that I'm not aware of
> and is currently not taking place?
> Could someone please ELI5 going through the rules one by one? I can't
> exactly understand the "narrative" that's going on,
>
> My security.json file's "authorization"  at this point looks like the
> snippet below, and almost nothing is working (except admin, and userC who,
> for some weird reason, can access  readCollC55b , which is tied to a role
> that the userC is NOT tied to..
> I'm completely lost any pointers, anyone?
> Mind you, i'm testing whether it works either directly in the browser by
> prepending a "username:password@" to the URL or from the cmdline with a
> curl command like so:
> *curl http://@IP/solr/collName/select?q=field:value*
>
> Many thanks!
> Sotiri
>
> "authorization":{
> "class":"solr.RuleBasedAuthorizationPlugin",
> "permissions":[
>   {
> "name":"readCollA",
> "collection":"CollA",
> "path":"/select/*",
> "role":"readCollA",
> "index":1},
>   {
> "name":"readCollB",
> "collection":"CollB",
> "path":"/select/*",
> "role":"readCollB",
> "index":2},
>   {
> "name":"readCollC55b",
> "collection":"CollC55b",
> "path":"/select/*",
> "role":"readCollC55b",
> "index":3},
>   {
> "name":"readCollCProduction",
> "collection":"CollCProd",
> "path":"/select/*",
> "role":"readCollCProduction",
> "index":4},
>   {
> "name":"all",
> "role":"admin",
> "index":5}],
> "user-role":{
>   "admin":[
> "admin",
> "readCollB",
> "readCollA",
> "readCollC55b",
> "readCollCProduction"],
>   "userA":["readCollC55b"],
>   "userB":["readCollC55b"],
>   "userC":["readCollCProduction"],
>   "userD":[
> "readCollCProduction",
> "readCollC55b",
>     "readCollB",
> "readCollA"]},
>
>
>
> On Fri, May 31, 2019 at 9:07 PM Sotiris Fragkiskos 
> wrote:
>
> > Terribly sorry about the duplicate post. It was just when i had first
> > subscribed, i mustn't have verified my subscription because i never
> > received any posts. I could also not find my post in the mailing list
> > archive, so I thought it never arrived. It was only today that I tried
> > subscribing again (+verifying) that I started receiving emails.
> > Thanks for your explanation, I had read this in the manual but it didn't
> > make much sense to me. I intepreted my order as: "first rule, the request
> > is not from an admin so fail, check the next rule, it's from role readColl
> > trying to access Coll, go ahead"
> > I will try it as soon as I can. Thanks very much.
> > I'm currently using 7.2.
> >
> > On Fri, May 31, 2019 at 8:27 PM Jason Gerlowski 
> > wrote:
> >
> >> Hi Sotiris,
> >>
> >> Is this your second time asking this question here, or is there a
> >> subtle difference I'm missing?  You asked a very similar question a
> >> week or so ago, and 

Re: Intermittent error 401 with JSON Facet query to retrieve count all collections

2019-06-03 Thread Jason Gerlowski
Hi Colvin,

We're still taking a look at fixing the bug, but as a workaround in
the meantime, you can look into adding a "forwardCredentials":true
property under the "authentication" section of security.json.  That
seems to fix the issue in my reproduction at least.

e.g.

{
"authentication": {
"blockUnknown": true,
"class": "solr.BasicAuthPlugin",
"credentials": {
"solradmin": ""
},
    "forwardCredentials": true
},
...
}

Jason

On Mon, Jun 3, 2019 at 9:31 AM Jason Gerlowski  wrote:
>
> One last note: as far as I can tell, nothing about this issue is
> specific to JSON Faceting or the JSON request API.  It can be
> triggered just as easily with "/select?q=*:*".
>
> The bug created for this is: SOLR-13510
>
> On Mon, Jun 3, 2019 at 9:17 AM Jason Gerlowski  wrote:
> >
> > I'm also able to reproduce this bug on master.  A few more notes about
> > the bad behavior:
> >
> > - the behavior occurs regardless of the specific permissions
> > configured in security.json.  (i.e. whether the top permission is
> > "all", or "security-edit", or there are no permissions at all.)
> > - I tried looking for a pattern in which requests saw the 401s, but
> > didn't have any luck.  The 401 occurs when talking to the whole
> > collection or targeting individual cores directly.  It occurs when
> > curl hits a host containing a replica for the collection in question,
> > and when it doesnt. etc.  This distinguishes it from SOLR-13472, which
> > seems more specific to collection structure/layout.
> >
> > I'll create a bug for this in JIRA.
> >
> > On Sun, Jun 2, 2019 at 9:53 AM Colvin Cowie  
> > wrote:
> > >
> > > Hello. I encountered this issue too and wrote this up before I found this
> > > thread, but I thought I might as well post it still, if it helps...
> > >
> > > Currently I'm trying to move our product on to Solr 8.1.1. We are 
> > > currently
> > > using 6.6.6, so things have definitely moved on.
> > >
> > > We use the BasicAuthPlugin + RuleBasedAuthorizationPlugin to lock down 
> > > Solr
> > > (and we also secure our zookeeper). Here's an example for solradmin as the
> > > user and password
> > >
> > > {
> > > "authentication": {
> > > "blockUnknown": true,
> > > "class": "solr.BasicAuthPlugin",
> > > "credentials": {
> > > "solradmin": "PIWZwkGnEKxKnqUs3X08xmbmYBaYyAeP3FiKp7fmeHc=
> > > Lnbp6bEbE7Ap8lXvQDKkUX2Xw53QDgP6Ae8QRT0P5/A="
> > > }
> > > },
> > > "authorization": {
> > > "class": "solr.RuleBasedAuthorizationPlugin",
> > > "permissions": [
> > > {
> > > "name": "all",
> > > "role": "admin"
> > > }
> > > ],
> > > "user-role": {
> > > "solradmin": "admin"
> > > }
> > > }
> > > }
> > >
> > >
> > > On Solr 8.1.1, using our previously working security.json, running queries
> > > (through the admin UI currently) I non-deterministically get 401 responses
> > > on queries when a collection has more than 1 shard. Increasing the number
> > > of shards in the collection makes the errors more likely.
> > >
> > > {
> > >   "responseHeader":{
> > > "zkConnected":true,
> > > "status":401,
> > > "QTime":30,
> > > "params":{
> > >   "q":"*:*",
> > >   "_":"1559474550365"}},
> > >   "error":{
> > > "metadata":[
> > >
> > > "error-class","org.apache.solr.client.solrj.impl.BaseHttpSolrClient$RemoteSolrException",
> > >
> > > "root-error-class","org.apache.solr.client.solrj.impl.BaseHttpSolrClient$RemoteSolrException"],
> > > "msg":"Error from server at null: Expected mime type
> > > application/octet-stream but got text/html. \n\n > > http-equiv=\"Content-Type\"
> > > content=\"text/html;charset=utf-8\"/>\nError 401 requi

Re: Intermittent error 401 with JSON Facet query to retrieve count all collections

2019-06-03 Thread Jason Gerlowski
One last note: as far as I can tell, nothing about this issue is
specific to JSON Faceting or the JSON request API.  It can be
triggered just as easily with "/select?q=*:*".

The bug created for this is: SOLR-13510

On Mon, Jun 3, 2019 at 9:17 AM Jason Gerlowski  wrote:
>
> I'm also able to reproduce this bug on master.  A few more notes about
> the bad behavior:
>
> - the behavior occurs regardless of the specific permissions
> configured in security.json.  (i.e. whether the top permission is
> "all", or "security-edit", or there are no permissions at all.)
> - I tried looking for a pattern in which requests saw the 401s, but
> didn't have any luck.  The 401 occurs when talking to the whole
> collection or targeting individual cores directly.  It occurs when
> curl hits a host containing a replica for the collection in question,
> and when it doesnt. etc.  This distinguishes it from SOLR-13472, which
> seems more specific to collection structure/layout.
>
> I'll create a bug for this in JIRA.
>
> On Sun, Jun 2, 2019 at 9:53 AM Colvin Cowie  
> wrote:
> >
> > Hello. I encountered this issue too and wrote this up before I found this
> > thread, but I thought I might as well post it still, if it helps...
> >
> > Currently I'm trying to move our product on to Solr 8.1.1. We are currently
> > using 6.6.6, so things have definitely moved on.
> >
> > We use the BasicAuthPlugin + RuleBasedAuthorizationPlugin to lock down Solr
> > (and we also secure our zookeeper). Here's an example for solradmin as the
> > user and password
> >
> > {
> > "authentication": {
> > "blockUnknown": true,
> > "class": "solr.BasicAuthPlugin",
> > "credentials": {
> > "solradmin": "PIWZwkGnEKxKnqUs3X08xmbmYBaYyAeP3FiKp7fmeHc=
> > Lnbp6bEbE7Ap8lXvQDKkUX2Xw53QDgP6Ae8QRT0P5/A="
> > }
> > },
> > "authorization": {
> > "class": "solr.RuleBasedAuthorizationPlugin",
> > "permissions": [
> > {
> > "name": "all",
> > "role": "admin"
> > }
> > ],
> > "user-role": {
> > "solradmin": "admin"
> > }
> > }
> > }
> >
> >
> > On Solr 8.1.1, using our previously working security.json, running queries
> > (through the admin UI currently) I non-deterministically get 401 responses
> > on queries when a collection has more than 1 shard. Increasing the number
> > of shards in the collection makes the errors more likely.
> >
> > {
> >   "responseHeader":{
> > "zkConnected":true,
> > "status":401,
> > "QTime":30,
> > "params":{
> >   "q":"*:*",
> >   "_":"1559474550365"}},
> >   "error":{
> > "metadata":[
> >
> > "error-class","org.apache.solr.client.solrj.impl.BaseHttpSolrClient$RemoteSolrException",
> >
> > "root-error-class","org.apache.solr.client.solrj.impl.BaseHttpSolrClient$RemoteSolrException"],
> > "msg":"Error from server at null: Expected mime type
> > application/octet-stream but got text/html. \n\n > http-equiv=\"Content-Type\"
> > content=\"text/html;charset=utf-8\"/>\nError 401 require
> > authentication\n\nHTTP ERROR 401\nProblem
> > accessing /solr/gettingstarted_shard4_replica_n6/select. Reason:\n
> >  require authentication\n\n\n",
> > "code":401}}
> >
> > The security stats indicate this is happening because the requests do not
> > have credentials with them, e.g.
> > http://localhost:8983/solr/#/gettingstarted_shard4_replica_n6/plugins?type=security=org.apache.solr.security.BasicAuthPlugin
> >
> >  org.apache.solr.security.BasicAuthPlugin
> > class:
> > org.apache.solr.security.BasicAuthPlugin
> > description:
> > Authentication Plugin org.apache.solr.security.BasicAuthPlugin
> > stats
> > SECURITY./authentication.authenticated:
> > 182
> > SECURITY./authentication.errors.count:
> > 0
> > SECURITY./authentication.failMissingCredentials:
> > 58
> > SECURITY./authentication.failWrongCredentials:
> >   

Re: Intermittent error 401 with JSON Facet query to retrieve count all collections

2019-06-03 Thread Jason Gerlowski
I'm also able to reproduce this bug on master.  A few more notes about
the bad behavior:

- the behavior occurs regardless of the specific permissions
configured in security.json.  (i.e. whether the top permission is
"all", or "security-edit", or there are no permissions at all.)
- I tried looking for a pattern in which requests saw the 401s, but
didn't have any luck.  The 401 occurs when talking to the whole
collection or targeting individual cores directly.  It occurs when
curl hits a host containing a replica for the collection in question,
and when it doesnt. etc.  This distinguishes it from SOLR-13472, which
seems more specific to collection structure/layout.

I'll create a bug for this in JIRA.

On Sun, Jun 2, 2019 at 9:53 AM Colvin Cowie  wrote:
>
> Hello. I encountered this issue too and wrote this up before I found this
> thread, but I thought I might as well post it still, if it helps...
>
> Currently I'm trying to move our product on to Solr 8.1.1. We are currently
> using 6.6.6, so things have definitely moved on.
>
> We use the BasicAuthPlugin + RuleBasedAuthorizationPlugin to lock down Solr
> (and we also secure our zookeeper). Here's an example for solradmin as the
> user and password
>
> {
> "authentication": {
> "blockUnknown": true,
> "class": "solr.BasicAuthPlugin",
> "credentials": {
> "solradmin": "PIWZwkGnEKxKnqUs3X08xmbmYBaYyAeP3FiKp7fmeHc=
> Lnbp6bEbE7Ap8lXvQDKkUX2Xw53QDgP6Ae8QRT0P5/A="
> }
> },
> "authorization": {
> "class": "solr.RuleBasedAuthorizationPlugin",
> "permissions": [
> {
> "name": "all",
> "role": "admin"
> }
> ],
> "user-role": {
> "solradmin": "admin"
> }
> }
> }
>
>
> On Solr 8.1.1, using our previously working security.json, running queries
> (through the admin UI currently) I non-deterministically get 401 responses
> on queries when a collection has more than 1 shard. Increasing the number
> of shards in the collection makes the errors more likely.
>
> {
>   "responseHeader":{
> "zkConnected":true,
> "status":401,
> "QTime":30,
> "params":{
>   "q":"*:*",
>   "_":"1559474550365"}},
>   "error":{
> "metadata":[
>
> "error-class","org.apache.solr.client.solrj.impl.BaseHttpSolrClient$RemoteSolrException",
>
> "root-error-class","org.apache.solr.client.solrj.impl.BaseHttpSolrClient$RemoteSolrException"],
> "msg":"Error from server at null: Expected mime type
> application/octet-stream but got text/html. \n\n http-equiv=\"Content-Type\"
> content=\"text/html;charset=utf-8\"/>\nError 401 require
> authentication\n\nHTTP ERROR 401\nProblem
> accessing /solr/gettingstarted_shard4_replica_n6/select. Reason:\n
>  require authentication\n\n\n",
> "code":401}}
>
> The security stats indicate this is happening because the requests do not
> have credentials with them, e.g.
> http://localhost:8983/solr/#/gettingstarted_shard4_replica_n6/plugins?type=security=org.apache.solr.security.BasicAuthPlugin
>
>  org.apache.solr.security.BasicAuthPlugin
> class:
> org.apache.solr.security.BasicAuthPlugin
> description:
> Authentication Plugin org.apache.solr.security.BasicAuthPlugin
> stats
> SECURITY./authentication.authenticated:
> 182
> SECURITY./authentication.errors.count:
> 0
> SECURITY./authentication.failMissingCredentials:
> 58
> SECURITY./authentication.failWrongCredentials:
> 0
> SECURITY./authentication.passThrough:
> 0
> SECURITY./authentication.requestTimes.meanRate:
> 0.4183414110946125
> SECURITY./authentication.requests:
> 240
> SECURITY./authentication.totalTime:
> 117791100
>
> I assume that this is connected to the changes around
> https://issues.apache.org/jira/browse/SOLR-7896 and
> https://issues.apache.org/jira/browse/SOLR-13344 I've tested with Solr
> 7.6.0 and it appears to be unaffected
>
> Repro steps:
># Extract solr 8.1.1.
># bin\solr start -e cloud
> 1 node / [default port] / [default collection name] / 4 shards / 1
> replica / [_default configuration]
># server\scripts\cloud-scripts\zkcli -zkhost localhost:9983 -cmd putfile
> /security.json 
>
># Execute repeated GETS to
> http://localhost:8983/solr/gettingstarted/select?q=*%3A* - a lot of them,
> but not all, will fail with 401s
>
>
> Also as a side note, because the authentication is now done through the
> form login rather than the browser basic auth, if you go directly to a non
> UI url (e.g. http://localhost:8983/solr/main_index/select?q=*%3A*) you have
> to authenticate to it using the browser's basic auth prompt. Which is
> slightly annoying since the query page in the Admin UI generates links to
> it for the queries you run, and you've already authenticated to get there.
> But it's not a 

Re: Adding Multiple JSON Documents

2019-06-03 Thread Jason Gerlowski
Hi John,

I believe the documentation there is correct.  That is: those are two
different "update" APIs.  /update takes a JSON array of potentially
multiple docs, /update/json/docs takes either a JSON array of multiple
docs, or a single document not wrapped in the JSON array syntax.

Best,

Jason

On Sun, Jun 2, 2019 at 10:50 PM John Davis  wrote:
>
> Hi there,
>
> I was looking at the solr documentation for indexing multiple documents via
> json and noticed inconsistency in the docs.
>
> Should the POST url be /update/*json/docs *instead of just /update. It does
> look like former does work, unless both will work just fine?
>
> https://lucene.apache.org/solr/guide/7_3/uploading-data-with-index-handlers.html#adding-multiple-json-documents
> Adding Multiple JSON Documents
> <https://lucene.apache.org/solr/guide/7_3/uploading-data-with-index-handlers.html#adding-multiple-json-documents>
>
> Adding multiple documents at one time via JSON can be done via a JSON Array
> of JSON Objects, where each object represents a document:
>
> curl -X POST -H 'Content-Type: application/json'
> 'http://localhost:8983/solr/my_collection/*update*' --data-binary '[
> {"id": "1","title": "Doc 1"  },  {"id": "2","title":
> "Doc 2"  }]'


Re: alias read access impossible for anyone other than admin?

2019-05-31 Thread Jason Gerlowski
Hi Sotiris,

Is this your second time asking this question here, or is there a
subtle difference I'm missing?  You asked a very similar question a
week or so ago, and I replied with a few suggestions for changing your
security.json and with a few questions.  In case you missed it for
whatever reason, I'll include my original response below:

-

Hi Sotiris,

First, what version of Solr are you running?  We've made some fixes
recently (esp. SOLR-13355) to RBAP, and they might affect the behavior
you're seeing or any fixes we can recommend.

Second, the order of permissions in security.json has a huge effect on
how .  Solr always uses the first permission rule that matches a given
API...later rules are ignored if a match is found in earlier ones.
The first rule in your permissions block ({"name": "all", "role":
"admin"}) will match all APIs and will only allow requests through if
the requesting user has the "admin" role.  So "user" being unable to
query an alias makes sense.  Usually "all" and other catchall
permissions are best used at the very bottom of your permissions list.
That way the catchall is the last rule to be checked, giving other
rules a chance to match first.

Hope that helps.

On Fri, May 31, 2019 at 9:34 AM Sotiris Fragkiskos  wrote:
>
> Hi everyone!
> I've been trying unsuccessfully to read an alias to a collection with a
> curl command.
> The command only works when I put in the admin credentials, although the
> user I want access for also has the required role for accessing.
> Is this perhaps built-in, or should anyone be able to access an alias from
> the API?
>
> The command I'm using is:
> curl http://:@/solr
> //select?q=:
> This fails for the user but succeeds for the admin
>
> My minimum working example of security.json follows.
> Many thanks!
>
> {
>   "authentication":{
> "blockUnknown":true,
> "class":"solr.BasicAuthPlugin",
> "credentials":{
>   "admin":"blahblahblah",
>   "user":"blahblah"},
> "":{"v":13}},
>   "authorization":{
> "class":"solr.RuleBasedAuthorizationPlugin",
> "permissions":[
>   {
> "name":"all",
> "role":"admin",
> "index":1},
>   {
> "name":"readColl",
> "collection":"Coll",
> "path":"/select/*",
> "role":"readColl",
> "index":2},
>   {
> "name":"readSCollAlias",
> "collection":"sCollAlias",
> "path":"/select/*",
> "role":"readSCollAlias",
> "index":3}],
> "user-role":{
>   "admin":[
> "admin",
> "readSCollAlias"],
>   "user":["readSCollAlias"]},
> "":{"v":21}}}


Re: Solr 7.7.1 indexing failing with analysis error: startOffset must be non-negative, and endOffset must be >= startOffset, and offsets must not go backwards

2019-05-28 Thread Jason J Baik
This might be of interest to you:
https://issues.apache.org/jira/browse/LUCENE-8776

On Mon, May 27, 2019 at 10:32 PM Zheng Lin Edwin Yeo 
wrote:

> How are you indexing the message, or what is the command that you used to
> index the message?
>
> Also, the attachment might not make it to the server, so you likely need to
> upload the file to a file sharing / storage site and share the link here.
>
> Regards,
> Edwin
>
> On Mon, 27 May 2019 at 15:24, SAM  wrote:
>
> > indexing a message on solr7.7.1 is failing with the following error. any
> > help is appreciated. attaching schema files.
> >
> > 2019-05-24 19:32:42.010 ERROR (qtp1115201599-17) [c:bn_sample s:shard1
> r:core_node2 x:bn_sample_shard1_replica_n1] o.a.s.h.RequestHandlerBase
> org.apache.solr.common.SolrException: Exception writing document id 1 to
> the index; possible analysis error: startOffset must be non-negative, and
> endOffset must be >= startOffset, and offsets must not go backwards
> startOffset=1,endOffset=3,lastStartOffset=6721 for field 'message_text'
> > at
> org.apache.solr.update.DirectUpdateHandler2.addDoc(DirectUpdateHandler2.java:243)
> > at
> org.apache.solr.update.processor.RunUpdateProcessor.processAdd(RunUpdateProcessorFactory.java:67)
> > at
> org.apache.solr.update.processor.UpdateRequestProcessor.processAdd(UpdateRequestProcessor.java:55)
> > at
> org.apache.solr.update.processor.DistributedUpdateProcessor.doLocalAdd(DistributedUpdateProcessor.java:1001)
> > at
> org.apache.solr.update.processor.DistributedUpdateProcessor.versionAdd(DistributedUpdateProcessor.java:1222)
> > at
> org.apache.solr.update.processor.DistributedUpdateProcessor.processAdd(DistributedUpdateProcessor.java:693)
> > at
> org.apache.solr.update.processor.LogUpdateProcessorFactory$LogUpdateProcessor.processAdd(LogUpdateProcessorFactory.java:103)
> > at
> org.apache.solr.handler.loader.JavabinLoader$1.update(JavabinLoader.java:110)
> > at
> org.apache.solr.client.solrj.request.JavaBinUpdateRequestCodec$StreamingCodec.readOuterMostDocIterator(JavaBinUpdateRequestCodec.java:327)
> > at
> org.apache.solr.client.solrj.request.JavaBinUpdateRequestCodec$StreamingCodec.readIterator(JavaBinUpdateRequestCodec.java:280)
> > at
> org.apache.solr.common.util.JavaBinCodec.readObject(JavaBinCodec.java:333)
> > at
> org.apache.solr.common.util.JavaBinCodec.readVal(JavaBinCodec.java:278)
> > at
> org.apache.solr.client.solrj.request.JavaBinUpdateRequestCodec$StreamingCodec.readNamedList(JavaBinUpdateRequestCodec.java:235)
> > at
> org.apache.solr.common.util.JavaBinCodec.readObject(JavaBinCodec.java:298)
> > at
> org.apache.solr.common.util.JavaBinCodec.readVal(JavaBinCodec.java:278)
> > at
> org.apache.solr.common.util.JavaBinCodec.unmarshal(JavaBinCodec.java:191)
> > at
> org.apache.solr.client.solrj.request.JavaBinUpdateRequestCodec.unmarshal(JavaBinUpdateRequestCodec.java:126)
> > at
> org.apache.solr.handler.loader.JavabinLoader.parseAndLoadDocs(JavabinLoader.java:123)
> > at
> org.apache.solr.handler.loader.JavabinLoader.load(JavabinLoader.java:70)
> > at
> org.apache.solr.handler.UpdateRequestHandler$1.load(UpdateRequestHandler.java:97)
> > at
> org.apache.solr.handler.ContentStreamHandlerBase.handleRequestBody(ContentStreamHandlerBase.java:68)
> > at
> org.apache.solr.handler.RequestHandlerBase.handleRequest(RequestHandlerBase.java:199)
> > at org.apache.solr.core.SolrCore.execute(SolrCore.java:2551)
> > at org.apache.solr.servlet.HttpSolrCall.execute(HttpSolrCall.java:710)
> > at org.apache.solr.servlet.HttpSolrCall.call(HttpSolrCall.java:516)
> > at
> org.apache.solr.servlet.SolrDispatchFilter.doFilter(SolrDispatchFilter.java:395)
> > at
> org.apache.solr.servlet.SolrDispatchFilter.doFilter(SolrDispatchFilter.java:341)
> > at
> org.eclipse.jetty.servlet.ServletHandler$CachedChain.doFilter(ServletHandler.java:1602)
> > at
> org.eclipse.jetty.servlet.ServletHandler.doHandle(ServletHandler.java:540)
> > at
> org.eclipse.jetty.server.handler.ScopedHandler.handle(ScopedHandler.java:146)
> > at
> org.eclipse.jetty.security.SecurityHandler.handle(SecurityHandler.java:548)
> > at
> org.eclipse.jetty.server.handler.HandlerWrapper.handle(HandlerWrapper.java:132)
> > at
> org.eclipse.jetty.server.handler.ScopedHandler.nextHandle(ScopedHandler.java:257)
> > at
> org.eclipse.jetty.server.session.SessionHandler.doHandle(SessionHandler.java:1588)
> > at
> org.eclipse.jetty.server.handler.ScopedHandler.nextHandle(ScopedHandler.java:255)
> > at
> org.eclipse.jetty.server.handler.ContextHandler.doHandle(ContextHandler.java:1345)
> > at
> org.eclipse.jetty.server.handler.ScopedHandler.nextScope(ScopedHandler.java:203)
> > at
> org.eclipse.jetty.servlet.ServletHandler.doScope(ServletHandler.java:480)
> > at
> org.eclipse.jetty.server.session.SessionHandler.doScope(SessionHandler.java:1557)
> > at
> org.eclipse.jetty.server.handler.ScopedHandler.nextScope(ScopedHandler.java:201)
> > at
> org.eclipse.jetty.server.handler.ContextHandler.doScope(ContextHandler.java:1247)
> > at

Re: alias read access impossible for anyone other than admin?

2019-05-28 Thread Jason Gerlowski
Hey Aroop,

The fix in SOLR-13355 is available starting in 8.1.  It will also be
available in 7.7.2 once that is released.  (Jan Hoydahl started the
release process for 7.7.2, but held off for a number of other ongoing
releases.  He's recently resumed work on the release though, and I
expect we'll see 7.7.2 in a week or two.)

RuleBasedAuthorizationPlugin does have some coverage in the ref-guide,
as you've likely seen:
https://lucene.apache.org/solr/guide/7_7/rule-based-authorization-plugin.html.
I don't think SOLR-13355 involved any changes to that documentation:
it fixed a bug that deviated from what was described in the ref-guide,
so there were no changes required when that bug was fixed.  That said,
if you see something I've missed, or think that page could be improved
more generally, it's definitely worth raising a JIRA for.  RBAP
permission matching/processing can be subtle for those using it for
the first time, so any improvement to the docs will go a long way.

Jason

On Sat, May 25, 2019 at 3:12 AM Aroop Ganguly  wrote:
>
> hi jason
>
> which version of solr has the definitive fix for the rbap again ?
> also is there a jira to fix or create a documentation for the same that works 
> :) ?
>
> aroop
>
>
> > On May 24, 2019, at 9:55 AM, Jason Gerlowski  wrote:
> >
> > Hi Sotiris,
> >
> > First, what version of Solr are you running?  We've made some fixes
> > recently (esp. SOLR-13355) to RBAP, and they might affect the behavior
> > you're seeing or any fixes we can recommend.
> >
> > Second, the order of permissions in security.json has a huge effect on
> > how .  Solr always uses the first permission rule that matches a given
> > API...later rules are ignored if a match is found in earlier ones.
> > The first rule in your permissions block ({"name": "all", "role":
> > "admin"}) will match all APIs and will only allow requests through if
> > the requesting user has the "admin" role.  So "user" being unable to
> > query an alias makes sense.  Usually "all" and other catchall
> > permissions are best used at the very bottom of your permissions list.
> > That way the catchall is the last rule to be checked, giving other
> > rules a chance to match first.
> >
> > Hope that helps.
> >
> > Jason
> >
> > On Wed, May 22, 2019 at 6:21 AM Sotiris Fragkiskos  
> > wrote:
> >>
> >> Hi everyone!
> >> I've been trying unsuccessfully to read an alias to a collection with a
> >> curl command.
> >> The command only works when I put in the admin credentials, although the
> >> user I want access for also has the required role for accessing.
> >> Is this perhaps built-in, or should anyone be able to access an alias from
> >> the API?
> >>
> >> The command I'm using is:
> >> curl http://
> >> :@/solr//select?q=:
> >> This fails for the user but succeeds for the admin
> >>
> >> My minimum working example of security.json follows.
> >> Many thanks!
> >>
> >> {
> >>  "authentication":{
> >>"blockUnknown":true,
> >>"class":"solr.BasicAuthPlugin",
> >>"credentials":{
> >>  "admin":"blahblahblah",
> >>  "user":"blahblah"},
> >>"":{"v":13}},
> >>  "authorization":{
> >>"class":"solr.RuleBasedAuthorizationPlugin",
> >>"permissions":[
> >>  {
> >>"name":"all",
> >>"role":"admin",
> >>"index":1},
> >>  {
> >>"name":"readColl",
> >>"collection":"Coll",
> >>"path":"/select/*",
> >>"role":"readColl",
> >>"index":2},
> >>  {
> >>"name":"readSCollAlias",
> >>"collection":"sCollAlias",
> >>"path":"/select/*",
> >>"role":"readSCollAlias",
> >>"index":3}],
> >>"user-role":{
> >>  "admin":[
> >>"admin",
> >>"readSCollAlias"],
> >>  "user":["readSCollAlias"]},
> >>"":{"v":21}}}
>


Re: alias read access impossible for anyone other than admin?

2019-05-24 Thread Jason Gerlowski
Hi Sotiris,

First, what version of Solr are you running?  We've made some fixes
recently (esp. SOLR-13355) to RBAP, and they might affect the behavior
you're seeing or any fixes we can recommend.

Second, the order of permissions in security.json has a huge effect on
how .  Solr always uses the first permission rule that matches a given
API...later rules are ignored if a match is found in earlier ones.
The first rule in your permissions block ({"name": "all", "role":
"admin"}) will match all APIs and will only allow requests through if
the requesting user has the "admin" role.  So "user" being unable to
query an alias makes sense.  Usually "all" and other catchall
permissions are best used at the very bottom of your permissions list.
That way the catchall is the last rule to be checked, giving other
rules a chance to match first.

Hope that helps.

Jason

On Wed, May 22, 2019 at 6:21 AM Sotiris Fragkiskos  wrote:
>
> Hi everyone!
> I've been trying unsuccessfully to read an alias to a collection with a
> curl command.
> The command only works when I put in the admin credentials, although the
> user I want access for also has the required role for accessing.
> Is this perhaps built-in, or should anyone be able to access an alias from
> the API?
>
> The command I'm using is:
> curl http://
> :@/solr//select?q=:
> This fails for the user but succeeds for the admin
>
> My minimum working example of security.json follows.
> Many thanks!
>
> {
>   "authentication":{
> "blockUnknown":true,
> "class":"solr.BasicAuthPlugin",
> "credentials":{
>   "admin":"blahblahblah",
>   "user":"blahblah"},
> "":{"v":13}},
>   "authorization":{
> "class":"solr.RuleBasedAuthorizationPlugin",
> "permissions":[
>   {
> "name":"all",
> "role":"admin",
> "index":1},
>   {
> "name":"readColl",
> "collection":"Coll",
> "path":"/select/*",
> "role":"readColl",
> "index":2},
>   {
> "name":"readSCollAlias",
> "collection":"sCollAlias",
> "path":"/select/*",
> "role":"readSCollAlias",
> "index":3}],
> "user-role":{
>   "admin":[
> "admin",
> "readSCollAlias"],
>   "user":["readSCollAlias"]},
> "":{"v":21}}}


Re: Solr RuleBasedAuthorizationPlugin question

2019-05-07 Thread Jason Gerlowski
The Admin UI lockdown is a known-issue in RBAP that's since been
fixed. (https://issues.apache.org/jira/browse/SOLR-13344), but only in
very recent versions of Solr.  I haven't tried this, but you should be
able to work around it by putting a rule like: {path: /, role: *}
right before your catch-all rule.  (I think "/" is the path that RBAP
sees for Admin UI requests.  Though you may also want to try
"/solr/").

As for why core-creation is still allowed with that config, I'll try
to take a quick look after work today, but may not have time to get to
it.  It's a bit of a hack, and it'd be nice to understand the behavior
now before making additional changes, but if you need to you can add
an explicit rule to cover core creation:

{
"name": "core-admin-edit",
"role": "admin"
},
{
   "name": "read",
   "role": "readonly"
 },
  {
"path": "*",
"role": "admin"
  },
  {
"name": "*",
"role": "admin"
   }

Good luck,

Jason

On Tue, May 7, 2019 at 11:31 AM Jérémy  wrote:
>
> Hi Jason,
>
> Thanks a lot for the detailed explanation. It's still very unclear in my
> head how things work, but now I know about the weird fallback mechanism of
> RBAP. Despite your example I still didn't manage to get the behavior I
> wanted.
> Here's the closest I've been so far. Any logged in user can still create
> cores but now the readonly user cannot delete or update documents. However
> the admin UI webinterface is completely locked now.
>
> {
>  "authentication": {
>"blockUnknown": true,
>"class": "solr.BasicAuthPlugin",
>"credentials": {
>  "adminuser": "adminpwd",
>  "readuser": "readpwd"
>}
>  },
>  "authorization": {
>"class": "solr.RuleBasedAuthorizationPlugin",
>"permissions": [
>  {
>"name": "read",
>"role": "readonly"
>  },
>   {
> "path": "*",
> "role": "admin"
>   },
>   {
> "name": "*",
> "role": "admin"
>}
>],
>"user-role": {
>  "readuser": "readonly",
>  "adminuser": ["admin", "readonly"]
>}
>  }
> }
>
> I feel like I'm almost there and that the json is just missing a bit.
>
> Thanks for your help, I really appreciate it,
> Jeremy
>
>
>
>
> On Mon, May 6, 2019 at 11:00 PM Jason Gerlowski 
> wrote:
>
> > Hey Jeremy,
> >
> > One important thing to remember about the RuleBasedAuthorizationPlugin
> > is that if it doesn't find any rules matching a particular API call,
> > it will allow the request.  I think that's what you're running into
> > here.  Let's trace through how RBAP will process your rules:
> >
> > 1. Solr receives an API call.  For this example, let's say its a new
> > doc sent to /solr/someCollection/update
> > 2. Solr fetches security.json and parses the auth rules.  It'll look
> > at each of these in turn.
> > 3. First Rule: Solr checks "/solr/someCollection/update" against the
> > "read" rule.  /update isn't a read API, so this rule doesn't apply to
> > our request.
> > 4. Second Rule: Solr checks "/solr/someCollection/update" agains the
> > "security-edit" rule.  /update isn't a security-related API, so this
> > rule doesn't apply to our request either.
> > 5. Solr is out of rules to try.  Since no rules locked down /update to
> > a particular user/role, Solr allows the request.
> >
> > This is pretty unintuitive and rarely is what people expect.  The way
> > that RBAP works, you almost always will want to have the last rule in
> > your security.json be a "catch-all" rule of some sort.  You can do
> > this by appending a rule entry with the wildcard path "*".  In the
> > latest Solr releases, you can also use the predefined "all" permission
> > (but beware of SOLR-13355 in earlier version).  e.g.
> >
> >  {
> > "name": "read",
> > "role": "readonly"
> >   },
> >   {
> > "name": "security-edit",
> > "role": "admin"
> >   },

Re: Solr RuleBasedAuthorizationPlugin question

2019-05-06 Thread Jason Gerlowski
Hey Jeremy,

One important thing to remember about the RuleBasedAuthorizationPlugin
is that if it doesn't find any rules matching a particular API call,
it will allow the request.  I think that's what you're running into
here.  Let's trace through how RBAP will process your rules:

1. Solr receives an API call.  For this example, let's say its a new
doc sent to /solr/someCollection/update
2. Solr fetches security.json and parses the auth rules.  It'll look
at each of these in turn.
3. First Rule: Solr checks "/solr/someCollection/update" against the
"read" rule.  /update isn't a read API, so this rule doesn't apply to
our request.
4. Second Rule: Solr checks "/solr/someCollection/update" agains the
"security-edit" rule.  /update isn't a security-related API, so this
rule doesn't apply to our request either.
5. Solr is out of rules to try.  Since no rules locked down /update to
a particular user/role, Solr allows the request.

This is pretty unintuitive and rarely is what people expect.  The way
that RBAP works, you almost always will want to have the last rule in
your security.json be a "catch-all" rule of some sort.  You can do
this by appending a rule entry with the wildcard path "*".  In the
latest Solr releases, you can also use the predefined "all" permission
(but beware of SOLR-13355 in earlier version).  e.g.

 {
"name": "read",
"role": "readonly"
  },
  {
"name": "security-edit",
"role": "admin"
  },
  {
"path": "*",
"role": "admin"
   }


Hope that helps.

Jason

On Fri, May 3, 2019 at 5:23 PM Jérémy  wrote:
>
> Hi,
>
> I hope that this question wasn't answered already, but I couldn't find what
> I was looking for in the archives.
>
> I'm having a hard time to use solr with the BasicAuth and
> RoleBasedAuthorization plugins.
> The auth part works well but I have issues with the RoleBasedAuthorization
> part. I'd like to have an admin role and a readonly one. I have two users,
> each having one role. However both of them can create cores, delete
> documents etc...
>
> Here's my security.json:
> {
>   "authentication": {
> "blockUnknown": true,
> "class": "solr.BasicAuthPlugin",
> "credentials": {
>   "adminuser": "adminpwd",
>   "readuser": "readpwd"
> }
>   },
>   "authorization": {
> "class": "solr.RuleBasedAuthorizationPlugin",
> "permissions": [
>   {
> "name": "read",
> "role": "readonly"
>   },
>   {
> "name": "security-edit",
> "role": "admin"
>   }
> ],
> "user-role": {
>   "readuser": "readonly",
>   "adminuser": "admin"
> }
>   }
> }
>
> I tried that with Solr 7.7.0 and 8.0.0, in cloud and standalone mode. I
> can't figure out why the readuser can delete documents.
>
> Any help is appreciated!
>
> Thanks,
> Jeremy


Re: JSON Facet query to retrieve count all collections in Solr 8.0.0

2019-04-17 Thread Jason Gerlowski
Agreed, I'd be surprised if this behavior was specific to JSON
Faceting.  Though I'm surprised it's happening at all, so...

Anyway, that's easy for you to test though.  Try a few "/select?q=*:*"
queries and see whether they also exhibits this behavior.  One other
question: does the behavior persist after restarting your Solr nodes?

Good luck,

Jason

On Wed, Apr 17, 2019 at 4:05 AM Zheng Lin Edwin Yeo
 wrote:
>
> Hi,
>
> For your info, I have enabled basic authentication and SSL in all the 3
> versions, and I'm not sure if the issue is more on the authentication side
> instead of the JSON Facet query?
>
> Regards,
> Edwin
>
> On Wed, 17 Apr 2019 at 06:54, Zheng Lin Edwin Yeo 
> wrote:
>
> > Hi Jason,
> >
> > Yes, that is correct.
> >
> > Below is the format of my security.json. I have changed the masked
> > password for security purposes.
> >
> > {
> > "authentication":{
> >"blockUnknown": true,
> >"class":"solr.BasicAuthPlugin",
> >"credentials":{"user1":"hyHXXuJSqcZdNgdSTGUvrQZRpqrYFUQ2ffmlWQ4GUTk=
> > E0w3/2FD+rlxulbPm2G7i9HZqT+2gMBzcyJCcGcMWwA="}
> > },
> > "authorization":{
> >"class":"solr.RuleBasedAuthorizationPlugin",
> >"user-role":{"user1":"admin"},
> >"permissions":[{"name":"security-edit",
> >   "role":"admin"}]
> > }}
> >
> > Regards,
> > Edwin
> >
> > On Tue, 16 Apr 2019 at 23:12, Jason Gerlowski 
> > wrote:
> >
> >> Hi Edwin,
> >>
> >> To clarify what you're running into:
> >>
> >> - on 7.6, this query works all the time
> >> - on 7.7 this query works all the time
> >> - on 8.0, this query works the first time you run it, but subsequent
> >> runs return a 401 error?
> >>
> >> Is that correct?  It might be helpful for others if you could share
> >> your security.json.
> >>
> >> Best,
> >>
> >> Jason
> >>
> >> On Mon, Apr 15, 2019 at 10:40 PM Zheng Lin Edwin Yeo
> >>  wrote:
> >> >
> >> > Hi,
> >> >
> >> > I am using the below JSON Facet to retrieve the count of all the
> >> different
> >> > collections in one query.
> >> >
> >> >
> >> https://localhost:8983/solr/collection1/select?q=testing=https://localhost:8983/solr/collection1,https://localhost:8983/solr/collection2,https://localhost:8983/solr/collection3,https://localhost:8983/solr/collection4,https://localhost:8983/solr/collection5,https://localhost:8983/solr/collection6=0={categories
> >> > : {type : terms,field : content_type,limit : 100}}
> >> >
> >> >
> >> > Previously, in Solr 7.6 and Solr 7.7, this query can work correctly and
> >> we
> >> > are able to produce the correct output.
> >> >
> >> > {
> >> >   "responseHeader":{
> >> > "zkConnected":true,
> >> > "status":0,
> >> > "QTime":24},
> >> >   "response":{"numFound":41200,"start":0,"maxScore":12.993215,"docs":[]
> >> >   },
> >> >   "facets":{
> >> > "count":41200,
> >> > "categories":{
> >> >   "buckets":[{
> >> >   "val":"collection1",
> >> >   "count":26213},
> >> > {
> >> >   "val":"collection2",
> >> >   "count":12075},
> >> > {
> >> >   "val":"collection3",
> >> >   "count":1947},
> >> > {
> >> >   "val":"collection4",
> >> >   "count":850},
> >> > {
> >> >   "val":"collection5",
> >> >   "count":111},
> >> > {
> >> >   "val":"collection6",
> >> >   "count":4}]}}}
> >> >
> >> >
> >> > However, in the new Solr 8.0.0, this query can only work once.
> >> > Subsequently, we will get the following error of 'require
> >> authentication':
> >> >
> >> > {
> >> >   "responseHeader":{
> >> > "zkConnected":true,
> >> > "status":401,
> >> > "QTime":11},
> >> >   "error":{
> >> > "metadata":[
> >> >
> >> >
> >> "error-class","org.apache.solr.client.solrj.impl.Http2SolrClient$RemoteSolrException",
> >> >
> >> >
> >> "root-error-class","org.apache.solr.client.solrj.impl.Http2SolrClient$RemoteSolrException"],
> >> > "msg":"Error from server at null: Expected mime type
> >> > application/octet-stream but got text/html. \n\n >> > http-equiv=\"Content-Type\"
> >> > content=\"text/html;charset=utf-8\"/>\nError 401 require
> >> > authentication\n\nHTTP ERROR
> >> 401\nProblem
> >> > accessing /solr/collection6/select. Reason:\nrequire
> >> > authentication\n\n\n",
> >> > "code":401}}
> >> >
> >> > This issue does not occur in Solr 7.6 and Solr 7.7, even though I have
> >> set
> >> > up the same authentication for all the versions.
> >> >
> >> > What could be the issue that causes this?
> >> >
> >> > Regards,
> >> > Edwin
> >>
> >


Re: JSON Facet query to retrieve count all collections in Solr 8.0.0

2019-04-16 Thread Jason Gerlowski
Hi Edwin,

To clarify what you're running into:

- on 7.6, this query works all the time
- on 7.7 this query works all the time
- on 8.0, this query works the first time you run it, but subsequent
runs return a 401 error?

Is that correct?  It might be helpful for others if you could share
your security.json.

Best,

Jason

On Mon, Apr 15, 2019 at 10:40 PM Zheng Lin Edwin Yeo
 wrote:
>
> Hi,
>
> I am using the below JSON Facet to retrieve the count of all the different
> collections in one query.
>
> https://localhost:8983/solr/collection1/select?q=testing=https://localhost:8983/solr/collection1,https://localhost:8983/solr/collection2,https://localhost:8983/solr/collection3,https://localhost:8983/solr/collection4,https://localhost:8983/solr/collection5,https://localhost:8983/solr/collection6=0={categories
> : {type : terms,field : content_type,limit : 100}}
>
>
> Previously, in Solr 7.6 and Solr 7.7, this query can work correctly and we
> are able to produce the correct output.
>
> {
>   "responseHeader":{
> "zkConnected":true,
> "status":0,
> "QTime":24},
>   "response":{"numFound":41200,"start":0,"maxScore":12.993215,"docs":[]
>   },
>   "facets":{
> "count":41200,
> "categories":{
>   "buckets":[{
>   "val":"collection1",
>   "count":26213},
> {
>   "val":"collection2",
>   "count":12075},
> {
>   "val":"collection3",
>   "count":1947},
> {
>   "val":"collection4",
>   "count":850},
> {
>   "val":"collection5",
>   "count":111},
> {
>   "val":"collection6",
>   "count":4}]}}}
>
>
> However, in the new Solr 8.0.0, this query can only work once.
> Subsequently, we will get the following error of 'require authentication':
>
> {
>   "responseHeader":{
> "zkConnected":true,
> "status":401,
> "QTime":11},
>   "error":{
> "metadata":[
>
> "error-class","org.apache.solr.client.solrj.impl.Http2SolrClient$RemoteSolrException",
>
> "root-error-class","org.apache.solr.client.solrj.impl.Http2SolrClient$RemoteSolrException"],
> "msg":"Error from server at null: Expected mime type
> application/octet-stream but got text/html. \n\n http-equiv=\"Content-Type\"
> content=\"text/html;charset=utf-8\"/>\nError 401 require
> authentication\n\nHTTP ERROR 401\nProblem
> accessing /solr/collection6/select. Reason:\nrequire
> authentication\n\n\n",
> "code":401}}
>
> This issue does not occur in Solr 7.6 and Solr 7.7, even though I have set
> up the same authentication for all the versions.
>
> What could be the issue that causes this?
>
> Regards,
> Edwin


Re: bin/post command not working when run from crontab

2019-04-14 Thread Jason Gerlowski
Hi Carsten,

I think this is probably worth a jira.  I'm not familiar enough with
bin/post to say definitively whether the behavior you mention is a
bug, or whether it's "expected" in some odd sense.  But there's enough
uncertainty that I think it's worth recording there.

Best,

Jason

On Fri, Apr 12, 2019 at 5:52 AM Carsten Agger  wrote:
>
> Hi all
>
> I posted the question below some time back, concerning the unusual
> behaviour of bin/post if there is no stdin.
>
> There has been no comments to that, and maybe bin/post is quaint in that
> regard - I ended up changing my application to POST directly on the Web
> endpoint instead.
>
> But I do have one question, though: Should this be considered a bug, and
> should I report it as such? Unfortunately I don't have the time to
> prepare a proper fix myself.
>
> Best
> Carsten
>
> On 3/27/19 7:55 AM, Carsten Agger wrote:
> > I'm working with a script where I want to send a command to delete all
> > elements in an index; notably,
> >
> >
> > /opt/solr/bin/post -c  -d  
> > "*:*"
> >
> >
> > When run interactively, this works fine.
> >
> > However, when run automatically as a cron job, it gives this interesting
> > output:
> >
> >
> > Unrecognized argument:   "*:*"
> >
> > If this was intended to be a data file, it does not exist relative to /root
> >
> > The culprit seems to be these lines, 143-148:
> >
> >  if [[ ! -t 0 ]]; then
> >MODE="stdin"
> >  else
> ># when no stdin exists and -d specified, the rest of the 
> > arguments
> ># are assumed to be strings to post as-is
> >MODE="args"
> >
> > This code seems to be doing the opposite of what the comment says - it
> > sets MODE="stdin" if stdin is NOT a terminal, but if it IS (i.e., there
> > IS an stdin) it assumes the rest of the args can be posted as-is.
> >
> > On the other hand, if the condition is reversed, my command will fail
> > interactively but not when run as a cron job. Both options are, of
> > course, unsatisfactory.
> >
> > It /will/ actually work in both cases, if instead the command to delete
> > the contents of the index is written as:
> >
> > echo "*:*" |  /opt/solr/bin/post -c 
> > departments -d
> >
> >
> > I've seen this bug in SOLR 7.5.0 and 7.7.1. Should I report it as a bug
> > or is there an easy explanation?
> >
> >
> > Best
> >
> > Carsten Agger
> >
> >
> --
> Carsten Agger
>
> Chief Technologist
> Magenta ApS
> Skt. Johannes Allé 2
> 8000 Århus C
>
> Tlf  +45 5060 1476
> http://www.magenta-aps.dk
> carst...@magenta-aps.dk
>


Re: Documentation for Apache Solr 8.0.0?

2019-04-01 Thread Jason Gerlowski
The Solr Reference Guide (of which the online documentation is a part)
gets built and released separately from the Solr distribution itself.
The Solr community tries to keep the code and documentation releases
as close together as we can, but the releases require work and are
done on a volunteer basis.  No one has volunteered for the 8.0.0
reference-guide release yet, but I suspect a volunteer will come
forward soon.

In the meantime though, there is documentation for Solr 8.0.0
available.  Solr's documentation is included alongside the code.  You
can checkout Solr and build the documentation yourself by moving to
"solr/solr-ref-guide" and running the command "ant clean default" from
that directory.  This will build the same HTML pages you're used to
seeing at lucene.apache.org/solr/guide, and you can open the local
copies in your browser and browse them as you normally would.

Alternatively, the Solr mirror on Github does its best to preview the
documentation.  It doesn't display perfectly, but it might be helpful
for tiding you over until the official documentation is available, if
you're unwilling or unable to build the documentation site locally:
https://github.com/apache/lucene-solr/blob/branch_8_0/solr/solr-ref-guide/src/index.adoc

Hope that helps,

Jason

On Mon, Apr 1, 2019 at 7:34 AM Yoann Moulin  wrote:
>
> Hello,
>
> I’m looking for the documentation for the latest release of SolR (8.0) but it 
> looks like it’s not online yet.
>
> https://lucene.apache.org/solr/news.html
>
> http://lucene.apache.org/solr/guide/
>
> Do you know when it will be available?
>
> Best regards.
>
> --
> Yoann Moulin
> EPFL IC-IT


Re: security.json "all" predefined permission

2019-03-29 Thread Jason Gerlowski
Thanks for the pointer Jan.

I spent much of yesterday experimenting with the ordering to make sure
that wasn't a factor and I was able to eventually rule it out with
some debug logging that showed that the requests were being allowed
because it couldn't find any governing permission rules. Apparently
RBAP fails "open"
(https://github.com/apache/lucene-solr/blob/master/solr/core/src/java/org/apache/solr/security/RuleBasedAuthorizationPlugin.java#L208)

Anyway, I'm pretty convinced this is a bug.  Most handlers implement
the PermissionNameProvider interface, which has a method that spits
out the required permission for that request handler.  (e.g.
CoreAdminHandler.getPermissionName() returns either CORE_READ_PERM or
CORE_EDIT_PERM based on the request's query params).  When the
request-handler is-a PermissionNameProvider, we do string matching to
see whether we have permissions, but we don't check for the "all"
special case.  So RBAP checks for "all" if the handler wasn't a
PermissionNameProvider (causing SOLR-13344's Admin UI behavior), but
it doesn't check for all when the handler is a PermissionNameProvider
(causing the buggy behavior I described above).

We should definitely be checking for all when there is a
PermissionNameProvider, so I'll create a JIRA for this.

Best,

Jason

On Thu, Mar 28, 2019 at 6:11 PM Jan Høydahl  wrote:
>
> There was some other issues with the "all" permission as well lately, see 
> https://issues.apache.org/jira/browse/SOLR-13344 
> <https://issues.apache.org/jira/browse/SOLR-13344>
> Order matters in permissions, the first permission matching is used, but I 
> don't know how that would change anything here.
> One thing to try could be to start with an empty RuleBasedAuth and then use 
> the REST API to add all the permissions and roles,
> in that way you are sure that they are syntactically correct, and hopefully 
> you get some errors if you do something wrong?
>
> --
> Jan Høydahl, search solution architect
> Cominvent AS - www.cominvent.com
>
> > 28. mar. 2019 kl. 20:24 skrev Jason Gerlowski :
> >
> > Hi all,
> >
> > Diving into the RuleBasedAuthorizationPlugin for the first time in
> > awhile, and found that the predefined permission "all" isn't behaving
> > the way I'd expect it to.  I'm trying to figure out whether it doesn't
> > work the way I think, whether I'm just making a dumb mistake, or
> > whether it's currently broken on master (and some 7x versions)
> >
> > My intent is to create two users, one with readonly access, and an
> > admin user with access to all APIs.  I'm trying to achieve this with
> > the security.json below:
> >
> > {
> >  "authentication": {
> >"blockUnknown": true,
> >"class": "solr.BasicAuthPlugin",
> >"credentials": {
> >  "readonly": "",
> >  "admin": ""}},
> >  "authorization": {
> >"class": "solr.RuleBasedAuthorizationPlugin",
> >"permissions": [
> >  {"name":"read","role": "*"},
> >  {"name":"schema-read", "role":"*"},
> >  {"name":"config-read", "role":"*"},
> >  {"name":"collection-admin-read", "role":"*"},
> >  {"name":"metrics-read", "role":"*"},
> >  {"name":"core-admin-read","role":"*"},
> >  {"name": "all", "role": "admin_role"}
> >],
> >"user-role": {
> >  "readonly": "readonly_role",
> >  "admin": "admin_role"
> >}}}
> >
> > When I go to test this though, I'm surprised to find that the
> > "readonly" user is still able to access APIs that I would expect to be
> > locked down.  The "readonly" user can even update security permissions
> > with the curl command below!
> >
> > curl -X POST -H 'Content-Type: application/json' -u
> > "readonly:readonlyPassword"
> > http://localhost:8983/solr/admin/authorization --d
> > @some_auth_json.json
> >
> > My expectation was that the predefined "all" permission would act as a
> > catch all, and restrict all requests to "admin_role" that require
> > permissions I didn't explicitly give to my "readonly" user.  But it
> > doesn't seem to work that way.  Am I misunderstanding what the "all"
> > permission does, or is this a bug?
> >
> > Thanks for any help or clarification.
> >
> > Jason
>


security.json "all" predefined permission

2019-03-28 Thread Jason Gerlowski
Hi all,

Diving into the RuleBasedAuthorizationPlugin for the first time in
awhile, and found that the predefined permission "all" isn't behaving
the way I'd expect it to.  I'm trying to figure out whether it doesn't
work the way I think, whether I'm just making a dumb mistake, or
whether it's currently broken on master (and some 7x versions)

My intent is to create two users, one with readonly access, and an
admin user with access to all APIs.  I'm trying to achieve this with
the security.json below:

{
  "authentication": {
"blockUnknown": true,
"class": "solr.BasicAuthPlugin",
"credentials": {
  "readonly": "",
  "admin": ""}},
  "authorization": {
"class": "solr.RuleBasedAuthorizationPlugin",
"permissions": [
  {"name":"read","role": "*"},
  {"name":"schema-read", "role":"*"},
  {"name":"config-read", "role":"*"},
  {"name":"collection-admin-read", "role":"*"},
  {"name":"metrics-read", "role":"*"},
  {"name":"core-admin-read","role":"*"},
  {"name": "all", "role": "admin_role"}
],
"user-role": {
  "readonly": "readonly_role",
  "admin": "admin_role"
}}}

When I go to test this though, I'm surprised to find that the
"readonly" user is still able to access APIs that I would expect to be
locked down.  The "readonly" user can even update security permissions
with the curl command below!

curl -X POST -H 'Content-Type: application/json' -u
"readonly:readonlyPassword"
http://localhost:8983/solr/admin/authorization --d
@some_auth_json.json

My expectation was that the predefined "all" permission would act as a
catch all, and restrict all requests to "admin_role" that require
permissions I didn't explicitly give to my "readonly" user.  But it
doesn't seem to work that way.  Am I misunderstanding what the "all"
permission does, or is this a bug?

Thanks for any help or clarification.

Jason


Re: Solr 8.0.0 coreNodeName

2019-03-28 Thread Jason J Baik
This seems related to https://issues.apache.org/jira/browse/SOLR-11503?


On Thu, Mar 28, 2019 at 2:14 AM vishal patel 
wrote:

>
> Hi
>
> I am upgrading the solr 8.0.0 from 6.1.0. Before I can not add the
> coreNodeName in core.properties and its working fine for me. But when i
> start the solr 8.0.0 with same core.properties it will give ERROR
>
> 2019-03-25 09:01:18.704 ERROR (coreLoadExecutor-13-thread-1-processing-n:
> 192.168.100.145:7991_solr) [c:product s:shard1  x:product]
> o.a.s.c.ZkController
> org.apache.solr.common.SolrException: Could not find collection : product
> at
> org.apache.solr.common.cloud.ClusterState.getCollection(ClusterState.java:118)
> ~[solr-solrj-8.0.0.jar:8.0.0 2ae4746365c1ee72a0047ced7610b2096e438979 -
> jimczi - 2019-03-08 12:06:10]
> at
> org.apache.solr.core.CoreContainer.repairCoreProperty(CoreContainer.java:1854)
> ~[solr-core-8.0.0.jar:8.0.0 2ae4746365c1ee72a0047ced7610b2096e438979 -
> jimczi - 2019-03-08 12:06:06]
> at
> org.apache.solr.cloud.ZkController.checkStateInZk(ZkController.java:1790)
> ~[solr-core-8.0.0.jar:8.0.0 2ae4746365c1ee72a0047ced7610b2096e438979 -
> jimczi - 2019-03-08 12:06:06]
> at
> org.apache.solr.cloud.ZkController.preRegister(ZkController.java:1729)
> [solr-core-8.0.0.jar:8.0.0 2ae4746365c1ee72a0047ced7610b2096e438979 -
> jimczi - 2019-03-08 12:06:06]
> at
> org.apache.solr.core.CoreContainer.createFromDescriptor(CoreContainer.java:1182)
> [solr-core-8.0.0.jar:8.0.0 2ae4746365c1ee72a0047ced7610b2096e438979 -
> jimczi - 2019-03-08 12:06:06]
> at
> org.apache.solr.core.CoreContainer.lambda$load$13(CoreContainer.java:695)
> [solr-core-8.0.0.jar:8.0.0 2ae4746365c1ee72a0047ced7610b2096e438979 -
> jimczi - 2019-03-08 12:06:06]
> at
> org.apache.solr.core.CoreContainer$$Lambda$244/470132045.call(Unknown
> Source) [solr-core-8.0.0.jar:8.0.0 2ae4746365c1ee72a0047ced7610b2096e438979
> - jimczi - 2019-03-08 12:06:06]
> at
> com.codahale.metrics.InstrumentedExecutorService$InstrumentedCallable.call(InstrumentedExecutorService.java:197)
> [metrics-core-3.2.6.jar:3.2.6]
> at
> java.util.concurrent.FutureTask.run(FutureTask.java:266) [?:1.8.0_45]
> at
> org.apache.solr.common.util.ExecutorUtil$MDCAwareThreadPoolExecutor.lambda$execute$0(ExecutorUtil.java:209)
> [solr-solrj-8.0.0.jar:8.0.0 2ae4746365c1ee72a0047ced7610b2096e438979 -
> jimczi - 2019-03-08 12:06:10]
> at
> org.apache.solr.common.util.ExecutorUtil$MDCAwareThreadPoolExecutor$$Lambda$56/2019826979.run(Unknown
> Source) [solr-solrj-8.0.0.jar:8.0.0
> 2ae4746365c1ee72a0047ced7610b2096e438979 - jimczi - 2019-03-08 12:06:10]
> at
> java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142)
> [?:1.8.0_45]
> at
> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617)
> [?:1.8.0_45]
> at java.lang.Thread.run(Thread.java:745) [?:1.8.0_45]
> 2019-03-25 09:01:18.720 ERROR
> (coreContainerWorkExecutor-2-thread-1-processing-n:192.168.100.145:7991_solr)
> [   ] o.a.s.c.CoreContainer Error waiting for SolrCore to be loaded on
> startup
> org.apache.solr.common.SolrException: Unable to create core [product]
> at
> org.apache.solr.core.CoreContainer.createFromDescriptor(CoreContainer.java:1210)
> ~[solr-core-8.0.0.jar:8.0.0 2ae4746365c1ee72a0047ced7610b2096e438979 -
> jimczi - 2019-03-08 12:06:06]
> at
> org.apache.solr.core.CoreContainer.lambda$load$13(CoreContainer.java:695)
> ~[solr-core-8.0.0.jar:8.0.0 2ae4746365c1ee72a0047ced7610b2096e438979 -
> jimczi - 2019-03-08 12:06:06]
> at
> org.apache.solr.core.CoreContainer$$Lambda$244/470132045.call(Unknown
> Source) ~[?:?]
> at
> com.codahale.metrics.InstrumentedExecutorService$InstrumentedCallable.call(InstrumentedExecutorService.java:197)
> ~[metrics-core-3.2.6.jar:3.2.6]
> at
> java.util.concurrent.FutureTask.run(FutureTask.java:266) [?:1.8.0_45]
> at
> org.apache.solr.common.util.ExecutorUtil$MDCAwareThreadPoolExecutor.lambda$execute$0(ExecutorUtil.java:209)
> [solr-solrj-8.0.0.jar:8.0.0 2ae4746365c1ee72a0047ced7610b2096e438979 -
> jimczi - 2019-03-08 12:06:10]
> at
> org.apache.solr.common.util.ExecutorUtil$MDCAwareThreadPoolExecutor$$Lambda$56/2019826979.run(Unknown
> Source) [solr-solrj-8.0.0.jar:8.0.0
> 2ae4746365c1ee72a0047ced7610b2096e438979 - jimczi - 2019-03-08 12:06:10]
> at
> java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142)
> [?:1.8.0_45]
> at
> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617)
> [?:1.8.0_45]
> at java.lang.Thread.run(Thread.java:745) [?:1.8.0_45]
> Caused by: org.apache.solr.common.SolrException:
> at
> 

Re: Upgrading solarj from 6.5.1 to 8.0.0

2019-03-25 Thread Jason Gerlowski
Hi Lahiru,

I had a chance to refresh myself on how this works over the weekend.
There are two ways in SolrJ to talk to a Solr protected by basic-auth:

1. The SolrRequest.setBasicAuthCredentials() method I mentioned
before.  This can be painful though, and isn't even possible in all
usecases.
2. Configuring your client process with several System Properties.
First, set the property "solr.httpclient.builder.factory" to
"org.apache.solr.client.solrj.impl.PreemptiveBasicAuthClientBuilderFactory"
to tell SolrJ that you want any SolrClient's setup to use basic-auth.
Once that is setup, you can specify your credentials in one of two
ways.  If you're OK with the auth credentials appearing in the command
line for your process, you can set the "basicauth" system property to
a value of the form ":".  A slightly more approach
is to have SolrJ read the credentials from a file.  You can choose
this approach by setting the "solr.httpclient.config" system property
and giving it the full path to an accessible properties file.  You
then need to create the properties file, specifying your username and
password using the "httpBasicAuthUser" and "httpBasicAuthPassword"
properties.

Currently (2) is not documented in our Solr Ref Guide, though it
really should be since it's the most practical way to setup auth.

Hope that helps,

Jason

On Thu, Mar 21, 2019 at 1:25 PM Erick Erickson  wrote:
>
> One tangent just so you’re aware. You _must_ re-index from scratch. Lucene 8x 
> will refuse to open an index that was _ever_ touched by Solr 6.
>
> Best,
> Erick
>
> > On Mar 21, 2019, at 8:26 AM, Lahiru Jayasekera  
> > wrote:
> >
> > Hi Jason,
> > Thanks for the response. I saw the method of setting credentials based on
> > individual request.
> > But I need to set the credentials at solrclient level. If you remember the
> > way to do it please let me know.
> >
> > Thanks
> >
> > On Thu, Mar 21, 2019 at 8:26 PM Jason Gerlowski 
> > wrote:
> >
> >> You should be able to set credentials on individual requests with the
> >> SolrRequest.setBasicAuthCredentials() method.  That's the method
> >> suggested by the latest Solr ref guide at least:
> >>
> >> https://lucene.apache.org/solr/guide/7_7/basic-authentication-plugin.html#using-basic-auth-with-solrj
> >>
> >> There might be a way to set the credentials on the client itself, but
> >> I can't think of it at the moment.
> >>
> >> Hope that helps,
> >>
> >> Jason
> >>
> >> On Thu, Mar 21, 2019 at 2:34 AM Lahiru Jayasekera
> >>  wrote:
> >>>
> >>> Hi all,
> >>> I need help implementing the following code in solarj 8.0.0.
> >>>
> >>> private SolrClient server, adminServer;
> >>>
> >>> this.adminServer = new HttpSolrClient(SolrClientUrl);
> >>> this.server = new HttpSolrClient( SolrClientUrl + "/" +
> >> mapping.getCoreName() );
> >>> if (serverUserAuth) {
> >>>  HttpClientUtil.setBasicAuth(
> >>>  (DefaultHttpClient) ((HttpSolrClient) adminServer).getHttpClient(),
> >>>  serverUsername, serverPassword);
> >>>  HttpClientUtil.setBasicAuth(
> >>>  (DefaultHttpClient) ((HttpSolrClient) server).getHttpClient(),
> >>>  serverUsername, serverPassword);
> >>> }
> >>>
> >>>
> >>> I could get the solarClients as following
> >>>
> >>> this.adminServer = new HttpSolrClient.Builder(SolrClientUrl).build();
> >>> this.server = new HttpSolrClient.Builder( SolrClientUrl + "/" +
> >>> mapping.getCoreName() ).build();
> >>>
> >>> But i can't find a way to implement basic authentication. I think that it
> >>> can be done via SolrHttpClientBuilder.
> >>> Can you please help me to solve this?
> >>>
> >>> Thank and regards
> >>> Lahiru
> >>> --
> >>> Lahiru Jayasekara
> >>> Batch 15
> >>> Faculty of Information Technology
> >>> University of Moratuwa
> >>> 0716492170
> >>
> >
> >
> > --
> > Lahiru Jayasekara
> > Batch 15
> > Faculty of Information Technology
> > University of Moratuwa
> > 0716492170
>


Re: Upgrading solarj from 6.5.1 to 8.0.0

2019-03-21 Thread Jason Gerlowski
You should be able to set credentials on individual requests with the
SolrRequest.setBasicAuthCredentials() method.  That's the method
suggested by the latest Solr ref guide at least:
https://lucene.apache.org/solr/guide/7_7/basic-authentication-plugin.html#using-basic-auth-with-solrj

There might be a way to set the credentials on the client itself, but
I can't think of it at the moment.

Hope that helps,

Jason

On Thu, Mar 21, 2019 at 2:34 AM Lahiru Jayasekera
 wrote:
>
> Hi all,
> I need help implementing the following code in solarj 8.0.0.
>
> private SolrClient server, adminServer;
>
> this.adminServer = new HttpSolrClient(SolrClientUrl);
> this.server = new HttpSolrClient( SolrClientUrl + "/" + mapping.getCoreName() 
> );
> if (serverUserAuth) {
>   HttpClientUtil.setBasicAuth(
>   (DefaultHttpClient) ((HttpSolrClient) adminServer).getHttpClient(),
>   serverUsername, serverPassword);
>   HttpClientUtil.setBasicAuth(
>   (DefaultHttpClient) ((HttpSolrClient) server).getHttpClient(),
>   serverUsername, serverPassword);
> }
>
>
> I could get the solarClients as following
>
> this.adminServer = new HttpSolrClient.Builder(SolrClientUrl).build();
> this.server = new HttpSolrClient.Builder( SolrClientUrl + "/" +
> mapping.getCoreName() ).build();
>
> But i can't find a way to implement basic authentication. I think that it
> can be done via SolrHttpClientBuilder.
> Can you please help me to solve this?
>
> Thank and regards
> Lahiru
> --
> Lahiru Jayasekara
> Batch 15
> Faculty of Information Technology
> University of Moratuwa
> 0716492170


Re: Solr collection indexed to pdf in hdfs throws error during solr restart

2019-03-14 Thread Jason Gerlowski
> When I restart Solr

How exactly are you restarting Solr?  Are you running a "bin/solr
restart"?  Or is Solr already shut down and you're just starting it
back up with a "bin/solr start "?  Depending on how Solr
was shut down, you might be running into a bit of a known-issue with
Solr's HDFS support.  Solr creates lock files for each index, to
restrict who can write to that index in the interest of avoiding race
conditions and protecting against file corruption.  Often when Solr
crashes or is shut down abruptly (via a "kill -9") it doesn't have
time to clean up these lock files and it fails to start up the next
time because it is still locked out from touching that index.  This
might be what you're running in to.  In which case you could carefully
make sure that no Solr nodes are using the index in question, delete
the lock file manually out of HDFS, and try starting Solr again.

The advice above is what we usually tell people with write.lock issues
on HDFS...though some elements of the stack trace you provided make me
wonder whether you're seeing the same exact problem.  Your stack trace
has a NullPointerException, and a "Filesystem Closed" error (typically
seen when a Java object gets closed too early and may indicate a bug).
I'm not used to seeing either of these associated with the "standard"
write.lock issues.  What version of Solr are you seeing this on?

Best regards,

Jason

On Thu, Mar 14, 2019 at 5:28 AM VAIBHAV SHUKLA
shuklavaibha...@yahoo.in  wrote:
>
> When I restart Solr it throws the following error. Solr collection indexed to 
> pdf in hdfs throws error during solr restart.
>
>
>
> Error
>
> java.util.concurrent.ExecutionException: 
> org.apache.solr.common.SolrException: Unable to create core [PDFIndex]
> at java.util.concurrent.FutureTask.report(FutureTask.java:122)
> at java.util.concurrent.FutureTask.get(FutureTask.java:192)
> at 
> org.apache.solr.core.CoreContainer.lambda$load$6(CoreContainer.java:594)
> at 
> com.codahale.metrics.InstrumentedExecutorService$InstrumentedRunnable.run(InstrumentedExecutorService.java:176)
> at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511)
> at java.util.concurrent.FutureTask.run(FutureTask.java:266)
> at 
> org.apache.solr.common.util.ExecutorUtil$MDCAwareThreadPoolExecutor.lambda$execute$0(ExecutorUtil.java:229)
> at 
> java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149)
> at 
> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624)
> at java.lang.Thread.run(Thread.java:748)
> Caused by: org.apache.solr.common.SolrException: Unable to create core 
> [PDFIndex]
> at 
> org.apache.solr.core.CoreContainer.createFromDescriptor(CoreContainer.java:966)
> at 
> org.apache.solr.core.CoreContainer.lambda$load$5(CoreContainer.java:565)
> at 
> com.codahale.metrics.InstrumentedExecutorService$InstrumentedCallable.call(InstrumentedExecutorService.java:197)
> ... 5 more
> Caused by: org.apache.solr.common.SolrException: Index dir 
> 'hdfs://192.168.1.16:8020/PDFIndex/data/index/' of core 'PDFIndex' is already 
> locked. The most likely cause is another Solr server (or another solr core in 
> this server) also configured to use this directory; other possible causes may 
> be specific to lockType: hdfs
> at org.apache.solr.core.SolrCore.(SolrCore.java:977)
> at org.apache.solr.core.SolrCore.(SolrCore.java:830)
> at 
> org.apache.solr.core.CoreContainer.createFromDescriptor(CoreContainer.java:950)
> ... 7 more
> Caused by: org.apache.lucene.store.LockObtainFailedException: Index dir 
> 'hdfs://192.168.1.16:8020/PDFIndex/data/index/' of core 'PDFIndex' is already 
> locked. The most likely cause is another Solr server (or another solr core in 
> this server) also configured to use this directory; other possible causes may 
> be specific to lockType: hdfs
> at org.apache.solr.core.SolrCore.initIndex(SolrCore.java:712)
> at org.apache.solr.core.SolrCore.(SolrCore.java:923)
> ... 9 more
> 2018-12-22 07:55:13.431 ERROR 
> (OldIndexDirectoryCleanupThreadForCore-PDFIndex) [   x:PDFIndex] 
> o.a.s.c.HdfsDirectoryFactory Error checking for old index directories to 
> clean-up.
> java.io.IOException: Filesystem closed
> at org.apache.hadoop.hdfs.DFSClient.checkOpen(DFSClient.java:808)
> at org.apache.hadoop.hdfs.DFSClient.listPaths(DFSClient.java:2083)
> at org.apache.hadoop.hdfs.DFSClient.listPaths(DFSClient.java:2069)
> at 
> org.apache.hadoop.hdfs.DistributedFileSystem.listStatusInternal(DistributedFileSystem.java:791)
> at 
> org.apache.hadoop.hdfs.DistributedFileSystem.access$700(DistributedFileSystem.java:106)
> at 
>

Re: ClassCastException in SolrJ 7.6+

2019-03-11 Thread Jason Gerlowski
Hi Gerald,

That looks like it might be a bug in SolrJ's JSON faceting support.
Do you have a small code snippet that reproduces the problem?  That'll
help us confirm it's a bug, and get us started on fixing it.

Best,

Jason

On Mon, Mar 11, 2019 at 10:29 AM Gerald Bonfiglio  wrote:
>
> I'm seeing the following Exception using JSON Facet API in SolrJ 7.6, 7.7, 
> 7.7.1:
>
> Caused by: java.lang.ClassCastException: java.lang.Long cannot be cast to 
> java.lang.Integer
>   at 
> org.apache.solr.client.solrj.response.json.NestableJsonFacet.(NestableJsonFacet.java:52)
>   at 
> org.apache.solr.client.solrj.response.QueryResponse.extractJsonFacetingInfo(QueryResponse.java:200)
>   at 
> org.apache.solr.client.solrj.response.QueryResponse.getJsonFacetingResponse(QueryResponse.java:571)
>
>
>
>
>
> [Nastel  Technologies]<http://www.nastel.com/>
>
> The information contained in this e-mail and in any attachment is 
> confidential and
> is intended solely for the use of the individual or entity to which it is 
> addressed.
> Access, copying, disclosure or use of such information by anyone else is 
> unauthorized.
> If you are not the intended recipient, please delete the e-mail and refrain 
> from use of such information.


Apache Solr Reference Guide 7.7 Released

2019-03-11 Thread Jason Gerlowski
The Lucene PMC is pleased to announce that the Solr Reference Guide
for 7.7 is now available.

This 1,431-page PDF is the definitive guide to using Apache Solr, the
search server built on Lucene.

The PDF Guide can be downloaded from:
https://www.apache.org/dyn/closer.cgi/lucene/solr/ref-guide/apache-solr-ref-guide-7.7.pdf.

It is also available online at https://lucene.apache.org/solr/guide/7_7.


Re: Solrj, Json Facets, (Date) stats facets

2019-03-11 Thread Jason Gerlowski
Hi Andrea,

It looks like you've stumbled on a bug in NestableJsonFacet.  I
clearly wasn't thinking about Date stats when I first wrote it; it
looks like it doesn't detect/parse them correctly in the current
iteration.  I'll try to fix this in a subsequent release.  But in the
meantime, unfortunately your only option is to use the NamedList
structures directly to retrieve the stat value.

Thanks for bringing it to our attention.

Best,

Jason

On Fri, Mar 8, 2019 at 4:42 AM Andrea Gazzarini  wrote:
>
> Good morning guys, I have a questions about Solrj and JSON facets.
>
> I'm using Solr 7.7.1 and I'm sending a request like this:
>
> json.facet={x:'max(iterationTimestamp)'}
>
> where "iterationTimestamp" is a solr.DatePointField. The JSON response
> correctly includes what I'm expecting:
>
>  "facets": {
>  "count": 8,
>  "x": "1973-09-20T17:33:18.700Z"
>  }
>
> but Solrj doesn't. Specifically, the jsonFacetingResponse contains only
> the domainCount attribute (8).
> Looking at the code I see that in NestableJsonFacet a stats is taken in
> account only if the corresponding value is an instance of Number (and x
> in the example above is a java.util.Date).
>
> Is that expected? Is there a way (other than dealing with nested
> NamedLists) for retrieving that value?
>
> Cheers,
> Andrea


Re: Hide BasicAuth JVM param on SOLR admin UI

2019-03-07 Thread Jason Gerlowski
Solr has a configuration option that allows redacting particular
properties that appear in the Admin UI.  I _think_ this is the
functionality you're looking for.  For more information, Kevin Risden
has a great little writeup of it here:
https://risdenk.github.io/2018/11/27/apache-solr-hide-redact-sensitive-properties.html

Hope that helps,

Jason

On Wed, Mar 6, 2019 at 9:27 PM Aroop Ganguly  wrote:
>
> try changing the passwords using the auth api 
> https://lucene.apache.org/solr/guide/6_6/basic-authentication-plugin.html#BasicAuthenticationPlugin-AddaUserorEditaPassword
>  
> <https://lucene.apache.org/solr/guide/6_6/basic-authentication-plugin.html#BasicAuthenticationPlugin-AddaUserorEditaPassword>
>
> That point onwards your credentials will be encrypted on the admin ui.
> I do not think your -DbasicAuth password will change but your actual password 
> would be different and base64 encrypted.
>
>
> > On Mar 6, 2019, at 12:22 AM, el mas capo  wrote:
> >
> > Hi everyone,
> > I am trying to configure Cloud Solr(7.7.0) with basic Authentification. All 
> >  seems to work nicely, but when I enter on the Web UI I can see the basic 
> > Auth Password configured in solr.in.sh in clear format:
> > -Dbasicauth=solr:SolrRocks
> > Can this behaviour be avoided?
> > Thank you by your attention.
> >
>


Re: Solr Reference Guide for version 7.7

2019-03-01 Thread Jason Gerlowski
Hi Edwin,

I volunteered to release the 7.7 ref-guide last week but decided to
wait until 7.7.1 came out to work on it.  (You probably know that
7.7.0 contained some serious bugs.  These would've required
non-trivial documentation effort in the ref-guide, and 7.7.1 already
had a release-manager and was coming soon, so it was simpler to wait.)

I'm back working on the 7.7 ref-guide today and hopefully we'll have
one out next week.  In the meantime, if you'd like to have the latest
documentation you can always check out the source code and build the
ref-guide locally ("ant clean default" from the solr/solr-ref-guide
directory, see the README in that same directory for more help)

Best,

Jason

On Thu, Feb 28, 2019 at 11:05 PM Zheng Lin Edwin Yeo
 wrote:
>
> Hi,
>
> Understand that Solr 7.7.1 has just been released, but Solr 7.7.0 has been
> released almost a month ago.
>
> However, from http://lucene.apache.org/solr/guide/, I still could not
> access the guide for version 7.7, the latest version is still 7.6.
>
> Is there any plans to release the guide for 7.7, or has the site been
> shifted to a new URL?
>
> Regards,
> Edwin


Re: Python Client for Solr Cloud - Leader aware

2019-03-01 Thread Jason Gerlowski
Hi Ganesh,

I'm not an expert on pysolr, but from a quick scan of their update
code, it does look like pysolr attempts to send update requests to _a_
leader node for a particular collection.  But that's all it does.  It
doesn't check which shard the document(s) will belong to and try to
pick the _correct_ leader. If your collections only have 1 shard, this
is still pretty great.  But if your collections have multiple shards
(and multiple leaders), then this will perform worse than SolrJ.

(This is based on what I gleaned from the code here:
https://github.com/django-haystack/pysolr/blob/master/pysolr.py#L1268
. Happy to be corrected by someone with more context.)

Best,

Jason

On Tue, Feb 26, 2019 at 1:50 PM Ganesh Sethuraman
 wrote:
>
> We are using Solr Cloud 7.2.1. Is there a leader aware python client (like
> SolrJ for Java), which can send the updates to the leader and it its highly
> available?
> I see PySolr https://pypi.org/project/pysolr/ project, not able to find any
> documentation if it supports leader aware updates.
>
> Regards
> Ganesh


Re: Giving SolrJ credentials for Zookeeper

2019-03-01 Thread Jason Gerlowski
Hi Ryan,

I haven't tried this myself, but wanted to offer a sanity check based
on how I understand those instructions.

Are you setting the "zkCredentialsProvider", "zkDigestUsername", and
"zkDigestPassword" system-properties on your client app/process as
well as on your Solr/ZK servers?  Or are you just setting it in the
config for your Solr/ZK servers?  I expect those system properties
need to be set for the client process as well, though the ref-guide
page doesn't explicitly say so.

Best,

Jason

On Tue, Feb 26, 2019 at 12:56 PM Snead, Ryan [USA]  wrote:
>
> I am following along with the example found in Zookeeper Access Control of 
> the Apache Solr 7.5 Reference Guide. I have gotten to the point where I can 
> use the zkcli.sh control script to access my secured Zookeeper environment. I 
> can also connect using Zookeeper's zkCli.sh and then authenticate using the 
> auth command. The point where I run into trouble is having completed the 
> steps in the article, how do I find what parameters to set with SolrJ to 
> allow my indexer code to communicate with Zookeeper.
>
> The error my Java code is returning when I try to process a QueryRequest is: 
> Error reading cluster properties from zookeeper 
> org.apache.zookeeper.KeeperException$NoAuthException: KeeperError Code = 
> NoAuth for /clusterprops.json
>
> My code is:
> solrClient = new CloudSolrClient.Builder("localhost:2181", 
> Optional.of("/")).build();
> String solrQuery = String.format("PRODUCT_TYPE:USER and PRODUCT_SK:%s", 
> productSk);
> SolrQuery q = new SolrQuery();
> q.set("q", solrQuery);
> QueryRequest request = new QueryRequest(q);
> numfound = request.process(solrClient).getResults().getNumFound();
> Error occurs at the last line. I suspect that I need to set a property in 
> solrClient, but it is not clear to me what that would be.
>
> References:
> https://lucene.apache.org/solr/guide/7_5/zookeeper-access-control.html
> ZooKeeper Access Control | Apache Solr Reference Guide 
> 7.5<https://lucene.apache.org/solr/guide/7_5/zookeeper-access-control.html>
> Content stored in ZooKeeper is critical to the operation of a SolrCloud 
> cluster. Open access to SolrCloud content on ZooKeeper could lead to a 
> variety of problems.
> lucene.apache.org
>
>


  1   2   3   4   5   6   7   8   >