number of documents exceed 2147483519

2020-03-16 Thread Hongxu Ma
Hi I'm using solr-cloud (ver 6.6), got an error: org.apache.solr.common.SolrException: Exception writing document id (null) to the index; possible analysis error: number of documents in the index cannot exceed 2147483519 After googled it, I know the number is exceed one solr shard limit. The

Re: Copying data

2020-03-16 Thread Erick Erickson
It’s not at all clear what the problem is. If you have a single-shard collection, just 1> create the stand-alone core 2> shut down the Solr instance 3> replace the stand-alone core's data dir with one from any of your prod machines. 4> start Solr An alternative is to use the replication API

How to sum model grouped?

2020-03-16 Thread hakan
I use solr version 7.1. I have as grouped model in total 11M record, as below example. This question is, How do I sum fromfollowers field from this grouped model? { groupValue: "1927245294", doclist: { numFound: 1, start: 0, docs: [

How to sum model grouped?

2020-03-16 Thread hakan
I use solr version 7.1. I have as grouped model in total 11M record, as below example. This question is, How do I sum fromfollowers field from this grouped model? { groupValue: "1927245294", doclist: { numFound: 1, start: 0, docs: [

How do *you* restrict access to Solr?

2020-03-16 Thread Ryan W
How do you, personally, do it? Do you use IPTables? Basic Authentication Plugin? Something else? I'm asking in part so I'l have something to search for. I don't know where I should begin, so I figured I would ask how others do it. I haven't been able to find anything that works, so if you can

Re: How do *you* restrict access to Solr?

2020-03-16 Thread Ryan W
Thanks Jorn, though this all seems unrealistic. Because the technical skill required to secure Solr far exceeds the technical skill required to install it, I suspect there are probably a lot of insecure installs out there. In many cases this will not apply: "if you work with people that know a

RE: How do *you* restrict access to Solr?

2020-03-16 Thread Dunigan, Craig A.
Setting up Apache is off-topic, but it’s just a matter of ProxyPass to the Solr app URL. I already gave you the relevant IP restriction configuration directive, “Allow from “. The rest is in httpd documentation. From: Ryan W Sent: Monday, March 16, 2020 10:41 AM To:

Re: How do *you* restrict access to Solr?

2020-03-16 Thread Jörn Franke
Solr should not be accessible to end users directly - only through a dedicated application in between. Then in an enterprise setting it is mostly Kerberos auth. and https (do not forget about zookeeper when using Solr cloud here you can also have Kerberos auth and in recent version also SSL).

Re: How do *you* restrict access to Solr?

2020-03-16 Thread David Hastings
master slave is the idea that you have an indexing server you do all indexing to and a search server that replicates the index, to deliver the results etc. if you keep the indexer separate you can tune it differently as well as protect the data. also means you can remove the delete/update

Re: How do *you* restrict access to Solr?

2020-03-16 Thread Aroop Ganguly
Hi Ryan You should consider a simple rule based authorization scheme. Your staff user can be given readonly privileges to everything you want to except the admin ui. Depending on which version of solr you are on this can be trivial. - Aroop > On Mar 16, 2020, at 8:46 AM, Ryan W wrote: > >

Using Synonym Graph Filter with StandardTokenizer does not tokenize the query string if it has multi-word synonym

2020-03-16 Thread atin janki
Hello everyone, I am using solr 8.3. After I included Synonym Graph Filter in my managed-schema file, I have noticed that if the query string contains a multi-word synonym, it considers that multi-word synonym as a single term and does not break it, further suppressing the default search

Re: How do *you* restrict access to Solr?

2020-03-16 Thread Ryan W
On Mon, Mar 16, 2020 at 11:09 AM Walter Underwood wrote: > What access do you want to prevent? How do you prefer to authenticate? > How do you manage users or roles? Master/slave or Solr Cloud? > I want to prevent access to the admin UI. I don't want to manage users or roles, preferably. I

Re: How do *you* restrict access to Solr?

2020-03-16 Thread Ryan W
On Mon, Mar 16, 2020 at 11:40 AM Walter Underwood wrote: > Also, even if you prevent access to the admin UI, a request to /update can > delete > all the content. It is really easy. This Gist shows how. > > https://gist.github.com/nz/673027/313f70681daa985ea13ba33a385753aef951a0f3 This seems

Re: Using Synonym Graph Filter with StandardTokenizer does not tokenize the query string if it has multi-word synonym

2020-03-16 Thread atin janki
Using sow=true, does split the word on whitespaces but it will not look for synonyms of "soap powder" anymore, rather it expands separate synonyms for "soap" and "powder". Best Regards, Atin Janki On Mon, Mar 16, 2020 at 4:59 PM Audrey Lorberfeld - audrey.lorberf...@ibm.com wrote: > Have

Re: How do *you* restrict access to Solr?

2020-03-16 Thread Nicolas Franck
IPtables seems like the way to go, at least for me. Even if this basic-auth-plugin works, then you'll have to deal with denial-of-service attacks (although these can also happen indirectly, by hitting the website that uses Solr). > On 16 Mar 2020, at 15:44, Ryan W wrote: > > How do you,

Re: How do *you* restrict access to Solr?

2020-03-16 Thread Walter Underwood
What access do you want to prevent? How do you prefer to authenticate? How do you manage users or roles? Master/slave or Solr Cloud? wunder Walter Underwood wun...@wunderwood.org http://observer.wunderwood.org/ (my blog) > On Mar 16, 2020, at 7:44 AM, Ryan W wrote: > > How do you, personally,

Re: Using Synonym Graph Filter with StandardTokenizer does not tokenize the query string if it has multi-word synonym

2020-03-16 Thread Audrey Lorberfeld - audrey.lorberf...@ibm.com
Have you set sow=true in your search handler? I know that we have it set to false (sow = split on whitespace) because we WANT multi-token synonyms retained as multiple tokens. On 3/16/20, 10:49 AM, "atin janki" wrote: Hello everyone, I am using solr 8.3. After I

Re: How do *you* restrict access to Solr?

2020-03-16 Thread Susheel Kumar
Basic auth should help you to start https://lucene.apache.org/solr/guide/8_1/basic-authentication-plugin.html On Mon, Mar 16, 2020 at 10:44 AM Ryan W wrote: > How do you, personally, do it? Do you use IPTables? Basic Authentication > Plugin? Something else? > > I'm asking in part so I'l have

Re: How do *you* restrict access to Solr?

2020-03-16 Thread David Hastings
Honestly? I know this isnt what youre going to want to hear, but security through obscurity. no one else knows what port the servers on, and its not accessible from anything outside of the internal network. if your solr install can be accessed from an external IP you have much larger issues.

Re: How do *you* restrict access to Solr?

2020-03-16 Thread Ryan W
On Mon, Mar 16, 2020 at 10:50 AM David Hastings < hastings.recurs...@gmail.com> wrote: > Honestly? I know this isnt what youre going to want to hear, but security > through obscurity. no one else knows what port the servers on, and its not > accessible from anything outside of the internal

RE: How do *you* restrict access to Solr?

2020-03-16 Thread Dunigan, Craig A.
Here are my suggestions. If you’re okay with IP restrictions only, then iptables. If you don’t have *nix or root access, an Apache proxy server with Allow from . If you want really, really secure, an stunnel front-end that requires client certs that you install in your browsers. For us, we

Re: How do *you* restrict access to Solr?

2020-03-16 Thread Walter Underwood
If your data changes slowly and you don’t need to shard, master/slave is great. It is loosely coupled, so not as complicated as Solr Cloud. Each slave is an exact clone. For master/slave, you can put an HTTP server (nginx, etc.) on each server and proxy traffic to Solr. Then configure Solr to

Re: How do *you* restrict access to Solr?

2020-03-16 Thread Ryan W
On Mon, Mar 16, 2020 at 11:32 AM Dunigan, Craig A. < craig.duni...@landsend.com> wrote: > Here are my suggestions. If you’re okay with IP restrictions only, then > iptables. Thanks! Just knowing this is an option helps. I took a stab at it but it didn't work initially, but at least now I

Re: How do *you* restrict access to Solr?

2020-03-16 Thread Ryan W
On Mon, Mar 16, 2020 at 10:51 AM Susheel Kumar wrote: > Basic auth should help you to start > > https://lucene.apache.org/solr/guide/8_1/basic-authentication-plugin.html Thanks. I think I will give up on the plugin system. I haven't been able to get the plugin system to work, and it creates

Re: Re: Re: Using Synonym Graph Filter with StandardTokenizer does not tokenize the query string if it has multi-word synonym

2020-03-16 Thread Audrey Lorberfeld - audrey.lorberf...@ibm.com
I don't think you can synonym-ize both the multi-token phrase and each individual token in the multi-token phrase at the same time. But anyone else feel free to chime in! Best, Audrey Lorberfeld On 3/16/20, 12:40 PM, "atin janki" wrote: I aim to achieve an expansion like -

Re: Re: Using Synonym Graph Filter with StandardTokenizer does not tokenize the query string if it has multi-word synonym

2020-03-16 Thread Audrey Lorberfeld - audrey.lorberf...@ibm.com
To confirm, you want a synonym like "soap powder" to map onto synonyms like "hand soap," "hygiene products," etc? As in, more of a cognitive synonym mapping where you feed synonyms that only apply to the multi-token phrase as a whole? On 3/16/20, 12:17 PM, "atin janki" wrote: Using

Re: Re: Using Synonym Graph Filter with StandardTokenizer does not tokenize the query string if it has multi-word synonym

2020-03-16 Thread atin janki
I aim to achieve an expansion like - Synonym(soap powder) + Synonym(soap) + Synonym (powder) which is not happening because of the Synonym expansion is being done at the moment. At the moment, using Synonym Graph Filter with StandardTokenizer and sow = false , expands as - Synonym(soap

Re: Unsubscribe request

2020-03-16 Thread Gora Mohanty
On Tue, 17 Mar 2020 at 05:18, Arpit Agarwal wrote: > Hi, > Please unsubscribe my email address (arpit.agarwa...@gmail.com) from your > mailing list . > Please follow the usual practice for subscribing from a public mailing list: see https://lucene.apache.org/solr/community.html . You need to

RE: How do *you* restrict access to Solr?

2020-03-16 Thread Phil Scadden
First off, use basic authentication to at least partially lock it down. Only the application server has access to the password. Second, our IT people thought Solr security insufficient to even remotely consider exposing to external web. It lives behind firewall so do a kind of proxy. External

Using Synonym Graph Filter with StandardTokenizer does not tokenize the query string if it has multi-word synonym

2020-03-16 Thread atin janki
Hello everyone, I am using solr 8.3. After I included Synonym Graph Filter in my managed-schema file, I have noticed that if the query string contains a multi-word synonym, it considers that multi-word synonym as a single term and does not break it, further suppressing the default search

Unsubscribe request

2020-03-16 Thread Arpit Agarwal
Hi, Please unsubscribe my email address (arpit.agarwa...@gmail.com) from your mailing list . Thanks Arpit A

Zookeeper migration

2020-03-16 Thread Dwane Hall
Hey Solr community, I’m wondering if anyone has ever managed a zookeeper migration while running SolrCloud or if they have any advice on the process (not a zookeeper upgrade but a new physical instance migration)? I could not seem to find any endpoints in the collections or coreadmin api’s