Re: Unsubscribe request

2020-03-16 Thread Gora Mohanty
On Tue, 17 Mar 2020 at 05:18, Arpit Agarwal 
wrote:

> Hi,
> Please unsubscribe my email address (arpit.agarwa...@gmail.com) from your
> mailing list .
>

Please follow the usual practice for subscribing from a public mailing
list: see  https://lucene.apache.org/solr/community.html . You need to send
an email to solr-user-unsubscribe 
to unsubscribe, and not to this list at large.

Regards,
Gora


Unsubscribe request

2020-03-16 Thread Arpit Agarwal
Hi,
Please unsubscribe my email address (arpit.agarwa...@gmail.com) from your
mailing list .

Thanks
Arpit A


Using Synonym Graph Filter with StandardTokenizer does not tokenize the query string if it has multi-word synonym

2020-03-16 Thread atin janki
Hello everyone,

I am using solr 8.3.

After I included Synonym Graph Filter in my managed-schema file, I have
noticed that if the query string contains a multi-word synonym, it
considers that multi-word synonym as a single term and does not break it,
further suppressing the default search behaviour.

I am using StandardTokenizer.

Below is a snippet from managed-schema file -

  

  
  
  


  
  
  
  

  


Here "soap powder" is the search query which is also a multi-word synonym
in the synonym file as-

s(104254535,1,'soap powder',n,1,1).
s(104254535,2,'built-soap powder',n,1,0).
s(104254535,3,'washing powder',n,1,0).

I am sharing some screenshots for understanding the problem-

without Synonym Graph Filter => 2 docs returned (screenshot at below
mentioned URL) -

https://ibb.co/zQXx7mV

with Synonym Graph Filter => 2 docs expected, only 1 returned (screenshot
at below mentioned URL) -

https://ibb.co/tp04Rzw



Has anyone experienced this before? If yes, is there any workaround ?

Or is it an expected behaviour?

Regards,
Atin Janki

>


Zookeeper migration

2020-03-16 Thread Dwane Hall
Hey Solr community,


I’m wondering if anyone has ever managed a zookeeper migration while running 
SolrCloud or if they have any advice on the process (not a zookeeper upgrade 
but a new physical instance migration)? I could not seem to find any endpoints 
in the collections or coreadmin api’s that catered for this scenario.


Initially I was hoping I could do all of the required zookeeper preparation 
(znode creation, clusterprops …) and start my existing Solr instance pointing 
at the new zookeeper but of course the new instance is unaware of the cluster 
state (state.json).  In fact when trailing this in a Development environment it 
was quite destructive operation as data from my SOLR_HOME (/var/solr/data) was 
physically deleted after I connected Solr to the new zookeeper instance! I’m 
uncertain if this is the expected behaviour or not but it’s certainly something 
for people to be aware of!


After some investigation and testing my thoughts are I’ll need to complete the 
following:


- Stop the existing Solr instance so no updates are occurring

- Backup the data

- Create the znode on the new zookeeper instance

- Update/upload the appropriate zookeeper managed files to the new 
zookeeper instance (security.json, clusterprops.json, solr.xml etc.)

- Start Solr using ZK_HOST equal to the new zookeeper instance and 
znode (possibly use new Solr nodes here and not the existing ones)

- Replicate the collection creation process on the new zookeeper 
instance

- Physically copy the data from the old Solr nodes to the new Solr 
nodes and carefully map each replica and shard to the new location (which will 
have new replica names)

- Start the new Solr instance

- Clean up the old instance


So in summary has anybody completed a similar migration, can offer any advice, 
or are they aware of an easy way to transfer state between zookeeper instances 
to avoid the migration process I’ve outlined above?


Many thanks,


Dwane


Environment (SolrCloud)

Existing Zk: 3.4.2 (Bare metal)

New Zk: 3.4.2 (Docker)

Existing Solr: 7.7.2 (Docker)


RE: How do *you* restrict access to Solr?

2020-03-16 Thread Phil Scadden
First off, use basic authentication to at least partially lock it down. Only 
the application server has access to the password. Second, our IT people 
thought Solr security insufficient to even remotely consider exposing to 
external web. It lives behind firewall so do a kind of proxy. External queries 
are passed to an internal application server which examines, modifies and add 
security to queries and then passes to SOLR. Results sent back up chain to 
external application server. I believe variations of this is what is expected. 
Our deconstruct/reconstruct queries are unusual but it does allow us to use a 
rights-based access to functionality. Ie general public can do searches against 
the title,author, abstract. Privileged and internal users can query against the 
full text of the technical reports.

-Original Message-
From: Ryan W 
Sent: Tuesday, 17 March 2020 03:44
To: solr-user@lucene.apache.org
Subject: How do *you* restrict access to Solr?

How do you, personally, do it?  Do you use IPTables?  Basic Authentication 
Plugin? Something else?

I'm asking in part so I'l have something to search for.  I don't know where I 
should begin, so I figured I would ask how others do it.

I haven't been able to find anything that works, so if you can tell me what 
works for you, I can at least narrow it down a bit and do some Google searches. 
 Do I need to learn Solr's plugin system?  Am I starting in the right place if 
I follow this document:
https://lucene.apache.org/solr/guide/7_0/rule-based-authorization-plugin.html#rule-based-authorization-plugin

Initially, the above document seems far too comprehensive for my needs.  I just 
want to block access to the Solr admin UI, and the list of predefined 
permissions in that document don't seem to be relevant.  Also, it seems 
unlikely this plugin system is necessary just to control access to the admin 
UI... or maybe it necessary?

In any case, what is your approach?

I'm using version 7.7.2 of Solr.

Thanks!
Notice: This email and any attachments are confidential and may not be used, 
published or redistributed without the prior written consent of the Institute 
of Geological and Nuclear Sciences Limited (GNS Science). If received in error 
please destroy and immediately notify GNS Science. Do not copy or disclose the 
contents.


Re: Re: Re: Using Synonym Graph Filter with StandardTokenizer does not tokenize the query string if it has multi-word synonym

2020-03-16 Thread Audrey Lorberfeld - audrey.lorberf...@ibm.com
I don't think you can synonym-ize both the multi-token phrase and each 
individual token in the multi-token phrase at the same time. But anyone else 
feel free to chime in! 

Best,
Audrey Lorberfeld

On 3/16/20, 12:40 PM, "atin janki"  wrote:

I aim to achieve an expansion like -

Synonym(soap powder) + Synonym(soap) + Synonym (powder)


which is not happening because of the Synonym expansion is being done at
the moment.

At the moment, using  Synonym Graph Filter with StandardTokenizer  and sow
= false , expands as -

 Synonym(soap powder)

because "soap powder" is a multi-word synonym present in the synonym file.

Using sow = true in the above setting will give -

Synonym(soap) + Synonym (powder)



Best Regards,
Atin Janki


On Mon, Mar 16, 2020 at 5:27 PM Audrey Lorberfeld -
audrey.lorberf...@ibm.com  wrote:

> To confirm, you want a synonym like "soap powder" to map onto synonyms
> like "hand soap," "hygiene products," etc? As in, more of a cognitive
> synonym mapping where you feed synonyms that only apply to the multi-token
> phrase as a whole?
>
> On 3/16/20, 12:17 PM, "atin janki"  wrote:
>
> Using sow=true, does split the word on whitespaces but it will not
> look for
> synonyms of "soap powder" anymore, rather it expands separate synonyms
> for
> "soap" and "powder".
>
>
>
> Best Regards,
> Atin Janki
>
>
> On Mon, Mar 16, 2020 at 4:59 PM Audrey Lorberfeld -
> audrey.lorberf...@ibm.com  wrote:
>
> > Have you set sow=true in your search handler? I know that we have it
> set
> > to false (sow = split on whitespace) because we WANT multi-token
> synonyms
> > retained as multiple tokens.
> >
> > On 3/16/20, 10:49 AM, "atin janki"  wrote:
> >
> > Hello everyone,
> >
> > I am using solr 8.3.
> >
> > After I included Synonym Graph Filter in my managed-schema file,
> I
> > have noticed that if the query string contains a multi-word
> synonym,
> > it considers that multi-word synonym as a single term and does
> not
> > break it, further suppressing the default search behaviour.
> >
> > I am using StandardTokenizer.
> >
> > Below is a snippet from managed-schema file -
> >
> > >
> > > *   > positionIncrementGap="100" multiValued="true">*
> > > **
> > > *  *
> > > *   words="stopwords.txt"
> > ignoreCase="true"/>*
> > > *  *
> > > **
> > > **
> > > *  *
> > > *   words="stopwords.txt"
> > ignoreCase="true"/>*
> > > *   expand="true"
> > ignoreCase="true" synonyms="synonyms.txt"/>*
> > > *  *
> > > ***  *
> >
> >
> > Here "*soap powder*" is the search *query* which is also a
> multi-word
> > synonym in the synonym file as-
> >
> > > s(104254535,1,'soap powder',n,1,1).
> > > s(104254535,2,'built-soap powder',n,1,0).
> > > s(104254535,3,'washing powder',n,1,0).
> >
> >
> > I am sharing some screenshots for understanding the problem-
> >
> > *without* Synonym Graph Filter => 2 docs returned  (screenshot 
at
> > below mentioned URL) -
> >
> >
> >
> 
https://urldefense.proofpoint.com/v2/url?u=https-3A__ibb.co_zQXx7mV=DwIBaQ=jf_iaSHvJObTbx-siA1ZOg=_8ViuZIeSRdQjONA8yHWPZIBlhj291HU3JpNIx5a55M=20lvJFDIjFQqyiTdHseNNeSlDRT2YSznQPoQnxGJQfM=QUaaR69psn7pqa3DtaC7MrTMFstQrQHgeuY0qeQTc0k=
> >
> > *with* Synonym Graph Filter => 2 docs expected, only 1 returned
> > (screenshot at below mentioned URL) -
> >
> >
> >
> 
https://urldefense.proofpoint.com/v2/url?u=https-3A__ibb.co_tp04Rzw=DwIBaQ=jf_iaSHvJObTbx-siA1ZOg=_8ViuZIeSRdQjONA8yHWPZIBlhj291HU3JpNIx5a55M=20lvJFDIjFQqyiTdHseNNeSlDRT2YSznQPoQnxGJQfM=pLPVuD71W1IhokvFuu4F672lX8Nk07b0X9pCVETRjks=
> >
> >
> > Has anyone experienced this before? If yes, is there any
> workaround ?
> > Or is it an expected behaviour?
> >
> > Regards,
> > Atin Janki
> >
> >
> >
>
>
>




Re: Re: Using Synonym Graph Filter with StandardTokenizer does not tokenize the query string if it has multi-word synonym

2020-03-16 Thread atin janki
I aim to achieve an expansion like -

Synonym(soap powder) + Synonym(soap) + Synonym (powder)


which is not happening because of the Synonym expansion is being done at
the moment.

At the moment, using  Synonym Graph Filter with StandardTokenizer  and sow
= false , expands as -

 Synonym(soap powder)

because "soap powder" is a multi-word synonym present in the synonym file.

Using sow = true in the above setting will give -

Synonym(soap) + Synonym (powder)



Best Regards,
Atin Janki


On Mon, Mar 16, 2020 at 5:27 PM Audrey Lorberfeld -
audrey.lorberf...@ibm.com  wrote:

> To confirm, you want a synonym like "soap powder" to map onto synonyms
> like "hand soap," "hygiene products," etc? As in, more of a cognitive
> synonym mapping where you feed synonyms that only apply to the multi-token
> phrase as a whole?
>
> On 3/16/20, 12:17 PM, "atin janki"  wrote:
>
> Using sow=true, does split the word on whitespaces but it will not
> look for
> synonyms of "soap powder" anymore, rather it expands separate synonyms
> for
> "soap" and "powder".
>
>
>
> Best Regards,
> Atin Janki
>
>
> On Mon, Mar 16, 2020 at 4:59 PM Audrey Lorberfeld -
> audrey.lorberf...@ibm.com  wrote:
>
> > Have you set sow=true in your search handler? I know that we have it
> set
> > to false (sow = split on whitespace) because we WANT multi-token
> synonyms
> > retained as multiple tokens.
> >
> > On 3/16/20, 10:49 AM, "atin janki"  wrote:
> >
> > Hello everyone,
> >
> > I am using solr 8.3.
> >
> > After I included Synonym Graph Filter in my managed-schema file,
> I
> > have noticed that if the query string contains a multi-word
> synonym,
> > it considers that multi-word synonym as a single term and does
> not
> > break it, further suppressing the default search behaviour.
> >
> > I am using StandardTokenizer.
> >
> > Below is a snippet from managed-schema file -
> >
> > >
> > > *   > positionIncrementGap="100" multiValued="true">*
> > > **
> > > *  *
> > > *   words="stopwords.txt"
> > ignoreCase="true"/>*
> > > *  *
> > > **
> > > **
> > > *  *
> > > *   words="stopwords.txt"
> > ignoreCase="true"/>*
> > > *   expand="true"
> > ignoreCase="true" synonyms="synonyms.txt"/>*
> > > *  *
> > > ***  *
> >
> >
> > Here "*soap powder*" is the search *query* which is also a
> multi-word
> > synonym in the synonym file as-
> >
> > > s(104254535,1,'soap powder',n,1,1).
> > > s(104254535,2,'built-soap powder',n,1,0).
> > > s(104254535,3,'washing powder',n,1,0).
> >
> >
> > I am sharing some screenshots for understanding the problem-
> >
> > *without* Synonym Graph Filter => 2 docs returned  (screenshot at
> > below mentioned URL) -
> >
> >
> >
> https://urldefense.proofpoint.com/v2/url?u=https-3A__ibb.co_zQXx7mV=DwIBaQ=jf_iaSHvJObTbx-siA1ZOg=_8ViuZIeSRdQjONA8yHWPZIBlhj291HU3JpNIx5a55M=20lvJFDIjFQqyiTdHseNNeSlDRT2YSznQPoQnxGJQfM=QUaaR69psn7pqa3DtaC7MrTMFstQrQHgeuY0qeQTc0k=
> >
> > *with* Synonym Graph Filter => 2 docs expected, only 1 returned
> > (screenshot at below mentioned URL) -
> >
> >
> >
> https://urldefense.proofpoint.com/v2/url?u=https-3A__ibb.co_tp04Rzw=DwIBaQ=jf_iaSHvJObTbx-siA1ZOg=_8ViuZIeSRdQjONA8yHWPZIBlhj291HU3JpNIx5a55M=20lvJFDIjFQqyiTdHseNNeSlDRT2YSznQPoQnxGJQfM=pLPVuD71W1IhokvFuu4F672lX8Nk07b0X9pCVETRjks=
> >
> >
> > Has anyone experienced this before? If yes, is there any
> workaround ?
> > Or is it an expected behaviour?
> >
> > Regards,
> > Atin Janki
> >
> >
> >
>
>
>


Re: Re: Using Synonym Graph Filter with StandardTokenizer does not tokenize the query string if it has multi-word synonym

2020-03-16 Thread Audrey Lorberfeld - audrey.lorberf...@ibm.com
To confirm, you want a synonym like "soap powder" to map onto synonyms like 
"hand soap," "hygiene products," etc? As in, more of a cognitive synonym 
mapping where you feed synonyms that only apply to the multi-token phrase as a 
whole?

On 3/16/20, 12:17 PM, "atin janki"  wrote:

Using sow=true, does split the word on whitespaces but it will not look for
synonyms of "soap powder" anymore, rather it expands separate synonyms for
"soap" and "powder".



Best Regards,
Atin Janki


On Mon, Mar 16, 2020 at 4:59 PM Audrey Lorberfeld -
audrey.lorberf...@ibm.com  wrote:

> Have you set sow=true in your search handler? I know that we have it set
> to false (sow = split on whitespace) because we WANT multi-token synonyms
> retained as multiple tokens.
>
> On 3/16/20, 10:49 AM, "atin janki"  wrote:
>
> Hello everyone,
>
> I am using solr 8.3.
>
> After I included Synonym Graph Filter in my managed-schema file, I
> have noticed that if the query string contains a multi-word synonym,
> it considers that multi-word synonym as a single term and does not
> break it, further suppressing the default search behaviour.
>
> I am using StandardTokenizer.
>
> Below is a snippet from managed-schema file -
>
> >
> > *   positionIncrementGap="100" multiValued="true">*
> > **
> > *  *
> > *   ignoreCase="true"/>*
> > *  *
> > **
> > **
> > *  *
> > *   ignoreCase="true"/>*
> > *   ignoreCase="true" synonyms="synonyms.txt"/>*
> > *  *
> > ***  *
>
>
> Here "*soap powder*" is the search *query* which is also a multi-word
> synonym in the synonym file as-
>
> > s(104254535,1,'soap powder',n,1,1).
> > s(104254535,2,'built-soap powder',n,1,0).
> > s(104254535,3,'washing powder',n,1,0).
>
>
> I am sharing some screenshots for understanding the problem-
>
> *without* Synonym Graph Filter => 2 docs returned  (screenshot at
> below mentioned URL) -
>
>
> 
https://urldefense.proofpoint.com/v2/url?u=https-3A__ibb.co_zQXx7mV=DwIBaQ=jf_iaSHvJObTbx-siA1ZOg=_8ViuZIeSRdQjONA8yHWPZIBlhj291HU3JpNIx5a55M=20lvJFDIjFQqyiTdHseNNeSlDRT2YSznQPoQnxGJQfM=QUaaR69psn7pqa3DtaC7MrTMFstQrQHgeuY0qeQTc0k=
>
> *with* Synonym Graph Filter => 2 docs expected, only 1 returned
> (screenshot at below mentioned URL) -
>
>
> 
https://urldefense.proofpoint.com/v2/url?u=https-3A__ibb.co_tp04Rzw=DwIBaQ=jf_iaSHvJObTbx-siA1ZOg=_8ViuZIeSRdQjONA8yHWPZIBlhj291HU3JpNIx5a55M=20lvJFDIjFQqyiTdHseNNeSlDRT2YSznQPoQnxGJQfM=pLPVuD71W1IhokvFuu4F672lX8Nk07b0X9pCVETRjks=
>
>
> Has anyone experienced this before? If yes, is there any workaround ?
> Or is it an expected behaviour?
>
> Regards,
> Atin Janki
>
>
>




Re: Using Synonym Graph Filter with StandardTokenizer does not tokenize the query string if it has multi-word synonym

2020-03-16 Thread atin janki
Using sow=true, does split the word on whitespaces but it will not look for
synonyms of "soap powder" anymore, rather it expands separate synonyms for
"soap" and "powder".



Best Regards,
Atin Janki


On Mon, Mar 16, 2020 at 4:59 PM Audrey Lorberfeld -
audrey.lorberf...@ibm.com  wrote:

> Have you set sow=true in your search handler? I know that we have it set
> to false (sow = split on whitespace) because we WANT multi-token synonyms
> retained as multiple tokens.
>
> On 3/16/20, 10:49 AM, "atin janki"  wrote:
>
> Hello everyone,
>
> I am using solr 8.3.
>
> After I included Synonym Graph Filter in my managed-schema file, I
> have noticed that if the query string contains a multi-word synonym,
> it considers that multi-word synonym as a single term and does not
> break it, further suppressing the default search behaviour.
>
> I am using StandardTokenizer.
>
> Below is a snippet from managed-schema file -
>
> >
> > *   positionIncrementGap="100" multiValued="true">*
> > **
> > *  *
> > *   ignoreCase="true"/>*
> > *  *
> > **
> > **
> > *  *
> > *   ignoreCase="true"/>*
> > *   ignoreCase="true" synonyms="synonyms.txt"/>*
> > *  *
> > ***  *
>
>
> Here "*soap powder*" is the search *query* which is also a multi-word
> synonym in the synonym file as-
>
> > s(104254535,1,'soap powder',n,1,1).
> > s(104254535,2,'built-soap powder',n,1,0).
> > s(104254535,3,'washing powder',n,1,0).
>
>
> I am sharing some screenshots for understanding the problem-
>
> *without* Synonym Graph Filter => 2 docs returned  (screenshot at
> below mentioned URL) -
>
>
> https://urldefense.proofpoint.com/v2/url?u=https-3A__ibb.co_zQXx7mV=DwIBaQ=jf_iaSHvJObTbx-siA1ZOg=_8ViuZIeSRdQjONA8yHWPZIBlhj291HU3JpNIx5a55M=20lvJFDIjFQqyiTdHseNNeSlDRT2YSznQPoQnxGJQfM=QUaaR69psn7pqa3DtaC7MrTMFstQrQHgeuY0qeQTc0k=
>
> *with* Synonym Graph Filter => 2 docs expected, only 1 returned
> (screenshot at below mentioned URL) -
>
>
> https://urldefense.proofpoint.com/v2/url?u=https-3A__ibb.co_tp04Rzw=DwIBaQ=jf_iaSHvJObTbx-siA1ZOg=_8ViuZIeSRdQjONA8yHWPZIBlhj291HU3JpNIx5a55M=20lvJFDIjFQqyiTdHseNNeSlDRT2YSznQPoQnxGJQfM=pLPVuD71W1IhokvFuu4F672lX8Nk07b0X9pCVETRjks=
>
>
> Has anyone experienced this before? If yes, is there any workaround ?
> Or is it an expected behaviour?
>
> Regards,
> Atin Janki
>
>
>


RE: How do *you* restrict access to Solr?

2020-03-16 Thread Dunigan, Craig A.
Setting up Apache is off-topic, but it’s just a matter of ProxyPass to the Solr 
app URL.  I already gave you the relevant IP restriction configuration 
directive, “Allow from “.  The rest is in httpd documentation.

From: Ryan W 
Sent: Monday, March 16, 2020 10:41 AM
To: solr-user@lucene.apache.org
Subject: Re: How do *you* restrict access to Solr?

WARNING: This email originated outside of Lands’ End. Please be on the lookout 
for phishing scams and do not open attachments or click links from people you 
do not know..

On Mon, Mar 16, 2020 at 11:32 AM Dunigan, Craig A. <
craig.duni...@landsend.com> wrote:

> Here are my suggestions. If you’re okay with IP restrictions only, then
> iptables.


Thanks! Just knowing this is an option helps. I took a stab at it but it
didn't work initially, but at least now I know there's a reason to keep
trying it.


> If you don’t have *nix or root access, an Apache proxy server with Allow
> from .


I do have root access and can edit the Apache config. Can I restrict
access in the Apache config? If so, that would be a great solution. My
situation is fairly typical. I have a LAMP environment with Red Hat
linux. I'm not quite sure how to make my Apache directives specific to the
Solr install. Again, just knowing this is an option would be helpful. The
Solr docs don't mention this possibility, I don't think.



> If you want really, really secure, an stunnel front-end that requires
> client certs that you install in your browsers. For us, we have a load
> balancer with VIPs that restrict access to the internal IP range of the
> building that houses IT, but not everyone has the luxury of hardware
> solutions.
>
> From: Ryan W mailto:rya...@gmail.com>>
> Sent: Monday, March 16, 2020 10:20 AM
> To: solr-user@lucene.apache.org
> Subject: Re: How do *you* restrict access to Solr?
>
> WARNING: This email originated outside of Lands’ End. Please be on the
> lookout for phishing scams and do not open attachments or click links from
> people you do not know..
>
> On Mon, Mar 16, 2020 at 10:50 AM David Hastings <
> hastings.recurs...@gmail.com>>
>  wrote:
>
> > Honestly? I know this isnt what youre going to want to hear, but security
> > through obscurity. no one else knows what port the servers on, and its
> not
> > accessible from anything outside of the internal network.
>
>
> That doesn't sound like security through obscurity, as long as you are
> confident that access to the internal network is limited... to whatever
> degree you require. I'd certainly be happy if I could restrict access
> based on IP.
>
>
>
> > if your solr
> > install can be accessed from an external IP you have much larger issues.
>
>
> > On Mon, Mar 16, 2020 at 10:44 AM Ryan W mailto:rya...@gmail.com%3cmailto:%0b>> 
rya...@gmail.com>> wrote:
> >
> > > How do you, personally, do it? Do you use IPTables? Basic
> > Authentication
> > > Plugin? Something else?
> > >
> > > I'm asking in part so I'l have something to search for. I don't know
> > where
> > > I should begin, so I figured I would ask how others do it.
> > >
> > > I haven't been able to find anything that works, so if you can tell me
> > what
> > > works for you, I can at least narrow it down a bit and do some Google
> > > searches. Do I need to learn Solr's plugin system? Am I starting in the
> > > right place if I follow this document:
> > >
> > >
> >
> https://lucene.apache.org/solr/guide/7_0/rule-based-authorization-plugin.html#rule-based-authorization-plugin
> <
> https://lucene.apache.org/solr/guide/7_0/rule-based-authorization-plugin.html#rule-based-authorization-plugin
> >
> > >
> > > Initially, the above document seems far too comprehensive for my needs.
> > I
> > > just want to block access to the Solr admin UI, and the list of
> > predefined
> > > permissions in that document don't seem to be relevant. Also, it seems
> > > unlikely this plugin system is necessary just to control access to the
> > > admin UI... or maybe it necessary?
> > >
> > > In any case, what is your approach?
> > >
> > > I'm using version 7.7.2 of Solr.
> > >
> > > Thanks!
> > >
> >
>


Re: Using Synonym Graph Filter with StandardTokenizer does not tokenize the query string if it has multi-word synonym

2020-03-16 Thread Audrey Lorberfeld - audrey.lorberf...@ibm.com
Have you set sow=true in your search handler? I know that we have it set to 
false (sow = split on whitespace) because we WANT multi-token synonyms retained 
as multiple tokens. 

On 3/16/20, 10:49 AM, "atin janki"  wrote:

Hello everyone,

I am using solr 8.3.

After I included Synonym Graph Filter in my managed-schema file, I
have noticed that if the query string contains a multi-word synonym,
it considers that multi-word synonym as a single term and does not
break it, further suppressing the default search behaviour.

I am using StandardTokenizer.

Below is a snippet from managed-schema file -

>
> *  *
> **
> *  *
> *  *
> *  *
> **
> **
> *  *
> *  *
> *  *
> *  *
> ***  *


Here "*soap powder*" is the search *query* which is also a multi-word
synonym in the synonym file as-

> s(104254535,1,'soap powder',n,1,1).
> s(104254535,2,'built-soap powder',n,1,0).
> s(104254535,3,'washing powder',n,1,0).


I am sharing some screenshots for understanding the problem-

*without* Synonym Graph Filter => 2 docs returned  (screenshot at
below mentioned URL) -


https://urldefense.proofpoint.com/v2/url?u=https-3A__ibb.co_zQXx7mV=DwIBaQ=jf_iaSHvJObTbx-siA1ZOg=_8ViuZIeSRdQjONA8yHWPZIBlhj291HU3JpNIx5a55M=20lvJFDIjFQqyiTdHseNNeSlDRT2YSznQPoQnxGJQfM=QUaaR69psn7pqa3DtaC7MrTMFstQrQHgeuY0qeQTc0k=
 

*with* Synonym Graph Filter => 2 docs expected, only 1 returned
(screenshot at below mentioned URL) -


https://urldefense.proofpoint.com/v2/url?u=https-3A__ibb.co_tp04Rzw=DwIBaQ=jf_iaSHvJObTbx-siA1ZOg=_8ViuZIeSRdQjONA8yHWPZIBlhj291HU3JpNIx5a55M=20lvJFDIjFQqyiTdHseNNeSlDRT2YSznQPoQnxGJQfM=pLPVuD71W1IhokvFuu4F672lX8Nk07b0X9pCVETRjks=
 


Has anyone experienced this before? If yes, is there any workaround ?
Or is it an expected behaviour?

Regards,
Atin Janki




Re: How do *you* restrict access to Solr?

2020-03-16 Thread Aroop Ganguly
Hi Ryan

You should consider a simple rule based authorization scheme.
Your staff user can be given readonly privileges to everything you want to 
except the admin ui.

Depending on which version of solr you are on this can be trivial.

- Aroop

> On Mar 16, 2020, at 8:46 AM, Ryan W  wrote:
> 
> On Mon, Mar 16, 2020 at 10:51 AM Susheel Kumar 
> wrote:
> 
>> Basic auth should help you to start
>> 
>> https://lucene.apache.org/solr/guide/8_1/basic-authentication-plugin.html
> 
> 
> 
> Thanks.  I think I will give up on the plugin system.  I haven't been able
> to get the plugin system to work, and it creates too many opportunities for
> human error.  Even if I can get it working this week, what about 6 months
> from now or a year from now when something goes wrong and I have to debug
> it.  It seems like far too much overhead to provide the desired security
> benefit, except perhaps in situations where an organization has Solr
> specialists who can maintain the system.



Re: How do *you* restrict access to Solr?

2020-03-16 Thread Ryan W
On Mon, Mar 16, 2020 at 10:51 AM Susheel Kumar 
wrote:

> Basic auth should help you to start
>
> https://lucene.apache.org/solr/guide/8_1/basic-authentication-plugin.html



Thanks.  I think I will give up on the plugin system.  I haven't been able
to get the plugin system to work, and it creates too many opportunities for
human error.  Even if I can get it working this week, what about 6 months
from now or a year from now when something goes wrong and I have to debug
it.  It seems like far too much overhead to provide the desired security
benefit, except perhaps in situations where an organization has Solr
specialists who can maintain the system.


Re: How do *you* restrict access to Solr?

2020-03-16 Thread Ryan W
On Mon, Mar 16, 2020 at 11:40 AM Walter Underwood 
wrote:

> Also, even if you prevent access to the admin UI, a request to /update can
> delete
> all the content. It is really easy. This Gist shows how.
>
> https://gist.github.com/nz/673027/313f70681daa985ea13ba33a385753aef951a0f3



This seems important.  In other words, my work isn't necessarily done if
I've secured the graphical UI.  I can't just visit the admin UI page to see
if my efforts are successful.



>
>
> wunder
> Walter Underwood
> wun...@wunderwood.org
> http://observer.wunderwood.org/  (my blog)
>
> > On Mar 16, 2020, at 8:20 AM, David Hastings <
> hastings.recurs...@gmail.com> wrote:
> >
> > master slave is the idea that you have an indexing server you do all
> > indexing to and a search server that replicates the index, to deliver the
> > results etc.  if you keep the indexer separate you can tune it
> differently
> > as well as protect the data.  also means you can remove the delete/update
> > request handlers from the slave/searcher
> >
> > yes security by obscurity isnt ideal, but the over head of adding
> > authentication to requests i find unnecessary,
> >
> > On Mon, Mar 16, 2020 at 11:16 AM Ryan W  wrote:
> >
> >> On Mon, Mar 16, 2020 at 11:09 AM Walter Underwood <
> wun...@wunderwood.org>
> >> wrote:
> >>
> >>> What access do you want to prevent? How do you prefer to authenticate?
> >>> How do you manage users or roles? Master/slave or Solr Cloud?
> >>>
> >>
> >> I want to prevent access to the admin UI.
> >>
> >> I don't want to manage users or roles, preferably.  I have only one
> user:
> >> staff.  I want to prevent the public from accessing the admin UI.  I'd
> be
> >> happy if I could set an IP address whitelist... especially if I don't
> have
> >> to learn a new framework (which I will never use for any other purpose)
> to
> >> do it.
> >>
> >> I don't know what master/slave is.  These are new concepts that weren't
> >> required to secure Solr prior to 7x, and this is my first project using
> a
> >> version after 6x.
> >>
> >> Thanks!
> >>
> >>
> >>
> >>>
> >>> wunder
> >>> Walter Underwood
> >>> wun...@wunderwood.org
> >>> http://observer.wunderwood.org/  (my blog)
> >>>
>  On Mar 16, 2020, at 7:44 AM, Ryan W  wrote:
> 
>  How do you, personally, do it?  Do you use IPTables?  Basic
> >>> Authentication
>  Plugin? Something else?
> 
>  I'm asking in part so I'l have something to search for.  I don't know
> >>> where
>  I should begin, so I figured I would ask how others do it.
> 
>  I haven't been able to find anything that works, so if you can tell me
> >>> what
>  works for you, I can at least narrow it down a bit and do some Google
>  searches.  Do I need to learn Solr's plugin system?  Am I starting in
> >> the
>  right place if I follow this document:
> 
> >>>
> >>
> https://lucene.apache.org/solr/guide/7_0/rule-based-authorization-plugin.html#rule-based-authorization-plugin
> 
>  Initially, the above document seems far too comprehensive for my
> needs.
> >>> I
>  just want to block access to the Solr admin UI, and the list of
> >>> predefined
>  permissions in that document don't seem to be relevant.  Also, it
> seems
>  unlikely this plugin system is necessary just to control access to the
>  admin UI... or maybe it necessary?
> 
>  In any case, what is your approach?
> 
>  I'm using version 7.7.2 of Solr.
> 
>  Thanks!
> >>>
> >>>
> >>
>
>


Re: How do *you* restrict access to Solr?

2020-03-16 Thread Ryan W
On Mon, Mar 16, 2020 at 11:32 AM Dunigan, Craig A. <
craig.duni...@landsend.com> wrote:

> Here are my suggestions.  If you’re okay with IP restrictions only, then
> iptables.


Thanks!  Just knowing this is an option helps.  I took a stab at it but it
didn't work initially, but at least now I know there's a reason to keep
trying it.


> If you don’t have *nix or root access, an Apache proxy server with Allow
> from .


I do have root access and can edit the Apache config.  Can I restrict
access in the Apache config?  If so, that would be a great solution.  My
situation is fairly typical.  I have a LAMP environment with Red Hat
linux.  I'm not quite sure how to make my Apache directives specific to the
Solr install.  Again, just knowing this is an option would be helpful.  The
Solr docs don't mention this possibility, I don't think.



> If you want really, really secure, an stunnel front-end that requires
> client certs that you install in your browsers.  For us, we have a load
> balancer with VIPs that restrict access to the internal IP range of the
> building that houses IT, but not everyone has the luxury of hardware
> solutions.
>
> From: Ryan W 
> Sent: Monday, March 16, 2020 10:20 AM
> To: solr-user@lucene.apache.org
> Subject: Re: How do *you* restrict access to Solr?
>
> WARNING: This email originated outside of Lands’ End. Please be on the
> lookout for phishing scams and do not open attachments or click links from
> people you do not know..
>
> On Mon, Mar 16, 2020 at 10:50 AM David Hastings <
> hastings.recurs...@gmail.com> wrote:
>
> > Honestly? I know this isnt what youre going to want to hear, but security
> > through obscurity. no one else knows what port the servers on, and its
> not
> > accessible from anything outside of the internal network.
>
>
> That doesn't sound like security through obscurity, as long as you are
> confident that access to the internal network is limited... to whatever
> degree you require. I'd certainly be happy if I could restrict access
> based on IP.
>
>
>
> > if your solr
> > install can be accessed from an external IP you have much larger issues.
>
>
> > On Mon, Mar 16, 2020 at 10:44 AM Ryan W  rya...@gmail.com>> wrote:
> >
> > > How do you, personally, do it? Do you use IPTables? Basic
> > Authentication
> > > Plugin? Something else?
> > >
> > > I'm asking in part so I'l have something to search for. I don't know
> > where
> > > I should begin, so I figured I would ask how others do it.
> > >
> > > I haven't been able to find anything that works, so if you can tell me
> > what
> > > works for you, I can at least narrow it down a bit and do some Google
> > > searches. Do I need to learn Solr's plugin system? Am I starting in the
> > > right place if I follow this document:
> > >
> > >
> >
> https://lucene.apache.org/solr/guide/7_0/rule-based-authorization-plugin.html#rule-based-authorization-plugin
> <
> https://lucene.apache.org/solr/guide/7_0/rule-based-authorization-plugin.html#rule-based-authorization-plugin
> >
> > >
> > > Initially, the above document seems far too comprehensive for my needs.
> > I
> > > just want to block access to the Solr admin UI, and the list of
> > predefined
> > > permissions in that document don't seem to be relevant. Also, it seems
> > > unlikely this plugin system is necessary just to control access to the
> > > admin UI... or maybe it necessary?
> > >
> > > In any case, what is your approach?
> > >
> > > I'm using version 7.7.2 of Solr.
> > >
> > > Thanks!
> > >
> >
>


Re: How do *you* restrict access to Solr?

2020-03-16 Thread Walter Underwood
If your data changes slowly and you don’t need to shard, master/slave is great.
It is loosely coupled, so not as complicated as Solr Cloud. Each slave is an 
exact
clone.

For master/slave, you can put an HTTP server (nginx, etc.) on each server and
proxy traffic to Solr. Then configure Solr to only listen to localhost. The 
HTTP server
should have plenty of tools for configuring access. The slave servers will 
contact
the master on the port that the HTTP server uses.

Also, even if you prevent access to the admin UI, a request to /update can 
delete
all the content. It is really easy. This Gist shows how.

https://gist.github.com/nz/673027/313f70681daa985ea13ba33a385753aef951a0f3

wunder
Walter Underwood
wun...@wunderwood.org
http://observer.wunderwood.org/  (my blog)

> On Mar 16, 2020, at 8:20 AM, David Hastings  
> wrote:
> 
> master slave is the idea that you have an indexing server you do all
> indexing to and a search server that replicates the index, to deliver the
> results etc.  if you keep the indexer separate you can tune it differently
> as well as protect the data.  also means you can remove the delete/update
> request handlers from the slave/searcher
> 
> yes security by obscurity isnt ideal, but the over head of adding
> authentication to requests i find unnecessary,
> 
> On Mon, Mar 16, 2020 at 11:16 AM Ryan W  wrote:
> 
>> On Mon, Mar 16, 2020 at 11:09 AM Walter Underwood 
>> wrote:
>> 
>>> What access do you want to prevent? How do you prefer to authenticate?
>>> How do you manage users or roles? Master/slave or Solr Cloud?
>>> 
>> 
>> I want to prevent access to the admin UI.
>> 
>> I don't want to manage users or roles, preferably.  I have only one user:
>> staff.  I want to prevent the public from accessing the admin UI.  I'd be
>> happy if I could set an IP address whitelist... especially if I don't have
>> to learn a new framework (which I will never use for any other purpose) to
>> do it.
>> 
>> I don't know what master/slave is.  These are new concepts that weren't
>> required to secure Solr prior to 7x, and this is my first project using a
>> version after 6x.
>> 
>> Thanks!
>> 
>> 
>> 
>>> 
>>> wunder
>>> Walter Underwood
>>> wun...@wunderwood.org
>>> http://observer.wunderwood.org/  (my blog)
>>> 
 On Mar 16, 2020, at 7:44 AM, Ryan W  wrote:
 
 How do you, personally, do it?  Do you use IPTables?  Basic
>>> Authentication
 Plugin? Something else?
 
 I'm asking in part so I'l have something to search for.  I don't know
>>> where
 I should begin, so I figured I would ask how others do it.
 
 I haven't been able to find anything that works, so if you can tell me
>>> what
 works for you, I can at least narrow it down a bit and do some Google
 searches.  Do I need to learn Solr's plugin system?  Am I starting in
>> the
 right place if I follow this document:
 
>>> 
>> https://lucene.apache.org/solr/guide/7_0/rule-based-authorization-plugin.html#rule-based-authorization-plugin
 
 Initially, the above document seems far too comprehensive for my needs.
>>> I
 just want to block access to the Solr admin UI, and the list of
>>> predefined
 permissions in that document don't seem to be relevant.  Also, it seems
 unlikely this plugin system is necessary just to control access to the
 admin UI... or maybe it necessary?
 
 In any case, what is your approach?
 
 I'm using version 7.7.2 of Solr.
 
 Thanks!
>>> 
>>> 
>> 



RE: How do *you* restrict access to Solr?

2020-03-16 Thread Dunigan, Craig A.
Here are my suggestions.  If you’re okay with IP restrictions only, then 
iptables.  If you don’t have *nix or root access, an Apache proxy server with 
Allow from .  If you want really, really secure, an stunnel front-end 
that requires client certs that you install in your browsers.  For us, we have 
a load balancer with VIPs that restrict access to the internal IP range of the 
building that houses IT, but not everyone has the luxury of hardware solutions.

From: Ryan W 
Sent: Monday, March 16, 2020 10:20 AM
To: solr-user@lucene.apache.org
Subject: Re: How do *you* restrict access to Solr?

WARNING: This email originated outside of Lands’ End. Please be on the lookout 
for phishing scams and do not open attachments or click links from people you 
do not know..

On Mon, Mar 16, 2020 at 10:50 AM David Hastings <
hastings.recurs...@gmail.com> wrote:

> Honestly? I know this isnt what youre going to want to hear, but security
> through obscurity. no one else knows what port the servers on, and its not
> accessible from anything outside of the internal network.


That doesn't sound like security through obscurity, as long as you are
confident that access to the internal network is limited... to whatever
degree you require. I'd certainly be happy if I could restrict access
based on IP.



> if your solr
> install can be accessed from an external IP you have much larger issues.


> On Mon, Mar 16, 2020 at 10:44 AM Ryan W 
> mailto:rya...@gmail.com>> wrote:
>
> > How do you, personally, do it? Do you use IPTables? Basic
> Authentication
> > Plugin? Something else?
> >
> > I'm asking in part so I'l have something to search for. I don't know
> where
> > I should begin, so I figured I would ask how others do it.
> >
> > I haven't been able to find anything that works, so if you can tell me
> what
> > works for you, I can at least narrow it down a bit and do some Google
> > searches. Do I need to learn Solr's plugin system? Am I starting in the
> > right place if I follow this document:
> >
> >
> https://lucene.apache.org/solr/guide/7_0/rule-based-authorization-plugin.html#rule-based-authorization-plugin
> >
> > Initially, the above document seems far too comprehensive for my needs.
> I
> > just want to block access to the Solr admin UI, and the list of
> predefined
> > permissions in that document don't seem to be relevant. Also, it seems
> > unlikely this plugin system is necessary just to control access to the
> > admin UI... or maybe it necessary?
> >
> > In any case, what is your approach?
> >
> > I'm using version 7.7.2 of Solr.
> >
> > Thanks!
> >
>


Re: How do *you* restrict access to Solr?

2020-03-16 Thread David Hastings
master slave is the idea that you have an indexing server you do all
indexing to and a search server that replicates the index, to deliver the
results etc.  if you keep the indexer separate you can tune it differently
as well as protect the data.  also means you can remove the delete/update
request handlers from the slave/searcher

yes security by obscurity isnt ideal, but the over head of adding
authentication to requests i find unnecessary,

On Mon, Mar 16, 2020 at 11:16 AM Ryan W  wrote:

> On Mon, Mar 16, 2020 at 11:09 AM Walter Underwood 
> wrote:
>
> > What access do you want to prevent? How do you prefer to authenticate?
> > How do you manage users or roles? Master/slave or Solr Cloud?
> >
>
> I want to prevent access to the admin UI.
>
> I don't want to manage users or roles, preferably.  I have only one user:
> staff.  I want to prevent the public from accessing the admin UI.  I'd be
> happy if I could set an IP address whitelist... especially if I don't have
> to learn a new framework (which I will never use for any other purpose) to
> do it.
>
> I don't know what master/slave is.  These are new concepts that weren't
> required to secure Solr prior to 7x, and this is my first project using a
> version after 6x.
>
> Thanks!
>
>
>
> >
> > wunder
> > Walter Underwood
> > wun...@wunderwood.org
> > http://observer.wunderwood.org/  (my blog)
> >
> > > On Mar 16, 2020, at 7:44 AM, Ryan W  wrote:
> > >
> > > How do you, personally, do it?  Do you use IPTables?  Basic
> > Authentication
> > > Plugin? Something else?
> > >
> > > I'm asking in part so I'l have something to search for.  I don't know
> > where
> > > I should begin, so I figured I would ask how others do it.
> > >
> > > I haven't been able to find anything that works, so if you can tell me
> > what
> > > works for you, I can at least narrow it down a bit and do some Google
> > > searches.  Do I need to learn Solr's plugin system?  Am I starting in
> the
> > > right place if I follow this document:
> > >
> >
> https://lucene.apache.org/solr/guide/7_0/rule-based-authorization-plugin.html#rule-based-authorization-plugin
> > >
> > > Initially, the above document seems far too comprehensive for my needs.
> > I
> > > just want to block access to the Solr admin UI, and the list of
> > predefined
> > > permissions in that document don't seem to be relevant.  Also, it seems
> > > unlikely this plugin system is necessary just to control access to the
> > > admin UI... or maybe it necessary?
> > >
> > > In any case, what is your approach?
> > >
> > > I'm using version 7.7.2 of Solr.
> > >
> > > Thanks!
> >
> >
>


Re: How do *you* restrict access to Solr?

2020-03-16 Thread Ryan W
On Mon, Mar 16, 2020 at 10:50 AM David Hastings <
hastings.recurs...@gmail.com> wrote:

> Honestly?  I know this isnt what youre going to want to hear, but security
> through obscurity.  no one else knows what port the servers on, and its not
> accessible from anything outside of the internal network.


That doesn't sound like security through obscurity, as long as you are
confident that access to the internal network is limited... to whatever
degree you require.  I'd certainly be happy if I could restrict access
based on IP.



> if your solr
> install can be accessed from an external IP you have much larger issues.


> On Mon, Mar 16, 2020 at 10:44 AM Ryan W  wrote:
>
> > How do you, personally, do it?  Do you use IPTables?  Basic
> Authentication
> > Plugin? Something else?
> >
> > I'm asking in part so I'l have something to search for.  I don't know
> where
> > I should begin, so I figured I would ask how others do it.
> >
> > I haven't been able to find anything that works, so if you can tell me
> what
> > works for you, I can at least narrow it down a bit and do some Google
> > searches.  Do I need to learn Solr's plugin system?  Am I starting in the
> > right place if I follow this document:
> >
> >
> https://lucene.apache.org/solr/guide/7_0/rule-based-authorization-plugin.html#rule-based-authorization-plugin
> >
> > Initially, the above document seems far too comprehensive for my needs.
> I
> > just want to block access to the Solr admin UI, and the list of
> predefined
> > permissions in that document don't seem to be relevant.  Also, it seems
> > unlikely this plugin system is necessary just to control access to the
> > admin UI... or maybe it necessary?
> >
> > In any case, what is your approach?
> >
> > I'm using version 7.7.2 of Solr.
> >
> > Thanks!
> >
>


Re: How do *you* restrict access to Solr?

2020-03-16 Thread Ryan W
On Mon, Mar 16, 2020 at 11:09 AM Walter Underwood 
wrote:

> What access do you want to prevent? How do you prefer to authenticate?
> How do you manage users or roles? Master/slave or Solr Cloud?
>

I want to prevent access to the admin UI.

I don't want to manage users or roles, preferably.  I have only one user:
staff.  I want to prevent the public from accessing the admin UI.  I'd be
happy if I could set an IP address whitelist... especially if I don't have
to learn a new framework (which I will never use for any other purpose) to
do it.

I don't know what master/slave is.  These are new concepts that weren't
required to secure Solr prior to 7x, and this is my first project using a
version after 6x.

Thanks!



>
> wunder
> Walter Underwood
> wun...@wunderwood.org
> http://observer.wunderwood.org/  (my blog)
>
> > On Mar 16, 2020, at 7:44 AM, Ryan W  wrote:
> >
> > How do you, personally, do it?  Do you use IPTables?  Basic
> Authentication
> > Plugin? Something else?
> >
> > I'm asking in part so I'l have something to search for.  I don't know
> where
> > I should begin, so I figured I would ask how others do it.
> >
> > I haven't been able to find anything that works, so if you can tell me
> what
> > works for you, I can at least narrow it down a bit and do some Google
> > searches.  Do I need to learn Solr's plugin system?  Am I starting in the
> > right place if I follow this document:
> >
> https://lucene.apache.org/solr/guide/7_0/rule-based-authorization-plugin.html#rule-based-authorization-plugin
> >
> > Initially, the above document seems far too comprehensive for my needs.
> I
> > just want to block access to the Solr admin UI, and the list of
> predefined
> > permissions in that document don't seem to be relevant.  Also, it seems
> > unlikely this plugin system is necessary just to control access to the
> > admin UI... or maybe it necessary?
> >
> > In any case, what is your approach?
> >
> > I'm using version 7.7.2 of Solr.
> >
> > Thanks!
>
>


Re: How do *you* restrict access to Solr?

2020-03-16 Thread Walter Underwood
What access do you want to prevent? How do you prefer to authenticate?
How do you manage users or roles? Master/slave or Solr Cloud?

wunder
Walter Underwood
wun...@wunderwood.org
http://observer.wunderwood.org/  (my blog)

> On Mar 16, 2020, at 7:44 AM, Ryan W  wrote:
> 
> How do you, personally, do it?  Do you use IPTables?  Basic Authentication
> Plugin? Something else?
> 
> I'm asking in part so I'l have something to search for.  I don't know where
> I should begin, so I figured I would ask how others do it.
> 
> I haven't been able to find anything that works, so if you can tell me what
> works for you, I can at least narrow it down a bit and do some Google
> searches.  Do I need to learn Solr's plugin system?  Am I starting in the
> right place if I follow this document:
> https://lucene.apache.org/solr/guide/7_0/rule-based-authorization-plugin.html#rule-based-authorization-plugin
> 
> Initially, the above document seems far too comprehensive for my needs.  I
> just want to block access to the Solr admin UI, and the list of predefined
> permissions in that document don't seem to be relevant.  Also, it seems
> unlikely this plugin system is necessary just to control access to the
> admin UI... or maybe it necessary?
> 
> In any case, what is your approach?
> 
> I'm using version 7.7.2 of Solr.
> 
> Thanks!



Re: How do *you* restrict access to Solr?

2020-03-16 Thread Ryan W
Thanks Jorn, though this all seems unrealistic.  Because the technical
skill required to secure Solr far exceeds the technical skill required to
install it, I suspect there are probably a lot of insecure installs out
there.  In many cases this will not apply: "if you work with people that
know a bit about those topics in your enterprise."  Solr is used in many
situations where the developer does not have access to a large enterprise
with highly specialized assistance.

On Mon, Mar 16, 2020 at 11:00 AM Jörn Franke  wrote:

> Solr should not be accessible to end users directly - only through a
> dedicated application in between.
>
> Then in an enterprise setting it is mostly Kerberos auth. and https (do
> not forget about zookeeper when using Solr cloud here you can also have
> Kerberos auth and in recent version also SSL). It is not that difficult to
> configure if you work with people that know a bit about those topics in
> your enterprise.
>
> In a Cloud based scenario jwt token can make sense.
>
> Do not do security by obscurity. You owe it to the users that potentially
> also have private data on Solr.
>
> > Am 16.03.2020 um 15:44 schrieb Ryan W :
> >
> > How do you, personally, do it?  Do you use IPTables?  Basic
> Authentication
> > Plugin? Something else?
> >
> > I'm asking in part so I'l have something to search for.  I don't know
> where
> > I should begin, so I figured I would ask how others do it.
> >
> > I haven't been able to find anything that works, so if you can tell me
> what
> > works for you, I can at least narrow it down a bit and do some Google
> > searches.  Do I need to learn Solr's plugin system?  Am I starting in the
> > right place if I follow this document:
> >
> https://lucene.apache.org/solr/guide/7_0/rule-based-authorization-plugin.html#rule-based-authorization-plugin
> >
> > Initially, the above document seems far too comprehensive for my needs.
> I
> > just want to block access to the Solr admin UI, and the list of
> predefined
> > permissions in that document don't seem to be relevant.  Also, it seems
> > unlikely this plugin system is necessary just to control access to the
> > admin UI... or maybe it necessary?
> >
> > In any case, what is your approach?
> >
> > I'm using version 7.7.2 of Solr.
> >
> > Thanks!
>


Re: How do *you* restrict access to Solr?

2020-03-16 Thread Jörn Franke
Solr should not be accessible to end users directly - only through a dedicated 
application in between.

Then in an enterprise setting it is mostly Kerberos auth. and https (do not 
forget about zookeeper when using Solr cloud here you can also have Kerberos 
auth and in recent version also SSL). It is not that difficult to configure if 
you work with people that know a bit about those topics in your enterprise.

In a Cloud based scenario jwt token can make sense. 

Do not do security by obscurity. You owe it to the users that potentially also 
have private data on Solr.

> Am 16.03.2020 um 15:44 schrieb Ryan W :
> 
> How do you, personally, do it?  Do you use IPTables?  Basic Authentication
> Plugin? Something else?
> 
> I'm asking in part so I'l have something to search for.  I don't know where
> I should begin, so I figured I would ask how others do it.
> 
> I haven't been able to find anything that works, so if you can tell me what
> works for you, I can at least narrow it down a bit and do some Google
> searches.  Do I need to learn Solr's plugin system?  Am I starting in the
> right place if I follow this document:
> https://lucene.apache.org/solr/guide/7_0/rule-based-authorization-plugin.html#rule-based-authorization-plugin
> 
> Initially, the above document seems far too comprehensive for my needs.  I
> just want to block access to the Solr admin UI, and the list of predefined
> permissions in that document don't seem to be relevant.  Also, it seems
> unlikely this plugin system is necessary just to control access to the
> admin UI... or maybe it necessary?
> 
> In any case, what is your approach?
> 
> I'm using version 7.7.2 of Solr.
> 
> Thanks!


Re: How do *you* restrict access to Solr?

2020-03-16 Thread Nicolas Franck
IPtables seems like the way to go, at least for me.
Even if this basic-auth-plugin works, then you'll have to
deal with denial-of-service attacks (although these can
also happen indirectly, by hitting the website that uses Solr).

> On 16 Mar 2020, at 15:44, Ryan W  wrote:
> 
> How do you, personally, do it?  Do you use IPTables?  Basic Authentication
> Plugin? Something else?
> 
> I'm asking in part so I'l have something to search for.  I don't know where
> I should begin, so I figured I would ask how others do it.
> 
> I haven't been able to find anything that works, so if you can tell me what
> works for you, I can at least narrow it down a bit and do some Google
> searches.  Do I need to learn Solr's plugin system?  Am I starting in the
> right place if I follow this document:
> https://lucene.apache.org/solr/guide/7_0/rule-based-authorization-plugin.html#rule-based-authorization-plugin
> 
> Initially, the above document seems far too comprehensive for my needs.  I
> just want to block access to the Solr admin UI, and the list of predefined
> permissions in that document don't seem to be relevant.  Also, it seems
> unlikely this plugin system is necessary just to control access to the
> admin UI... or maybe it necessary?
> 
> In any case, what is your approach?
> 
> I'm using version 7.7.2 of Solr.
> 
> Thanks!



Re: How do *you* restrict access to Solr?

2020-03-16 Thread Susheel Kumar
Basic auth should help you to start

https://lucene.apache.org/solr/guide/8_1/basic-authentication-plugin.html

On Mon, Mar 16, 2020 at 10:44 AM Ryan W  wrote:

> How do you, personally, do it?  Do you use IPTables?  Basic Authentication
> Plugin? Something else?
>
> I'm asking in part so I'l have something to search for.  I don't know where
> I should begin, so I figured I would ask how others do it.
>
> I haven't been able to find anything that works, so if you can tell me what
> works for you, I can at least narrow it down a bit and do some Google
> searches.  Do I need to learn Solr's plugin system?  Am I starting in the
> right place if I follow this document:
>
> https://lucene.apache.org/solr/guide/7_0/rule-based-authorization-plugin.html#rule-based-authorization-plugin
>
> Initially, the above document seems far too comprehensive for my needs.  I
> just want to block access to the Solr admin UI, and the list of predefined
> permissions in that document don't seem to be relevant.  Also, it seems
> unlikely this plugin system is necessary just to control access to the
> admin UI... or maybe it necessary?
>
> In any case, what is your approach?
>
> I'm using version 7.7.2 of Solr.
>
> Thanks!
>


Re: How do *you* restrict access to Solr?

2020-03-16 Thread David Hastings
Honestly?  I know this isnt what youre going to want to hear, but security
through obscurity.  no one else knows what port the servers on, and its not
accessible from anything outside of the internal network.  if your solr
install can be accessed from an external IP you have much larger issues.

On Mon, Mar 16, 2020 at 10:44 AM Ryan W  wrote:

> How do you, personally, do it?  Do you use IPTables?  Basic Authentication
> Plugin? Something else?
>
> I'm asking in part so I'l have something to search for.  I don't know where
> I should begin, so I figured I would ask how others do it.
>
> I haven't been able to find anything that works, so if you can tell me what
> works for you, I can at least narrow it down a bit and do some Google
> searches.  Do I need to learn Solr's plugin system?  Am I starting in the
> right place if I follow this document:
>
> https://lucene.apache.org/solr/guide/7_0/rule-based-authorization-plugin.html#rule-based-authorization-plugin
>
> Initially, the above document seems far too comprehensive for my needs.  I
> just want to block access to the Solr admin UI, and the list of predefined
> permissions in that document don't seem to be relevant.  Also, it seems
> unlikely this plugin system is necessary just to control access to the
> admin UI... or maybe it necessary?
>
> In any case, what is your approach?
>
> I'm using version 7.7.2 of Solr.
>
> Thanks!
>


Using Synonym Graph Filter with StandardTokenizer does not tokenize the query string if it has multi-word synonym

2020-03-16 Thread atin janki
Hello everyone,

I am using solr 8.3.

After I included Synonym Graph Filter in my managed-schema file, I
have noticed that if the query string contains a multi-word synonym,
it considers that multi-word synonym as a single term and does not
break it, further suppressing the default search behaviour.

I am using StandardTokenizer.

Below is a snippet from managed-schema file -

>
> *   positionIncrementGap="100" multiValued="true">*
> **
> *  *
> *   ignoreCase="true"/>*
> *  *
> **
> **
> *  *
> *   ignoreCase="true"/>*
> *   ignoreCase="true" synonyms="synonyms.txt"/>*
> *  *
> ***  *


Here "*soap powder*" is the search *query* which is also a multi-word
synonym in the synonym file as-

> s(104254535,1,'soap powder',n,1,1).
> s(104254535,2,'built-soap powder',n,1,0).
> s(104254535,3,'washing powder',n,1,0).


I am sharing some screenshots for understanding the problem-

*without* Synonym Graph Filter => 2 docs returned  (screenshot at
below mentioned URL) -

https://ibb.co/zQXx7mV

*with* Synonym Graph Filter => 2 docs expected, only 1 returned
(screenshot at below mentioned URL) -

https://ibb.co/tp04Rzw


Has anyone experienced this before? If yes, is there any workaround ?
Or is it an expected behaviour?

Regards,
Atin Janki


How do *you* restrict access to Solr?

2020-03-16 Thread Ryan W
How do you, personally, do it?  Do you use IPTables?  Basic Authentication
Plugin? Something else?

I'm asking in part so I'l have something to search for.  I don't know where
I should begin, so I figured I would ask how others do it.

I haven't been able to find anything that works, so if you can tell me what
works for you, I can at least narrow it down a bit and do some Google
searches.  Do I need to learn Solr's plugin system?  Am I starting in the
right place if I follow this document:
https://lucene.apache.org/solr/guide/7_0/rule-based-authorization-plugin.html#rule-based-authorization-plugin

Initially, the above document seems far too comprehensive for my needs.  I
just want to block access to the Solr admin UI, and the list of predefined
permissions in that document don't seem to be relevant.  Also, it seems
unlikely this plugin system is necessary just to control access to the
admin UI... or maybe it necessary?

In any case, what is your approach?

I'm using version 7.7.2 of Solr.

Thanks!


How to sum model grouped?

2020-03-16 Thread hakan
I use solr version 7.1. I have as grouped model in total 11M record, as below
example.
This question is, How do I sum fromfollowers field from this grouped model?
{
 groupValue: "1927245294",
 doclist: {
numFound: 1,
start: 0,
docs: [
{
   fromuserid: "1927245294",
   fromfollowers: 185
 }
 ]
  }
},
{
 groupValue: "98405321",
 doclist: {
numFound: 1,
start: 0,
docs: [
{
   fromuserid: "98405321",
   fromfollowers: 292
 }
 ]
  }
},
{
 groupValue: "182496421",
 doclist: {
numFound: 1,
start: 0,
docs: [
{
   fromuserid: "182496421",
   fromfollowers: 111
 }
 ]
  }
}



--
Sent from: https://lucene.472066.n3.nabble.com/Solr-User-f472068.html


Re: Copying data

2020-03-16 Thread Erick Erickson
It’s not at all clear what the problem is. If you have a single-shard 
collection, just 
1> create the stand-alone core 
2> shut down the Solr instance
3> replace the stand-alone core's data dir with one from any of your prod 
machines. 
4> start Solr

An alternative is to use the replication API to replace the index on your 
stand-alone core with one from one of the prod machines, see: 
https://lucene.apache.org/solr/guide/7_7/index-replication.html. You have to 
specify the masterURL and shouldn’t need to do anything with the configuration.

But assuming you have 3 shards: 

First, it’s easy enough to create a three-shard collection on your dev machine, 
either using embedded ZK or a separate ZK instance on the dev machine, so 
that’s one option. The advantage there is it’s the same environment. To do 
that, just create the 30shard replica

you can use the core admin API MERGEINDEXES command. What you’ll do is

1> create your core on your dev machine
2> copy one of the data dirs from one of the prod machines to the data dir of 
your new core.
3> copy the other two data dirs somewhere on the prod machine
4> use MERGEINDEXES, see: 
https://lucene.apache.org/solr/guide/7_4/coreadmin-api.html

Best,
Erick

> On Mar 16, 2020, at 12:32 AM, Jayadevan Maymala  
> wrote:
> 
> Hi all,
> 
> I have a 3 node Solr cluster in production (with zoo keeper). In dev, I
> have one node Solr instance, no zoo keeper. Which is the best way to copy
> over the production solr data to dev?
> Operating system is CentOS 7.7, Solr Version 7.3
> Collection size is in the 40-50 GB range.
> 
> Regards,
> Jayadevan



How to sum model grouped?

2020-03-16 Thread hakan
I use solr version 7.1. I have as grouped model in total 11M record, as below
example.
This question is, How do I sum fromfollowers field from this grouped model?
{
 groupValue: "1927245294",
 doclist: {
numFound: 1,
start: 0,
docs: [
{
   fromuserid: "1927245294",
   fromfollowers: 185
 }
 ]
  }
},
{
 groupValue: "98405321",
 doclist: {
numFound: 1,
start: 0,
docs: [
{
   fromuserid: "98405321",
   fromfollowers: 292
 }
 ]
  }
},
{
 groupValue: "182496421",
 doclist: {
numFound: 1,
start: 0,
docs: [
{
   fromuserid: "182496421",
   fromfollowers: 111
 }
 ]
  }
}





--
Sent from: https://lucene.472066.n3.nabble.com/Solr-User-f472068.html


number of documents exceed 2147483519

2020-03-16 Thread Hongxu Ma
Hi
I'm using solr-cloud (ver 6.6), got an error:
org.apache.solr.common.SolrException: Exception writing document id (null) to 
the index; possible analysis error: number of documents in the index cannot 
exceed 2147483519

After googled it, I know the number is exceed one solr shard limit.
The collection has 64 shards, so I think total limit is 20B*64=128B

My question is:
I don't want to recreate index (then split to more shards) and also don't want 
to delete docs.
Can I using the "SPLITSHARD" api to fix this issue?
https://lucene.apache.org/solr/guide/6_6/collections-api.html#CollectionsAPI-splitshard

After split each shard (now 128 shards), I think the total limit is increasing 
to 256B, right?

Thanks.


Collections API | Apache Solr Reference Guide 
6.6
The Collections API is used to enable you to create, remove, or reload 
collections, but in the context of SolrCloud you can also use it to create 
collections with a specific number of shards and replicas.
lucene.apache.org