Setting up SolrCloud Behind Azure Application Gateway

2020-11-12 Thread Victor Kretzer
I'm attempting to set up SolrCloud for use with Sitecore 9.0.2. I want to set 
up my Azure Application Gateway with a TSL cert. I want a private IP for 
Sitecore and a public IP for accessing the Solr Admin Dashboard. My goal is to 
use Application Gateway for the TSL and then route to the backend using http 
protocol.

I currently have the following configuration:
* 2 SolrCloud 6.6.6 nodes on 2 Azure Ubuntu 18.04 
LTS VMs
* 3 Zookeeper nodes on 3 Azure Ubuntu VMs
* A VPN with the IPs of all the above
* An application Gateway with:
o public listener on port 443
o public listener on port 80 (to 
eliminate the cert as a cause of my issues)
o backend pool for the two 
sorlCloud VMs
o an HTTP setting for Backend port 
8983

I can access the dashboard for the nodes using:
* 
http://:8983/solr/#/

But not when using either of the following:
* 
https:///solr/# with a public 
listener on port 443
* 
http:///solr/# with a public 
listener on port 80

The private IPs of both SolrCloud VMs are reporting healthy on port 8983 with a 
302-status code according to the default Backend Health monitor on Application 
Gateway.

I greatly appreciate any help provided.

Thanks,

Victor



RE: Using Multiple collections with streaming expressions

2020-11-12 Thread ufuk yılmaz
Many thanks for the info Joel

--ufuk

Sent from Mail for Windows 10

From: Joel Bernstein
Sent: 12 November 2020 17:00
To: solr-user@lucene.apache.org
Subject: Re: Using Multiple collections with streaming expressions

T



Re: Using Multiple collections with streaming expressions

2020-11-12 Thread Joel Bernstein
The multiple collection syntax has been implemented for only a few stream
sources: search, timeseries, facet and stats. Eventually it will be
implemented for all stream sources.


Joel Bernstein
http://joelsolr.blogspot.com/


On Tue, Nov 10, 2020 at 12:32 PM ufuk yılmaz 
wrote:

> Thanks again Erick, that’s a good idea!
>
> Alternatively, I use an alias covering multiple collections in these
> situations, but there may be too many combinations of collections, so it’s
> not always suitable.
>
> Merged significantTerms streams will have meaningles scores in tuples I
> think, it would be comparing apples and oranges, but in this case I’m only
> interested in getting foreground counts, so it’s another day’s problem
>
> What seemed strange to me was source code for streams appeared to be
> handling this case.
>
>
> Sent from Mail for Windows 10
>
> From: Erick Erickson
> Sent: 10 November 2020 16:48
> To: solr-user@lucene.apache.org
> Subject: Re: Using Multiple collections with streaming expressions
>
> Y
>
>


Re: Unloading and loading a Collection in SolrCloud with external Zookeeper ensemble

2020-11-12 Thread Erick Erickson
As stated in the docs, using the core admin API when using SolrCloud is not 
recommended, 
for just reasons like this. While SolrCloud _does_ use the Core Admin API, it’s 
usage
has to be very precise.

You apparently didn’t heed this warning in the UNLOAD command for the 
collections API:

"Unloading all cores in a SolrCloud collection causes the removal of that 
collection’s metadata from ZooKeeper.”

This latter is what the “non legacy mode…” message is about. In earlier 
versions of Solr,
the ZK information was recreated when Solr found a core.properties file, but 
that had
its own problems so was removed.

Your best bet now is to wipe your directories, create a new collection and 
re-index.

If you absolutely can’t reindex:
0> save away one index directory from every shard, it doesn’t matter which.
1> create the collection, with the exact same number of shards and a 
replicationFactor of 1
2> shut down all the Solr instances
3> copy the index directory from <0> to ’the right place”. For instance, if you
have a collection blah, you’ll have some directory like 
blah_shard1_replica_n1/data/index.
It’s critical that you replace the contents of data/index with the contents 
of the
directory saved in <0> from the _same_ shard, shard1 in this example.
4> start your Solr instances back up
5> use ADDREPLICA to build out the collection to have as many replicas as you 
need.

Good luck!
Erick


> On Nov 12, 2020, at 6:32 AM, Gajanan  wrote:
> 
> I have unloaded all cores of a collection in SolrCloud (8.x.x ) using
> coreAdmin APIs as UNLOAD collection is not available in collections API. Now
> I want reload the unloaded collection using APIs only. 
> When trying with coreAdmin APIs I am getting "Non legacy mode CoreNodeName
> not found." 
> When trying with collections APIs it is reloaded but shows no cores
> available.
> 
> 
> 
> 
> --
> Sent from: https://lucene.472066.n3.nabble.com/Solr-User-f472068.html



Re: Need help to resolve Apache Solr vulnerability

2020-11-12 Thread Dave
Solr isn’t meant to be public facing. Not sure how anyone would send these 
commands since it can’t be reached from the outside world 

> On Nov 12, 2020, at 7:12 AM, Sheikh, Wasim A. 
>  wrote:
> 
> Hi Team,
> 
> Currently we are facing the below vulnerability for Apache Solr tool. So can 
> you please check the below details and help us to fix this issue.
> 
> /etc/init.d/solr-master version
> 
> Server version: Apache Tomcat/7.0.62
> Server built: May 7 2015 17:14:55 UTC
> Server number: 7.0.62.0
> OS Name: Linux
> OS Version: 2.6.32-431.29.2.el6.x86_64
> Architecture: amd64
> JVM Version: 1.8.0_20-b26
> JVM Vendor: Oracle Corporation
> 
> 
> solr-spec-version:4.10.4,
> Solr is an enterprise search platform.
> Solr is prone to remote code execution vulnerability.
> 
> Affected Versions:
> Apache Solr version prior to 6.6.2 and prior to 7.1.0
> 
> QID Detection Logic (Unauthenticated):
> This QID sends specifically crafted request which include special entities in 
> the xml document and looks for the vulnerable response.
> Alternatively, in another check, this QID matches vulnerable versions in the 
> response webpage
> Successful exploitation allows attacker to execute arbitrary code.
> The vendor has issued updated packages to fix this vulnerability. For more 
> information about the vulnerability and obtaining patches, refer to the 
> following Fedora security advisories : HREF="https://lucene.apache.org/solr/news.html; TARGET="_blank">Apache Solr 
> 6.6.2 For more information regarding the update can be found at  HREF="https://lucene.apache.org/solr/news.html; TARGET="_blank">Apache Solr  
> 7.1.0.
> 
> 
> 
> 
> 
> 
> 
> Patch:
> Following are links for downloading patches to fix the vulnerabilities:
>  https://lucene.apache.org/solr/news.html; TARGET="_blank">Apache 
> Solr 6.6.2 https://lucene.apache.org/solr/news.html; 
> TARGET="_blank">Apache Solr 7.1.0
> 
> 
> Thanks...
> Wasim Shaikh
> 
> 
> 
> This message is for the designated recipient only and may contain privileged, 
> proprietary, or otherwise confidential information. If you have received it 
> in error, please notify the sender immediately and delete the original. Any 
> other use of the e-mail by you is prohibited. Where allowed by local law, 
> electronic communications with Accenture and its affiliates, including e-mail 
> and instant messaging (including content), may be scanned by our systems for 
> the purposes of information security and assessment of internal compliance 
> with Accenture policy. Your privacy is important to us. Accenture uses your 
> personal data only in compliance with data protection laws. For further 
> information on how Accenture processes your personal data, please see our 
> privacy statement at https://www.accenture.com/us-en/privacy-policy.
> __
> 
> www.accenture.com


How to unload and reload a solr collection in SolrCloud

2020-11-12 Thread Gajanan Watkar
I want to unload and reload all cores of a collection in SolrCloud mode
(Solr 8.x.x).

-- 
-Gajanan


Child documents are not retrieved after DIH

2020-11-12 Thread Jordi Cabré
I will try to explain myself in as much detail as possible and isolating as
much as possible from the context.

Shortly, I'm trying to create a `DIH` in order to digest some documents as
nested. I mean, I need to digest an `one-to-many` relation and put it as
nested documents.

My `parents` data is:


++---+-+
| id |name_s | node_type_s |
++===+=+
|  1 | parent-name-1 | parent  |
|  2 | parent-name-2 | parent  |
|  3 | parent-name-3 | parent  |
++---+-+

And `children` data is:


+-+-+--+-+
| id  | parent_id_s |name_s| node_type_s |
+=+=+==+=+
| 1-1 |   1 | child-name-1 | child   |
| 2-1 |   1 | child-name-2 | child   |
| 3-2 |   2 | child-name-3 | child   |
| 4-3 |   3 | child-name-4 | child   |
+-+-+--+-+


Here my `DIH` configuration:

























As you can see, `child="true"` into `nested entity`.

After having performed my data import handler:

{
  "responseHeader": {
"status": 0,
"QTime": 0
  },
  "initArgs": [
"defaults",
[
  "config",
  "parent-children-config.xml"
]
  ],
  "command": "status",
  "status": "idle",
  "importResponse": "",
  "statusMessages": {
"Total Requests made to DataSource": "2",
"Total Rows Fetched": "7",
"Total Documents Processed": "3",
"Total Documents Skipped": "0",
"Full Dump Started": "2020-11-12 08:02:25",
"": "Indexing completed. Added/Updated: 3 documents. Deleted 0
documents.",
"Committed": "2020-11-12 08:02:25",
"Time taken": "0:0:0.304"
  }
}

So, digestion seems to be worked well.

After that, I've tested how to get only parents `q={!parent
which=node_type_s:parent}`:

{
   "responseHeader":{
  "status":0,
  "QTime":1,
  "params":{
 "q":"{!parent which=node_type_s:parent}",
 "_":"1605166879678"
  }
   },
   "response":{
  "numFound":3,
  "start":0,
  "numFoundExact":true,
  "docs":[
 {
"name_s":"parent-name-1",
"node_type_s":"parent",
"id":"1",
"_version_":1683140793502531584
 },
 {
"name_s":"parent-name-2",
"node_type_s":"parent",
"id":"2",
"_version_":1683140793504628736
 },
 {
"name_s":"parent-name-3",
"node_type_s":"parent",
"id":"3",
"_version_":1683140793505677312
 }
  ]
   }
}

As you can see, only `parents` are returned.

When I'm asking for only `children`:

{
   "responseHeader":{
  "status":0,
  "QTime":3,
  "params":{
 "q":"{!child of=\"node_type_s:parent\"}",
 "_":"1605166879678"
  }
   },
   "response":{
  "numFound":4,
  "start":0,
  "numFoundExact":true,
  "docs":[
 {
"name_s":"child-name-1",
"node_type_s":"child",
"parent_id_s":"1",
"id":"1-1",
"_version_":1683140793502531584
 },
 {
"name_s":"child-name-2",
"node_type_s":"child",
"parent_id_s":"1",
"id":"2-1",
"_version_":1683140793502531584
 },
 {
"name_s":"child-name-3",
"node_type_s":"child",
"parent_id_s":"2",
"id":"3-2",
"_version_":1683140793504628736
 },
 {
"name_s":"child-name-4",
"node_type_s":"child",
"parent_id_s":"3",
"id":"4-3",
"_version_":1683140793505677312
 }
  ]
   }
}

All right, only children documents are returned.

Then, I've also tried to get only `childrens of parent 1`:

{
   "responseHeader":{
  "status":0,
  "QTime":0,
  "params":{
 "q":"{!child of=\"node_type_s:parent\"}id:1",
 "_":"1605166879678"
  }
   },
   "response":{
  "numFound":2,
  "start":0,
  "numFoundExact":true,
  "docs":[
 {
"name_s":"child-name-1",
"node_type_s":"child",
"parent_id_s":"1",
"id":"1-1",
"_version_":1683140793502531584

Need help to resolve Apache Solr vulnerability

2020-11-12 Thread Sheikh, Wasim A.
Hi Team,

Currently we are facing the below vulnerability for Apache Solr tool. So can 
you please check the below details and help us to fix this issue.

/etc/init.d/solr-master version

Server version: Apache Tomcat/7.0.62
Server built: May 7 2015 17:14:55 UTC
Server number: 7.0.62.0
OS Name: Linux
OS Version: 2.6.32-431.29.2.el6.x86_64
Architecture: amd64
JVM Version: 1.8.0_20-b26
JVM Vendor: Oracle Corporation


solr-spec-version:4.10.4,
Solr is an enterprise search platform.
Solr is prone to remote code execution vulnerability.

Affected Versions:
Apache Solr version prior to 6.6.2 and prior to 7.1.0

QID Detection Logic (Unauthenticated):
This QID sends specifically crafted request which include special entities in 
the xml document and looks for the vulnerable response.
Alternatively, in another check, this QID matches vulnerable versions in the 
response webpage
Successful exploitation allows attacker to execute arbitrary code.
The vendor has issued updated packages to fix this vulnerability. For more 
information about the vulnerability and obtaining patches, refer to the 
following Fedora security advisories :https://lucene.apache.org/solr/news.html; TARGET="_blank">Apache Solr 
6.6.2 For more information regarding the update can be found at https://lucene.apache.org/solr/news.html; TARGET="_blank">Apache Solr  
7.1.0.







Patch:
Following are links for downloading patches to fix the vulnerabilities:
 https://lucene.apache.org/solr/news.html; TARGET="_blank">Apache 
Solr 6.6.2 https://lucene.apache.org/solr/news.html; 
TARGET="_blank">Apache Solr 7.1.0


Thanks...
Wasim Shaikh



This message is for the designated recipient only and may contain privileged, 
proprietary, or otherwise confidential information. If you have received it in 
error, please notify the sender immediately and delete the original. Any other 
use of the e-mail by you is prohibited. Where allowed by local law, electronic 
communications with Accenture and its affiliates, including e-mail and instant 
messaging (including content), may be scanned by our systems for the purposes 
of information security and assessment of internal compliance with Accenture 
policy. Your privacy is important to us. Accenture uses your personal data only 
in compliance with data protection laws. For further information on how 
Accenture processes your personal data, please see our privacy statement at 
https://www.accenture.com/us-en/privacy-policy.
__

www.accenture.com


Unloading and loading a Collection in SolrCloud with external Zookeeper ensemble

2020-11-12 Thread Gajanan
I have unloaded all cores of a collection in SolrCloud (8.x.x ) using
coreAdmin APIs as UNLOAD collection is not available in collections API. Now
I want reload the unloaded collection using APIs only. 
When trying with coreAdmin APIs I am getting "Non legacy mode CoreNodeName
not found." 
When trying with collections APIs it is reloaded but shows no cores
available.




--
Sent from: https://lucene.472066.n3.nabble.com/Solr-User-f472068.html