Re: CVE-2019-17558 on SOLR 6.1

2021-02-12 Thread Shawn Heisey

On 2/12/2021 11:17 AM, Rick Tham wrote:

I am trying to figure out if the following is an additioanal valid
mitigation step for CVE-2019-17558 on SOLR 6.1. None of our solrconfig.xml
contains the lib references to the velocity jar files as follows:


l

It doesn't appear that you can add these jars references using the config
API. Without these references, you are not able to flip the
params.resource.loader.enabled to true using the config API. If you are not
able to flip the flag and none of your cores have these lib references then
is the risk present?


In order to be vulnerable to that problem, all of the following things 
must be true.  If any of them is NOT true, then this vulnerability does 
not apply:


* The velocity jars must be loaded.  A common way for this is the  
configuration you mentioned, but there are other ways.  Those other ways 
require human intervention to move the actual files.

* Your config must *use* the jars, by containing a velocity config.
* The params resource loader must be enabled in the velocity config. 
Note that the "velocity.params.resource.loader.enabled" flag only 
applies if the velocity config in solrconfig.xml *references* that flag.
* Your Solr server must be reachable to unauthorized parties who would 
exploit the vulnerability.


I have no idea whether any of this config can be changed remotely.  I 
have never used the config API.  But if your Solr server is not 
reachable to bad guys, it won't matter.


Simply controlling who can reach the Solr server is the easiest way to 
avoid being vulnerable to anything.  Although there are security 
mechanisms available, Solr is not designed to be publicly reachable.  It 
should be heavily firewalled.


The velocity response writer usually requires end users to have direct 
access to the Solr server for it to be worth something.  We STRONGLY 
discourage leaving Solr exposed.


Thanks,
Shawn


Re: Extremely Small Segments

2021-02-12 Thread Shawn Heisey

On 2/12/2021 4:30 AM, yasoobhaider wrote:

Note: Nothing out of the ordinary in logs. Only /update request logs.


Can you share your logs?  The best option would be to include everything 
in the logs directory.  Hopefully you have not altered the default 
logging config, which sets the detail to INFO.


Can you also include everything in that's in the ZK configuration path?

If you need to remove sensitive information, please do so in a 
consistent way, and replace it with something else rather than just 
deleting it.


Note that this mailing list has a tendency to eat attachments.  So 
you're going to need to use a file-sharing site and give us one or more 
URLs.  Dropbox is a good choice, but not the only one.


Thanks,
Shawn


Re: Why Solr questions on stackoverflow get very few views and answers, if at all?

2021-02-12 Thread Walter Underwood
Many questions have responses as comments, but no actual answers. One frequent 
contributor doesn’t understand how StackOverflow works, so he’s posting answers 
as comments. He’s also doing conversations instead of crafting a useful, 
complete answer.

I just answered a few. Mostly with “don’t use stop words” and “Solr is not a 
database”.

wunder
Walter Underwood
wun...@wunderwood.org
http://observer.wunderwood.org/  (my blog)

> On Feb 12, 2021, at 3:03 AM, Charlie Hull  
> wrote:
> 
> I've answered a few in my time, but my experience is that if you do so you 
> then get emailed a whole load more questions some of which aren't even 
> relevant to Solr! Also, quite a few of them are 'here is 3 pages of code 
> please debug it for me no I won't tell the actual error I got'.
> 
> This is the best place to come,  also there's the IRC channel, the new Slack 
> gateway to this at https://s.apache.org/solr-slack and in our own Relevance 
> Slack at http://opensourceconnections.com/slack there's a #solr channel (as 
> well as many others on search & relevance topics).
> 
> Solr is 'hot' (but not as hot as Elasticsearch), and search is still a niche 
> business overall.
> 
> HTH
> 
> Cheers
> 
> Charlie
> 
> On 12/02/2021 10:37, ufuk yılmaz wrote:
>> Is it because the main place for q&a is this mailing list, or somewhere else 
>> that I don’t know?
>> 
>> Or Solr isn’t ‘hot’ as some other topics?
>> 
>> Sent from Mail for Windows 10
>> 
>> 
> 
> -- 
> Charlie Hull - Managing Consultant at OpenSource Connections Limited 
> 
> Founding member of The Search Network  and 
> co-author of Searching the Enterprise 
> 
> tel/fax: +44 (0)8700 118334
> mobile: +44 (0)7767 825828



Re: Elevation in dataDir in Solr Cloud

2021-02-12 Thread Chris Hostetter


: I need to have the elevate.xml file updated frequently and I was wondering
: if it is possible to put this file in the dataDir folder when using Solr
: Cloud. I know that this is possible in the standalone mode, and I haven't
: seen in the documentation [1] that it can not be done in Cloud.
: 
: I am using Solr 7.7.2 and ZooKeeper. After creating the techproducts
: collection for the tests, I remove the elevate.xml file from the
: configuration and I put it in the dataDir folder of the cores. When I
: update the collection with that configuration, I get the following error:
: "Can't find resource 'elevate.xml' in classpath or
: '/configs/techproductsConfExp'". Is this expected or I am doing something
: wrong?

Hmmm... can you share the full stack trace of that error?

(I suspect at some point someone made a sloppy assumption in the QEC code 
that no one would ever try to keep elevate.xml in the data dir in cloud 
mode.)

I don't know if it will work, but one thing you might want to experiment 
with is putting your elevate.xml back the configset in zk, and updating it 
on the fly in zk -- then see if it gets reloaded by each core the next 
time the index changes (NOTE that there will almost certainly need to be 
an index change for it to re-load, since I don't see any indication that 
it's watching for changes in zk)

FWIW: the way most people seem to be using QEC these days is to have an 
empty elevate.xml file, and then have their application use some other 
key/val store, or more complex matching logic, to decide which documents 
to elevate, and then use the "elevateIds" param to pass that info to solr.


-Hoss
http://www.lucidworks.com/


CVE-2019-17558 on SOLR 6.1

2021-02-12 Thread Rick Tham
We are using Solr 6.1 and at the moment we can not upgrade due to
application dependencies.

We have mitigation steps in place to only trust specific machines within
our DMZ.

I am trying to figure out if the following is an additioanal valid
mitigation step for CVE-2019-17558 on SOLR 6.1. None of our solrconfig.xml
contains the lib references to the velocity jar files as follows:


l

It doesn't appear that you can add these jars references using the config
API. Without these references, you are not able to flip the
params.resource.loader.enabled to true using the config API. If you are not
able to flip the flag and none of your cores have these lib references then
is the risk present?

Thanks in advance!


Re: [ANNOUNCE] Apache Solr 8.8.0 released

2021-02-12 Thread Ishan Chattopadhyaya
Hi all,
This release contains a critical bug, that should be fixed in 8.8.1
shortly. Please avoid upgrading to this release for the moment.
https://twitter.com/ichattopadhyaya/status/1360163382171586562
Apologies for the inconvenience.
Thanks,
Ishan

On Mon, Feb 1, 2021 at 6:01 PM Noble Paul  wrote:

> -BEGIN PGP SIGNED MESSAGE-
> Hash: SHA512
>
> Solr is the popular, blazing fast, open source NoSQL search platform from
> the Apache Lucene project. Its major features include powerful full-text
> search, hit highlighting, faceted search and analytics, rich document
> parsing, geospatial search, extensive REST APIs as well as parallel SQL.
> Solr is enterprise grade, secure and highly scalable, providing fault
> tolerant distributed search and indexing, and powers the search and
> navigation features of many of the world's largest internet sites.
>
> The release is available for immediate download at:
>
>  https://lucene.apache.org/solr/downloads.html
>
> Please read CHANGES.txt for a detailed list of changes:
>
>  https://lucene.apache.org/solr/8_8_0/changes/Changes.html Solr 8.8.0
>
> Release Highlights:
>
> Reducing overseer bottlenecks using per-replica states. More stability and
> lesser load on large cluster that use this feature.
>
> Better restart and collection creation performance.
>
> Interleaving support in Learning To Rank
>
> A summary of important changes is published in the Solr Reference Guide at:
>
>  https://lucene.apache.org/solr/guide/8_8/solr-upgrade-notes.html.
>
> For the most exhaustive list, see the full release notes at
> https://lucene.apache.org/solr/8_8_0/changes/Changes.html
>
> or
>
> by viewing the CHANGES.txt file accompanying the distribution. Solr's
> release notes usually don't include Lucene layer changes. Lucene's release
> notes are at
>
> https://lucene.apache.org/core/8_8_0/changes/Changes.html
>
> - -
> Noble Paul
> -BEGIN PGP SIGNATURE-
> Version: FlowCrypt Email Encryption 8.0.0
> Comment: Seamlessly send and receive encrypted email
>
> wsFzBAEBCgAGBQJgF/RnACEJEMOP9ew/z9s+FiEEz85fu5IMPHRc7uCEw4/1
> 7D/P2z6fzRAAm4AKbeIGWfPK+0nsrZCAPaDucGZYVL0lPQr3eF4jnmhi60dF
> Sv9rD5Mq5ZSTTuJlpwoaxowxVp4M1tV1vmCdfBRkgoUD3dwS/snryr/AK69R
> zdjjV/BABtcMNA7cMYIrkolGl37g4kI1alLfU36Uf/3M0NfUcw0keW1XuMOr
> uV7AzXhZGw4eL4LJt7I7gXJs1kgE6/sPSmoKBVckKisrruiUSYmH9r/EhXXU
> YB8cxd5tenMrchbjcOquC9X2JJjB++/LyJw3mFNIO5W3UpjqwtI8IGDo1Sxl
> fM32FuAWVVDZsiBKXuRzsIO/iEPfgZFfTcoSJkD0Rt/Q6gJPZIuBmiUFaYfs
> 9fzufNDuXdPKFEndSHfwdPMJwvk3XA5+xYzhkcQH+3FKOPmYXkvLolOC3j+r
> ZtbgI421jDIahpVPbFtgUPB2dM3mw34B73wP5MIOHHxz22tVKe6PBOeihccK
> mOr0r1tZHR+11aijYf+Nlhv3hpbpRoDbQ7pRkRyu53Od47p6itZAi60TFFIJ
> bDw26wZRNRrEuYhriJUeM7ahvJNlcE6VaO0szUDL5g/x2Oa9jKMHPpsUF9pS
> 9HbJWcnflxq0iU+sfdv7Eoxzv6zkXMTUsbpT2XjKcZZN5jd2rWV3JfiU6FiZ
> jpqJBHzwGan9qKKswNKyDKhoa2jPdSYIbqQ=
> =NbSI
> -END PGP SIGNATURE-
>


Re: SOLR upgrade

2021-02-12 Thread David Hastings
i generally will only upgrade every other release.  since i started with
1.4, went to 3->5->7.X, and never EVER a .0 or an even .X release,

On Fri, Feb 12, 2021 at 12:01 PM Ishan Chattopadhyaya <
ichattopadhy...@gmail.com> wrote:

> Just avoid 8.8.0 for the moment, until 8.8.1 is released. 8.7.x should be
> fine.
>
> On Fri, Feb 12, 2021 at 10:28 PM Alessandro Benedetti <
> a.benede...@sease.io>
> wrote:
>
> > Hi,
> > following up on Charlie's detailed response I would recommend carefully
> > assess the code you are using to interact with Apache Solr (on top of the
> > Solr changes themselves).
> > Assuming you are using some sort of client, it's extremely important to
> > fully understand both the syntax and semantic of each call.
> > I saw a lot of "compiling ok" search-api migrations that were ok
> > syntactically but doing a disaster from the semantic perspective (missing
> > important parameters ect).
> >
> > In case you have plugins to maintain this would be even more complicated
> > than just make them compile.
> >
> > Regards
> > --
> > Alessandro Benedetti
> > Apache Lucene/Solr Committer
> > Director, R&D Software Engineer, Search Consultant
> > www.sease.io
> >
> >
> > On Tue, 9 Feb 2021 at 11:01, Charlie Hull <
> ch...@opensourceconnections.com
> > >
> > wrote:
> >
> > > Hi Lulu,
> > >
> > > I'm afraid you're going to have to recognise that Solr 5.2.1 is very
> > > out-of-date and the changes between this version and the current 8.x
> > > releases are significant. A direct jump is I think the only sensible
> > > option.
> > >
> > > Although you could take the current configuration and attempt to
> upgrade
> > > it to work with 8.x, I recommend that you should take the chance to
> look
> > > at your whole infrastructure (from data ingestion through to query
> > > construction) and consider what needs upgrading/redesigning for both
> > > performance and future-proofing. You shouldn't just attempt a
> > > lift-and-shift of the current setup - some things just won't work and
> > > some may lock you into future issues. If you're running at large scale
> > > (I've talked to some people at the BL before and I know you have some
> > > huge indexes there!) then a redesign may be necessary for scalability
> > > reasons (cost and feasibility). You should also consider your skills
> > > base and how the team can stay up to date with Solr changes and modern
> > > search practice.
> > >
> > > Hope this helps - this is a common situation which I've seen many times
> > > before, you're certainly not the oldest version of Solr running I've
> > > seen recently either!
> > >
> > > best
> > >
> > > Charlie
> > >
> > > On 09/02/2021 01:14, Paul, Lulu wrote:
> > > > Hi SOLR team,
> > > >
> > > > Please may I ask for advice regarding upgrading the SOLR version (our
> > > project currently running on solr-5.2.1) to the latest version?
> > > > What are the steps, breaking changes and potential issues ? Could
> this
> > > be done as an incremental version upgrade or a direct jump to the
> newest
> > > version?
> > > >
> > > > Much appreciate the advice, Thank you!
> > > >
> > > > Best Wishes
> > > > Lulu
> > > >
> > > >
> > > >
> > >
> >
> **
> > > > Experience the British Library online at www.bl.uk >
> > > > The British Library's latest Annual Report and Accounts :
> > > www.bl.uk/aboutus/annrep/index.html<
> > > http://www.bl.uk/aboutus/annrep/index.html>
> > > > Help the British Library conserve the world's knowledge. Adopt a
> Book.
> > > www.bl.uk/adoptabook
> > > > The Library's St Pancras site is WiFi - enabled
> > > >
> > >
> >
> *
> > > > The information contained in this e-mail is confidential and may be
> > > legally privileged. It is intended for the addressee(s) only. If you
> are
> > > not the intended recipient, please delete this e-mail and notify the
> > > postmas...@bl.uk : The contents of this
> e-mail
> > > must not be disclosed or copied without the sender's consent.
> > > > The statements and opinions expressed in this message are those of
> the
> > > author and do not necessarily reflect those of the British Library. The
> > > British Library does not take any responsibility for the views of the
> > > author.
> > > >
> > >
> >
> *
> > > > Think before you print
> > > >
> > >
> > > --
> > > Charlie Hull - Managing Consultant at OpenSource Connections Limited
> > > 
> > > Founding member of The Search Network 
> > > and co-author of Searching the Enterprise
> > > 
> > > tel/fax: +4

Re: SOLR upgrade

2021-02-12 Thread Ishan Chattopadhyaya
Just avoid 8.8.0 for the moment, until 8.8.1 is released. 8.7.x should be
fine.

On Fri, Feb 12, 2021 at 10:28 PM Alessandro Benedetti 
wrote:

> Hi,
> following up on Charlie's detailed response I would recommend carefully
> assess the code you are using to interact with Apache Solr (on top of the
> Solr changes themselves).
> Assuming you are using some sort of client, it's extremely important to
> fully understand both the syntax and semantic of each call.
> I saw a lot of "compiling ok" search-api migrations that were ok
> syntactically but doing a disaster from the semantic perspective (missing
> important parameters ect).
>
> In case you have plugins to maintain this would be even more complicated
> than just make them compile.
>
> Regards
> --
> Alessandro Benedetti
> Apache Lucene/Solr Committer
> Director, R&D Software Engineer, Search Consultant
> www.sease.io
>
>
> On Tue, 9 Feb 2021 at 11:01, Charlie Hull  >
> wrote:
>
> > Hi Lulu,
> >
> > I'm afraid you're going to have to recognise that Solr 5.2.1 is very
> > out-of-date and the changes between this version and the current 8.x
> > releases are significant. A direct jump is I think the only sensible
> > option.
> >
> > Although you could take the current configuration and attempt to upgrade
> > it to work with 8.x, I recommend that you should take the chance to look
> > at your whole infrastructure (from data ingestion through to query
> > construction) and consider what needs upgrading/redesigning for both
> > performance and future-proofing. You shouldn't just attempt a
> > lift-and-shift of the current setup - some things just won't work and
> > some may lock you into future issues. If you're running at large scale
> > (I've talked to some people at the BL before and I know you have some
> > huge indexes there!) then a redesign may be necessary for scalability
> > reasons (cost and feasibility). You should also consider your skills
> > base and how the team can stay up to date with Solr changes and modern
> > search practice.
> >
> > Hope this helps - this is a common situation which I've seen many times
> > before, you're certainly not the oldest version of Solr running I've
> > seen recently either!
> >
> > best
> >
> > Charlie
> >
> > On 09/02/2021 01:14, Paul, Lulu wrote:
> > > Hi SOLR team,
> > >
> > > Please may I ask for advice regarding upgrading the SOLR version (our
> > project currently running on solr-5.2.1) to the latest version?
> > > What are the steps, breaking changes and potential issues ? Could this
> > be done as an incremental version upgrade or a direct jump to the newest
> > version?
> > >
> > > Much appreciate the advice, Thank you!
> > >
> > > Best Wishes
> > > Lulu
> > >
> > >
> > >
> >
> **
> > > Experience the British Library online at www.bl.uk
> > > The British Library's latest Annual Report and Accounts :
> > www.bl.uk/aboutus/annrep/index.html<
> > http://www.bl.uk/aboutus/annrep/index.html>
> > > Help the British Library conserve the world's knowledge. Adopt a Book.
> > www.bl.uk/adoptabook
> > > The Library's St Pancras site is WiFi - enabled
> > >
> >
> *
> > > The information contained in this e-mail is confidential and may be
> > legally privileged. It is intended for the addressee(s) only. If you are
> > not the intended recipient, please delete this e-mail and notify the
> > postmas...@bl.uk : The contents of this e-mail
> > must not be disclosed or copied without the sender's consent.
> > > The statements and opinions expressed in this message are those of the
> > author and do not necessarily reflect those of the British Library. The
> > British Library does not take any responsibility for the views of the
> > author.
> > >
> >
> *
> > > Think before you print
> > >
> >
> > --
> > Charlie Hull - Managing Consultant at OpenSource Connections Limited
> > 
> > Founding member of The Search Network 
> > and co-author of Searching the Enterprise
> > 
> > tel/fax: +44 (0)8700 118334
> > mobile: +44 (0)7767 825828
> >
>


Re: SOLR upgrade

2021-02-12 Thread Alessandro Benedetti
Hi,
following up on Charlie's detailed response I would recommend carefully
assess the code you are using to interact with Apache Solr (on top of the
Solr changes themselves).
Assuming you are using some sort of client, it's extremely important to
fully understand both the syntax and semantic of each call.
I saw a lot of "compiling ok" search-api migrations that were ok
syntactically but doing a disaster from the semantic perspective (missing
important parameters ect).

In case you have plugins to maintain this would be even more complicated
than just make them compile.

Regards
--
Alessandro Benedetti
Apache Lucene/Solr Committer
Director, R&D Software Engineer, Search Consultant
www.sease.io


On Tue, 9 Feb 2021 at 11:01, Charlie Hull 
wrote:

> Hi Lulu,
>
> I'm afraid you're going to have to recognise that Solr 5.2.1 is very
> out-of-date and the changes between this version and the current 8.x
> releases are significant. A direct jump is I think the only sensible
> option.
>
> Although you could take the current configuration and attempt to upgrade
> it to work with 8.x, I recommend that you should take the chance to look
> at your whole infrastructure (from data ingestion through to query
> construction) and consider what needs upgrading/redesigning for both
> performance and future-proofing. You shouldn't just attempt a
> lift-and-shift of the current setup - some things just won't work and
> some may lock you into future issues. If you're running at large scale
> (I've talked to some people at the BL before and I know you have some
> huge indexes there!) then a redesign may be necessary for scalability
> reasons (cost and feasibility). You should also consider your skills
> base and how the team can stay up to date with Solr changes and modern
> search practice.
>
> Hope this helps - this is a common situation which I've seen many times
> before, you're certainly not the oldest version of Solr running I've
> seen recently either!
>
> best
>
> Charlie
>
> On 09/02/2021 01:14, Paul, Lulu wrote:
> > Hi SOLR team,
> >
> > Please may I ask for advice regarding upgrading the SOLR version (our
> project currently running on solr-5.2.1) to the latest version?
> > What are the steps, breaking changes and potential issues ? Could this
> be done as an incremental version upgrade or a direct jump to the newest
> version?
> >
> > Much appreciate the advice, Thank you!
> >
> > Best Wishes
> > Lulu
> >
> >
> >
> **
> > Experience the British Library online at www.bl.uk
> > The British Library's latest Annual Report and Accounts :
> www.bl.uk/aboutus/annrep/index.html<
> http://www.bl.uk/aboutus/annrep/index.html>
> > Help the British Library conserve the world's knowledge. Adopt a Book.
> www.bl.uk/adoptabook
> > The Library's St Pancras site is WiFi - enabled
> >
> *
> > The information contained in this e-mail is confidential and may be
> legally privileged. It is intended for the addressee(s) only. If you are
> not the intended recipient, please delete this e-mail and notify the
> postmas...@bl.uk : The contents of this e-mail
> must not be disclosed or copied without the sender's consent.
> > The statements and opinions expressed in this message are those of the
> author and do not necessarily reflect those of the British Library. The
> British Library does not take any responsibility for the views of the
> author.
> >
> *
> > Think before you print
> >
>
> --
> Charlie Hull - Managing Consultant at OpenSource Connections Limited
> 
> Founding member of The Search Network 
> and co-author of Searching the Enterprise
> 
> tel/fax: +44 (0)8700 118334
> mobile: +44 (0)7767 825828
>


Re: Extremely Small Segments

2021-02-12 Thread Alessandro Benedetti
Hi Yasoob,
Can you check in the log when hard commits really happen?
I ended up sometimes with auto soft/hard commit config in the wrong place
of the solrconfig.xml and for that reason getting un-expected behaviour.
Your assumptions are correct, the ramBuffer flushes as soon as one of the
threshold is met for memory/doc count.
For the auto-commit, it's the same, but for time/docs.

Are you sure there's no additional commit happening?
Do you see those numbers on all shards/replicas?
Which kind of replica are you using?
Sharding on 10 GB index may not be necessary, do you have any evidence you
had to shard your index?
Any performance benchmark?

Cheers
--
Alessandro Benedetti
Apache Lucene/Solr Committer
Director, R&D Software Engineer, Search Consultant
www.sease.io


On Fri, 12 Feb 2021 at 13:44, yasoobhaider  wrote:

> Hi
>
> I am migrating from master slave to Solr Cloud but I'm running into
> problems
> with indexing.
>
> Cluster details:
>
> 8 machines of 64GB memory, each hosting 1 replica.
> 4 shards, 2 replica of each. Heap size is 16GB.
>
> Collection details:
>
> Total number of docs: ~250k (but only 50k are indexed right now)
> Size of collection (master slave number for reference): ~10GB
>
> Our collection is fairly heavy with some dynamic fields with high
> cardinality (of order of ~1000s), which is why the large heap size for even
> a small collection.
>
> Relevant solrconfig settings:
>
> commit settings:
>
> 
>   1
>   360
>   false
> 
>
> 
>   ${solr.autoSoftCommit.maxTime:180}
> 
>
> index config:
>
> 500
> 1
>
>  class="org.apache.solr.index.TieredMergePolicyFactory">
>   10
>   10
> 
>
>
> class="org.apache.lucene.index.ConcurrentMergeScheduler">
>  6
>  4
>
>
>
> Problem:
>
> I setup the cloud and started indexing at the throughput of our earlier
> master-slave setup, but soon the machines ran into full blown Garbage
> Collection. This throughput was not a lot though. We index the whole
> collection overnight, so roughly ~250k documents in 6 hours. That's roughly
> 12rps.
>
> So now I'm doing indexing at an extremely slow rate trying to find the
> problem.
>
> Currently I'm indexing at 1 document/2seconds, so every minute ~30
> documents.
>
> Observations:
>
> 1. I'm noticing extremely small segments in the segments UI. Example:
>
> Segment _1h4:
> #docs: 5
> #dels: 0
> size: 1,586,878 bytes
> age: 2021-02-12T11:05:33.050Z
> source: flush
>
> Why is lucene creating such small segments? I understood that segments are
> created when ramBufferSizeMB or maxBufferedDocs limit is hit. Or on a hard
> commit. Neither of those should lead to such small segments.
>
> 2. The index/ directory has a large number of files. For one shard with 30k
> documents & 1.5GB size, there are ~450-550 files in this directory. I
> understand that each segment is composed of a bunch of files. Even
> accounting for that, the number of segments seems very large.
>
> Note: Nothing out of the ordinary in logs. Only /update request logs.
>
> Please help with making sense of the 2 observations above.
>
>
>
> --
> Sent from: https://lucene.472066.n3.nabble.com/Solr-User-f472068.html
>


Re: Asymmetric Key Size not sufficient

2021-02-12 Thread Ishan Chattopadhyaya
Recent versions of Solr use 2048.
https://github.com/apache/lucene-solr/blob/branch_8_6/solr/core/src/java/org/apache/solr/util/CryptoKeys.java#L332

Thanks for your report.

On Fri, Feb 12, 2021 at 3:44 PM Mahir Kabir  wrote:

> Hello,
>
> I am a Ph.D. student at Virginia Tech, USA. While working on a security
> project-related work, we came across the following vulnerability in the
> source code -
>
> In file
>
> https://github.com/apache/lucene-solr/blob/branch_6_6/solr/core/src/java/org/apache/solr/util/CryptoKeys.java
> <
> https://github.com/apache/ranger/blob/71e1dd40366c8eb8e9c498b0b5158d85d603af02/kms/src/main/java/org/apache/hadoop/crypto/key/RangerKeyStore.java
> >
> (at
> Line 300) Key Size was set as 1024.
>
> *Security Impact*:
>
> < 2048 key size for RSA algorithm makes the system vulnerable to
> brute-force attack
>
> *Useful resource*:
> https://rules.sonarsource.com/java/type/Vulnerability/RSPEC-4426
> https://rules.sonarsource.com/java/type/Vulnerability/RSPEC-4426
>
> *Solution we suggest*:
>
> For RSA algorithm, the key size should be >= 2048
>
> *Please share with us your opinions/comments if there is any*:
>
> Is the bug report helpful?
>
> Please let us know what you think about the issue. Any feedback will be
> appreciated.
>
> Thank you,
> Md Mahir Asef Kabir
> Ph.D. Student
> Department of CS
> Virginia Tech
>


Elevation in dataDir in Solr Cloud

2021-02-12 Thread Mónica Marrero
Hi,

I need to have the elevate.xml file updated frequently and I was wondering
if it is possible to put this file in the dataDir folder when using Solr
Cloud. I know that this is possible in the standalone mode, and I haven't
seen in the documentation [1] that it can not be done in Cloud.

I am using Solr 7.7.2 and ZooKeeper. After creating the techproducts
collection for the tests, I remove the elevate.xml file from the
configuration and I put it in the dataDir folder of the cores. When I
update the collection with that configuration, I get the following error:
"Can't find resource 'elevate.xml' in classpath or
'/configs/techproductsConfExp'". Is this expected or I am doing something
wrong?

Thanks a lot for your help.

[1]
https://lucene.apache.org/solr/guide/7_7/the-query-elevation-component.html

-- 
Disclaimer: This email and any files transmitted with it are confidential 
and intended solely for the use of the individual or entity to whom they 
are
addressed. If you have received this email in error please notify the 
system manager. If you are not the named addressee you should not 
disseminate,
distribute or copy this email. Please notify the sender 
immediately by email if you have received this email by mistake and delete 
this email from your
system.


Re: Why Solr questions on stackoverflow get very few views and answers, if at all?

2021-02-12 Thread samuel...@grupoinditex.mail.onmicrosoft.com




Re: Why Solr questions on stackoverflow get very few views and answers, if at all?

2021-02-12 Thread Alexandre Rafalovitch
I answered quite a bunch a whole ago, as part of book writing process.

I think a lot of them were missing core information like version of Solr.
So they were not very timeless.

The list allows a conversation and multiple perspectives, which is better
than a one shot answer.

Regards,
   Alex

On Fri., Feb. 12, 2021, 5:37 a.m. ufuk yılmaz, 
wrote:

> Is it because the main place for q&a is this mailing list, or somewhere
> else that I don’t know?
>
> Or Solr isn’t ‘hot’ as some other topics?
>
> Sent from Mail for Windows 10
>
>


Extremely Small Segments

2021-02-12 Thread samuel...@grupoinditex.mail.onmicrosoft.com




Extremely Small Segments

2021-02-12 Thread yasoobhaider
Hi

I am migrating from master slave to Solr Cloud but I'm running into problems
with indexing.

Cluster details:

8 machines of 64GB memory, each hosting 1 replica.
4 shards, 2 replica of each. Heap size is 16GB.

Collection details:

Total number of docs: ~250k (but only 50k are indexed right now)
Size of collection (master slave number for reference): ~10GB

Our collection is fairly heavy with some dynamic fields with high
cardinality (of order of ~1000s), which is why the large heap size for even
a small collection.

Relevant solrconfig settings:

commit settings:


  1
  360
  false



  ${solr.autoSoftCommit.maxTime:180}


index config:

500
1


  10
  10



   
 6
 4
   


Problem:

I setup the cloud and started indexing at the throughput of our earlier
master-slave setup, but soon the machines ran into full blown Garbage
Collection. This throughput was not a lot though. We index the whole
collection overnight, so roughly ~250k documents in 6 hours. That's roughly
12rps.

So now I'm doing indexing at an extremely slow rate trying to find the
problem.

Currently I'm indexing at 1 document/2seconds, so every minute ~30
documents.

Observations:

1. I'm noticing extremely small segments in the segments UI. Example:

Segment _1h4:
#docs: 5
#dels: 0
size: 1,586,878 bytes
age: 2021-02-12T11:05:33.050Z
source: flush

Why is lucene creating such small segments? I understood that segments are
created when ramBufferSizeMB or maxBufferedDocs limit is hit. Or on a hard
commit. Neither of those should lead to such small segments.

2. The index/ directory has a large number of files. For one shard with 30k
documents & 1.5GB size, there are ~450-550 files in this directory. I
understand that each segment is composed of a bunch of files. Even
accounting for that, the number of segments seems very large.

Note: Nothing out of the ordinary in logs. Only /update request logs.

Please help with making sense of the 2 observations above.



--
Sent from: https://lucene.472066.n3.nabble.com/Solr-User-f472068.html


Extremely small segments

2021-02-12 Thread samuel...@grupoinditex.mail.onmicrosoft.com




Extremely small segments

2021-02-12 Thread Yasoob Haider
Hi

I am migrating from master slave to Solr Cloud but I'm running into problems
with indexing.

Cluster details:

8 machines of 64GB memory, each hosting 1 replica.
4 shards, 2 replica of each. Heap size is 16GB.

Collection details:

Total number of docs: ~250k (but only 50k are indexed right now)
Size of collection (master slave number for reference): ~10GB

Our collection is fairly heavy with some dynamic fields with high
cardinality (of order of ~1000s), which is why the large heap size for even
a small collection.

Relevant solrconfig settings:

commit settings:


  1
  360
  false



  ${solr.autoSoftCommit.maxTime:180}


index config:

500
1


  10
  10



   
 6
 4
   


Problem:

I setup the cloud and started indexing at the throughput of our earlier
master-slave setup, but soon the machines ran into full blown Garbage
Collection. This throughput was not a lot though. We index the whole
collection overnight, so roughly ~250k documents in 6 hours. That's roughly
12rps.

So now I'm doing indexing at an extremely slow rate trying to find the
problem.

Currently I'm indexing at 1 document/2seconds, so every minute ~30
documents.

Observations:

1. I'm noticing extremely small segments in the segments UI. Example:

Segment _1h4:
#docs: 5
#dels: 0
size: 1,586,878 bytes
age: 2021-02-12T11:05:33.050Z
source: flush

Why is lucene creating such small segments? I understood that segments are
created when ramBufferSizeMB or maxBufferedDocs limit is hit. Or on a hard
commit. Neither of those should lead to such small segments.

2. The index/ directory has a large number of files. For one shard with 30k
documents & 1.5GB size, there are ~450-550 files in this directory. I
understand that each segment is composed of a bunch of files. Even
accounting for that, the number of segments seems very large.

Note: Nothing out of the ordinary in logs. Only /update request logs.

Please help with making sense of the 2 observations above.



--
Sent from: https://lucene.472066.n3.nabble.com/Solr-User-f472068.html


Re: Why Solr questions on stackoverflow get very few views and answers, if at all?

2021-02-12 Thread Charlie Hull
I've answered a few in my time, but my experience is that if you do so 
you then get emailed a whole load more questions some of which aren't 
even relevant to Solr! Also, quite a few of them are 'here is 3 pages of 
code please debug it for me no I won't tell the actual error I got'.


This is the best place to come,  also there's the IRC channel, the new 
Slack gateway to this at https://s.apache.org/solr-slack and in our own 
Relevance Slack at http://opensourceconnections.com/slack there's a 
#solr channel (as well as many others on search & relevance topics).


Solr is 'hot' (but not as hot as Elasticsearch), and search is still a 
niche business overall.


HTH

Cheers

Charlie

On 12/02/2021 10:37, ufuk yılmaz wrote:

Is it because the main place for q&a is this mailing list, or somewhere else 
that I don’t know?

Or Solr isn’t ‘hot’ as some other topics?

Sent from Mail for Windows 10




--
Charlie Hull - Managing Consultant at OpenSource Connections Limited 

Founding member of The Search Network  
and co-author of Searching the Enterprise 


tel/fax: +44 (0)8700 118334
mobile: +44 (0)7767 825828


Why Solr questions on stackoverflow get very few views and answers, if at all?

2021-02-12 Thread ufuk yılmaz
Is it because the main place for q&a is this mailing list, or somewhere else 
that I don’t know?

Or Solr isn’t ‘hot’ as some other topics?

Sent from Mail for Windows 10



Asymmetric Key Size not sufficient

2021-02-12 Thread Mahir Kabir
Hello,

I am a Ph.D. student at Virginia Tech, USA. While working on a security
project-related work, we came across the following vulnerability in the
source code -

In file
https://github.com/apache/lucene-solr/blob/branch_6_6/solr/core/src/java/org/apache/solr/util/CryptoKeys.java

(at
Line 300) Key Size was set as 1024.

*Security Impact*:

< 2048 key size for RSA algorithm makes the system vulnerable to
brute-force attack

*Useful resource*:
https://rules.sonarsource.com/java/type/Vulnerability/RSPEC-4426
https://rules.sonarsource.com/java/type/Vulnerability/RSPEC-4426

*Solution we suggest*:

For RSA algorithm, the key size should be >= 2048

*Please share with us your opinions/comments if there is any*:

Is the bug report helpful?

Please let us know what you think about the issue. Any feedback will be
appreciated.

Thank you,
Md Mahir Asef Kabir
Ph.D. Student
Department of CS
Virginia Tech