Re: Solr security

Matthias Epheser Mon, 17 Nov 2008 06:42:02 -0800

Erik Hatcher schrieb:

On Nov 16, 2008, at 6:18 PM, Ryan McKinley wrote:
my assumption with solrjs is that you are hitting "read-only" solrservers that you don't mind if people query directly.
Exactly the assumption I'm going with too.
It would not be appropriate for something where you don't want people(who really care) to know you are running solr and could executearbitrary queries.
Since it is an example, I don't mind leaving the /admin interface openon:
http://example.solrstuff.org/solrjs/admin/
but /update has a password:
http://example.solrstuff.org/solrjs/update
I have said in the past I like the idea of a "read-only" flag in solrconfig that would throw an error if you try to do something with theUpdateHandler. However there are other ways to do that also.

As the thoughts and ideas of this thread are spread in several emails, let mejust drop my uncoordinated thoughts here:

For solrjs, what exactly is the required information solr has to provide"directly":

- We need data for several widgets. This data will be in 99% of the cases somefacet information and/or result docs. The result docs will be in suitableranges, no webpage will display 100000+ result items at the same time.

- So "potentially dangerous" request params like rows>1000 or some otherhandlers apart from StandardRequest may be blocked.


- update handlers and admin interface shouldn't be exposed.

Like others mentioned before, I'm not sure this is a task that *has* to besolved inside Solr. As a standalone servlet, it is verly likely that it is NOTaccessible directly in a production environment.

Hiding or password protecting update/admin is an easy task using a proxy likeapache http. It could also be solved by a configurable ServletFilter deliveredwith solr, that is initialized inside solr's web.xml. To separate the concerns,I think it should not be coded "deeper" inside the solr code. The idea of a"read-only" server can be implemented like that. Optional update urls that areonly accessed inside a firewall or something may also be present.

This servlet filter may also check the request params for things that are notneeded for solrjs and potentially dangerous. It even may check how frequentlyurls are accessed (thinking about DoS).

I think even if it looks like a direct access, using solrjs doesn't have to bedifferent to "common" solr webapps. Usually these apps take user input, a webapplication translates this input into a solr query and translates the result ina suitable client format. Other solr stuff is blocked indirectly because onlythis app has access to solr. Now the last 2 steps are done inside the client.But if we block stuff that isn't used by the client, we are in control of whatmay happen.

If that isn't secure enough, the more complicated solution would be the createsuch a stateful servlet that holds the query state of a client, and solrjs onlyperforms /select/solrjs/?new_query=city:vienna or something. Then the querygeneration and all solr related stuff happens again on the server.

I think it should easily be reached to deliver this SecuritySolrFilter with thestandard solr distribution, making it configurable for the user to decide whaturls are blocked/password protected and what request parameters should bechecked for illegal values. On the other hand, existing firewalls and proxies ofthe destination system may be used.Therefore some "best-practices" may behelpful in the solr wiki.


I would be fine by me to help implementing a standard securty filter for solr.

WDYT?

regards,
matthias

Re: Solr security

Reply via email to