Erik Hatcher schrieb:

On Nov 16, 2008, at 6:18 PM, Ryan McKinley wrote:

my assumption with solrjs is that you are hitting "read-only" solr servers that you don't mind if people query directly.

Exactly the assumption I'm going with too.

It would not be appropriate for something where you don't want people (who really care) to know you are running solr and could execute arbitrary queries.

Since it is an example, I don't mind leaving the /admin interface open on:
http://example.solrstuff.org/solrjs/admin/
but /update has a password:
http://example.solrstuff.org/solrjs/update

I have said in the past I like the idea of a "read-only" flag in solr config that would throw an error if you try to do something with the UpdateHandler. However there are other ways to do that also.


As the thoughts and ideas of this thread are spread in several emails, let me just drop my uncoordinated thoughts here:

For solrjs, what exactly is the required information solr has to provide "directly":

- We need data for several widgets. This data will be in 99% of the cases some facet information and/or result docs. The result docs will be in suitable ranges, no webpage will display 100000+ result items at the same time.

- So "potentially dangerous" request params like rows>1000 or some other handlers apart from StandardRequest may be blocked.

- update handlers and admin interface shouldn't be exposed.


Like others mentioned before, I'm not sure this is a task that *has* to be solved inside Solr. As a standalone servlet, it is verly likely that it is NOT accessible directly in a production environment.

Hiding or password protecting update/admin is an easy task using a proxy like apache http. It could also be solved by a configurable ServletFilter delivered with solr, that is initialized inside solr's web.xml. To separate the concerns, I think it should not be coded "deeper" inside the solr code. The idea of a "read-only" server can be implemented like that. Optional update urls that are only accessed inside a firewall or something may also be present.

This servlet filter may also check the request params for things that are not needed for solrjs and potentially dangerous. It even may check how frequently urls are accessed (thinking about DoS).

I think even if it looks like a direct access, using solrjs doesn't have to be different to "common" solr webapps. Usually these apps take user input, a web application translates this input into a solr query and translates the result in a suitable client format. Other solr stuff is blocked indirectly because only this app has access to solr. Now the last 2 steps are done inside the client. But if we block stuff that isn't used by the client, we are in control of what may happen.

If that isn't secure enough, the more complicated solution would be the create such a stateful servlet that holds the query state of a client, and solrjs only performs /select/solrjs/?new_query=city:vienna or something. Then the query generation and all solr related stuff happens again on the server.

I think it should easily be reached to deliver this SecuritySolrFilter with the standard solr distribution, making it configurable for the user to decide what urls are blocked/password protected and what request parameters should be checked for illegal values. On the other hand, existing firewalls and proxies of the destination system may be used.Therefore some "best-practices" may be helpful in the solr wiki.

I would be fine by me to help implementing a standard securty filter for solr.

WDYT?

regards,
matthias

Reply via email to