(Note: this is different then what i have suggested before. Treat it as brainstorming on how to take what i have suggested and mesh it with your concerns)
What if: The RequestParser is not be part of the core API - It would be a helper function for Servlets and Filters that call the core API. It could be configured in web.xml rather then solrconfig.xml. A RequestDispatcher (Servlet or Filter) would be configured with a single RequestParser. The RequestParser would be in charge of taking HttpRequest and determining: 1) The RequestHandler 2) The SolrRequest (Params & Streams) It would not be the most 'pluggable' of plugins, but I am still having trouble imagining anything beyond a single default RequestParser. Assuming anything doing *really* complex ways of extracting ContentStreams will do it in the Handler not the request parser. For reference see my argument for a seperate DocumentParser interface in: http://www.nabble.com/Re%3A-Update-Plugins-%28was-Re%3A-Handling-disparate-data-sources-in-Solr%29-p8386161.html In my view, the default one could be mapped to "/*" and a custom one could be mapped to "/mycustomparser/*" This would drop the ':' from my proposed URL and change the scheme to look like: /parser/path/the/parser/knows/how/to/extract/?params This would give people a relativly easy way to implement 'restful' URLs if they need to. (but they would have to edit web.xml)
: Would that be configured in solrconfig.xml as <handler name="xml"? : name="update/xml"? If it is "update/xml" would it only really work if : the 'update' servlet were configured properly? it would only make sense to map that as "xml" ... the SolrCore (and hte solrconfig.xml) shouldn't have any knowledge of the Servlet/ServletFilter base paths because it should be possible to use the SolrCore independent of any ServletContainer (if for no other reason in unit tests)
Correct, SolrCore shoudl not care what the request path is. That is why I want to deprecate the execute( ) function that assumes the handler is defined by 'qt' Unit tests should be handled by execute( handler, req, res ) If I had my druthers, It would be: res = handler.execute( req ) but that is too big of leap for now :)
... A third use case of doing queries with POST might be that you want to use standard CGI form encoding/multi-part file upload semantics of HTTP to send an XML file (or files) to the above mentioned XmlQPRequestHandler ... so then we have "MultiPartMimeRequestParser" ...
I agree with all your use cases. It just seems like a LOT of complex overhead to extract the general aspects of translating a URL+Params+Streams => Handler+Request(Params+Streams) Again, since the number of 'RequestParsers' is small, it seems overly complex to have a separate plugin to extract URL, another to extract the Handler, and another to extract the streams. Particulary since the decsiions on how you parse the URL can totally affect the other aspects.
...i really, really, REALLY don't like the idea that the RequestParser Impls -- classes users should be free to write on their own and plugin to Solr using the solrconfig.xml -- are responsible for the URL parsing and parameter extraction. Maybe calling them "RequestParser" in my suggested design is missleading, maybe a better name like "StreamExtractor" would be better ... but they shouldn't be in charge of doing anything with the URL.
What if it were configured in web.xml, would you feel more comfortable letting it determine how the URL is parsed and streams are extracted?
Imagine if 3 years ago, when Yonik and I were first hammering out the API for SolrRequestHandlers, we had picked this... public interface SolrRequestHandlers extends SolrInfoMBean { public void init(NamedList args); public void handleRequest(HttpServletRequest req, SolrQueryResponse rsp); }
Thank goodness you didn't! I'm confident you won't let me (or anyone) talk you into something like that! You guys made a lot of good choices and solr is an amazing platform for it. That said, the task at issue is: How do we convert an arbitrary HttpServletRequest into a SolrRequest. I am proposing we have a single interface to do this: SolrRequest r = RequestParser.parse( HttpServletRequest ) You are proposing this is broken down further. Something like: Handler h = (the filter) getHandler( req.getPath() ) SolrParams = (the filter) do stuff to extract the params (using parser.preProcess()) ContentStreams = parser.parse( request ) While it is not great to have plugins manipulate the HttpRequest - someone needs to do it. In my opinion, the RequestParser's job is to isolate *everything* *else* from the HttpServletRequest. Again, since the number of RequestParser is small, it seems ok (to me)
keeping HttpServletRequest out of the API for RequestParsers helps us future-proof against breaking plugins down the road.
I agree. This is why i suggest the RequestParsers is not a core part of the API, just a helper class for Servlets and Filters. ryan