I'm moving a content management system's Lucene library over to Solr to reap the benefits, but along the way I'm meeting some problems which I imagine affect everyone doing the same kind of thing. I realise that Solr began life as something which was primarily designed to be used over HTTP or with non-Java clients. That explains the use of String name-value maps and other String representations in the outer API. However, constructs like NamedList are used right down into the code - e.g. in clever classes like SimpleFacets - which means that you might as well not be using an object oriented language at all. Instead of being able to use an IDE to tell at a glance what's inside facetInfo or highlightingInfo, for example, you have to resort to reading the Wiki or to searching code for all instances of "rsp.add(...)". Having written software like this in the past, in which object structures are in developers' heads rather than in the code, I bet it's made things more difficult along the way. I get the impression that Solr is ready for a bit of refactoring to give it a more Java-friendly API. This API should be the primary means of access into Solr functionality; it should explicitly model searches (i.e. filters plus queries plus sorts plus facet and highlighting cues), search results, hits (SolrHit which has a SolrDocument plus scoring info, by way of analogy with Lucene Hit) and hit documents (i.e. SolrDocument, so that's already fine). This API should be used _by_ the String-oriented request handlers, not the other way round; request handlers (and all uses of NamedList) should be reserved for implementations of that API which deal with non-Java-native clients. At the moment, the non-Java use cases are calling the shots in the Java implementation, and that seems a pity. Some of these considerations are clearly driving the implementation of org.apache.solr.client.solrj, which is an important development - I bet that's where most people start with Solr now. But I think two things need to happen here: (i) the work here should be moved into org.apache.solr, because with the right API at the server end you don't _need_ any code for a Java client - it would just call into the API, and would be a 'client' only in the sense that any caller of a method is that method's client. And (ii) the API which is currently in org.apache.solr.client.solrj should be using the kinds of classes I listed above, with UpdateResponse etc containing fields and getters which model what's actually returned (and do so without recourse to NamedList).
I realise that some of this is already happening, but I think with 1.3 still in its early stages now might be a good time to go the whle way. With a more heavily modelled and self-documenting API in place, people would find it a lot easier to develop Solr integrations, and I expect it would speed up the process of developing new core Solr functionality. Any thoughts? Jon