Ok... Well I did try that. I think that can be done as well. IMO schemas should be avoided with realtime. Otherwise there is a nightmare with schema versions. The current config files would not be used. How do you propose the non-integration of those things? It would seem to create a strange non-overlapping system within SOLR. Then I begin to wonder what is SOLR being used for here? Is it the RequestHandlers? But then those don't support optimistic concurrency. They do support optimize and commit which would need to be turned off. Is this ok to the user? I actually think the XML based RequestHandlers (or even binary) is not as powerful as basic object serialization. For example at a previous company I wrote all this code to do XML based span queries. That was pretty useless given I should have just serialized the span queries and sent them into SOLR. But then what was SOLR doing in that case? I would have needed to write a request handler to handle serialized queries but over HTTP? HTTP doesn't scale is grid computing. So these are some of the things I have thought about that are unclear right now. Also Payloads, does one need to write a custom RequestHandler or SearchComponent to handle custom Payloads? Using serialization I could just write the code and it would be dynamically loaded by the server, executed, and returns a result like the server is local. All in 1/10 the time it would take to do some custom RequestHandler. If the deployment had 100 servers, each RequestHandler I am testing out would require a reboot of each server each time? That is extremely inefficient. Search server systems always grow larger and my concern is, SOLR is adding features on a level that is not scalable in grid computing, meaning every little new feature, delays releases, needs testing, and is probably something 50% of the users don't need and will never use. It would be better IMO to have a clean separation between the core search server, and everything else. This is the architecture I decided to go with in Ocean. Where if I want new functionality I write a class that executes remotely on all the servers and returns any object I want. The class directly accesses the individual IndexReader of each index. I don't have to reboot anything, deploy a new WAR, do a bunch of testing etc. The XML interface should be at the server that is performing the distributed search, rather than at each server node because this is where the search results meet the real application. I guess I have found the current model for SOLR to be somewhat flawed. It's not anyone's fault because SOLR also is a major step forward for Lucene. However, a lot of the delay in new releases is because everyone is adding anything and everything they want into it which should not really be the case in order to move forward with new core features such as realtime. I think the facets is another example where it's currently tied into receiving an HTTP call via the SolrParams which are strings. It makes the code non-reusable in other projects. It could be rewritten and used in another project but then big fixes need to be manually placed back in, which makes things difficult. I am unfamiliar with open source projects and am curious how the Linux project handles these things. I guess it just seems at this point there is not enough clean separation between the various parts of SOLR making the development of it somewhat less efficient for production systems than it could be to the detriment of the users.
On Fri, Sep 5, 2008 at 9:40 AM, Noble Paul നോബിള് नोब्ळ् <[EMAIL PROTECTED]> wrote: > Postponing Ocean Integration towards 2.0 is not a good idea. First of > all we do not know when 2.0 is going to happen. delaying such a good > feature till 2.0 is wasting time. > > My assumption was that Actually realtime search may have nothing to do > with the core itself . It may be fine with a Pluggable > SolrIndexSearcherFactory/SolrIndexWriterFactory . Ocean can have a > unified reader-writer which may choose to implement both in one class. > > A total rewrite has its own problems. Achieving consensus on how > things should change is time consuming. So it will keep getting > delayed. If with a few changes we can start the integration, that is > the best way forward . Eventually , we can slowly , evolve to a > better design. But, the design need not be as important as the feature > itself. > > > > On Fri, Sep 5, 2008 at 6:46 PM, Yonik Seeley <[EMAIL PROTECTED]> wrote: >> On Fri, Sep 5, 2008 at 9:03 AM, Jason Rutherglen >> <[EMAIL PROTECTED]> wrote: >>> Ok, SOLR 2 can be a from the ground up rewrite? >> >> Sort-of... I think that's up for discussion at this point, but enough >> should change that keeping Java APIs back compatible is not a priority >> (just my opinion of course). Supporting the current main search and >> update interfaces and migrating most of the handlers shouldn't be that >> difficult. We should be able to provide relatively painless back >> compatibility for the 95% of Solr users that don't do any custom >> Java.... and the others hopefully won't mind migrating their stuff to >> get the cool new features :-) >> >> As far as SolrCore goes... I agree it's probably best to not do >> pluggability at that level. >> The way that Lucene has evolved, and may evolve (and how we want Solr >> to evolve), it seems like we want more of a combo >> IndexReader/IndexWriter interface. It also needs (optional) >> optimistic concurrency... that was also assumed in the discussions >> about bailey. >> >> -Yonik >> > > > > -- > --Noble Paul >