Re: [jira] Commented: (SOLR-567) SolrCore Pluggable

Jason Rutherglen Fri, 05 Sep 2008 07:06:44 -0700

Ok... Well I did try that.  I think that can be done as well. IMO
schemas should be avoided with realtime.  Otherwise there is a
nightmare with schema versions.  The current config files would not be
used.  How do you propose the non-integration of those things?  It
would seem to create a strange non-overlapping system within SOLR.
Then I begin to wonder what is SOLR being used for here?  Is it the
RequestHandlers?  But then those don't support optimistic concurrency.
 They do support optimize and commit which would need to be turned
off.  Is this ok to the user?  I actually think the XML based
RequestHandlers (or even binary) is not as powerful as basic object
serialization.  For example at a previous company I wrote all this
code to do XML based span queries.  That was pretty useless given I
should have just serialized the span queries and sent them into SOLR.
But then what was SOLR doing in that case?  I would have needed to
write a request handler to handle serialized queries but over HTTP?
HTTP doesn't scale is grid computing.  So these are some of the things
I have thought about that are unclear right now.  Also Payloads, does
one need to write a custom RequestHandler or SearchComponent to handle
custom Payloads?  Using serialization I could just write the code and
it would be dynamically loaded by the server, executed, and returns a
result like the server is local.  All in 1/10 the time it would take
to do some custom RequestHandler.  If the deployment had 100 servers,
each RequestHandler I am testing out would require a reboot of each
server each time?  That is extremely inefficient.  Search server
systems always grow larger and my concern is, SOLR is adding features
on a level that is not scalable in grid computing, meaning every
little new feature, delays releases, needs testing, and is probably
something 50% of the users don't need and will never use.  It would be
better IMO to have a clean separation between the core search server,
and everything else.  This is the architecture I decided to go with in
Ocean.  Where if I want new functionality I write a class that
executes remotely on all the servers and returns any object I want.
The class directly accesses the individual IndexReader of each index.
I don't have to reboot anything, deploy a new WAR, do a bunch of
testing etc.  The XML interface should be at the server that is
performing the distributed search, rather than at each server node
because this is where the search results meet the real application.  I
guess I have found the current model for SOLR to be somewhat flawed.
It's not anyone's fault because SOLR also is a major step forward for
Lucene.  However, a lot of the delay in new releases is because
everyone is adding anything and everything they want into it which
should not really be the case in order to move forward with new core
features such as realtime.  I think the facets is another example
where it's currently tied into receiving an HTTP call via the
SolrParams which are strings.  It makes the code non-reusable in other
projects.  It could be rewritten and used in another project but then
big fixes need to be manually placed back in, which makes things
difficult.  I am unfamiliar with open source projects and am curious
how the Linux project handles these things.  I guess it just seems at
this point there is not enough clean separation between the various
parts of SOLR making the development of it somewhat less efficient for
production systems than it could be to the detriment of the users.


On Fri, Sep 5, 2008 at 9:40 AM, Noble Paul നോബിള്‍ नोब्ळ्
<[EMAIL PROTECTED]> wrote:
> Postponing Ocean Integration towards 2.0 is not a good idea. First of
> all we do not know when 2.0 is going to happen. delaying  such a good
> feature till 2.0 is wasting time.
>
> My assumption was that Actually realtime search may have nothing to do
> with the core itself . It may be fine with a Pluggable
> SolrIndexSearcherFactory/SolrIndexWriterFactory . Ocean can have a
> unified reader-writer which may choose to implement both in one class.
>
> A total rewrite has its own problems. Achieving consensus on how
> things should change is time consuming. So it will keep getting
> delayed.  If with a few changes we can start the integration, that is
> the best way forward . Eventually , we can slowly ,  evolve to a
> better design. But, the design need not be as important as the feature
> itself.
>
>
>
> On Fri, Sep 5, 2008 at 6:46 PM, Yonik Seeley <[EMAIL PROTECTED]> wrote:
>> On Fri, Sep 5, 2008 at 9:03 AM, Jason Rutherglen
>> <[EMAIL PROTECTED]> wrote:
>>> Ok, SOLR 2 can be a from the ground up rewrite?
>>
>> Sort-of... I think that's up for discussion at this point, but enough
>> should change that keeping Java APIs back compatible is not a priority
>> (just my opinion of course).  Supporting the current main search and
>> update interfaces and migrating most of the handlers shouldn't be that
>> difficult.  We should be able to provide relatively painless back
>> compatibility for the 95% of Solr users that don't do any custom
>> Java.... and the others hopefully won't mind migrating their stuff to
>> get the cool new features :-)
>>
>> As far as SolrCore goes... I agree it's probably best to not do
>> pluggability at that level.
>> The way that Lucene has evolved, and may evolve (and how we want Solr
>> to evolve), it seems like we want more of a combo
>> IndexReader/IndexWriter interface.  It also needs (optional)
>> optimistic concurrency... that was also assumed in the discussions
>> about bailey.
>>
>> -Yonik
>>
>
>
>
> --
> --Noble Paul
>

Re: [jira] Commented: (SOLR-567) SolrCore Pluggable

Reply via email to