Re: Some new SOLR features

Jason Rutherglen Thu, 18 Sep 2008 04:40:57 -0700

> That would allow a single request to see a stable view of the
> schema, while preventing having to make every aspect of the schema
> thread-safe.


Yes that is the best approach.

> Nothing will stop one from using java serialization for config
> persistence,

Persistence should not be serialized.  Serialization is for transport
over the wire for automated upgrades of the configuration.  This could
be done in XML as well, but it would be good to support both models.

> Is there a role here for OSGi to play?

Yes.  Eclipse successfully uses OSGI, and for grid computing in Java,
and to take advantage of what Java can do with dynamic classloading,
OSGI is the way to go.  Every search project I have worked on needs
this stuff to be way easier than it is now.  The current distributed
computing model in SOLR may work, but it will not work reliably and
will break a lot.  When it does break there is no way to know what
happened.  This will create excessive downtime for users.  I have had
excessive downtime in production even in the current simple
master-slave architecture because there is no failover.  Failover in
the current system should be in there because it's too easy to
implement with the rsync based batch replication.

On Wed, Sep 17, 2008 at 2:21 PM, Yonik Seeley <[EMAIL PROTECTED]> wrote:
> On Wed, Sep 17, 2008 at 1:27 PM, Jason Rutherglen
> <[EMAIL PROTECTED]> wrote:
>> If the configuration code is going to be rewritten then I would like
>> to see the ability to dynamically update the configuration and schema
>> without needing to reboot the server.
>
> Exactly.  Actually, multi-core allows you to instantiate a completely
> new core and swap it for the old one, but it's a bit of a heavyweight
> approach.
>
> The key is finding the right granularity of change.
> My current thought is that a schema object would not be mutable, but
> that one could easily swap in a new schema object for an index at any
> time.  That would allow a single request to see a stable view of the
> schema, while preventing having to make every aspect of the schema
> thread-safe.
>
>> Also I would like the
>> configuration classes to just contain data and not have so many
>> methods that operate on the filesystem.
>
> That's the plan... completely separate the serialized and in memory
> representations.
>
>> This way the configuration
>> object can be serialized, and loaded by the server dynamically.  It
>> would be great for the schema to work the same way.
>
> Nothing will stop one from using java serialization for config
> persistence, however I am a fan of human readable for config files...
> so much easier to debug and support.  Right now, people can
> cut-n-paste relevant parts of their config in email for support, or to
> a wiki to explain things, etc.
>
> Of course, if you are talking about being able to have custom filters
> or analyzers (new classes that don't even exist on the server yet),
> then it does start to get interesting.  This intersects with
> deployment in general... and I'm not sure what the right answer is.
> What if Lucene or Solr needs an upgrade?  It would be nice if that
> could also automatically be handled in a a large cluster... what are
> the options for handling that?  Is there a role here for OSGi to play?
>  It sounds like at least some of that is outside of the Solr domain.
>
> An alternative to serializing everything would be to ship a new schema
> along with a new jar file containing the custom components.
>
> -Yonik
>

Re: Some new SOLR features

Reply via email to