Re: [DISCUSS] Lucene-Solr split (Solr promoted to TLP)

Doug Turnbull Tue, 12 May 2020 06:13:03 -0700

I'll give a perspective that comes more from the user's / "market" point
of view as at OSC we onboard lots of new organizations into Solr.


- Most new users incorrectly think of Solr as an independent Apache
project, and many will have little knowledge or awareness of Lucene itself
until given the full history of Lucene, Solr, Elasticsearch... or they have
to dive into the code/write a plugin

- Most orgs / managers think in terms of "Solr" (as in "Solr" vs
"Elasticsearch" vs "Vespa, etc). So the starting point for new devs / folks
is from the Solr angle

- Lucene, when discussed, is understood more colloquially as a Solr
dependency

- If someone brings down the code to do some kind of work or investigation,
there's typically surprise that Lucene and Solr are bundled together.

- There's further surprise as the projects are indeed so different: Lucene
and Solr tests, for example look little alike. They seem to have different
coding syles / practices. One has more server-like and distributed system
concerns; the other is clearly a low-level library for doing search work...

I personally have a hard time explaining to new users the rationale for
keeping these together, and it only increases the barrier to entry (to both
projects) to have this added complexity of two very different code bases
munged together.

Just my 2 cents...
-Doug

On Tue, May 12, 2020 at 7:30 AM Alan Woodward <[email protected]> wrote:

> One advantage I find with the way Elasticsearch and Lucene interact is
> that ES doesn’t depend on the master branch.  We upgrade our master branch
> frequently to keep up to date with the latest release branch, and that lets
> us find regressions or API problems pretty quickly, but it also insulates
> us from having to make big changes immediately.  I find this really useful
> for things like deprecations.  Let’s say we deprecate a particular API in
> the release branch, and remove it entirely in master.  Currently, that
> means Solr needs to immediately switch over to the new API in its master
> branch.  But the whole point of doing deprecations first is that it gives
> users time to find issues with the replacements - if we find that the
> replacement API doesn’t quite fit in ES, we have time to work out either
> how to change our code, or to improve the new API, but because the
> deprecated version is still there we’re not blocked from upgrading and
> getting other improvements.  Solr, meanwhile, may end up with a hacky
> workaround because that’s what got tests passing for the Lucene developer;
> or worse, we end up just copying the deprecated API wholesale into Solr and
> abandoning it there - witness TrieField or UninvertingReader.
>
> > On 11 May 2020, at 19:05, Atri Sharma <[email protected]> wrote:
> >
> > My two cents:
> >
> > As a Lucene heavy developer, I have several found maintaining Solr
> > dependencies while making large changes a bit cumbersome. I believe
> > Lucene and Solr should exist in a symbiotic relationship but not
> > tightly coupled with each other.
> >
> >
> > On Mon, May 11, 2020 at 7:22 PM Erik Hatcher <[email protected]>
> wrote:
> >>
> >> Without reading much or replying to any specific points made on this
> thread, here's my raw thoughts on this age-old topic.... (finally  coming
> out of my cocoon after taking things in for a bit)
> >>
> >> Solr is a search -server- with distributed capabilities, that leverages
> the magic of Lucene underneath.  Solr depends on Lucene, is a consumer of
> it.  Lucene is a tight search -library- with little to no external
> dependencies.  Their purposes and end-users are different.
> >>
> >> I was never really for the grand unification of Lucene and Solr back in
> the day because:
> >>
> >> - Solr's developer experience would be greatly streamlined, faster,
> cleaner, leaner, and focused
> >> - Having Lucene change when Solr doesni't (yet) adapt to those changes
> leads to confusion and inconsistency, loose wires hanging out of the wall
> unconnected or duct taped together
> >> - It simply makes sense to keep Lucene versioned and tightly controlled
> for upgrades, various testing configurations varying Lucene versions,
> within Solr
> >> - Solr could have a very concerted upgrade effort for Lucene capability
> jumps, with a focused upgrade effort at the changed/improved/added touch
> points just like other dependencies within Solr (like Tika and Jetty)
> >>
> >> Those points all kinda say the same thing.... Solr depends on
> "lucene.jar" and I'm in the camp that thinks Solr and Lucene development,
> communities, and end-users/consumers would all greatly benefit from a fancy
> new TLP and focused community for solr.apache.org and a tight(er)
> relationship with the Lucene community as an involved and vested consumer.
> >>
> >> Erik
> >>
> >
> >
> > --
> > Regards,
> >
> > Atri
> > Apache Concerted
> >
> > ---------------------------------------------------------------------
> > To unsubscribe, e-mail: [email protected]
> > For additional commands, e-mail: [email protected]
> >
>
>
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: [email protected]
> For additional commands, e-mail: [email protected]
>
>

-- 
*Doug Turnbull **| CTO* | OpenSource Connections
<http://opensourceconnections.com>, LLC | 240.476.9983
Author: Relevant Search <http://manning.com/turnbull>; Contributor: *AI
Powered Search <http://aipoweredsearch.com>*
This e-mail and all contents, including attachments, is considered to be
Company Confidential unless explicitly stated otherwise, regardless
of whether attachments are marked as such.

Re: [DISCUSS] Lucene-Solr split (Solr promoted to TLP)

Reply via email to