Re: Deprecate Schemaless Mode?

Marcus Eagan Mon, 03 Aug 2020 12:07:07 -0700

Furthermore, just to be clear, I opened a discussion about deprecating and
not replacing schemaless mode for two reasons:


(1) the pain it has inflicted on Solr users and reputation of Solr —
deprecation logs speak volumes.
(2) to get a better understanding of what engineers and others in the
community use Schemaless for to inform the design of its replacement.

At no point would I argue that a feature like Schemaless is unnecessary. It
was the first way I used Solr (the second time around, the first time I
tried it I built my company using Elasticsearch because of other issues). I
am of the opinion that "Schemaless Mode" has done more harm to Solr than
good in my limited experience with the feature. Heck, *I've only been
consulting for a week and it has already come up*. I acknowledge a very
small sample size.

I am curious as to your thoughts on these points. There are not lots of
people getting started with Solr today relative to the other solutions on
the market regardless of what you might assume. I am here to see if I can
change that through a shift in how we approach user experience and the
knowledge requisite to operate a production cluster. I hope no one takes
offense to me challenging how some community members think about what is a
good feature vs what is a bad one.

Marcus




On Mon, Aug 3, 2020 at 11:44 AM Marcus Eagan <marcusea...@gmail.com> wrote:

> I know a person using it in production today. It's causing problems. They
> could abandon Solr altogether. It seems like a schema creation wizard is
> the right getting started motion if we know that schemaless doesn't do what
> people think it does. It's misleading. It's also a false representation of
> how easy it is to get started when compared to other solutions on the
> market. If schemaless is about support new use/adoption, it should actually
> help that more than hurt it.
>
> That's why I raised it. Re-branding this feature is like pig-lipsticking
> in my mind, but you all have more experience than me and are committers. I
> will defer to you for now. I am in favor on re-naming the feature as the
> minimum change that should happen.
>
> Schemaless mode makes sense in a world where schemas are largely opaque
> like IoT-telemetry or server logs. When you are searching data primarily
> for human consumption, I think it is just a headache in a bottle. In the
> cases of CSV and TSV, customers know the schema. I like to approach
> designing software such that no one ever needs to talk to me. No
> firefighting consulting is necessary, and you can skim the docs and proceed
> safely. I understand others may not feel that way, but it is the future of
> software.
>
> I encourage everyone here to try the newer search systems that have been
> released and are growing rapidly to inform your opinions on this topic. I
> am doing that because it is the concrete poured to build the common ground
> of the future.
>
> On Mon, Aug 3, 2020 at 11:40 AM Anshum Gupta <ans...@anshumgupta.net>
> wrote:
>
>> +1 Jason.
>>
>> Here's some context on how this came into being.
>>
>> Users find it difficult to understand and create a basic schema when just
>> trying out Solr. This mode was supposed to help them bootstrap, and one
>> they had a better understanding of how things worked, they'd tune it before
>> using the schema in production.
>> This did improve the OTB experience for new users, but a lot of people
>> abused this convenience and used this in production causing issues.
>>
>> As Jason mentioned, we'd better serve our users if we left this feature
>> for the getting started experience and add warnings (in UI and responses?)
>> so users would know what they are doing when they take this to production.
>>
>> This feature isn't trappy unless people use it in ways it was not
>> intended to be used in. We just need to warn and educate people better.
>>
>> On Mon, Aug 3, 2020 at 10:41 AM Jason Gerlowski <gerlowsk...@gmail.com>
>> wrote:
>>
>>> > Is anyone on this list using schemaless mode in production or have you
>>> tried to?
>>>
>>> Schemaless mode is one of a group of Solr features present for
>>> convenience but not intended for production usage.  It's in the same
>>> boat as "bin/post", and SolrCell, and others.  These features do cause
>>> headaches when users ignore the documented restrictions and use them
>>> for more than prototyping.  But at the same time they're super
>>> valuable for these sort of demo-ing or getting-started use cases.  An
>>> easy getting-started experience is important, and schemaless et al
>>> serve a mostly positive role in that.
>>>
>>> I think we'd better serve our users if we left schemaless
>>> in/undeprecated, and instead focused on making it harder to
>>> (unknowingly) use them in ways contrary to community recommendations.
>>> Add louder warnings in the documentation (where not already present).
>>> Add warnings to the Solr logs the first time these features are used.
>>> Disable them by default (where that makes sense).  Taken to the
>>> extreme, we could even add a section into Solr's response that lists
>>> non-production features used in serving a given request.
>>>
>>> There are lots of ways to address the "feature X is trappy" problem
>>> without removing X together.
>>>
>>> On Mon, Aug 3, 2020 at 11:33 AM Marcus Eagan <marcusea...@gmail.com>
>>> wrote:
>>> >
>>> > Community,
>>> >
>>> > There are many of us that have had to deal with the pain of managing
>>> the schemaless mode of operation in Solr. I'm curious to get others
>>> thoughts about how well it is working for them and if they would like to
>>> continue to use it.
>>> >
>>> > I for one don't think Schemaless works as intended and favor
>>> deprecating it and replacing it with some more usable but I am sure others
>>> have thoughts here.
>>> >
>>> > Is anyone on this list using schemaless mode in production or have you
>>> tried to?
>>> >
>>> > A preliminary discussion has occurred in this Jira ticket:
>>> https://issues.apache.org/jira/browse/SOLR-14701
>>> >
>>> > Thank you all,
>>> >
>>> > Marcus Eagan
>>> >
>>>
>>> ---------------------------------------------------------------------
>>> To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
>>> For additional commands, e-mail: dev-h...@lucene.apache.org
>>>
>>>
>>
>> --
>> Anshum Gupta
>>
>
>
> --
> Marcus Eagan
>
>

-- 
Marcus Eagan

Re: Deprecate Schemaless Mode?

Reply via email to