[
https://issues.apache.org/jira/browse/SOLR-14701?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17170109#comment-17170109
]
Erick Erickson commented on SOLR-14701:
---------------------------------------
bq. Sure. But when we guess right, you can
I can't disagree more that there's any justification in keeping a feature
because "it works some of the time".
bq. Doesn't matter, as this is NOT a production feature
Again, I couldn't disagree more with this statement. Schemaless is supposed to
make getting your feet wet easier and it flat doesn't work in lots of
situations. There's a workable alternative that is robust. Why keep this around?
bq. Perhaps they love it, or perhaps they hate it. Probably a good chunk of both
My claim is that we can provide much better functionality with the dynamic
field idea or at least something similar. What you lose, of course, is some
specialization, e.g. no attempt to guess numerics or date types. I'm perfectly
willing to give that up for something that fulfills its intended purpose: make
it unnecessary to even know about a schema to do some indexing when you first
start out.
bq. Since this is mostly contained to one URP ...
That rarely gets any attention. Another bit of orphan code.
bq. For some usecases with well formatted typed data it can work really
well...Elastic, this is exactly what you need to to there as well.
Agreed, and this is something that would be lost. I'm willing to lose it
though. Those use cases would still "work" keeping in mind that the intent is
to press a button without knowing anything about the schema and get _something_
that you can search.
I think it would be far more useful to have a button to press that examined a
current index and spat out a schema. The process would be:
1> index the data with the glob dynamic field
2> press the button and have a process go through all the stored fields (the
glob is stored=true by default) and generate a schema. Have an option to load
the generated schema into Zookeeper automatically either with the same name as
the collection uses or a new name. Or save it somewhere. Or generate the
correct schema API commands as you suggest. Or...
bq. This feature is only an aid very early on in exploring your data, to avoid
having to hand edit 142 <field>...
And with the dynamic glob mapping they wouldn't have to edit anything to start
either. Admittedly you have to get there sometime and when you do you have to
do some more typing.
bq. It's not hidden, is it? We recommend AGAINST this feature in production
And you're OK with that? We recommend against using it in production in the
first place because it doesn't work there reliably. So we ship something that
we know isn't good enough for production, put up with all the noise from the
test cases that nobody is fixing, consume developer resources whenever anything
we do breaks any schemaless tests for something people shouldn't use anyway.
bq. Or we could just make a page in Admin UI schema tab...
Which nobody has done. Or even signed up to do in the years since this feature
was introduced. This is a variant of the "learning mode" idea. A copy/paste
into some admin UI window doesn't process nearly enough documents to be robust.
Nobody is going to paste 10,000 docs in some window.
What's your objection to the glob field .vs. full-blown schemaless? The only
think I see that we lose is some specializations, and the "examine and generate
a schema" idea addresses that without shipping something that we then recommend
nobody actually use for production.
> Deprecate Schemaless Mode (Discussion)
> --------------------------------------
>
> Key: SOLR-14701
> URL: https://issues.apache.org/jira/browse/SOLR-14701
> Project: Solr
> Issue Type: Improvement
> Security Level: Public(Default Security Level. Issues are Public)
> Components: Schema and Analysis
> Reporter: Marcus Eagan
> Priority: Major
>
> I know this won't be the most popular ticket out there, but I am growing more
> and more sympathetic to the idea that we should rip many of the freedoms out
> that cause users more harm than not. One of the freedoms I saw time and time
> again to cause issues was schemaless mode. It doesn't work as named or
> documented, so I think it should be deprecated.
> If you use it in production reliably and in a way that cannot be accomplished
> another way, I am happy to hear from more knowledgeable folks as to why
> deprecation is a bad idea.
--
This message was sent by Atlassian Jira
(v8.3.4#803005)
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]