Erick Erickson created SOLR-10229:
-------------------------------------

             Summary: See what it would take to shift many of our one-off 
schemas used for testing to managed schema and construct them as part of the 
tests
                 Key: SOLR-10229
                 URL: https://issues.apache.org/jira/browse/SOLR-10229
             Project: Solr
          Issue Type: Improvement
      Security Level: Public (Default Security Level. Issues are Public)
            Reporter: Erick Erickson
            Priority: Minor


The test schema files are intimidating. There are about a zillion of them, and 
making a change in any of them risks breaking some _other_ test. That leaves 
people three choices:

1> add what they need to some existing schema. Which makes schemas bigger and 
bigger and bigger.

2> create a new schema file, adding to the proliferation thereof.

3> Look through all the existing tests to see if they have something that works.

The recent work on LUCENE-7705 is a case in point. We're adding a maxLen 
parameter to some tokenizers. Putting those parameters into any of the existing 
schemas, especially to test < 255 char tokens is virtually guaranteed to break 
other tests, so the only safe thing to do is make another schema file. Adding 
to the multiplication of files.

As part of SOLR-5260 I tried creating the schema on the fly rather than 
creating a new static schema file and it's not hard. WDYT about making this 
into some better thought-out utility? 

At present, this is pretty fuzzy, I wanted to get some reactions before putting 
much effort into it. I expect that the utility methods would eventually get a 
bunch of canned types. It's reasonably straightforward for primitive types, if 
lengthy. But when you get into solr.TextField-based types it gets less 
straight-forward.

We could manage to just move the "intimidation" from the plethora of schema 
files to a zillion fieldTypes in the utility to choose from...

Also, forcing every test to define the fields up-front is arguably less 
convenient than just having _some_ canned schemas we can use. And erroneous 
schemas to test failure modes are probably not very good fits for any such 
framework.

[~steve_rowe] and [[email protected]] in particular might have 
something to say.




--
This message was sent by Atlassian JIRA
(v6.3.15#6346)

---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to