[jira] [Updated] (SOLR-10229) See what it would take to shift many of our one-off schemas used for testing to managed schema and construct them as part of the tests

Amrit Sarkar (JIRA) Fri, 31 Mar 2017 12:11:13 -0700

     [ 
https://issues.apache.org/jira/browse/SOLR-10229?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]


Amrit Sarkar updated SOLR-10229:
--------------------------------
    Attachment: SOLR-10229.patch

Uploaded the first draft, SOLR-10229.patch, (working) for review, suggestions 
and clearing doubts.

bq. 1. A "mother" schema, with most common field and fieldType definitions, 
will be loaded/parsed in the test framework once, non-dependent on any 
individual tests.

Loaded mother schema using +ManagedSchemaFactory::create+ in framework 
creation, independent of any test suite. I had to hardcode the +solr.home+ 
system property so that it can pick the default +solrconfig.xml+ and 
+mother-schema+, suggestion on this will be appreciated. There must be a better 
way to do it.

bq. 2. The individual tests then can pull relevant/required field and fieldType 
definitions from mother schema already parsed content and post them to its own 
miniature schema for tests via Schema API. The utility method can be named as 
"copyFieldAndDefinition" as suggested above.

+addField(String... fields)+ and +addFieldTypes(String... fieldTypes)+ will do 
the desired. Fetch the managed-schema from the current core and add components 
to it accordingly.

bq. 3. For custom field and fieldType, which are not available in the mother 
schema, utility methods in framework to pass them onto Schema API.

On Hoss suggestions, declarative builder methods are written which will add a 
new field or fieldType. Did not verified how the analysers will be specified 
for a specific fieldType, need to test that part and will update.

As we are fetching the +managedIndexSchema+ from the current core, this 
framework will not support Classic schema. I am assuming going forward as we 
are encouraging users to use managed-schema in place of classic, this approach 
is correct.

Made some progress on other features not listed in the patch, will update them 
very soon.

> See what it would take to shift many of our one-off schemas used for testing 
> to managed schema and construct them as part of the tests
> --------------------------------------------------------------------------------------------------------------------------------------
>
>                 Key: SOLR-10229
>                 URL: https://issues.apache.org/jira/browse/SOLR-10229
>             Project: Solr
>          Issue Type: Improvement
>      Security Level: Public(Default Security Level. Issues are Public) 
>            Reporter: Erick Erickson
>            Assignee: Erick Erickson
>            Priority: Minor
>         Attachments: SOLR-10229.patch
>
>
> The test schema files are intimidating. There are about a zillion of them, 
> and making a change in any of them risks breaking some _other_ test. That 
> leaves people three choices:
> 1> add what they need to some existing schema. Which makes schemas bigger and 
> bigger and bigger.
> 2> create a new schema file, adding to the proliferation thereof.
> 3> Look through all the existing tests to see if they have something that 
> works.
> The recent work on LUCENE-7705 is a case in point. We're adding a maxLen 
> parameter to some tokenizers. Putting those parameters into any of the 
> existing schemas, especially to test < 255 char tokens is virtually 
> guaranteed to break other tests, so the only safe thing to do is make another 
> schema file. Adding to the multiplication of files.
> As part of SOLR-5260 I tried creating the schema on the fly rather than 
> creating a new static schema file and it's not hard. WDYT about making this 
> into some better thought-out utility? 
> At present, this is pretty fuzzy, I wanted to get some reactions before 
> putting much effort into it. I expect that the utility methods would 
> eventually get a bunch of canned types. It's reasonably straightforward for 
> primitive types, if lengthy. But when you get into solr.TextField-based types 
> it gets less straight-forward.
> We could manage to just move the "intimidation" from the plethora of schema 
> files to a zillion fieldTypes in the utility to choose from...
> Also, forcing every test to define the fields up-front is arguably less 
> convenient than just having _some_ canned schemas we can use. And erroneous 
> schemas to test failure modes are probably not very good fits for any such 
> framework.
> [~steve_rowe] and [[email protected]] in particular might have 
> something to say.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)

---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

[jira] [Updated] (SOLR-10229) See what it would take to shift many of our one-off schemas used for testing to managed schema and construct them as part of the tests

Reply via email to