[ 
https://issues.apache.org/jira/browse/SOLR-646?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Henri Biestro updated SOLR-646:
-------------------------------

    Description: 
This patch refers to 'generalized configuration properties' as specified by 
[HossMan|https://issues.apache.org/jira/browse/SOLR-350?focusedCommentId=12562834#action_12562834]
This means configuration & schema files can use expression based on properties 
defined in *solr.xml*.

h3. Use cases:
Describe core data directories from solr.xml as properties.
Share the same schema and/or config file between multiple cores.
Share reusable fragments of schema & configuration between multiple cores.

h3. Usage:
h4. solr.xml
This *solr.xml* will be used to illustrates using properties for different 
purpose.
{code:xml}
<solr persistent="true">
  <property name="version" value="1.3"/>
  <property name="lang" value="english, french"/>
  <property name="en-cores" value="en,core0"/>
  <property name="fr-cores" value="fr,core1"/>
  <!-- This experimental feature flag enables schema & solrconfig to include 
other files --> 
  <property name="solr.experimental.enableConfigInclude" value="true"/>
  <cores adminPath="/admin/cores">
    <core name="${en-cores}" instanceDir="./">
          <property name="version" value="3.5"/>
          <property name="l10n" value="EN"/>
          <property name="ctlField" value="core0"/>
          <property name="comment" value="This is a sample"/>
        </core>
    <core name="${fr-cores}" instanceDir="./">
          <property name="version" value="2.4"/>
          <property name="l10n" value="FR"/>
          <property name="ctlField" value="core1"/>
          <property name="comment" value="Ceci est un exemple"/>
        </core>
  </cores>
</solr>
{code}
{{version}} : if you update your solr.xml or your cores for various motives, it 
can be useful to track of a version. In this example, this will be used to 
define the {{dataDir}} for each core.
{{en-cores}},{{fr-cores}}: with aliases, if the list is long or repetitive, it 
might be convenient to use a property that can then be used to describe the 
Solr core name.
{{instanceDir}}: note that both cores will use the same instance directory, 
sharing their configuration and schema. The {{dataDir}} will be set for each of 
them from the *solrconfig.xml*.

h4. solrconfig.xml
This is where our *solr.xml* property are used to define the data directory as 
a composition of, in our example, the language code {{l10n}} and the core 
version stored in {{version}}.
{code:xml}
<config>
  <dataDir>${solr.solr.home}/data/${l10n}-${version}</dataDir>
....
</config>
{code}

h5. schema.xml
The {{include}} allows to import a file within the schema (or a solrconfig); 
this can help de-clutter long schemas or reuse parts.
The {{ctlField}} is just illustrating that a field & its type can be set 
through properties as well; in our example, we will want the 'english' core to 
refer to an 'english-configured' field and the 'french' core to a 
'french-configured' one. The type for the field is defined as {{text-EN}} or 
{{text-FR}} after expansion.

{code:xml}
<schema name="example core ${l10n}" version="1.1">
  <types>
...
   <include resource="text-l10n.xml"/>
  </types>

 <fields>   
...
  <field name="${ctlField}"   type="text-${l10n}"   indexed="true"  
stored="true"  multiValued="true" /> 
 </fields>
{code}

This schema is importing this *text-l10n.xml* file which is a *fragment*; the 
fragment tag must be present & indicates the file is to be included. Our 
example only defines different stopwords for each language but you could of 
course extend this to stemmers, synonyms, etc.
{code:xml}
<fragment>
        <fieldType name="text-FR" class="solr.TextField" 
positionIncrementGap="100">
...
            <filter class="solr.StopFilterFactory" ignoreCase="true" 
words="stopwords-fr.txt"/>
...
        </fieldType>
        <fieldType name="text-EN" class="solr.TextField" 
positionIncrementGap="100">
...
            <filter class="solr.StopFilterFactory" ignoreCase="true" 
words="stopwords-en.txt"/>
...
        </fieldType>
</fragment>
{code}

Alternatively, one can use XML entities using the 'solr:' protocol to the same 
end as in:
{code:xml}
<!DOCTYPE schema [
<!ENTITY textL10n SYSTEM "solr:${l10ntypes}">
]>
<schema name="example core ${l10n}" version="1.1">
  <types>
   <fieldtype name="string"  class="solr.StrField" sortMissingLast="true" 
omitNorms="true"/>
   <!--include resource="text-l10n.xml"/-->
   &textL10n;
  </types>
  ...
</schema>
{code}


h4. Technical specifications
solr.xml can define properties at the multicore & each core level.
Properties defined in the multicore scope can override system properties.
Properties defined in a core scope can override multicore & system properties.
Property definitions can use expressions to define their name & value; these 
expressions are evaluated in their outer scope context .
CoreContainer serialization keeps properties as defined; persistence is 
idem-potent. (ie property expressions are written, not their evaluation).

The core descriptor properties are automatically defined in each core context, 
namely:
solr.core.instanceDir
solr.core.name
solr.core.configName
solr.core.schemaName

h3. Coding notes:

- DOMUtil.java:
cosmetic changes
toMapExcept systematically skips 'xml:base" attributes (which may come from 
entity resolving)

- CoreDescriptor.java:
The core descriptor does not store properties as values but as expressions (and 
all its members can be property expressions as well) allowing to write file as 
defined (not as evaluated)
The public getCoreProperties is removed for that reason. (too bad we were in 
such a rush...)

- CoreContainer.java:
changes related to extracting the core names before they are evaluated in load()
changes related to evaluating core descriptor member before adding them to the 
core's loader properties
fix in persistFile which was not interpreting relative pathes correctly
fix in persist because properties were not written at the right place
changes in persist to write expressions (and core name when it is one) 

- Config.java:
subsituteProperties has been moved out of constructor so calls must be explicit.
added the entity resolver
added subsituteIncludes which processes <include name.../>

- SolrConfig.java & IndexSchema.java
added explicit calls to substituteIncludesto perform property/include expansion

- SolrResourceLoader.java
cosmetic, changed getCoreProperties to getProperties (since they may come from 
the CoreContainer)

- SolrProperties.java:
schema uses a localization (l10n) property to define an attribute
persists the file to check it keeps the expression properties

- QueryElevationComponent.java
Needed to explicitly call substituteProperties.


  was:
This patch refers to 'generalized configuration properties' as specified by 
[HossMan|https://issues.apache.org/jira/browse/SOLR-350?focusedCommentId=12562834#action_12562834]
This means configuration & schema files can use expression based on properties 
defined in *solr.xml*.

h3. Use cases:
Describe core data directories from solr.xml as properties.
Share the same schema and/or config file between multiple cores.
Share reusable fragments of schema & configuration between multiple cores.

h3. Usage:
h4. solr.xml
This *solr.xml* will be used to illustrates using properties for different 
purpose.
{code:xml}
<solr persistent="true">
  <property name="version" value="1.3"/>
  <property name="lang" value="english, french"/>
  <property name="en-cores" value="en,core0"/>
  <property name="fr-cores" value="fr,core1"/>
  <!-- This experimental feature flag enables schema & solrconfig to include 
other files --> 
  <property name="solr.experimental.enableConfigInclude" value="true"/>
  <cores adminPath="/admin/cores">
    <core name="${en-cores}" instanceDir="./">
          <property name="version" value="3.5"/>
          <property name="l10n" value="EN"/>
          <property name="ctlField" value="core0"/>
          <property name="comment" value="This is a sample"/>
        </core>
    <core name="${fr-cores}" instanceDir="./">
          <property name="version" value="2.4"/>
          <property name="l10n" value="FR"/>
          <property name="ctlField" value="core1"/>
          <property name="comment" value="Ceci est un exemple"/>
        </core>
  </cores>
</solr>
{code}
{{version}} : if you update your solr.xml or your cores for various motives, it 
can be useful to track of a version. In this example, this will be used to 
define the {{dataDir}} for each core.
{{en-cores}},{{fr-cores}}: with aliases, if the list is long or repetitive, it 
might be convenient to use a property that can then be used to describe the 
Solr core name.
{{instanceDir}}: note that both cores will use the same instance directory, 
sharing their configuration and schema. The {{dataDir}} will be set for each of 
them from the *solrconfig.xml*.

h4. solrconfig.xml
This is where our *solr.xml* property are used to define the data directory as 
a composition of, in our example, the language code {{l10n}} and the core 
version stored in {{version}}.
{code:xml}
<config>
  <dataDir>${solr.solr.home}/data/${l10n}-${version}</dataDir>
....
</config>
{code}

h5. schema.xml
The {{include}} allows to import a file within the schema (or a solrconfig); 
this can help de-clutter long schemas or reuse parts.
{color:red}This is an experimental feature that may not be kept in the 
future.{color}
The {{ctlField}} is just illustrating that a field & its type can be set 
through properties as well; in our example, we will want the 'english' core to 
refer to an 'english-configured' field and the 'french' core to a 
'french-configured' one. The type for the field is defined as {{text-EN}} or 
{{text-FR}} after expansion.

{code:xml}
<schema name="example core ${l10n}" version="1.1">
  <types>
...
   <include resource="text-l10n.xml"/>
  </types>

 <fields>   
...
  <field name="${ctlField}"   type="text-${l10n}"   indexed="true"  
stored="true"  multiValued="true" /> 
 </fields>
{code}

This schema is importing this *text-l10n.xml* file which is a *fragment*; the 
fragment tag must be present & indicates the file is to be included. Our 
example only defines different stopwords for each language but you could of 
course extend this to stemmers, synonyms, etc.
{code:xml}
<fragment>
        <fieldType name="text-FR" class="solr.TextField" 
positionIncrementGap="100">
...
            <filter class="solr.StopFilterFactory" ignoreCase="true" 
words="stopwords-fr.txt"/>
...
        </fieldType>
        <fieldType name="text-EN" class="solr.TextField" 
positionIncrementGap="100">
...
            <filter class="solr.StopFilterFactory" ignoreCase="true" 
words="stopwords-en.txt"/>
...
        </fieldType>
</fragment>
{code}


h4. Technical specifications
solr.xml can define properties at the multicore & each core level.
Properties defined in the multicore scope can override system properties.
Properties defined in a core scope can override multicore & system properties.
Property definitions can use expressions to define their name & value; these 
expressions are evaluated in their outer scope context .
CoreContainer serialization keeps properties as defined; persistence is 
idem-potent. (ie property expressions are written, not their evaluation).

The core descriptor properties are automatically defined in each core context, 
namely:
solr.core.instanceDir
solr.core.name
solr.core.configName
solr.core.schemaName

h3. Coding notes:

- DOMUtil.java:
refactored substituteSystemProperties to use an Evaluator;
an Evaluator is a DOM visitor that expands property expressions "in place" 
using a property map as an evaluation context
added an asString(node) method for logging purpose

- CoreDescriptor.java:
added an expression member to keep property expressions as defined in solr.xml 
for persistence - allowing to write file as defined (not as expanded)

- CoreContainer.java:
add an expression member to keep property expression as defined in solr.xml for 
persistence - allowing to write file as defined (not as expanded);
solrx.xml peristence is idem-potent
added a local DOMUtil.Evaluator that tracks property expressions to evaluate & 
store them
*issues outlined through solr-646:*
fix in load: 
CoreDescriptor p = new CoreDescriptor(this, names, ....);
was: CoreDescriptor p = new CoreDescriptor(this, name, ...);
fix in load;
register(aliases.get(a), core, false);
was of register(aliases.get(i), core, false);

- CoreAdminHandler.java
added an optional fileName to persist so it is possible to write the solr.xml 
to a different file (for comparison purpose)

- CoreAdminRequest.java
added PersistRequest to allow passing optional fileName

- Config.java:
subsituteProperties has been moved out of constructor & doc member made 
protected to allow override
added an IncludesEvaluator that deals with include/fragment

- SolrConfig.java & IndexSchema.ava
added explicit calls to substituteProperties to perform property/include 
expansion

- SolrResourceLoader.java
added properties member to store CoreContainer & per-SolrCore properties
added constructor properties parameter & getter for properties

- SolrProperties.java:
test inspired by MulticoreExampleTestBase.java
loads 2 cores sharing a schema & config;
config define dataDir using a property
schema uses a localization (l10n) property to define an attribute
persists the file to check it keeps the expression properties




> Configuration properties in multicore.xml
> -----------------------------------------
>
>                 Key: SOLR-646
>                 URL: https://issues.apache.org/jira/browse/SOLR-646
>             Project: Solr
>          Issue Type: New Feature
>    Affects Versions: 1.3
>            Reporter: Henri Biestro
>            Assignee: Shalin Shekhar Mangar
>             Fix For: 1.4
>
>         Attachments: solr-646.patch, solr-646.patch, solr-646.patch, 
> SOLR-646.patch, solr-646.patch, solr-646.patch, solr-646.patch, 
> solr-646.patch, solr-646.patch
>
>
> This patch refers to 'generalized configuration properties' as specified by 
> [HossMan|https://issues.apache.org/jira/browse/SOLR-350?focusedCommentId=12562834#action_12562834]
> This means configuration & schema files can use expression based on 
> properties defined in *solr.xml*.
> h3. Use cases:
> Describe core data directories from solr.xml as properties.
> Share the same schema and/or config file between multiple cores.
> Share reusable fragments of schema & configuration between multiple cores.
> h3. Usage:
> h4. solr.xml
> This *solr.xml* will be used to illustrates using properties for different 
> purpose.
> {code:xml}
> <solr persistent="true">
>   <property name="version" value="1.3"/>
>   <property name="lang" value="english, french"/>
>   <property name="en-cores" value="en,core0"/>
>   <property name="fr-cores" value="fr,core1"/>
>   <!-- This experimental feature flag enables schema & solrconfig to include 
> other files --> 
>   <property name="solr.experimental.enableConfigInclude" value="true"/>
>   <cores adminPath="/admin/cores">
>     <core name="${en-cores}" instanceDir="./">
>         <property name="version" value="3.5"/>
>         <property name="l10n" value="EN"/>
>         <property name="ctlField" value="core0"/>
>         <property name="comment" value="This is a sample"/>
>       </core>
>     <core name="${fr-cores}" instanceDir="./">
>         <property name="version" value="2.4"/>
>         <property name="l10n" value="FR"/>
>         <property name="ctlField" value="core1"/>
>         <property name="comment" value="Ceci est un exemple"/>
>       </core>
>   </cores>
> </solr>
> {code}
> {{version}} : if you update your solr.xml or your cores for various motives, 
> it can be useful to track of a version. In this example, this will be used to 
> define the {{dataDir}} for each core.
> {{en-cores}},{{fr-cores}}: with aliases, if the list is long or repetitive, 
> it might be convenient to use a property that can then be used to describe 
> the Solr core name.
> {{instanceDir}}: note that both cores will use the same instance directory, 
> sharing their configuration and schema. The {{dataDir}} will be set for each 
> of them from the *solrconfig.xml*.
> h4. solrconfig.xml
> This is where our *solr.xml* property are used to define the data directory 
> as a composition of, in our example, the language code {{l10n}} and the core 
> version stored in {{version}}.
> {code:xml}
> <config>
>   <dataDir>${solr.solr.home}/data/${l10n}-${version}</dataDir>
> ....
> </config>
> {code}
> h5. schema.xml
> The {{include}} allows to import a file within the schema (or a solrconfig); 
> this can help de-clutter long schemas or reuse parts.
> The {{ctlField}} is just illustrating that a field & its type can be set 
> through properties as well; in our example, we will want the 'english' core 
> to refer to an 'english-configured' field and the 'french' core to a 
> 'french-configured' one. The type for the field is defined as {{text-EN}} or 
> {{text-FR}} after expansion.
> {code:xml}
> <schema name="example core ${l10n}" version="1.1">
>   <types>
> ...
>    <include resource="text-l10n.xml"/>
>   </types>
>  <fields>   
> ...
>   <field name="${ctlField}"   type="text-${l10n}"   indexed="true"  
> stored="true"  multiValued="true" /> 
>  </fields>
> {code}
> This schema is importing this *text-l10n.xml* file which is a *fragment*; the 
> fragment tag must be present & indicates the file is to be included. Our 
> example only defines different stopwords for each language but you could of 
> course extend this to stemmers, synonyms, etc.
> {code:xml}
> <fragment>
>       <fieldType name="text-FR" class="solr.TextField" 
> positionIncrementGap="100">
> ...
>           <filter class="solr.StopFilterFactory" ignoreCase="true" 
> words="stopwords-fr.txt"/>
> ...
>       </fieldType>
>       <fieldType name="text-EN" class="solr.TextField" 
> positionIncrementGap="100">
> ...
>           <filter class="solr.StopFilterFactory" ignoreCase="true" 
> words="stopwords-en.txt"/>
> ...
>       </fieldType>
> </fragment>
> {code}
> Alternatively, one can use XML entities using the 'solr:' protocol to the 
> same end as in:
> {code:xml}
> <!DOCTYPE schema [
> <!ENTITY textL10n SYSTEM "solr:${l10ntypes}">
> ]>
> <schema name="example core ${l10n}" version="1.1">
>   <types>
>    <fieldtype name="string"  class="solr.StrField" sortMissingLast="true" 
> omitNorms="true"/>
>    <!--include resource="text-l10n.xml"/-->
>    &textL10n;
>   </types>
>   ...
> </schema>
> {code}
> h4. Technical specifications
> solr.xml can define properties at the multicore & each core level.
> Properties defined in the multicore scope can override system properties.
> Properties defined in a core scope can override multicore & system properties.
> Property definitions can use expressions to define their name & value; these 
> expressions are evaluated in their outer scope context .
> CoreContainer serialization keeps properties as defined; persistence is 
> idem-potent. (ie property expressions are written, not their evaluation).
> The core descriptor properties are automatically defined in each core 
> context, namely:
> solr.core.instanceDir
> solr.core.name
> solr.core.configName
> solr.core.schemaName
> h3. Coding notes:
> - DOMUtil.java:
> cosmetic changes
> toMapExcept systematically skips 'xml:base" attributes (which may come from 
> entity resolving)
> - CoreDescriptor.java:
> The core descriptor does not store properties as values but as expressions 
> (and all its members can be property expressions as well) allowing to write 
> file as defined (not as evaluated)
> The public getCoreProperties is removed for that reason. (too bad we were in 
> such a rush...)
> - CoreContainer.java:
> changes related to extracting the core names before they are evaluated in 
> load()
> changes related to evaluating core descriptor member before adding them to 
> the core's loader properties
> fix in persistFile which was not interpreting relative pathes correctly
> fix in persist because properties were not written at the right place
> changes in persist to write expressions (and core name when it is one) 
> - Config.java:
> subsituteProperties has been moved out of constructor so calls must be 
> explicit.
> added the entity resolver
> added subsituteIncludes which processes <include name.../>
> - SolrConfig.java & IndexSchema.java
> added explicit calls to substituteIncludesto perform property/include 
> expansion
> - SolrResourceLoader.java
> cosmetic, changed getCoreProperties to getProperties (since they may come 
> from the CoreContainer)
> - SolrProperties.java:
> schema uses a localization (l10n) property to define an attribute
> persists the file to check it keeps the expression properties
> - QueryElevationComponent.java
> Needed to explicitly call substituteProperties.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

Reply via email to