Many thanks for the assistance Hoss!  After a couple of bumps, it worked
great.

====

I followed the recommendations (and read the explanation - thanks!)

Although I swear it threw the error once again, just to be sure I rebooted
everything (Zookeeper included) then reloaded the configs into Zookeeper
and restarted my Solr servers - at that point the errors disappeared and
everything worked.

This will make upgrading super easy for us.  Given the relatively small
size of our data set, we have the luxury of just creating new Solr 6.1
instances in AWS, making a new node in Zookeeper, creating a collection,
adding the custom_schema file as you described and loading the data into
Solr from our Kafka store.  Gotta love it when your complete indexing into
Solr is in the neighborhood of two hours rather than two days or two weeks!

On Thu, Aug 4, 2016 at 8:42 PM, Alexandre Rafalovitch <arafa...@gmail.com>
wrote:

> Just as a note, TYPO3 uses a lot of include files though I do not remember
> which specific mechanism they rely on.
>
> Regards,
>     Alex
>
> On 5 Aug 2016 10:51 AM, "John Bickerstaff" <j...@johnbickerstaff.com>
> wrote:
>
> > Many thanks for your time!  Yes, it does make sense.
> >
> > I'll give your recommendation a shot tomorrow and update the thread.
> >
> > On Aug 4, 2016 6:22 PM, "Chris Hostetter" <hossman_luc...@fucit.org>
> > wrote:
> >
> >
> > TL;DR: use entity includes *WITH OUT TOP LEVEL WRAPPER ELEMENTS* like in
> > this example...
> >
> > https://github.com/apache/lucene-solr/blob/master/solr/
> > core/src/test-files/solr/collection1/conf/schema-snippet-types.incl
> > https://github.com/apache/lucene-solr/blob/master/solr/
> > core/src/test-files/solr/collection1/conf/schema-xinclude.xml
> >
> >
> > : The file I pasted last time is the file I was trying to include into
> the
> > : main schema.xml.  It was when that file was getting processed that I
> got
> > : the error  ['content' is not a glob and doesn't match any explicit
> field
> > or
> > : dynamicField. ]
> >
> > Ok -- so just to be crystal clear, you have two files, that look roughly
> > like this...
> >
> > --- BEGIN schema.xml ---
> > <?xml version="1.0" encoding="UTF-8" ?>
> > <schema name="statdx" version="1.5">
> >   <!-- a whole lot of <field>, <fieldType>, and <copyField> declarations
> >     -->
> >   <xi:include href="statdx_custom_schema.xml" xmlns:xi="
> http://www.w3.org/
> > 2001/XInclude"/>
> > </schema>
> > --- END schema.xml ---
> >
> > -- BEGIN statdx_custom_schema.xml ---
> > <?xml version="1.0" encoding="UTF-8" ?>
> > <schema name="example" version="1.6">
> >   <!-- a whole lot of ADDITIONAL <field>, <fieldType>, and <copyField>
> >        declarations
> >     -->
> > </schema>
> > --- END statdx_custom_schema.xml ---
> >
> > ...am I correct?
> >
> >
> > I'm going to skip a lot of the nitty gritty and just summarize by saying
> > that ultimately there are 2 problems here that combine to lead to the
> > error you are getting:
> >
> > 1) what you are trying to do as far as the xinclude is not really what
> > xinclude is designed for and doesn't work the way you (or any other sane
> > person) would think it does.
> >
> > 2) for historical reasons, Solr is being sloppy in what <copyField>
> > entries it recognizes.  If anything the "bug" is that Solr is
> > willing to try to load any parts of your include file at all -- it it
> were
> > behaving consistently it should be ignoring all of it.
> >
> >
> > Ok ... that seems terse, i'll clarify with a little of the nitty
> gritty...
> >
> >
> > The root of the issue is really something you alluded to earlier that
> > dind't make sense to me at the time because I didn't realize you were
> > showing us the *includED* file when you said it...
> >
> > >>> I assumed (perhaps wrongly) that I could duplicate the <schema ...>
> > >>>  </schema> arrangement from the schema.xml file.
> >
> > ...that assumption is the crux of the problem, because when the XML
> parser
> > evaluates your xinclude, what it produces is functionally equivilent to
> if
> > you had a schema.xml file that looked like this....
> >
> > --- BEGIN EFFECTIVE schema.xml ---
> > <?xml version="1.0" encoding="UTF-8" ?>
> > <schema name="statdx" version="1.5">
> >   <!-- a whole lot of <field>, <fieldType>, and <copyField> declarations
> >     -->
> >   <schema name="example" version="1.6">
> >     <!-- a whole lot of ADDITIONAL <field>, <fieldType>, and <copyField>
> >          declarations
> >       -->
> >   </schema>
> > </schema>
> > --- END EFFECTIVE schema.xml ---
> >
> > ...that extra <schema> element nested inside of the original <schema>
> > element is what's confusing the hell out of solr.  The <field> and
> > <fieldType> parsing is fairly strict, and only expects to find them as
> top
> > level elements (or, for historical purposes, as children of <fields> and
> > <types> -- note the plurals) while the <copyField> parsing is sloppy and
> > finds the one that gives you an error.
> >
> > (Even if the <field> and <fieldType> parsing was equally sloppy, only the
> > outermost <schema> tag would be recognized, so your default field props
> > would be based on the version="1.5" declaration, not the version="1.6"
> > declaration of the included file they'd be in ... which would be
> confusing
> > as hell, so it's a good thing Solr isn't sloppy about that parsing too)
> >
> >
> > In contrast to xincludes, XML Entity includes are (almost as a side
> effect
> > of the triviality of their design) vastly supperiour 90% of the time, and
> > capable of doing what you want.  The key diff being that Entity includes
> > do not require that the file being included is valid XML -- it can be an
> > arbitrary snippet of xml content (w/o a top level element) that will be
> > inlined verbatim.  so you can/should do soemthing like this...
> >
> > --- BEGIN schema.xml ---
> > <?xml version="1.0" encoding="UTF-8" ?>
> > <!DOCTYPE schema [
> >     <!ENTITY statdx_custom_include SYSTEM "statdx_custom_schema.incl">
> >     ]>
> > <schema name="statdx" version="1.5">
> >   <!-- a whole lot of <field>, <fieldType>, and <copyField> declarations
> >     -->
> >   &statdx_custom_include;
> > </schema>
> > --- END schema.xml ---
> >
> > -- BEGIN statdx_custom_schema.incl ---
> > <!-- a whole lot of ADDITIONAL <field>, <fieldType>, and <copyField>
> >      declarations
> >   -->
> > --- END statdx_custom_schema.incl ---
> >
> >
> > ...make sense?
> >
> >
> > -Hoss
> > http://www.lucidworks.com/
> >
>

Reply via email to