Re: Problems using fieldType text_general in copyField

John Bickerstaff Thu, 04 Aug 2016 15:51:12 -0700

I get the same error with the Entity Includes - with or without the
<schema> tag...


I'm probably just going to make a section in schema.xml rather than worry
about this.

Includes are "nice to have" but not critical.

On Thu, Aug 4, 2016 at 4:25 PM, John Bickerstaff <j...@johnbickerstaff.com>
wrote:

> Found the Entity Includes - thanks.
>
> On Thu, Aug 4, 2016 at 4:22 PM, John Bickerstaff <j...@johnbickerstaff.com
> > wrote:
>
>> Thanks!
>>
>> The schema is a copy of the techproducts sample.
>>
>> Entire include here - and I take your point about the possibility of
>> malformation - thanks.
>>
>> I assumed (perhaps wrongly) that I could duplicate the <schema ...>
>>  </schema> arrangement from the schema.xml file.
>>
>> I'm unfamiliar with xml entity includes, but I'll go look them up...
>>
>> <?xml version="1.0" encoding="UTF-8" ?>
>> <schema name="example" version="1.6">
>>
>>    <!-- ngram field to support suggestions / lookahead search on title
>> (and category, contentType)-->
>>    <copyField source="foobar" dest="text"/>
>>    <field name="suggestion_ngram_for_title" type="text_suggest_ngram"
>> indexed="true" stored="false"/>
>>    <field name="displayurl" type="text_general" indexed="true"
>> stored="true" multiValued="false"/>
>>    <field name="productVersionId" type="string" indexed="true"
>> stored="true" multiValued="false"/>
>>    <field name="caption" type="text_general" indexed="true" stored="true"
>> multiValued="false"/>
>>    <field name="documentId" type="string" indexed="true" stored="true"
>> multiValued="false"/>
>>    <!--<field name="category" type="string" indexed="true" stored="true"
>> multiValued="true"/>-->
>>    <field name="contentType" type="text_special_synonym" indexed="true"
>> stored="true" multiValued="false"/>
>>    <!-- Do NOT assume that much thought went into using int on the
>> following field. This is testing only!-->
>>    <field name="preference_" type="int" indexed="true" stored="true"
>> multiValued="false"/>
>>
>>    <field name="meta_doc_type" type="text_general" indexed="true"
>> stored="true" multiValued="false"/>
>>    <!--<field name="content" type="text_general" indexed="true"
>> stored="true" multiValued="false"/>-->
>>
>>    <!-- STATdx Weighting fields here. These are not part of the document,
>> but are used to calculate relevancy scores -->
>>    <field name="category_weight"  type="double" indexed="true"
>>  stored="true"/>    <!-- used for rule one - weighting docs on general
>> usefulness -->
>>
>>    <!-- Main body of document extracted by SolrCell.
>>         NOTE: This field is not indexed by default, since it is also
>> copied to "text"
>>         using copyField below. This is to save space. Use this field for
>> returning and
>>         highlighting document content. Use the "text" field to search the
>> content. -->
>>    <field name="content" type="text_en" indexed="false" stored="true"
>> multiValued="true"/> *//HERE IS WHERE "CONTENT" IS DEFINED*
>>
>> <!-- test for parsing statdx-provided html in content field. text_html
>> has been modified to clean html -->
>>    <field name="html_content" type="text_html" indexed="true"
>> stored="true" multiValued="true"/>
>>
>>    <!-- Text fields from SolrCell to search by default in our catch-all
>> field -->
>>    <copyField source="title" dest="text"/>
>>    <copyField source="author" dest="text"/>
>>    <copyField source="description" dest="text"/>
>>    <copyField source="keywords" dest="text"/>
>>    <copyField source="content" dest="text"/>  /*/THROWING ERROR ABOUT
>> "CONTENT" NOT EXISTING HERE*
>>    <copyField source="content_type" dest="text"/>
>>    <copyField source="resourcename" dest="text"/>
>>    <copyField source="url" dest="text"/>
>>
>>    <!-- Create a string version of author for faceting -->
>>    <copyField source="author" dest="author_s"/>
>>
>>   <!-- Above, multiple source fields are copied to the [text] field.
>>           Another way to map multiple source fields to the same
>>           destination field is to use the dynamic field syntax.
>>           copyField also supports a maxChars to copy setting.  -->
>>
>>         <copyField source="*_en" dest="text"/>
>>
>>
>>     <!-- a copy of text_general. Used to handle the rule that says that
>> docs with "table"
>>          and "tsm" in the contentType field should show at the top of
>> results IF any of the
>>          following terms are in the search term submitted by the user:
>>          [TNM, AJCC, Stage, Staging, FIGO]   Note the special synonym
>> file in the xml below.
>>          Note to self: Expand this documentation if we end up adding more
>> "special" synonyms -->
>>     <fieldType name="text_special_synonym" class="solr.TextField"
>> positionIncrementGap="100">
>>       <analyzer type="index">
>>         <tokenizer class="solr.StandardTokenizerFactory"/>
>>         <filter class="solr.StopFilterFactory" ignoreCase="true"
>> words="stopwords.txt" />
>>                 <!-- in this example, we will only use synonyms at query
>> time
>>         <filter class="solr.SynonymFilterFactory"
>> synonyms="index_synonyms.txt" ignoreCase="true" expand="false"/>
>>         -->
>>         <filter class="solr.LowerCaseFilterFactory"/>
>>       </analyzer>
>>       <analyzer type="query">
>>         <tokenizer class="solr.StandardTokenizerFactory"/>
>>         <filter class="solr.StopFilterFactory" ignoreCase="true"
>> words="stopwords.txt" />
>>         <!-- Special synonym file here!!!!  -->
>>         <filter class="solr.SynonymFilterFactory"
>> synonyms="contentType_synonyms.txt" ignoreCase="true" expand="true"/>
>>         <filter class="solr.LowerCaseFilterFactory"/>
>>       </analyzer>
>>     </fieldType>
>>
>> </schema>
>>
>>
>>
>> On Thu, Aug 4, 2016 at 3:55 PM, Chris Hostetter <hossman_luc...@fucit.org
>> > wrote:
>>
>>>
>>> you mentioned that the problem only happens when you use xinclude, but
>>> you
>>> havne't shown us hte details of your xinclude -- what exactly does your
>>> schema.xml look like (with the xinclude call) and what exactly does the
>>> file being included look like (entire contents)
>>>
>>> (I suspect the problem you are seeing is realted to the way xinclude
>>> doens't really support "snippets" of malformed xml, and instead requires
>>> some root tag -- i can't imagine what root tag you are using in the
>>> included file that would play nicely with mixing/matching field
>>> declarations. ... using xml entity includes may be a simpler/safer
>>> option)
>>>
>>>
>>>
>>> : Date: Thu, 4 Aug 2016 15:47:00 -0600
>>> : From: John Bickerstaff <j...@johnbickerstaff.com>
>>> : Reply-To: solr-user@lucene.apache.org
>>> : To: solr-user@lucene.apache.org
>>> : Subject: Re: Problems using fieldType text_general in copyField
>>> :
>>> : I would call this a bug...
>>> :
>>> : I'm going out on a limb and say that if you define a field in the
>>> included
>>> : XML file, you will get this error.
>>> :
>>> : As long as the field is defined first in schema.xml, you can
>>> "copyFIeld" it
>>> : or whatever in the include file, but apparently fields MUST be created
>>> in
>>> : the schema.xml file.
>>> :
>>> : That makes use of the include for custom things somewhat moot - at
>>> least in
>>> : my situation.
>>> :
>>> : I'd love to be wrong by the way, but that's what my tests suggest right
>>> : now...
>>> :
>>> : On Thu, Aug 4, 2016 at 1:37 PM, John Bickerstaff <
>>> j...@johnbickerstaff.com>
>>> : wrote:
>>> :
>>> : > Summary:
>>> : >
>>> : > Using xinclude to include an xml file into schema.xml
>>> : >
>>> : > The following line
>>> : >
>>> : > <copyField source="content" dest="text"/>
>>> : >
>>> : > generates an error:  about a field being "not a glob and not
>>> matching an
>>> : > explicit field" even though I declare the field in the line just
>>> above.
>>> : >
>>> : > This seems to happen only for for fieldType text_general?
>>> : >
>>> : > ============
>>> : >
>>> : > Explanation:
>>> : >
>>> : > I need a little help - keep getting an error when trying to use the
>>> : > ability to include an additional XML file.  I may be overlooking
>>> something,
>>> : > but if so, I need help to see it.
>>> : >
>>> : > I have the following two lines which throw zero errors when part of
>>> : > schema.xml:
>>> : >
>>> : > <field name="content" type="text_general" indexed="false"
>>> stored="true"
>>> : > multiValued="true"/>
>>> : >  <copyField source="content" dest="text"/>
>>> : >
>>> : > However, when I put this into an include file and use xinclude, then
>>> I get
>>> : > this error when starting Solr.
>>> : >
>>> : >
>>> : >
>>> : >    - *statdx_shard1_replica3:* org.apache.solr.common.
>>> : >    SolrException:org.apache.solr.common.SolrException: Could not
>>> load
>>> : >    conf for core statdx_shard1_replica3: Can't load schema
>>> schema.xml:
>>> : >    copyField source :'content' is not a glob and doesn't match any
>>> explicit
>>> : >    field or dynamicField.
>>> : >
>>> : >
>>> : > Given that I am defining the field in the line right above the
>>> copyField
>>> : > statement, I'm confused about why this works fine in schema.xml but
>>> NOT in
>>> : > an included file.
>>> : >
>>> : > I experimented and found that any field of type "text_general" will
>>> throw
>>> : > this same error if it is part of the included xml file.  Other
>>> fieldTypes
>>> : > that I tried (string, int, double) did not have this issue.
>>> : >
>>> : > I'm using Solr 5.4, although I'm pulling custom config into an
>>> included
>>> : > file for purposes of moving to 6.1
>>> : >
>>> : > I have the following list of copyField commands in the included xml
>>> file,
>>> : > and get no errors on any but the "content" one.  It just so happens
>>> that
>>> : > "content" is the only field of type "text_general" in there.
>>> : >
>>> : >
>>> : > Any hints greatly appreciated.
>>> : >
>>> : >   <copyField source="title" dest="text"/>
>>> : >    <copyField source="author" dest="text"/>
>>> : >    <copyField source="description" dest="text"/>
>>> : >    <copyField source="keywords" dest="text"/>
>>> : >    <copyField source="content" dest="text"/>
>>> : >    <copyField source="content_type" dest="text"/>
>>> : >    <copyField source="resourcename" dest="text"/>
>>> : >    <copyField source="url" dest="text"/>
>>> : >
>>> : >
>>> :
>>>
>>> -Hoss
>>> http://www.lucidworks.com/
>>>
>>
>>
>

Re: Problems using fieldType text_general in copyField

Reply via email to