Hi Solr users,

I have a lots of dates from a library catalog in not
solr.DateField compatible format. I wrote a new <fieldType>
definition inside the solrconfig.xml, which creates
eg. 1991-01-01T00:00:01Z from the input '[c1991.]' string.
It works fine when I tried it with the typical values
in the http://localhost:8983/solr/admin/analysis.jsp,
but it always throws an exception, when I try to index
the records.

<fieldType name="trickyDate" class="solr.DateField"
 sortMissingLast="true" omitNorms="true">
 <analyzer>
   <tokenizer class="solr.KeywordTokenizerFactory"/>
   <filter class="solr.LowerCaseFilterFactory" />
   <filter class="solr.TrimFilterFactory" />
   <filter class="solr.PatternReplaceFilterFactory"
     pattern="sh..?wa \d\d? " replacement="" replace="first"/>
   <filter class="solr.PatternReplaceFilterFactory"
     pattern="june (\d\d), " replacement="" replace="first"/>
   <filter class="solr.PatternReplaceFilterFactory"
     pattern="september (\d\d), " replacement="" replace="first"/>
   <filter class="solr.PatternReplaceFilterFactory"
     pattern="(\D)" replacement="" replace="all"/>
   <filter class="solr.PatternReplaceFilterFactory"
     pattern="^(\d{4})\d*$" replacement="$1-01-01T00:00:01"
     replace="all"/>
 </analyzer>
</fieldType>

It is more than possible, that I misunderstand something. What I
like to do is to 'normalize' somehow the input data, and I thought
that it is more effective in the Solr side, than in the client.

Have you got any advise, how I may continue?

Péter

Reply via email to