Hi Michael, have you considered the DataImportHandler? You could use the the LineEntityProcessor to create fields per line and then copyField to collect everything for the AllData field.
http://wiki.apache.org/solr/DataImportHandler#LineEntityProcessor Chantal On Tue, 2011-08-23 at 12:28 +0200, Michael Kliewe wrote: > Hello all, > > I have a custom schema which has a few fields, and I would like to create a > new field in the schema that only has one special line of another field > indexed. Lets use this example: > > field AllData (TextField) has for example this data: > Title: exampleTitle of the book > Author: Example Author > Date: 01.01.1980 > > Each line is separated by a line break. > I now need a new field named OnlyAuthor which only has the Author information > in it, so I can search and facet for specific Author information. I added > this to my schema: > > <fieldType name="authorField" class="solr.TextField"> > <analyzer type="index"> > <charFilter class="solr.PatternReplaceCharFilterFactory" > pattern="^.*\nAuthor: (.*?)\n.*$" replacement="$1" replace="all" /> > <tokenizer class="solr.WhitespaceTokenizerFactory"/> > <filter class="solr.LowerCaseFilterFactory"/> > <filter class="solr.TrimFilterFactory"/> > </analyzer> > <analyzer type="query"> > <charFilter class="solr.PatternReplaceCharFilterFactory" > pattern="^.*\nAuthor: (.*?)\n.*$" replacement="$1" replace="all" /> > <tokenizer class="solr.WhitespaceTokenizerFactory"/> > <filter class="solr.LowerCaseFilterFactory"/> > <filter class="solr.TrimFilterFactory"/> > </analyzer> > </fieldType> > > <field name="OnlyAuthor" type="authorField" indexed="true" stored="true" /> > > <copyField source="AllData" dest="OnlyAuthor"/> > > > But this is not working, the new AuthorOnly field contains all data, because > the regex didn't match. But I need "Example Author" in that field (I think) > to be able to search and facet only author information. > > I don't know where the problem is, perhaps someone of you can give me a hint, > or a totally different method to achieve my goal to extract a single line > from this multi-line-text. > > Kind regards and thanks for any help > Michael > >