Dear Wiki user, You have subscribed to a wiki page or wiki category on "Solr Wiki" for change notification.
The "SimplePreAnalyzedParser" page has been changed by AndrzejBialecki: http://wiki.apache.org/solr/SimplePreAnalyzedParser New page: = SimplePreAnalyzedParser format = This page describes the simple serializatio formar for PreAnalyzedField type. == General syntax == The format of the serialization is as follows: {{{ content ::= version (stored)? tokens version ::= digit+ " " ; stored field value - any "=" inside must be escaped! stored ::= "=" text "=" tokens ::= (token ((" ") + token)*)* token ::= text ("," attrib)* attrib ::= name '=' value name ::= text value ::= text }}} Special characters in "text" values can be escaped using the escape character \ . The following escape sequences are recognized: {{{ "\ " - literal space character "\," - literal , character "\=" - literal = character "\\" - literal \ character "\n" - newline "\r" - carriage return "\t" - horizontal tab }}} Please note that Unicode sequences (e.g. \u0001) are not supported. == Supported attribute names == The following token attributes are supported, and identified with short symbolic names: * `i` - position increment (integer) * `s` - token offset, start position (integer) * `e` - token offset, end position (integer) * `y` - token type (string) * `f` - token flags (hexadecimal integer) * `p` - payload (bytes in hexadecimal format) Token positions are tracked and implicitly added to the token stream - the start and end offsets consider only the term text and whitespace, and exclude the space taken by token attributes. == Example token streams == {{{ 1 one two three - version 1 - stored: 'null' - tok: '(term=one,startOffset=0,endOffset=3)' - tok: '(term=two,startOffset=4,endOffset=7)' - tok: '(term=three,startOffset=8,endOffset=13)' 1 one two three - version 1 - stored: 'null' - tok: '(term=one,startOffset=1,endOffset=4)' - tok: '(term=two,startOffset=6,endOffset=9)' - tok: '(term=three,startOffset=12,endOffset=17)' 1 one,s=123,e=128,i=22 two three,s=20,e=22 - version 1 - stored: 'null' - tok: '(term=one,positionIncrement=22,startOffset=123,endOffset=128)' - tok: '(term=two,positionIncrement=1,startOffset=5,endOffset=8)' - tok: '(term=three,positionIncrement=1,startOffset=20,endOffset=22)' 1 \ one\ \,,i=22,a=\, two\= \n,\ =\ \ - version 1 - stored: 'null' - tok: '(term= one ,,positionIncrement=22,startOffset=0,endOffset=6)' - tok: '(term=two= ,positionIncrement=1,startOffset=7,endOffset=15)' - tok: '(term=\,positionIncrement=1,startOffset=17,endOffset=18)' 1 ,i=22 ,i=33,s=2,e=20 , - version 1 - stored: 'null' - tok: '(term=,positionIncrement=22,startOffset=0,endOffset=0)' - tok: '(term=,positionIncrement=33,startOffset=2,endOffset=20)' - tok: '(term=,positionIncrement=1,startOffset=2,endOffset=2)' 1 =This is the stored part with \= \n \t escapes.=one two three - version 1 - stored: 'This is the stored part with = \n \t escapes.' - tok: '(term=one,startOffset=0,endOffset=3)' - tok: '(term=two,startOffset=4,endOffset=7)' - tok: '(term=three,startOffset=8,endOffset=13)' 1 == - version 1 - stored: '' - (no tokens) 1 =this is a test.= - version 1 - stored: 'this is a test.' - (no tokens) }}}