[
https://issues.apache.org/jira/browse/SOLR-1020?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12673544#action_12673544
]
Karl Wettin commented on SOLR-1020:
-----------------------------------
Missed out on telling you that I'm also looking at a binary solution for Solrj..
> PreAnalyzed field analyzer
> --------------------------
>
> Key: SOLR-1020
> URL: https://issues.apache.org/jira/browse/SOLR-1020
> Project: Solr
> Issue Type: New Feature
> Components: Analysis
> Affects Versions: 1.3
> Reporter: Karl Wettin
> Priority: Minor
> Attachments: SOLR-1020.txt
>
>
> An Analyzer that produce a TokenStream based on XML input that contains a
> marshalled TokenStream. Also contains static TokenStream XML marshaller.
> I kind of pulled this out of my pocket without testing it in a real
> environment in order to get some comments on the solution before I add it to
> my project. So cosider it a beta-patch.
> It use JSR173 XMLStream API available in Java 1.6, compatible with Java 1.5
> and downloadable from https://sjsxp.dev.java.net/
> XSD:
> {code:xml}
> <?xml version="1.0" encoding="UTF-8"?>
> <xs:schema attributeFormDefault="unqualified" elementFormDefault="qualified"
> xmlns:xs="http://www.w3.org/2001/XMLSchema">
> <xs:element name="tokens" type="tokensType"/>
> <xs:complexType name="tokensType">
> <xs:sequence>
> <xs:element type="tokenType" name="token"/>
> </xs:sequence>
> </xs:complexType>
> <xs:complexType name="tokenType">
> <xs:sequence>
> <xs:element type="xs:int" name="positionIncrement" maxOccurs="1"/>
> <xs:element type="xs:string" name="term" minOccurs="1"
> maxOccurs="1"/>
> <xs:element type="xs:string" name="type" maxOccurs="1"/>
> <xs:element type="xs:int" name="startOffset" maxOccurs="1"/>
> <xs:element type="xs:int" name="endOffset" maxOccurs="1"/>
> <xs:element type="xs:int" name="flags" maxOccurs="1"/>
> <xs:element type="payloadType" name="payload" maxOccurs="1"/>
> </xs:sequence>
> </xs:complexType>
> <xs:complexType name="payloadType">
> <xs:choice maxOccurs="1" minOccurs="1">
> <xs:element type="bytesType" name="bytes"/>
> <xs:element type="xs:string" name="hex"/>
> <xs:element type="xs:string" name="base64"/>
> </xs:choice>
> </xs:complexType>
> <xs:complexType name="bytesType">
> <xs:sequence>
> <xs:element type="xs:byte" name="byte" maxOccurs="unbounded"
> minOccurs="1"/>
> </xs:sequence>
> </xs:complexType>
> </xs:schema>
> {code}
> Even though I've added a couple of variants to how to handle a Payload in the
> XSD only <hex> is supported.
> Example XML:
> {code:xml}
> <tokens>
> <token>
> <positionIncrement>1</positionIncrement>
> <term>term</term>
> <type>type</type>
> <startOffset>0</startOffset>
> <endOffset>3</endOffset>
> <flags>65535</flags>
> <payload><hex>fffefd</hex></payload>
> </token>
> </tokens>
> {code}
--
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.