[
https://issues.apache.org/jira/browse/SOLR-2546?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Koji Sekiguchi resolved SOLR-2546.
----------------------------------
Resolution: Invalid
> Using hl.useFastVectorHighlighter with copyfield multivalued "boosted" we get
> too much informations
> ---------------------------------------------------------------------------------------------------
>
> Key: SOLR-2546
> URL: https://issues.apache.org/jira/browse/SOLR-2546
> Project: Solr
> Issue Type: Bug
> Components: highlighter
> Affects Versions: 4.0
> Environment: running on linux centos distro with tomcat 5 server
> Reporter: Marc Drolet
>
> I used a copyfield to search on. "Publisher_text" where I've copied a couple
> of fields into it. ex: id, Name, url, email
> I've copied 8 time the Name field into that copyfield to add a boost on the
> Name when I search on that copyfield.
> When I search on that copyfield and highlight that field with highlighting on
> using the useFastTermHighlighter I get the result truncated an the string
> return is multiplicated ontil the hl.fragsize is reach. default 100.
> here is my query for this example:
> ?q=Publisher_text%3Aedi&start=0&rows=10&fl=Publisher_text&hl=on&hl.fl=Publisher_text&hl.useFastVectorHighlighter=on
> here is the result's I have:
> <result name="response" numFound="322" start="0">
> <doc>
> <arr name="Publisher_text">
> <str> </str>
> <str/>
> <str>Neil Houston [email protected]</str>
> <str>jyounes</str>
> <str>Rogers Digital Media [New]</str>
> <str>Rogers Digital Media [New]</str>
> <str>Rogers Digital Media [New]</str>
> <str>Rogers Digital Media [New]</str>
> <str>Rogers Digital Media [New]</str>
> <str>Rogers Digital Media [New]</str>
> <str>Rogers Digital Media [New]</str>
> <str>Rogers Digital Media [New]</str>
> <str>Rogers Digital Media</str>
> <str>1 Mount Pleasant Toronto Canada M4Y 2Y5 Ontario</str>
> <str>Corby Fine [email protected]</str>
> <str>2262</str>
> ....
> here is the highlighting result I have:
> <lst name="highlighting">
> <lst name="Publisher_2262">
> <arr name="Publisher_text">
> <str>
> igital<span class="match"> Me</span>dia [New] Rogers Digital<span
> class="match"> Me</span>dia [New] Rogers Digital<span class="match">
> Me</span>dia [New] Rogers Digital<span class="match"> Me</span>dia [New]
> </str>
> </arr>
> </lst>
> You can see that the starting string is truncated. It's supposed to start
> with Rodgers .. and it's start at igital.
> You can also see that the string is return 4 times when it's supposed to
> return only once "Rogers Digital<span class="match"> Me</span>dia [New]"
> You can also see that the hl.tag.pre and hl.tag.post are not at the right
> spot. <span class="match"> Me</span>dia it should be M<span
> class="match">edi</span>a
> here is my schema Publisher_text field description:
> <field name="Publisher_text" type="text_wild" indexed="true"
> stored="true" multiValued="true" omitNorms="true" termVectors="true"
> termPositions="true" termOffsets="true"/>
> here is my text_wild field type description:
> <fieldType name="text_wild" class="solr.TextField" >
> <analyzer type="index">
> <tokenizer class="solr.NGramTokenizerFactory" minGramSize="3"
> maxGramSize="15" />
> <filter class="solr.LowerCaseFilterFactory"/>
> </analyzer>
> <analyzer type="query">
> <tokenizer class="solr.WhitespaceTokenizerFactory" />
> <filter class="solr.LowerCaseFilterFactory"/>
> </analyzer>
> </fieldType>
> When I remove the hl.useFastVectorHighlighter, the query is slower but I get
> the right result:
> <lst name="highlighting">
> <lst name="Publisher_2262">
> <arr name="Publisher_text">
> <str>Rogers Digital<em> Me</em>dia [New]</str>
> </arr>
> </lst>
> I'm running on the nightly build: apache-solr-4.0-2011-05-16_08-24-17-src.tgz
> If you need more information, let me know.
--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]