Well, unfortunately, this is a trap that users do hit. By requiring the user to think about the limit on creating PostingsHighlighter, he/she would think about it and realize they are in fact setting a limit.
Silent limits are dangerous because you don't offhand know what's wrong / why you see nothing getting highlighted. Mike McCandless http://blog.mikemccandless.com On Tue, Oct 15, 2013 at 9:42 AM, Robert Muir <rcm...@gmail.com> wrote: > I strongly disagree: there is no trap, its a reasonable default for > good summarization, and the behavior is no different than the other > highlighters here. > > Typically people *do* care about performance and its important to have > a clean simple API too. > > In my opinion increasing this limit is very esoteric: usually > sentences that deep do not summarize the document well. > > > > On Tue, Oct 15, 2013 at 9:38 AM, Michael McCandless > <luc...@mikemccandless.com> wrote: >> Maybe we should make the max length a required argument to >> PostingsHighlighter ctor? >> >> Because it's trappy now, since you don't realize offhand that it's >> silently enforcing a limit ... >> >> Mike McCandless >> >> http://blog.mikemccandless.com >> >> >> On Tue, Oct 15, 2013 at 9:31 AM, Robert Muir <rcm...@gmail.com> wrote: >>> Thanks Jon. Ill add some stuff to the javadocs here to try to make it >>> more obvious. >>> >>> On Tue, Oct 15, 2013 at 5:54 AM, Jon Stewart >>> <j...@lightboxtechnologies.com> wrote: >>>> Awesome, that did it! I didn't realize that DEFAULT_MAX_LENGTH was >>>> only 10,000. I've now upped it to 16MB (I'm not doing the usual thing >>>> and performance is not a particular concern). >>>> >>>> Thanks, >>>> >>>> Jon >>>> >>>> >>>> On Mon, Oct 14, 2013 at 9:58 PM, Robert Muir <rcm...@gmail.com> wrote: >>>>> are your documents large? >>>>> >>>>> try PostingsHighlighter(int) ctor with a larger value than >>>>> DEFAULT_MAX_LENGTH. >>>>> >>>>> sounds like the passages you see with matches are very deep into the >>>>> document and its just hitting the default limit and returning the >>>>> default summarization (getEmptyHighlight()) >>>>> >>>>> otherwise, please open a JIRA issue :) >>>>> >>>>> On Mon, Oct 14, 2013 at 9:32 PM, Jon Stewart >>>>> <j...@lightboxtechnologies.com> wrote: >>>>>> I upgraded to 4.5. Same results, unfortunately. Most docs in the >>>>>> result set will have a Passage where numMatches() > 0, but some do >>>>>> not. In these cases, the Passage array's length is greater than zero. >>>>>> >>>>>> >>>>>> Jon >>>>>> >>>>>> >>>>>> On Mon, Oct 14, 2013 at 5:24 PM, Robert Muir <rcm...@gmail.com> wrote: >>>>>>> did you try the latest release? There are some bugs fixed... >>>>>>> >>>>>>> On Mon, Oct 14, 2013 at 2:11 PM, Jon Stewart >>>>>>> <j...@lightboxtechnologies.com> wrote: >>>>>>>> Hello, >>>>>>>> >>>>>>>> I've observed that when using PostingsHighlighter in Lucene 4.4 that >>>>>>>> some of the responsive documents in TopDocs will have zero matches in >>>>>>>> the associated array of Passage objects. I.e., in the call of >>>>>>>> PassageFormatter.format(), there will be some calls where none of the >>>>>>>> Passage objects in the array will have matches. I've seen this on a >>>>>>>> simple one-word query, where the word clearly exists in the Document's >>>>>>>> text for the field (and the Document is included in the TopDocs result >>>>>>>> set). >>>>>>>> >>>>>>>> Any ideas? >>>>>>>> >>>>>>>> Thanks, >>>>>>>> >>>>>>>> Jon >>>>>>>> -- >>>>>>>> Jon Stewart, Principal >>>>>>>> (646) 719-0317 | j...@lightboxtechnologies.com | Arlington, VA >>>>>>>> >>>>>>>> --------------------------------------------------------------------- >>>>>>>> To unsubscribe, e-mail: java-user-unsubscr...@lucene.apache.org >>>>>>>> For additional commands, e-mail: java-user-h...@lucene.apache.org >>>>>>>> >>>>>>> >>>>>>> --------------------------------------------------------------------- >>>>>>> To unsubscribe, e-mail: java-user-unsubscr...@lucene.apache.org >>>>>>> For additional commands, e-mail: java-user-h...@lucene.apache.org >>>>>>> >>>>>> >>>>>> >>>>>> >>>>>> -- >>>>>> Jon Stewart, Principal >>>>>> (646) 719-0317 | j...@lightboxtechnologies.com | Arlington, VA >>>>>> >>>>>> --------------------------------------------------------------------- >>>>>> To unsubscribe, e-mail: java-user-unsubscr...@lucene.apache.org >>>>>> For additional commands, e-mail: java-user-h...@lucene.apache.org >>>>>> >>>>> >>>>> --------------------------------------------------------------------- >>>>> To unsubscribe, e-mail: java-user-unsubscr...@lucene.apache.org >>>>> For additional commands, e-mail: java-user-h...@lucene.apache.org >>>>> >>>> >>>> >>>> >>>> -- >>>> Jon Stewart, Principal >>>> (646) 719-0317 | j...@lightboxtechnologies.com | Arlington, VA >>>> >>>> --------------------------------------------------------------------- >>>> To unsubscribe, e-mail: java-user-unsubscr...@lucene.apache.org >>>> For additional commands, e-mail: java-user-h...@lucene.apache.org >>>> >>> >>> --------------------------------------------------------------------- >>> To unsubscribe, e-mail: java-user-unsubscr...@lucene.apache.org >>> For additional commands, e-mail: java-user-h...@lucene.apache.org >>> >> >> --------------------------------------------------------------------- >> To unsubscribe, e-mail: java-user-unsubscr...@lucene.apache.org >> For additional commands, e-mail: java-user-h...@lucene.apache.org >> > > --------------------------------------------------------------------- > To unsubscribe, e-mail: java-user-unsubscr...@lucene.apache.org > For additional commands, e-mail: java-user-h...@lucene.apache.org > --------------------------------------------------------------------- To unsubscribe, e-mail: java-user-unsubscr...@lucene.apache.org For additional commands, e-mail: java-user-h...@lucene.apache.org