Cheng,
What Version of Jackrabbit are you running?
-Christopher
Cheng Zhang wrote:
> Thanks, Jukka. Attached please my version of HTMLParser.java.
>
>
> ----- Original Message ----
> From: Jukka Zitting <[email protected]>
> To: [email protected]
> Sent: Sunday, January 4, 2009 1:48:10 AM
> Subject: Re: search results
>
> Hi,
>
> On Sun, Jan 4, 2009 at 3:08 AM, Cheng Zhang <[email protected]> wrote:
>
>> It turns out that the org.apache.jackrabbit.extractor.HTMLParser eats all
>> digits.
>> in method filterAndJoin, all non-letters are removed.
>> Does anybody has any idea why we do so? imo, index "hf100" makes more
>> sense than indexing "hf".
>>
>
> I don't recall any specific reason why digits should be dropped. I'd
> be happy to apply the fix if you've already fixed this and would like
> to attach the patch to Jira.
>
>
>> Or is there anyway I can configure to use my HTMLParser instead of the
>> default?
>>
>
> Look at the textFilterClasses parameter in the <SearchIndex/>
> configuration of your repository.xml and workspace.xml files.
>
> BR,
>
> Jukka Zitting
>