€ 0.02: Indexing code "++" is a stop term, it might be in english text 
as well. 'C' is a not very descriptive but very valid variable name. '#' 
is used in some old morse transcripts I think. I am not going to die or 
get fired, but I'd suggest not including those tokens in a standard 
anything.

Erik Hatcher wrote:

> I personally don't have a problem with that change, however I don't 
> like changing such things as they can lead to unexpected and confusing 
> issues later. Suppose someone upgrades their version of Lucene without 
> re-indexing and now queries that used to work no longer work? (sure, I 
> agree it is wise to re-index if you upgrade Lucene).
>
> Perhaps others could chime in on whether this change would adversely 
> affect them or if this a desirable change?
>
> Erik
>
>
>
> On Jan 17, 2005, at 4:51 AM, Chris Lamprecht wrote:
>
>> Erik, Paul, Daniel,
>>
>> I submitted a testcase --
>> http://issues.apache.org/bugzilla/show_bug.cgi?id=33134
>>
>> On a related note, what do you all think about updating the
>> StandardAnalyzer grammar to treat "C#" and "C++" as tokens? It's a
>> small modification to the grammar -- NutchAnalysis.jj has it.
>>
>> -Chris
>>
>> On Mon, 17 Jan 2005 03:23:41 -0500, Erik Hatcher
>> <[EMAIL PROTECTED]> wrote:
>>
>>> I don't see any tests of StandardAnalyzer either. Your contribution
>>> would be most welcome. There are tests that use StandardAnalyzer, but
>>> not to test it directly.
>>>
>>
>> ---------------------------------------------------------------------
>> To unsubscribe, e-mail: [EMAIL PROTECTED]
>> For additional commands, e-mail: [EMAIL PROTECTED]
>
>
>
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: [EMAIL PROTECTED]
> For additional commands, e-mail: [EMAIL PROTECTED]
>
>



-- 
The information contained in this communication and any attachments is 
confidential and may be privileged, and is for the sole use of the intended 
recipient(s). Any unauthorized review, use, disclosure or distribution is 
prohibited. If you are not the intended recipient, please notify the sender 
immediately by replying to this message and destroy all copies of this message 
and any attachments. ASML is neither liable for the proper and complete 
transmission of the information contained in this communication, nor for any 
delay in its receipt.

Reply via email to