On 29/01/2011 01:45, Koji Sekiguchi wrote:
(11/01/25 2:14), Paul Taylor wrote:
On 22/01/2011 15:43, Koji Sekiguchi wrote:
(11/01/20 22:19), Paul Taylor wrote:
Trying to extend MappingCharFilter so that it only changes a token
if the length of the token
matches the length of singleMatch in NormalizeCharMap (currently
the singleMatch just has to be
found in the token I want ut to match the whole token). Can this be
done it sounds simple enough but
I cannot make any headway understanding the MappingCharFilter
source code
thanks Paul
Paul,
Can you give us a concrete input/output (you wanted) with mapping table
so that I can understand what you want?
Thanks,
Koji
Sure
charConvertMap.add("!!!","ApostropheApostropheApostrophe");
charConvertMap.add("*** ***","StarStarStar");
charConvertMap.add("!","Apostrophe");
Normally, punctuation gets removed during index and searching which
is what I want for good search
results but when the token only contains specific punctuation strings
I don't want to remove the
punctuation because it would make it impossible to match, so I
convert it to a textual representation.
As it stands in the 3rd case '!' will be preserved wherever it is
found, so to get a good match on
'Wow!' you would have to search for 'Wow!. But I want you to be able
to search for 'Wow' and it
return 'Wow!' which is the case if "!" isn't in the char convert map,
but if you searched for '!' I
want it to return the token which is just '!' which is only the case
if the value is added to the map.
I need to do this because the text we are indexing and searching are
short strings representing an
music artist name (there is an artist called !!!)
thanks Paul
Hi Paul,
Still I'm not sure I understand your issue correctly, but if you want:
query="Wow!" result="Wow!"
query="Wow" result="Wow!"
query="!" result="Wow!"
query="!!!" result="!!!"
does the following maps solve your problem?
(I assume you use Whitespace-type-Tokenizer here)
charConvertMap.add("!!!","ApostropheApostropheApostrophe");
charConvertMap.add("!"," Apostrophe"); // there is a space in front
of "!"
Koji
No, the list of names your solution would convert all cases of
apostrophe which is not what I want to, and I need to do this for is
much larger than the two examples I give here,so you cannot rely on the
order they are added
in.
Is it possible to help me with the original question, how do I subclass
MaapingCharFilter so that it only changes complete matching tokens.
i.e if my charconvertmap contained
charConvertMap.add("!!!","ApostropheApostropheApostrophe");
it would convert a token of !!! to 'ApostropheApostropheApostrophe' but
a token of 'Hello!!!' becomes Hello
Paul
---------------------------------------------------------------------
To unsubscribe, e-mail: java-user-unsubscr...@lucene.apache.org
For additional commands, e-mail: java-user-h...@lucene.apache.org