It may be a tokenization thing -- the apostrophe is causing a word break
so your custom stem is never matched.
What does this give you: cts:tokenize(cts:stem("Int'l"))?
Do things work as you expect for a custom stem that doesn't have a
punctuation character in it?
A workaround for that is to create a field custom tokenization override
making apostrophe a word character. That will be confined to that specific
field, however, and not to word queries in general.
Regardless, you should probably report a bug to ML support.
//Mary
On Wed, 22 Jul 2015 08:02:33 -0700, Rhodes, David (LNG-CON)
<[email protected]> wrote:
> I am trying to use a custom dictionary to extend the set of stemmed
> words.
>
> I am using MarkLogic 7.0, and have been following the documentation
> guides in Chapters 17 and 18:
> http://docs.marklogic.com/7.0/guide/search-dev/stemming
> http://docs.marklogic.com/7.0/guide/search-dev/custom-dictionaries
>
> I noted that there are two ways to see if words are resolving to their
> stems:
>
> cts:stem(word) returns the stems of word
>
> and
>
> cts:contains(word, stem) returns true if these two terms resolve to the
> same stem
>
> I confirmed that both of these work for terms that are in the default
> dictionary (e.g., run and running, bite and bitten)
>
> I have added a custom dictionary that adds "Int'l" as a word with
> "International" as its stem.
>
> cdict:dictionary-write("en",$dict)
>
> With that dictionary added as the custom dictionary for English,
> cts:stem works but cts:contains does not.
> cts:stem("Int'l") returns International
> cts:contains("Int'l", "International") returns false
>
> I reindexed my database, since I understand that my dictionary entry
> means that all documents containing "Int'l" should now be indexed under
> "International".
>
> cts:contains("Int'l", "International") still returns false
> Furthermore, in the real search work flow that I am doing, searches for
> "Int'l" do not return documents containing "International" (But searches
> for "bitten" do return documents containing "bite").
>
> My database indexes are set to Stemmed Searches = Basic, and Word
> Searches = False.
>
> I think that stemming can be a powerful feature for my work flow, if I
> can just get it to work. Thank you for any advice you can offer.
>
> David
--
Using Opera's revolutionary email client: http://www.opera.com/mail/
_______________________________________________
General mailing list
[email protected]
Manage your subscription at:
http://developer.marklogic.com/mailman/listinfo/general