Hi Yun,

I completely forgot about custom dictionaries (thnx Justin!). You can find more 
detail here: http://docs.marklogic.com/guide/search-dev/custom-dictionaries. 
But in a nutshell it allows you to create a dictionary file that should allow 
you to override stemming behavior of particular existing terms, and to learn 
the stemmer how to stem words it doesn’t know yet.

Not entirely clear how you would use that to provide decompounding stemming, 
but it is worth a look at the least..

Cheers,
Geert

From: 
<[email protected]<mailto:[email protected]>>
 on behalf of Geert Josten 
<[email protected]<mailto:[email protected]>>
Reply-To: MarkLogic Developer Discussion 
<[email protected]<mailto:[email protected]>>
Date: Friday, August 28, 2015 at 7:52 AM
To: MarkLogic Developer Discussion 
<[email protected]<mailto:[email protected]>>
Subject: Re: [MarkLogic Dev General] How to search run-in words?

Hi Yun,

If it would have been real compound words (in Dutch `board game` is written as 
one word `bordspel`), you could have used decompounding stemming. But that 
would not work for misspelled words like below.

I imagine you would want to be able to search on `tax`, and find `tax asset`, 
right? The simplest solution would be to search with wildcards, like: `tax*`..

Cheers,
Geert

From: 
<[email protected]<mailto:[email protected]>>
 on behalf of "Yang, Yun" 
<[email protected]<mailto:[email protected]>>
Reply-To: MarkLogic Developer Discussion 
<[email protected]<mailto:[email protected]>>
Date: Friday, August 28, 2015 at 6:13 AM
To: "[email protected]<mailto:[email protected]>" 
<[email protected]<mailto:[email protected]>>
Subject: [MarkLogic Dev General] How to search run-in words?

All,

Is there an easy way we can do the search on run-in words?  We have some files 
that the words run together like below. Can they be treated as the separate 
words?

Sample:

run-in words

should be

taxasset

tax asset

riseto

rise to

taxbenefit

tax benefit

anincome

an income

decreasefor

decrease for

fabricatorfor

fabricator for


Thanks,

Yun


_______________________________________________
General mailing list
[email protected]
Manage your subscription at: 
http://developer.marklogic.com/mailman/listinfo/general

Reply via email to