Hi Yun,

If it would have been real compound words (in Dutch `board game` is written as 
one word `bordspel`), you could have used decompounding stemming. But that 
would not work for misspelled words like below.

I imagine you would want to be able to search on `tax`, and find `tax asset`, 
right? The simplest solution would be to search with wildcards, like: `tax*`..

Cheers,
Geert

From: 
<[email protected]<mailto:[email protected]>>
 on behalf of "Yang, Yun" 
<[email protected]<mailto:[email protected]>>
Reply-To: MarkLogic Developer Discussion 
<[email protected]<mailto:[email protected]>>
Date: Friday, August 28, 2015 at 6:13 AM
To: "[email protected]<mailto:[email protected]>" 
<[email protected]<mailto:[email protected]>>
Subject: [MarkLogic Dev General] How to search run-in words?

All,

Is there an easy way we can do the search on run-in words?  We have some files 
that the words run together like below. Can they be treated as the separate 
words?

Sample:

run-in words

should be

taxasset

tax asset

riseto

rise to

taxbenefit

tax benefit

anincome

an income

decreasefor

decrease for

fabricatorfor

fabricator for


Thanks,

Yun


_______________________________________________
General mailing list
[email protected]
Manage your subscription at: 
http://developer.marklogic.com/mailman/listinfo/general

Reply via email to