Hi all, I'm facing a problem to split concatenated English text, more specifically, domain name. For example: boysandgirls.com -> boy(s)|and|girl(s)|.com haveaniceday.net -> have|a|nice|day|.net
Can I use opennlp to do this? I checked the opennlp documentation and looks like "Learnable Tokenizer" is promising, but i couldn't get it to work. Any help is appreciated.