Dear all,

I would like to know if it's possible to get a list of ngrams with a hyphen 
inside, maybe during the tokenization process.

For exemple, I want to get these bigrams: 
- call-connected signal
- clear-back signal
- clear-forward signal

Instead of two bigrams for each one:
- call<>connected<>179 2608 527 
  connected<>signal<>189 320 9176  

- clear<>back<>283 1115 733 
  back<>signal<>157 380 9176 

- clear<>forward<>632 1115 877 
  forward<>signal<>493 1547 9176 

Thanks a lot,

Mercè


Reply via email to