Hi All!
Let say I have a filter that produces new tokens based on the original ones. How bad will it be if my filter sets the start of each token to 0 and end to the length of a token? An example (based on the phrase "How are you?": Original token: [you?] (8,12) New tokens: [you] (0,3) [?] (0,1) It wouldn't be so hard to calculate the right numbers for left to right languages and it is a bit more challenging to do it for right to left ones but for mixed text it is quite hard. Thanks.
