I like to analyse text. my method consisted of something like
words=text.split(), which would split the text into space-seperated
units. then I tried to use the Python NLTK library, which had alot
of features I wanted, but using `word-tokenize' gives a different
answer.-
What gives?.
--
m..
On 2018-08-07, Stefan Ram wrote:
> Steven D'Aprano writes:
>>In natural language, words are more complicated than just space-separated
>>units. Some languages don't use spaces as a word delimiter.
>
> Even above, the word »units« is neither directly preceded
> nor directly followed by a spac