Re: Re: Suggestion: Finding missing words in dictionaries via web scraping and nlp

2017-08-17 Thread Andrej Warkentin
by "missing words in dictionaries", do you mean that if "teh" was used as an archaic spelling of "tea" in a work of Shakespeare (completely made up and hypothetical example), that we should add "teh" to the dictionary and no longer flag it as a wrongly spelled word? Of course not. The result o

Suggestion: Finding missing words in dictionaries via web scraping and natural language processing

2017-08-17 Thread Andrej Warkentin
Hello, in a talk at the PyData Berlin meetup I saw this project: https://github.com/lusy/hora-de-decir-bye-bye , where spanish articles are scraped and searched for english words. In order to identify english words she used the dictionaries from Open Office and compared scraped words to the d