I suppose I should mention that I have a prototype wrapper of Google's gumbo HTML parsing library in the works: https://github.com/porterjamesj/GumboParser.jl
It's not on METADATA and I wouldn't consider the API stable, but everything seems to work pretty well so far. If you really want to use it now I would just look at my code and vendor whatever you need into your project, as I will probably make many breaking changes to what's there now. On Wednesday, June 4, 2014 5:47:38 AM UTC-5, paul analyst wrote: > > How to read and change the content of web pages to the vector ["word1", " > word2", "word3", ",,,", "wordlast"]? > > Paul >