Hi Talat, On Sat, May 3, 2014 at 4:35 AM, <[email protected]> wrote:
> > Now used parser plugins nekohtml doesnt parse correctly. What is wrong with it? Are there any issues in Jira to back this up? > When I tested > in huge website site, it leaves html tags. Pretty vague. Anything else? Any more details? Can this be implemented in existing parser plugins? > IMHO our parser is little > bit old. Which one? Is it possible to upgrade? I don't know which parser you mean. > After doing some research, I found Jsoup[1] and Gumbo[2] > parser. I did some test on broken websites. I saw gumbo and jsoup > parsed very similar Google's parser. > > So what are the benefits? If we have a clear cut argument then lets go for it. If not then maybe your time would be better invested elsewhere. It's up to you I suppose :)

