Hello
I am using nutch 2.3 and faced a problem with some arabic content sites
this url displays the title by a tag in the <body>
and getTitle code will stop after </head> and consider that there is no
title
I thought many times of a good way to get this title and figure out that I
can modify  "getTextHelper" in parser-html plugin to make it return two
StringBuilder content and title and make no need for getTitle function ...
I thought that I have to report this for you
thank you for everything

Reply via email to