Yes, 1.4 has some fixes for bad content. It strips away bad UTF-8 sequences.
On Wednesday 14 December 2011 14:57:40 remi tassing wrote: > I'm using Nutch-1.2. Solr-3.4 & 3.5 don't work but 1.4 works well! -- Markus Jelsma - CTO - Openindex
Yes, 1.4 has some fixes for bad content. It strips away bad UTF-8 sequences.
On Wednesday 14 December 2011 14:57:40 remi tassing wrote: > I'm using Nutch-1.2. Solr-3.4 & 3.5 don't work but 1.4 works well! -- Markus Jelsma - CTO - Openindex