Re: Unable to index on Hadoop 3.2.0 with 1.16

2020-08-12 Thread Sebastian Nagel
Hi Joe, > I eliminated it when I updated the index-writers.xml for the solr_indexer_1 > to use only a single URL. Thanks for the hint. I'm able to reproduce the error by adding an overlong URL to Could you open an issue to fix this on https://issues.apache.org/jira/projects/NUTCH ?

Re: Unable to index on Hadoop 3.2.0 with 1.16

2020-08-12 Thread Gilvary, Joseph
Hi, I wasn't on the list when this discussion happened, so I hope this will thread correctly in archives. I linked to the archive below and tried to include enough here to ensure searchers can find it if this won't thread. I was getting an error with Nutch 1.17. I never used 1.16, but

Re: Unable to index on Hadoop 3.2.0 with 1.16

2019-10-22 Thread Sebastian Nagel
Hi Markus, any updates on this? Just to make sure the issue gets resolved. Thanks, Sebastian On 14.10.19 17:08, Markus Jelsma wrote: Hello, We're upgrading our stuff to 1.16 and got a peculiar problem when we started indexing: 2019-10-14 13:50:30,586 WARN [main]

Re: Unable to index on Hadoop 3.2.0 with 1.16

2019-10-14 Thread Sebastian Nagel
Hi Markus, I've tested in pseudo-distributed mode with Hadoop 3.2.1, including indexing into Solr. It worked. Could be a dependency version issue similar to that causing NUTCH-2706. But that's only an assumption. Since the IndexWriters.describe() is for help only, I would just deactivate this

Unable to index on Hadoop 3.2.0 with 1.16

2019-10-14 Thread Markus Jelsma
Hello, We're upgrading our stuff to 1.16 and got a peculiar problem when we started indexing: 2019-10-14 13:50:30,586 WARN [main] org.apache.hadoop.mapred.YarnChild: Exception running child : java.lang.IllegalStateException: text width is less than 1, was <-41> at