Hi @ the lovely mcf community out there,
in our setup we run manifoldcf (2.3) behind a corporate http proxy server and
we try to crawl specific web pages in the internet.
We run into java.net.UnknownHostException because the connector tries to
resolve the ip of the hostname. This fails, because our network setup does not
allow direct dns lookups for internet pages and the JDKs
InetAddress.getByName() call relies on the systems dns lookup mechanisms. All
internet traffic goes through the corporate http proxy server which does all
necessary dns resolution on his side.
Can you think of any other (more elegant) solution besides adding the records
to /etc/hosts on the crawlers machine?
Many thanks in advance,
Markus