Hi @ the lovely mcf community out there,
 
in our setup we run manifoldcf (2.3) behind a corporate http proxy server and 
we try to crawl specific web pages in the internet.
 
We run into java.net.UnknownHostException because the connector tries to 
resolve the ip of the hostname. This fails, because our network setup does not 
allow direct dns lookups for internet pages and the JDKs 
InetAddress.getByName() call relies on the systems dns lookup mechanisms. All 
internet traffic goes through the corporate http proxy server which does all 
necessary dns resolution on his side.
 
Can you think of any other (more elegant) solution besides adding the records 
to /etc/hosts on the crawlers machine?
 
Many thanks in advance,
Markus
 
 

Reply via email to