The patch has been committed and will be part of the 0.6 release. Karl
On Sun, Jul 8, 2012 at 9:54 AM, Karl Wright <[email protected]> wrote: > Thanks for all of your work on this. I'll be able to commit this patch > tonight. > > > Karl > > Sent from my Windows Phone > ________________________________ > From: Jan van Haarst > Sent: 7/8/2012 6:40 AM > > To: [email protected] > Subject: Re: Crawling behind an ISA proxy (iis 7.5) > > Dear All, > > We are now able to connect to the IIS proxy, thanks to the added logging > facilities by Karl, we were able to see that this is the fix : > > Index: > connectors/webcrawler/connector/src/main/java/org/apache/manifoldcf/crawler/connectors/webcrawler/WebcrawlerConnector.java > =================================================================== > --- > connectors/webcrawler/connector/src/main/java/org/apache/manifoldcf/crawler/connectors/webcrawler/WebcrawlerConnector.java > (revision 1357379) > +++ > connectors/webcrawler/connector/src/main/java/org/apache/manifoldcf/crawler/connectors/webcrawler/WebcrawlerConnector.java > (working copy) > @@ -361,7 +361,7 @@ > String emailAddress = > params.getParameter(WebcrawlerConfig.PARAMETER_EMAIL); > if (emailAddress == null) > throw new ManifoldCFException("Missing email address"); > - userAgent = "ApacheManifoldCFWebCrawler; "+emailAddress+")"; > + userAgent = "Mozilla/5.0 (ApacheManifoldCFWebCrawler; > "+emailAddress+")"; > from = emailAddress; > > x = params.getParameter(WebcrawlerConfig.PARAMETER_ROBOTSUSAGE); > > Yes, this is weird, a proxy shouldn't fail on User-Agent settings, but > apparently this one does. > Even Google apparently does this : > http://www.useragentstring.com/pages/Googlebot/ > Now, we 'just' have to get the crawling working, but the main (unique) > hurdle has now been taken ! > > Karl, a big Thank You for your help, and for the openssl s_client that > enabled us to debug this. > > Dag, > Jan > > On Thu, Jun 28, 2012 at 11:05 PM, Jan van Haarst <[email protected]> wrote: >> >> On Thu, Jun 28, 2012 at 11:26 AM, Karl Wright <[email protected]> wrote: >>> >>> I was wondering if you'd picked up and tried the patch for >>> CONNECTORS-483. This patch adds official proxy support for the Web >>> Connector. Alternatively, you could try to build and run with trunk >>> code. >>> >>> Karl >> >> >> I'm going the building from trunk way, and all seems to go well up to the >> creation of the zip and tar.gz files. >> Is there anything special to do after running the build process like this >> ? >> >> ant clean clean-core-deps clean-deps && ant make-core-deps make-deps build >> && ant image >> >> Did I miss anything ? >> If not, I'll replace the old binary installation with my source-build one, >> and see where it leads me. >> >> -- >> Dag, >> Jan > > > > > -- > Dag, > Jan
