Nailed it. Stepping through in Eclipse will help out a lot. Have a great weekend folks :-) On May 1, 2014 11:40 AM, "Chris Mattmann" <chris.mattm...@gmail.com> wrote:
> Hey Lewis, > > That's b/c Crawler doesn't do HTTP connections. > PushPull is the component where that occurs. We > specifically made Crawler only handle local data, > and refactored the protocol layer/functionality > into PushPull and they operate through a shared > directory structure for a 'staging' dir and through > Crawler pre conditions and Actions. > > Scope out Push Pull and then we can discuss. > > Thanks dude. > > Cheers, > Chris > > ------------------------ > Chris Mattmann > chris.mattm...@gmail.com > > > > > -----Original Message----- > From: Lewis John Mcgibbney <lewis.mcgibb...@gmail.com> > Reply-To: <user@oodt.apache.org> > Date: Thursday, May 1, 2014 10:35 AM > To: <user@oodt.apache.org> > Subject: CAS Crawler Crawling Code > > >Hi Folks, > >Im sitting jumping between ProductCrawler and StdIngester trying to pin > >point _exactly_ where product fetching actually happens. > >I'm aware of the triple headed nature of crawler workflows e.g. > >preIngestion, postIngestionSuccess and postIngestionFailure... I can see > >the logic within the ProductCrawler code... what I cannot locate is where > >HTTP/transport socket connections are created and used. > > > >Can anyone please point this out? > >Thanks > >Lewis > > >