Thanks Daniel, that sounds like a good idea and I will have a look at that.
But I would also be interested to instrument the call to crawl the actual URL so I can put some monitoring code before and after it. Do you know how the actual crawl is being done ? Is it done via twisted ? It does not look like httplib is being used for that. Thanks Philipp On Thursday, May 21, 2015 at 10:29:44 PM UTC+2, Daniel Fockler wrote: > > Hey, > > Not sure exactly what you are looking for, but you can implement a Scrapy > Downloader Middleware and run a process_request function that will pass > each request into that function so you can examine it. Here's the docs for > that. > > http://scrapy.readthedocs.org/en/latest/topics/downloader-middleware.html > > On Thursday, May 21, 2015 at 7:03:35 AM UTC-7, Philipp Bussche wrote: >> >> Hi there, >> I am working on some monitoring for my python/scrapy deployment using one >> of the commercial APM tools. >> I was able to instrument the parsing of the response as well as the >> pipeline which pushes the items into an ElasticSearch instance. >> You can see in the attached screenshot how that is visualized in the tool. >> I would now also like to see the outgoing calls that Scrapy is making >> through the downloader to actually crawl the http pages (which is obviously >> happening before parsing and pipelining). >> But I can't figure out where in the code the actual http call is made so >> that I could put my monitoring hook around it. >> Could you guys please point me to the class that is actually doing the >> http calls ? >> >> Thanks >> Philipp >> > -- You received this message because you are subscribed to the Google Groups "scrapy-users" group. To unsubscribe from this group and stop receiving emails from it, send an email to [email protected]. To post to this group, send email to [email protected]. Visit this group at http://groups.google.com/group/scrapy-users. For more options, visit https://groups.google.com/d/optout.
