Thanks Daniel,

that sounds like a good idea and I will have a look at that.

But I would also be interested to instrument the call to crawl the actual 
URL so I can put some monitoring code before and after it.
Do you know how the actual crawl is being done ? Is it done via twisted ? 
It does not look like httplib is being used for that.

Thanks
Philipp

On Thursday, May 21, 2015 at 10:29:44 PM UTC+2, Daniel Fockler wrote:
>
> Hey,
>
> Not sure exactly what you are looking for, but you can implement a Scrapy 
> Downloader Middleware and run a process_request function that will pass 
> each request into that function so you can examine it. Here's the docs for 
> that.
>
> http://scrapy.readthedocs.org/en/latest/topics/downloader-middleware.html
>
> On Thursday, May 21, 2015 at 7:03:35 AM UTC-7, Philipp Bussche wrote:
>>
>> Hi there,
>> I am working on some monitoring for my python/scrapy deployment using one 
>> of the commercial APM tools.
>> I was able to instrument the parsing of the response as well as the 
>> pipeline which pushes the items into an ElasticSearch instance.
>> You can see in the attached screenshot how that is visualized in the tool.
>> I would now also like to see the outgoing calls that Scrapy is making 
>> through the downloader to actually crawl the http pages (which is obviously 
>> happening before parsing and pipelining).
>> But I can't figure out where in the code the actual http call is made so 
>> that I could put my monitoring hook around it.
>> Could you guys please point me to the class that is actually doing the 
>> http calls ?
>>
>> Thanks
>> Philipp
>>
>

-- 
You received this message because you are subscribed to the Google Groups 
"scrapy-users" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to [email protected].
To post to this group, send email to [email protected].
Visit this group at http://groups.google.com/group/scrapy-users.
For more options, visit https://groups.google.com/d/optout.

Reply via email to