Of course, I fight with with this issue for a month, submit request for
help on the Internet and find the cause of the issue 2 days later...
I managed to reproduce this issue in firefox by clicking more and faster.
It seems that parameter _flowExecutionKey has server-side limit how many
times it can be used (or how fast - I don't know yet). I will have to
workaround it somehow.
W dniu poniedziałek, 10 października 2016 11:46:08 UTC+2 użytkownik Mateusz
> I have spider for site openlife.pl. It is my pension fund site, where I
> log-in and can view history of payments, value of funds etc. I wanted to
> scrap history of my payments and fees taken - and succeeded. About 400
> operations total. Operations are listed in < table > with all basic info
> and url to details page (for now unused). Table lists only 25 entries, rest
> is on subsequent "next page" pages - I read table, find url for next page,
> go there with the same handler. Works 100%.
> Later I wanted to scrap data from details urls. Seems straightforward and
> actually was - I managed to write this too. Except it didn't work 100%.
> When downloading details is enabled, I get 50-100 operations (of 400) and
> fewer than half of them have details.
> It turned out that every single url on the site does not lead directly to
> page displayed, but is first redirected. Example of proper redirect:
> 1. '
> 2. '
> However a lot of urls are redirected like so:
> 1. '
> 2. '
> 3. '
> 4. '
> steps 2,3 are without any details, so 4 gives main page instead of detail
> page. There scraping fails because there are no data expected by handler.
> Steps 2,3 are common for all failed items. I didn't notice such redirection
> when browsing manually.
> What can cause such redirects? How to avoid them?
> source code can be viewed in full here:
> Methods of interest are on_account_history and
> Code also contains my attempts to solve the issue, including two custom
> downloader middlewares that don't help.
You received this message because you are subscribed to the Google Groups
To unsubscribe from this group and stop receiving emails from it, send an email
To post to this group, send email to firstname.lastname@example.org.
Visit this group at https://groups.google.com/group/scrapy-users.
For more options, visit https://groups.google.com/d/optout.