301 redirects, RFPDupeFilter and url parameters order

crawler Tue, 16 Dec 2014 04:51:39 -0800

I have a site that performs 301 redirect from example.com?a=1&b=2 to 
example.com?b=2&a=1
So, I got trouble:
— Scrapy found url example.com?b=2&a=1 and put in a queue;
— then Scrapy changed URL to example.com?a=1&b=2 and sent request;
— site performed redirect to example.com?b=2&a=1;
— Scrapy got URL example.com?b=2&a=1 and filtered this URL out as a 
duplicate.


What should I do? I can't disable RFPDupeFilter, because there are real 
duplicate links.
I can't change that site's behavior.

-- 
You received this message because you are subscribed to the Google Groups 
"scrapy-users" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to [email protected].
To post to this group, send email to [email protected].
Visit this group at http://groups.google.com/group/scrapy-users.
For more options, visit https://groups.google.com/d/optout.

301 redirects, RFPDupeFilter and url parameters order

Reply via email to