Scrapy should still crawl the URL 1 time, though. Do you think scrapy isn't crawling b=2&a=1 even once? Can you provide some evidence (output, debug messages, preferably through pastebin as formatting in email is hard to read) to support this?
On Mon, Dec 15, 2014 at 2:57 PM, crawler <[email protected]> wrote: > > I have a site that performs 301 redirect from example.com?a=1&b=2 to > example.com?b=2&a=1 > So, I got trouble: > — Scrapy found url example.com?b=2&a=1 and put in a queue; > — then Scrapy changed URL to example.com?a=1&b=2 and sent request; > — site performed redirect to example.com?b=2&a=1; > — Scrapy got URL example.com?b=2&a=1 and filtered this URL out as a > duplicate. > > What should I do? I can't disable RFPDupeFilter, because there are real > duplicate links. > I can't change that site's behavior. > > -- > You received this message because you are subscribed to the Google Groups > "scrapy-users" group. > To unsubscribe from this group and stop receiving emails from it, send an > email to [email protected]. > To post to this group, send email to [email protected]. > Visit this group at http://groups.google.com/group/scrapy-users. > For more options, visit https://groups.google.com/d/optout. > -- You received this message because you are subscribed to the Google Groups "scrapy-users" group. To unsubscribe from this group and stop receiving emails from it, send an email to [email protected]. To post to this group, send email to [email protected]. Visit this group at http://groups.google.com/group/scrapy-users. For more options, visit https://groups.google.com/d/optout.
