Hello Scrapy users,
we released Scrapy 1.4.0 last Thursday and we hope you will like it. It brings a bunch of bug fixes but also a handful of new features. response.follow: the new kid in town Checkout the new response.follow <https://doc.scrapy.org/en/latest/topics/request-response.html#scrapy.http.Response.follow> shortcut method to properly build Request objects in your callbacks. It is the new recommended way to do that. It’s shorter to write, and more correct. So, instead of: for href in response.css('li.page a::attr(href)').extract(): url = response.urljoin(href) yield scrapy.Request(url, self.parse, encoding=response.encoding) you can now write this: for a in response.css('li.page a'): yield response.follow(a, self.parse) FTP in Python 3 Scrapy finally supports FTP in Python 3, with the additional support for anonymous FTP sessions even. Just make sure you are using at least Twisted 17.1. Link extractors Link extractors also got some love regarding leading and trailing whitespace. Their behavior is now much closer to what your regular desktop browser does when following hyperlinks. Oh, and we disabled the default canonicalization of URLs for extracted links. It was causing more trouble for users than anything. Referrer policy Handling of the “Referer” HTTP header is now driven by a customizable Referrer Policy, as defined by the W3C <https://www.w3.org/TR/referrer-policy/>. Checkout the details and security implications in the dedicated docs section <https://docs.scrapy.org/en/latest/topics/spider-middleware.html#std:setting-REFERRER_POLICY> . Pretty-printing your items Scrapy 1.4 also has a new option for pretty-printing items when you export to JSON or XML. By default, you still have items on their own line. But you can also get a more human-readable output with a non-negative FEED_EXPORT_INDENT <https://docs.scrapy.org/en/latest/topics/feed-exports.html#std:setting-FEED_EXPORT_INDENT> . To get a pretty-printed JSON with an indentation of two spaces, you run: $ scrapy crawl yourspider -o items.json -s FEED_EXPORT_INDENT=2 We recommend all users to update Scrapy to version 1.4.0. Pip users: $ pip install --upgrade scrapy Conda users: $ conda install -c conda-forge scrapy=1.4.0 Check out the release notes <https://docs.scrapy.org/en/latest/news.html#scrapy-1-4-0-2017-05-18> for the full changelog. Happy scraping! /Paul, for the Scrapy team -- You received this message because you are subscribed to the Google Groups "scrapy-users" group. To unsubscribe from this group and stop receiving emails from it, send an email to scrapy-users+unsubscr...@googlegroups.com. To post to this group, send email to scrapy-users@googlegroups.com. Visit this group at https://groups.google.com/group/scrapy-users. For more options, visit https://groups.google.com/d/optout.