Dear all if have the following problem, for which I couldn't find a 
solution on the Internet or develop an own one...:(

 

I would like to write a rss-feed with PyRSS2Gen within a pipeline to a 
file. The problem I have is that the code in the pipeline is run only with 
one scrapy-item at a time. So I cannot loop through the rss-items 
PyRSS2Gen.RSS2 and prior to that write the language, copyright, title 
information,… only once. (see comment in the code example) 

The header is repeated with each item in the generated rss-file. The file 
is written with the "a" option. This should rather be “w”.

 

How can I write the rss file without repeating the header? Where is the 
best place in the scapy frame work to place this code: collect all items 
and write it to file with one header? THX for any hints!


#Code as seen in pipelines.py

import datetime
import PyRSS2Gen

def write_to_rss(item):

    # This should be done only once per rss-file write operation
    rss = PyRSS2Gen.RSS2(
        language = "en-US",
        copyright = "None",
        title = "My Feed",
        link = "http://www.mysite.com/";,
        lastBuildDate = datetime.datetime.utcnow(),       

        # This should be done for each scapy-Item, without scrapy this part 
loops for each thing in the rss feed.
        items = [
           PyRSS2Gen.RSSItem(
            title = str(item['headline']),
            description = str(item['article_content']),
        ])
    # In the original the "w" optin is used to replace the feed i.e. to 
have the correct lastBuildDate
    rss.write_xml(open("pyrss2gen.xml", "a"))

class WriteToRSS(object):
    def process_item(self, item, spider):
        write_to_rss(item)
        return item

It is also possible to add/append items to the variable before the full 
variable gets written to the file. The code from PyRSS2Gen writes the items 
and wraps them in XML and places the header before in the rss-feed and than 
call the WriteToRSS class. But I don't know where and how to place this 
code in scrapy. 

Code hier eingeben      # Add Item to the rss feed
    rss.items.append(PyRSS2Gen.RSSItem(
        title = str(item['headline']),
        description = str(item['article_content'])))...


-- 
You received this message because you are subscribed to the Google Groups 
"scrapy-users" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to [email protected].
To post to this group, send email to [email protected].
Visit this group at https://groups.google.com/group/scrapy-users.
For more options, visit https://groups.google.com/d/optout.

Reply via email to