Are you doing anything other than parsing the XML? Also, how big is the Atom feed? > 10MB?
Also, are you using a DOM parser or stream based? Stream based will be a bit faster. Finally, how long does it take you to parse, say, 100 Atom entries using your code? I would expect the time to be on the order of <1ms. > -----Original Message----- > From: [email protected] [mailto:google- > [email protected]] On Behalf Of Sérgio Nunes > Sent: Tuesday, February 24, 2009 6:04 AM > To: Google App Engine > Subject: [google-appengine] Advice on dealing with high CPU consumption > in fetch + parse script > > > Hi, > > I would like to have some advice on how to deal with a CPU consuming > script. > The script simply fetches an Atom XML file (using urlfetch) and then > parses each item using both minidom and BeautifulSoup. The Atom file > typically has 50 entries. > > It seems that spawning a process for each N entries to be parsed would > be the best option. However I think that this is not possible with > GAE. > > The Atom file is being retrieved every hour. I could reduce the number > of entries to be parsed by increasing the frequency of urlfetch calls. > The trade off seems to be between more calls to urlfetch with fewer > items to parse, or less calls to urlfetch with more items to parse. > > Any other option I am missing? > In a nutshell, what is the best (optimized and scalable) way to > periodically fetch and parse an Atom feed. > > Thanks in advance for any comments, > -- > Sérgio Nunes > --~--~---------~--~----~------------~-------~--~----~ You received this message because you are subscribed to the Google Groups "Google App Engine" group. To post to this group, send email to [email protected] To unsubscribe from this group, send email to [email protected] For more options, visit this group at http://groups.google.com/group/google-appengine?hl=en -~----------~----~----~----~------~----~------~--~---
