Re: [backstage] Introducing Chipwrapper search for UK newspapers
On 01/09/07, Martin Belam [EMAIL PROTECTED] wrote: Yeah, mostly Pipes to process the RSS feeds, and the Google Custom Search Engine. There's also some very crude Perl of my own to add Newspaper: Some newspaper headline into the RSS before it gets passed to Feedburner, and to make the 'headline buzz' feed. Apologies on and off list for delay in replying to people. At the moment in Crete I'm getting 7.2 Kbps online in 30 second bursts. Ho ho ho I had the same problem on Crete - but they had brilliant broadband on Kos... m On 31/08/07, Ian Forrester [EMAIL PROTECTED] wrote: Nice work Martin, I'll add it to the prototype list on Monday. What's it build using? Just pipes and Google custom search? Ian Forrester This e-mail is: [ x ] private; [ ] ask first; [ ] bloggable Senior Producer, BBC Backstage BC5 C3, Media Village, 201 Wood Lane, London W12 7TP e: [EMAIL PROTECTED] p: +44 (0)2080083965 -Original Message- From: [EMAIL PROTECTED] [mailto: [EMAIL PROTECTED] On Behalf Of Martin Belam Sent: 30 August 2007 22:12 To: backstage@lists.bbc.co.uk Subject: [backstage] Introducing Chipwrapper search for UK newspapers Hi all, I wanted to introduce to the list a site I've been working on, and invite you to have a play with the feeds being produced, and maybe help make some new tools for it. Called Chipwrapper, it is intended to be a hub for searching purely UK newspapers and UK news sources. http://www.chipwrapper.co.uk OK, I know, I know, the Chipwrapper metaphor doesn't work once you add TV news. Originally it was strictly newspapers only, but it just seemed weird to be searching UK news and not see links from the BBC, ITN and Sky. The homepage is a headline aggregator and a Google Custom Search Engine which only brings back results from the major UK newspapers, plus the TV news giants. astonishingly long link which will break in your mail client http://www.google.com/custom?cx=003036505619348485408%3Aminejdg5pkecof=AH%3Aleft%3BALC%3A%2366%3BBGC%3A%23FF%3BCX%3AChipwrapper%3BDIV%3A%23BB%3BFORID%3A0%3BGALT%3A%23003300%3BGFNT%3A%2366%3BGIMP%3A%2366%3BL%3Ahttp%3A%2F%2Fwww.chipwrapper.co.uk%2Fimages%2Fchipwrapper-logo-small.jpg%3BLC%3A%2366%3BLH%3A35%3BLP%3A1%3BS%3Ahttp%3A%2F%2Fwww.chipwrapper.co.uk%3BT%3A%2333%3BVLC%3A%23002200%3Bq=BBC+backstage /astonishingly long link which will break in your mail client I plan to add regional and local newspapers to the results later in the year. There are Opensearch plugins and a custom Google Toolbar button for the service. http://www.chipwrapper.co.uk/tools/browser_search_plugins.shtml http://www.chipwrapper.co.uk/tools/google_toolbar_buttons.shtml There are also some RSS feeds for news headlines, sport headlines and football headlines - with some rugby-flavoured stuff to come to tie-in with the upcoming world cup. http://www.chipwrapper.co.uk/tools/rss_feeds.shtml There's also a Headline Buzz feature. It uses a longer Yahoo! Pipe which takes ten headlines for each source - http://pipes.yahoo.com/pipes/pipe.run?_id=QKDz_ihT3BGSheQho_NLYQ_render=rss - and then analyses the most popular words. The top 7 words (at the moment) appear on the Chipwrapper homepage as the Headline Buzz links, but there is also a headline buzz RSS feed. This has all of the words (minus stop words like 'the', 'of' etc) that appear more than 3 times in the set of headlines in popularity order. It refreshes every hour. http://feeds.feedburner.com/chipwrapper-buzz There's a page on the site about making DIY stuff, with links to all the feeds and the original Yahoo! Pipes I've used to mash-up the newspaper content in one place. http://www.chipwrapper.co.uk/tools/make_stuff.shtml So far, apart from the cost of registering the domain and my own time, I've done everything using free (as in didn't cost me money) tools and free (as in I've republished it but am not quite sure how The Sun's lawyers are going to take it) content. There's lots of things that I've thought of, but don't have the ability and/or time to do or learn about - like mash-ups with maps, tracking headline changes over time, email alerts on topics - which maybe some of you guys and gals might want to play with? Of course, any feedback on what is there already is very welcome on or off-list - [EMAIL PROTECTED] - plus does anyone know of a really good ready-to-download text file of 'stop' words, because at the moment I'm having to build it up by hand? I'm going to be in London for most of September and October with my BBC hat back on for a bit, so hopefully I might see/meet some of you at something suitably geeky during the course of that. all the best, martin -- Martin Belam - http://www.currybet.net - Sent via the backstage.bbc.co.uk discussion group. To unsubscribe, please visit http
Re: [backstage] Introducing Chipwrapper search for UK newspapers
One of the problems I had with the datestamps is that certain newspapers CMS systems date all of their entries to being published on January 1st 1971 or some such arbitrary figure - but possibly I can do some more work on the Pipes. m On 31/08/2007, Mario Menti [EMAIL PROTECTED] wrote: Hi Martin, this is cool, and I was immediately thinking of feeding the headlines to twitter using twitterfeed. However, some of the feeds, e.g. http://feeds.feedburner.com/chipwrapper (the chipwrapper uk newspaper headlines) don't seem to contain any date stamps, so won't work with twitterfeed which needs the time stamps to know if an item is new and/or has been posted previously. Any chance you could add this? Cheers, Mario. On 8/30/07, Martin Belam [EMAIL PROTECTED] wrote: Hi all, I wanted to introduce to the list a site I've been working on, and invite you to have a play with the feeds being produced, and maybe help make some new tools for it. Called Chipwrapper, it is intended to be a hub for searching purely UK newspapers and UK news sources. http://www.chipwrapper.co.uk OK, I know, I know, the Chipwrapper metaphor doesn't work once you add TV news. Originally it was strictly newspapers only, but it just seemed weird to be searching UK news and not see links from the BBC, ITN and Sky. The homepage is a headline aggregator and a Google Custom Search Engine which only brings back results from the major UK newspapers, plus the TV news giants. astonishingly long link which will break in your mail client http://www.google.com/custom?cx=003036505619348485408%3Aminejdg5pkecof=AH%3Aleft %3BALC%3A%2366%3BBGC%3A%23FF%3BCX%3AChipwrapper%3BDIV%3A%23BB%3BFORID%3A0%3BGALT%3A%23003300%3BGFNT%3A%2366%3BGIMP%3A%2366%3BL%3Ahttp%3A%2F%2Fwww.chipwrapper.co.uk%2Fimages%2Fchipwrapper-logo-small.jpg %3BLC%3A%2366%3BLH%3A35%3BLP%3A1%3BS%3Ahttp%3A%2F%2Fwww.chipwrapper.co.uk%3BT%3A%2333%3BVLC%3A%23002200%3Bq=BBC+backstage /astonishingly long link which will break in your mail client I plan to add regional and local newspapers to the results later in the year. There are Opensearch plugins and a custom Google Toolbar button for the service. http://www.chipwrapper.co.uk/tools/browser_search_plugins.shtml http://www.chipwrapper.co.uk/tools/google_toolbar_buttons.shtml There are also some RSS feeds for news headlines, sport headlines and football headlines - with some rugby-flavoured stuff to come to tie-in with the upcoming world cup. http://www.chipwrapper.co.uk/tools/rss_feeds.shtml There's also a Headline Buzz feature. It uses a longer Yahoo! Pipe which takes ten headlines for each source - http://pipes.yahoo.com/pipes/pipe.run?_id=QKDz_ihT3BGSheQho_NLYQ_render=rss - and then analyses the most popular words. The top 7 words (at the moment) appear on the Chipwrapper homepage as the Headline Buzz links, but there is also a headline buzz RSS feed. This has all of the words (minus stop words like 'the', 'of' etc) that appear more than 3 times in the set of headlines in popularity order. It refreshes every hour. http://feeds.feedburner.com/chipwrapper-buzz There's a page on the site about making DIY stuff, with links to all the feeds and the original Yahoo! Pipes I've used to mash-up the newspaper content in one place. http://www.chipwrapper.co.uk/tools/make_stuff.shtml So far, apart from the cost of registering the domain and my own time, I've done everything using free (as in didn't cost me money) tools and free (as in I've republished it but am not quite sure how The Sun's lawyers are going to take it) content. There's lots of things that I've thought of, but don't have the ability and/or time to do or learn about - like mash-ups with maps, tracking headline changes over time, email alerts on topics - which maybe some of you guys and gals might want to play with? Of course, any feedback on what is there already is very welcome on or off-list - [EMAIL PROTECTED] - plus does anyone know of a really good ready-to-download text file of 'stop' words, because at the moment I'm having to build it up by hand? I'm going to be in London for most of September and October with my BBC hat back on for a bit, so hopefully I might see/meet some of you at something suitably geeky during the course of that. all the best, martin -- Martin Belam - http://www.currybet.net - Sent via the backstage.bbc.co.uk discussion group. To unsubscribe, please visit http://backstage.bbc.co.uk /archives/2005/01/mailing_list.html. Unofficial list archive: http://www.mail-archive.com/backstage@lists.bbc.co.uk/ -- Martin Belam - http://www.currybet.net - Sent via the backstage.bbc.co.uk discussion group. To unsubscribe, please visit http://backstage.bbc.co.uk/archives/2005/01/mailing_list.html. Unofficial list archive: