RE: Uploading the BBC programme catalogue to freebase (was RE: [backstage] Programme Catalogue vs. Freebase (was: BBC Programme Catalogue -any APIs yet?))
Hi Graeme Get in touch with me off list and we can sort this out. thanks Jem Stone. -Original Message- From: [EMAIL PROTECTED] on behalf of Robin Doran Sent: Tue 7/24/2007 9:50 PM To: backstage@lists.bbc.co.uk; backstage@lists.bbc.co.uk Subject: RE: Uploading the BBC programme catalogue to freebase (was RE: [backstage] Programme Catalogue vs. Freebase (was: BBC Programme Catalogue -any APIs yet?)) Hi Graeme, The robots.txt file has been accidentally dropped from the new release and we will be re-introducing it, this is due to initial concerns & complaints raised about personal data population in external search engines when the service was launched. On the subject of scraping the data, I've asked the catalogue.bbc.co.uk team to clarify the terms of use on the data to see if that will help answer your question but if you have a specific request then I would recommend using the Contact Us page http://catalogue.bbc.co.uk/catalogue/infax/contact Regards, From: [EMAIL PROTECTED] on behalf of Graeme West Sent: Tue 7/24/2007 20:39 To: backstage@lists.bbc.co.uk Subject: Re: Uploading the BBC programme catalogue to freebase (was RE: [backstage] Programme Catalogue vs. Freebase (was: BBC Programme Catalogue -any APIs yet?)) Hi all, Sorry to re-open an old thread - just wondering what the position is on scraping the catalogue.bbc.co.uk test site? I say this because I'm trying a little experiment - ingesting the whole catalogue into our Fedora repository ( http://www.fedora.info ) to be cross-referenced with the 200+ hours of BBC audio and video which we legally hold in our legacy repository as per our deposit agreement with the BBC ( http://www.spokenword.ac.uk/using-audio-video/copyright/ ). The reason I ask is that I've constructed a set of scripts which scrape the catalogue.bbc.co.uk archive's RDF files. I've already got a 'master' list of all programme URLs (the script to generate that took a pretty long time on a JANET connection), but having started the crawler grabbing the actual RDF streams for each programme, I can see that this is going to involve a pretty large amount of data transfer. FYI, my crawler uses Wget and respects robots.txt files. There's no robots.txt file on catalogue.bbc.co.uk so it seems to be fair game, but there is one on open.bbc.co.uk - I'm scraping from the former obviously. Clearly there's a licensing issue with copying the content but I'm only trying this as a technical experiment at this stage anyway - it will not be publicly available. -- Graeme West Spoken Word Services Glasgow Caledonian University Email: [EMAIL PROTECTED] <mailto:[EMAIL PROTECTED]> Project web site: http://www.spokenword.ac.uk/ <http://www.spokenword.ac.uk/> On 9 Jul 2007, at 21:30, Brendan Quinn wrote: I was considering entering a hack for Hack Day around that very thing. But then they went and made me one of the judges ;-) Wanna help? A simple set of scripts that scrape the archive (er I mean "call that big RESTful API") and post entries/updates to the freebase sandbox server would be an interesting experiment. I agree that freebase is an amazing resource, especially when the programme data is curated properly: compare http://www.freebase.com/view/?id=%239202a8c04000641f80012406 with http://open.bbc.co.uk/catalogue/infax/series/DOCTOR+WHO ! There may be some rights issues around what would basically amount to opening up the programme catalogue under the creative commons attribution license, where the attribution wouldn't go to the BBC but to Freebase... Brendan. -Original Message- From: [EMAIL PROTECTED] [mailto:[EMAIL PROTECTED] On Behalf Of Oliver Cole Sent: 09 July 2007 20:51 To: backstage@lists.bbc.co.uk Subject: [backstage] Programme Catalogue vs. Freebase (was: BBC Programme Catalogue -any APIs yet?) I've been following the Programme Catalogue since it was announced, and its pretty interesting. I do however have a question for the BBC people on the list - have you considered simply uploading all the information to Freebase[1]? I can understand that you might want to keep it in house, but if you merged it with the wealth of information on Freebase you can do exponentially more. For example, if it was properly integrated you could run a query that would tell me how many of the contributors to Spooks series 2 were born in London. Regards, Oli [1] http://www.freebase.com - A very cool structured database, currently handling 2.3 million instances of 870 'types' - Sent vi
RE: Uploading the BBC programme catalogue to freebase (was RE: [backstage] Programme Catalogue vs. Freebase (was: BBC Programme Catalogue -any APIs yet?))
Hi Graeme, The robots.txt file has been accidentally dropped from the new release and we will be re-introducing it, this is due to initial concerns & complaints raised about personal data population in external search engines when the service was launched. On the subject of scraping the data, I've asked the catalogue.bbc.co.uk team to clarify the terms of use on the data to see if that will help answer your question but if you have a specific request then I would recommend using the Contact Us page http://catalogue.bbc.co.uk/catalogue/infax/contact Regards, From: [EMAIL PROTECTED] on behalf of Graeme West Sent: Tue 7/24/2007 20:39 To: backstage@lists.bbc.co.uk Subject: Re: Uploading the BBC programme catalogue to freebase (was RE: [backstage] Programme Catalogue vs. Freebase (was: BBC Programme Catalogue -any APIs yet?)) Hi all, Sorry to re-open an old thread - just wondering what the position is on scraping the catalogue.bbc.co.uk test site? I say this because I'm trying a little experiment - ingesting the whole catalogue into our Fedora repository ( http://www.fedora.info ) to be cross-referenced with the 200+ hours of BBC audio and video which we legally hold in our legacy repository as per our deposit agreement with the BBC ( http://www.spokenword.ac.uk/using-audio-video/copyright/ ). The reason I ask is that I've constructed a set of scripts which scrape the catalogue.bbc.co.uk archive's RDF files. I've already got a 'master' list of all programme URLs (the script to generate that took a pretty long time on a JANET connection), but having started the crawler grabbing the actual RDF streams for each programme, I can see that this is going to involve a pretty large amount of data transfer. FYI, my crawler uses Wget and respects robots.txt files. There's no robots.txt file on catalogue.bbc.co.uk so it seems to be fair game, but there is one on open.bbc.co.uk - I'm scraping from the former obviously. Clearly there's a licensing issue with copying the content but I'm only trying this as a technical experiment at this stage anyway - it will not be publicly available. -- Graeme West Spoken Word Services Glasgow Caledonian University Email: [EMAIL PROTECTED] <mailto:[EMAIL PROTECTED]> Project web site: http://www.spokenword.ac.uk/ <http://www.spokenword.ac.uk/> On 9 Jul 2007, at 21:30, Brendan Quinn wrote: I was considering entering a hack for Hack Day around that very thing. But then they went and made me one of the judges ;-) Wanna help? A simple set of scripts that scrape the archive (er I mean "call that big RESTful API") and post entries/updates to the freebase sandbox server would be an interesting experiment. I agree that freebase is an amazing resource, especially when the programme data is curated properly: compare http://www.freebase.com/view/?id=%239202a8c04000641f80012406 with http://open.bbc.co.uk/catalogue/infax/series/DOCTOR+WHO ! There may be some rights issues around what would basically amount to opening up the programme catalogue under the creative commons attribution license, where the attribution wouldn't go to the BBC but to Freebase... Brendan. -Original Message- From: [EMAIL PROTECTED] [mailto:[EMAIL PROTECTED] On Behalf Of Oliver Cole Sent: 09 July 2007 20:51 To: backstage@lists.bbc.co.uk Subject: [backstage] Programme Catalogue vs. Freebase (was: BBC Programme Catalogue -any APIs yet?) I've been following the Programme Catalogue since it was announced, and its pretty interesting. I do however have a question for the BBC people on the list - have you considered simply uploading all the information to Freebase[1]? I can understand that you might want to keep it in house, but if you merged it with the wealth of information on Freebase you can do exponentially more. For example, if it was properly integrated you could run a query that would tell me how many of the contributors to Spooks series 2 were born in London. Regards, Oli [1] http://www.freebase.com - A very cool structured database, currently handling 2.3 million instances of 870 'types' - Sent via the backstage.bbc.co.uk discussion group. To unsubscribe, please visit http://backstage.bbc.co.uk/archives/2005/01/mailing_list.html. Unofficial list archive: http://www.mail-archive.com/backstage@lists.bbc.co.uk/ Email has been scanned for viruses by Altman Technologies' email management service - www.altman.co.uk/emailsystems
Re: Uploading the BBC programme catalogue to freebase (was RE: [backstage] Programme Catalogue vs. Freebase (was: BBC Programme Catalogue -any APIs yet?))
Hi all, Sorry to re-open an old thread - just wondering what the position is on scraping the catalogue.bbc.co.uk test site? I say this because I'm trying a little experiment - ingesting the whole catalogue into our Fedora repository ( http://www.fedora.info ) to be cross-referenced with the 200+ hours of BBC audio and video which we legally hold in our legacy repository as per our deposit agreement with the BBC ( http://www.spokenword.ac.uk/using-audio-video/copyright/ ). The reason I ask is that I've constructed a set of scripts which scrape the catalogue.bbc.co.uk archive's RDF files. I've already got a 'master' list of all programme URLs (the script to generate that took a pretty long time on a JANET connection), but having started the crawler grabbing the actual RDF streams for each programme, I can see that this is going to involve a pretty large amount of data transfer. FYI, my crawler uses Wget and respects robots.txt files. There's no robots.txt file on catalogue.bbc.co.uk so it seems to be fair game, but there is one on open.bbc.co.uk - I'm scraping from the former obviously. Clearly there's a licensing issue with copying the content but I'm only trying this as a technical experiment at this stage anyway - it will not be publicly available. -- Graeme West Spoken Word Services Glasgow Caledonian University Email: [EMAIL PROTECTED] Project web site: http://www.spokenword.ac.uk/ On 9 Jul 2007, at 21:30, Brendan Quinn wrote: I was considering entering a hack for Hack Day around that very thing. But then they went and made me one of the judges ;-) Wanna help? A simple set of scripts that scrape the archive (er I mean "call that big RESTful API") and post entries/updates to the freebase sandbox server would be an interesting experiment. I agree that freebase is an amazing resource, especially when the programme data is curated properly: compare http://www.freebase.com/view/?id=%239202a8c04000641f80012406 with http://open.bbc.co.uk/catalogue/infax/series/DOCTOR+WHO ! There may be some rights issues around what would basically amount to opening up the programme catalogue under the creative commons attribution license, where the attribution wouldn't go to the BBC but to Freebase... Brendan. -Original Message- From: [EMAIL PROTECTED] [mailto:[EMAIL PROTECTED] On Behalf Of Oliver Cole Sent: 09 July 2007 20:51 To: backstage@lists.bbc.co.uk Subject: [backstage] Programme Catalogue vs. Freebase (was: BBC Programme Catalogue -any APIs yet?) I've been following the Programme Catalogue since it was announced, and its pretty interesting. I do however have a question for the BBC people on the list - have you considered simply uploading all the information to Freebase[1]? I can understand that you might want to keep it in house, but if you merged it with the wealth of information on Freebase you can do exponentially more. For example, if it was properly integrated you could run a query that would tell me how many of the contributors to Spooks series 2 were born in London. Regards, Oli [1] http://www.freebase.com - A very cool structured database, currently handling 2.3 million instances of 870 'types' - Sent via the backstage.bbc.co.uk discussion group. To unsubscribe, please visit http://backstage.bbc.co.uk/archives/2005/01/ mailing_list.html. Unofficial list archive: http://www.mail- archive.com/backstage@lists.bbc.co.uk/ Email has been scanned for viruses by Altman Technologies' email management service - www.altman.co.uk/emailsystems
Re: [backstage] Re: Uploading the BBC programme catalogue to freebase (was RE: [backstage] Programme Catalogue vs. Freebase (was: BBC Programme Catalogue -any APIs yet?))
On 7/10/07, Tom Loosemore <[EMAIL PROTECTED]> wrote: it'd be a much easier sell chez auntie if Freebase itself didn't demand attribution for any use of the content and data within it... why should the BBC give up attribution (its form of pseudo revenue) on its data and hand over attribution of credit and link juice to Freebase? Why does Freebase insist on its own attribution licence for 3rd party us of all content & data other people upload? As I was pondering the answer to this question, it struck me that it's not clear on what basis Freebase.com exists - by which I mean Freebase.com the entity, rather than the content and data within it. Is Freebase.com a charity? Is it a not-for-profit? Is it fully commercial? Or is it just in its understandably confused early stages? The FAQ is not particularly forthcoming. http://www.freebase.com/signin/faq Not much chance of Auntie handing over its data to someone else to apply their - valuable - attribution licence too with so much uncertainty - however cool it would be from a tech and product and play angle. On 10/07/07, Michael Smethurst <[EMAIL PROTECTED]> wrote: > Just a reminder that the top of the pops data is available under creative commons and has an xml representation (that could probably do with some work) and has musicbrainz ids and musicbrainz has been uploaded to freebase in it's entirety > > Unfortuntely it's under an attribution licence but like tom says about dr who, totp is "clearly [a] BBC programme[s], I'm not sure it's the end of > the world..." > > It's something i've been meaning to do for weeks but keep getting bored reading the api docs. IF you wanna give it a go and need anything more from me shout... > > http://bbc-hackday.dyndns.org:2821/ > > > > > -Original Message- > From: [EMAIL PROTECTED] on behalf of Tom Loosemore > Sent: Mon 7/9/2007 10:48 PM > To: backstage@lists.bbc.co.uk > Subject: Re: [backstage] Re: Uploading the BBC programme catalogue to freebase (was RE: [backstage] Programme Catalogue vs. Freebase (was: BBC Programme Catalogue -any APIs yet?)) > > On 09/07/07, Oliver Cole <[EMAIL PROTECTED]> wrote: > > On Mon, 2007-07-09 at 21:30 +0100, Brendan Quinn wrote: > > > I was considering entering a hack for Hack Day around that very thing. > > > But then they went and made me one of the judges ;-) > > > > > > Wanna help? A simple set of scripts that scrape the archive (er I mean > > > "call that big RESTful API") and post entries/updates to the freebase > > > sandbox server would be an interesting experiment. > > > > I've not yet (bulk) posted data on Freebase - I'll take a look at this > > when I'm more au fait with it. > > > > > compare > > > http://www.freebase.com/view/?id=%239202a8c04000641f80012406 > > > with > > > http://open.bbc.co.uk/catalogue/infax/series/DOCTOR+WHO > > > ! > > > > Freebase is still in alpha as far as I know - those who can't see the > > first link can see a screenshot at: > > http://cornflakes.imen.org.uk/~oli/DrWho.png > > > > Those who are particularly interested can feel free to ask me for one of > > my remaining 4 invites - and I imagine Brendan has some too. > > > > > There may be some rights issues around what would basically amount to > > > opening up the programme catalogue under the creative commons > > > attribution license, where the attribution wouldn't go to the BBC but to > > > Freebase... > > > > Well, the RDF for the catalogue links to > > http://backstage.bbc.co.uk/archives/2005/05/api_licence.html: > > > > "The BBC grants to You a ... non-sublicensable right to copy..." > > > > Further: > > > > d. not publish, distribute or otherwise make the APIs available, > > (including in any Work You create), in a way that would enable other > > people to download or use the APIs other than as set out in this > > Licence. > > standard backstage API licence - it was the only one lying around at > the time... (nov 2005) > > > I don't see any legal way that we can export the data to Freebase and > > relicense it as CC-BY. > > yeah... the attribution back to BBC kinda matters... though given the > programmes are clearly BBC programmes, I'm not sure it's the end of > the world... > > > > Would you be able to get the appropriate BBC people to get this done? > > I'll do a bit of lobbying... > - > Sent via the backstage.bbc.co.uk discussion group. To unsubscribe, please visit http://backstage.bbc.co.uk/archives/2005/01/mailing_list.html. Unoff
Re: [backstage] Re: Uploading the BBC programme catalogue to freebase (was RE: [backstage] Programme Catalogue vs. Freebase (was: BBC Programme Catalogue -any APIs yet?))
it'd be a much easier sell chez auntie if Freebase itself didn't demand attribution for any use of the content and data within it... why should the BBC give up attribution (its form of pseudo revenue) on its data and hand over attribution of credit and link juice to Freebase? Why does Freebase insist on its own attribution licence for 3rd party us of all content & data other people upload? As I was pondering the answer to this question, it struck me that it's not clear on what basis Freebase.com exists - by which I mean Freebase.com the entity, rather than the content and data within it. Is Freebase.com a charity? Is it a not-for-profit? Is it fully commercial? Or is it just in its understandably confused early stages? The FAQ is not particularly forthcoming. http://www.freebase.com/signin/faq Not much chance of Auntie handing over its data to someone else to apply their - valuable - attribution licence too with so much uncertainty - however cool it would be from a tech and product and play angle. On 10/07/07, Michael Smethurst <[EMAIL PROTECTED]> wrote: Just a reminder that the top of the pops data is available under creative commons and has an xml representation (that could probably do with some work) and has musicbrainz ids and musicbrainz has been uploaded to freebase in it's entirety Unfortuntely it's under an attribution licence but like tom says about dr who, totp is "clearly [a] BBC programme[s], I'm not sure it's the end of the world..." It's something i've been meaning to do for weeks but keep getting bored reading the api docs. IF you wanna give it a go and need anything more from me shout... http://bbc-hackday.dyndns.org:2821/ -Original Message- From: [EMAIL PROTECTED] on behalf of Tom Loosemore Sent: Mon 7/9/2007 10:48 PM To: backstage@lists.bbc.co.uk Subject: Re: [backstage] Re: Uploading the BBC programme catalogue to freebase (was RE: [backstage] Programme Catalogue vs. Freebase (was: BBC Programme Catalogue -any APIs yet?)) On 09/07/07, Oliver Cole <[EMAIL PROTECTED]> wrote: > On Mon, 2007-07-09 at 21:30 +0100, Brendan Quinn wrote: > > I was considering entering a hack for Hack Day around that very thing. > > But then they went and made me one of the judges ;-) > > > > Wanna help? A simple set of scripts that scrape the archive (er I mean > > "call that big RESTful API") and post entries/updates to the freebase > > sandbox server would be an interesting experiment. > > I've not yet (bulk) posted data on Freebase - I'll take a look at this > when I'm more au fait with it. > > > compare > > http://www.freebase.com/view/?id=%239202a8c04000641f80012406 > > with > > http://open.bbc.co.uk/catalogue/infax/series/DOCTOR+WHO > > ! > > Freebase is still in alpha as far as I know - those who can't see the > first link can see a screenshot at: > http://cornflakes.imen.org.uk/~oli/DrWho.png > > Those who are particularly interested can feel free to ask me for one of > my remaining 4 invites - and I imagine Brendan has some too. > > > There may be some rights issues around what would basically amount to > > opening up the programme catalogue under the creative commons > > attribution license, where the attribution wouldn't go to the BBC but to > > Freebase... > > Well, the RDF for the catalogue links to > http://backstage.bbc.co.uk/archives/2005/05/api_licence.html: > > "The BBC grants to You a ... non-sublicensable right to copy..." > > Further: > > d. not publish, distribute or otherwise make the APIs available, > (including in any Work You create), in a way that would enable other > people to download or use the APIs other than as set out in this > Licence. standard backstage API licence - it was the only one lying around at the time... (nov 2005) > I don't see any legal way that we can export the data to Freebase and > relicense it as CC-BY. yeah... the attribution back to BBC kinda matters... though given the programmes are clearly BBC programmes, I'm not sure it's the end of the world... > Would you be able to get the appropriate BBC people to get this done? I'll do a bit of lobbying... - Sent via the backstage.bbc.co.uk discussion group. To unsubscribe, please visit http://backstage.bbc.co.uk/archives/2005/01/mailing_list.html. Unofficial list archive: http://www.mail-archive.com/backstage@lists.bbc.co.uk/ - Sent via the backstage.bbc.co.uk discussion group. To unsubscribe, please visit http://backstage.bbc.co.uk/archives/2005/01/mailing_list.html. Unofficial list archive: http://www.mail-archive.com/backstage@lists.bbc.co.uk/
RE: [backstage] Re: Uploading the BBC programme catalogue to freebase (was RE: [backstage] Programme Catalogue vs. Freebase (was: BBC Programme Catalogue -any APIs yet?))
one final thought. getting both the programme catalogue and totp into freebase would allow us combine http://catalogue.bbc.co.uk/catalogue/infax/programme/LLVM414K and http://bbc-hackday.dyndns.org:2821/totp/episode/vj3n (the only episode i remember) and start to make links from bbc programmes to musicbrainz which would be a good thing -Original Message- From: [EMAIL PROTECTED] on behalf of Michael Smethurst Sent: Tue 7/10/2007 8:20 AM To: backstage@lists.bbc.co.uk Subject: RE: [backstage] Re: Uploading the BBC programme catalogue to freebase (was RE: [backstage] Programme Catalogue vs. Freebase (was: BBC Programme Catalogue -any APIs yet?)) Just a reminder that the top of the pops data is available under creative commons and has an xml representation (that could probably do with some work) and has musicbrainz ids and musicbrainz has been uploaded to freebase in it's entirety Unfortuntely it's under an attribution licence but like tom says about dr who, totp is "clearly [a] BBC programme[s], I'm not sure it's the end of the world..." It's something i've been meaning to do for weeks but keep getting bored reading the api docs. IF you wanna give it a go and need anything more from me shout... http://bbc-hackday.dyndns.org:2821/ -Original Message- From: [EMAIL PROTECTED] on behalf of Tom Loosemore Sent: Mon 7/9/2007 10:48 PM To: backstage@lists.bbc.co.uk Subject: Re: [backstage] Re: Uploading the BBC programme catalogue to freebase (was RE: [backstage] Programme Catalogue vs. Freebase (was: BBC Programme Catalogue -any APIs yet?)) On 09/07/07, Oliver Cole <[EMAIL PROTECTED]> wrote: > On Mon, 2007-07-09 at 21:30 +0100, Brendan Quinn wrote: > > I was considering entering a hack for Hack Day around that very thing. > > But then they went and made me one of the judges ;-) > > > > Wanna help? A simple set of scripts that scrape the archive (er I mean > > "call that big RESTful API") and post entries/updates to the freebase > > sandbox server would be an interesting experiment. > > I've not yet (bulk) posted data on Freebase - I'll take a look at this > when I'm more au fait with it. > > > compare > > http://www.freebase.com/view/?id=%239202a8c04000641f80012406 > > with > > http://open.bbc.co.uk/catalogue/infax/series/DOCTOR+WHO > > ! > > Freebase is still in alpha as far as I know - those who can't see the > first link can see a screenshot at: > http://cornflakes.imen.org.uk/~oli/DrWho.png > > Those who are particularly interested can feel free to ask me for one of > my remaining 4 invites - and I imagine Brendan has some too. > > > There may be some rights issues around what would basically amount to > > opening up the programme catalogue under the creative commons > > attribution license, where the attribution wouldn't go to the BBC but to > > Freebase... > > Well, the RDF for the catalogue links to > http://backstage.bbc.co.uk/archives/2005/05/api_licence.html: > > "The BBC grants to You a ... non-sublicensable right to copy..." > > Further: > > d. not publish, distribute or otherwise make the APIs available, > (including in any Work You create), in a way that would enable other > people to download or use the APIs other than as set out in this > Licence. standard backstage API licence - it was the only one lying around at the time... (nov 2005) > I don't see any legal way that we can export the data to Freebase and > relicense it as CC-BY. yeah... the attribution back to BBC kinda matters... though given the programmes are clearly BBC programmes, I'm not sure it's the end of the world... > Would you be able to get the appropriate BBC people to get this done? I'll do a bit of lobbying... - Sent via the backstage.bbc.co.uk discussion group. To unsubscribe, please visit http://backstage.bbc.co.uk/archives/2005/01/mailing_list.html. Unofficial list archive: http://www.mail-archive.com/backstage@lists.bbc.co.uk/ <>
RE: [backstage] Re: Uploading the BBC programme catalogue to freebase (was RE: [backstage] Programme Catalogue vs. Freebase (was: BBC Programme Catalogue -any APIs yet?))
Just a reminder that the top of the pops data is available under creative commons and has an xml representation (that could probably do with some work) and has musicbrainz ids and musicbrainz has been uploaded to freebase in it's entirety Unfortuntely it's under an attribution licence but like tom says about dr who, totp is "clearly [a] BBC programme[s], I'm not sure it's the end of the world..." It's something i've been meaning to do for weeks but keep getting bored reading the api docs. IF you wanna give it a go and need anything more from me shout... http://bbc-hackday.dyndns.org:2821/ -Original Message- From: [EMAIL PROTECTED] on behalf of Tom Loosemore Sent: Mon 7/9/2007 10:48 PM To: backstage@lists.bbc.co.uk Subject: Re: [backstage] Re: Uploading the BBC programme catalogue to freebase (was RE: [backstage] Programme Catalogue vs. Freebase (was: BBC Programme Catalogue -any APIs yet?)) On 09/07/07, Oliver Cole <[EMAIL PROTECTED]> wrote: > On Mon, 2007-07-09 at 21:30 +0100, Brendan Quinn wrote: > > I was considering entering a hack for Hack Day around that very thing. > > But then they went and made me one of the judges ;-) > > > > Wanna help? A simple set of scripts that scrape the archive (er I mean > > "call that big RESTful API") and post entries/updates to the freebase > > sandbox server would be an interesting experiment. > > I've not yet (bulk) posted data on Freebase - I'll take a look at this > when I'm more au fait with it. > > > compare > > http://www.freebase.com/view/?id=%239202a8c04000641f80012406 > > with > > http://open.bbc.co.uk/catalogue/infax/series/DOCTOR+WHO > > ! > > Freebase is still in alpha as far as I know - those who can't see the > first link can see a screenshot at: > http://cornflakes.imen.org.uk/~oli/DrWho.png > > Those who are particularly interested can feel free to ask me for one of > my remaining 4 invites - and I imagine Brendan has some too. > > > There may be some rights issues around what would basically amount to > > opening up the programme catalogue under the creative commons > > attribution license, where the attribution wouldn't go to the BBC but to > > Freebase... > > Well, the RDF for the catalogue links to > http://backstage.bbc.co.uk/archives/2005/05/api_licence.html: > > "The BBC grants to You a ... non-sublicensable right to copy..." > > Further: > > d. not publish, distribute or otherwise make the APIs available, > (including in any Work You create), in a way that would enable other > people to download or use the APIs other than as set out in this > Licence. standard backstage API licence - it was the only one lying around at the time... (nov 2005) > I don't see any legal way that we can export the data to Freebase and > relicense it as CC-BY. yeah... the attribution back to BBC kinda matters... though given the programmes are clearly BBC programmes, I'm not sure it's the end of the world... > Would you be able to get the appropriate BBC people to get this done? I'll do a bit of lobbying... - Sent via the backstage.bbc.co.uk discussion group. To unsubscribe, please visit http://backstage.bbc.co.uk/archives/2005/01/mailing_list.html. Unofficial list archive: http://www.mail-archive.com/backstage@lists.bbc.co.uk/ <>
Re: [backstage] Re: Uploading the BBC programme catalogue to freebase (was RE: [backstage] Programme Catalogue vs. Freebase (was: BBC Programme Catalogue -any APIs yet?))
On 09/07/07, Oliver Cole <[EMAIL PROTECTED]> wrote: On Mon, 2007-07-09 at 21:30 +0100, Brendan Quinn wrote: > I was considering entering a hack for Hack Day around that very thing. > But then they went and made me one of the judges ;-) > > Wanna help? A simple set of scripts that scrape the archive (er I mean > "call that big RESTful API") and post entries/updates to the freebase > sandbox server would be an interesting experiment. I've not yet (bulk) posted data on Freebase - I'll take a look at this when I'm more au fait with it. > compare > http://www.freebase.com/view/?id=%239202a8c04000641f80012406 > with > http://open.bbc.co.uk/catalogue/infax/series/DOCTOR+WHO > ! Freebase is still in alpha as far as I know - those who can't see the first link can see a screenshot at: http://cornflakes.imen.org.uk/~oli/DrWho.png Those who are particularly interested can feel free to ask me for one of my remaining 4 invites - and I imagine Brendan has some too. > There may be some rights issues around what would basically amount to > opening up the programme catalogue under the creative commons > attribution license, where the attribution wouldn't go to the BBC but to > Freebase... Well, the RDF for the catalogue links to http://backstage.bbc.co.uk/archives/2005/05/api_licence.html: "The BBC grants to You a ... non-sublicensable right to copy..." Further: d. not publish, distribute or otherwise make the APIs available, (including in any Work You create), in a way that would enable other people to download or use the APIs other than as set out in this Licence. standard backstage API licence - it was the only one lying around at the time... (nov 2005) I don't see any legal way that we can export the data to Freebase and relicense it as CC-BY. yeah... the attribution back to BBC kinda matters... though given the programmes are clearly BBC programmes, I'm not sure it's the end of the world... Would you be able to get the appropriate BBC people to get this done? I'll do a bit of lobbying... - Sent via the backstage.bbc.co.uk discussion group. To unsubscribe, please visit http://backstage.bbc.co.uk/archives/2005/01/mailing_list.html. Unofficial list archive: http://www.mail-archive.com/backstage@lists.bbc.co.uk/
[backstage] RE: Uploading the BBC programme catalogue to freebase (was RE: [backstage] Programme Catalogue vs. Freebase (was: BBC Programme Catalogue -any APIs yet?))
On Mon, 2007-07-09 at 22:05 +0100, Chris Sizemore wrote: > holy synonomous concepts, batman... > point is, it would be easy to "merge" these on freebase, nearly > impossible directly in the BBC Programme Catalogue context... Indeed, Freebase is superior in this regard. > arguably, the BBC has done it's part by making the Catalogue data > available via RDF and Atom? if freebase is a useful (interim) > destination for this data, isn't the assumption that the community > will make it happen? (hint, hint?) I believe the community would make it happen if the license allowed it to happen. The problem is that the BBC took the obvious step of applying the Backstage API license to the data on the Catalogue, which is more of a database than an API... See my other post for license discussion. Regards, Oli > > > best-- > > --cs > > -Original Message- > From: [EMAIL PROTECTED] on behalf of Brendan Quinn > Sent: Mon 7/9/2007 9:30 PM > To: backstage@lists.bbc.co.uk > Subject: Uploading the BBC programme catalogue to freebase (was RE: > [backstage] Programme Catalogue vs. Freebase (was: BBC Programme > Catalogue -any APIs yet?)) > > I was considering entering a hack for Hack Day around that very thing. > But then they went and made me one of the judges ;-) > > Wanna help? A simple set of scripts that scrape the archive (er I mean > "call that big RESTful API") and post entries/updates to the freebase > sandbox server would be an interesting experiment. > > I agree that freebase is an amazing resource, especially when the > programme data is curated properly: > > compare > http://www.freebase.com/view/?id=%239202a8c04000641f80012406 > with > http://open.bbc.co.uk/catalogue/infax/series/DOCTOR+WHO > ! > > There may be some rights issues around what would basically amount to > opening up the programme catalogue under the creative commons > attribution license, where the attribution wouldn't go to the BBC but > to > Freebase... > > Brendan. > > -Original Message----- > From: [EMAIL PROTECTED] > [mailto:[EMAIL PROTECTED] On Behalf Of Oliver Cole > Sent: 09 July 2007 20:51 > To: backstage@lists.bbc.co.uk > Subject: [backstage] Programme Catalogue vs. Freebase (was: BBC > Programme Catalogue -any APIs yet?) > > I've been following the Programme Catalogue since it was announced, > and > its pretty interesting. > > I do however have a question for the BBC people on the list - have you > considered simply uploading all the information to Freebase[1]? I can > understand that you might want to keep it in house, but if you merged > it > with the wealth of information on Freebase you can do exponentially > more. > > For example, if it was properly integrated you could run a query that > would tell me how many of the contributors to Spooks series 2 were > born > in London. > > Regards, > Oli > > [1] http://www.freebase.com - A very cool structured database, > currently > handling 2.3 million instances of 870 'types' > > - > Sent via the backstage.bbc.co.uk discussion group. To unsubscribe, > please visit > http://backstage.bbc.co.uk/archives/2005/01/mailing_list.html. > Unofficial list archive: > http://www.mail-archive.com/backstage@lists.bbc.co.uk/ > > > - Sent via the backstage.bbc.co.uk discussion group. To unsubscribe, please visit http://backstage.bbc.co.uk/archives/2005/01/mailing_list.html. Unofficial list archive: http://www.mail-archive.com/backstage@lists.bbc.co.uk/
[backstage] Re: Uploading the BBC programme catalogue to freebase (was RE: [backstage] Programme Catalogue vs. Freebase (was: BBC Programme Catalogue -any APIs yet?))
On Mon, 2007-07-09 at 21:30 +0100, Brendan Quinn wrote: > I was considering entering a hack for Hack Day around that very thing. > But then they went and made me one of the judges ;-) > > Wanna help? A simple set of scripts that scrape the archive (er I mean > "call that big RESTful API") and post entries/updates to the freebase > sandbox server would be an interesting experiment. I've not yet (bulk) posted data on Freebase - I'll take a look at this when I'm more au fait with it. > compare > http://www.freebase.com/view/?id=%239202a8c04000641f80012406 > with > http://open.bbc.co.uk/catalogue/infax/series/DOCTOR+WHO > ! Freebase is still in alpha as far as I know - those who can't see the first link can see a screenshot at: http://cornflakes.imen.org.uk/~oli/DrWho.png Those who are particularly interested can feel free to ask me for one of my remaining 4 invites - and I imagine Brendan has some too. > There may be some rights issues around what would basically amount to > opening up the programme catalogue under the creative commons > attribution license, where the attribution wouldn't go to the BBC but to > Freebase... Well, the RDF for the catalogue links to http://backstage.bbc.co.uk/archives/2005/05/api_licence.html: "The BBC grants to You a ... non-sublicensable right to copy..." Further: d. not publish, distribute or otherwise make the APIs available, (including in any Work You create), in a way that would enable other people to download or use the APIs other than as set out in this Licence. I don't see any legal way that we can export the data to Freebase and relicense it as CC-BY. I don't think it can be done without a relicensing of the catalogue - I guess its lucky you didn't go ahead and write that script at Hack day :) Would you be able to get the appropriate BBC people to get this done? Regards, Oli > > -Original Message- > From: [EMAIL PROTECTED] > [mailto:[EMAIL PROTECTED] On Behalf Of Oliver Cole > Sent: 09 July 2007 20:51 > To: backstage@lists.bbc.co.uk > Subject: [backstage] Programme Catalogue vs. Freebase (was: BBC > Programme Catalogue -any APIs yet?) > > I've been following the Programme Catalogue since it was announced, and > its pretty interesting. > > I do however have a question for the BBC people on the list - have you > considered simply uploading all the information to Freebase[1]? I can > understand that you might want to keep it in house, but if you merged it > with the wealth of information on Freebase you can do exponentially > more. > > For example, if it was properly integrated you could run a query that > would tell me how many of the contributors to Spooks series 2 were born > in London. > > Regards, > Oli > > [1] http://www.freebase.com - A very cool structured database, currently > handling 2.3 million instances of 870 'types' > > - > Sent via the backstage.bbc.co.uk discussion group. To unsubscribe, please > visit http://backstage.bbc.co.uk/archives/2005/01/mailing_list.html. > Unofficial list archive: > http://www.mail-archive.com/backstage@lists.bbc.co.uk/ > - Sent via the backstage.bbc.co.uk discussion group. To unsubscribe, please visit http://backstage.bbc.co.uk/archives/2005/01/mailing_list.html. Unofficial list archive: http://www.mail-archive.com/backstage@lists.bbc.co.uk/
RE: Uploading the BBC programme catalogue to freebase (was RE: [backstage] Programme Catalogue vs. Freebase (was: BBC Programme Catalogue -any APIs yet?))
http://catalogue.bbc.co.uk/catalogue/infax/series/DR+WHO holy synonomous concepts, batman... (http://open.bbc.co.uk/catalogue/infax/series/DOCTOR+WHO) point is, it would be easy to "merge" these on freebase, nearly impossible directly in the BBC Programme Catalogue context... suppose this all has to do with the different purposes of the 2 products... arguably, the BBC has done it's part by making the Catalogue data available via RDF and Atom? if freebase is a useful (interim) destination for this data, isn't the assumption that the community will make it happen? (hint, hint?) best-- --cs -Original Message- From: [EMAIL PROTECTED] on behalf of Brendan Quinn Sent: Mon 7/9/2007 9:30 PM To: backstage@lists.bbc.co.uk Subject: Uploading the BBC programme catalogue to freebase (was RE: [backstage] Programme Catalogue vs. Freebase (was: BBC Programme Catalogue -any APIs yet?)) I was considering entering a hack for Hack Day around that very thing. But then they went and made me one of the judges ;-) Wanna help? A simple set of scripts that scrape the archive (er I mean "call that big RESTful API") and post entries/updates to the freebase sandbox server would be an interesting experiment. I agree that freebase is an amazing resource, especially when the programme data is curated properly: compare http://www.freebase.com/view/?id=%239202a8c04000641f80012406 with http://open.bbc.co.uk/catalogue/infax/series/DOCTOR+WHO ! There may be some rights issues around what would basically amount to opening up the programme catalogue under the creative commons attribution license, where the attribution wouldn't go to the BBC but to Freebase... Brendan. -Original Message- From: [EMAIL PROTECTED] [mailto:[EMAIL PROTECTED] On Behalf Of Oliver Cole Sent: 09 July 2007 20:51 To: backstage@lists.bbc.co.uk Subject: [backstage] Programme Catalogue vs. Freebase (was: BBC Programme Catalogue -any APIs yet?) I've been following the Programme Catalogue since it was announced, and its pretty interesting. I do however have a question for the BBC people on the list - have you considered simply uploading all the information to Freebase[1]? I can understand that you might want to keep it in house, but if you merged it with the wealth of information on Freebase you can do exponentially more. For example, if it was properly integrated you could run a query that would tell me how many of the contributors to Spooks series 2 were born in London. Regards, Oli [1] http://www.freebase.com - A very cool structured database, currently handling 2.3 million instances of 870 'types' - Sent via the backstage.bbc.co.uk discussion group. To unsubscribe, please visit http://backstage.bbc.co.uk/archives/2005/01/mailing_list.html. Unofficial list archive: http://www.mail-archive.com/backstage@lists.bbc.co.uk/
Uploading the BBC programme catalogue to freebase (was RE: [backstage] Programme Catalogue vs. Freebase (was: BBC Programme Catalogue -any APIs yet?))
I was considering entering a hack for Hack Day around that very thing. But then they went and made me one of the judges ;-) Wanna help? A simple set of scripts that scrape the archive (er I mean "call that big RESTful API") and post entries/updates to the freebase sandbox server would be an interesting experiment. I agree that freebase is an amazing resource, especially when the programme data is curated properly: compare http://www.freebase.com/view/?id=%239202a8c04000641f80012406 with http://open.bbc.co.uk/catalogue/infax/series/DOCTOR+WHO ! There may be some rights issues around what would basically amount to opening up the programme catalogue under the creative commons attribution license, where the attribution wouldn't go to the BBC but to Freebase... Brendan. -Original Message- From: [EMAIL PROTECTED] [mailto:[EMAIL PROTECTED] On Behalf Of Oliver Cole Sent: 09 July 2007 20:51 To: backstage@lists.bbc.co.uk Subject: [backstage] Programme Catalogue vs. Freebase (was: BBC Programme Catalogue -any APIs yet?) I've been following the Programme Catalogue since it was announced, and its pretty interesting. I do however have a question for the BBC people on the list - have you considered simply uploading all the information to Freebase[1]? I can understand that you might want to keep it in house, but if you merged it with the wealth of information on Freebase you can do exponentially more. For example, if it was properly integrated you could run a query that would tell me how many of the contributors to Spooks series 2 were born in London. Regards, Oli [1] http://www.freebase.com - A very cool structured database, currently handling 2.3 million instances of 870 'types' - Sent via the backstage.bbc.co.uk discussion group. To unsubscribe, please visit http://backstage.bbc.co.uk/archives/2005/01/mailing_list.html. Unofficial list archive: http://www.mail-archive.com/backstage@lists.bbc.co.uk/
[backstage] Programme Catalogue vs. Freebase (was: BBC Programme Catalogue - any APIs yet?)
I've been following the Programme Catalogue since it was announced, and its pretty interesting. I do however have a question for the BBC people on the list - have you considered simply uploading all the information to Freebase[1]? I can understand that you might want to keep it in house, but if you merged it with the wealth of information on Freebase you can do exponentially more. For example, if it was properly integrated you could run a query that would tell me how many of the contributors to Spooks series 2 were born in London. Regards, Oli [1] http://www.freebase.com - A very cool structured database, currently handling 2.3 million instances of 870 'types', On Mon, 2007-07-09 at 17:42 +0100, Jonathan Powell wrote: > On 7/9/07, Tom Loosemore <[EMAIL PROTECTED]> wrote: > > On to my questions: > > Has anyone yet been able to create an API around the BBC > Programme > > Catalogue? It seems this would be the best data source to > use so far. > > the BBC Programme Catalogue is already one big restful API... > which > may be enough for your needs, depending... > > replace 'infax' in with 'xml' in any url and see what you get > back > > eg > http://open.bbc.co.uk/catalogue/xml/programme/ICYD984E > http://open.bbc.co.uk/catalogue/xml/on_this_day/2003/8/13 > http://open.bbc.co.uk/catalogue/xml/contributor/2221 > - > Sent via the backstage.bbc.co.uk discussion group. To > unsubscribe, please visit > http://backstage.bbc.co.uk/archives/2005/01/mailing_list.html. > Unofficial list archive: > http://www.mail-archive.com/backstage@lists.bbc.co.uk/ > > Ah! Excellent. Very useful, especially if you know the exact name or > ID number of the program/actor/etc. Might prove a little tricky if > you've got incomplete data on a show... but I might be tempted to > spend some time tonight putting this in a .Net assembly :) > > - Sent via the backstage.bbc.co.uk discussion group. To unsubscribe, please visit http://backstage.bbc.co.uk/archives/2005/01/mailing_list.html. Unofficial list archive: http://www.mail-archive.com/backstage@lists.bbc.co.uk/