RE: [backstage] Competition - Ideas but no time

2005-09-05 Thread David Tattersall
Luke,

Wow, I was thinking of something along the lines of Bayesian filtering too!

I was thinking that - if Bayesian filters can be trained to pick out spam,
how about television programmes?

Of course spam contains a lot more information - headers, formatting etc
whereas you'd just have a title and a few sentences of description for each
programme.

The lines I was thinking along were marking certain programmes as favourites
and the app will tell you if favourite programmes are coming up in the next
week, and suggest different ones you might like.

David

-Original Message-
From: [EMAIL PROTECTED]
[mailto:[EMAIL PROTECTED] On Behalf Of Luke Dicken
Sent: 05 September 2005 2:15
To: backstage@lists.bbc.co.uk
Subject: RE: [backstage] Competition - Ideas but no time

My background is fairly heavily AI-slanted so that's the sort of area I've
been coming up with ideas. I think the best one I've had so far is to take
the data and couple it with a Bayesian classification system. By using a
reasonable set of training data (a previous week's listings for
example) and a decent heuristic you should be able to create an AI system
that can predictively suggest forthcoming programmes. You would also need
some ancillary odds and sods like user tracking to cater on a per-person
basis. With the basic system implemented the heuristic function could be
tweaked to make it more accurate if necessary, the same base system could
also be used with multiple functions - perhaps one could have a larger
emphasis on timing than content etc. The coding for this most likely
wouldn't be all that intense since its primarily mathematical, although
storing the results of profiling would give it a reasonable amount of
overhead (or would require a bit of ninja-ing) since you would be having to
do a certain amount of natural language analysis on the data.

I don't know about you guys, but something that highlighted programs I might
be interested in would certainly be of benefit to me, saving me trawling
through the listings.


--

"Those are my principles. If you don't like them, I have others."

Groucho Marx


> -Original Message-
> From: [EMAIL PROTECTED] 
> [mailto:[EMAIL PROTECTED] On Behalf Of Chris Gilbert
> Sent: 05 September 2005 13:23
> To: backstage@lists.bbc.co.uk
> Subject: Re: [backstage] Competition - Ideas but no time
> 
> 
> I'm pretty sure I goes for the majority of people here to say that we 
> are always interested to hear ideas.
> 
> Why not throw them into the discussion and see what happens?
> 
> --
> Chris Gilbert
> 
> 07966 077 486
> [EMAIL PROTECTED]
> 
> 
> 
> On 5 Sep 2005, at 12:14, Luke Dicken wrote:
> 
> 
> > I have some ideas for projects but no time to fulfill them, is there 
> > anyone out there who has time but no ideas?
> >
> > Luke
> >
> >
> > --
> >
> > "Those are my principles. If you don't like them, I have others."
> >
> > Groucho Marx
> >
> >
> > -
> > Sent via the backstage.bbc.co.uk discussion group.  To
> unsubscribe,
> > please visit http://backstage.bbc.co.uk/archives/2005/01/
> > mailing_list.html.  Unofficial list archive: http://www.mail- 
> > archive.com/backstage@lists.bbc.co.uk/
> >
> >
> 
> 
> -
> Sent via the backstage.bbc.co.uk discussion group.  To unsubscribe, 
> please visit 
> http://backstage.bbc.co.uk/archives/2005/01/mailing_list.html.
  Unofficial list archive:
http://www.mail-archive.com/backstage@lists.bbc.co.uk/

-
Sent via the backstage.bbc.co.uk discussion group.  To unsubscribe, please
visit http://backstage.bbc.co.uk/archives/2005/01/mailing_list.html.
Unofficial list archive:
http://www.mail-archive.com/backstage@lists.bbc.co.uk/

--
No virus found in this incoming message.
Checked by AVG Anti-Virus.
Version: 7.0.344 / Virus Database: 267.10.18/89 - Release Date: 02/09/05
 

-- 
No virus found in this outgoing message.
Checked by AVG Anti-Virus.
Version: 7.0.344 / Virus Database: 267.10.18/89 - Release Date: 02/09/05
 

-
Sent via the backstage.bbc.co.uk discussion group.  To unsubscribe, please 
visit http://backstage.bbc.co.uk/archives/2005/01/mailing_list.html.  
Unofficial list archive: http://www.mail-archive.com/backstage@lists.bbc.co.uk/


RE: [backstage] Competition - Ideas but no time

2005-09-05 Thread Jared Williams
 
> -Original Message-
> From: [EMAIL PROTECTED] 
> [mailto:[EMAIL PROTECTED] On Behalf Of Luke Dicken
> Sent: 05 September 2005 17:15
> To: backstage@lists.bbc.co.uk
> Subject: RE: [backstage] Competition - Ideas but no time
> 
> > Had an idea for a crude method for this, basically 
> comparing genres of 
> > programs, and couting the matches.
> > (Should be fairly trivial in SQL). Could be used for "Find other 
> > programs like this program", or Finding Thrillers or whatever.
> 
> Transporting the XML into SQL is going to introduce a fair 
> bot of bloat though surely? Wouldn't it be better running 
> your queries through DOM or SAX?
> 

Current importer is 220 odd lines of PHP5 with the libXML DOM extension.

Imports Service, ProgramInformation, GroupInformation, and ProgramLocation TVA 
sections. It doesn't import everything, just what I
think is (currently) useful. 

DOM & SAX I think would be too slow, lacking permanent indexes, and having todo 
cross document queries.

Currently have queries like

SELECT MIN(startTime), MAX(startTime) FROM
(SELECT MAX(startTime) AS startTime FROM
scheduleEvent
WHERE serviceId IN ($set)
AND startTime < :startTime
GROUP BY serviceId)

Which adjusts time such that get all currently airing programs, for a set of 
services, which would be a little painful for DOM, SAX.

Jared 

-
Sent via the backstage.bbc.co.uk discussion group.  To unsubscribe, please 
visit http://backstage.bbc.co.uk/archives/2005/01/mailing_list.html.  
Unofficial list archive: http://www.mail-archive.com/backstage@lists.bbc.co.uk/


RE: [backstage] Competition - Ideas but no time

2005-09-05 Thread Luke Dicken
> At 13:53 +0100 5/9/05, Nick Crossland wrote:
> >In the interests of making the competition more inclusive 
> for those that
> >don't have either the knowledge or time to make a working 
> prototype, perhaps
> >for future competitions you might just ask for a single sided written
> >proposal for a concept.  Perhaps the prize could include 
> having a prototype
> >built?
> 
> Yes, but some ideas are "NP hard", and hence, not computable.
> 
> Gordo

Yeah I have to agree, you cant open it up to a point where you can enter
an airy-fairy idea and stand a chance of winning. On the other hand,
perhaps there might be scope for a category with a lesser prize for
written proposals with technical background / references. This would
stop general "Wouldn't it be nice if." but allow people to research
ideas or use their own knowledge without having to commit the resources
required to produce a working prototype. I don't really know how you
would formalise the distinction though...

-
Sent via the backstage.bbc.co.uk discussion group.  To unsubscribe, please 
visit http://backstage.bbc.co.uk/archives/2005/01/mailing_list.html.  
Unofficial list archive: http://www.mail-archive.com/backstage@lists.bbc.co.uk/


Re: [backstage] Competition - Ideas but no time

2005-09-05 Thread Robin Berjon

Luke Dicken wrote:
Had an idea for a crude method for this, basically comparing 
genres of programs, and couting the matches. 
(Should be fairly trivial in SQL). Could be used for "Find 
other programs like this program", or Finding Thrillers or whatever.


Transporting the XML into SQL is going to introduce a fair bot of bloat
though surely? Wouldn't it be better running your queries through DOM or
SAX?


Hmmm, DOM or SAX for this would probably quickly turn into something 
rather unpleasant, and a little dull on the side. But yes, using XPath 
for instance should work (especially since the BBC only uses the 2002 
TVA, meaning you don't enter into the madness of trying to query TVA 
data with the immixion of 2004 data whereby any attempt at processing 
gets really, really ugly). Other options could include XQuery or (after 
a transformation into RDF) SPARQL.


The big advantage of SQL here though is that it can be indexed.

--
Robin Berjon
  Senior Research Scientist
  Expway, http://expway.com/


-
Sent via the backstage.bbc.co.uk discussion group.  To unsubscribe, please 
visit http://backstage.bbc.co.uk/archives/2005/01/mailing_list.html.  
Unofficial list archive: http://www.mail-archive.com/backstage@lists.bbc.co.uk/


RE: [backstage] Competition - Ideas but no time

2005-09-05 Thread Luke Dicken
> Had an idea for a crude method for this, basically comparing 
> genres of programs, and couting the matches. 
> (Should be fairly trivial in SQL). Could be used for "Find 
> other programs like this program", or Finding Thrillers or whatever.

Transporting the XML into SQL is going to introduce a fair bot of bloat
though surely? Wouldn't it be better running your queries through DOM or
SAX?

-
Sent via the backstage.bbc.co.uk discussion group.  To unsubscribe, please 
visit http://backstage.bbc.co.uk/archives/2005/01/mailing_list.html.  
Unofficial list archive: http://www.mail-archive.com/backstage@lists.bbc.co.uk/


RE: [backstage] Competition - Ideas but no time

2005-09-05 Thread Jared Williams


 

> -Original Message-
> From: [EMAIL PROTECTED] 
> [mailto:[EMAIL PROTECTED] On Behalf Of Luke Dicken
> Sent: 05 September 2005 14:15
> To: backstage@lists.bbc.co.uk
> Subject: RE: [backstage] Competition - Ideas but no time
> 
> My background is fairly heavily AI-slanted so that's the sort 
> of area I've been coming up with ideas. I think the best one 
> I've had so far is to take the data and couple it with a 
> Bayesian classification system. By using a reasonable set of 
> training data (a previous week's listings for
> example) and a decent heuristic you should be able to create 
> an AI system that can predictively suggest forthcoming 
> programmes. You would also need some ancillary odds and sods 
> like user tracking to cater on a per-person basis. With the 
> basic system implemented the heuristic function could be 
> tweaked to make it more accurate if necessary, the same base 
> system could also be used with multiple functions - perhaps 
> one could have a larger emphasis on timing than content etc. 
> The coding for this most likely wouldn't be all that intense 
> since its primarily mathematical, although storing the 
> results of profiling would give it a reasonable amount of 
> overhead (or would require a bit of ninja-ing) since you 
> would be having to do a certain amount of natural language 
> analysis on the data.

Had an idea for a crude method for this, basically comparing genres of 
programs, and couting the matches. 
(Should be fairly trivial in SQL). Could be used for "Find other programs like 
this program", or Finding Thrillers or whatever.

> I don't know about you guys, but something that highlighted 
> programs I might be interested in would certainly be of 
> benefit to me, saving me trawling through the listings.

Yes, would be nice. Slowly implementing a web application todo something along 
these lines.

Also plans to provide the "hilighted programs" in another formats, the most 
interesting would be iCal (though RSS would also been
useful) so then calendar clients (sunbird, rainlender, etc) can subscribe to 
the ical feed, and provide alerts and such. 

Then there is sharing of programs, between multiple users, so can subscribe to 
each others feeds.

http://homepage.ntlworld.com/jared.williams/php5/image001.png is how far I am, 
(not very, and mozilla has problems with it currently
which is a pain).

Red line is a clock hand that ticks across with the current time.

> 
> 
> --
> 
> "Those are my principles. If you don't like them, I have others."
>   
> Groucho Marx
>   
> 
> > -Original Message-----
> > From: [EMAIL PROTECTED] 
> > [mailto:[EMAIL PROTECTED] On Behalf Of Chris Gilbert
> > Sent: 05 September 2005 13:23
> > To: backstage@lists.bbc.co.uk
> > Subject: Re: [backstage] Competition - Ideas but no time
> > 
> > 
> > I'm pretty sure I goes for the majority of people here to 
> say that we 
> > are always interested to hear ideas.
> > 
> > Why not throw them into the discussion and see what happens?
> > 
> > --
> > Chris Gilbert
> > 
> > 07966 077 486
> > [EMAIL PROTECTED]
> > 
> > 
> > 
> > On 5 Sep 2005, at 12:14, Luke Dicken wrote:
> > 
> > 
> > > I have some ideas for projects but no time to fulfill 
> them, is there 
> > > anyone out there who has time but no ideas?
> > >
> > > Luke
> > >
> > >
> > > --
> > >
> > > "Those are my principles. If you don't like them, I have others."
> > >
> > > Groucho Marx
> > >
> > >
> > > -
> > > Sent via the backstage.bbc.co.uk discussion group.  To
> > unsubscribe,
> > > please visit http://backstage.bbc.co.uk/archives/2005/01/
> > > mailing_list.html.  Unofficial list archive: http://www.mail- 
> > > archive.com/backstage@lists.bbc.co.uk/
> > >
> > >
> > 
> > 
> > -
> > Sent via the backstage.bbc.co.uk discussion group.  To unsubscribe, 
> > please visit 
> > http://backstage.bbc.co.uk/archives/2005/01/mailing_list.html.
>   Unofficial list archive:
> http://www.mail-archive.com/backstage@lists.bbc.co.uk/
> 
> -
> Sent via the backstage.bbc.co.uk discussion group.  To 
> unsubscribe, please visit 
> http://backstage.bbc.co.uk/archives/2005/01/mailing_list.html.
>   Unofficial list archive: 
> http://www.mail-archive.com/backstage@lists.bbc.co.uk/
> 

-
Sent via the backstage.bbc.co.uk discussion group.  To unsubscribe, please 
visit http://backstage.bbc.co.uk/archives/2005/01/mailing_list.html.  
Unofficial list archive: http://www.mail-archive.com/backstage@lists.bbc.co.uk/


RE: [backstage] Competition - Ideas but no time

2005-09-05 Thread Luke Dicken
> >I don't know about you guys, but something that highlighted 
> programs I
> >might be interested in would certainly be of benefit to me, saving me
> >trawling through the listings.
> >
> 
> Recommender systems are common now (Amazon for example) and have a 
> decent history (e.g. RINGO from MIT).
> 
> Is the training data rich enough?
> 
> I watch two to seven programmes a week, apart from random channel 
> switching, for example, and I tend to find them on BBC3 and BBC4 
> (first, before they head off to BBC2).
> 
> Gordo

The general (unproven, not particularly closely thought through) idea I
had was to build a feature vector based on polygram analysis of item
entries within the XML, with different weighting based on location
within the elements - for example words in a Genre element would have
more weight than words within a Synopsis. Really this is glorified
keyword searching, but with the keywords being dynamically generated
based on a user's perceived preferences, and with a lot fuzziness being
built in. The only possible problem is that the terseness of the
programme synopsis could lead to a situation where the language implies
something that isnt explicitly stated. Can't think of an example of this
off the top of my head, but its why I'd consider bi- or poly-gram
analysis as opposed to unigram.

As far as the same programmes coming round again, that would be down to
individual implementation style I guess, your options are to ignore it,
to not suggest it again, or to try to track whether it ought to be
suggestable based on prompting the user to complete a set of what he/she
watched previously (assuming that the CRID for a programme doesn't
change when it switches channel).

I think really its something that you can debate the ins and outs of til
the cows come home, but until someone is able to sit down and try it,
you can't judge how well it will/won't work.

-
Sent via the backstage.bbc.co.uk discussion group.  To unsubscribe, please 
visit http://backstage.bbc.co.uk/archives/2005/01/mailing_list.html.  
Unofficial list archive: http://www.mail-archive.com/backstage@lists.bbc.co.uk/


RE: [backstage] Competition - Ideas but no time

2005-09-05 Thread Gordon Joly


I don't know about you guys, but something that highlighted programs I
might be interested in would certainly be of benefit to me, saving me
trawling through the listings.



Recommender systems are common now (Amazon for example) and have a 
decent history (e.g. RINGO from MIT).


Is the training data rich enough?

I watch two to seven programmes a week, apart from random channel 
switching, for example, and I tend to find them on BBC3 and BBC4 
(first, before they head off to BBC2).


Gordo

--
"Think Feynman"/
http://pobox.com/~gordo/
[EMAIL PROTECTED]///
-
Sent via the backstage.bbc.co.uk discussion group.  To unsubscribe, please 
visit http://backstage.bbc.co.uk/archives/2005/01/mailing_list.html.  
Unofficial list archive: http://www.mail-archive.com/backstage@lists.bbc.co.uk/


RE: [backstage] Competition - Ideas but no time

2005-09-05 Thread Gordon Joly

At 13:53 +0100 5/9/05, Nick Crossland wrote:

In the interests of making the competition more inclusive for those that
don't have either the knowledge or time to make a working prototype, perhaps
for future competitions you might just ask for a single sided written
proposal for a concept.  Perhaps the prize could include having a prototype
built?


Yes, but some ideas are "NP hard", and hence, not computable.

Gordo

--
"Think Feynman"/
http://pobox.com/~gordo/
[EMAIL PROTECTED]///
-
Sent via the backstage.bbc.co.uk discussion group.  To unsubscribe, please 
visit http://backstage.bbc.co.uk/archives/2005/01/mailing_list.html.  
Unofficial list archive: http://www.mail-archive.com/backstage@lists.bbc.co.uk/


RE: [backstage] Competition - Ideas but no time

2005-09-05 Thread Luke Dicken
My background is fairly heavily AI-slanted so that's the sort of area
I've been coming up with ideas. I think the best one I've had so far is
to take the data and couple it with a Bayesian classification system. By
using a reasonable set of training data (a previous week's listings for
example) and a decent heuristic you should be able to create an AI
system that can predictively suggest forthcoming programmes. You would
also need some ancillary odds and sods like user tracking to cater on a
per-person basis. With the basic system implemented the heuristic
function could be tweaked to make it more accurate if necessary, the
same base system could also be used with multiple functions - perhaps
one could have a larger emphasis on timing than content etc. The coding
for this most likely wouldn't be all that intense since its primarily
mathematical, although storing the results of profiling would give it a
reasonable amount of overhead (or would require a bit of ninja-ing)
since you would be having to do a certain amount of natural language
analysis on the data.

I don't know about you guys, but something that highlighted programs I
might be interested in would certainly be of benefit to me, saving me
trawling through the listings.


--

"Those are my principles. If you don't like them, I have others."

Groucho Marx


> -Original Message-
> From: [EMAIL PROTECTED] 
> [mailto:[EMAIL PROTECTED] On Behalf Of Chris Gilbert
> Sent: 05 September 2005 13:23
> To: backstage@lists.bbc.co.uk
> Subject: Re: [backstage] Competition - Ideas but no time
> 
> 
> I'm pretty sure I goes for the majority of people here to say 
> that we  
> are always interested to hear ideas.
> 
> Why not throw them into the discussion and see what happens?
> 
> --
> Chris Gilbert
> 
> 07966 077 486
> [EMAIL PROTECTED]
> 
> 
> 
> On 5 Sep 2005, at 12:14, Luke Dicken wrote:
> 
> 
> > I have some ideas for projects but no time to fulfill them, is there
> > anyone out there who has time but no ideas?
> >
> > Luke
> >
> >
> > --
> >
> > "Those are my principles. If you don't like them, I have others."
> >
> > Groucho Marx
> >
> >
> > -
> > Sent via the backstage.bbc.co.uk discussion group.  To 
> unsubscribe,  
> > please visit http://backstage.bbc.co.uk/archives/2005/01/ 
> > mailing_list.html.  Unofficial list archive: http://www.mail- 
> > archive.com/backstage@lists.bbc.co.uk/
> >
> >
> 
> 
> -
> Sent via the backstage.bbc.co.uk discussion group.  To 
> unsubscribe, please visit 
> http://backstage.bbc.co.uk/archives/2005/01/mailing_list.html.
  Unofficial list archive:
http://www.mail-archive.com/backstage@lists.bbc.co.uk/

-
Sent via the backstage.bbc.co.uk discussion group.  To unsubscribe, please 
visit http://backstage.bbc.co.uk/archives/2005/01/mailing_list.html.  
Unofficial list archive: http://www.mail-archive.com/backstage@lists.bbc.co.uk/


RE: [backstage] Competition - Ideas but no time

2005-09-05 Thread Nick Crossland
In the interests of making the competition more inclusive for those that
don't have either the knowledge or time to make a working prototype, perhaps
for future competitions you might just ask for a single sided written
proposal for a concept.  Perhaps the prize could include having a prototype
built?

That way, once freed from the restriction of having to build the thing in
their spare time, hopefully more people can enter and can come up with ideas
a bit wilder and more imaginative than if they actually had to programme the
bloomin' things!

More inspiration, less perspiration.





> -Original Message-
> From: Chris Gilbert [mailto:[EMAIL PROTECTED] 
> Sent: 05 September 2005 1:23 pm
> To: backstage@lists.bbc.co.uk
> Subject: Re: [backstage] Competition - Ideas but no time
> 
> I'm pretty sure I goes for the majority of people here to say 
> that we are always interested to hear ideas.
> 
> Why not throw them into the discussion and see what happens?
> 
> --
> Chris Gilbert
> 
> 07966 077 486
> [EMAIL PROTECTED]
> 
> 
> 
> On 5 Sep 2005, at 12:14, Luke Dicken wrote:
> 
> 
> > I have some ideas for projects but no time to fulfill them, 
> is there 
> > anyone out there who has time but no ideas?
> >
> > Luke
> >
> >
> > --
> >
> > "Those are my principles. If you don't like them, I have others."
> >
> > Groucho Marx
> >
> >
> > -
> > Sent via the backstage.bbc.co.uk discussion group.  To unsubscribe, 
> > please visit http://backstage.bbc.co.uk/archives/2005/01/
> > mailing_list.html.  Unofficial list archive: http://www.mail- 
> > archive.com/backstage@lists.bbc.co.uk/
> >
> >
> 
> 
> -
> Sent via the backstage.bbc.co.uk discussion group.  To 
> unsubscribe, please visit 
> http://backstage.bbc.co.uk/archives/2005/01/mailing_list.html.
>   Unofficial list archive: 
> http://www.mail-archive.com/backstage@lists.bbc.co.uk/
> 
> 

-
Sent via the backstage.bbc.co.uk discussion group.  To unsubscribe, please 
visit http://backstage.bbc.co.uk/archives/2005/01/mailing_list.html.  
Unofficial list archive: http://www.mail-archive.com/backstage@lists.bbc.co.uk/


Re: [backstage] Competition - Ideas but no time

2005-09-05 Thread Chris Gilbert
I'm pretty sure I goes for the majority of people here to say that we  
are always interested to hear ideas.


Why not throw them into the discussion and see what happens?

--
Chris Gilbert

07966 077 486
[EMAIL PROTECTED]



On 5 Sep 2005, at 12:14, Luke Dicken wrote:



I have some ideas for projects but no time to fulfill them, is there
anyone out there who has time but no ideas?

Luke


--

"Those are my principles. If you don't like them, I have others."

Groucho Marx


-
Sent via the backstage.bbc.co.uk discussion group.  To unsubscribe,  
please visit http://backstage.bbc.co.uk/archives/2005/01/ 
mailing_list.html.  Unofficial list archive: http://www.mail- 
archive.com/backstage@lists.bbc.co.uk/






-
Sent via the backstage.bbc.co.uk discussion group.  To unsubscribe, please 
visit http://backstage.bbc.co.uk/archives/2005/01/mailing_list.html.  
Unofficial list archive: http://www.mail-archive.com/backstage@lists.bbc.co.uk/