Re: [backstage] Today's TV-Anytime tar is broken

2006-03-07 Thread Hywel Williams

At 11:26 07/03/2006 +, you wrote:

Hi,

Is there any news of this as there doesn't appear to be a file for today :-(

Appart from this problem, the data is really great and I'm currently 
hacking Perl TV::Anytime to make use of the additional data that is hidden 
in these files.  I'm make a patch available once it is fully working.


There appears to be a problem with the FTP link used to get the data from 
the database machine to the web site itself.  I've flagged the problem to 
our internet services people - hopefully the problem will be resolved 
soon.  It seems to have corrupted yesterday's upload and completely failed 
on today's.


Hywel

I tried a manual push - that also failed. 


-
Sent via the backstage.bbc.co.uk discussion group.  To unsubscribe, please 
visit http://backstage.bbc.co.uk/archives/2005/01/mailing_list.html.  
Unofficial list archive: http://www.mail-archive.com/backstage@lists.bbc.co.uk/


Re: [backstage] BBC TV Listings Feed

2006-01-24 Thread Hywel Williams

At 13:50 24/01/2006 +, you wrote:

Hi,

The TV Listing feed isn't available yet for today (24-01-2006).

Just wondered if your having any problems today generating the feed.


Sorry - I inadvertently damaged the crontab that calls the daily data 
generator yesterday which meant it didn't run this morning.  It's now fixed 
and the script was manually run to ensure that there's a set of valid data 
for today.


Hywel 


-
Sent via the backstage.bbc.co.uk discussion group.  To unsubscribe, please 
visit http://backstage.bbc.co.uk/archives/2005/01/mailing_list.html.  
Unofficial list archive: http://www.mail-archive.com/backstage@lists.bbc.co.uk/


Re: [backstage] typo/error in TV-Anytime data

2005-12-19 Thread Hywel Williams

At 13:36 17/12/2005 +, you wrote:

Hi,

I think I found a typo/error in the latest TV-Anytime data:

In 20051223BBCOne_pi.xml, there's this line...

Synopsis length='short'![CDATA[The Long Goodbye: Owen returns to work, 
eager to start a new life with Chrissie. Elliot reveals his dark side when 
he loses a patient. Mickie realises that she needs to move on with her 
life. [ADS,SL]]]/Synopsis


I believe ADS,SL should be AD,S,SL (or AD,SL)? The TV::Anytime 
parser chokes on the ADS...


It looks like a typo.  The synopses are typed in by hand and occasionally 
inevitably errors occur.  Very often these errors are spotted and corrected 
in later database updates by the schedulers.


Hywel 


-
Sent via the backstage.bbc.co.uk discussion group.  To unsubscribe, please 
visit http://backstage.bbc.co.uk/archives/2005/01/mailing_list.html.  
Unofficial list archive: http://www.mail-archive.com/backstage@lists.bbc.co.uk/


RE: [backstage] Timezone bug?

2005-10-31 Thread Hywel Williams


At 11:45 31/10/2005 +, you wrote:
From
an internal perspective

Yes! seperate out Series,
Episode numbers and Ep Titles into clear fields. It's really useful if
you were to use TV anytime data to populate programme support sites like,
say, bbc.co.uk/buffy or /theoffice - and becomes really important for
following ongoing series. It's something that the Radio Times do in a
very patchy way.
TV-Anytime can represent episode numbers using the EpisodeOf tag:
e.g.
EpisodeOf index='2'/ 
would denote the second episode in a series. The problem however is
that although this information is available within the BBC, it's not
provided in the data stream I use to generate the TV-Anytime files, with
the exception of BBC7, who put this information at the end of the
synopsis. This data source was intended for the generation of EPG guides
- TV-Anytime on the other hand has a much bigger remit. We enhance
the basic guide data with as much information as possible to provide the
TV-Anytime data - but if we're not provided with the information in
the first place, there's only so much that I can do!

You could, for instance, make a
query engine that lets you build a personal schedule around watching a
show in story order across all the various channels and
repeats

That's exactly what the tag is meant for - to ensure the order of a
series is maintained by an automated recording/displaying
service.
Hywel


-
Sent via the backstage.bbc.co.uk discussion group.  To unsubscribe, please visit http://backstage.bbc.co.uk/archives/2005/01/mailing_list.html.  Unofficial list archive: http://www.mail-archive.com/backstage@lists.bbc.co.uk/


RE: [backstage] Timezone bug?

2005-10-31 Thread Hywel Williams


At 14:32 31/10/2005 +, you wrote:
[Taking
this back on list as need the attention of the TV anytime
guys]

LOL - If anything, I'm Ms Web
Editorial - I'm a content producer with small grasp of technical stuff,
who just happens to be reasonably good at translating technical stuff
into useful content ideas. I've done a lot of making TV programme based
content on bbc.co.uk, and a bit of thinking about data models for that.


The set-to-record from trailer
is a great idea - and you're right, should be possible / doable. The
problem is, i think, that CRID isn't widely used across the whole
industry - i don't know if TV anytime is used by sky?

There's also a problem that
somehow, the trails (BBC speak for trailer) would have to be synched with
some kind of CRID broadcasting widget. I may be wrong, but I think that
trails are added on a very ad-hoc basis in the transmission suite - they
can be added, swapped or removed at the last second (literally!) to keep
the schedules running on time, and are played in from digibeta. (This
might be from harddisk these days... it's a year or two since I've
chatted to anyone in TX (BBC Speak for Transmission) in the
bar.)

Oy! Kingswood boys! What do you
think to this, and can you explain whether the idea would work, and what
the issues are, in terms I can follow?
We already have a demonstration of programme selection by trailer, using
a modified Pioneer Freeview box and a transport stream containing
TV-Anytime data and signalling. This was on show to the public in 2004's
IBC as well as being demonstrated many times to industry (e.g. BBC
RD's Open Days this year). 
There are many obstacles to getting this as a live service of course -
the least of this being the way trailers are changed at last minute, but
we've shown it can be done with the existing TVA standard and
technology. We've also experimented with linking to the live
playout system and getting triggers for trailers that way.
Hywel



-
Sent via the backstage.bbc.co.uk discussion group.  To unsubscribe, please visit http://backstage.bbc.co.uk/archives/2005/01/mailing_list.html.  Unofficial list archive: http://www.mail-archive.com/backstage@lists.bbc.co.uk/


Re: [backstage] TV Data bug..

2005-10-30 Thread Hywel Williams

At 11:04 30/10/2005, you wrote:

Hi,

Program:

crid://bbc.co.uk/1103146355

and

crid://bbc.co.uk/1103146353

Are the same program, both on BBC 4


Unfortunately to get 100% accuracy, the programmes should all be 
matched by hand, possibly by the schedulers.  Unfortunately this 
isn't currently done and repeat programmes are found using a simple 
algorithm, which occasionally fails as it has in this case. The 
opposite can occasionally happen when two programmes are matched as 
repeats, when they're not - usually caused by generic saynopses.


There is also a know issue with films that traverse the news, with 
the two halves not being paired correctly.


This is a test environment - it s believed that far more accurate 
matching will be done should a full broadcast TV-Anytime service be set up.


Hywel
[EMAIL PROTECTED] 


-
Sent via the backstage.bbc.co.uk discussion group.  To unsubscribe, please 
visit http://backstage.bbc.co.uk/archives/2005/01/mailing_list.html.  
Unofficial list archive: http://www.mail-archive.com/backstage@lists.bbc.co.uk/


Re: [backstage] Timezone bug?

2005-10-30 Thread Hywel Williams

At 14:23 30/10/2005, you wrote:

Hello, backstagers:
  It looks like the data generator on the BBC end is having timezone
issues -- all data I have for today (from the 20051029 tarball) is an
hour off.  For a while I thought that it was on my end, but no, the
actual XML files get it wrong.

Also, I noticed that at least for some shows, the time in the ProgramURL
does not agree with the time in the PublishedStartTime by a few minutes.
What does this difference represent?  Are the PublishedStartTimes simply
inaccurate but look nice, or is there something more interesting going
on?

Also, rather unrelatedly, would it be possible to get the name of the
episode in a seperate element from the free-text descriptions?

Thank you,
-=- James Mastros



I can't see anything wrong with the times in the data sets both for 
today and the 29th's tarball.  Remember that they're all in UTC 
(hence the Z at the end of the time), so anything up until 31st 
October appears to be out of synch by an hour.  I'll look into it in 
more detail tomorrow to ensure there isn't a problem, but at the end 
of the day, it's up to the end application to sort out daylight savings.


The time difference you're seeing is the difference between the 
published time and the actual time the playout server was going to 
show the programme. Sometimes programmes may start early and at other 
times late.  More often or not, these days this is not accidental - 
prerecorded BBC television services mostly come off servers these 
days so the timing can be down to the second.


Having said that, the figures you're seeing are only a snapshot of 
the estimated start times at 8am in the morning when the files are 
generated.  These times shift and change throughout the day.  The 
only anchors you'll notice are the 1, 6 and 10 o'clock news on BBC 
One which start at the exact time published.


In the future, a TV-Anytime service may be broadcast along with a 
tv/radio digital service or provided as a live Internet feed and this 
exact time could be used to, say start a PVR or stream recorder.


As to separating the name - I'll look into the TV-Anytime standard to 
see if there's a field for this (I don't recollect one) when I have 
that hefty tome on my desk at work tomorrow.  The synopsis provided 
is in fact a slightly cleaned up version of the one that goes out to 
Sky and Freeview, so to extract the title from the synopsis, I'd have 
to somehow work out which part of the text is the episode title.  If 
there's a pattern, it's easy to write a filter.  Where different 
schedulers do it in different ways, that's where it gets complicated.


Hywel
[EMAIL PROTECTED] 


-
Sent via the backstage.bbc.co.uk discussion group.  To unsubscribe, please 
visit http://backstage.bbc.co.uk/archives/2005/01/mailing_list.html.  
Unofficial list archive: http://www.mail-archive.com/backstage@lists.bbc.co.uk/


[backstage] TV-Anytime regional opt-out files

2005-10-26 Thread Hywel Williams
In view of the stagnant thread, I thought I'd add something new to the 
TV-Anytime data so that people can have something new to experiment with.


It hasn't been rigorously rested and may include some errors - but this is 
a stab at representing regional opt-outs using TV-Anytime.  At the moment 
I'm only providing English regional opt-outs for BBC One, but hopefully it 
won't be too much extra work to extend to the Nations and BBC2 Nations.


Here's the way I've implemented it:

1.  In the ServiceInformationTable, the main BBCOne feed is still London - 
this actually reflects how the BBC have provided sustaining feeds to the 
regions for a long time.


2. The English regions are also in the ServiceInformationTable - these have 
an extra element describing the parent service as being the BBCOne 
feed.  In other words, when there's no data provided specifically for the 
regional feed, assume the parent feed's data is to be used.


3. ProgramInformationTable:  The metadata for all the regions is found in 
the BBCOne file.  As always, these are indexed by their CRIDs.


4. ProgramLocationTable: Again, a single BBCOne file is provided.  The 
Schedule table for the BBCOne sustaining service is initially listed, 
followed by the regional opt-outs.  Only the programmes that differ are 
listed here, with the parent service being assumed to otherwise.


5. ContentReferencingTable:  Here, for each CRID, the locations for all 
regions are listed. When services are opted out, they're not listed under 
the main BBCOne service.


This isn't the only way to represent regional opt-outs in files, but I 
thought I'd experiment with this format, to see if it's a viable method of 
distributing a complete service list for a broadcaster which has regional 
opt-outs.


The first file of this format is found in the sub directory test from where 
the data is usually put and is called 20051025.tar.gz  The data 
automatically provided every day hasn't changed.


For now, I'll only produce an occasional set of files in that directory for 
you to get the gist.  If someone requires a more regular service, that can 
be arranged...


Hywel

-
Sent via the backstage.bbc.co.uk discussion group.  To unsubscribe, please 
visit http://backstage.bbc.co.uk/archives/2005/01/mailing_list.html.  
Unofficial list archive: http://www.mail-archive.com/backstage@lists.bbc.co.uk/


[backstage] New, more experimental data to play with

2005-09-23 Thread Hywel Williams

Folks,

A request was made a while ago that a separate feed of data that's closer 
to what we're using on the development end be made available, which may be 
more liable to instant change as new features appear. This has now been done.


I haven't yet automated the generation of this feed so for now it won't be 
updated daily, but this is also in the pipeline. The files can be found in 
a sub-directory of the tvradio directory:

http://backstage.bbc.co.uk/feeds/tvradio/test/

Currently, there are only two major differences between this feed and the 
main Backstage one - this one now uses the more recent 2004 classification 
schema and also supports almost all local BBC Radio stations.


One issue with regard to local radio stations is that occasionally, sister 
stations get together to broadcast the same programmes.  Ideally, and by 
definition, these programmes should have the same CRID, regardless of where 
they're broadcast, however this is not currently supported.  Each station 
in the current model is treated completely separately, but I'm looking into 
the possibility of providing a more accurate model in future (and with this 
in mind, regional telly schedules would be desirable too, again something 
I'm looking into).


Hywel

-
Sent via the backstage.bbc.co.uk discussion group.  To unsubscribe, please 
visit http://backstage.bbc.co.uk/archives/2005/01/mailing_list.html.  
Unofficial list archive: http://www.mail-archive.com/backstage@lists.bbc.co.uk/


Re: [backstage] BUG: dodgy data..

2005-09-18 Thread Hywel Williams

At 01:42 18/09/2005, you wrote:

Hi,

Program crids: 285169583, 285170277 and 285171923
are in fact all the same program Two Pints of Lager and a Packet of
Crisps,
All with the same description Purgatory: Gaz tries to get over his
guilt by doing
one special thing for Donna. Janet holds a party for Louise's
graduation.

285171923 also seems to have 21 audio channels - which although
impressive
might not be true :)

Cheers

Leo



Thanks for keeping us informed about these anomalies - most 
appreciated.  Many of these happen unfortunately further up the chain 
from where the data is generated but these two certainly sound like 
they shouldn't have slipped the net.


The first one is strange - a programme with the same synopsis and 
title on BBC Three should have been allocated the same CRID - I 
suspect that isn't a chain problem and should hopefully be 
something I can track down and fix myself.


I see no reason why any channels should have 21 channels of audio 
(well, I think I've worked out what's going on - read on)!  Any 
errors that come through from our scheduling department that are 
obviously bogus (e.g. 5.1 surround or any other signification other 
than stereo or mono) should be trapped and are given a default value 
within the permitted range.  I'll check the code on Monday to see how 
on earth that happened! Almost certainly a case statement gong wrong 
and giving it both 2 and 1 channels I suspect...


--
Hywel Williams
[EMAIL PROTECTED]

Backstage Email Account @ Home 


-
Sent via the backstage.bbc.co.uk discussion group.  To unsubscribe, please 
visit http://backstage.bbc.co.uk/archives/2005/01/mailing_list.html.  
Unofficial list archive: http://www.mail-archive.com/backstage@lists.bbc.co.uk/


Re: [backstage] Some changes to the TV-Anytime data

2005-09-15 Thread Hywel Williams

At 11:15 15/09/2005 +0100, you wrote:

At 18:32 14/09/2005 +0100, you wrote:
Some bug fixes and modifications have been made to the data set - these 
are outlined below.


2. When audio description is being provided for a programme, this is 
indicated with an extra AudioAtrributes entry with an AudioLanguage 
purpose of type 1 (Audio description for the visually impaired):

AudioAttributes
 NumOfChannels1/NumOfChannels
 AudioLanguage purpose='urn:tva:metadata:cs:AudioPurposeCS:2004:1'E
N-UK/AudioLanguage
/AudioAttributes


delurk Ahh - this reminds me of something I meant to ask ! /delurk

From other interest areas, I know the Beeb are involved in the W3C Timed 
Text initiative. I think one of the contacts is David Kirby in the RD 
section of BBC.


For those of you who don't know, this is an XML format for storing 
subtitles or captions (if you're in the USA).


So here's a thought : Supposing each broadcast program is subtitled - most 
are, to meet disability guidelines for deaf viewers.  How about publishing 
the subtitle files as part of backstage, in W3C TT format (maybe in the 
final draft format until the complete spec is ratified).


Then in the TV data feeds, publish a linkage to the subtitle file.  So, 
not actually embedded in the TV data feeds, but as a separate dataset. Now 
we'd have something similar to the google video and Blinkx text searches.


I do this on my Captionkit website, but of course it's a manual process to 
enter the captions in the editor.


As the BBC already pay people to create and store this data, it's 
definitely available and I think it would be a massive benefit to publish 
it. If you did that I'd for-sure drum up a demo project in under a week 
(probably a PHP API) to make use of those, and link a subtitle search to 
program information for repeats etc.


Food for thought ?



It's an excellent idea, and technically quite possible.The big hurdle 
unfortunately is rights.


Obviously, a broadcaster wouldn't publish subtitles for programmes that 
haven't been shown yet - the tabloid press would have a field day 
publishing the plot of EastEnders days before it's broadcast!


After/during broadcast? Definitely technically possible and has already 
been demonstrated on both pre-recorded and live output, but at the moment 
the rights to do so are prohibitive for a public service.


Hywel  


-
Sent via the backstage.bbc.co.uk discussion group.  To unsubscribe, please 
visit http://backstage.bbc.co.uk/archives/2005/01/mailing_list.html.  
Unofficial list archive: http://www.mail-archive.com/backstage@lists.bbc.co.uk/


Re: [backstage] Some changes to the TV-Anytime data

2005-09-15 Thread Hywel Williams

At 10:23 15/09/2005 +0200, you wrote:

Are you just changing the genres or are you also introducing some 2004 
elements with namespace and all? TVA's atrocious practice of changing 
namespaces when they change versions could introduce all manners of 
problems for implementations if the upgrade isn't planned.


Since the genre schemas are backwards compatible, the initial change would 
probably be cosmetic in just changing the 2002 to 2004 in the data.  New 
genres introduced in the 2004 schema will eventually be used.


In fact, 2004 descriptors are already in use for AudioPurpse in the audio 
description information for the simple reason that they didn't exist in the 
2002 classification schema!


Hywel 


-
Sent via the backstage.bbc.co.uk discussion group.  To unsubscribe, please 
visit http://backstage.bbc.co.uk/archives/2005/01/mailing_list.html.  
Unofficial list archive: http://www.mail-archive.com/backstage@lists.bbc.co.uk/


[backstage] Some changes to the TV-Anytime data

2005-09-14 Thread Hywel Williams
Some bug fixes and modifications have been made to the data set - these are 
outlined below.


1. The character set being used is now correctly identified as 
ISO-8859-9.  This was always the case but the ISO-8859-1 ident was 
incorrectly displayed.


2. When audio description is being provided for a programme, this is 
indicated with an extra AudioAtrributes entry with an AudioLanguage purpose 
of type 1 (Audio description for the visually impaired):

AudioAttributes
 NumOfChannels1/NumOfChannels
 AudioLanguage purpose='urn:tva:metadata:cs:AudioPurposeCS:2004:1'E
N-UK/AudioLanguage
/AudioAttributes

3. The missing ServiceGenre items in the ServiceInformationTable have now 
been added.


4. CBeebies' identifier was misspelt as CeeBeebies in the 
ServiceInformationTable (and everywhere else!)  This has now been 
corrected.  Note that this has also affected the filename used to represent 
the channel's data.


-

Some forthcoming changes/issues.

1. It has been pointed out that the format of the DVB locator used in the 
ContentReferencing Tables and the ProgramLocation tables don't conform to 
the DVB standard!  Essentially the divider between the date and time should 
use -- rather than / to be DVB compliant. 
e.g.  dvb://233a.4000.4740;[EMAIL PROTECTED]:00:00Z--PT01H00M


While this is dead easy to fix, it may cause some software already written 
to use the current method to fail so this is forewarning that this bug will 
be fixed soon.


2. More channels have been requested.  I'm currently testing the inclusion 
of almost all BBC regional radio stations in the schedules.   The main 
issue is that at present they don't represent occasions when more than one 
regional station is broadcasting the same programme.


3. Currently, the 2002 genre classification is being used instead of the 
more recent 2004.  While again something very easy to change since they're 
very similar, this again could have issues for software expecting the 2002.  


-
Sent via the backstage.bbc.co.uk discussion group.  To unsubscribe, please 
visit http://backstage.bbc.co.uk/archives/2005/01/mailing_list.html.  
Unofficial list archive: http://www.mail-archive.com/backstage@lists.bbc.co.uk/