Hi Pete,

Before I address any of this, I think there is a confusion still and
I'd like to make this very clear so we don't talk past each other: The
"descriptions" that Alex and I have been talking about for the
purposes of this thread are not in audio format. Instead, they are
text and provided to the browser in exactly the same way as captions.
Here is an example of such a description file:
http://www.annodex.net/~silvia/itext/elephants_dream/audiodescription.srt
. It has cues like the following:

00:00:00,000 --> 00:00:05,000
The orange open movie project presents

00:00:05,010 --> 00:00:12,000
Introductory titles are showing on the background of a water pool with
fishes swimming and mechanical objects lying on a stone floor.

00:00:12,010 --> 00:00:14,800
elephants dream

00:00:26,100 --> 00:00:28,206
Two people stand on a small bridge.

They aren't actually useful unless voiced in parallel to the video
that is playing and rendered during the time that the cue is
providing.


On Tue, Jun 21, 2011 at 4:52 AM, Pete Brunet <[email protected]> wrote:
> Hi Sylvia, We probably have more to learn from you than you from us :-)

Well, we have to work together to solve this problem. :-)

> I think even in the case of HTML5 nothing has changed for those who are
> either deaf or blind but not both, i.e. an additional mode can be used to
> compensate for the sense that is impaired, e.g. captions for the deaf and
> audio descriptions for the blind.  However, in the case of those who are
> deaf/blind then a tactile mode is needed.  One solution is to make captions
> available to the screen reader (with its Braille support) and the audio
> descriptions available as text to the screen reader.
>
> Are there other scenarios besides use by those who are deaf/blind where text
> descriptions are needed?

In the HTML spec, there is mention of hands-free applications that
could make use of it, too. But I suspect that would also require use
of a screen reader type additional application.


> If the only need for text descriptions is to provide access for those who
> are deaf/blind, what are the current solutions?  Transcripts?

For deaf-blind people I believe transcripts are the solution and they
are the solution still, even though a voicing of both, captions and
text descriptions will provide a solution for deaf-blind people, too.
I regard that as a minor use case though.

>  Are the
> existing solutions insufficient enough to justify the engineering effort
> associated with text descriptions and stream control?

Text descriptions are for blind people in general, not just for
deaf-blind people.

The current solution are audio descriptions and they are much harder
to produce than text descriptions. So, in the interest of gaining more
accessibility to video content, text descriptions were created to help
achieve that. Both mechanisms: audio descriptions and text
descriptions, are supported in HTML5.


>  From the business
> perspectives that development managers are bound by the cost of design and
> implementation would not be justifiable for such a small user base, unless
> there is a legal requirement.  Is there (or will there soon be) a legal
> requirement to provide text descriptions?

I assume you are talking about development managers of accessibility
software? I believe the implementation of such a feature is indeed a
business decision and screen readers are free to compete on the
grounds of one having more a11y features than another. However, we are
here only indirectly talking about software - we are instead talking
about a general, standardised means of making a HTML5 feature
available to AT. I believe that this standardisation effort is well
worth the effort so we don't get different screen readers implementing
support for audio descriptions in a different manner when they do
decide to implement it.


> Regarding the descriptions keyword of the kind attribute, the document says
> that it's meant for use when visual capability is unavailable and gives the
> example of driving or blind users and also mentions that the text is meant
> for synthesis.  However, for those who are blind (and not deaf/blind) audio
> can still be heard and thus there is no need for a text version of an audio
> description.  And for deaf blind users synthesis is not needed - tactile
> output (Braille) is needed.

I hope my above description explains why text descriptions are
different to audio descriptions and that support for both is required.
Audio descriptions will indeed already work with the current
specification of HTML5. But we want to make the simpler authoring task
of creating text descriptions a more effective means of delivering
accessibility to videos for blind users. Note that when there are text
descriptions available, we would *not* expect there to also be audio
descriptions available.

>>> At least at this point I'm not in favor of the media control methods.
>>> Developers should provide accessible GUI controls.  The developer would have
>>> to implement the access in any case and having access through the GUI would
>>> eliminate adding the code for these new methods on both sides of the
>>> interface.  If the app developer does a correct implementation of the GUI
>>> there would be no extra coding required in ATs.
>>
>> I guess the idea here was that there may be situations where AT needs
>> to overrule what is happening in the UI, for example when there are
>> audio and video resources that start autoplaying on a newly opened
>> page. However, I am not quite clear on this point either.
>
> I believe the AT user would be in the same situation as a non-AT user, i.e.
> all users would use the same means to stop autoplaying (if such means were
> available).

That is probably true: a browser setting to generally disallow
autoplaying or a shortcut key in the browser to stop any and all media
elements that are autoplaying would be a nice browser feature for any
user.

Just to clarify: I cannot explain why we need the API in 2.7.1
https://wiki.mozilla.org/Accessibility/IA2_1.3#Control_video.2Faudio .

I do think, however, that we need the interface in 2.7.2
https://wiki.mozilla.org/Accessibility/IA2_1.3#Text_cues .

Note that I created the second interface in that section, because I
believe that AT needs to know the start time, end time, and exact text
of the to-be-read cue. I included "id" so we can keep an identifier,
but that may not be necessary. Also, I included both a function to
grab the HTML version as well as the plaint text version of the cue so
we have the ability to render markup differently, such as "em" can
create emphasis in the voicing, or navigation markers can be used to
jump over earlier details in the cue to later ones. I am just
guessing, though, what kinds of information may be useful for the
screenreader to receive.

Also, note the need to listen to the "cuechange" event on the video's
description track and for access to setting/unsetting the
"pauseOnExit" IDL attribute of the cue from the screenreader.

I hope I've been able to clarify a few things...

Cheers,
Silvia.
_______________________________________________
Accessibility-ia2 mailing list
[email protected]
https://lists.linux-foundation.org/mailman/listinfo/accessibility-ia2

Reply via email to