There is a typesystem in the GALE Multi-Modal Example in the Sandbox that
has been used for processing audio.  We created an AudioSpan type whose
begin & end are  seconds (float) from the start of a block of audio that was
referenced via the SofaDataUri.  Speech recognizers annotated words on an
AudioSpan in the audio view, and the words were later combined into a text
string in another view for further textual processing.

~Burn

Reply via email to