Hi Eric,

 

If it’s something that FFMPEG extracted, I suggest checking out:

 

http://wiki.apache.org/tika/FFMPEGParser 

 

If it’s something where you want to classify what’s going on in the video
using Tensorflow, see:

 

https://wiki.apache.org/tika/TikaAndVisionVideo 

 

Hope they help.

 

Cheers,

Chris

 

 

 

 

From: Eric Pugh <[email protected]>
Reply-To: "[email protected]" <[email protected]>
Date: Monday, January 21, 2019 at 6:50 AM
To: "[email protected]" <[email protected]>
Subject: Extracting Subtitles from Video Files?

 

Hi all, thought I would toss out this inquiry…!  Has any one used Tika to 
extract subtitles from typical video files?   I’ve done some research, and it 
appears the common formats, .SRT, .SBV, .VTT, and even a plain text format all 
look like slightly different versions of the below (taken from a .SRT file:

 

00:00:01,160 --> 00:00:04,729

Welcome to the presentation

on basic addition.

 

They have a time range, and then then the corresponding text.   It seems like a 
great use case for Tika would be to handle various different types of embedded 
close captioning files, and emit them in the single standard structure.

 

Before I get too far down the path, thought I would see if anyone else has done 
this in the open source space!

 

Eric

 

 

 

_______________________

Eric Pugh | Founder & CEO | OpenSource Connections, LLC | 434.466.1467 | 
http://www.opensourceconnections.com | My Free/Busy  

Co-Author: Apache Solr Enterprise Search Server, 3rd Ed 

This e-mail and all contents, including attachments, is considered to be 
Company Confidential unless explicitly stated otherwise, regardless of whether 
attachments are marked as such.

 

Reply via email to