[jira] [Commented] (TIKA-2322) Video labeling using existing ObjectRecognition

ASF GitHub Bot (JIRA) Fri, 28 Apr 2017 02:10:07 -0700

    [ 
https://issues.apache.org/jira/browse/TIKA-2322?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15988463#comment-15988463
 ]


ASF GitHub Bot commented on TIKA-2322:
--------------------------------------

smadha commented on issue #168: fix for TIKA-2322 contributed by [email protected]
URL: https://github.com/apache/tika/pull/168#issuecomment-297948356
 
 
   Guys I tried more stuff but OpenCV is still failing to read videos inside 
docker. It seems as ffmpeg is not installed as per OpenCV requirements.
   
   I am attaching logs that I get on building OpenCV while I am sshed inside 
docker. 
[ssh_docker_opencv_build.txt](https://github.com/apache/tika/files/963888/ssh_docker_opencv_build.txt)
   
   After my changes I can successfully see ffmpeg/avformat library being found 
in logs `Looking for libavformat/avformat.h - found` but still it does not 
builds with ffmpeg
   ```
   -- Looking for linux/videodev.h
   -- Looking for linux/videodev.h - not found
   -- Looking for linux/videodev2.h
   -- Looking for linux/videodev2.h - found
   -- Looking for sys/videoio.h
   -- Looking for sys/videoio.h - not found
   -- Looking for libavformat/avformat.h
   -- Looking for libavformat/avformat.h - found
   -- Looking for ffmpeg/avformat.h
   -- Looking for ffmpeg/avformat.h - found
   ```
   Below lines in log should have a YES against FFMPEG. I think only thing left 
to try is to build ffmmpeg from source.
   
   ```
   --   Video I/O:
   --     DC1394 1.x:                  NO
   --     DC1394 2.x:                  NO
   --     FFMPEG:                      NO
   --       codec:                     YES (ver )
   --       format:                    YES (ver )
   --       util:                      YES (ver )
   --       swscale:                   NO
   --       resample:                  NO
   --       gentoo-style:              YES
   --     GStreamer:                   NO
   --     OpenNI:                      NO
   --     OpenNI PrimeSensor Modules:  NO
   --     OpenNI2:                     NO
   --     PvAPI:                       NO
   --     GigEVisionSDK:               NO
   --     UniCap:                      NO
   --     UniCap ucil:                 NO
   ```
   
   
   Once you have installed OpenCV you can test it by -
   Shell -
   ```shell
   curl -o testVideoMp4.mp4 
"https://github.com/smadha/tika/blob/TIKA-2322/tika-parsers/src/test/resources/test-documents/testVideoMp4.mp4?raw=true";
 
   ```
   python -
   ```python
   import cv2
   cap = cv2.VideoCapture('testVideoMp4.mp4')
   print cap.isOpened()
   ```
   
   I am thinking it should be either a small change in my approach or we need 
to replace 
https://github.com/smadha/tika/blob/TIKA-2322/tika-parsers/src/main/resources/org/apache/tika/parser/recognition/tf/InceptionVideoRestDockerfile#L26-L32
 with ffmpeg build from source.
 
----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
[email protected]


> Video labeling using existing ObjectRecognition
> -----------------------------------------------
>
>                 Key: TIKA-2322
>                 URL: https://issues.apache.org/jira/browse/TIKA-2322
>             Project: Tika
>          Issue Type: Improvement
>          Components: parser
>            Reporter: Madhav Sharan
>            Assignee: Chris A. Mattmann
>              Labels: memex
>             Fix For: 1.15
>
>
> Currently TIKA supports ObjectRecognition in Images. I am proposing to extend 
> this to support videos. 
> Idea is -
> 1. Extract frames from video and run IncV3 to get labels for these frames. 
> 2. We average confidence scores of same labels for each frame. 
> 3. Return results in sorted order of confidence score. 
> I am writing code for different modes of frame extractions -
> 1. Extract center image.
> 2. Extract frames after every fixed interval.
> 3. Extract N frames equally divided across video.
> We used this approach in [0]. Code in [1]
> [0] https://github.com/USCDataScience/hadoop-pot
> [1] https://github.com/USCDataScience/video-recognition



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)

[jira] [Commented] (TIKA-2322) Video labeling using existing ObjectRecognition

Reply via email to