#11387: dnn_detect filter won't work with yolo4-tiny model when both anchors and
labels filenames are defined
-------------------------------------+-------------------------------------
             Reporter:  Leandro      |                     Type:  defect
  Santiago                           |
               Status:  new          |                 Priority:  normal
            Component:  avfilter     |                  Version:
                                     |  unspecified
             Keywords:               |               Blocked By:
             Blocking:               |  Reproduced by developer:  0
Analyzed by developer:  0            |
-------------------------------------+-------------------------------------
 Summary of the bug:

 System: Manjaro Linux stable (latest as on on Dec 30th 2024).

 OpenVino version: 2024.6.0.

 FFMpeg version:

 {{{
 ffmpeg version N-118193-g5f38c82536 Copyright (c) 2000-2024 the FFmpeg
 developers
 built with gcc 14.2.1 (GCC) 20240910
 configuration: --enable-libopenvino --enable-libharfbuzz --enable-
 libfribidi --enable-libfreetype --enable-libfontconfig --enable-openssl
 libavutil      59. 53.100 / 59. 53.100
 libavcodec     61. 28.100 / 61. 28.100
 libavformat    61.  9.102 / 61.  9.102
 libavdevice    61.  4.100 / 61.  4.100
 libavfilter    10.  6.101 / 10.  6.101
 libswscale      8. 13.100 /  8. 13.100
 libswresample   5.  4.100 /  5.  4.100
 }}}

 How to reproduce:

 Install the `openvino-dev` python package to download the models:

 {{{
 pip install openvino-dev tensorflow
 }}}

 And download and convert the the `yolo-v4-tiny-tf` and the labels file:

 {{{
 omz_downloader --name yolo-v4-tiny-tf
 omz_converter --name yolo-v4-tiny-tf
 wget
 
https://raw.githubusercontent.com/openvinotoolkit/open_model_zoo/refs/heads/master/data/dataset_classes/coco_80cl.txt
 }}}

 Then run ffplay on some arbitrary video containing several objects that
 should be detected by this model, and drawing rectangles and labels on the
 detected objects:

 {{{
 ffplay \
  https://videos.pexels.com/video-
 files/5222540/5222540-uhd_3840_2160_30fps.mp4 \
  -vf 'dnn_detect=dnn_backend=openvino:model=public/yolo-v4-tiny-
 tf/FP32/yolo-v4-tiny-
 
tf.xml:input=image_input:confidence=0.4:model_type=yolov4:anchors=81&82&135&169&344&319:labels=coco_80cl.txt:async=1:nb_classes=80,drawbox=box_source=side_data_detection_bboxes:color=yellow,drawtext=text_source=side_data_detection_bboxes:fontcolor=yellow:bordercolor=yellow:fontsize=40,showinfo'
 }}}

 You'see many log lines like this:

 {{{
 [Parsed_dnn_detect_0 @ 0x785bd2f21680] anchors is not set
 }}}

 As the `anchors=` filter option on `dnn_detect` is not passed to the
 filter, and anchors are required by `yolo4`.

 The correct behaviour is the drawbox and drawtext filters writing on the
 image, as well as the information about the detected objects being logged
 to the terminal:

 {{{
 ...
 [Parsed_showinfo_3 @ 0x743b02f22b00] index: 0,  region: (145, 1042) ->
 (740, 1495), label: car, confidence: 9918/10000.
 [Parsed_showinfo_3 @ 0x743b02f22b00] index: 1,  region: (551, 893) ->
 (551, 893), label: person, confidence: 4277/10000.
 [Parsed_showinfo_3 @ 0x743b02f22b00] index: 2,  region: (791, 1012) ->
 (791, 1012), label: person, confidence: 4069/10000.
 [Parsed_showinfo_3 @ 0x743b02f22b00] index: 3,  region: (1375, 1055) ->
 (1375, 1055), label: person, confidence: 5944/10000.
 [Parsed_showinfo_3 @ 0x743b02f22b00] index: 4,  region: (1505, 1065) ->
 (1505, 1065), label: person, confidence: 7363/10000.
 [Parsed_showinfo_3 @ 0x743b02f22b00] index: 5,  region: (794, 1011) ->
 (794, 1011), label: person, confidence: 8378/10000.
 [Parsed_showinfo_3 @ 0x743b02f22b00] index: 6,  region: (915, 1010) ->
 (915, 1010), label: person, confidence: 8011/10000.
 [Parsed_showinfo_3 @ 0x743b02f22b00] index: 7,  region: (1088, 1117) ->
 (1088, 1117), label: person, confidence: 9511/10000.
 [Parsed_showinfo_3 @ 0x743b02f22b00] index: 8,  region: (1385, 1052) ->
 (1385, 1052), label: person, confidence: 7692/10000.
 [Parsed_showinfo_3 @ 0x743b02f22b00] index: 9,  region: (1644, 1172) ->
 (1644, 1172), label: person, confidence: 9132/10000.
 [Parsed_showinfo_3 @ 0x743b02f22b00] index: 10, region: (1801, 1173) ->
 (1801, 1173), label: person, confidence: 9828/10000.
 [Parsed_showinfo_3 @ 0x743b02f22b00] index: 11, region: (2480, 1299) ->
 (2480, 1299), label: person, confidence: 9496/10000.
 [Parsed_showinfo_3 @ 0x743b02f22b00] index: 12, region: (414, 1239) ->
 (414, 1239), label: car, confidence: 8610/10000.
 [Parsed_showinfo_3 @ 0x743b02f22b00] index: 13, region: (422, 1265) ->
 (422, 1265), label: car, confidence: 9608/10000.
 [Parsed_showinfo_3 @ 0x743b02f22b00] index: 14, region: (452, 1266) ->
 (452, 1266), label: car, confidence: 9239/10000.
 ...
 }}}
-- 
Ticket URL: <https://trac.ffmpeg.org/ticket/11387>
FFmpeg <https://ffmpeg.org>
FFmpeg issue tracker
_______________________________________________
FFmpeg-trac mailing list
FFmpeg-trac@avcodec.org
https://ffmpeg.org/mailman/listinfo/ffmpeg-trac

To unsubscribe, visit link above, or email
ffmpeg-trac-requ...@ffmpeg.org with subject "unsubscribe".

Reply via email to