#11387: dnn_detect filter won't work with yolo4-tiny model when both anchors and labels filenames are defined -------------------------------------+------------------------------------- Reporter: Leandro | Type: defect Santiago | Status: new | Priority: normal Component: avfilter | Version: | unspecified Keywords: | Blocked By: Blocking: | Reproduced by developer: 0 Analyzed by developer: 0 | -------------------------------------+------------------------------------- Summary of the bug:
System: Manjaro Linux stable (latest as on on Dec 30th 2024). OpenVino version: 2024.6.0. FFMpeg version: {{{ ffmpeg version N-118193-g5f38c82536 Copyright (c) 2000-2024 the FFmpeg developers built with gcc 14.2.1 (GCC) 20240910 configuration: --enable-libopenvino --enable-libharfbuzz --enable- libfribidi --enable-libfreetype --enable-libfontconfig --enable-openssl libavutil 59. 53.100 / 59. 53.100 libavcodec 61. 28.100 / 61. 28.100 libavformat 61. 9.102 / 61. 9.102 libavdevice 61. 4.100 / 61. 4.100 libavfilter 10. 6.101 / 10. 6.101 libswscale 8. 13.100 / 8. 13.100 libswresample 5. 4.100 / 5. 4.100 }}} How to reproduce: Install the `openvino-dev` python package to download the models: {{{ pip install openvino-dev tensorflow }}} And download and convert the the `yolo-v4-tiny-tf` and the labels file: {{{ omz_downloader --name yolo-v4-tiny-tf omz_converter --name yolo-v4-tiny-tf wget https://raw.githubusercontent.com/openvinotoolkit/open_model_zoo/refs/heads/master/data/dataset_classes/coco_80cl.txt }}} Then run ffplay on some arbitrary video containing several objects that should be detected by this model, and drawing rectangles and labels on the detected objects: {{{ ffplay \ https://videos.pexels.com/video- files/5222540/5222540-uhd_3840_2160_30fps.mp4 \ -vf 'dnn_detect=dnn_backend=openvino:model=public/yolo-v4-tiny- tf/FP32/yolo-v4-tiny- tf.xml:input=image_input:confidence=0.4:model_type=yolov4:anchors=81&82&135&169&344&319:labels=coco_80cl.txt:async=1:nb_classes=80,drawbox=box_source=side_data_detection_bboxes:color=yellow,drawtext=text_source=side_data_detection_bboxes:fontcolor=yellow:bordercolor=yellow:fontsize=40,showinfo' }}} You'see many log lines like this: {{{ [Parsed_dnn_detect_0 @ 0x785bd2f21680] anchors is not set }}} As the `anchors=` filter option on `dnn_detect` is not passed to the filter, and anchors are required by `yolo4`. The correct behaviour is the drawbox and drawtext filters writing on the image, as well as the information about the detected objects being logged to the terminal: {{{ ... [Parsed_showinfo_3 @ 0x743b02f22b00] index: 0, region: (145, 1042) -> (740, 1495), label: car, confidence: 9918/10000. [Parsed_showinfo_3 @ 0x743b02f22b00] index: 1, region: (551, 893) -> (551, 893), label: person, confidence: 4277/10000. [Parsed_showinfo_3 @ 0x743b02f22b00] index: 2, region: (791, 1012) -> (791, 1012), label: person, confidence: 4069/10000. [Parsed_showinfo_3 @ 0x743b02f22b00] index: 3, region: (1375, 1055) -> (1375, 1055), label: person, confidence: 5944/10000. [Parsed_showinfo_3 @ 0x743b02f22b00] index: 4, region: (1505, 1065) -> (1505, 1065), label: person, confidence: 7363/10000. [Parsed_showinfo_3 @ 0x743b02f22b00] index: 5, region: (794, 1011) -> (794, 1011), label: person, confidence: 8378/10000. [Parsed_showinfo_3 @ 0x743b02f22b00] index: 6, region: (915, 1010) -> (915, 1010), label: person, confidence: 8011/10000. [Parsed_showinfo_3 @ 0x743b02f22b00] index: 7, region: (1088, 1117) -> (1088, 1117), label: person, confidence: 9511/10000. [Parsed_showinfo_3 @ 0x743b02f22b00] index: 8, region: (1385, 1052) -> (1385, 1052), label: person, confidence: 7692/10000. [Parsed_showinfo_3 @ 0x743b02f22b00] index: 9, region: (1644, 1172) -> (1644, 1172), label: person, confidence: 9132/10000. [Parsed_showinfo_3 @ 0x743b02f22b00] index: 10, region: (1801, 1173) -> (1801, 1173), label: person, confidence: 9828/10000. [Parsed_showinfo_3 @ 0x743b02f22b00] index: 11, region: (2480, 1299) -> (2480, 1299), label: person, confidence: 9496/10000. [Parsed_showinfo_3 @ 0x743b02f22b00] index: 12, region: (414, 1239) -> (414, 1239), label: car, confidence: 8610/10000. [Parsed_showinfo_3 @ 0x743b02f22b00] index: 13, region: (422, 1265) -> (422, 1265), label: car, confidence: 9608/10000. [Parsed_showinfo_3 @ 0x743b02f22b00] index: 14, region: (452, 1266) -> (452, 1266), label: car, confidence: 9239/10000. ... }}} -- Ticket URL: <https://trac.ffmpeg.org/ticket/11387> FFmpeg <https://ffmpeg.org> FFmpeg issue tracker
_______________________________________________ FFmpeg-trac mailing list FFmpeg-trac@avcodec.org https://ffmpeg.org/mailman/listinfo/ffmpeg-trac To unsubscribe, visit link above, or email ffmpeg-trac-requ...@ffmpeg.org with subject "unsubscribe".