Re: [FFmpeg-devel] [PATCH V2 6/6] lavfi/dnn_classify: add filter dnn_classify for classification based on detection bounding boxes

2021-05-04 Thread Guo, Yejun


> -Original Message-
> From: Guo, Yejun 
> Sent: 2021年4月29日 21:37
> To: ffmpeg-devel@ffmpeg.org
> Cc: Guo, Yejun 
> Subject: [PATCH V2 6/6] lavfi/dnn_classify: add filter dnn_classify for
> classification based on detection bounding boxes
> 
> classification is done on every detection bounding box in frame's side data,
> which are the results of object detection (filter dnn_detect).
> 
> Please refer to commit log of dnn_detect for the material for detection,
> and see below for classification.
> 
> - download material for classifcation:
> wget
> https://github.com/guoyejun/ffmpeg_dnn/raw/main/models/openvino/202
> 1.1/emotions-recognition-retail-0003.bin
> wget
> https://github.com/guoyejun/ffmpeg_dnn/raw/main/models/openvino/202
> 1.1/emotions-recognition-retail-0003.xml
> wget
> https://github.com/guoyejun/ffmpeg_dnn/raw/main/models/openvino/202
> 1.1/emotions-recognition-retail-0003.label
> 
> - run command as:
> ./ffmpeg -i cici.jpg -vf
> dnn_detect=dnn_backend=openvino:model=face-detection-adas-0001.xml:in
> put=data:output=detection_out:confidence=0.6:labels=face-detection-adas-0
> 001.label,dnn_classify=dnn_backend=openvino:model=emotions-recognition-
> retail-0003.xml:input=data:output=prob_emotion:confidence=0.3:labels=em
> otions-recognition-retail-0003.label:target=face,showinfo -f null -
> 
> We'll see the detect result as below:
> [Parsed_showinfo_2 @ 0x55b7d25e77c0]   side data - detection bounding
> boxes:
> [Parsed_showinfo_2 @ 0x55b7d25e77c0] source:
> face-detection-adas-0001.xml, emotions-recognition-retail-0003.xml
> [Parsed_showinfo_2 @ 0x55b7d25e77c0] index: 0,  region: (1005, 813) ->
> (1086, 905), label: face, confidence: 1/1.
> [Parsed_showinfo_2 @ 0x55b7d25e77c0]classify:  label:
> happy, confidence: 6757/1.
> [Parsed_showinfo_2 @ 0x55b7d25e77c0] index: 1,  region: (888, 839) ->
> (967, 926), label: face, confidence: 6917/1.
> [Parsed_showinfo_2 @ 0x55b7d25e77c0]classify:  label: anger,
> confidence: 4320/1.
> 
> Signed-off-by: Guo, Yejun 
> ---
> the main change of V2 in this patch set is to rebase with latest code
> by resolving the conflicts.
> 
>  configure |   1 +
>  doc/filters.texi  |  39 
>  libavfilter/Makefile  |   1 +
>  libavfilter/allfilters.c  |   1 +
>  libavfilter/vf_dnn_classify.c | 330
> ++
>  5 files changed, 372 insertions(+)
>  create mode 100644 libavfilter/vf_dnn_classify.c
> 
will push tomorrow if there's no objection, thanks.
___
ffmpeg-devel mailing list
ffmpeg-devel@ffmpeg.org
https://ffmpeg.org/mailman/listinfo/ffmpeg-devel

To unsubscribe, visit link above, or email
ffmpeg-devel-requ...@ffmpeg.org with subject "unsubscribe".


[FFmpeg-devel] [PATCH V2 6/6] lavfi/dnn_classify: add filter dnn_classify for classification based on detection bounding boxes

2021-04-29 Thread Guo, Yejun
classification is done on every detection bounding box in frame's side data,
which are the results of object detection (filter dnn_detect).

Please refer to commit log of dnn_detect for the material for detection,
and see below for classification.

- download material for classifcation:
wget 
https://github.com/guoyejun/ffmpeg_dnn/raw/main/models/openvino/2021.1/emotions-recognition-retail-0003.bin
wget 
https://github.com/guoyejun/ffmpeg_dnn/raw/main/models/openvino/2021.1/emotions-recognition-retail-0003.xml
wget 
https://github.com/guoyejun/ffmpeg_dnn/raw/main/models/openvino/2021.1/emotions-recognition-retail-0003.label

- run command as:
./ffmpeg -i cici.jpg -vf 
dnn_detect=dnn_backend=openvino:model=face-detection-adas-0001.xml:input=data:output=detection_out:confidence=0.6:labels=face-detection-adas-0001.label,dnn_classify=dnn_backend=openvino:model=emotions-recognition-retail-0003.xml:input=data:output=prob_emotion:confidence=0.3:labels=emotions-recognition-retail-0003.label:target=face,showinfo
 -f null -

We'll see the detect result as below:
[Parsed_showinfo_2 @ 0x55b7d25e77c0]   side data - detection bounding boxes:
[Parsed_showinfo_2 @ 0x55b7d25e77c0] source: face-detection-adas-0001.xml, 
emotions-recognition-retail-0003.xml
[Parsed_showinfo_2 @ 0x55b7d25e77c0] index: 0,  region: (1005, 813) -> (1086, 
905), label: face, confidence: 1/1.
[Parsed_showinfo_2 @ 0x55b7d25e77c0]classify:  label: happy, 
confidence: 6757/1.
[Parsed_showinfo_2 @ 0x55b7d25e77c0] index: 1,  region: (888, 839) -> (967, 
926), label: face, confidence: 6917/1.
[Parsed_showinfo_2 @ 0x55b7d25e77c0]classify:  label: anger, 
confidence: 4320/1.

Signed-off-by: Guo, Yejun 
---
the main change of V2 in this patch set is to rebase with latest code
by resolving the conflicts.

 configure |   1 +
 doc/filters.texi  |  39 
 libavfilter/Makefile  |   1 +
 libavfilter/allfilters.c  |   1 +
 libavfilter/vf_dnn_classify.c | 330 ++
 5 files changed, 372 insertions(+)
 create mode 100644 libavfilter/vf_dnn_classify.c

diff --git a/configure b/configure
index 820f719a32..9f2dfaf2d4 100755
--- a/configure
+++ b/configure
@@ -3550,6 +3550,7 @@ derain_filter_select="dnn"
 deshake_filter_select="pixelutils"
 deshake_opencl_filter_deps="opencl"
 dilation_opencl_filter_deps="opencl"
+dnn_classify_filter_select="dnn"
 dnn_detect_filter_select="dnn"
 dnn_processing_filter_select="dnn"
 drawtext_filter_deps="libfreetype"
diff --git a/doc/filters.texi b/doc/filters.texi
index 36e35a175b..b405cc5dfb 100644
--- a/doc/filters.texi
+++ b/doc/filters.texi
@@ -10127,6 +10127,45 @@ ffmpeg -i INPUT -f lavfi -i 
nullsrc=hd720,geq='r=128+80*(sin(sqrt((X-W/2)*(X-W/2
 @end example
 @end itemize
 
+@section dnn_classify
+
+Do classification with deep neural networks based on bounding boxes.
+
+The filter accepts the following options:
+
+@table @option
+@item dnn_backend
+Specify which DNN backend to use for model loading and execution. This option 
accepts
+only openvino now, tensorflow backends will be added.
+
+@item model
+Set path to model file specifying network architecture and its parameters.
+Note that different backends use different file formats.
+
+@item input
+Set the input name of the dnn network.
+
+@item output
+Set the output name of the dnn network.
+
+@item confidence
+Set the confidence threshold (default: 0.5).
+
+@item labels
+Set path to label file specifying the mapping between label id and name.
+Each label name is written in one line, tailing spaces and empty lines are 
skipped.
+The first line is the name of label id 0,
+and the second line is the name of label id 1, etc.
+The label id is considered as name if the label file is not provided.
+
+@item backend_configs
+Set the configs to be passed into backend
+
+For tensorflow backend, you can set its configs with @option{sess_config} 
options,
+please use tools/python/tf_sess_config.py to get the configs for your system.
+
+@end table
+
 @section dnn_detect
 
 Do object detection with deep neural networks.
diff --git a/libavfilter/Makefile b/libavfilter/Makefile
index 5a287364b0..6c22d0404e 100644
--- a/libavfilter/Makefile
+++ b/libavfilter/Makefile
@@ -243,6 +243,7 @@ OBJS-$(CONFIG_DILATION_FILTER)   += 
vf_neighbor.o
 OBJS-$(CONFIG_DILATION_OPENCL_FILTER)+= vf_neighbor_opencl.o opencl.o \
 opencl/neighbor.o
 OBJS-$(CONFIG_DISPLACE_FILTER)   += vf_displace.o framesync.o
+OBJS-$(CONFIG_DNN_CLASSIFY_FILTER)   += vf_dnn_classify.o
 OBJS-$(CONFIG_DNN_DETECT_FILTER) += vf_dnn_detect.o
 OBJS-$(CONFIG_DNN_PROCESSING_FILTER) += vf_dnn_processing.o
 OBJS-$(CONFIG_DOUBLEWEAVE_FILTER)+= vf_weave.o
diff --git a/libavfilter/allfilters.c b/libavfilter/allfilters.c
index 931d7dbb0d..87c3661cf4 100644
--- a/libavfilter/allfilters.c
+++