Re: [FFmpeg-devel] A few filter questions

2014-07-20 Thread Clément Bœsch
On Fri, Jul 18, 2014 at 12:38:43PM +0200, Gerion Entrup wrote:
 Am Donnerstag 17 Juli 2014, 17:24:35 schrieb Clément Bœsch:
  On Thu, Jul 17, 2014 at 04:56:08PM +0200, Gerion Entrup wrote:
  [...]
  
Also, you still have the string metadata possibility (git grep SET_META
libavfilter).
   
   Hmm, thank you, I will take a look at it. If I see it right, it is used to
   fill a dictionary per frame with some kind of data?
  
  Strings only, so you'll have to find a serialization somehow. Maybe simply
  an ascii hex string or something. But yeah, it just allows you to map some
  key → value string couples to the frames passing by in the filter.
  
  How huge is the information to store per frame?
 82 byte per frame for the finesignature
 (Could be split again in three parts: An one byte confidence, a 5 byte words 
 vector, and a 76 byte framesignature, something like:
 struct finesignature{
 uint8_t confidence;
 uint8_t words[5];
 uint8_t framesignature[76]
 })
 152 byte per 90 frames for the coursesignature
 (Note, that there are 2 coursesignatures with an offset of 45 frames:
 0-89
 45-134
 90-179
 ...)
 
 If I see it right, there are two possibilies:
 Write as chars in the output (looks crappy, but needs the same amount of 
 memory).
 Write as ascii hex in the output (looks nice, but needs twice as much memory).

It won't be encoded in the output (at least I'm not sure about which muxer
would store these meta) so the bandwidth issue is not a problem. An
ascii hex string would be nice IMO, it's extremely small and would appear
fine and be easily parsable.

[...]

-- 
Clément B.


pgpzlbLRzY72R.pgp
Description: PGP signature
___
ffmpeg-devel mailing list
ffmpeg-devel@ffmpeg.org
http://ffmpeg.org/mailman/listinfo/ffmpeg-devel


Re: [FFmpeg-devel] A few filter questions

2014-07-18 Thread Gerion Entrup
Am Donnerstag 17 Juli 2014, 17:24:35 schrieb Clément Bœsch:
 On Thu, Jul 17, 2014 at 04:56:08PM +0200, Gerion Entrup wrote:
 [...]
 
   Also, you still have the string metadata possibility (git grep SET_META
   libavfilter).
  
  Hmm, thank you, I will take a look at it. If I see it right, it is used to
  fill a dictionary per frame with some kind of data?
 
 Strings only, so you'll have to find a serialization somehow. Maybe simply
 an ascii hex string or something. But yeah, it just allows you to map some
 key → value string couples to the frames passing by in the filter.
 
 How huge is the information to store per frame?
82 byte per frame for the finesignature
(Could be split again in three parts: An one byte confidence, a 5 byte words 
vector, and a 76 byte framesignature, something like:
struct finesignature{
uint8_t confidence;
uint8_t words[5];
uint8_t framesignature[76]
})
152 byte per 90 frames for the coursesignature
(Note, that there are 2 coursesignatures with an offset of 45 frames:
0-89
45-134
90-179
...)

If I see it right, there are two possibilies:
Write as chars in the output (looks crappy, but needs the same amount of 
memory).
Write as ascii hex in the output (looks nice, but needs twice as much memory).

 
 [...]
 
   stdout/stderr really isn't a good thing. Using metadata is way better
   because you can output them from ffprobe, and parse them according to
   various outputs (XML, CSV, JSON, ...).
  
  Sounds good…
 
 tools/normalize.py make use of such feature if you want examples (that's
 the -of option of ffprobe)
Ok.
 
 [...]
 
   Am I understanding right your wondering?
  
  No ;), but anyway thanks for your answer. In your 2nd method your filter
  is a VV-V filter? Am I right, that this filter then also can take only
  one stream? Said in another way: Can a VV-V filter also behave as a V-V
  filter?
 Yes, fieldmatch is a (complex) example of this. But typically it's simply
 a filter with dynamic inputs, based on the user input. The simplest
 example would be the split filter. Look at it for an example of dynamic
 allocation of the number of inputs based on the user input (-vf split=4 is
 a V- filter)
Hmm, interesting code, thank you.
 
 [...]
 
   Check tools/normalize.py, it's using ebur128 and the metadata system.
  
  Thats what I mean. Someone has to write an external script, which calls
  ffmpeg/ffprobe two times, parse stdout of the first call and pass it to
  the
  filteroptions of the second call. As I see, there is no direct way.
  Something like:
  ffmpeg -i foo -f:a volume=mode=autodetect normalized.opus
 
 We add a discussion several time for real time with that filter. If we do
 a 2-pass, that's simply because it's more efficient. Typically, doing
 some live normalization can be done easily (we had patches for this):
 ebur128 already attaches some metadata to frames, so a following filter
 such as volume could reuse them, something like -filter_complex
 ebur128=metadata=1,volume=metadata.
 
 [...]

___
ffmpeg-devel mailing list
ffmpeg-devel@ffmpeg.org
http://ffmpeg.org/mailman/listinfo/ffmpeg-devel


[FFmpeg-devel] A few filter questions

2014-07-17 Thread Gerion Entrup
Good day,

I'm currently working on a video signature filter for ffmpeg. This allows you 
to 
fingerprint videos.
This fingerprint is built up of 9mb/s of bits or 2-3 mb/s bits compressed.

In this context a few questions come into my mind:
- Should I print this whole bitstream to stdout/stderr at the end? Is it maybe 
a better choice to made an own stream out of this. But which type of stream 
this is?
  (btw, the video signature algorithm needs 90 following frames, so I can 
theoretically write every 90 frames something somewhere.)
- If I print the whole bitstream to stdout/stderr (my current implementation), 
is there a possibility to use this later in an external program? The only 
other globally analyze filter I found is volumedetect. This filter at the end 
prints per print_stats the calculated results to the console. Is there a 
possibility within the API for an external program to use these values or do I 
have to grep the output?
  A similar example is AcousticID (a fingerprinting technique for audio). 
Currently chromaprint (the AcousticID library) provides an executable (fpcalc) 
to calculate AcousticID. It therefore uses FFmpeg to decode the audio and then 
its own library to calculate the fingerprint. The better way I think would be 
to have an ffmpeg filter for this. But is it possibly to use the calculated 
number in an external program without grepping the output?

Another thing that came into my mind: Can filter force other filters to go into 
the filterchain? I see it, when I force GREY_8 only in my filter, it 
automatically enables the scale filter, too. The reason I asked is the lookup 
for my filter. Currently my filter analyzes a video and then produces a lot of 
numbers. To compare two videos and decide, wheather they match or not, these 
numbers has to be compared. I see three possibilities:
1. Write an VV-V filter. Reimplement (copy) the code from the V-V signature 
filter and give a boolean as output (match or match not).
2. Take the V-V filter and write a python (or whatever) script that fetch the 
output and calculates then the rest.
3. Write an VV-V filter, but enforce, that the normal signature filter is 
executed first to both streams, use the result and then calculate the matching 
type. Unfortunately I have no idea, how to do this and whether it is possible 
at all. Can you give me an advice?

The last possibility also would allow something like twopass volume 
normalisation. Currently there is a volumedetect and volume filter. To 
normalize once could run volumedetect, then fetch the output, and put the 
values into the volume filter, but I currently don't see a way to do this 
automatically directly in ffmpeg.

(Once the filter is in a good state, I will try to bring it upstream.)

Best,
Gerion

___
ffmpeg-devel mailing list
ffmpeg-devel@ffmpeg.org
http://ffmpeg.org/mailman/listinfo/ffmpeg-devel


Re: [FFmpeg-devel] A few filter questions

2014-07-17 Thread Clément Bœsch
On Thu, Jul 17, 2014 at 12:33:41PM +0200, Gerion Entrup wrote:
 Good day,
 
 I'm currently working on a video signature filter for ffmpeg. This allows you 
 to 
 fingerprint videos.

Oh, nice.

 This fingerprint is built up of 9mb/s of bits or 2-3 mb/s bits compressed.
 
 In this context a few questions come into my mind:
 - Should I print this whole bitstream to stdout/stderr at the end? Is it 
 maybe 
 a better choice to made an own stream out of this. But which type of stream 
 this is?

How does the fingerprint looks like? Could it make sense as a gray video
output fractal, or maybe some kind of audio signal?

Also, you still have the string metadata possibility (git grep SET_META
libavfilter).

   (btw, the video signature algorithm needs 90 following frames, so I can 
 theoretically write every 90 frames something somewhere.)

Do you cache all these frames or just update your caches/stats  drop
them?

 - If I print the whole bitstream to stdout/stderr (my current 
 implementation), 
 is there a possibility to use this later in an external program? The only 
 other globally analyze filter I found is volumedetect. This filter at the end 
 prints per print_stats the calculated results to the console. Is there a 
 possibility within the API for an external program to use these values or do 
 I 
 have to grep the output?

stdout/stderr really isn't a good thing. Using metadata is way better
because you can output them from ffprobe, and parse them according to
various outputs (XML, CSV, JSON, ...).

Another solution I can now think of is to simply pass an output file as
option to the filter. That's typically how we do the 2-pass thing with
vidstab filter.

[...]
 Another thing that came into my mind: Can filter force other filters to go 
 into 
 the filterchain? I see it, when I force GREY_8 only in my filter, it 
 automatically enables the scale filter, too.

Some filter are inserted automatically for conversion  constraints, but
that's not decided by the filters but the framework itself.

  The reason I asked is the lookup 
 for my filter. Currently my filter analyzes a video and then produces a lot 
 of 
 numbers. To compare two videos and decide, wheather they match or not, these 
 numbers has to be compared. I see three possibilities:
 1. Write an VV-V filter. Reimplement (copy) the code from the V-V signature 
 filter and give a boolean as output (match or match not).
 2. Take the V-V filter and write a python (or whatever) script that fetch 
 the 
 output and calculates then the rest.
 3. Write an VV-V filter, but enforce, that the normal signature filter is 
 executed first to both streams, use the result and then calculate the 
 matching 
 type. Unfortunately I have no idea, how to do this and whether it is possible 
 at all. Can you give me an advice?
 

So if you output a file in the filter itself:
  ffmpeg -i video   -vf fingerprint=video.sig -f null -
  ffmpeg -i another -vf fingerprint=video.sig:check=1 -f null -

Or if you save the signature stream in a video (in gray8 for instance):
  ffmpeg -i video   -vf fingerprint -c:v ffv1 sig.nut
  ffmpeg -i another -i sig.nut -vf '[0][1] fingerprint=mode=check' -f null -

The 2nd method is better because it doesn't require file handling in the
library, and it also allows stuff like using a diff filter (if you also
apply fingerprint - not with mode=check - on `another`)

Am I understanding right your wondering?

 The last possibility also would allow something like twopass volume 
 normalisation. Currently there is a volumedetect and volume filter. To 
 normalize once could run volumedetect, then fetch the output, and put the 
 values into the volume filter, but I currently don't see a way to do this 
 automatically directly in ffmpeg.

Check tools/normalize.py, it's using ebur128 and the metadata system.

 
 (Once the filter is in a good state, I will try to bring it upstream.)
 

Cool

 Best,
 Gerion
 

-- 
Clément B.


pgpIUcBGsdmql.pgp
Description: PGP signature
___
ffmpeg-devel mailing list
ffmpeg-devel@ffmpeg.org
http://ffmpeg.org/mailman/listinfo/ffmpeg-devel


Re: [FFmpeg-devel] A few filter questions

2014-07-17 Thread Gerion Entrup
Am Donnerstag 17 Juli 2014, 13:00:13 schrieb Clément Bœsch:
 On Thu, Jul 17, 2014 at 12:33:41PM +0200, Gerion Entrup wrote:
  Good day,
  
  I'm currently working on a video signature filter for ffmpeg. This allows
  you to fingerprint videos.
 
 Oh, nice.
 
  This fingerprint is built up of 9mb/s of bits or 2-3 mb/s bits compressed.
Argh, fail, sorry. I meant: 9mb per hour of video (and 2-3 mb per hour).
  
  In this context a few questions come into my mind:
  - Should I print this whole bitstream to stdout/stderr at the end? Is it
  maybe a better choice to made an own stream out of this. But which type
  of stream this is?
 
 How does the fingerprint looks like? Could it make sense as a gray video
 output fractal, or maybe some kind of audio signal?
There a finesignatures per frame and coursesignatures per 90 finesignatures.
coursesignature are binarized histograms (0 or 1 possible as count).
finesignature is mainly a vector of 380 difference values between -128 and 127 
which are ternarized into 0 1 or 2.
(See the MPEG-7 Standard for more details).

I doubt, this is a good video or audio stream.
Definitely, interpreting this as video make sense in some way, but metadata 
looks more useful.

 
 Also, you still have the string metadata possibility (git grep SET_META
 libavfilter).
Hmm, thank you, I will take a look at it. If I see it right, it is used to fill 
a dictionary per frame with some kind of data?

 
(btw, the video signature algorithm needs 90 following frames, so I can
  
  theoretically write every 90 frames something somewhere.)
 
 Do you cache all these frames or just update your caches/stats  drop
 them?
ATM I don't cache the frames, but the whole signature. As said above, the 
coursesignatures (the part, which needs the 90 frames) is calculated only from 
the finesignatures (the finesignatures are cached, anyway).
 
  - If I print the whole bitstream to stdout/stderr (my current
  implementation), is there a possibility to use this later in an external
  program? The only other globally analyze filter I found is volumedetect.
  This filter at the end prints per print_stats the calculated results to
  the console. Is there a possibility within the API for an external
  program to use these values or do I have to grep the output?
 
 stdout/stderr really isn't a good thing. Using metadata is way better
 because you can output them from ffprobe, and parse them according to
 various outputs (XML, CSV, JSON, ...).
Sounds good…
 
 Another solution I can now think of is to simply pass an output file as
 option to the filter. That's typically how we do the 2-pass thing with
 vidstab filter.
I don't like output files. If you want to write a program, that performs a 
lookup to signatures somewhere stored in a database and this program uses 
ffmpeg internally and then always has to write a file and read it again, it's 
not that elegant.
(btw, an example for such a program is Musicbrainz Picard, but for AcousticID 
;))
 
 [...]
 
  Another thing that came into my mind: Can filter force other filters to go
  into the filterchain? I see it, when I force GREY_8 only in my filter, it
  automatically enables the scale filter, too.
 
 Some filter are inserted automatically for conversion  constraints, but
 that's not decided by the filters but the framework itself.
 
   The reason I asked is the
   lookup
  
  for my filter. Currently my filter analyzes a video and then produces a
  lot of numbers. To compare two videos and decide, wheather they match or
  not, these numbers has to be compared. I see three possibilities:
  1. Write an VV-V filter. Reimplement (copy) the code from the V-V
  signature filter and give a boolean as output (match or match not).
  2. Take the V-V filter and write a python (or whatever) script that fetch
  the output and calculates then the rest.
  3. Write an VV-V filter, but enforce, that the normal signature filter is
  executed first to both streams, use the result and then calculate the
  matching type. Unfortunately I have no idea, how to do this and whether
  it is possible at all. Can you give me an advice?
 
 So if you output a file in the filter itself:
   ffmpeg -i video   -vf fingerprint=video.sig -f null -
   ffmpeg -i another -vf fingerprint=video.sig:check=1 -f null -
 
 Or if you save the signature stream in a video (in gray8 for instance):
   ffmpeg -i video   -vf fingerprint -c:v ffv1 sig.nut
   ffmpeg -i another -i sig.nut -vf '[0][1] fingerprint=mode=check' -f null -
 
 The 2nd method is better because it doesn't require file handling in the
 library, and it also allows stuff like using a diff filter (if you also
 apply fingerprint - not with mode=check - on `another`)
 
 Am I understanding right your wondering?
No ;), but anyway thanks for your answer. In your 2nd method your filter is a 
VV-V filter? Am I right, that this filter then also can take only one stream? 

Re: [FFmpeg-devel] A few filter questions

2014-07-17 Thread Clément Bœsch
On Thu, Jul 17, 2014 at 04:56:08PM +0200, Gerion Entrup wrote:
[...]
  Also, you still have the string metadata possibility (git grep SET_META
  libavfilter).
 Hmm, thank you, I will take a look at it. If I see it right, it is used to 
 fill 
 a dictionary per frame with some kind of data?
 

Strings only, so you'll have to find a serialization somehow. Maybe simply
an ascii hex string or something. But yeah, it just allows you to map some
key → value string couples to the frames passing by in the filter.

How huge is the information to store per frame?

[...]
  stdout/stderr really isn't a good thing. Using metadata is way better
  because you can output them from ffprobe, and parse them according to
  various outputs (XML, CSV, JSON, ...).
 Sounds good…

tools/normalize.py make use of such feature if you want examples (that's
the -of option of ffprobe)

[...]
  Am I understanding right your wondering?
 No ;), but anyway thanks for your answer. In your 2nd method your filter is a 
 VV-V filter? Am I right, that this filter then also can take only one 
 stream? 
 Said in another way: Can a VV-V filter also behave as a V-V filter?

Yes, fieldmatch is a (complex) example of this. But typically it's simply
a filter with dynamic inputs, based on the user input. The simplest
example would be the split filter. Look at it for an example of dynamic
allocation of the number of inputs based on the user input (-vf split=4 is
a V- filter)

[...]
  Check tools/normalize.py, it's using ebur128 and the metadata system.
 Thats what I mean. Someone has to write an external script, which calls 
 ffmpeg/ffprobe two times, parse stdout of the first call and pass it to the 
 filteroptions of the second call. As I see, there is no direct way. Something 
 like:
 ffmpeg -i foo -f:a volume=mode=autodetect normalized.opus

We add a discussion several time for real time with that filter. If we do
a 2-pass, that's simply because it's more efficient. Typically, doing
some live normalization can be done easily (we had patches for this):
ebur128 already attaches some metadata to frames, so a following filter
such as volume could reuse them, something like -filter_complex
ebur128=metadata=1,volume=metadata.

[...]

-- 
Clément B.


pgporzFGKSof0.pgp
Description: PGP signature
___
ffmpeg-devel mailing list
ffmpeg-devel@ffmpeg.org
http://ffmpeg.org/mailman/listinfo/ffmpeg-devel


Re: [FFmpeg-devel] A few filter questions

2014-07-17 Thread Nicolas George
Le nonidi 29 messidor, an CCXXII, Clément Bœsch a écrit :
 We add a discussion several time for real time with that filter. If we do
 a 2-pass, that's simply because it's more efficient. Typically, doing
 some live normalization can be done easily (we had patches for this):
 ebur128 already attaches some metadata to frames, so a following filter
 such as volume could reuse them, something like -filter_complex
 ebur128=metadata=1,volume=metadata.

I believe you are wrong in this paragraph: we do two passes for
normalization because that is the only way of doing it without distortions:
the level of volume adjustment depends on the whole stream.

Normalization can be done in a single pass with distortions, but currently,
no filter is capable of smoothing the measures computed by ebur128 to make
the distortions inaudible. Patch welcome.

Regards,

-- 
  Nicolas George


signature.asc
Description: Digital signature
___
ffmpeg-devel mailing list
ffmpeg-devel@ffmpeg.org
http://ffmpeg.org/mailman/listinfo/ffmpeg-devel