Re: [FFmpeg-devel] AVWriter API

2020-01-14 Thread Nicolas George
Hi. Thanks for looking into it with an open mind.

Hendrik Leppkes (12020-01-12):
> While this argument sounds sensible on the surface, I can't help but
> wonder if it really makes sense.
> Sure, some returned strings may be able to be wrapped straight into
> whatever structure they use. But what about  const char* strings? Will
> we need to allocate a buffer and copy those in the future? Considering
> I can practically use those inline in logging printfs now, that would
> be an extreme increase in complexity.

I want to apply consistency with pragmatism, not with dogmatism. If what
we have at hand is a static const char *, the of course that is what we
return: it is the simplest and the most efficient. It is especially true
for enum → string functions: av_foobar_get_string(). (Note that you
can't use them really directly in printf, since "%s", NULL can
segfault.)

But in this case, let's also provide the av_foobar_write() variant,
because all it costs is a macro call in the header:
FF_DEFINE_WRITE_FROM_TO_STRING(av_foobar, AVFoobar, "unknown")

But as soon as a function needs to build a string in a buffer, let us
use this, because it is a very simple solution to let the programme
choose what kind of buffer to use.

> Also, what happens to strings that are an input to our libraries? Will
> those no longer be char pointers, possibly malloc'ed (depending on
> where they are used)?

For this, there are too many cases, I have yet to find a unifying
solution. Note that a lot of string implementations can give read-only
access to their internal buffer, which means we can accept const char *
and let the application decide how to allocate it.

> Additionally, any widely used custom string class will have a direct
> conversion from/to char pointers, so is this really such a huge burden
> on application developers? I work on Windows, where everything is
> wchar_t, and I never thought this was a problem, since I have to deal
> with this for every library. In fact the proposed API won't really
> help, since the entire API is still designed around UTF-8, so what I
> would do is wrap the dynbuf writter and just convert to wide-char when
> the string is complete. In short, I do exactly what I do now, just
> more complicated.

I am not sure what your point is about wchar_t, but if you think it
would be useful to provide AVWBufWriter and AVWDynbufWriter,
counterparts to AVBufWriter and AVDynbufWriter that output into wchar_t
buffers, it is entirely possible to do it.

Sure, we can continue doing what we have been doing: allocate a string,
fill it, allocate a new string, convert, allocate yet another string,
copy. This is what Java programmers do all day. But if we use Java
patterns in C, we get the drawbacks of Java patterns plus the fact that
the C allocator is not optimized for that.

My proposal is not only rooted in convenience, it is also rooted in
efficiency. String operations are not part of the inner loops of codecs,
but we still make an awful lot of it. For example,
av_get_channel_layout_string() can be called by verbose code for every
audio frame (cf. af_ashowinfo), that can make thousands of times per
second for low-latency codecs. It's a shame to do a malloc() only to
free it immediately and have the log layer discard the string anyway.

You may not have noticed this, but when you use av_dynbuf_writer(),
which is the functional equivalent of returning a mallocated buffer and
the recommended way, it will first use a buffer on stack, and only use
malloc() if it is not enough. Anton's implementation of
av_channel_layout_describe() always makes a malloc; with AVWriter, it
would only do so if there are more than 200 channels or the caller wants
a permanent copy of the string.

I think this consideration and the factoring of error check are enough
to warrant the little bit of complication for the caller. And the
possibility to output to many kinds of strings comes on top of that.

Regards,

-- 
  Nicolas George


signature.asc
Description: PGP signature
___
ffmpeg-devel mailing list
ffmpeg-devel@ffmpeg.org
https://ffmpeg.org/mailman/listinfo/ffmpeg-devel

To unsubscribe, visit link above, or email
ffmpeg-devel-requ...@ffmpeg.org with subject "unsubscribe".

Re: [FFmpeg-devel] AVWriter API

2020-01-12 Thread Hendrik Leppkes
On Sun, Jan 12, 2020 at 5:45 PM Josh de Kock  wrote:
>
>
> Hi,
>
> I have very little time to work on FFmpeg lately but this thread caught
> my interest.
>
> Nicolas George writes:
> > Anton Khirnov (12020-01-07):
> >> And most importantly - strings are simple. Every C programmer (worth the
> >> name) understands strings. But now you'd be forcing every API user to
> >> understand and remember this new API.
> >
> > [...]
> >
> > We will not play the game of reinventing yet another data structure for
> > strings, but let us not play the game of pretending the issue does not
> > exist: it only makes the problem more stinging.
>
> I think this is really important, AVWriter allows the user to make use
> of whichever implementation of strings their application already uses
> and not have to deal with yet-another-string-type.
>

While this argument sounds sensible on the surface, I can't help but
wonder if it really makes sense.
Sure, some returned strings may be able to be wrapped straight into
whatever structure they use. But what about  const char* strings? Will
we need to allocate a buffer and copy those in the future? Considering
I can practically use those inline in logging printfs now, that would
be an extreme increase in complexity.

Also, what happens to strings that are an input to our libraries? Will
those no longer be char pointers, possibly malloc'ed (depending on
where they are used)?

It sounds to me like you won't really be able to get rid of char
pointers, static or malloc'ed, without introducing a lot of pain
points. Sure, typical output functions may give you AVWriter things,
but other things will still  be char*.

Additionally, any widely used custom string class will have a direct
conversion from/to char pointers, so is this really such a huge burden
on application developers? I work on Windows, where everything is
wchar_t, and I never thought this was a problem, since I have to deal
with this for every library. In fact the proposed API won't really
help, since the entire API is still designed around UTF-8, so what I
would do is wrap the dynbuf writter and just convert to wide-char when
the string is complete. In short, I do exactly what I do now, just
more complicated.

Every library outputs strings in a certain manner, its either
char-pointer strings that you get to free, or their own string-type. I
can't help but think that this feels worse and more complicated then
our own string type.

- Hendrik
___
ffmpeg-devel mailing list
ffmpeg-devel@ffmpeg.org
https://ffmpeg.org/mailman/listinfo/ffmpeg-devel

To unsubscribe, visit link above, or email
ffmpeg-devel-requ...@ffmpeg.org with subject "unsubscribe".

Re: [FFmpeg-devel] AVWriter API

2020-01-12 Thread Josh de Kock

Hi,

I have very little time to work on FFmpeg lately but this thread caught
my interest.

Nicolas George writes:
> Anton Khirnov (12020-01-07):
>> And most importantly - strings are simple. Every C programmer (worth the
>> name) understands strings. But now you'd be forcing every API user to
>> understand and remember this new API.
>
> [...]
>
> We will not play the game of reinventing yet another data structure for
> strings, but let us not play the game of pretending the issue does not
> exist: it only makes the problem more stinging.

I think this is really important, AVWriter allows the user to make use
of whichever implementation of strings their application already uses
and not have to deal with yet-another-string-type.

> Who wants a mallocated char pointer, really? Who things it is a good
> idea for a string?
>
> Not applications that use another allocator than malloc(), obviously.

Not relevant to this thread but: How would an application not using the
traditional malloc() allocator work use FFmpeg at all if it didn't 'just
work' by setting MALLOC_PREFIX?

> [...]
>
> Etc. I hope you see my point and I do not need to add Java and PHP to
> these examples.

It's an issue that everyone is making their own string type, we do
definitely not want to do this (one of the reasons I think this API is
in the right direction).

> By returning a mallocated char *, you are saying to the authors of these
> applications "screw you, I don't care about your use case". If you don't
> care about most applications' use cases, don't design public APIs.

I guess the argument against this would be 'well, you can just copy it
to whichever string implementation you need', which maybe isn't ideal
but it 'works' currently.

> This AVWriter API addresses all that. If the caller wants a mallocated
> char *, they can get it, with almost no extra code. But if they want
> something else, they can get it.

Important that we won't lose any functionality with this new API.

> And even if it is used only for a mallocated char *, this code still
> brings the benefit of factoring error checks. Error checks are "not to
> be avoided", but they are not to be littered all over the code either:
> they are often tricky, with the risk of leaks or double frees, and they
> are almost never tested. Any change that centralizes them is good to
> take, and this one of the things AVWriter does, centralize error checks.

Deferring error checks properly and centralising them means they can be
better handled by a user. This is a good approach.

> As for the extra complexity. Yes, there is some. There is some
> significant complexity in the implementation, of course. If it is
> anybody's problem, it's mine.
>
> There is a little complexity for the caller. Look at the code: very very
> little complexity. Much less than what is required learning to use any
> FFmpeg API anyway.
>
> There is no complexity for the callee. If anything, using AVWriter makes
> the callee code simpler.

I don't think the complexity is a big issue here.

> I understand that you may be reluctant to embrace creative APIs, but
> please try to go beyond that reluctance to see the actual benefits and
> costs.

For the most part I think the concept is good, and it definitely solves
a problem which exists but the million dollar question is:

Are strings a big enough issue at the moment that we should introduce a
new API to fix them?

Though I guess if someone is willing to implement a fix, then the answer
to this doesn't even matter at all, and should always be 'yes'--you have
my support for this set, at least.

--
Josh
___
ffmpeg-devel mailing list
ffmpeg-devel@ffmpeg.org
https://ffmpeg.org/mailman/listinfo/ffmpeg-devel

To unsubscribe, visit link above, or email
ffmpeg-devel-requ...@ffmpeg.org with subject "unsubscribe".

Re: [FFmpeg-devel] AVWriter API

2020-01-12 Thread Nicolas George
Anton Khirnov (12020-01-07):
> Replying to this thread, since it contains the motivation.

Thanks.

> And most importantly - strings are simple. Every C programmer (worth the
> name) understands strings. But now you'd be forcing every API user to
> understand and remember this new API.

I'll reply to only that, because it contains the root of the problem.
No, strings are not simple. Strings only appear simple when you don't do
anything much with them. But as soon as you do, you need extra
complexity to implement them properly and efficiently.

And if you don't realize this is true, just observe that strings are so
simple that about every other library and most languages implement their
own version of strings, with the specific optimizations they think they
need.

Of course, FFmpeg does not do much with strings, and therefore can
pretend they are simple, but we are not talking about inside FFmpeg
here, we are talking about interacting with an application, an
application that probably does something with strings.

We will not play the game of reinventing yet another data structure for
strings, but let us not play the game of pretending the issue does not
exist: it only makes the problem more stinging.

Who wants a mallocated char pointer, really? Who things it is a good
idea for a string?

Not applications that use another allocator than malloc(), obviously.

Not Gtk+ applications, because they want a GString.

Not C++ application, because they want a boost string or something.

Not djb-style applications, because they want to use stralloc.

Not internationalized applications, because they want wchar_t.

Not Perl bindings, because they want to use a scalar value.

Etc. I hope you see my point and I do not need to add Java and PHP to
these examples.

By returning a mallocated char *, you are saying to the authors of these
applications "screw you, I don't care about your use case". If you don't
care about most applications' use cases, don't design public APIs.

This AVWriter API addresses all that. If the caller wants a mallocated
char *, they can get it, with almost no extra code. But if they want
something else, they can get it.

And even if it is used only for a mallocated char *, this code still
brings the benefit of factoring error checks. Error checks are "not to
be avoided", but they are not to be littered all over the code either:
they are often tricky, with the risk of leaks or double frees, and they
are almost never tested. Any change that centralizes them is good to
take, and this one of the things AVWriter does, centralize error checks.

As for the extra complexity. Yes, there is some. There is some
significant complexity in the implementation, of course. If it is
anybody's problem, it's mine.

There is a little complexity for the caller. Look at the code: very very
little complexity. Much less than what is required learning to use any
FFmpeg API anyway.

There is no complexity for the callee. If anything, using AVWriter makes
the callee code simpler.

I understand that you may be reluctant to embrace creative APIs, but
please try to go beyond that reluctance to see the actual benefits and
costs.

Regards,

-- 
  Nicolas George


signature.asc
Description: PGP signature
___
ffmpeg-devel mailing list
ffmpeg-devel@ffmpeg.org
https://ffmpeg.org/mailman/listinfo/ffmpeg-devel

To unsubscribe, visit link above, or email
ffmpeg-devel-requ...@ffmpeg.org with subject "unsubscribe".

Re: [FFmpeg-devel] AVWriter API

2020-01-07 Thread Anton Khirnov
Replying to this thread, since it contains the motivation.

Quoting Nicolas George (2019-12-08 15:25:43)
> Hi.
> 
> [ TL;DR: a lightweight oo API for functions that need to return a
> string, designed to be as convenient as possible. ]

TL;DR: sorry, I do not like this. It seems to me you are adding a
substantial amount of new complexity in an awkward place, that ends up
being worse than the problem it's supposed to address.

> 
> Yesterday, in my comments about the new channel layout API:
> https://ffmpeg.org/pipermail/ffmpeg-devel/2019-December/254020.html
> I evoked the following issue:
> 
> We have many functions that return strings to the caller:
> 
> char *av_get_sample_fmt_string(char *buf, int buf_size, enum AVSampleFormat 
> sample_fmt);
> const char *av_get_media_type_string(enum AVMediaType media_type);
> char *av_base64_encode(char *out, int out_size, const uint8_t *in, int 
> in_size);
> int av_dict_get_string(const AVDictionary *m, char **buffer,
>const char key_val_sep, const char pairs_sep);
> int av_strerror(int errnum, char *errbuf, size_t errbuf_size);
> void av_hash_final_hex(struct AVHashContext *ctx, uint8_t *dst, int size);
> const char *av_get_known_color_name(int color_idx, const uint8_t **rgb);
> const char *av_color_range_name(enum AVColorRange range);
> char *av_timecode_make_mpeg_tc_string(char *buf, uint32_t tc25bit);
> static inline char *av_ts_make_string(char *buf, int64_t ts)
> void avcodec_string(char *buf, int buf_size, AVCodecContext *enc, int encode);
> 
> The simplest of them are "enum → static string" functions, but what do
> they do with an invalid value: return NULL or a meaningful string like
> "unknown"?
> 
> A lot of them demand a pair of "char *buf, size_t buf_size" arguments to
> return the string. What is a reasonable size for the buffer? How do we
> handle the cases where the buffer is too small?
> 
> When the string is likely to be long, they dynamically allocate the
> buffer.
> 
> This is an inconsistent mess.

You assume every function that produces a string for the caller is
supposed to be consistent. I do not agree with that - the functions
serve different purposes and different patterns may be appropriate for
each of those. A static string, an informative message and serialized
structured data are all different things.

And most importantly - strings are simple. Every C programmer (worth the
name) understands strings. But now you'd be forcing every API user to
understand and remember this new API.

> And unsatisfactory on top of it: most of
> the time, we will just copy the buffer to another buffer, write it to a
> file or dump it to a log. The string function may need a big buffer to
> build the string while it is at no point necessary to have it in full.
> 
> I have been wanting to find a clean solution for this issue for a long
> time. And now I have. (The AVBPrint was a preliminary attempt at that.)
> 
> But before polishing it, I wanted to know the opinions of my fellow
> developers on this issue. The design is a bit unconventional (which is
> what enables it to be convenient), I suspect some people may object
> because of that.
> 
> I have attached the header file so people can have a look at it.
> 
> The idea is to implement the bare minimum of a very simple object
> system. 
> 
> To return a string (or a binary blob, that changes nothing), the
> functions take an AVWriter object and just write to it using a
> fprintf()-like function or other useful functions.
> 
> The AVWriter object is abstract and can actually write in several
> different ways. av_buf_writer_array(buf) will just wrap "char buf[42]"
> and write in it. av_dynbuf_writer() prepares a dynamic buffer.
> av_log_writer(obj) writes to av_log().
> 
> User application can implement their own AVWriter types: write directly
> in a GUI text buffer or to syslog.
> 
> I wanted to avoid as much as possible adding new necessary error checks.
> I think I have succeeded.
> 
> av_foobar_to_string(buf, sizeof(buf), foobar);
> 
> becomes
> 
> av_foobar_write(av_buf_writer_array(buf), foobar);
> 
> which has a little more letters but is actually simpler.

Error checks are not an evil to be avoided. On the contrary, where error
checks are necessary (like in memory allocation), we want to shove them
in the caller's face as strongly as we can.

And if the buffer was static and not large enough, again we want to
shout very loudly that the result may be invalid. Especially if the
result is structured and is supposed to be interpreted somehow.

-- 
Anton Khirnov
___
ffmpeg-devel mailing list
ffmpeg-devel@ffmpeg.org
https://ffmpeg.org/mailman/listinfo/ffmpeg-devel

To unsubscribe, visit link above, or email
ffmpeg-devel-requ...@ffmpeg.org with subject "unsubscribe".

Re: [FFmpeg-devel] AVWriter API

2019-12-21 Thread Nicolas George
Paul B Mahol (12019-12-21):
> I doubt that is very good idea or conclusion.

If you have technical or practical thoughts about my proposal, please
share them in reply to the first message.

-- 
  Nicolas George
___
ffmpeg-devel mailing list
ffmpeg-devel@ffmpeg.org
https://ffmpeg.org/mailman/listinfo/ffmpeg-devel

To unsubscribe, visit link above, or email
ffmpeg-devel-requ...@ffmpeg.org with subject "unsubscribe".

Re: [FFmpeg-devel] AVWriter API

2019-12-21 Thread Paul B Mahol
On 12/21/19, Nicolas George  wrote:
> Nicolas George (12019-12-08):
>> [ TL;DR: a lightweight oo API for functions that need to return a
>> string, designed to be as convenient as possible. ]
>
> Judging on the lack of reaction to this mail, there seems to be no
> strong feeling against this. I therefore intend to move ahead.
>

I doubt that is very good idea or conclusion.

> Regards,
>
> --
>   Nicolas George
>
___
ffmpeg-devel mailing list
ffmpeg-devel@ffmpeg.org
https://ffmpeg.org/mailman/listinfo/ffmpeg-devel

To unsubscribe, visit link above, or email
ffmpeg-devel-requ...@ffmpeg.org with subject "unsubscribe".

Re: [FFmpeg-devel] AVWriter API

2019-12-21 Thread Nicolas George
Nicolas George (12019-12-08):
> [ TL;DR: a lightweight oo API for functions that need to return a
> string, designed to be as convenient as possible. ]

Judging on the lack of reaction to this mail, there seems to be no
strong feeling against this. I therefore intend to move ahead.

Regards,

-- 
  Nicolas George


signature.asc
Description: PGP signature
___
ffmpeg-devel mailing list
ffmpeg-devel@ffmpeg.org
https://ffmpeg.org/mailman/listinfo/ffmpeg-devel

To unsubscribe, visit link above, or email
ffmpeg-devel-requ...@ffmpeg.org with subject "unsubscribe".

Re: [FFmpeg-devel] AVWriter API

2019-12-16 Thread Nicolas George
Nicolas George (12019-12-08):
> [ TL;DR: a lightweight oo API for functions that need to return a
> string, designed to be as convenient as possible. ]

Ping?

Regards,

-- 
  Nicolas George


signature.asc
Description: PGP signature
___
ffmpeg-devel mailing list
ffmpeg-devel@ffmpeg.org
https://ffmpeg.org/mailman/listinfo/ffmpeg-devel

To unsubscribe, visit link above, or email
ffmpeg-devel-requ...@ffmpeg.org with subject "unsubscribe".

[FFmpeg-devel] AVWriter API

2019-12-08 Thread Nicolas George
Hi.

[ TL;DR: a lightweight oo API for functions that need to return a
string, designed to be as convenient as possible. ]

Yesterday, in my comments about the new channel layout API:
https://ffmpeg.org/pipermail/ffmpeg-devel/2019-December/254020.html
I evoked the following issue:

We have many functions that return strings to the caller:

char *av_get_sample_fmt_string(char *buf, int buf_size, enum AVSampleFormat 
sample_fmt);
const char *av_get_media_type_string(enum AVMediaType media_type);
char *av_base64_encode(char *out, int out_size, const uint8_t *in, int in_size);
int av_dict_get_string(const AVDictionary *m, char **buffer,
   const char key_val_sep, const char pairs_sep);
int av_strerror(int errnum, char *errbuf, size_t errbuf_size);
void av_hash_final_hex(struct AVHashContext *ctx, uint8_t *dst, int size);
const char *av_get_known_color_name(int color_idx, const uint8_t **rgb);
const char *av_color_range_name(enum AVColorRange range);
char *av_timecode_make_mpeg_tc_string(char *buf, uint32_t tc25bit);
static inline char *av_ts_make_string(char *buf, int64_t ts)
void avcodec_string(char *buf, int buf_size, AVCodecContext *enc, int encode);

The simplest of them are "enum → static string" functions, but what do
they do with an invalid value: return NULL or a meaningful string like
"unknown"?

A lot of them demand a pair of "char *buf, size_t buf_size" arguments to
return the string. What is a reasonable size for the buffer? How do we
handle the cases where the buffer is too small?

When the string is likely to be long, they dynamically allocate the
buffer.

This is an inconsistent mess. And unsatisfactory on top of it: most of
the time, we will just copy the buffer to another buffer, write it to a
file or dump it to a log. The string function may need a big buffer to
build the string while it is at no point necessary to have it in full.

I have been wanting to find a clean solution for this issue for a long
time. And now I have. (The AVBPrint was a preliminary attempt at that.)

But before polishing it, I wanted to know the opinions of my fellow
developers on this issue. The design is a bit unconventional (which is
what enables it to be convenient), I suspect some people may object
because of that.

I have attached the header file so people can have a look at it.

The idea is to implement the bare minimum of a very simple object
system. 

To return a string (or a binary blob, that changes nothing), the
functions take an AVWriter object and just write to it using a
fprintf()-like function or other useful functions.

The AVWriter object is abstract and can actually write in several
different ways. av_buf_writer_array(buf) will just wrap "char buf[42]"
and write in it. av_dynbuf_writer() prepares a dynamic buffer.
av_log_writer(obj) writes to av_log().

User application can implement their own AVWriter types: write directly
in a GUI text buffer or to syslog.

I wanted to avoid as much as possible adding new necessary error checks.
I think I have succeeded.

av_foobar_to_string(buf, sizeof(buf), foobar);

becomes

av_foobar_write(av_buf_writer_array(buf), foobar);

which has a little more letters but is actually simpler.

But it is really more interesting than that because if we keep the same
AVWriter for several call, it will just concatenate, allowing to build a
message.

Another thing that is not yet in the header: have all the
av_foobar_write() function take an extra "unsigned flags" argument. And
have a few transverse flags, especially:

#define AV_WRITER_FALLBACK (1U << 16)

When this flag is set, the function writes a fall-back string for
unknown values ("unknown", "?", "invalid", depends on the type).
Otherwise, it prints nothing.

Before working more on integrating this, I would like a little feedback,
please.

Regards,

-- 
  Nicolas George
/*
 * Copyright (c) 2019 The FFmpeg project
 *
 * This file is part of FFmpeg.
 *
 * FFmpeg is free software; you can redistribute it and/or
 * modify it under the terms of the GNU Lesser General Public License
 * as published by the Free Software Foundation; either
 * version 2.1 of the License, or (at your option) any later version.
 *
 * FFmpeg is distributed in the hope that it will be useful,
 * but WITHOUT ANY WARRANTY; without even the implied warranty of
 * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
 * GNU Lesser General Public License for more details.
 *
 * You should have received a copy of the GNU Lesser General Public License
 * along with FFmpeg; if not, write to the Free Software Foundation, Inc.,
 * 51 Franklin Street, Fifth Floor, Boston, MA 02110-1301 USA
 */

#include 
#include 

#include "bprint.h"

/**
 * @defgroup av_writer AVWriter
 *
 * Object-oriented API to write strings and binary data.
 *
 * @{
 */

typedef struct AVWriterMethods AVWriterMethods;

/**
 * Opaque object to write strings and binary data.
 *
 * An AVWriter is meant to allow to build and return string (and