Re: [FFmpeg-devel] AVWriter API
Hi. Thanks for looking into it with an open mind. Hendrik Leppkes (12020-01-12): > While this argument sounds sensible on the surface, I can't help but > wonder if it really makes sense. > Sure, some returned strings may be able to be wrapped straight into > whatever structure they use. But what about const char* strings? Will > we need to allocate a buffer and copy those in the future? Considering > I can practically use those inline in logging printfs now, that would > be an extreme increase in complexity. I want to apply consistency with pragmatism, not with dogmatism. If what we have at hand is a static const char *, the of course that is what we return: it is the simplest and the most efficient. It is especially true for enum → string functions: av_foobar_get_string(). (Note that you can't use them really directly in printf, since "%s", NULL can segfault.) But in this case, let's also provide the av_foobar_write() variant, because all it costs is a macro call in the header: FF_DEFINE_WRITE_FROM_TO_STRING(av_foobar, AVFoobar, "unknown") But as soon as a function needs to build a string in a buffer, let us use this, because it is a very simple solution to let the programme choose what kind of buffer to use. > Also, what happens to strings that are an input to our libraries? Will > those no longer be char pointers, possibly malloc'ed (depending on > where they are used)? For this, there are too many cases, I have yet to find a unifying solution. Note that a lot of string implementations can give read-only access to their internal buffer, which means we can accept const char * and let the application decide how to allocate it. > Additionally, any widely used custom string class will have a direct > conversion from/to char pointers, so is this really such a huge burden > on application developers? I work on Windows, where everything is > wchar_t, and I never thought this was a problem, since I have to deal > with this for every library. In fact the proposed API won't really > help, since the entire API is still designed around UTF-8, so what I > would do is wrap the dynbuf writter and just convert to wide-char when > the string is complete. In short, I do exactly what I do now, just > more complicated. I am not sure what your point is about wchar_t, but if you think it would be useful to provide AVWBufWriter and AVWDynbufWriter, counterparts to AVBufWriter and AVDynbufWriter that output into wchar_t buffers, it is entirely possible to do it. Sure, we can continue doing what we have been doing: allocate a string, fill it, allocate a new string, convert, allocate yet another string, copy. This is what Java programmers do all day. But if we use Java patterns in C, we get the drawbacks of Java patterns plus the fact that the C allocator is not optimized for that. My proposal is not only rooted in convenience, it is also rooted in efficiency. String operations are not part of the inner loops of codecs, but we still make an awful lot of it. For example, av_get_channel_layout_string() can be called by verbose code for every audio frame (cf. af_ashowinfo), that can make thousands of times per second for low-latency codecs. It's a shame to do a malloc() only to free it immediately and have the log layer discard the string anyway. You may not have noticed this, but when you use av_dynbuf_writer(), which is the functional equivalent of returning a mallocated buffer and the recommended way, it will first use a buffer on stack, and only use malloc() if it is not enough. Anton's implementation of av_channel_layout_describe() always makes a malloc; with AVWriter, it would only do so if there are more than 200 channels or the caller wants a permanent copy of the string. I think this consideration and the factoring of error check are enough to warrant the little bit of complication for the caller. And the possibility to output to many kinds of strings comes on top of that. Regards, -- Nicolas George signature.asc Description: PGP signature ___ ffmpeg-devel mailing list ffmpeg-devel@ffmpeg.org https://ffmpeg.org/mailman/listinfo/ffmpeg-devel To unsubscribe, visit link above, or email ffmpeg-devel-requ...@ffmpeg.org with subject "unsubscribe".
Re: [FFmpeg-devel] AVWriter API
On Sun, Jan 12, 2020 at 5:45 PM Josh de Kock wrote: > > > Hi, > > I have very little time to work on FFmpeg lately but this thread caught > my interest. > > Nicolas George writes: > > Anton Khirnov (12020-01-07): > >> And most importantly - strings are simple. Every C programmer (worth the > >> name) understands strings. But now you'd be forcing every API user to > >> understand and remember this new API. > > > > [...] > > > > We will not play the game of reinventing yet another data structure for > > strings, but let us not play the game of pretending the issue does not > > exist: it only makes the problem more stinging. > > I think this is really important, AVWriter allows the user to make use > of whichever implementation of strings their application already uses > and not have to deal with yet-another-string-type. > While this argument sounds sensible on the surface, I can't help but wonder if it really makes sense. Sure, some returned strings may be able to be wrapped straight into whatever structure they use. But what about const char* strings? Will we need to allocate a buffer and copy those in the future? Considering I can practically use those inline in logging printfs now, that would be an extreme increase in complexity. Also, what happens to strings that are an input to our libraries? Will those no longer be char pointers, possibly malloc'ed (depending on where they are used)? It sounds to me like you won't really be able to get rid of char pointers, static or malloc'ed, without introducing a lot of pain points. Sure, typical output functions may give you AVWriter things, but other things will still be char*. Additionally, any widely used custom string class will have a direct conversion from/to char pointers, so is this really such a huge burden on application developers? I work on Windows, where everything is wchar_t, and I never thought this was a problem, since I have to deal with this for every library. In fact the proposed API won't really help, since the entire API is still designed around UTF-8, so what I would do is wrap the dynbuf writter and just convert to wide-char when the string is complete. In short, I do exactly what I do now, just more complicated. Every library outputs strings in a certain manner, its either char-pointer strings that you get to free, or their own string-type. I can't help but think that this feels worse and more complicated then our own string type. - Hendrik ___ ffmpeg-devel mailing list ffmpeg-devel@ffmpeg.org https://ffmpeg.org/mailman/listinfo/ffmpeg-devel To unsubscribe, visit link above, or email ffmpeg-devel-requ...@ffmpeg.org with subject "unsubscribe".
Re: [FFmpeg-devel] AVWriter API
Hi, I have very little time to work on FFmpeg lately but this thread caught my interest. Nicolas George writes: > Anton Khirnov (12020-01-07): >> And most importantly - strings are simple. Every C programmer (worth the >> name) understands strings. But now you'd be forcing every API user to >> understand and remember this new API. > > [...] > > We will not play the game of reinventing yet another data structure for > strings, but let us not play the game of pretending the issue does not > exist: it only makes the problem more stinging. I think this is really important, AVWriter allows the user to make use of whichever implementation of strings their application already uses and not have to deal with yet-another-string-type. > Who wants a mallocated char pointer, really? Who things it is a good > idea for a string? > > Not applications that use another allocator than malloc(), obviously. Not relevant to this thread but: How would an application not using the traditional malloc() allocator work use FFmpeg at all if it didn't 'just work' by setting MALLOC_PREFIX? > [...] > > Etc. I hope you see my point and I do not need to add Java and PHP to > these examples. It's an issue that everyone is making their own string type, we do definitely not want to do this (one of the reasons I think this API is in the right direction). > By returning a mallocated char *, you are saying to the authors of these > applications "screw you, I don't care about your use case". If you don't > care about most applications' use cases, don't design public APIs. I guess the argument against this would be 'well, you can just copy it to whichever string implementation you need', which maybe isn't ideal but it 'works' currently. > This AVWriter API addresses all that. If the caller wants a mallocated > char *, they can get it, with almost no extra code. But if they want > something else, they can get it. Important that we won't lose any functionality with this new API. > And even if it is used only for a mallocated char *, this code still > brings the benefit of factoring error checks. Error checks are "not to > be avoided", but they are not to be littered all over the code either: > they are often tricky, with the risk of leaks or double frees, and they > are almost never tested. Any change that centralizes them is good to > take, and this one of the things AVWriter does, centralize error checks. Deferring error checks properly and centralising them means they can be better handled by a user. This is a good approach. > As for the extra complexity. Yes, there is some. There is some > significant complexity in the implementation, of course. If it is > anybody's problem, it's mine. > > There is a little complexity for the caller. Look at the code: very very > little complexity. Much less than what is required learning to use any > FFmpeg API anyway. > > There is no complexity for the callee. If anything, using AVWriter makes > the callee code simpler. I don't think the complexity is a big issue here. > I understand that you may be reluctant to embrace creative APIs, but > please try to go beyond that reluctance to see the actual benefits and > costs. For the most part I think the concept is good, and it definitely solves a problem which exists but the million dollar question is: Are strings a big enough issue at the moment that we should introduce a new API to fix them? Though I guess if someone is willing to implement a fix, then the answer to this doesn't even matter at all, and should always be 'yes'--you have my support for this set, at least. -- Josh ___ ffmpeg-devel mailing list ffmpeg-devel@ffmpeg.org https://ffmpeg.org/mailman/listinfo/ffmpeg-devel To unsubscribe, visit link above, or email ffmpeg-devel-requ...@ffmpeg.org with subject "unsubscribe".
Re: [FFmpeg-devel] AVWriter API
Anton Khirnov (12020-01-07): > Replying to this thread, since it contains the motivation. Thanks. > And most importantly - strings are simple. Every C programmer (worth the > name) understands strings. But now you'd be forcing every API user to > understand and remember this new API. I'll reply to only that, because it contains the root of the problem. No, strings are not simple. Strings only appear simple when you don't do anything much with them. But as soon as you do, you need extra complexity to implement them properly and efficiently. And if you don't realize this is true, just observe that strings are so simple that about every other library and most languages implement their own version of strings, with the specific optimizations they think they need. Of course, FFmpeg does not do much with strings, and therefore can pretend they are simple, but we are not talking about inside FFmpeg here, we are talking about interacting with an application, an application that probably does something with strings. We will not play the game of reinventing yet another data structure for strings, but let us not play the game of pretending the issue does not exist: it only makes the problem more stinging. Who wants a mallocated char pointer, really? Who things it is a good idea for a string? Not applications that use another allocator than malloc(), obviously. Not Gtk+ applications, because they want a GString. Not C++ application, because they want a boost string or something. Not djb-style applications, because they want to use stralloc. Not internationalized applications, because they want wchar_t. Not Perl bindings, because they want to use a scalar value. Etc. I hope you see my point and I do not need to add Java and PHP to these examples. By returning a mallocated char *, you are saying to the authors of these applications "screw you, I don't care about your use case". If you don't care about most applications' use cases, don't design public APIs. This AVWriter API addresses all that. If the caller wants a mallocated char *, they can get it, with almost no extra code. But if they want something else, they can get it. And even if it is used only for a mallocated char *, this code still brings the benefit of factoring error checks. Error checks are "not to be avoided", but they are not to be littered all over the code either: they are often tricky, with the risk of leaks or double frees, and they are almost never tested. Any change that centralizes them is good to take, and this one of the things AVWriter does, centralize error checks. As for the extra complexity. Yes, there is some. There is some significant complexity in the implementation, of course. If it is anybody's problem, it's mine. There is a little complexity for the caller. Look at the code: very very little complexity. Much less than what is required learning to use any FFmpeg API anyway. There is no complexity for the callee. If anything, using AVWriter makes the callee code simpler. I understand that you may be reluctant to embrace creative APIs, but please try to go beyond that reluctance to see the actual benefits and costs. Regards, -- Nicolas George signature.asc Description: PGP signature ___ ffmpeg-devel mailing list ffmpeg-devel@ffmpeg.org https://ffmpeg.org/mailman/listinfo/ffmpeg-devel To unsubscribe, visit link above, or email ffmpeg-devel-requ...@ffmpeg.org with subject "unsubscribe".
Re: [FFmpeg-devel] AVWriter API
Replying to this thread, since it contains the motivation. Quoting Nicolas George (2019-12-08 15:25:43) > Hi. > > [ TL;DR: a lightweight oo API for functions that need to return a > string, designed to be as convenient as possible. ] TL;DR: sorry, I do not like this. It seems to me you are adding a substantial amount of new complexity in an awkward place, that ends up being worse than the problem it's supposed to address. > > Yesterday, in my comments about the new channel layout API: > https://ffmpeg.org/pipermail/ffmpeg-devel/2019-December/254020.html > I evoked the following issue: > > We have many functions that return strings to the caller: > > char *av_get_sample_fmt_string(char *buf, int buf_size, enum AVSampleFormat > sample_fmt); > const char *av_get_media_type_string(enum AVMediaType media_type); > char *av_base64_encode(char *out, int out_size, const uint8_t *in, int > in_size); > int av_dict_get_string(const AVDictionary *m, char **buffer, >const char key_val_sep, const char pairs_sep); > int av_strerror(int errnum, char *errbuf, size_t errbuf_size); > void av_hash_final_hex(struct AVHashContext *ctx, uint8_t *dst, int size); > const char *av_get_known_color_name(int color_idx, const uint8_t **rgb); > const char *av_color_range_name(enum AVColorRange range); > char *av_timecode_make_mpeg_tc_string(char *buf, uint32_t tc25bit); > static inline char *av_ts_make_string(char *buf, int64_t ts) > void avcodec_string(char *buf, int buf_size, AVCodecContext *enc, int encode); > > The simplest of them are "enum → static string" functions, but what do > they do with an invalid value: return NULL or a meaningful string like > "unknown"? > > A lot of them demand a pair of "char *buf, size_t buf_size" arguments to > return the string. What is a reasonable size for the buffer? How do we > handle the cases where the buffer is too small? > > When the string is likely to be long, they dynamically allocate the > buffer. > > This is an inconsistent mess. You assume every function that produces a string for the caller is supposed to be consistent. I do not agree with that - the functions serve different purposes and different patterns may be appropriate for each of those. A static string, an informative message and serialized structured data are all different things. And most importantly - strings are simple. Every C programmer (worth the name) understands strings. But now you'd be forcing every API user to understand and remember this new API. > And unsatisfactory on top of it: most of > the time, we will just copy the buffer to another buffer, write it to a > file or dump it to a log. The string function may need a big buffer to > build the string while it is at no point necessary to have it in full. > > I have been wanting to find a clean solution for this issue for a long > time. And now I have. (The AVBPrint was a preliminary attempt at that.) > > But before polishing it, I wanted to know the opinions of my fellow > developers on this issue. The design is a bit unconventional (which is > what enables it to be convenient), I suspect some people may object > because of that. > > I have attached the header file so people can have a look at it. > > The idea is to implement the bare minimum of a very simple object > system. > > To return a string (or a binary blob, that changes nothing), the > functions take an AVWriter object and just write to it using a > fprintf()-like function or other useful functions. > > The AVWriter object is abstract and can actually write in several > different ways. av_buf_writer_array(buf) will just wrap "char buf[42]" > and write in it. av_dynbuf_writer() prepares a dynamic buffer. > av_log_writer(obj) writes to av_log(). > > User application can implement their own AVWriter types: write directly > in a GUI text buffer or to syslog. > > I wanted to avoid as much as possible adding new necessary error checks. > I think I have succeeded. > > av_foobar_to_string(buf, sizeof(buf), foobar); > > becomes > > av_foobar_write(av_buf_writer_array(buf), foobar); > > which has a little more letters but is actually simpler. Error checks are not an evil to be avoided. On the contrary, where error checks are necessary (like in memory allocation), we want to shove them in the caller's face as strongly as we can. And if the buffer was static and not large enough, again we want to shout very loudly that the result may be invalid. Especially if the result is structured and is supposed to be interpreted somehow. -- Anton Khirnov ___ ffmpeg-devel mailing list ffmpeg-devel@ffmpeg.org https://ffmpeg.org/mailman/listinfo/ffmpeg-devel To unsubscribe, visit link above, or email ffmpeg-devel-requ...@ffmpeg.org with subject "unsubscribe".
Re: [FFmpeg-devel] AVWriter API
Paul B Mahol (12019-12-21): > I doubt that is very good idea or conclusion. If you have technical or practical thoughts about my proposal, please share them in reply to the first message. -- Nicolas George ___ ffmpeg-devel mailing list ffmpeg-devel@ffmpeg.org https://ffmpeg.org/mailman/listinfo/ffmpeg-devel To unsubscribe, visit link above, or email ffmpeg-devel-requ...@ffmpeg.org with subject "unsubscribe".
Re: [FFmpeg-devel] AVWriter API
On 12/21/19, Nicolas George wrote: > Nicolas George (12019-12-08): >> [ TL;DR: a lightweight oo API for functions that need to return a >> string, designed to be as convenient as possible. ] > > Judging on the lack of reaction to this mail, there seems to be no > strong feeling against this. I therefore intend to move ahead. > I doubt that is very good idea or conclusion. > Regards, > > -- > Nicolas George > ___ ffmpeg-devel mailing list ffmpeg-devel@ffmpeg.org https://ffmpeg.org/mailman/listinfo/ffmpeg-devel To unsubscribe, visit link above, or email ffmpeg-devel-requ...@ffmpeg.org with subject "unsubscribe".
Re: [FFmpeg-devel] AVWriter API
Nicolas George (12019-12-08): > [ TL;DR: a lightweight oo API for functions that need to return a > string, designed to be as convenient as possible. ] Judging on the lack of reaction to this mail, there seems to be no strong feeling against this. I therefore intend to move ahead. Regards, -- Nicolas George signature.asc Description: PGP signature ___ ffmpeg-devel mailing list ffmpeg-devel@ffmpeg.org https://ffmpeg.org/mailman/listinfo/ffmpeg-devel To unsubscribe, visit link above, or email ffmpeg-devel-requ...@ffmpeg.org with subject "unsubscribe".
Re: [FFmpeg-devel] AVWriter API
Nicolas George (12019-12-08): > [ TL;DR: a lightweight oo API for functions that need to return a > string, designed to be as convenient as possible. ] Ping? Regards, -- Nicolas George signature.asc Description: PGP signature ___ ffmpeg-devel mailing list ffmpeg-devel@ffmpeg.org https://ffmpeg.org/mailman/listinfo/ffmpeg-devel To unsubscribe, visit link above, or email ffmpeg-devel-requ...@ffmpeg.org with subject "unsubscribe".
[FFmpeg-devel] AVWriter API
Hi. [ TL;DR: a lightweight oo API for functions that need to return a string, designed to be as convenient as possible. ] Yesterday, in my comments about the new channel layout API: https://ffmpeg.org/pipermail/ffmpeg-devel/2019-December/254020.html I evoked the following issue: We have many functions that return strings to the caller: char *av_get_sample_fmt_string(char *buf, int buf_size, enum AVSampleFormat sample_fmt); const char *av_get_media_type_string(enum AVMediaType media_type); char *av_base64_encode(char *out, int out_size, const uint8_t *in, int in_size); int av_dict_get_string(const AVDictionary *m, char **buffer, const char key_val_sep, const char pairs_sep); int av_strerror(int errnum, char *errbuf, size_t errbuf_size); void av_hash_final_hex(struct AVHashContext *ctx, uint8_t *dst, int size); const char *av_get_known_color_name(int color_idx, const uint8_t **rgb); const char *av_color_range_name(enum AVColorRange range); char *av_timecode_make_mpeg_tc_string(char *buf, uint32_t tc25bit); static inline char *av_ts_make_string(char *buf, int64_t ts) void avcodec_string(char *buf, int buf_size, AVCodecContext *enc, int encode); The simplest of them are "enum → static string" functions, but what do they do with an invalid value: return NULL or a meaningful string like "unknown"? A lot of them demand a pair of "char *buf, size_t buf_size" arguments to return the string. What is a reasonable size for the buffer? How do we handle the cases where the buffer is too small? When the string is likely to be long, they dynamically allocate the buffer. This is an inconsistent mess. And unsatisfactory on top of it: most of the time, we will just copy the buffer to another buffer, write it to a file or dump it to a log. The string function may need a big buffer to build the string while it is at no point necessary to have it in full. I have been wanting to find a clean solution for this issue for a long time. And now I have. (The AVBPrint was a preliminary attempt at that.) But before polishing it, I wanted to know the opinions of my fellow developers on this issue. The design is a bit unconventional (which is what enables it to be convenient), I suspect some people may object because of that. I have attached the header file so people can have a look at it. The idea is to implement the bare minimum of a very simple object system. To return a string (or a binary blob, that changes nothing), the functions take an AVWriter object and just write to it using a fprintf()-like function or other useful functions. The AVWriter object is abstract and can actually write in several different ways. av_buf_writer_array(buf) will just wrap "char buf[42]" and write in it. av_dynbuf_writer() prepares a dynamic buffer. av_log_writer(obj) writes to av_log(). User application can implement their own AVWriter types: write directly in a GUI text buffer or to syslog. I wanted to avoid as much as possible adding new necessary error checks. I think I have succeeded. av_foobar_to_string(buf, sizeof(buf), foobar); becomes av_foobar_write(av_buf_writer_array(buf), foobar); which has a little more letters but is actually simpler. But it is really more interesting than that because if we keep the same AVWriter for several call, it will just concatenate, allowing to build a message. Another thing that is not yet in the header: have all the av_foobar_write() function take an extra "unsigned flags" argument. And have a few transverse flags, especially: #define AV_WRITER_FALLBACK (1U << 16) When this flag is set, the function writes a fall-back string for unknown values ("unknown", "?", "invalid", depends on the type). Otherwise, it prints nothing. Before working more on integrating this, I would like a little feedback, please. Regards, -- Nicolas George /* * Copyright (c) 2019 The FFmpeg project * * This file is part of FFmpeg. * * FFmpeg is free software; you can redistribute it and/or * modify it under the terms of the GNU Lesser General Public License * as published by the Free Software Foundation; either * version 2.1 of the License, or (at your option) any later version. * * FFmpeg is distributed in the hope that it will be useful, * but WITHOUT ANY WARRANTY; without even the implied warranty of * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the * GNU Lesser General Public License for more details. * * You should have received a copy of the GNU Lesser General Public License * along with FFmpeg; if not, write to the Free Software Foundation, Inc., * 51 Franklin Street, Fifth Floor, Boston, MA 02110-1301 USA */ #include #include #include "bprint.h" /** * @defgroup av_writer AVWriter * * Object-oriented API to write strings and binary data. * * @{ */ typedef struct AVWriterMethods AVWriterMethods; /** * Opaque object to write strings and binary data. * * An AVWriter is meant to allow to build and return string (and