On Sat, Feb 9, 2019 at 7:39 PM Carl Eugen Hoyos <ceffm...@gmail.com> wrote: > > 2019-02-09 17:42 GMT+01:00, Marton Balint <c...@passwd.hu>: > > On Sat, 9 Feb 2019, Carl Eugen Hoyos wrote: > > > >> From 9033f0a18727a7a576c4cc06b9985d6d922d46ad Mon Sep 17 00:00:00 2001 > >> From: Carl Eugen Hoyos <ceffm...@gmail.com> > >> Date: Sat, 9 Feb 2019 00:49:51 +0100 > >> Subject: [PATCH] lavf/mpegts: Convert service_name and service_provider to > >> utf-8. > >> > >> Fixes ticket #6320. > >> --- > >> libavformat/mpegts.c | 48 > >> ++++++++++++++++++++++++++++++++++++++++++++++++ > >> 1 file changed, 48 insertions(+) > >> > >> Diff --git a/libavformat/mpegts.c b/libavformat/mpegts.c > >> Index b04fd7b..1e27500 100644 > >> --- a/libavformat/mpegts.c > >> +++ b/libavformat/mpegts.c > >> @@ -37,6 +37,9 @@ > >> #include "avio_internal.h" > >> #include "mpeg.h" > >> #include "isom.h" > >> +#if CONFIG_ICONV > >> +#include <iconv.h> > >> +#endif > >> > >> /* maximum size in which we look for synchronization if > >> * synchronization is lost */ > >> @@ -674,6 +677,51 @@ static char *getstr8(const uint8_t **pp, const > >> uint8_t *p_end) > >> return NULL; > >> if (len > p_end - p) > >> return NULL; > >> +#if CONFIG_ICONV > >> + if (len && *p < 0x20) { > >> + char iso8859[] = "ISO-8859-00"; > >> + const char *encodings[] = { > >> + "ISO6937", "ISO-8859-5", "ISO-8859-6", "ISO-8859-7", > >> "ISO-8859-8", > >> + "ISO-8859-9", "ISO-8859-10", "ISO-8859-11", "", > >> "ISO-8859-13", > >> + "ISO-8859-14", "ISO-8859-15", "", "", "", "", > >> + "", "ISO-10646", "KSC_5601", "GB2312", "ISO-10646", "UTF-8", > >> "", > >> + "", "", "", "", "", "", "", "", "" > >> + }; > >> + iconv_t cd; > >> + char *in, *out; > >> + size_t inlen = len - 1, outlen = inlen * 6 + 1; > >> + if (len >= 3 && p[0] == 0x10 && !p[1] && p[2] && p[2] <= 0xf && > >> p[2] != 0xc) { > >> + if (p[2] < 10) { > >> + iso8859[9] += p[2]; > >> + iso8859[10] = 0; > >> + } else { > >> + iso8859[9]++; > >> + iso8859[10] += p[2] - 10; > >> + } > > > > I think this would be much more readable: > > > > char iso8859[16]; > > snprintf(iso8859, sizeof(iso8859), "ISO-8859-%d", p[2]); > > Definitely, new patch attached. >
Idea-wise I like this. We generally try to promise that our metadata is UTF-8, but with broadcast things we've not held up to that promise too much :) . This fixes quite a bit of that, which is nice. Checked that this doesn't seem to be breaking my future integration of ARIB STD-B24 text decoding into UTF-8 looking at my set of samples on hand. Just changes the place I'll have to integrate to as to not do a double conversion. In other words, good work. Jan _______________________________________________ ffmpeg-devel mailing list ffmpeg-devel@ffmpeg.org http://ffmpeg.org/mailman/listinfo/ffmpeg-devel