Re: [vdr] bad characters in epg.data
> On 06 Dec 2015, at 20:55, Peter Münsterwrote: > > On Wed, Dec 02 2015, Klaus Schmidinger wrote: > >>> C S19.2E-133-3-263 SVM - GR\326D >>> >>> Would it be possible/easy to patch vdr to filter out such errors? >>> What is the right function to look at? >> >> Take a look at StripControlCharacters() or cEvent::FixEpgBugs() in epg.c. > > It seems, that these functions only take care of the title and the > description, but not the channel name. Sorry, I missed that. > Finally, I've patched vdr like this: > > --8<---cut here---start->8--- > --- epg.c~ 2013-12-28 12:33:08.0 +0100 > +++ epg.c 2015-12-06 15:54:58.312233837 +0100 > @@ -1064,11 +1064,32 @@ > } > } > > +static char *StripFunny8bitCharacters(const char *src) > +{ > +static char dest[100]; > +strn0cpy(dest, src, 100); > +char *s = dest; > +int len = strlen(s); > +while (len > 0) { > +int l = Utf8CharLen(s); > +uchar *p = (uchar *)s; > +if (l == 1 && *p > 0x7F) { // this is not utf-8 > +memmove(s, p + 1, len); // we also copy the terminating 0! > +len--; > +l = 0; > +} > +s += l; > +len -= l; > +} > +return dest; > +} > + > void cSchedule::Dump(FILE *f, const char *Prefix, eDumpMode DumpMode, time_t > AtTime) const > { > cChannel *channel = Channels.GetByChannelID(channelID, true); > if (channel) { > - fprintf(f, "%sC %s %s\n", Prefix, *channel->GetChannelID().ToString(), > channel->Name()); > + fprintf(f, "%sC %s %s\n", Prefix, *channel->GetChannelID().ToString(), > + StripFunny8bitCharacters(channel->Name())); > const cEvent *p; > switch (DumpMode) { >case dmAll: { > --8<---cut here---end--->8--- > > It seems to work. > Would it be possible to integrate this patch into vdr? Well, first we should investigate why this isn’t set correctly in libsi/si.c. That’s the place where such fixes should actually be done. I’ll look into this once I have my VDR development environment up and running at my new place… Klaus ___ vdr mailing list vdr@linuxtv.org http://www.linuxtv.org/cgi-bin/mailman/listinfo/vdr
Re: [vdr] bad characters in epg.data
On Wed, Dec 02 2015, Klaus Schmidinger wrote: >> C S19.2E-133-3-263 SVM - GR\326D >> >> Would it be possible/easy to patch vdr to filter out such errors? >> What is the right function to look at? > > Take a look at StripControlCharacters() or cEvent::FixEpgBugs() in epg.c. It seems, that these functions only take care of the title and the description, but not the channel name. Finally, I've patched vdr like this: --8<---cut here---start->8--- --- epg.c~ 2013-12-28 12:33:08.0 +0100 +++ epg.c 2015-12-06 15:54:58.312233837 +0100 @@ -1064,11 +1064,32 @@ } } +static char *StripFunny8bitCharacters(const char *src) +{ +static char dest[100]; +strn0cpy(dest, src, 100); +char *s = dest; +int len = strlen(s); +while (len > 0) { +int l = Utf8CharLen(s); +uchar *p = (uchar *)s; +if (l == 1 && *p > 0x7F) { // this is not utf-8 +memmove(s, p + 1, len); // we also copy the terminating 0! +len--; +l = 0; +} +s += l; +len -= l; +} +return dest; +} + void cSchedule::Dump(FILE *f, const char *Prefix, eDumpMode DumpMode, time_t AtTime) const { cChannel *channel = Channels.GetByChannelID(channelID, true); if (channel) { - fprintf(f, "%sC %s %s\n", Prefix, *channel->GetChannelID().ToString(), channel->Name()); + fprintf(f, "%sC %s %s\n", Prefix, *channel->GetChannelID().ToString(), + StripFunny8bitCharacters(channel->Name())); const cEvent *p; switch (DumpMode) { case dmAll: { --8<---cut here---end--->8--- It seems to work. Would it be possible to integrate this patch into vdr? -- Peter ___ vdr mailing list vdr@linuxtv.org http://www.linuxtv.org/cgi-bin/mailman/listinfo/vdr
Re: [vdr] bad characters in epg.data
On Sun, Nov 29 2015, Klaus Schmidinger wrote: > Have you tried this (from the VDR “INSTALL” file)? > > Workaround for providers not encoding their DVB SI table strings correctly Thanks, I've tried it. But unfortunately I've got today this line: C S19.2E-133-3-263 SVM - GR\326D with the same 0xD6 character... Would it be possible/easy to patch vdr to filter out such errors? What is the right function to look at? TIA for any help, -- Peter ___ vdr mailing list vdr@linuxtv.org http://www.linuxtv.org/cgi-bin/mailman/listinfo/vdr
Re: [vdr] bad characters in epg.data
> On 02 Dec 2015, at 20:45, Peter Münsterwrote: > > On Sun, Nov 29 2015, Klaus Schmidinger wrote: > >> Have you tried this (from the VDR “INSTALL” file)? >> >> Workaround for providers not encoding their DVB SI table strings correctly > > Thanks, I've tried it. But unfortunately I've got today this line: > > C S19.2E-133-3-263 SVM - GR\326D > > with the same 0xD6 character... > > Would it be possible/easy to patch vdr to filter out such errors? > What is the right function to look at? Take a look at StripControlCharacters() or cEvent::FixEpgBugs() in epg.c. Klaus ___ vdr mailing list vdr@linuxtv.org http://www.linuxtv.org/cgi-bin/mailman/listinfo/vdr
Re: [vdr] bad characters in epg.data
> On 29 Nov 2015, at 14:04, Peter Münsterwrote: > > Hi, > > It seems, that the encoding of the epg.data file is utf-8, but > sometimes, there are lines like this: > > C S19.2E-133-3-263 18:00 GRÖD - RBS > > The "Ö" is one byte (0xD6) that seems not conform to utf-8 encoding. > > How could I avoid such characters in the epg.data file please? Have you tried this (from the VDR “INSTALL” file)? Workaround for providers not encoding their DVB SI table strings correctly -- According to "ETSI EN 300 468" the default character set for SI data is ISO6937. But unfortunately some broadcasters actually use ISO-8859-9 or other encodings, but fail to correctly announce that. Users who want to set the default character set to something different can do this by using the command line option --chartab with something like ISO-8859-9. Klaus ___ vdr mailing list vdr@linuxtv.org http://www.linuxtv.org/cgi-bin/mailman/listinfo/vdr