Re: [vdr] trouble with asprintf

2008-02-11 Thread Udo Richter
Wolfgang Rohdewald wrote:
   char *s;
   asprintf(s,%ld-%.9s,random(),artist.original());
 
 segfaults only if illegal utf8 chars appear in artist.original()
 
 asprintf returns -1, so s is nothing that could be freed,
 and this gives a nice backtrace:

So its basically just free'ing an uninitialized pointer.

Well, that leads to the question whether s is unchanged in case of a -1 
error return, and whether this would work:

char *s = NULL;
asprintf(s,%ld-%.9s,random(),artist.original());

Cheers,

Udo


___
vdr mailing list
vdr@linuxtv.org
http://www.linuxtv.org/cgi-bin/mailman/listinfo/vdr


Re: [vdr] trouble with asprintf

2008-02-11 Thread Ludwig Nussel
Udo Richter wrote:
 Wolfgang Rohdewald wrote:
  char *s;
  asprintf(s,%ld-%.9s,random(),artist.original());
  
  segfaults only if illegal utf8 chars appear in artist.original()
  
  asprintf returns -1, so s is nothing that could be freed,
  and this gives a nice backtrace:
 
 So its basically just free'ing an uninitialized pointer.
 
 Well, that leads to the question whether s is unchanged in case of a -1 
 error return, and whether this would work:
 
   char *s = NULL;
   asprintf(s,%ld-%.9s,random(),artist.original());

The manpage explicitly says that the content of s is undefined in
case of error. So even if it works you can't really count on it. You
can't get around checking the return value.

cu
Ludwig

-- 
 (o_   Ludwig Nussel
 //\   SUSE LINUX Products GmbH, Development
 V_/_  http://www.suse.de/



___
vdr mailing list
vdr@linuxtv.org
http://www.linuxtv.org/cgi-bin/mailman/listinfo/vdr


Re: [vdr] trouble with asprintf

2008-02-11 Thread Darren Salt
I demand that Ludwig Nussel may or may not have written...

 Darren Salt wrote:
 I demand that Ludwig Nussel may or may not have written...
 [snip]
 asprintf needs to check for multibyte characters to not cut them in
 the middle and produce invalid output.
 No - it's encoding-neutral. [...]

 Try the following with 'LANG=C' and 'LANG=de_DE.UTF-8'. You will notice
 that in the latter case it will not cut the umlaut.
[snip code - hmm, dodgy use of printf]

Interesting. It omits it entirely. But the rest of my point still stands - it
still counts bytes.

-- 
| Darren Salt| linux or ds at  | nr. Ashington, | Toon
| RISC OS, Linux | youmustbejoking,demon,co,uk | Northumberland | Army
| + Burn less waste. Use less packaging. Waste less. USE FEWER RESOURCES.

This message was brought to you using only 100% recycled electrons.

___
vdr mailing list
vdr@linuxtv.org
http://www.linuxtv.org/cgi-bin/mailman/listinfo/vdr


Re: [vdr] trouble with asprintf

2008-02-11 Thread Ludwig Nussel
Darren Salt wrote:
 I demand that Ludwig Nussel may or may not have written...
 
 [snip]
  asprintf needs to check for multibyte characters to not cut them in
  the middle and produce invalid output.
 
 No - it's encoding-neutral. What you want is your own version which does that

Try the following with 'LANG=C' and 'LANG=de_DE.UTF-8'. You will
notice that in the latter case it will not cut the umlaut.

#define _GNU_SOURCE
#include stdio.h
#include string.h
#include stdlib.h
#include locale.h

int main(void)
{
char* buffer;
char artist[] = Haegar;
int ret;
setlocale(LC_ALL, );
artist[1]=0xc3;
artist[2]=0xa4;
ret = asprintf(buffer,%.2s\n,artist);
printf(%d bytes\n, ret);
printf(buffer);
free(buffer);
return 0;
}

cu
Ludwig

-- 
 (o_   Ludwig Nussel
 //\   
 V_/_  http://www.suse.de/
SUSE LINUX Products GmbH, GF: Markus Rex, HRB 16746 (AG Nuernberg)





___
vdr mailing list
vdr@linuxtv.org
http://www.linuxtv.org/cgi-bin/mailman/listinfo/vdr


Re: [vdr] trouble with asprintf

2008-02-11 Thread Wolfgang Rohdewald
On Montag, 11. Februar 2008, Udo Richter wrote:
 Well, that leads to the question whether s is unchanged in case of a -1 
 error return, and whether this would work:

I can confirm that. The man page however says the value will be undefined.

My current understanding is:

1. dont forget to call setlocale! Normally setlocale(LC_ALL,)

2. if locale is UTF-8, asprintf returns -1 if the string contains illegal
UTF-8 characters anywhere

3. this and out of memory are the only reasons I know for result -1. The
man page to asprintf says there could be other errors than out of memory
but mentions none.

4. If result -1, the buffer pointer stays unchanged, see man page

5. if locale is UTF-8 and a maximum length is defined as in %.9s, and if
%.9s would cut a multibyte char, only 8 chars will be used. See example
from Ludwig Nussel.

What I don't know where in the man pages this is explained - I did not
find anything about it. Neither man asprintf or man printf

-- 
Wolfgang

___
vdr mailing list
vdr@linuxtv.org
http://www.linuxtv.org/cgi-bin/mailman/listinfo/vdr


Re: [vdr] trouble with asprintf

2008-02-11 Thread Ludwig Nussel
Wolfgang Rohdewald wrote:
 My problem code:
 
 mgDb::Build_cddbid(const mgSQLString artist) const
 {
   char *s;
   asprintf(s,%ld-%.9s,random(),artist.original());
 
 segfaults only if illegal utf8 chars appear in artist.original()
 
 asprintf returns -1, so s is nothing that could be freed,
 and this gives a nice backtrace:
 
 Program received signal SIGSEGV, Segmentation fault.
 [Switching to Thread -1319449712 (LWP 22989)]
 0xb7bf57ea in free () from /lib/tls/i686/cmov/libc.so.6
 (gdb) bt
 #0  0xb7bf57ea in free () from /lib/tls/i686/cmov/libc.so.6
 #1  0xb7986908 in mgDb::Build_cddbid (this=0x86ed8e8, [EMAIL PROTECTED]) at 
 mg_db.c:1023

As you can see it doesn't segfault on asprintf but on free().

 If I change %.9s to %s, everything is fine.
 
 I cannot easily simplify that, if I try like this, it works:
 
 char artist[50];
 strcpy(artist,Celine Dion);
 artist[1]=0xe9;
 asprintf(buffer,%ld-%.9s,random(),artist);
 printf(buffer);
 free(buffer);

if(asprintf(...) = 0)
{
printf(...);
free(...);
}

Or just use normal snprintf as the amount of charactes to print is
fixed anyways so you don't need a variable sized buffer.

cu
Ludwig


-- 
 (o_   Ludwig Nussel
 //\   
 V_/_  http://www.suse.de/
SUSE LINUX Products GmbH, GF: Markus Rex, HRB 16746 (AG Nuernberg)


___
vdr mailing list
vdr@linuxtv.org
http://www.linuxtv.org/cgi-bin/mailman/listinfo/vdr


Re: [vdr] trouble with asprintf

2008-02-11 Thread Ludwig Nussel
Wolfgang Rohdewald wrote:
 since asprintf leads to segfaults if feeded with incorrect UTF-8 characters,

It's not asprintf that segfaults but the call to free uninitialized
memory afterwards.

 I wanted to write a wrapper function which would then check the return value
 of asprintf. However I have a problem with the variable argument list and
 the va_* macros. Using gdb shows that, in the following example, in
 
 res=asprintf (strp, fmt, ap);
 
 ap is interpreted not as a list of arguments but as an integer.

use vasprintf

 int
 msprintf(char **strp, const char *fmt, ...)
 {
 va_list ap;
 int res;
 va_start (ap, fmt);
 res=asprintf (strp, fmt, ap);
 va_end (ap);
 }

Even if you use vasprintf to make the function actually work you
still need to check the return value of vasprintf otherwise this
wrapper would be kind of useless.

cu
Ludwig

-- 
 (o_   Ludwig Nussel
 //\   
 V_/_  http://www.suse.de/
SUSE LINUX Products GmbH, GF: Markus Rex, HRB 16746 (AG Nuernberg)


___
vdr mailing list
vdr@linuxtv.org
http://www.linuxtv.org/cgi-bin/mailman/listinfo/vdr


Re: [vdr] trouble with asprintf

2008-02-11 Thread Ludwig Nussel
Udo Richter wrote:
 Wolfgang Rohdewald wrote:
  since asprintf leads to segfaults if feeded with incorrect UTF-8 characters,
  I wanted to write a wrapper function which would then check the return value
  of asprintf. 
 
 I never understood what the problem is with utf8 and asprintf, since 
 utf8 is mostly ASCIIZ backwards compatible, and asprintf probably 
 doesn't even know the difference between utf8 and ascii. What special 
 handling does asprintf with utf8? Is there some example that causes the 
 trouble?

asprintf needs to check for multibyte characters to not cut them in
the middle and produce invalid output.

cu
Ludwig

-- 
 (o_   Ludwig Nussel
 //\   
 V_/_  http://www.suse.de/
SUSE LINUX Products GmbH, GF: Markus Rex, HRB 16746 (AG Nuernberg)





___
vdr mailing list
vdr@linuxtv.org
http://www.linuxtv.org/cgi-bin/mailman/listinfo/vdr


Re: [vdr] trouble with asprintf

2008-02-11 Thread Wolfgang Rohdewald
On Montag, 11. Februar 2008, Ludwig Nussel wrote:
 As you can see it doesn't segfault on asprintf but on free().

I did see that. I did not say it segfaults but it does lead
to segfaults.
 
 if(asprintf(...) = 0)
 {
   printf(...);
   free(...);
 }

I do not want to change dozens of places like that. Just have
one single point which can emit an error message so I can then
see what has to be done for each individual place. Most of the
asprintf calls will never get into trouble anyway. But if a user
reports a problem I prefer an error message over some vague description.
 
 Or just use normal snprintf as the amount of charactes to print is
 fixed anyways so you don't need a variable sized buffer.

this is just a minimal sample. The real code has variable length
strings.

On Montag, 11. Februar 2008, Ludwig Nussel wrote:
 Even if you use vasprintf to make the function actually work you
 still need to check the return value of vasprintf otherwise this
 wrapper would be kind of useless.

of course. See above.

-- 
Wolfgang

___
vdr mailing list
vdr@linuxtv.org
http://www.linuxtv.org/cgi-bin/mailman/listinfo/vdr


Re: [vdr] trouble with asprintf

2008-02-10 Thread Klaus Schmidinger
On 02/10/08 16:06, Wolfgang Rohdewald wrote:
 Hi,
 
 I am making the muggle plugin work with UTF-8 and have a little problem:
 
 since asprintf leads to segfaults if feeded with incorrect UTF-8 characters,
 I wanted to write a wrapper function which would then check the return value
 of asprintf. However I have a problem with the variable argument list and
 the va_* macros. Using gdb shows that, in the following example, in
 
 res=asprintf (strp, fmt, ap);
 
 ap is interpreted not as a list of arguments but as an integer.
 
 What is wrong here?
 
 BTW I am quite sure that vdr will sometimes coredump since it never checks the
 return value of asprintf. One suspect would be if somebody used a latin1
 charset and had special characters like äöü in file names and then changes
 to utf-8 without converting file names to utf-8. If vdr then passes such
 a file name to asprintf, corrupted memory results. Might be difficult
 to debug remotely.

You could use VDR's cString::sprintf() instead.
This is probably also what I am going to do in the VDR core code,
to avoid asprintf() altogether. The single leftover vasprintf()
call in cString::sprintf() can then be made safe.

Klaus

___
vdr mailing list
vdr@linuxtv.org
http://www.linuxtv.org/cgi-bin/mailman/listinfo/vdr


Re: [vdr] trouble with asprintf

2008-02-10 Thread Wolfgang Rohdewald
On Sonntag, 10. Februar 2008, Klaus Schmidinger wrote:
 You could use VDR's cString::sprintf() instead.
 This is probably also what I am going to do in the VDR core code,
 to avoid asprintf() altogether. The single leftover vasprintf()
 call in cString::sprintf() can then be made safe.

vasprintf was a good hint - I only had to change asprintf to vasprintf,
same arguments. now it works as expected.

I will use my msprintf until you have made cString::sprintf() safe.

Thank you!

int
msprintf(char **strp, const char *fmt, ...)
{
va_list ap;
va_start (ap, fmt);
int res=vasprintf (strp, fmt, ap);
va_end (ap);
}


-- 
Wolfgang

___
vdr mailing list
vdr@linuxtv.org
http://www.linuxtv.org/cgi-bin/mailman/listinfo/vdr


Re: [vdr] trouble with asprintf

2008-02-10 Thread Udo Richter
Wolfgang Rohdewald wrote:
 since asprintf leads to segfaults if feeded with incorrect UTF-8 characters,
 I wanted to write a wrapper function which would then check the return value
 of asprintf. 

I never understood what the problem is with utf8 and asprintf, since 
utf8 is mostly ASCIIZ backwards compatible, and asprintf probably 
doesn't even know the difference between utf8 and ascii. What special 
handling does asprintf with utf8? Is there some example that causes the 
trouble?

Worst case I can imagine would be that there's an invalid 0 byte inside 
an utf8 multibyte char, and even this would just result in an utf8 
string that terminates with an incomplete char - and shouldn't handling 
such crap be the job of whatever processes the utf8 string later on? At 
least IMHO it would be wise to count any 0 byte as string end.


Cheers,

Udo

___
vdr mailing list
vdr@linuxtv.org
http://www.linuxtv.org/cgi-bin/mailman/listinfo/vdr


Re: [vdr] trouble with asprintf

2008-02-10 Thread Darren Salt
I demand that Wolfgang Rohdewald may or may not have written...

 On Sonntag, 10. Februar 2008, Udo Richter wrote:
 What special handling does asprintf with utf8? Is there some example that
 causes the trouble?
 Worst case I can imagine would be that there's an invalid 0 byte inside 
 an utf8 multibyte char

 printf and family sometimes have to count characters, so I suppose they
 have to scan UTF

No; they only ever count bytes. The encoding is irrelevant.

[snip]
-- 
| Darren Salt| linux or ds at  | nr. Ashington, | Toon
| RISC OS, Linux | youmustbejoking,demon,co,uk | Northumberland | Army
| + Buy local produce. Try to walk or cycle. TRANSPORT CAUSES GLOBAL WARMING.

If a bus stops at a bus station, does work stop at a workstation?

___
vdr mailing list
vdr@linuxtv.org
http://www.linuxtv.org/cgi-bin/mailman/listinfo/vdr