[issue7330] PyUnicode_FromFormat segfault

Ray.Allen Sun, 01 Aug 2010 02:28:02 -0700

Ray.Allen <ysj....@gmail.com> added the comment:

Here is the patch, it add support to use width and precision formatters in 
PyUnicode_FromFormat() for type %s, %S, %R, %V, %U, %A, besides fixed two bugs, 
which at least I believe:



1. According to PyUnicode_FromFormat() doc: 
http://docs.python.org/dev/py3k/c-api/unicode.html?highlight=pyunicode_fromformat#PyUnicode_FromFormat,
 the "%A" should produce result of ascii(). But in the existing code, I only 
find code of  call to ascii(object) and calculate the spaces needed for it, but 
not appending the ascii() output to result. Also according to my simple test, 
the %A doesn't work, as the following simple test function:
static PyObject *
getstr(PyObject *self, PyObject *args)
{
    const char *s = "hello world";
    PyObject *unicode = PyUnicode_FromString(s);
    return PyUnicode_FromFormat("%A", unicode);
}
Which should return the result of calling ascii() with the object named 
*unicode* as its argument. The result should be a unicode object with string 
"hello world". But it actually return a unicode object with string "%A". This 
can be fixed by adding the following line:
                   case 'A':
in step 4.


2. another bug, here is a piece of code in Object/unicodeobject.c, 
PyUnicode_FromFormatV():

797          if (*f == '%') {
798  #ifdef HAVE_LONG_LONG
799              int longlongflag = 0;
800  #endif
801              const char* p = f;
802              width = 0;
803              while (ISDIGIT((unsigned)*f))
804                  width = (width*10) + *f++ - '0';


Here the variable *width* cannot be correctly calculated, because the while 
loop will not execute, the *f currently is definitely '%'! So the width is 
always 0. But currently this doesn't cause error, since the following codes 
will ensure width >= MAX_LONG_CHARS:

834        case 'd': case 'u': case 'i': case 'x':
835            (void) va_arg(count, int);
836  #ifdef HAVE_LONG_LONG
837            if (longlongflag) {
838               if (width < MAX_LONG_LONG_CHARS)
839                    width = MAX_LONG_LONG_CHARS;
840            }
841            else
842  #endif
843                /* MAX_LONG_CHARS is enough to hold a 64-bit integer,
844                 including sign.  Decimal takes the most space.  This
845                 isn't enough for octal.  If a width is specified we
846                 need more (which we allocate later). */
847                if (width < MAX_LONG_CHARS)
848                    width = MAX_LONG_CHARS;

(currently width and precision only apply to integer types:%d, %u, %i, %x, not 
string and object types:%s, %S, %R, %A, %U, %V )

To fix, the following line:
801              const char* p = f;
should be:
801              const char* p = f++;
just as the similar loop in step 4, and add another line:
                 f--;
after calculate width to adapting the character pointer.


My patch fixed these two problems. Hoping somebody could take a look at it.

----------
keywords: +patch
Added file: http://bugs.python.org/file18305/issue_7330.diff

_______________________________________
Python tracker <rep...@bugs.python.org>
<http://bugs.python.org/issue7330>
_______________________________________
_______________________________________________
Python-bugs-list mailing list
Unsubscribe: 
http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com

[issue7330] PyUnicode_FromFormat segfault

Reply via email to