[issue7330] PyUnicode_FromFormat segfault

2011-03-03 Thread Ray.Allen

Ray.Allen ysj@gmail.com added the comment:

Here is the updated patch:

1, Work with function parse_format_flags() which is introduced in issue10829, 
and the patch is simpler and more clear than before.
2, Change parse_format_flags() to set precision value to -1 in the case of '%s' 
in order to differ with '%.0s'
3, Move call of unicode_format_align() in step 3 in order to avoid many codes 
like n += width  PyUnicode_GET_SIZE(str) ? width : PyUnicode_GET_SIZE(str);, 
(following haypo's comments)

--
Added file: http://bugs.python.org/file20983/issue7330_2.diff

___
Python tracker rep...@bugs.python.org
http://bugs.python.org/issue7330
___
___
Python-bugs-list mailing list
Unsubscribe: 
http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue7330] PyUnicode_FromFormat segfault

2011-02-20 Thread Ray.Allen

Ray.Allen ysj@gmail.com added the comment:

   With your patch, %.200s truncates the input string to 200 *characters*, 
   but I think that it should truncate to 200 *bytes*, as printf does.
  
  Sorry, I don't understand. The result of PyUnicode_FromFormatV() is a 
  unicode object. Then how to truncate to 200 *bytes*?

 You can truncate the input char* on the call to PyUnicode_DecodeUTF8:
pass a size smaller than strlen(s).


Now I wonder how should we treat precision formatters of '%s'. First of all, 
the PyUnicode_FromFormat() should behave like C printf(). In C printf(), the 
precision formatter of %s is to specify a maximum width of the displayed 
result. If final result is longer than that value, it must be truncated. That 
means the precision is applied on the final result. While python's 
PyUnicode_FromFormat() is to produce unicode strings, so the width and 
precision formatter should be applied on the final unicode string result. And 
the format stage is split into two ones, one is converting each paramater to an 
unicode string, another one is to put the width and precision formatters on 
them. So I wonder if we should apply the precision formatter on the converting 
stage, that is, to PyUnicode_DecodeUTF8(). So in my opinion precision should 
not be applied to input chars, but output unicodes.

I hope I didn't misunderstand something.

So haypo, what's your opinion.

--

___
Python tracker rep...@bugs.python.org
http://bugs.python.org/issue7330
___
___
Python-bugs-list mailing list
Unsubscribe: 
http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue7330] PyUnicode_FromFormat segfault

2011-02-20 Thread Ray.Allen

Changes by Ray.Allen ysj@gmail.com:


Removed file: http://bugs.python.org/file20739/issue_7330.diff

___
Python tracker rep...@bugs.python.org
http://bugs.python.org/issue7330
___
___
Python-bugs-list mailing list
Unsubscribe: 
http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue7330] PyUnicode_FromFormat segfault

2011-02-18 Thread Ray.Allen

Ray.Allen ysj@gmail.com added the comment:

 No you don't. You can copy a substring of the input string with
Py_UNICODE_COPY: just pass a smaller length.

Oh, yes, I got your meaning now. I'll follow this.


 You can truncate the input char* on the call to PyUnicode_DecodeUTF8:

Oh, what if the trunked char* cannot be decoded correctly? e.g. a tow-bytes 
character is divided in the middle? 


 Yes, but I am no more sure that it is the right thing to do.

If I understand correctly(my English ability is limited), your suggestion is to 
combine, right? I'm afraid that combine may bring us too complicated code to 
write. The currently 4 steps just divide the process into smaller and simpler 
pieces. I'm not sure.

--

___
Python tracker rep...@bugs.python.org
http://bugs.python.org/issue7330
___
___
Python-bugs-list mailing list
Unsubscribe: 
http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue7330] PyUnicode_FromFormat segfault

2011-02-18 Thread STINNER Victor

STINNER Victor victor.stin...@haypocalc.com added the comment:

 Oh, what if the trunked char* cannot be decoded correctly?
 e.g. a tow-bytes character is divided in the middle? 

Yes, but PyUnicode_FromFormatV() uses UTF-8 decoder with replace error handler, 
and so the incomplete byte sequence will be replaced by � (it doesn't fail with 
an error). Example:

 abc€.encode(utf-8)[:-1].decode(utf-8, replace)
'abc�'

--

___
Python tracker rep...@bugs.python.org
http://bugs.python.org/issue7330
___
___
Python-bugs-list mailing list
Unsubscribe: 
http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue7330] PyUnicode_FromFormat segfault

2011-02-18 Thread Ray.Allen

Ray.Allen ysj@gmail.com added the comment:

 Can you add tests for %.s? I would like to know if %.s is different than 
 %s :-)

Oh sorry~~  I made an mistake. There is no bug here. I have attached tests that 
show that '%.s' is the same as '%s'.


Here is the updated patch:
1, changed the function name unicode_format() to 
1, remove

- must be a sequence, not %200s,
+ must be a sequence, not %.200s,

in Python/ceval.c

2, Removing using PySequence_GetSlice() in unicode_format_align() and do a 
refactor to optimize the process.

3, Add tests for '%.s' and '%s', as haypo wanted.


This is obviously not the final patch just convenient for other to do a  
review. Something more need to be discussed.

--
Added file: http://bugs.python.org/file20786/issue_7330.diff

___
Python tracker rep...@bugs.python.org
http://bugs.python.org/issue7330
___
___
Python-bugs-list mailing list
Unsubscribe: 
http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue7330] PyUnicode_FromFormat segfault

2011-02-17 Thread Ray.Allen

Ray.Allen ysj@gmail.com added the comment:

Thanks hyapo! 

 It looks like your patch fixes #10829: you should add tests for that, you can 
 just reuse the tests of my patch (attached to #10829).

Sorry, but I think my patch doesn't fix #10829. It seems link another issue. 
And by applying my patch and add tests from #10829's patch, the tests cannot 
passed. Or did I missed something?


 You should also avoid the creation of a temporary unicode object (it can be 
 slow if precision is large) using PySequence_GetSlice(). Py_UNICODE_COPY() 
 does already truncate the string because you can pass an arbitrary length.

In order to use Py_UNICODE_COPY, I have to create a unicode object with 
required length first. I feel this have the same cost as using 
PySequence_GetSlice(). If I understand correctly?


 With your patch, %.200s truncates the input string to 200 *characters*, but 
 I think that it should truncate to 200 *bytes*, as printf does.

Sorry, I don't understand. The result of PyUnicode_FromFormatV() is a unicode 
object. Then how to truncate to 200 *bytes*? I think the %s formatter just 
indicate that the argument is c-style chars, the result is always unicode 
string, and the width and precision formatters are to applied after converting 
c-style chars to string. 


 I don't like this change because I hate having to compute manually strings 
 length. It should that it would be easier if you format directly strings with 
 width and precision at step 3, instead of doing it at step 4: so you can just 
 read the length of the formatted string, and it avoids having to handle 
 width/precision in two steps (which may be inconsistent :-/).

Do you mean combine step 3 and step 4 together? Currently step 3 is just to 
compute the biggest width value and step 4 is to compute exact width and do the 
real format work. Only by doing real format we can get the exact width of a 
string. So I have to compute each width twice in both step 3 and step 4. Is 
combining the two steps in to one a good idea?


 In my opinion, the patch is a little bit too big. We may first commit the fix 
 on the code parsing the width and precision: fix #10829?

Again, I guess #10829 need another its own patch to fix. 


 Can you add tests for %.s? I would like to know if %.s is different than 
 %s :-)

Err, '%.s' causes unexpected result both with and without my patch. Maybe it's 
still another bug?

--

___
Python tracker rep...@bugs.python.org
http://bugs.python.org/issue7330
___
___
Python-bugs-list mailing list
Unsubscribe: 
http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue7330] PyUnicode_FromFormat segfault

2011-02-17 Thread Ray.Allen

Ray.Allen ysj@gmail.com added the comment:

 Do you mean combine step 3 and step 4 together? Currently step 3 is just to 
 compute the biggest width value and step 4 is to compute exact width and do 
 the real format work. Only by doing real format we can get the exact width of 
 a string. So I have to compute each width twice in both step 3 and step 4. Is 
 combining the two steps in to one a good idea?

Sorry, Here I mean:

Do you mean combine step 3 and step 4 together? Currently step 3 is just to 
compute the biggest width value and step 4 is to compute exact width and do the 
convert work(by calling 
PyObject_Str()/PyObject_Repr()/PyObject_ASCII()/PyUnicode_DecodeUTF8() for 
%S/%R/%A/%s). Only by doing convert we can get the exact width of a string. So 
I have to compute each width twice in both step 3 and step 4. Is combining the 
two steps in to one a good idea?

--

___
Python tracker rep...@bugs.python.org
http://bugs.python.org/issue7330
___
___
Python-bugs-list mailing list
Unsubscribe: 
http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue7330] PyUnicode_FromFormat segfault

2011-02-17 Thread STINNER Victor

STINNER Victor victor.stin...@haypocalc.com added the comment:

  It looks like your patch fixes #10829: you should add tests for that, you 
  can just reuse the tests of my patch (attached to #10829).
 
 Sorry, but I think my patch doesn't fix #10829.

Ah ok, so don't add failing tests :-)

  You should also avoid the creation of a temporary unicode object (it can be 
  slow if precision is large) using PySequence_GetSlice(). Py_UNICODE_COPY() 
  does already truncate the string because you can pass an arbitrary length.
 
 In order to use Py_UNICODE_COPY, I have to create a unicode object with 
 required length first.

No you don't. You can copy a substring of the input string with
Py_UNICODE_COPY: just pass a smaller length.

  With your patch, %.200s truncates the input string to 200 *characters*, 
  but I think that it should truncate to 200 *bytes*, as printf does.
 
 Sorry, I don't understand. The result of PyUnicode_FromFormatV() is a unicode 
 object. Then how to truncate to 200 *bytes*?

You can truncate the input char* on the call to PyUnicode_DecodeUTF8:
pass a size smaller than strlen(s).

case 's':
{
/* UTF-8 */
const char *s = va_arg(count, const char*);
PyObject *str = PyUnicode_DecodeUTF8(s, strlen(s), replace);
if (!str)
goto fail;
n += PyUnicode_GET_SIZE(str);
/* Remember the str and switch to the next slot */
*callresult++ = str;
break;
}

I don't know if we should truncate to a number of bytes, or a number of
characters.

  I don't like this change because I hate having to compute manually strings 
  length. It should that it would be easier if you format directly strings 
  with width and precision at step 3, instead of doing it at step 4: so you 
  can just read the length of the formatted string, and it avoids having to 
  handle width/precision in two steps (which may be inconsistent :-/).
 
 Do you mean combine step 3 and step 4 together? Currently step 3 is just to 
 compute the biggest width value and step 4 is to compute exact width and do 
 the real format work. Only by doing real format we can get the exact width of 
 a string. So I have to compute each width twice in both step 3 and step 4. Is 
 combining the two steps in to one a good idea?

Do you mean combine step 3 and step 4 together?

Yes, but I am no more sure that it is the right thing to do.

  Can you add tests for %.s? I would like to know if %.s is different 
  than %s :-)
 
 Err, '%.s' causes unexpected result both with and without my patch. Maybe 
 it's still another bug?

If the fix (always have the same behaviour) is short, it would be nice
to include it in your patch.

--

___
Python tracker rep...@bugs.python.org
http://bugs.python.org/issue7330
___
___
Python-bugs-list mailing list
Unsubscribe: 
http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue7330] PyUnicode_FromFormat segfault

2011-02-13 Thread Ray.Allen

Changes by Ray.Allen ysj@gmail.com:


Removed file: http://bugs.python.org/file18305/issue_7330.diff

___
Python tracker rep...@bugs.python.org
http://bugs.python.org/issue7330
___
___
Python-bugs-list mailing list
Unsubscribe: 
http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue7330] PyUnicode_FromFormat segfault

2011-02-13 Thread Ray.Allen

Changes by Ray.Allen ysj@gmail.com:


Removed file: http://bugs.python.org/file19132/issue_7330.diff

___
Python tracker rep...@bugs.python.org
http://bugs.python.org/issue7330
___
___
Python-bugs-list mailing list
Unsubscribe: 
http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue7330] PyUnicode_FromFormat segfault

2011-02-13 Thread Ray.Allen

Changes by Ray.Allen ysj@gmail.com:


Removed file: http://bugs.python.org/file20731/issue_7330.diff

___
Python tracker rep...@bugs.python.org
http://bugs.python.org/issue7330
___
___
Python-bugs-list mailing list
Unsubscribe: 
http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue7330] PyUnicode_FromFormat segfault

2011-02-11 Thread STINNER Victor

STINNER Victor victor.stin...@haypocalc.com added the comment:

It looks like your patch fixes #10829: you should add tests for that, you can 
just reuse the tests of my patch (attached to #10829).

---

unicode_format() looks suboptimal.

+memset(buffer, ' ', width);
+width_unicode = PyUnicode_FromStringAndSize(buffer, width);

You should avoid this byte string (buffer) and use memset() on the Unicode 
string directly. Something like:

Py_UNICODE *u;
Py_ssize_t i;
width_unicode = PyUnicode_FromUnicode(NULL, width);
u = PyUnicode_AS_UNICODE(width_unicode);
for(i=0; i  width; i++) {
  *u = (Py_UNICODE)' ';
  u++;
}

You should also avoid the creation of a temporary unicode object (it can be 
slow if precision is large) using PySequence_GetSlice(). Py_UNICODE_COPY() does 
already truncate the string because you can pass an arbitrary length.

---

I don't like unicode_format function name: it sounds like str.format() in 
Python. A suggestion: unicode_format_align

---

With your patch, %.200s truncates the input string to 200 *characters*, but I 
think that it should truncate to 200 *bytes*, as printf does.

---

-n += PyUnicode_GET_SIZE(str);
+n += width  PyUnicode_GET_SIZE(str) ? width : 
PyUnicode_GET_SIZE(str);

I don't like this change because I hate having to compute manually strings 
length. It should that it would be easier if you format directly strings with 
width and precision at step 3, instead of doing it at step 4: so you can just 
read the length of the formatted string, and it avoids having to handle 
width/precision in two steps (which may be inconsistent :-/).

---

Your patch implements %.100s (and %.100U): we might decide what to do with 
#10833 before commiting your patch.

---

In my opinion, the patch is a little bit too big. We may first commit the fix 
on the code parsing the width and precision: fix #10829?

---

Can you add tests for %.s? I would like to know if %.s is different than 
%s :-)

---

- must be a sequence, not %200s,
+ must be a sequence, not %.200s,

Hum, I think that they are many other places where such fix should be done. 
Nobody noticed this typo before because %.200s nor %200s were implemented 
(#10833).


---

Finally, do you really need to implement %200s, %2.5s and %.100s? I don't know, 
but I would be ok to commit the patch if you fix it for all of my remarks :-)

--

___
Python tracker rep...@bugs.python.org
http://bugs.python.org/issue7330
___
___
Python-bugs-list mailing list
Unsubscribe: 
http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue7330] PyUnicode_FromFormat segfault

2011-02-10 Thread Ray.Allen

Changes by Ray.Allen ysj@gmail.com:


Removed file: http://bugs.python.org/file19131/issue_7330.diff

___
Python tracker rep...@bugs.python.org
http://bugs.python.org/issue7330
___
___
Python-bugs-list mailing list
Unsubscribe: 
http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue7330] PyUnicode_FromFormat segfault

2011-02-10 Thread Ray.Allen

Ray.Allen ysj@gmail.com added the comment:

Thanks haypo!

Here is the updated patch, it add the tests about width modifiers and precision 
modifiers of %S, %R, %A. Besides I don't know how to add tests of %s, since 
when calling through ctypes, I could not get correct result value as python 
object from PyUnicode_FromFormat() with '%s' in format string as argument.

--
Added file: http://bugs.python.org/file20731/issue_7330.diff

___
Python tracker rep...@bugs.python.org
http://bugs.python.org/issue7330
___
___
Python-bugs-list mailing list
Unsubscribe: 
http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue7330] PyUnicode_FromFormat segfault

2011-02-10 Thread Ray.Allen

Ray.Allen ysj@gmail.com added the comment:

Here's the complete patch, added unittest for width modifier and precision 
modifier for '%s' formatter of PyUnicode_FromFormat() function.

--
Added file: http://bugs.python.org/file20739/issue_7330.diff

___
Python tracker rep...@bugs.python.org
http://bugs.python.org/issue7330
___
___
Python-bugs-list mailing list
Unsubscribe: 
http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue7330] PyUnicode_FromFormat segfault

2011-02-01 Thread STINNER Victor

STINNER Victor victor.stin...@haypocalc.com added the comment:

I opened other tickets related to PyUnicode_FromFormatV:

 * #10833 :Replace %.100s by %s in PyErr_Format(): the arbitrary limit of 500 
bytes is outdated
 * #10831: PyUnicode_FromFormatV() doesn't support %li, %lli, %zi
 * #10830: PyUnicode_FromFormatV(%c) doesn't support non-BMP characters on 
narrow build
 * #10829: PyUnicode_FromFormatV() bugs with % and %% format strings

(see also #10832: Add support of bytes objects in PyBytes_FromFormatV())

PyUnicode_FromFormatV() has now tests in test_unicode: issue_7330.diff should 
add new tests, at least to check that %20R doesn't crash.

--

___
Python tracker rep...@bugs.python.org
http://bugs.python.org/issue7330
___
___
Python-bugs-list mailing list
Unsubscribe: 
http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue7330] PyUnicode_FromFormat segfault

2011-01-31 Thread Alexander Belopolsky

Changes by Alexander Belopolsky belopol...@users.sourceforge.net:


--
components: +Unicode
nosy: +haypo

___
Python tracker rep...@bugs.python.org
http://bugs.python.org/issue7330
___
___
Python-bugs-list mailing list
Unsubscribe: 
http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue7330] PyUnicode_FromFormat segfault

2010-10-05 Thread Ray.Allen

Ray.Allen ysj@gmail.com added the comment:

I update the patch. Hope somebody could do a review.

--
Added file: http://bugs.python.org/file19131/issue_7330.diff

___
Python tracker rep...@bugs.python.org
http://bugs.python.org/issue7330
___
___
Python-bugs-list mailing list
Unsubscribe: 
http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue7330] PyUnicode_FromFormat segfault

2010-10-05 Thread Ray.Allen

Ray.Allen ysj@gmail.com added the comment:

I update the patch. Hope somebody could do a review.

--
Added file: http://bugs.python.org/file19132/issue_7330.diff

___
Python tracker rep...@bugs.python.org
http://bugs.python.org/issue7330
___
___
Python-bugs-list mailing list
Unsubscribe: 
http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue7330] PyUnicode_FromFormat segfault

2010-10-05 Thread Ray.Allen

Ray.Allen ysj@gmail.com added the comment:

Oooops! Sorry for re-submit the request...

--

___
Python tracker rep...@bugs.python.org
http://bugs.python.org/issue7330
___
___
Python-bugs-list mailing list
Unsubscribe: 
http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue7330] PyUnicode_FromFormat segfault

2010-08-01 Thread Ray.Allen

Ray.Allen ysj@gmail.com added the comment:

Here is the patch, it add support to use width and precision formatters in 
PyUnicode_FromFormat() for type %s, %S, %R, %V, %U, %A, besides fixed two bugs, 
which at least I believe:


1. According to PyUnicode_FromFormat() doc: 
http://docs.python.org/dev/py3k/c-api/unicode.html?highlight=pyunicode_fromformat#PyUnicode_FromFormat,
 the %A should produce result of ascii(). But in the existing code, I only 
find code of  call to ascii(object) and calculate the spaces needed for it, but 
not appending the ascii() output to result. Also according to my simple test, 
the %A doesn't work, as the following simple test function:
static PyObject *
getstr(PyObject *self, PyObject *args)
{
const char *s = hello world;
PyObject *unicode = PyUnicode_FromString(s);
return PyUnicode_FromFormat(%A, unicode);
}
Which should return the result of calling ascii() with the object named 
*unicode* as its argument. The result should be a unicode object with string 
hello world. But it actually return a unicode object with string %A. This 
can be fixed by adding the following line:
   case 'A':
in step 4.


2. another bug, here is a piece of code in Object/unicodeobject.c, 
PyUnicode_FromFormatV():

797  if (*f == '%') {
798  #ifdef HAVE_LONG_LONG
799  int longlongflag = 0;
800  #endif
801  const char* p = f;
802  width = 0;
803  while (ISDIGIT((unsigned)*f))
804  width = (width*10) + *f++ - '0';


Here the variable *width* cannot be correctly calculated, because the while 
loop will not execute, the *f currently is definitely '%'! So the width is 
always 0. But currently this doesn't cause error, since the following codes 
will ensure width = MAX_LONG_CHARS:

834case 'd': case 'u': case 'i': case 'x':
835(void) va_arg(count, int);
836  #ifdef HAVE_LONG_LONG
837if (longlongflag) {
838   if (width  MAX_LONG_LONG_CHARS)
839width = MAX_LONG_LONG_CHARS;
840}
841else
842  #endif
843/* MAX_LONG_CHARS is enough to hold a 64-bit integer,
844 including sign.  Decimal takes the most space.  This
845 isn't enough for octal.  If a width is specified we
846 need more (which we allocate later). */
847if (width  MAX_LONG_CHARS)
848width = MAX_LONG_CHARS;

(currently width and precision only apply to integer types:%d, %u, %i, %x, not 
string and object types:%s, %S, %R, %A, %U, %V )

To fix, the following line:
801  const char* p = f;
should be:
801  const char* p = f++;
just as the similar loop in step 4, and add another line:
 f--;
after calculate width to adapting the character pointer.


My patch fixed these two problems. Hoping somebody could take a look at it.

--
keywords: +patch
Added file: http://bugs.python.org/file18305/issue_7330.diff

___
Python tracker rep...@bugs.python.org
http://bugs.python.org/issue7330
___
___
Python-bugs-list mailing list
Unsubscribe: 
http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue7330] PyUnicode_FromFormat segfault

2010-07-30 Thread Ray.Allen

Ray.Allen ysj@gmail.com added the comment:

Is this really worthy to fix?

--

___
Python tracker rep...@bugs.python.org
http://bugs.python.org/issue7330
___
___
Python-bugs-list mailing list
Unsubscribe: 
http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue7330] PyUnicode_FromFormat segfault

2010-07-29 Thread Ray.Allen

Ray.Allen ysj@gmail.com added the comment:

You can write %20s as a argument for PyUnicode_FromFormat(), but it has no 
effect. The width and precision modifiers are not intended to apply to string 
formating(%s, %S, %R, %A), only apply to integer(%d, %u,  %i, %x). Though you 
can write %20s, but you cannot write %20S, %20R and %20A.


There can be several fixes:

1. make the presence of width and precision modifiers of %s, %S, %R, %A  raise 
an Exception, like ValueError, instead of segment fault.
2. make the presence of width and precision modifiers of %s, %S, %R, %A have no 
effect, just like current %s.
3. make the presence of width and precision modifiers of %s, %S, %R, %A do have 
correct effect, like %r and %s in string formatting in python code.


Thanks to Eric's ideas. Now I'm sure I prefer the last fix. I will work out a 
patch for this.

--

___
Python tracker rep...@bugs.python.org
http://bugs.python.org/issue7330
___
___
Python-bugs-list mailing list
Unsubscribe: 
http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue7330] PyUnicode_FromFormat segfault

2010-07-28 Thread Ray.Allen

Ray.Allen ysj@gmail.com added the comment:

I feel it's not proper to allow the width restrict on types %S, %R, %A. These 
types correspond to PyObject_Str(), PyObject_Repr, PyObject_ASCII() 
respectively, the results of them are usually a complete string representation 
of a object. If you put a width restriction on the string, it's likely that the 
result string is intercepted and is of no complete meaning. If you really want 
to put a width restriction on the result, you can use %s instead, with one or 
two more lines to get the corresponding char* from the object.

--
nosy: +ysj.ray

___
Python tracker rep...@bugs.python.org
http://bugs.python.org/issue7330
___
___
Python-bugs-list mailing list
Unsubscribe: 
http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue7330] PyUnicode_FromFormat segfault

2010-07-28 Thread Marc-Andre Lemburg

Marc-Andre Lemburg m...@egenix.com added the comment:

Ray.Allen wrote:
 
 Ray.Allen ysj@gmail.com added the comment:
 
 I feel it's not proper to allow the width restrict on types %S, %R, %A. These 
 types correspond to PyObject_Str(), PyObject_Repr, PyObject_ASCII() 
 respectively, the results of them are usually a complete string 
 representation of a object. If you put a width restriction on the string, 
 it's likely that the result string is intercepted and is of no complete 
 meaning. If you really want to put a width restriction on the result, you can 
 use %s instead, with one or two more lines to get the corresponding char* 
 from the object.

I agree with that, but don't feel strongly about not allowing this
use case.

If it's easy to support, why not have it ? Otherwise, I'd be +1 on
adding a check and raise an error in case a width modifier is used
with these markers.

--
nosy: +lemburg

___
Python tracker rep...@bugs.python.org
http://bugs.python.org/issue7330
___
___
Python-bugs-list mailing list
Unsubscribe: 
http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue7330] PyUnicode_FromFormat segfault

2010-07-28 Thread Eric Smith

Eric Smith e...@trueblade.com added the comment:

I think under the we're all consenting adults doctrine that it should be 
allowed. If you really want that behavior, why force the char*/%s dance at each 
call site when it's easy enough to do it in one place? I don't think anyone 
supplying a width would really be surprised that it would truncate the result 
and possibly break round-tripping through repr.

Besides, it's allowed in pure python code:
 '%.5r' % object()
'obje'

--

___
Python tracker rep...@bugs.python.org
http://bugs.python.org/issue7330
___
___
Python-bugs-list mailing list
Unsubscribe: 
http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue7330] PyUnicode_FromFormat segfault

2010-07-27 Thread Ron Adam

Changes by Ron Adam ron_a...@users.sourceforge.net:


--
nosy: +ron_adam
title: PyUnicode_FromFormat segfault when using widths. - PyUnicode_FromFormat 
segfault

___
Python tracker rep...@bugs.python.org
http://bugs.python.org/issue7330
___
___
Python-bugs-list mailing list
Unsubscribe: 
http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue7330] PyUnicode_FromFormat segfault

2010-07-27 Thread Ezio Melotti

Changes by Ezio Melotti ezio.melo...@gmail.com:


--
nosy: +ezio.melotti

___
Python tracker rep...@bugs.python.org
http://bugs.python.org/issue7330
___
___
Python-bugs-list mailing list
Unsubscribe: 
http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue7330] PyUnicode_FromFormat segfault when using widths.

2009-11-15 Thread Mark Dickinson

New submission from Mark Dickinson dicki...@gmail.com:

There seems to be something wrong with the width handling code in 
PyUnicode_FromFormat;  or perhaps I'm misusing it.

To reproduce:  replace the line

   return PyUnicode_FromFormat(range(%R, %R), r-start, r-stop);

in range_repr in Objects/rangeobject.c with

   return PyUnicode_FromFormat(range(%20R, %20R), r-start, r-stop);

On my machine (OS X 10.6), this results in a segfault when invoking 
range_repr:

Python 3.2a0 (py3k:76311M, Nov 15 2009, 19:16:40) 
[GCC 4.2.1 (Apple Inc. build 5646)] on darwin
Type help, copyright, credits or license for more information.
 range(0, 10)
Segmentation fault

Perhaps these modifiers aren't supposed to be used with a width?

--
messages: 95306
nosy: mark.dickinson
severity: normal
status: open
title: PyUnicode_FromFormat segfault when using widths.
type: crash
versions: Python 3.2

___
Python tracker rep...@bugs.python.org
http://bugs.python.org/issue7330
___
___
Python-bugs-list mailing list
Unsubscribe: 
http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue7330] PyUnicode_FromFormat segfault when using widths.

2009-11-15 Thread Eric Smith

Eric Smith e...@trueblade.com added the comment:

It looks like PyUnicode_FromFormatV is computing callcount incorrectly.
It's looking for 'S', 'R', or 'A' immediately following '%', before the
width. It seems to me it should be treating them the same as 's',
although I'll admit to not having looked at it close enough to know
exactly what's going on.

The whole routine could use some attention, I think.

--
nosy: +eric.smith

___
Python tracker rep...@bugs.python.org
http://bugs.python.org/issue7330
___
___
Python-bugs-list mailing list
Unsubscribe: 
http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com