Re: [Python-Dev] Typo.pl scan of Python 2.5 source code

2006-10-27 Thread Johnny Lee


I grabbed the latest Python2.5 code via subversion and ran my typo script on it.

Weeding out the obvious false positives and Neal's comments leaves about 129 typos.

See http://www.geocities.com/typopl/typoscan.htm

Should I enter the typos as bugs in the Python bug db?
J



 Date: Fri, 22 Sep 2006 21:51:38 -0700 From: [EMAIL PROTECTED] To: [EMAIL PROTECTED] Subject: Re: [Python-Dev] Typo.pl scan of Python 2.5 source code CC: python-dev@python.org  On 9/22/06, Johnny Lee [EMAIL PROTECTED] wrote:   Hello,  My name is Johnny Lee. I have developed a *ahem* perl script which scans  C/C++ source files for typos.  Hi Johnny.  Thanks for running your script, even if it is written in Perl and ran on Windows. :-)   The Python 2.5 typos can be classified into 7 types.   2) realloc overwrite src if NULL, i.e. p = realloc(p, new_size);  If realloc() fails, it will return NULL. If you assign the return value to  the same variable you passed into realloc,  then you've overwritten the variable and possibly leaked the memory that the  variable pointed to.  A bunch of these warnings were accurate and a bunch were not. There were 2 reasons for the false positives. 1) The pointer was aliased, thus not lost, 2) On failure, we exited (Parser/*.c)   4) if ((X!=0) || (X!=1))  These 2 cases occurred in binascii. I have no idea if the warning is wright or the code is.   6) XX;;  Just being anal here. Two semicolons in a row. Second one is extraneous.  I already checked in a fix for these on HEAD. Hard for even me to screw up those fixes. :-)   7) extraneous test for non-NULL ptr  Several memory calls that free memory accept NULL ptrs.  So testing for NULL before calling them is redundant and wastes code space.  Now some codepaths may be time-critical, but probably not all, and smaller  code usually helps.  I ignored these as I'm not certain all the platforms we run on accept free(NULL).  Below is my categorization of the warnings except #7. Hopefully someone will fix all the real problems in the first batch.  Thanks again!  n --  # Problems Objects\fileobject.c (338): realloc overwrite src if NULL; 17: file-f_setbuf=(char*)PyMem_Realloc(file-f_setbuf,bufsize) Objects\fileobject.c (342): using PyMem_Realloc result w/no check 30: setvbuf(file-f_fp, file-f_setbuf, type, bufsize); [file-f_setbuf] Objects\listobject.c (2619): using PyMem_MALLOC result w/no check 30: garbage[i] = selfitems[cur]; [garbage] Parser\myreadline.c (144): realloc overwrite src if NULL; 17: p=(char*)PyMem_REALLOC(p,n+incr) Modules\_csv.c (564): realloc overwrite src if NULL; 17: self-field=PyMem_Realloc(self-field,self-field_size) Modules\_localemodule.c (366): realloc overwrite src if NULL; 17: buf=PyMem_Realloc(buf,n2) Modules\_randommodule.c (290): realloc overwrite src if NULL; 17: key=(unsigned#long*)PyMem_Realloc(key,bigger*sizeof(*key)) Modules\arraymodule.c (1675): realloc overwrite src if NULL; 17: self-ob_item=(char*)PyMem_REALLOC(self-ob_item,itemsize*self-ob_size) Modules\cPickle.c (536): realloc overwrite src if NULL; 17: self-buf=(char*)realloc(self-buf,n) Modules\cPickle.c (592): realloc overwrite src if NULL; 17: self-buf=(char*)realloc(self-buf,bigger) Modules\cPickle.c (4369): realloc overwrite src if NULL; 17: self-marks=(int*)realloc(self-marks,s*sizeof(int)) Modules\cStringIO.c (344): realloc overwrite src if NULL; 17: self-buf=(char*)realloc(self-buf,self-buf_size) Modules\cStringIO.c (380): realloc overwrite src if NULL; 17: oself-buf=(char*)realloc(oself-buf,oself-buf_size) Modules\_ctypes\_ctypes.c (2209): using PyMem_Malloc result w/no check 30: memset(obj-b_ptr, 0, dict-size); [obj-b_ptr] Modules\_ctypes\callproc.c (1472): using PyMem_Malloc result w/no check 30: strcpy(conversion_mode_encoding, coding); [conversion_mode_encoding] Modules\_ctypes\callproc.c (1478): using PyMem_Malloc result w/no check 30: strcpy(conversion_mode_errors, mode); [conversion_mode_errors] Modules\_ctypes\stgdict.c (362): using PyMem_Malloc result w/no check 30: memset(stgdict-ffi_type_pointer.elements, 0, [stgdict-ffi_type_pointer.elements] Modules\_ctypes\stgdict.c (376): using PyMem_Malloc result w/no check 30: memset(stgdict-ffi_type_pointer.elements, 0, [stgdict-ffi_type_pointer.elements]  # No idea if the code or tool is right. Modules\binascii.c (1161) Modules\binascii.c (1231)  # Platform specific files. I didn't review and won't fix without testing. Python\thread_lwp.h (107): using malloc result w/no check 30: lock-lock_locked = 0; [lock] Python\thread_os2.h (141): using malloc result w/no check 30: (long)sem)); [sem] Python\thread_os2.h (155): using malloc result w/no check 30: lock-is_set = 0; [lock] Python\thread_pth.h (133): using malloc result w/no check 30: memset((void *)lock, '\0', sizeof(pth_lock)); [lock] Python\thread_solaris.h (48): using malloc result w/no check 30: funcarg-func = func; [funcarg] Python\thread_solaris.h (133): using malloc result w/no check 30: if(mutex_init(lock,USYNC_THREAD,0)) [lock]  # Who cares about

[Python-Dev] Typo.pl scan of Python 2.5 source code

2006-09-22 Thread Johnny Lee





Hello,My name is Johnny Lee. I have developed a *ahem* perl script which scans C/C++ source files for typos. I ran the typo.pl script on the released Python 2.5 source code. The scan took about two minutes and produced ~340 typos.After spending about 13 minutes weeding out the obvious false positives, 149 typos remain.One of the pros/cons of the script is that it doesn't need to be intergrated into the build process to work.It just searches for files with typical C/C++ source code file extensions and scans them.The downside is if the source file is not included in the build process, then the script is scanning an irrelevant file.Unless you aid the script via some parameters, it will scan all the code, even stuff inside #ifdef'sthat wouldn't normally be compiled.You can access the list of typos from http://www.geocities.com/typopl/typoscan.htmThe Perl 1999 paper can be read at http://www.geocities.com/typopl/index.htmI've mapped the Python memory-related calls PyMem_Alloc, PyMem_Realloc, etc. to the same behaviour as the C std library malloc, realloc, etc. sinceInclude\pymem.h seem to map them to those calls. If that assumption is not valid, then you can ignore typos that involve those PyMem_XXX calls.The Python 2.5typos can be classified into 7 types.1) if (X = 0)Assignment within an if statement. Typically a false positive, but sometimes it catches something valid.In Python's case, the one typo is:if (status = ERROR_MORE_DATA)but the previous code statement returns an error code into the status variable.2) realloc overwrite src if NULL, i.e. p = realloc(p, new_size);If realloc() fails, it will return NULL. If you assign the return value to the same variable you passed into realloc,then you've overwritten the variable and possibly leaked the memory that the variable pointed to.3) if (CreateFileMapping == IHV)On Win32, the CreateFileMapping() API will return NULL on failure, not INVALID_HANDLE_VALUE.The Python code does not check for NULL though.4) if ((X!=0) || (X!=1))The problem with code of this type is that it doesn't work. In the Python case, we have in a large if statement:quotetabs  ((data[in]!='\t')||(data[in]!=' '))Now if data[in] == '\t', then it will fail the first data[in] but it will pass the second data[in] comparison.Typically you want "" not "||".5) using API result w/no checkThere are several APIs that should be checked for success before using the returned ptrs/cookies, i.e.malloc, realloc, and fopen among others.6) XX;;Just being anal here. Two semicolons in a row. Second one is extraneous.7) extraneous test for non-NULL ptrSeveral memory calls that free memory accept NULL ptrs. So testing for NULL before calling them is redundant and wastes code space.Now some codepaths may be time-critical, but probably not all, and smaller code usually helps.If you have any questions, comments, feel free to email. I hope this scan is useful.Thanks for your time,JUse Messenger to talk to your IM friends, even those on Yahoo! Talk now!
___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com