Hello folks, SUMMARY: I am offering to implement fixes in PIL to make its handling of 16- bit image types less vexing. Are the PIL maintainers interested in such help? I have a few questions to ask, and then I could get to work if some maintainer would give me a preliminary go-ahead.
THE FULL STORY: I'm trying to use PIL for a scientific imaging library. Unfortunately, many of the images I need to deal with are 16-bit grayscale image files; an image mode that PIL does not deal well with. Thus I will need to either write work-arounds in my code to fix the problems, or fix PIL itself. I would prefer to do the latter. I previously sent a patch to this list to allow for proper reading of 16-bit TIFF files; however this patch is a very simple-minded and I think that with a few more changes, PIL could support 16-bit grayscale files in a much better way. However there's a major philosophical issue that needs to be addressed first! Specifically, does mode 'I;16' mean '16-bit little-endian integers' or '16-bit native integers'? In practice the code uses the former definition; however, Imaging.h declares the latter to be the case. The real issue is that all other multi-byte types like 'I' and 'F' are stored as native byte order in memory, regardless of how they are read in. Thus, both users and developers are abstracted from the question of byte ordering. (Except that developers still need to care about endian-ness at serialization time.) However, 16-bit image types are not so insulated, which is what makes them so vexing in PIL. This makes double the work for trying to add 16-bit compatibility to any image function, because you have to write the compatibility twice, once for each endian-ness. Also, writing a function to deal with a particular byte ordering which may or may not be native is both error-prone and inefficient. The real problem is just one of nomenclature. Pack.c and Unpack.c make a distinction between 'raw modes' like 'I;32' (which implicitly means 32-bit little-endian) and normal use-level 'modes' like 'I' (which means 32-bit native endian). However, the use of 'I;16' as a user-level image mode has clouded issues, because even at the user level it means 'little endian'. These subtle differences in meaning cause a lot of the 16-bit manipulation bugs that I've seen in PIL so far. I think that Imaging.h is correct in that 'I;16' ought to be treated as native byte order when it is used as an image mode (just like 'I' is). However, as a raw mode, 'I;16' needs to mean 'little-endian' just as 'I;32' means 'little endian'. This change wouldn't be too hard to make, and it would be (mostly) backwards compatible. However, having one name for two different entities (a 'mode' with native order and a 'raw mode' with little-endian order) is likely to be very confusing and the source of future bugs. It seems like the better solution would be to add a new '16-bit unsigned native byte order' image type to PIL -- maybe 'S' for 'short' -- and reserve 'I; 16' and 'I;16B' strictly for raw modes. The only problem would be that this would break some older code that relied on these 'experimental' features. Is anyone interested in discussing these options (and several other bugs in PIL's image packing and unpacking that I've discovered in looking at the code)? I'm happy to take this on, since these are changes I need to make anyway for my project and I'd rather see them in the PIL trunk than in my own fork. Thanks, Zach Pincus Program in Biomedical Informatics and Department of Biochemistry Stanford University School of Medicine _______________________________________________ Image-SIG maillist - Image-SIG@python.org http://mail.python.org/mailman/listinfo/image-sig