On Monday, January 11, 2010 4:50 PM, Ted Kremenek wrote: > > On Jan 11, 2010, at 1:24 PM, Ken Dyck wrote: > > >> > >> I'm also concerned about the dimensionality here. Why did > we choose > >> 'Chars' instead of 'Bytes'? > > > > The short answer is that it reflects how getTypeSizeInChars() > > calculates its value. It divides the bit size of the type > by the bit > > size of the char type, so calling them CharUnits seemed > more accurate > > than ByteUnits. The aim is to eventually support character widths > > other than 8. > > > > What specifically are you concerned about? > > I'm concerned that the uses of getTypeSize() / 8 always want > the size in bytes, not chars (if the size of chars differs > from the size of bytes). Code that expects > getTypeSizeInChars() to return the size in bytes (which is > all the cases in libAnalysis) will get the wrong results.
Just to get the terminology straight here, when we are talking about bytes do we mean: A. an 8-bit value, B. the smallest addressable unit of memory on a machine, or C. an addressable unit of data storage large enough to hold any member of the basic character set of the execution environment (C99), or D. something else? However we define byte it seems that it is at least theoretically possible for the character type to have a different width, and so I think Ted makes a valid point. If there is code that expects a size in bytes (however defined), perhaps we need to add another API. As a clang newbie, it is difficult to determine whether a literal 8 means the width of a byte or that of a character, so I'm relying on you guys for reviews. So far, I have been approaching the problem with definition C above and the simpifying assumption that clang enforces byte width == char width, even if neither are 8. This allows characters and bytes to be used interchangably. -Ken _______________________________________________ cfe-commits mailing list [email protected] http://lists.cs.uiuc.edu/mailman/listinfo/cfe-commits
