Re: [Harbour] BYTE -> UCHAR patch

Viktor Szakáts Sun, 08 Feb 2009 08:53:48 -0800

Hi Przemek,
Now I think I understand your concept, plus that your
main concern (for now) is that BYTE's signedness isn't
currently well defined. Of course I didn't want to change
anything that could break compatibility.


Here's my draft planned steps to switch to a properly typed
core; I've keyed this in yesterday, and your types seem to
fit nicely into the concept, I agree on the naming, too:

--- [ draft / brainstorm stage ]
Safe path to proper typing:
1. Copy all current Harbour-defined generic (legacy) types to a "hb" prefix
version.
2. Convert all current Harbour-defined generic (legacy) type usages to have
an "hb" prefix. (essentialy S&R)
3. Move Clipper compatibility types into *.api/extend.h/clipdefs.h.
4. Add new define to enable/disable legacy types. Leave that enabled by
default
   for a transition period. Harbour developers should explictly disable
   legacy types locally by setting HB_USER_CFLAGS appropriately.
5. Add Harbour abstract types (all "hb" prefixed) for all important entities
in Harbour
   (which doesn't yet have one, but would be nice to control them
independently).
   Favour plain ANSI C types (unsigned char, int, long, unsigned int,
unsigned long)
   _where possible_. size_t should be handled like any other foreign type
which
   external packages might happen to use, we should explicitly cast to/from
it.
   Possible list of such abstract types:
   - Harbour character (text) (currently: char, BYTE, SCHAR, UCHAR) ->
HB_CHAR ?
   - Harbour character (binary/raw) (currently: char, BYTE, SCHAR, UCHAR) ->
HB_BYTE ?
   - Harbour string (text) (currently: char *, BYTE *, SCHAR *, UCHAR *) ->
PHB_CHAR / HB_CHAR * ?
   - Harbour string (binary/raw) (currently: char *, BYTE *, SCHAR *, UCHAR
*) -> PHB_BYTE / HB_BYTE * ?
   - Harbour string/array/hash length / index (currently: ULONG) -> HB_POS?
(== long) / long
   - Harbour integer num (currently: int) -> OKAY
   - Harbour long num (currently: long ?) -> OKAY
   - Harbour longlong num (currently: HB_LONG ?) -> OKAY
   - Harbour float (currently: double) -> OKAY
   - Harbour date (currently: long) -> HB_DATELONG
   - Harbour time (currently: LONG) -> HB_TIMELONG / long
   - Harbour GT color (currently: BYTE) -> HB_COLOR ? int ?
   - Harbour GT attribute (currently: BYTE) -> HB_ATTR (existing type) ? int
?
   - Harbour internal indexes (pcode pointer, param count, line no,
class/method ID) (currently: USHORT | LONG) -> ?
   - probably more, please extend. (colors, screen positions, other internal
types...)
6. Begin to cleanse HB_ prefixed _generic_ types to abstract ones.
   This isn't automatic, and isn't done in one pass.
   It's rather a process, but at this point the code is already
   free of legacy types, so we can take our time to further rise
   code quality + consistency by moving to abstract ones.
---

Brgds,
Viktor

On Sun, Feb 8, 2009 at 4:48 PM, Przemyslaw Czerpak <[email protected]>wrote:

> On Sat, 07 Feb 2009, Szak�ts Viktor wrote:
>
> Hi Viktor,
>
> > > I suggest to change all 'BYTE *' used as file name in Harbour
> > > FS API and similar functions to 'char *' type to not replicate
> > > the problems which comes from Clipper, f.e. from CL53 header files:
> > >   typedef unsigned char BYTE;
> > >   typedef BYTE far * BYTEP;
> > >   extern void _retc(BYTEP);
> > >   extern void _retclen(char far *, unsigned int);
> > > You may find similar problems also in CL52.
> > > It forces explicit casting in C++ mode if char is signed. Such casting
> > > has bad side effect. It can hide typos or even errors normally easy to
> > > catch at compile time, f.e. pFile wrongly used instead of szFile:
> > >   PHB_ITEM pFile = hb_param( 1, HB_IT_STRING );
> > >   char * szFile = hb_itemGetCPtr( pFile )
> > >   HB_FHANDLE hFile = hb_fsOpen( ( BYTE * ) pFile, ... );
> > Yes, I see your point.
> > I'll try to fix that before committing anything, but first
> > I have some questions.
> > Am I right assuming that we will definitely take the UTF-8
> > route then (for filenames, too), and in this case 'char' may
> > just hold UTF-8 strings in the future? (had we choose UTF16,
> > we may need to centrally redefine 'char' to double bytes in
> > the future, so such abstract type might have its benefits.)
>
> I do not think we can decide now. The UTF-8 allows to keep all
> existing functions with 'char' as is but in some cases U16 (it's
> not exactly UTF16) is better because it's fixed length format.
> Anyhow I do not think we can fully drop pure char API. It still
> can be usable for different libraries so maybe we will have both.
> I would like to return to this subject when we will have CDP API
> and will work on Unicode support.
>
> > > In your modifications you replaced BYTE * used as file name in Harbour
> > > FS API to UCHAR *. File names are text strings and  I want to use
> > > simple char * for them and I want to change BYTE used as synonym
> > > of 8bit unsigned integer to UCHAR.
> > Is this true for GT strings, too? (to convert them to char, not UCHAR)
>
> It will depend on functions.
> Here we will have interaction with Unicode, too.
> Now characters and strings are passed as BYTEs.
> Let's leave it now. Here we should make also few other modifications
> which will break backward binary compatibility so it should be grouped
> with other modifications.
>
> > Q1: Having Unicode support in mind, shouldn't we use some distinct
> > marking for 'text' (char *) data?
>
> Using 'char *' already make it.
>
> > Q2: May I change to UCHAR / SCHAR to HB_UCHAR / HB_SCHAR,
> > as part of moving our special types to our own namespace?
> > (UCHAR is an Microsoft type name, also)
>
> Please not now. We have many different types and we haven't decide
> how the final version should look and what names space we should
> use. Now I suggest to make only basic modifications which does not
> break backward binary compatibility and freeze modifications to
> releaze new stable version with MT support. When this version
> will be ready then we can stat discussion on core API modifications
> and begin to implement it. Such modifications make take long time,
> even more then year and people will need stable version which MT
> so first we have to release it. Probably it should be version 1.2.
> After releasing 1.2 we can start new branch with 1.3 and begin to
> introduce agreed modifications. If the backward source code and/or
> binary compatibility will be seriously broken by new API (I expect it
> in this case) then probably next stable release should have number
> 2.0 to make difference between branches and 1.x and 2.x API.
>
>
> > Q3: Can we define the rules for different string/char types, so that
> > everyone speaks the same language here:
> > - HB_UCHAR, HB_SCHAR: ... ?
> > - char / HB_TEXT?: Harbour character/string (with future UTF-8 support) ?
> > - BYTE: ... Harbour raw binary data or other simple numeric BYTE ?
>
> The sign of BYTE is platform/C compiler dependent so should not be used
> internally as number holder because programmers does not know it will be
> mapped to signed or unsigned char. This type should be rather used to
> mark data where sign is undefined.
> I do not know yet what name space we will use and how we will represent
> Unicode strings so I do not think we should make any of such modifications
> now. Please remember that each time you are changing some definitions in
> core code you force code updating in 3-rd party projects. Making such
> modifications with important reasons and then reverting them or changing
> to sth else has to be serious problem for 3-rd party code developers.
> We have to well know what is our final goal before we start anything like
> that.
>
> > Maybe that would help for everyone to see more clearly.
>
> If we change it once again in few months then for sure not.
>
> best regards,
> Przemek
>
> ps. For new types I suggest to use sth like:
>
>   ANSI C types:
>      void,
>      [ [un]signed ] char, [ [un]signed ] short, [ [un]signed ] int,
>      [ [un]signed ] long, double,
>
>      [ [un]signed ] long long is not supported by some platforms / C
> compilers
>      so it should not be used
>
>   harbour overloaded types:
>      hbChar, hbSChar, hbUChar, hbShort, hbUShort, hbInt, hbUInt,
>      hbLong, hbULong, hbLongLong, hbULongLong, hbDouble,
>      hbMaxInt, hbMaxDouble,
>      hbCounter, hbSize, hbPtrDiff, hbPtrVal,
>      hbPointer,
>      hbWChar // for future wide character representation
>
>   harbour strict bit types:
>      hbI8, hbU8, hbI16, hbU16, hbI32, hbU32, hbI64, hbI64
>
>   Types which depends on internal HVM/compilation settings:
>      hbMaxVMInt - maximal integer which can be storred in HVM item
>                   (HB_IT_LONG). It's current HB_LONG, usually will be
>                   the same as hbMaxInt unless for some reasons it will
>                   not be reduced, f.e. compiler may support 128bit
>                   integers as hbMaxInt but we may don't use it for
>                   HB_IT_LONG due to performance reduction.
>
> It's not the full list. Just a startup proposition. After 1.2 we can
> discuss
> about the full list of types and new API. I will also want to update some
> public structures (f.e. RDD ones). All this modifications can make Harbour
> unstable for some time and when they will be finally ready then people
> using 3-rd party libraries will have to wait for their updating and new
> releases. Without 1.2 they will have very serious problem.
>
> best regards,
> Przemek
> _______________________________________________
> Harbour mailing list
> [email protected]
> http://lists.harbour-project.org/mailman/listinfo/harbour
>

_______________________________________________
Harbour mailing list
[email protected]
http://lists.harbour-project.org/mailman/listinfo/harbour

Re: [Harbour] BYTE -> UCHAR patch

Reply via email to