Re: [Lazarus] rewriting of LConvEncoding

Guy Fink Thu, 23 Sep 2010 15:08:52 -0700

> No, I mean fpc/rtl/units/creumap.pp that afaik generates statically
> linkable
> units from ISO files that plugin to charset.
>
> >What I see is, that charset is not finished.
>
> Then finish it.


Really, I don't know what to think of that order. I want to contribute to the 
project with my knowledge of more than 20 years programming in Pascal (since 
the very first days of Turpo Pascal). I surely will not waste my time on 
completing an algorhytm that I think is unappropriate for the problem!

>
> > It just offers a rudimentary way to read in the unicode.org
> textfiles, and
> > some functions to find a mapping and convert one character.  No
> support
> > for complete string conversions, or UTF-8, UTF-16, UTF-32.
>
> Then make a good proposal to fix this. Preferably with patches.
>

Does not really make sense when after applying the patch there is nothing left 
from the original.


>
> > The tables are created dynamically via getmem and stored in a linked list.
> > Every character is stored in a record : tunicodecharmapping, where Unicode
> > is only definded as word, not cardinal.  Thus UTF32 is not supported,
> > UTF16 surrogates neither.
>
> UTF32 is nowhere supported at all with FPC atm, and to be honest, I  don't
> see a reason to start now.  The unicode Delphi's also don't provide a type
> for it.  It is simply the most practical format, and the few places

Is that really a reason not to start support for it? I don't think so. I even 
think it is a reason to support it, Delphi does not have full Unicodesupport, 
FPC will have.


> where it  is typically used , like complex string routines and the like, can
> survive on hardcode handoptimized code.   (IOW it is not really an user type)
>
> Since despite what people think, UTF32 is extremely wasteful, and still
> not free from problems (codepoints vs chars, denormalized sequences etc)

UTF32 is there in the world, and yes it is wasteful.. And so what? Is that a 
reason to ignore it?

> Well, one of the reasons is that the unit is mainly used for embedded
> applications (which includes DOS and win9x nowadays) or special cases
> (like  very, very compatible installers), since on normal targets the OS
> routines are used.

These routines do not support all of the codepages. Further, the aim of a 
library is not to wrap some OS routines but to deliver functionality to the 
developer to help him solve his problem. Developers need solutions, not good 
words of how clean and ligthweigth the libraries are.

> Nevertheless, I don't want to hide behind that. Certainly, charset is
> pretty much
> a one-off effort and can be improved. But please, when reengineering,
> keep  in mind that the "special" uses are the main ones.
>
> But if everybody tries to roll something new instead of improving
> existing functionality the we are getting nowhere.

And if everybody holds on algorhytms which have been identified as beeing not 
appropriate to the problem you are getting nowhere either.

>
> > Charset has absolutly no support to handle endianess of UTF-16 and UTF-32
> > strings.
>
> I would add separate special functions for that. No need to bog down the
> standard functions that do the bulk of the work.  IOW a special
> functions
> that do input validation at the perimeter, and functions that only do
> internal conversions (e.g. that you could base the widestring manager
> on)
>
> > With static tables, I mean a table in a const-section, compiled and
> linked
> > into the code.
>
> Have a look at creumap. If you had looked up where and how (c)charset is
> used, you would have noticed
>
> (see e.g. compiler/cp*

I have noticed... and now? Doesn' t improve the algorhytm. Perhaps it is better 
first to think over the right datastructures than to write down some trivial 
lines of code and to propagate that these have to stay now like that till the 
end of the days.



Sorry at this point for these hard words. I really appreciate the work done by 
the FPC and Lazarus-Team. It is a great piece of work and I think it will have 
a great future. It is out of that thinking that I would like to contribute my 
small part to the project.

But M. van de Voort, I will not continue the discussion on this level and in 
this tone.

My first intention was to improve LConvEncoding. I still think this 
functionality has to be in the RTL, but I also said at the beginning of this 
threat that it is to the core-developers to decide if it can be integrated 
there. Mattias Gaertner approved to this, and even named a COMPLETE conversion 
unit in its post. Felipe Monteiro de Carvalho also agreed.

If now others think that this is not wanted, no problem for me, the unit may 
stay in the LCL, I can live with that very well.



______________________________________________________
powered by GLOBER.LU
Luxembourg Internet Service Provider
Hosting. Domain Registration, Webshops, Webdesign, FreeMail ...

Our professional Web Hosting plans include all the features you are looking for 
at the best possible price.
www.globe.lu


--
_______________________________________________
Lazarus mailing list
[email protected]
http://lists.lazarus.freepascal.org/mailman/listinfo/lazarus

Re: [Lazarus] rewriting of LConvEncoding

Reply via email to