Re: [Lazarus] rewriting of LConvEncoding

waldo kitty Thu, 23 Sep 2010 15:41:26 -0700

On 9/23/2010 18:08, Guy Fink wrote:

No, I mean fpc/rtl/units/creumap.pp that afaik generates statically
linkable units from ISO files that plugin to charset.

What I see is, that charset is not finished.


Then finish it.


Really, I don't know what to think of that order.

i don't see that as an "order"... more like "this is FOSS. you are free to fixthings that you see and contribute them back to the community." ;)

I want to contribute to the project with my knowledge of more than 20
years programming in Pascal (since the very first days of Turpo Pascal).


ahhh... another oldtimer amongst us :P

I surely will not waste my time on completing an algorhytm that I think is 
unappropriate for the problem!


FWIW: *I* can understand that :)

It just offers a rudimentary way to read in the unicode.org
textfiles, and some functions to find a mapping and convert

>>> one character.  No support for complete string conversions,

or UTF-8, UTF-16, UTF-32.


Then make a good proposal to fix this. Preferably with patches.


Does not really make sense when after applying the patch there is nothing left 
from the original.

that would depend on the "fix" wouldn't it? it also depends on the core"guidance team" and if they accept the fix... remember one goal is to not breakexisting functionality... maybe your fix/enhancement would/should use differentunit and include names so as to not break what's already being used in thousandsof projects? that way the core team can bring it in gradually and/or existingprojects can convert to it on their time and needs ;)

The tables are created dynamically via getmem and stored in a linked list.
Every character is stored in a record : tunicodecharmapping, where Unicode
is only definded as word, not cardinal.  Thus UTF32 is not supported,
UTF16 surrogates neither.


UTF32 is nowhere supported at all with FPC atm, and to be honest, I  don't
see a reason to start now.  The unicode Delphi's also don't provide a type
for it.  It is simply the most practical format, and the few places


Is that really a reason not to start support for it? I don't think so. I even
think it is a reason to support it, Delphi does not have full Unicodesupport,
FPC will have.

one must also remember and take into account that delphi compatibility is agoal... that FPC and Laz have the option and ability to move further than delphiis a plus but delphi compatibility is still a requirement...

and what happens when delphi does add such capability? will your fix/enhancementbe "updated" to match delphi?

where it  is typically used , like complex string routines and the like, can
survive on hardcode handoptimized code.   (IOW it is not really an user type)

Since despite what people think, UTF32 is extremely wasteful, and still
not free from problems (codepoints vs chars, denormalized sequences etc)


UTF32 is there in the world, and yes it is wasteful.. And so what? Is that a
reason to ignore it?

on the surface, i'd say "no" but another question that comes to mind is whyinvest time in it if it goes nowhere?

Well, one of the reasons is that the unit is mainly used for embedded
applications (which includes DOS and win9x nowadays) or special cases
(like  very, very compatible installers), since on normal targets the OS
routines are used.


These routines do not support all of the codepages. Further, the aim of a
library is not to wrap some OS routines but to deliver functionality to the
developer to help him solve his problem. Developers need solutions, not good
words of how clean and ligthweigth the libraries are.

i tend to agree with this, on the surface... however, much is actually done byproviding wrappers so that existing functionality and compatibility can bemaintained...

Nevertheless, I don't want to hide behind that. Certainly, charset is
pretty much
a one-off effort and can be improved. But please, when reengineering,
keep  in mind that the "special" uses are the main ones.

But if everybody tries to roll something new instead of improving
existing functionality the we are getting nowhere.


And if everybody holds on algorhytms which have been identified as beeing not
appropriate to the problem you are getting nowhere either.

+1

Charset has absolutly no support to handle endianess of UTF-16 and UTF-32
strings.


I would add separate special functions for that. No need to bog down the
standard functions that do the bulk of the work.  IOW a special
functions
that do input validation at the perimeter, and functions that only do
internal conversions (e.g. that you could base the widestring manager
on)

With static tables, I mean a table in a const-section, compiled and
linked into the code.

one must remember that FPC's and Laz's smartlinking stuff is nowhere near likethat in TP/BP or delphi... but i'm also not sure if that fits with what you aresaying, either...

Have a look at creumap. If you had looked up where and how (c)charset is
used, you would have noticed

(see e.g. compiler/cp*


I have noticed... and now? Doesn' t improve the algorhytm. Perhaps it is better
first to think over the right datastructures than to write down some trivial
lines of code and to propagate that these have to stay now like that till the
end of the days.

Sorry at this point for these hard words. I really appreciate the work done by
the FPC and Lazarus-Team. It is a great piece of work and I think it will have
a great future. It is out of that thinking that I would like to contribute my
small part to the project.

But M. van de Voort, I will not continue the discussion on this level and in
this tone.


i'm not sure the perceived "tone" is what you seem to take it as...

My first intention was to improve LConvEncoding. I still think this 
functionality
has to be in the RTL, but I also said at the beginning of this threat that it is
to the core-developers to decide if it can be integrated there. Mattias Gaertner
approved to this, and even named a COMPLETE conversion unit in its post. Felipe
Monteiro de Carvalho also agreed.

If now others think that this is not wanted, no problem for me, the unit may 
stay
in the LCL, I can live with that very well.

since this is FOSS, i'd say to move forward keeping the above in mind and nothurting what is already out there... use what you can and write new stuff thatmay (eventually) replace it... that's the FOSS way from what i've seen over theyears :)


--
_______________________________________________
Lazarus mailing list
[email protected]
http://lists.lazarus.freepascal.org/mailman/listinfo/lazarus

Re: [Lazarus] rewriting of LConvEncoding

Reply via email to