On 9/23/2010 18:08, Guy Fink wrote:
No, I mean fpc/rtl/units/creumap.pp that afaik generates statically
linkable units from ISO files that plugin to charset.
What I see is, that charset is not finished.
Then finish it.
Really, I don't know what to think of that order.
i don't see that as an "order"... more like "this is FOSS. you are free to fix
things that you see and contribute them back to the community." ;)
I want to contribute to the project with my knowledge of more than 20
years programming in Pascal (since the very first days of Turpo Pascal).
ahhh... another oldtimer amongst us :P
I surely will not waste my time on completing an algorhytm that I think is
unappropriate for the problem!
FWIW: *I* can understand that :)
It just offers a rudimentary way to read in the unicode.org
textfiles, and some functions to find a mapping and convert
>>> one character. No support for complete string conversions,
or UTF-8, UTF-16, UTF-32.
Then make a good proposal to fix this. Preferably with patches.
Does not really make sense when after applying the patch there is nothing left
from the original.
that would depend on the "fix" wouldn't it? it also depends on the core
"guidance team" and if they accept the fix... remember one goal is to not break
existing functionality... maybe your fix/enhancement would/should use different
unit and include names so as to not break what's already being used in thousands
of projects? that way the core team can bring it in gradually and/or existing
projects can convert to it on their time and needs ;)
The tables are created dynamically via getmem and stored in a linked list.
Every character is stored in a record : tunicodecharmapping, where Unicode
is only definded as word, not cardinal. Thus UTF32 is not supported,
UTF16 surrogates neither.
UTF32 is nowhere supported at all with FPC atm, and to be honest, I don't
see a reason to start now. The unicode Delphi's also don't provide a type
for it. It is simply the most practical format, and the few places
Is that really a reason not to start support for it? I don't think so. I even
think it is a reason to support it, Delphi does not have full Unicodesupport,
FPC will have.
one must also remember and take into account that delphi compatibility is a
goal... that FPC and Laz have the option and ability to move further than delphi
is a plus but delphi compatibility is still a requirement...
and what happens when delphi does add such capability? will your fix/enhancement
be "updated" to match delphi?
where it is typically used , like complex string routines and the like, can
survive on hardcode handoptimized code. (IOW it is not really an user type)
Since despite what people think, UTF32 is extremely wasteful, and still
not free from problems (codepoints vs chars, denormalized sequences etc)
UTF32 is there in the world, and yes it is wasteful.. And so what? Is that a
reason to ignore it?
on the surface, i'd say "no" but another question that comes to mind is why
invest time in it if it goes nowhere?
Well, one of the reasons is that the unit is mainly used for embedded
applications (which includes DOS and win9x nowadays) or special cases
(like very, very compatible installers), since on normal targets the OS
routines are used.
These routines do not support all of the codepages. Further, the aim of a
library is not to wrap some OS routines but to deliver functionality to the
developer to help him solve his problem. Developers need solutions, not good
words of how clean and ligthweigth the libraries are.
i tend to agree with this, on the surface... however, much is actually done by
providing wrappers so that existing functionality and compatibility can be
maintained...
Nevertheless, I don't want to hide behind that. Certainly, charset is
pretty much
a one-off effort and can be improved. But please, when reengineering,
keep in mind that the "special" uses are the main ones.
But if everybody tries to roll something new instead of improving
existing functionality the we are getting nowhere.
And if everybody holds on algorhytms which have been identified as beeing not
appropriate to the problem you are getting nowhere either.
+1
Charset has absolutly no support to handle endianess of UTF-16 and UTF-32
strings.
I would add separate special functions for that. No need to bog down the
standard functions that do the bulk of the work. IOW a special
functions
that do input validation at the perimeter, and functions that only do
internal conversions (e.g. that you could base the widestring manager
on)
With static tables, I mean a table in a const-section, compiled and
linked into the code.
one must remember that FPC's and Laz's smartlinking stuff is nowhere near like
that in TP/BP or delphi... but i'm also not sure if that fits with what you are
saying, either...
Have a look at creumap. If you had looked up where and how (c)charset is
used, you would have noticed
(see e.g. compiler/cp*
I have noticed... and now? Doesn' t improve the algorhytm. Perhaps it is better
first to think over the right datastructures than to write down some trivial
lines of code and to propagate that these have to stay now like that till the
end of the days.
Sorry at this point for these hard words. I really appreciate the work done by
the FPC and Lazarus-Team. It is a great piece of work and I think it will have
a great future. It is out of that thinking that I would like to contribute my
small part to the project.
But M. van de Voort, I will not continue the discussion on this level and in
this tone.
i'm not sure the perceived "tone" is what you seem to take it as...
My first intention was to improve LConvEncoding. I still think this
functionality
has to be in the RTL, but I also said at the beginning of this threat that it is
to the core-developers to decide if it can be integrated there. Mattias Gaertner
approved to this, and even named a COMPLETE conversion unit in its post. Felipe
Monteiro de Carvalho also agreed.
If now others think that this is not wanted, no problem for me, the unit may
stay
in the LCL, I can live with that very well.
since this is FOSS, i'd say to move forward keeping the above in mind and not
hurting what is already out there... use what you can and write new stuff that
may (eventually) replace it... that's the FOSS way from what i've seen over the
years :)
--
_______________________________________________
Lazarus mailing list
[email protected]
http://lists.lazarus.freepascal.org/mailman/listinfo/lazarus