from:"DrDiettrich"

Re: [fpc-devel] problem with is operator

2005-04-18 Thread DrDiettrich

Linuxer Wang wrote:
 
 Hello,
 
 Can anybody tell me how can I know which specific type an instance of
 class is?

Check the ClassType or ClassName.

 The is operator seems weird when interface is used.

Add a GetObject method to your interfaces, that returns the object that
implements the interface. Then you can use the class specific checks.

DoDi


___
fpc-devel maillist  -  fpc-devel@lists.freepascal.org
http://lists.freepascal.org/mailman/listinfo/fpc-devel

Re: [fpc-devel] integer, cardinal

2005-04-18 Thread DrDiettrich

Ales Katona wrote:

 I think that pascal typesystem requires a bit overhaul when it comes to
 integers.
 
 First of all Integer should be size independent, that is, xy bits
 depending on the platform. All others should be specific.

I agree with an application wide integer/cardinal type, but there should
exist a way to get integers of a minimum size, as required by a specific
program. One may use range types for that purpose, which the compiler
can extend to the next optimal byte count, but doing so should not
result in too much range checking.

 Second, we should force people in a friendly way to use more readible
 names like:
 sint32, uint64, etc. than cardinal

Not necessarily, when integer and cardinal for themselves are
sufficient.

What's the reason for using explicitly sized variables, of different
sizes? I can see only one purpose, in the I/O of records in binary
files. But with regards to different byte orders, reading structured
blocks of binary information is not portable.

 In a few years when 64 bits are normal, what will cardinal become? who
 knows..

32 bit integers will continue to be sufficient for most purposes. Arrays
with more elements are not very likely in the next few years, 64 bit
integers will be restricted to few special purposes (file size).

DoDi



___
fpc-devel maillist  -  fpc-devel@lists.freepascal.org
http://lists.freepascal.org/mailman/listinfo/fpc-devel

Re: [fpc-devel] Quick patch for bug 3762

2005-03-31 Thread DrDiettrich

Tomas Hajny wrote:

 The main problem is that there's a lot platform independent
 functionality in Crt unit which is re-implemented for every
 platform again and again. The best solution would be to throw all
 the individual implementations away completely and implement
 cross-platform Crt unit based on capabilities provided by units
 Keyboard and Video (possibly missing functionalities within these
 units necessary for Crt could be either handled by platform
 specific include file, or by extending current Keyboard and/or
 Video).

From the discussion I have the impression that there exist problems in
separating keyboard and mouse events. If I'm right here, then I'd
suggest to separate platforms with separate mouse and keyboard events
from platforms with common events for both devices. The platform
independent implementation then should assume and use separate queues,
as provided (separated if required) by the platform specific code.

DoDi


___
fpc-devel maillist  -  fpc-devel@lists.freepascal.org
http://lists.freepascal.org/mailman/listinfo/fpc-devel

Re: [fpc-devel] First benchmark of Abbrevia 4.0

2005-03-30 Thread DrDiettrich

Michael Van Canneyt wrote:
 
 On Sun, 27 Mar 2005, DrDiettrich wrote:
 
  A friend of mine just has tested my archiver, with the following results
  for an TAR with a million of files:
 
  PowerArchiver: 530 minutes.
  My Unarch: 160 minutes.
 
 Huh ?
 Who creates archives with million of files ?
 Who creates a million of files in the first place ?!

It's a CD archive, with descriptions of all known music CDs
(FreeDB/expd).

  I hope to get the original .tgz archive soon, in order to test it with
  GNU gzip and tar as well. The time may decrease again when the log is
  redirected into an file...
 
 It should, definitely if the test was run on Windows;

You guessed it ;-)


 If you need testers, I'm always prepared to help.

Later, when I've ported the zip support. There are a lot of things to
explore, like the actual contents of time stamps and other OS specific
fields in the file descriptors. The current version is written for
Delphi and Windows, a port to FPC and other OS or machines will require
some adaptations; this is where somebody could jump in...

Hmm, the ~4 core units (90 KB) should be sufficiently stable for an FPC
and Linux port, if somebody is interested in such stuff. The
documentation (9 KB HTML) is not up to date, my code is intended to be
self-explanatory ;-)

Another thing is the grid control for a GUI (~50 KB), based on
TCustomGrid...

DoDi



___
fpc-devel maillist  -  fpc-devel@lists.freepascal.org
http://lists.freepascal.org/mailman/listinfo/fpc-devel

Re: [fpc-devel] Friend classes?

2005-03-30 Thread DrDiettrich

Marco van de Voort wrote:

  Do you mean that only one level of dependencies must be checked with
  uses, whereas even the indirectly #included files must be checked for
  changes?
 
 You always have to do the #include. Always. Pre-compiled headers are
 possible, not trivial, since it requires a comparison of the entire
 preprocessor state to see if a header needs recompilation.

With precompiled headers it's almost like with compiled units: every
compiled file contains a list of dependend files, dates and/or
checksums. If something has changed, the affected unit or module must be
recompiled. If nothing has changed, the preprocessor states can not have
changed as well.

   That allows the compiler to auto-find the compilation order..

A compilation order only applies to units. A Pascal compiler or linker
starts with the root unit, and then descends into all used units. C
projects instead consist of a single (flat) list of modules, which can
be compiled in any order.


  A C project doesn't require any compilation order, every module can be
  compiled independently from other modules.
 
 Implementation: yes. Header: no.

Hmm, perhaps we have a different idea of precompiled header files. Let
me explain what BC 3.1 did:

Header files are not precompiled for themselves, one by one, instead
there must exist something like a root module (*.c) which contains a
list of #include's. These included files are precompiled together, in
the given order. 
Multiple modules can share the same precompiled headers only if the
modules #include the same header files in the same sequence.

The precompiled header files, for all modules of a project, reside in
the same precompiled file. This file in fact is a library, containing
distinct sets of precompiled headers, in different extent or order. IOW
modules with exactly the same #include's in sequence share a single
precompiled header module. The file may be organized as a tree of header
modules, with bifurcations when different files are #included after a
preceding identical sequence.

Borland also introduced a #pragma headerstop, that delimits the list of
precompiled header files. All modules with the same #include's before
this pragma can share the same header module, eventually following
#includes are not precompiled. This feature allows to precompile a
reasonable number of header files into one header module. AFAIR it was
easier to #include exactly the same header files in all modules of a
project, regardless of the actual needs of every module. The resulting
precompiled header file, for a Win3.1 project, then could be as small as
20 MB, in contrast to  80 MB for less, but different, #includes in the
source modules.

Regardless of such optimization, as long as the header files are not
touched themselves, and the #include's in the source modules are
unchanged, a recompilation of the header files is never required.

DoDi


___
fpc-devel maillist  -  fpc-devel@lists.freepascal.org
http://lists.freepascal.org/mailman/listinfo/fpc-devel

Re: [fpc-devel] Friend classes?

2005-03-29 Thread DrDiettrich

Marco van de Voort wrote:

  The definitions of templates, inline procedures or macros do not
  immediately contribute to the size of a compiled module, only when they
  are *used* in code modules.
 
 That goes for all routines.

I'm not sure what you mean. A global procedure, exported in the
interface section of a unit, always must be be compiled. Only the linker
can determine which procedures really are used in a project, and which
ones can be omitted. A specific procedure can be used in one project,
but can be unused in another project. Think of the standard units like
System...


  In comparison with C/C++, uses summarizes the #include of header files
  and the dependency checks of Make.
 
 The important difference (in TP/Delphi/FPC) is that preprocessor state
 doesn't follow USES statements.

Do you mean that only one level of dependencies must be checked with
uses, whereas even the indirectly #included files must be checked for
changes?

 That allows the compiler to auto-find the compilation order..

What's the importance of a compilation order?

  Many properties make Pascal compilers faster than C/C++ compilers. The
  effect of uses is equivalent to C/C++ precompiled header files.
 
 The effect of units is that it is a lot safer (more guaranteed) and easier
 to implement that precompiled header files, and auto-building is also a lot
 easier (not requiring explicit manual compile order enforced)

Again I don't understand what you mean (compile order???) :-(

A C project doesn't require any compilation order, every module can be
compiled independently from other modules. The problem is not the order,
instead it's the condition, *when* a module has to be recompiled.


The only *disadvantage* of units are the current Pascal compilers, which
cannot handle circular unit references :-(
 
 It could in theory deal with some forms of circular records. Specially in
 the case of e.g. classes.

I can imagine some kind of extended forward declarations, extending
into other units. In most cases it's suffient, for compilation, that the
general kind of a named type is known, in detail with pointers and
references. Then it's possible to layout records, classes or other
structured data types, including parameter lists of subroutines, without
the exact knowledge of the referenced types. Knowledge about the exact
types usually is required only when members of some referenced complex
type come into the play, what does not normally happen in declarations
(interface part of units).

Perhaps I should give some examples:

type
  A = record a1: B; end;
  B = record b1: A; end;

Such a construct of course is illegal, because of infinite recursion.
The same construct would be perfectly accepatble with pointers:

type
  A = record a1: ^B; end;
  B = record b1: ^A; end;

My idea for extended external declarations:

type
  C = class in OtherUnit;
or
  OtherUnit.C = class;

Now the compiler can insert a reference to OtherUnit.C for every
occurence of C in the remainder of the interface section. The OtherUnit
now can be moved from the uses list of the interface section into the
uses list of the implementation section; or it's imported implicitly, in
the most comfortable case.


 Just recompile till CRCs don't change anymore. This allows units that
 circularly import eachother, but have no real circular dependancy of types
 to compile. Or even if the circular reference only involves types of which
 the sizes are known (like class etc types that are only 4 bytes).

I would be happy with the latter case and no recompilation at all.


 However it is of course questionable if it is worth the trouble to implement
 this, and make it bullet proof. Maybe somebody who does graph theory as
 a hobby?

I don't qualify as graph theory guru, even if I have already implemented
the analysis of control flow graphs in my decompilers. In the case of
circular unit references I'd ignore the unit dependencies for
themselves, and instead concentrate on the type definitions themselves.
There I'd use a list of type names, with some fixed attributes (unit,
size, immediate type...), and a reference to the detailed definition of
the base type, in case of more complex types. Then the existence of a
(illegal) loop can be determined from a Nil reference, whenever access
to the detailed definition is actually required. The unit attribute in
the fixed part of a type definition is required to distinguish
definitions of the same name in different units, and it also can be used
to complete forward (or external) type references when the defining unit
is imported later.


 IIRC TP could handle mild circular references and so can FPC. I don't know
 if this is still the case.

Sounds interesting, do you have more information?

DoDi



___
fpc-devel maillist  -  fpc-devel@lists.freepascal.org
http://lists.freepascal.org/mailman/listinfo/fpc-devel

[fpc-devel] First benchmark of Abbrevia 4.0

2005-03-29 Thread DrDiettrich

A friend of mine just has tested my archiver, with the following results
for an TAR with a million of files:

PowerArchiver: 530 minutes.
My Unarch: 160 minutes.

I hope to get the original .tgz archive soon, in order to test it with
GNU gzip and tar as well. The time may decrease again when the log is
redirected into an file...

Currently I'm implementing a ZIP module, as the last proof-of-concept
for the extractors. This is a very hard job, because half of all
Abbrevia modules are involved and have to be updated. Then comes the
final step, the creation of new archives.

DoDi


___
fpc-devel maillist  -  fpc-devel@lists.freepascal.org
http://lists.freepascal.org/mailman/listinfo/fpc-devel

Re: [fpc-devel] Friend classes?

2005-03-26 Thread DrDiettrich

Micha Nelissen wrote:

  Perhaps you missed that in C/C++ the preprocessor is typically (99.999%)
  used to include header files, not source files. This is comparable to
  Pascal uses, not to {$Include}!
 
 What's the difference between a 'header' file, and a source file ? Header  
 files often contain template classes with inline implemented methods.
 Preprocessor is used also for macros.

The definitions of templates, inline procedures or macros do not
immediately contribute to the size of a compiled module, only when they
are *used* in code modules.


 Please look again at 'uses', it works on a more abstract level than just
 recompiling all units depended on, it's where the speed of compiling pascal
 comes from.

In comparison with C/C++, uses summarizes the #include of header files
and the dependency checks of Make. But it's not only found in Pascal,
e.g. Java has imports for almost the same purpose. The original Wirth
Pascal had no uses, it was added later.

Many properties make Pascal compilers faster than C/C++ compilers. The
effect of uses is equivalent to C/C++ precompiled header files.


  The only *disadvantage* of units are the current Pascal compilers, which
  cannot handle circular unit references :-(
 
 No, it's an advantage: it makes the code and design clearer, plus it
 increases the speed of compilation a *lot*.

I wouldn't call a design clearer when it requires to implement most of a
project in a single unit :-(

I also doubt about the speed increase, as long as nobody tried to write
an Pascal compiler that can handle circular unit references.

DoDi



___
fpc-devel maillist  -  fpc-devel@lists.freepascal.org
http://lists.freepascal.org/mailman/listinfo/fpc-devel

Re: [fpc-devel] Friend classes?

2005-03-24 Thread DrDiettrich

Florian Klaempfl wrote:

 C++ creates one monster module in this case as well.
 
 
  I disagree. Neither the declarations (interface, header files) nor the
  definitions (implementation, code modules) must reside in one file.
 
 How the sources are splitted doesn't matter. The compiler handles it as one 
 file
 and creates one object module. The compiler even doesn't see that the files 
 are
 splitted, the preprocessor concats everything.

Please let me quote the originial statement:

In porting C++ code to Pascal I often stumbled into circular unit
references. Then the only solution is a monster unit, that implements
all the related classes at once, where the C++ implementation can be
split into appropriate modules.

In one case I had about 10 C++ modules, which implemented one class
each. These modules can be compiled separately, into the same number of
object files.

In Pascal I had to merge all these modules into an single unit, due to
circular unit references. This monster unit then compiles into one
object file, of course.

Perhaps you missed that in C/C++ the preprocessor is typically (99.999%)
used to include header files, not source files. This is comparable to
Pascal uses, not to {$Include}!


 Units are a higher level concept than the include files of C++ but units can
 often but not necessarily reflect the include file structure.

There exist many kinds of concepts, with regards to the relationship
between declarations (interfaces) and definitions (implementations). In
C/C++ both are separated and strictly unrelated, in Modula both are
separated into strictly related files, in Pascal or Java both are
combined into single unit (class) files. I cannot see any levels in
these concepts. Adding namespaces results in a 3D model, of
declarations, namespaces, and implementations.

For the compound Pascal units, the most important *difference* is the
required implementation of everything, what's declared in the interface
section of an unit, in the implementation section of the *same* unit.
This requirement leaves no room for inconsistent sets of interfaces
(header files) and implementations (source modules).

The only *disadvantage* of units are the current Pascal compilers, which
cannot handle circular unit references :-(

DoDi



___
fpc-devel maillist  -  fpc-devel@lists.freepascal.org
http://lists.freepascal.org/mailman/listinfo/fpc-devel

Re: [fpc-devel] Friend classes?

2005-03-22 Thread DrDiettrich

Micha Nelissen wrote:

 The real question is: was the design of the code ok ?

Some dependencies cannot be removed by a redesign :-(

In some cases interfaces instead of classes can help, because they don't
need implementation in the same unit. But then the classes have to be
implemented as well.

IMO it's important to keep classes encapsulated in distinct units, so
that no unintended interaction (use of private/protected members of
other classes...) can occur, as is possible in a single unit.

 Circular references makes code harder to understand.
 Layers are easier to understand.

I see no problem in understanding here? When two objects refer to each
other, but the objects are not otherwise functionally related, they can
be understood without knowing about the other one. It also should be
possible to implement all these objects seaparately.

DoDi



___
fpc-devel maillist  -  fpc-devel@lists.freepascal.org
http://lists.freepascal.org/mailman/listinfo/fpc-devel

Re: [fpc-devel] Friend classes?

2005-03-20 Thread DrDiettrich

Ales Katona wrote:

 C++ requires friend only because it lacks the idea of modularity.
 Since all classes are apart they need some way to tell each other I
 can use you
 In pascal you simply put them into 1 unit.

That's why the C++ model is better, there exists no requirement to
implement related classes in a single module.

In porting C++ code to Pascal I often stumbled into circular unit
references. Then the only solution is a monster unit, that implements
all the related classes at once, where the C++ implementation can be
split into appropriate modules. Even in Java it's possible to implement
every class in a dedicated unit, regardless of class relationships, and
without a need for separate header files. That's what I call a good
modular concept.


Perhaps I dislike Pascal include files only because they are poorly
supported by the Delphi IDE. With better support it were possible to
split the implementation into multiple dedicated include files, which
could be thought of as implementation files, according to e.g. the
Modula model. Lazarus offers better support for included files, but
unfortunately it currently groups the types, constants etc. overview
together by the according clauses; I hope for better grouping options in
the near future, so that e.g. all types of an unit can reside in a
single group. I already considered to contribute to the Lazarus
development, but currently I have other projects with higher priority...

DoDi


___
fpc-devel maillist  -  fpc-devel@lists.freepascal.org
http://lists.freepascal.org/mailman/listinfo/fpc-devel

Re: [fpc-devel] Friend classes?

2005-03-20 Thread DrDiettrich

Michael Van Canneyt wrote:

 sorry, but I fail to see the problem ? The above makes all protected
 members of a class visible. This is normal, that is why they are
 protected. If you want to avoid that, make the members private. Then
 they are visible only in the same unit, even for other classes (for
 'cooperation'), but not outside the unit, even with a descendent.

I.e. protected means to you that there exist means to bypass that
protection?

Hmm, I think that I can update my mental class model accordingly ;-)

Thanks
  DoDi



___
fpc-devel maillist  -  fpc-devel@lists.freepascal.org
http://lists.freepascal.org/mailman/listinfo/fpc-devel

Re: [fpc-devel] Local procedures as procedural parameter

2005-03-19 Thread DrDiettrich

Jonas Maebe wrote:

  Couldn't the framepointer be last parameter in all modes ?
 
 That would still require some ugly hacks inside the parameter passing
 code, because of the fact that on x86 the callee removes the parameters
 from the stack and not the caller (so you'd still need a check
 comparing that framepointer wil nil and not pushing it if it is nil).

From the many contributions I got the following picture:

1) Local subroutines expect an framepointer, regardless of whether being
called from their enclosing subroutine, or from outside (as a callback).
This parameter must be passed, and possibly be removed from the stack,
in every call to, and return from, a local subroutine. All the technical
details are already specified, by the current handling of local calls to
local subroutines.

2) The compiler must check every call for local/global subroutines, in
order to supply or omit the framepointer. The only remarkable difference
are static/dynamic checks, where direct calls can be handled at compile
time, as is, whereas indirect calls would require a distinction at
runtime, based on the local/global kind of the actual callback
procedure.

So I wonder why there should be any problem, besides for the generation
of the code for the runtime checks?



3) Procedural parameters already come in two flavours, for global
subroutines and methods. The of object type has an additional Self
pointer field. It should not be hard to extend the base type with an
according Frame pointer field. This extension only is necessary for
parameters, not for variables of procedural type, because the compiler
should (already!) reject attempts to store references to local
subroutines in variables. Of course this change will break compatibility
with older binary code, such code must be recompiled with the new
compiler version.

Such a change may allow for really smart callbacks, where global or
local subroutines can be used for the same parameter, as well as methods
can be used without an explicit of object clause. In the latter case
distinct values must be used for the object (Self) pointer of methods,
to distinguish between methods with Self=nil, and ordinary subroutines
with no Self pointer at all.

The extension of the calling conventions may be a nasty task, but the
compiler already has implemented a bunch of calling conventions (cdecl,
pascal...) and modifications (method vs. ordinary subroutines, direct
vs. indirect calls). Only the very last case needs an update, so that
indirect calls have to be checked for local/global subroutines at
runtime. As mentioned above (3), additional dynamic support can
distinguish and handle global/local and subroutine/method variations
appropriately, as 4 distinct cases.

Since such smart callbacks require some amount of code, increasing the
size and runtime of the compiled code, it's up to the compiler writers
to add an compiler mode, option, or a flag, to enable/disable smart
callbacks and the according code production.


Did I miss anything?

What's your opinion on smart callbacks?
I mentioned this new construct based on my current observations with
comparison functions in sorting procedures. Wouldn't it be nice if one
could supply any kind of comparison function (local/global,
subroutine/method) to a single sorting procedure? Then the comparison
function could check flags and other values in their own runtime
environment, e.g. to distinguish between ascending/descending sort
order, in a thread-safe environment.

DoDi



___
fpc-devel maillist  -  fpc-devel@lists.freepascal.org
http://lists.freepascal.org/mailman/listinfo/fpc-devel

Re: [fpc-devel] Local procedures as procedural parameter

2005-03-18 Thread DrDiettrich

[EMAIL PROTECTED] wrote:
 
  Let me add some more thoughts about procedural types:
 
  - I like the ability to declare procedural types, the ISO convention
  looks like one of the many incredible C hacks to me :-(
 
 But it is standard pascal. And we need to support those zillion lines of
 code out there, written in standard pascal.

Agreed, as far as I'm not pressed myself, to use constructs that I don't
like ;-)

  - For the restricted use of local subroutines as procedural parameters I
  could imagine a const prefix in the accepting procedure declaration:
 
  procedure my_fun(const pf: tfun);
 
 This will be unclear imo, I would prefer a directive which tells what it
 really is about.

Such an explicit directive would not be portable, unless introduced by
some accepted standard.

  Hmm, the hidden frame parameter still will make a difference with local
  subroutines. At least in Pascal calling convention, where the arguments
  are popped by the called subroutine, not by the caller...
 
 The pascal calling convention is not used on most modern processors, since
 parameters instead are passed in registers.

I'd be careful with most, the most frequently used modern processor
has anything but a modern architecture. But unfortunately the bundling
of bad hardware with bad software seems to be what the consumer market
appreciates :-(

DoDi



___
fpc-devel maillist  -  fpc-devel@lists.freepascal.org
http://lists.freepascal.org/mailman/listinfo/fpc-devel

Re: [fpc-devel] Local procedures as procedural parameter

2005-03-15 Thread DrDiettrich

Michael Van Canneyt wrote:

 1. What happens if f would use a variable from somefun, and f is called
 when somefun is no longer executed ?

This can not happen when the function parameter cannot be stored
anywhere.

 2. I see no difference whatsoever between typ_fun and iso_fun, except
 the use of an extra type, which, in my opinion, does not change
 anything to the usage or code of these functions. If one is allowed,
 the other should be allowed as well.

As I understand the ISO convention, it just shall disallow to create
variables of the procedural type, which were required to store such a
reference. In Pascal two type declarations are different, even if they
look the same.


Let me add some more thoughts about procedural types:

- I like the ability to declare procedural types, the ISO convention
looks like one of the many incredible C hacks to me :-(

- For the restricted use of local subroutines as procedural parameters I
could imagine a const prefix in the accepting procedure declaration:

procedure my_fun(const pf: tfun);

The const prefix here means that the procedure pf only can be called,
but cannot be stored in a variable. The compiler then can assure that
local subroutines are passed only as const arguments. This syntax
requires no changes to the parser. The compiler message for a missing
const could be a warning instead of an error, to prevent
compatibility/portability problems. Other compilers should ignore the
const, so that accordingly modified source code still should be
portable?

Hmm, the hidden frame parameter still will make a difference with local
subroutines. At least in Pascal calling convention, where the arguments
are popped by the called subroutine, not by the caller...


- I'd appreciate to define procedures based on procedural types as well.
Currently a change to the procedural type requires updates of all
derived procedure declarations. Something like:

myproc: tfun =
begin
  blabla
end;

IMO such a definition would better implement the strict Pascal typing,
instead of only a type equivalence determined by the procedure
signature.

Unfortunately(?) this syntax is incompatible with procedural variables,
so that it would disallow to create such variables. A procedure prefix
would make the definition look like a function returning an TProcType.
But perhaps somebody has a better idea?

DoDi



___
fpc-devel maillist  -  fpc-devel@lists.freepascal.org
http://lists.freepascal.org/mailman/listinfo/fpc-devel

[fpc-devel] Friend classes?

2005-03-15 Thread DrDiettrich

I just came about code that uses protected members of other classes,
defined in other units. In Delphi this possible by a declaration like:

type TFriendClass = class(TNotMyClass);

After this declaration the protected (and private?) members of
TNotMyClass are accessible, using TFriendClass as an type alias.

Now my questions:

1) Is such behaviour also available with FPC? In non-Delphi modes?

2) How portable are such friend classes? Does there exist some (other)
standard for this purpose?

3) What alternatives could be used, so that not all class members must
be made public, when they shall be used only in specific related units?

DoDi


___
fpc-devel maillist  -  fpc-devel@lists.freepascal.org
http://lists.freepascal.org/mailman/listinfo/fpc-devel

Re: [fpc-devel] utf8 reading

2005-03-12 Thread DrDiettrich

Uberto Barbini wrote:

 Using natively utf-8 I think is impossible, because the encoding.

Support might be implmemented like/in MBCS support.

 Please note that at every Borland conference there is someone asking for
 Unicode support since Delphi2...

Not only for Delphi ;-)

 There are several opensource library for managing unicode strings in delphi
 but they are implemented as standard classes, not refcounted first class
 citizen as long-string.

It's not easy to find a solution suitable for everybody. There exist so
many character encodings, a single class or data type hardly will cover
all of them. Windows users may be happy with utf-16
(WideChar/WideString) because that's supported by the OS and some of its
standard controls, but other OS have different models and support.

As long as strings are created, used, and stored by an application, I'd
suggest to use utf-8 for the external (disk file) representation, and
WideString in the application. Then only two procedures are required to
convert between utf-8 and WideString. Strings from other sources then
have to be converted by the appropriate procedure into WideString, where
the coder is responsible for the selection of the appropriate
conversion; then a general library of such conversion procedures can be
created and maintained for use in Pascal programs, or the coder can use
his preferred opensource library.

Some coders may prefer WideString also in disk files, if utf-8 files
would be bigger in their natural or preferred language. Of course an
application also can continue to use AnsiString instead of WideString,
if Unicode support is not required. All these selections are up to the
coder, the required data types and conversions are already supported.

A compiler may support an Unicode switch, that maps the general data
types Char and String into either AnsiChar/AnsiString or
WideChar/WideString, in order to support easily portable code. The
switch may be extended to map WideChar into utf-16, utf-32, utf-64, or
whatever will become available in future compiler versions. A similar
effect can be achieved by user-defined data types TChar and TString in
portable code, with the possible problem that there exists no standard
unit where these data types can be defined unambiguously, throughout
whole projects.

DoDi



___
fpc-devel maillist  -  fpc-devel@lists.freepascal.org
http://lists.freepascal.org/mailman/listinfo/fpc-devel

Re: [fpc-devel] Improving Ref Counting

2005-03-04 Thread DrDiettrich

peter green wrote:

 surely this also means
 
 1: there has to be rtti for every field in every class so the compiler can
 follow the paths from the class

That's almost required, not only for classes but for all data types that
contain references (pointers) to managed objects.

It's not necessarily RTTI where such information resides, the compiler
(Delphi, at least) already maintains such tables for fields with
reference counted data types. These tables are used in the Initialize
and Finalize procedures of every data structure.


 2: you can't safely cast between classes and untyped pointers or integers
 (because those refs would be invisible to the gc)

It's a matter of the compiler specific handling of managed objects.
Remember that dynamic strings (AnsiString), arrays etc. and their casts
already have to be handled by the compiler.

You'll encounter no problems with pointers as long as the objects are
kept alive by references which GC recognizes. Consider an AnsiString and
an PChar into the string, then the pointer will become invalid when the
string is moved or destroyed. Likewise a pointer to an object will
become invalid when the object is destroyes. These are situations which
you'll have to consider already, even without GC.


 3: the GC can't work with pointers to things for every class

GC doesn't have to know about pointer types, it's sufficient to
determine that an object at an address (pointer value) is referenced and
consequently is alive. When such an object has fields with further
references, then GC certainly must know about the data type of the
object.

That's why managed objects usually have a somewhat fixed standard
layout, so that GC can find the class type and layout of the object from
e.g. the VMT pointer. Alternatively managed objects can have an
additional (hidden) reference to the layout information, stored before
the begin of the object. Such a preamble also is used by classic memory
managers, though with different contents, so that the MM can return the
memory to the free memory pool when the object is free'd with e.g.
FreeMem, Dispose or Destroy. At this level GC and traditional MM are not
very different ;-)

DoDi


___
fpc-devel maillist  -  fpc-devel@lists.freepascal.org
http://lists.freepascal.org/mailman/listinfo/fpc-devel

Re: [fpc-devel] Improving Ref Counting

2005-03-01 Thread DrDiettrich

Jamie McCracken wrote:

 A GC needs to trace an object's references to see if anything still
 points to it. How else can it decide whether an object is no longer in use?

GC starts from known alive object references, in static or local
variables, and follows the references in these objects to further
objects. Unused objects never occur in this search, so they need no
special marking or other treatment.

Imagine objects represented as tiles, tied together by reference ropes.
Lift one living tile, and all other living tiles will follow. The tiles
without references from living tiles will be left on the floor. GC does
nothing but lift the known alive tiles, and then sweeps the garbage from
the floor. The references in static and local variables are known to
live, their location is the only thing that a garbage collector has to
know.



 It's very expensive. getmem is quite expensive, and you need it for every
 reference this way.

Okay then use Tlist with preallocation of say half a dozen references - 
that should be efficient for 99% of cases for an individual object's 
references.


Do you realize how much memory consume your suggestions? Talking
abojects, each one increased by a list of references?

I agree that there exist only few references to most objects, perhaps
less than 2 references in average, but the management of the according
lists will cost either time (getmem/freemem) or space (preallocation).

DoDi


___
fpc-devel maillist  -  fpc-devel@lists.freepascal.org
http://lists.freepascal.org/mailman/listinfo/fpc-devel

Re: [fpc-devel] Improving Ref Counting

2005-02-28 Thread DrDiettrich

Jamie McCracken wrote:

 A GC needs to trace an object's references to see if anything still
 points to it. How else can it decide whether an object is no longer in use?

GC starts from known alive object references, in static or local
variables, and follows the references in these objects to further
objects. Unused objects never occur in this search, so they need no
special marking or other treatment.

Imagine objects represented as tiles, tied together by reference ropes.
Lift one living tile, and all other living tiles will follow. The tiles
without references from living tiles will be left on the floor. GC does
nothing but lift the known alive tiles, and then sweeps the garbage from
the floor. The references in static and local variables are known to
live, their location is the only thing that a garbage collector has to
know.



 It's very expensive. getmem is quite expensive, and you need it for every
 reference this way.

Okay then use Tlist with preallocation of say half a dozen references - 
that should be efficient for 99% of cases for an individual object's 
references.


Do you realize how much memory consume your suggestions? Talking 
aboX-MozillaX-Mozilla-Status: 0009jects, each one increased by a list of
references?

I agree that there exist only few references to most objects, perhaps
less than 2 on average, but the management of the according lists will
cost either time (getmem/freemem) or space (preallocation).

DoDi



___
fpc-devel maillist  -  fpc-devel@lists.freepascal.org
http://lists.freepascal.org/mailman/listinfo/fpc-devel

Re: [fpc-devel] Modernising Pascal

2005-02-27 Thread DrDiettrich

peter green wrote:

 one thing i have heared (i don't use garbage collected languages much so
 i've never seen this myself) is that the GC gets some CPU time and starts
 causing stuff to be swapped in. This slows other threads down which gives
 the GC more CPU and causes more stuff to be swapped in.

GC is started either when the system is (almost) idle, or when no more
memory is available for some request. In the first case nothing is
blocked by GC, and in the second case everything is blocked by
out-of-memory, until the GC has collected free memory.

 another effect that i have seen is when CPU usage is high a garbage
 collected app can suck up all the systems memory. It can get all the memory
 it needs for itself because it can force run its garbage collector when it
 runs out but no other app on the system can.

When GC runs in idle time, no other apps exist that could be blocked.
When GC must swap something into memory, then the system is equipped
with too little RAM for the current situation; in such a state
everything runs slowly, due to permanent swapping. I know that people
often discuss how to reduce swapping overhead, and typically they all
miss the simple solution that they should extend their RAM to prevent
swapping to occur at all.

 some objects may use resources other than memory which are more valuable and
 need to be freed asap.

Destructors (or finalizers) can be called to release no longer required
resources, even when GC is used. When it's known or suspected that an
object could be destroyed, the GC can be invoked immediately. GC for
itself doesn't eliminate all resource woes, of course. In special
situations it may make sense to support GC with additional code, just
like with any other memory management method (see below).

 refcounting doesn't suffer from any of theese issues

Reference counting has more critical problems, like zombie objects. GC
will always find and remove all dead objects, whereas reference counting
never can eliminate bidirectionally linked objects, as exist in a GUI
(parent - child controls).

 also note that use of const parameters can eliminate a huge amount of
 refcounting overhead.

I'm not sure about the real impact of const. And even if the use of
const parameters results in a significant runtime reduction, this only
indicates that much more runtime could be saved by not using reference
counting at all!

DoDi



___
fpc-devel maillist  -  fpc-devel@lists.freepascal.org
http://lists.freepascal.org/mailman/listinfo/fpc-devel

Re: [fpc-devel] Improving Ref Counting

2005-02-27 Thread DrDiettrich

Peter Vreman wrote:

 Why are you looking at GC/Refcounting when the problem is the try..finally?
 It is better to rewrite the try..finally code using the C++ ABI for
 exception handling.

Where do you see improvements in the C++ ABI? Or even differences?

Windows implements this ABI, and every language should use it in the
intended way. Perhaps FPC implements a different method on other
systems?

DoDi



___
fpc-devel maillist  -  fpc-devel@lists.freepascal.org
http://lists.freepascal.org/mailman/listinfo/fpc-devel

Re: [fpc-devel] Improving Ref Counting

2005-02-27 Thread DrDiettrich

Jamie McCracken wrote:

 Rather than continuing the GC stuff which seems fruitless I thought it
 might be better to improve what we have with ref counting (whilst taking
 a leaf out of the GC book as well).

A reasonable attempt.


 2) Have a single linked list that contains references to all managed
 objects (ansi strings, variants etc)

A pointer to the next object could be part of every managed object.

 3) Each managed object also has a single linked list that contains
 addresses of all pointers that reference it.

Now you replace an simple reference counter by a list of pointers?
What a waste of memory and runtime :-(

Consider what will happen with every assignment to a (managed) string
variable.

 5) dec ref count means removing the pointer's address from that list.
 When that list is empty the object is freed. (a check for eliminating
 circular refs should also be done if applicable)

A check for circular references is just what GC never has to do. That's
why GC can execute in linear time. It also cannot be determined when a
check for circular references is required, as soon as more than one
object has to be managed. This means that this check must be run very
often, with low probability that it will find discardable object
clusters.


 6) Whenever an exception is thrown, wait until its either handled or
 fully propagated and then perform some garbage collection. (traverse the
 single linked list of all managed objects and for each object check
 whether anything that references it is still valid and delete if
 appropriate).

This requires that it's known which objects are in/valid - how that?


Your ideas are not bad at all, so let me refine your model:
{Notes: 
object in the following text means *managed* object.
A client object contains a reference to a server object.
}


1) Every object is accompanied by a list of references to other objects.
This list is required for the maintenance of the reference lists. The
list can become part of the RTTI, as a list of offsets to reference
fields in the objects of the same type.

2) All objects include an pointer to the next object. (Your 2).

3) Every object has a list of client objects (it is the list header). 
A reference (list record) consists of an pointer to the referenced
object, and of an pointer to the next reference to the same (server)
object. (Your 3)

4) Subroutines include a list of references to managed objects. This
list is used to determine which references become invalid, after normal
or abnormal exit from a subroutine. (Your 6)
This list can be optimized by allocating all such variables in a
contiguous stack area, to eliminate linked-list management overhead.
Subroutine parameters must be managed similarly, but they can't be
reordered. The lists are static, part of the subroutine signature.


Now let's replace (3) by this:

5) Non-local variables are collected in another list of references.

and, oh wonder, we have all what's required to implement a GC!

In your model the reference lists (3) must be updated with every
assignment to a reference variable, and after exit from a subroutine.
Adding a reference (inc ref) doesn't cost much, but removing a reference
(dec ref) from the linked list can become expensive.

In a mark/sweep GC an additional mark bit in every object is required.
There exist enough usable bits, e.g. the low bits in every object
reference. In the first pass (mark) the list of static (non-local)
variables is examined, and all referenced objects are marked (alive).
Then the active subroutine list (stack) is traversed, and all referenced
objects are marked. Finally the list of all objects is traversed and
split into a life and an unknown list. Every marked object is put
into the life list, and all objects that it references are marked
alive as well. In a simple implementation the remaining unknown object
list can be traversed until it contains no more alive members. More
sophisticated algorithms may do that recursively, where recursion is
limited by the number of yet unmarked objects. The key point is the
linear traversal of the lists, without searches. In the final pass
(sweep) the mark bits in the life list are cleared, and it becomes
(is) the list of all remaining objects (2), and the objects in the
unknown list are discarded.

This may sound complicated, but all that happens in linear time. When
swapping will occur during GC, the same will occur in your model, when
the reference lists are updated. Now it should be clear why frequent
updates of the management information should be avoided.


I'm not perfect, any comments on above are appreciated.

DoDi



___
fpc-devel maillist  -  fpc-devel@lists.freepascal.org
http://lists.freepascal.org/mailman/listinfo/fpc-devel

Re: [fpc-devel] TShiftState as enum

2005-02-20 Thread DrDiettrich

Alexey Barkovoy wrote:

 Delphi dowsn't allow sets with ordinal values larger than 255 too:

That's incorrect.

 Borland Delphi  Version 13.0  Copyright (c) 1983,99 Inprise Corporation
 1.pas(2) Error: Sets may have at most 256 elements

Sets are restricted to a maximum of 256 members, but the ordinal values
of the members can be higher.

DoDi


___
fpc-devel maillist  -  fpc-devel@lists.freepascal.org
http://lists.freepascal.org/mailman/listinfo/fpc-devel

Re: [fpc-devel] Abbrevia and Delphi compatibility

2005-02-17 Thread DrDiettrich

Marco van de Voort wrote:
 
  2) Sets with minimal size, at least with 1 and 2 bytes for replacement
  of Byte and Word types.
 
  I consider both features as vital in translations from C to Pascal, and
  in the detection and elimination of bugs. Will it be possible to add
  these features to FPC, this year?
 
 I'm interested in the last one. What is exactly the problem with it, except
 foreign data, like API declarations (which are rare anyway), and binary
 files?

The size restriction applies almost only to filed records, whereas
in-memory data structures usually can be realigned at will.

The use of enums and sets allows for safer code, instead of working with
bitmasks and general ordinal data types and constants. According (set)
variables can be found in many data structures, so that it would be nice
to retype such fields as appropriate sets in Pascal code.
 
 The C argument is a bit doubtful, since there are more such problems (like
 bitpacking, not being able to exactly duplicate any union), so 1:1 remapping
 is not really possible anyway.

Even C bitfields can be emulated perfectly in Pascal, with Object types
and properties.
Named C unions can be emulated by variant records in Pascal, only
anonymous unions sometimes require changes to the code, or the use of
objects with properties. But also some (legacy) Pascal constructs (file
open mode...) could be much nicer and safer with structured data types.

DoDi



___
fpc-devel maillist  -  fpc-devel@lists.freepascal.org
http://lists.freepascal.org/mailman/listinfo/fpc-devel

Re: [fpc-devel] Abbrevia and Delphi compatibility

2005-02-17 Thread DrDiettrich

Peter Vreman wrote:

  1) Properties for Object type.
 
 Since which Delphi version is a property allowed in an normal object?

I'm not sure, but at least D4 supports such properties.

 Currently FPC has special code that forbids property in objects for delphi
 mode.

Then it would be sufficient to remove that special code?

BTW, I didn't find object properties mentioned in the FPC documentation
(HTML). Properties only are described for Class types?


 We can't promise anything.

Granted ;-)

DoDi



___
fpc-devel maillist  -  fpc-devel@lists.freepascal.org
http://lists.freepascal.org/mailman/listinfo/fpc-devel

Re: [fpc-devel] TShiftState as enum

2005-02-17 Thread DrDiettrich

Marco van de Voort wrote:
 
   TShiftState is defined as TShiftState = set of (...);
  
   How can I iterate through the enums? If not, can we split and add an
   enum:
  
   TShiftStateEnum = (...)
   TShiftState = set of TShiftStateEnum;
  
   ?
 
  Of course that is possible. It requires some imagination though (and a feel
  for obfuscated Pascal)
 
 The example fails in Delphi 6 btw, but works in FPC :-)
 Delphi does not allow low/high on sets
 
 (I vote to keep this FPC extension btw)

Can somebody enlighten me, what code exactly fails in D6?
What extension does FPC have, that Delphi doesn't have?

And what iteration is desired? I'd use: For Low(x) To High(x)...


Let me add some more notes, regarding Delphi compatibility:

Older Delphi (and TP?) versions implemented sets of subranges (e.g. set
of 510..515) by stripping unused bytes in front of the set. The lowest
bit in a set variable always had an ordinal value of 2^n, and above set
would occupy 2 bytes, equaling an set of (504..519). In newer Delphi
versions the lowest bit exactly corresponds to the lowest ordinal value
in the range, the above set only occupies one byte (510..517).

AFAIR D6 introduced open enums, with arbitrary (discontinuous...)
values for the members. IMO this only is an attempt to closer reflect C
enums, but sets of such enumerated types are not allowed.


I do not ask for any compatibility in these cases, in detail when the
Delphi implementation differs amongst compiler versions.

DoDi


___
fpc-devel maillist  -  fpc-devel@lists.freepascal.org
http://lists.freepascal.org/mailman/listinfo/fpc-devel

[fpc-devel] Abbrevia and Delphi compatibility

2005-02-15 Thread DrDiettrich

After eliminating dozens of bugs now I have a working version of
Abbrevia for Delphi. Unfortunately this version is not usable with FPC,
primarily because FPC doesn't support properties for the Object type.

In my code I like to use Object for records with methods and properties,
which need no dynamic memory allocation. In Abbrevia this feature
allowed to eliminate near 100 methods of various classes, and simplified
more code. After all I found the following essential features missing in
FPC:

1) Properties for Object type.
2) Sets with minimal size, at least with 1 and 2 bytes for replacement
of Byte and Word types.

I consider both features as vital in translations from C to Pascal, and
in the detection and elimination of bugs. Will it be possible to add
these features to FPC, this year?

DoDi


___
fpc-devel maillist  -  fpc-devel@lists.freepascal.org
http://lists.freepascal.org/mailman/listinfo/fpc-devel

Re: [fpc-devel] Bug in PWidechar and refcounting

2005-02-15 Thread DrDiettrich

Alexey Barkovoy wrote:
 PAnsiChar, PChar are just pointers and not garbage collected by compiler. But
 AnsiString and WideString are compiler managed types. So, as Peter mentioned,
 behaviour you are seeing is by design.

In Delphi WideString is not reference counted at all. The layout of the
string prefix is dictated by the Windows API (BSTR type), and AFAIK
Windows also doesn't use reference counting with this type, or with
OLEVariants containing wide strings.

DoDi



___
fpc-devel maillist  -  fpc-devel@lists.freepascal.org
http://lists.freepascal.org/mailman/listinfo/fpc-devel

Re: [fpc-devel] File Dates

2005-01-28 Thread DrDiettrich

Michael Van Canneyt wrote:

  What time stamps are in use on the various platforms?
 
 Too various. I suggest using simply TDateTime. It has microsecond
 resolution, which should be more than enough. It offers the additional
 advantage that no transformations are necessary for display  compare
 routines. There are a lot of TDateTime routines in the RTL, they would
 all be at your disposal.

Okay, I'll use TDateTime internally, with the following questions:

FPC defines 1900-1-1 as the start date, whereas Delphi defines
1899-12-30 as the start date - note: neither 1900 nor dec. 31!
This requires different constants for converting a Unix date into
TDateTime, or portable procedures. What's the suggested way for such
conversions?

The next question addresses UTC vs. local time. Usually file times are
displayed in local time. In archives either Unix dates (UTC) or FAT
dates (local time) can be found, so that conversions are required.
Unfortunately I couldn't find a definition of the time, as used in the
FPC SysUtils functions for file dates/times. Is it guaranteed, on all
platforms, that file dates in a TDateTime represent local time, and not
UTC?


Currently I'm extending the FileDate object into a FileSpec object, that
also holds the file attributes, file name, file size, and a file system
flag. I'm not yet sure how different file systems, as defined by gzip,
influence the file related information in gzip or other archives. One of
such possible effects is the encoding (character set...) of the file
names. For now at least the methods for FAT and Unix file systems will
be implemented.

The FileSpec object will contain two methods for retreiving and setting
the file related information for disk files. FromFile will collect the
information about an file or directory on disk, for subsequent storage
in an archive. ToFile will apply the file attributes to an file after
extraction from an archive. Then only the conversion between the archive
information and the information in the FileSpec object has to be
implemented for each archive type. The internal information shall allow
for lossless handling of all file attributes, when the archive file
system equals the host system.
It would be nice to apply the file attributes just when a file is
created, instead of after closing an file, but I have no idea whether
this will be possible on all platforms?


The general archive interface will have at least two levels of
abstraction. In the highest level the archive formats will be handled by
according archiver (compressor...) classes. In the lowest level the
encryption and compression methods are handled by further classes. All
available handlers (classes) register themselves at program start, so
that this registry can determine the appropriate handler for an given or
to be created file type. The selected file handler in turn can select
the appropriate handlers for compression and encryption. This separation
allows to add further file formats and compression/encryption methods
easily, without any impact on already existing code.
AFAIR Unix has some kind of registry for file types, based on file
extensions and characteristic bytes at the begin of an file. Does
somebody know more about that registry, so that it could be integrated
into the intended registry for archive handlers?


The Abbrevia contols then can sit on top of that interface, after a
one-time adaptation; specialized components for various archive types
are no more required. The Abbrevia maintainers didn't respond yet, and I
can understand that very well - nobody likes to hear that his
modifications of the orginial code are, ahem, crap. But I think that I
can adopt the Abbrevia controls to the new interface myself, though I'd
appreciate some assistance for the implementation of the Unix specific
procedures, and for testing of course. Hands up somebody out there?

DoDi


___
fpc-devel maillist  -  fpc-devel@lists.freepascal.org
http://lists.freepascal.org/mailman/listinfo/fpc-devel

[fpc-devel] File Dates

2005-01-23 Thread DrDiettrich

Currently I'm trying to define an object for file dates. This object
shall allow to compare time stamps for files on disk and in archives,
and it also shall be usable to set time stamps for such files. Now I'm
undecided what unique internal date/time representation to use in such
an object.

For comparison the minimum resolution of the given time stamps should be
taken into account, so that computations don't result in different time
stamps for the same time. This restriction prohibits the use of
TDateTime, where floating point calculations can result in such
differences.

Do there exist usable data formats for such time stamps?

What time stamps are in use on the various platforms?

What methods would you expect with such an object, so that the date/time
information can be displayed and compared to other time stamps on the
host OS?


AFAIK Unix file time stamps are UTC in seconds since 1970. That format
would be safe for comparisons, calculations only are required for
display purposes (uncritical).

DOS (FAT) file time stamps are in local time, without an indication of
the time zone. Furthermore the date is a calendar date, so that
calculations are required to convert such time stamps into UTC. How to
accomplish a conversion into a unique UTC date, so that the stamps will
compare equal regardless of local time zones or daylight savings?

Win32 has another (NTFS) file time format, using UTC and elapsed time in
100 ns resolution. A translation from/into Unix time stamps should be
safe and easy?

DoDi


___
fpc-devel maillist  -  fpc-devel@lists.freepascal.org
http://lists.freepascal.org/mailman/listinfo/fpc-devel

Re: [fpc-devel] ansistrings and widestrings

2005-01-08 Thread DrDiettrich

peter green wrote:
 
 it should be noted that pascal classes are really not suited to doing
 strings.

IMO we should distinguish Strings, as containers, from Text as an
interpretation of data as, ahem, text of some language, in some
encoding, possibly with attributes...

 to do strings with classes you really need language features which fpc
 doesn't have.

Please explain?

 doing strings with non garbage collected heap based classes would make
 something that was as painfull to work with as pchars and that was totally
 different from any string handling pascal has seen before.

FPC has reference counted string and array types, so that GC is
available.

 just as pascal doesn't consider two strings with different cases to be equal
 it should probbablly not consider two strings of unicode code points to be
 equal unless they are binary equivilent.

That's one of the differences between strings and text. All comparable
data types must have associated comparison functions. For numbers and
strings the standard comparison functions are part of the language
(operators), which usually do a simple binary compare. For other data
types such operators can be defined as appropriate. It should be noted
that a comparison for anything but (strict) equality requires
interpretation rules for the data types. E.g. comparing even ordinal
numbers depends on the byte order of the machine, comparing strings
depends on many more attributes, like mappings for upper/lower case.
That's why a programming language, for itself, will supply only
primitive string comparisons, that have reasonable restrictions so
that an implementation should be possible for any platform.

 conversion between ansistring and widestring should be done by functions
 that take one and returns the other (use a const param to avoid the implicit
 try-finally) so that no limitations are put on how the conversion is done.

This applies to all string handling procedures. A modification of
non-const string parameters opens a can of worms (aliasing...)!

 Theese functions should be indirected through procvars so that the default
 fallback versions can be replaced by versions supplied by a unit which
 provides proper internationalisation.

(Inter)nationalization goes far beyond any standard features. Dealing
with natural languages IMO requires more than only dictionaries and
hard-coded translation rules. Every natural language can have their own
rules, how e.g. the words in a message must be modified or rearranged
when message arguments shall be inserted into the text.

IMO we must distinguish between the handling of Characters, Strings and
Text. For the alphabets (character sets) of natural languages it should
be possible to implement functions to compare and convert characters;
such support often is built into the OS, for selected languages. This is
the level where multibyte characters can come in, so that just a
Character can be different from any fixed-size data type, and that the
same Character can have multiple representations - remember your umlaut
example? Nonetheless the rules on the Character level at least are quite
well defined, so that it's possible to implement according standard
procedures for comparison and conversion. Of course these procedures
require parameters like the language and the encoding of the characters,
so that IMO exchangable and configurable classes are the best containers
for characters.

Strings can be considered as arrays of Characters, so that the string
handling procedures can use the character handling procedures.
Everything else, that requires more than processing an stream of
individual characters, is beyond the scope of standard procedures. Here
it can become problematic when a string just contains words from
different languages, because then an automatic detection of the language
and according rules can not be guaranteed. That's why I hold the
programmer liable for the correct description of whatever he puts into a
string object.

DoDi



___
fpc-devel maillist  -  fpc-devel@lists.freepascal.org
http://lists.freepascal.org/mailman/listinfo/fpc-devel

Re: [fpc-devel] Portability Standards

2005-01-07 Thread DrDiettrich

Jonas Maebe wrote:

 Why? There are cvs clients for pretty much any OS out there. Or does
 your working OS not have a network connection?

I have a network connection, that I use rarely when I need to access
some device built into my old computer. Everything else goes over the
phone line.

But perhaps I missed some important features of the local CVS client?

 Good IDE's have cvs syncing integrated ;)

Lazarus?

DoDi



___
fpc-devel maillist  -  fpc-devel@lists.freepascal.org
http://lists.freepascal.org/mailman/listinfo/fpc-devel

Re: [fpc-devel] ansistrings and widestrings

2005-01-07 Thread DrDiettrich

Florian Klaempfl wrote:

  The only universal international representation for strings is Unicode
  (currently 32 bit), that doesn't require any conversions.
 
 That's not true. E.g. the german umlauts can be represented by 2 chars
 when using UTF-32 (the char and the two dots), same apply to a lot of
 other languages.

Okay, this is where I didn't understand the difference between code
points and whatsoever. Doesn't in the umlaut and accented case exist a
unique glyph and according code, that could be used in the first place?
In other languages (Arabic...) the glyph may vary with the context, here
I have no idea how to compare such text, but the native writers
(speakers) of such glyphs should know ;-)

 Encoding isn't the main problem, you need dedicated procecures and
 functions for unicode comparision, upper/lower conversion etc.

Agreed, these will become the string class methods. It may be necessary
to partition Unicode into code pages, with different methods for
comparison etc.

In the worst case, if we cannot find or agree about a so-far unique
representation for text, an uncomparable value has to become a valid
result of a comparison.


 To achive this platfrom independend is very hard ...

How that? I agree that here the existence of definitely
compatible/portable OS services is not guaranteed. But when the methods
have to be implemented for platforms that do not have such services at
all, then these implementations can be used on all other platforms as
well.


All in all I'd say that we do not intend to implement a text processing
or translation system. What we can do is to define a string or text
class, that contains text in a well defined form, for processing with
all specified methods. The key point is the import of text into an
object of any such class. If no appropriate class has been implemented,
the import is simply impossible. Inside, i.e. between these classes, all
the methods should work. Perhaps with graceful uncomparable or
unconvertable results, when somebody insists in using incompletly
implemented classes.
We don't want the impossible, the doable will be sufficient ;-)

DoDi


___
fpc-devel maillist  -  fpc-devel@lists.freepascal.org
http://lists.freepascal.org/mailman/listinfo/fpc-devel

Re: [fpc-devel] Portability Standards

2005-01-07 Thread DrDiettrich

Jonas Maebe wrote:
 
 On 6 jan 2005, at 22:21, DrDiettrich wrote:
 
  The FPC baseunix/unix units mimic more or less the POSIX standard
 
  As already mentioned, I couldn't find these units :-(
 
 They only exist for unix-like OS'es. They are not generic units which
 you can use to port software from *nix to Dos/Windows (although in
 theory they should compile on Windows with a POSIX-compliant libc
 installed, if some small include files with type definitions are made).

In a port from C to Pascal it's required to implement the OS specific
procedures for all targets, as far as possible. This requires knowledge
of the available target specific procedures, that can be used on the
various platforms. In most cases the C code is written for Unix/POSIX,
so that only a connection to e.g. Windows must be implemented, somehow.
In other cases, like Abbrevia, the Unix connection must be implemented,
and that's when I need detailed information about the available FPC
libraries, even if I'm working on Windows.


In my discussion with Linux people, about the distribution of
FSF/GNU/open source... code, I had an idea for a more practical system
for writing and distributing portable code. My ideas were not understood
by the C people, perhaps I'll find a less biased audience here?

Most readers will know the procedures used to install portable C code on
e.g. Linux, from a *.src.tar.gz or like that. The most critical step is
./configure, where the package tries to find out all required
information about the host and target system. IMO it's not a good idea
to leave it to every single package to implement these tests. In the
case of C code it would be much simpler to support an set of common
header files, containing all ported data types and procedure prototypes.
These header files map the portable functions to the functions available
on the actual target, provide defines for the available functionality
(memory mapped files just come into mind). These header files can be
distinct from the header files of the platform and compiler, they only
have to provide the appropriate mapping to the available procedures. The
difference from the actual automake... procedure is the simplified
maintenance of these header files, required exactly once for every
target OS. Then not much is left what configure or the author can or
should do at all. The makefiles can use platform specific macros for
e.g. producing shared libraries, making libtool superfluous.

The only decisions, left to the author of portable software, are about
the workarounds when a feature, like MM files, is not available on a
platform. In the simple case a Make will fail with unresolved external
symbols, in the more elaborated case the code will check for the
availability of a feature, and use standard procedures on less
comfortably equipped platforms instead.

Even if the exact procedures are somewhat different for C and Pascal,
e.g. the standard header files correspond to Pascal units, the idea IMO
already is part of FPC. What remains to do is the creation and
maintenance of further portable units, in addition to (or extending) the
already existing portable System, Classes, SysUtils etc. units.


Wouldn't it be great to demonstrate to the C world how easy it can be to
write portable software, based on the FPC philosophy? Not that I want to
convince anybody to use Pascal instead of C, but I dream of open source
code that compiles and runs on every (supported) platform, without
autobloat and all that can of worms :-)

I even would accept that Windows is not an official supported platform,
because then I would maintain the portable units for that target myself.
That wouldn't be really hard to do, it only were required to check the
changes to the portable header files or units (interface sections), and
to implement the new or changed funcionality as appropriate for Windows.
Nothing more would be required, to turn Windows or any other best-hated
g OS into a supported platform for really portable code ;-)

DoDi



___
fpc-devel maillist  -  fpc-devel@lists.freepascal.org
http://lists.freepascal.org/mailman/listinfo/fpc-devel

Re: [fpc-devel] Portability Standards

2005-01-06 Thread DrDiettrich

Michael Van Canneyt wrote:

 POSIX says nothing about pascal, it's basically a C interface.

To me POSIX means primarily the very different file handling, with
regards to devices, path separators, owner/group/world access rights
etc. This is what bites not only me when porting GNU software to
Windows.

 The FPC baseunix/unix units mimic more or less the POSIX standard

As already mentioned, I couldn't find these units :-(


 No. TStream. An archive stream does not have a handle.

It was so in my first approach. But you are right, an archive or
decompressor stream should use an input stream, as provided by the
calling application.


  That's why I choose the well defined C names for now, like size_t,
  off_t.
 
 1. They are not well-defined, but this is another story :-)

Hmm, I refer to the GNU libc documentation, so I thought that some types
and procedures are well specified there.

 2. I absolutely HATE underscores in types/variables/whatever
for documentation purposes it is a horror (I use TeX)

I also don't like underscores, but for porting IMO it's a *must* to
retain these names. How else should somebody know which Pascal names to
use for these types?


  Apropos optimization, in many FPC units I found constructs like:
if x = 0 then f(0) else f(x)
  In these cases the IF is superfluous.
 
 You can send a list of such things to any core member.

Okay.


  DWORD often is used for exactly 32 bits, so that Cardinal is not an
  appropriate replacement.
 
 Cardinal = 32 bits _always_. Also on 64 bit.

As I understand the Delphi documentation, Integer and Cardinal have an
unspecified size. In a 16 bit environment they have 16 bits, in 32 bit
environment 32 bits. Of course one could say that even on a 64 bit
machine the most frequently used numbers will have 32 bits as well...


  Perhaps we should discuss the (new and old) data types explicitly, for a
  precise description of the intended use, naming, and general properties
  (signedness, fixed or minimum bit size...).
 
 The old are not subject to discussion.

It would make life easier, with regards to 64 bit machines, 64 bit file
sizes etc.  I definitely see no problems when the file-size types
(off_t) would be retyped from LongInt into e.g. TFileSize in the TStream
class(es). This is one of the rare cases where one can learn from
Windows and C ;-)

DoDi



___
fpc-devel maillist  -  fpc-devel@lists.freepascal.org
http://lists.freepascal.org/mailman/listinfo/fpc-devel

Re: [fpc-devel] Portability Standards

2005-01-06 Thread DrDiettrich

Tomas Hajny wrote:

  I only don't know how to implement or check the other branches - is the
  Windows version of FPC equipped for crosscompilation?
 
 The compiler itself can compile for all platforms listed in help pages

The problem is that the installed FPC doesn't show any help - no files
installed. There's something wrong with that FPC version :-(
I only could install an older verion, but AFAIR that one cannot cross
compile at all. The new version is out of reach to me, unless I get it
on a CD.

Alternatively, you can compile with -s parameter, which gives you the 
generated assembler code for your source files and scripts necessary to 
assemblelink somewhere

That would be sufficient, I mainly need an syntax check. Best in
Lazarus, so that I can browse through the related source files...

DoDi



___
fpc-devel maillist  -  fpc-devel@lists.freepascal.org
http://lists.freepascal.org/mailman/listinfo/fpc-devel

Re: [fpc-devel] FPC 1.9.6 (a.k.a. 2.0.0-RC1) is out

2005-01-06 Thread DrDiettrich

Peter Vreman wrote:
 
  If no major bugs are found, version 2.0 will be released in a short
  timeframe.

Only one major problem so far. Congratulations! :-)

  So let me list the problems I encountered with a preceding version:
 
  - Mouse buttons inoperative (W2K, no menu selection...)
  - \\ contained in all stored pathes, IMO due to input of the FPC
  directory ending with an \.
 
 Please only report bugs on the latest version.

The above bugs still apply :-(

Here the bugs with W2K:

- The mouse buttons don't work. The clicked character is highlighted,
but that's all. Applies to installer and IDE.

- The colors in the install dialog make the (inactive) text almost
unreadable.

- When text is entered for the path, the previous text is entirely
erased.
- This is why I entered the path with a trailing backslash, whereupon
the fpc.cfg contained \/ in all pathes. When the installer suggests to
use backslashes, and then adds further parts with forward slashes, then
it should check for both. The display of the path to enter, for using
fpc, shows all forward slashes, i.e. // instead of the stored \/.

Now for Win98, same machine:

- In the dialog directory already exists the buttons are empty, no
graphics, no text. (a common problem?)
- Same problem with pathes in fpc.cfg.
+ The mouse works, so far.
+ The colors make the text better readable.
+ The path text is not erased, can be edited.

Let me know when the installer is updated, then I will check this again.
Somebody else should check with WinXP!


Some more instructions were nice for the Win32/separate distribution.
The readme.txt refers to the full archive, no hints about the separated
files. It should be noted that the download of install.exe and (what?)
more is required. The units-* are almost self descriptive, but what
about the other archives?


Some tool (Netscape?) changed the dots in the filenames into
underscores, so that nothing but the docs were installed in my first
try. I also got an empty install.dat, due to unreported connection
problems with the ftp server. If I hadn't compared the file sizes, I had
no idea what has gone wrong. These problems IMO also are worth a note in
the readme. Windows users are a bit stupid, you know? ;-)

I still have to download the IDE, until now everything seems to work :-)
Is make and asld required as well?

DoDi


___
fpc-devel maillist  -  fpc-devel@lists.freepascal.org
http://lists.freepascal.org/mailman/listinfo/fpc-devel

Re: [fpc-devel] FPC 1.9.6 (a.k.a. 2.0.0-RC1) is out

2005-01-04 Thread DrDiettrich

[EMAIL PROTECTED] wrote:

 If no major bugs are found, version 2.0 will be released in a short
 timeframe.

I would like to test the Win32 distribution, but the archives are too
big for an download (narrow band) :-(

So let me list the problems I encountered with a preceding version:

- Mouse buttons inoperative (W2K, wireless USB mouse: no menu
selection...)
- Install dialog: last tab not reachable (no mouse support, Alt-C -
Continue!)
- \\ contained in all stored pathes, IMO due to input of the FPC
directory ending with an \.

DoDi



___
fpc-devel maillist  -  fpc-devel@lists.freepascal.org
http://lists.freepascal.org/mailman/listinfo/fpc-devel

Re: [fpc-devel] Portability Standards

2005-01-04 Thread DrDiettrich

Tomas Hajny wrote:

  {$ifndef unix}
  {$i abiuwin.inc} // more to follow later: e.g. Mac OS, Netware etc.
  {$else}
  {$i abiulin.inc}
  {$endif}
 
 There's at least one (IMHO not worse at least) alternative to that (already
 used in FPC itself among others) - keep the include file name the same, place
 it in separate directories for different targets (like UNIX, WIN32, etc.) and
 provide the right paths using a Makefile (see e.g. fcl/Makefile.fpc + Makefile
 in our source tree). The advantage of this approach is that you don't need any
 ifdefs at all (especially when there are more targets supported it could get
 somewhat messy).

A good idea - at least for FPC ;-)
For Delphi it does no harm to implement just the Windows specific part,
in a dedicated directory.

I only don't know how to implement or check the other branches - is the
Windows version of FPC equipped for crosscompilation?

DoDi



___
fpc-devel maillist  -  fpc-devel@lists.freepascal.org
http://lists.freepascal.org/mailman/listinfo/fpc-devel

Re: [fpc-devel] Abbrevia Port (was: Portability Standards)

2005-01-04 Thread DrDiettrich

Michael Van Canneyt wrote:

  Question: What's preferrable, a direct port of the Abbrevia library, or
  a new and better portable design instead, that interfaces with the not
  otherwise available worker classes as implemented in Abbrevia?
 
 Second option.

Here's my general idea of an Abbrevia compatible redesign:

The working name for this new project is Directories+DirectoryItems,
with a di prefix for the unit names etc. The project unifies the
handling of archives, compressed and encrypted files.

The basic objects are Directories and DirectoryItems. This should allow
to cover the file systems of all platforms, as well as archive files,
networks etc.

Directories are DirectoryItems themselves (subdirectories), in general
containers for DirectoryItems, with according management functions. One
method allows to enumerate all contained DirectoryItems. A callback
function can process the items and signal abort (when the file is found)
or recurse (into a subdirectory). This IMO is a better replacement for
the FindFirst/FindNext/FindClose crap, applicable also to the contents
of archive files.

Other DirectoryItems are files and links.
Links transparently wrap the linked file for file related operations,
and have additional methods for managing the links themselves
(redirect...).
File items can be opened and closed, open files have an according Stream
object/property.
Archive files must be mappable into Directories, somehow. A Mount method
might return the appropriate Directory object for the files inside an
archive.

Question:
On Posix (Unix?) file systems the ownership (UID, GID) as well as
specific file (executable...) and directory (sticky...) attributes are
important, when extracting files from archives. Does FPC already provide
according portable access and management procedures?

DoDi



___
fpc-devel maillist  -  fpc-devel@lists.freepascal.org
http://lists.freepascal.org/mailman/listinfo/fpc-devel

Re: [fpc-devel] Portability Standards

2005-01-04 Thread DrDiettrich

Marco van de Voort wrote:

 You might also want to have a look at
 
 http://www.stack.nl/~marcov/porting.pdf
 
 and
 
 http://www.stack.nl/~marcov/unixrtl.pdf

Ah, thanks :-)

 There are 4 cases for Unix:
 
 1 Kylix
 2 FPC/Linux/x86 reusing Kylix libc code.
 3 FPC/Linux/x86 using general FPC unix code
 4 Other FPC Unix targets using general FPC unix code
 
 Since 1-2 is an easy port often, so I think in general FPC/Linux/x86 should
 remain switchable between libc and FPC-UNIX mode.

How important is Kylix at all? Isn't it almost dead now?


 I think it would wise to not use exceptions too deep in the code.
 Compression is one of those things where OOP and exception overhead still
 count.

I'd use exceptions only for fatal errors, which can be caught in the
application code. The cost for raising such exceptions is neglectable,
and catching outside the compressor does not affect the efficiency of
the worker code. The worker code can become simpler this way, because
error checks in every level then are superfluous. OTOH all local
resources then must be protected by try-finally blocks...

DoDi



___
fpc-devel maillist  -  fpc-devel@lists.freepascal.org
http://lists.freepascal.org/mailman/listinfo/fpc-devel

Re: [fpc-devel] Portability Standards

2005-01-04 Thread DrDiettrich

Michael Van Canneyt wrote:

 The FPC units are not POSIX, hence, UNIX.
 (long threads have already been spent on that, and it is a done deal)

I don't want to resurrect a discussion, but can somebody give me an idea
how UNIX and POSIX are different, with regards to FPC?

  Question: What's preferrable, a direct port of the Abbrevia library, or
  a new and better portable design instead, that interfaces with the not
  otherwise available worker classes as implemented in Abbrevia?
 
 Second option.

I have already started with a revival of my TXStream classes, now with
hopefully better portability. One problem are required changes and
extensions to the TStream base class. The new stream classes will be
either input or output streams, and will have I/O buffers used for
crypting and compression. What's the recommended base class herefore?
THandleStream?

  3) Data Types
  off_t now replaces the Integer/LongInt/Int64 mess, for all file size
  related values. I took this name from the C standard, but perhaps it
  should be replaced by some Pascal name?
 
 I think the best way - for now - is to define a set of basic types
 in some base unit:
 - TFileHandle
 - TSeekOffset
 - ...
 Which should be used consistently throughout the system.

That's why I choose the well defined C names for now, like size_t,
off_t.
off_t could be replaced by TFileSize, what's the reason for moving from
32 to 64 bit for huge files, whereas TSeekOffset also indicates that
this type must be signed for moves in both directions.
size_t must not necessarily become a new type. I used it for the amount
of data in Read and Write, where (signed) 32 bits are sufficient on 32
bit machines, with regards to buffer size in memory.

 These base types can then be mapped on a per-os basis to whatever fits best.
 (or to basic FPC types, when (not if!) FPC introduces them.

The best base unit IMO is where the low level I/O procedures are
defined, which wrap the OS I/O calls (FileRead...).

 The current implementation is done for compatibility with Delphi, and
 for optimization.

Hmm, compatibility with which Delphi version? ;-)

Apropos optimization, in many FPC units I found constructs like:
  if x = 0 then f(0) else f(x)
In these cases the IF is superfluous.

  DWORD will be replaced by cUInt32 (or equivalent - what?), as far as
  this type is used outside dedicated MSWINDOWS code.
 
 The native pascal type is 'Cardinal'. But see the remark about basic types.

DWORD often is used for exactly 32 bits, so that Cardinal is not an
appropriate replacement.

Perhaps we should discuss the (new and old) data types explicitly, for a
precise description of the intended use, naming, and general properties
(signedness, fixed or minimum bit size...).

DoDi



___
fpc-devel maillist  -  fpc-devel@lists.freepascal.org
http://lists.freepascal.org/mailman/listinfo/fpc-devel

Re: [fpc-devel] Portability Standards

2005-01-02 Thread DrDiettrich

Michael Van Canneyt wrote:

  1) Target Dependencies

 Agreed 100%.
 In general, a component suite should have all os-dependent code in a single
 unit, presenting the rest of the suite with a uniform API.

Fine :-)
But how should that code be implemented? For various target platforms?


  2) Code Checks

 Why, you can simply cross-compile ?

No, I can't, neither in Delphi nor (AFAIK) in the Windows version of
fpc.


  3) Standard Units

 Compiler defines the endianness.

Correct, but this definition must be evaluated to produce the
appropriate code. E.g. RPM defines a function htonl, read:
HostToNetworkLong, that then is used to convert Long values in Network
byte order (big endian?) into the Host byte order. Such a function
should reside in a common unit, where the according implementation
depends on the endianness of the host machine. It also should have an
handy name, so that it's almost obvious how to replace htonl in the
ported code.

 System unit has 'guaranteed' types.

Fine, but these names often are not very descriptive, like ShortInt and
SmallInt. Names like cUInt16 make it more obvious that this type has a
fixed size and is unsigned.  More standard C names have to be mapped to
the predefined types. E.g. in Abbrevia the file size type (IMO)
initially has been Integer, later was conditionally extended to LongInt
and then to Int64, depending on the capabilities of the compiler and
platform. Here it can be seen that a replacement of some type by the
currently appropriate System standard name is not a good idea. A
distinct data type like off_t (standard C) would make all related
adaptations much simpler, with exactly one definition of that data type.

  - OS specific data types and procedures
 
 ? This is not portable.

When porting code, that was written for a specific platform, the names
of the procedures and related data types should not be changed. Instead
these procedures should be declared in a general unit, together with the
according implementations for other platforms. Please note the different
handling of OS functions and types, where the names should not be
changed, in contrast to the htonl function mentioned above, that should
be renamed to the according (to be defined) common procedure name.

  The best compromise might be a mix of both methods. The interface can be
  fully procedural, and objects with virtual methods are used only in the
  implementation part, and only when this approach makes sense. Perhaps
  somebody finds another way to achieve the same goal, with less
  disadvantages?
 
 
 I don't think you should worry about virtuals.

I don't worry at all, but many people (assembler freaks...) will. That's
why I want to collect opinions before proceeding.

 For archiving purposes,
 it's the compression/decompression algorithms that will take most time,
 and tha actual reading/writing from stream.

This is an rather obvious case, but there exist other cases...


  I'm willing to demonstrate my ideas in a redesign and extension of
  Abbrevia, so that we have a concreter base for further discussions. But
  before starting with that work I would like to hear some encouraging g
  opinions or suggestions.
 
 I think you can do this. I will be pleased to help where I can.
 But send a proposal before you start, I wouldn't want to you
 end up rewriting half your code after a discussion. ;-)

After a closer look at Abbrevia I'm not sure whether it's a good example
for porting code. The recent Kylix extension is crap, and also the
preceding conversions into Windows specific procedures and Pascal
standard types must be reverted to the original definitions. That's why
I'd prefer to use the better ported and portable code from e.g. zlib and
bzip2, instead of again porting the according miserable code from
Abbrevia. More to follow...

DoDi



___
fpc-devel maillist  -  fpc-devel@lists.freepascal.org
http://lists.freepascal.org/mailman/listinfo/fpc-devel

Re: [fpc-devel] Portability Standards

2005-01-02 Thread DrDiettrich

Michael Van Canneyt wrote:

  I'm willing to demonstrate my ideas in a redesign and extension of
  Abbrevia, so that we have a concreter base for further discussions. But
  before starting with that work I would like to hear some encouraging g
  opinions or suggestions.
 
 I think you can do this. I will be pleased to help where I can.
 But send a proposal before you start, I wouldn't want to you
 end up rewriting half your code after a discussion. ;-)

Here some comments and questions about the Abbrevia port.

1) Conditionals
I've redesigned the AbDefine.inc file, with two important changes:

In the first place a DEBUG symbol allows to replace everything else by
another AbDebug.inc file. This switch will allow (me) to define the OS
etc. at will, for debugging purposes. Defining this symbol and the file
contents is entirely up to the user, the file will not occur in CVS or
any distribution.

In the next place the FPC symbol is used to replace the Borland specific
part by an AbFpc.inc file. This file, with an appropriate name, can be
contributed by some FPC user, I don't care for this conditional branch
for now.

Question: Rename all files? All lowercase...?

The LINUX symbol can clash with the according FPC symbol. It should be
renamed e.g. into KYLIX - does somebody know what Kylix defines for
itself?

Question: How to use the FPC provided target symbols?
I'm not happy with the UNIX keyword, I'd used POSIX instead. But that's
only my opinion, I don't want to introduce another symbol.
I'm also not familiar with the differences between the Unix/POSIX
systems. Currently I make no according differences myself, I only
separate the existing code into conditional MSWINDOWS and UNIX parts.


2) File Restructuring
I've separated the spaghetti code in AbUtils.pas into distinct MSWINDOWS
and UNIX sections, each containing complete procedures. These sections
could be moved into dedicated OS specific include files - what's the
preferred way?

It may be desireable to use another dedicated unit for strictly platform
dependent procedures, apart from AbUtils?

According to my first impression of the Abbrevia coding conventions I'd
prefer to use the existing and better ported and portable code in the
already existing paszlib and bzip2 libraries, instead of porting the
according Abbrevia implementations. Perhaps the whole de/compressor part
of Abbrevia could be reduced to the base classes, from which inherit the
new wrappers for the FPC ports.

Question: What's preferrable, a direct port of the Abbrevia library, or
a new and better portable design instead, that interfaces with the not
otherwise available worker classes as implemented in Abbrevia?

A direct port IMO requires more work than a redesign.
In a redesign also much Windows/Delphi/Kylix specific stuff could be
removed or replaced by more portable procedures. The entire file
handling stuff in Abbrevia IMO is too Windows-centric, the Linux/Kylix
extensions only are patchwork that tries to map the Windows file
handling to the very different POSIX file handling, what can result in
loss of file attributes on POSIX platforms.


3) Data Types
off_t now replaces the Integer/LongInt/Int64 mess, for all file size
related values. I took this name from the C standard, but perhaps it
should be replaced by some Pascal name?
IMO this type also should be used in the TStream classes, replacing the
current conditional distinction (seek64bit) between 32 and 64 bit
methods. Why write two implementations, when the compiler can use the
actual type definition immediately?

DWORD will be replaced by cUInt32 (or equivalent - what?), as far as
this type is used outside dedicated MSWINDOWS code.

The 16 bit word datatype IMO should not be used in 32 bit code, unless
where required in datastructures with a fixed layout.

Similar considerations and decisions will be required during porting, I
only presented these examples as a concrete base for a general
discussion.


4) Exceptions, Error Codes etc.
IMO it would be sufficient to use one general exception class for the
de/compression errors. An application cannot draw any advantage from
many specialized exception classes, and exception handlers inside a
specific class can use the class specific error code in the exception
object.

For the same reasons I think that public error codes only should be
specified for those error conditions that can or must be handled in some
specific way. These common codes then should be used by all
de/compressor classes.

DoDi


___
fpc-devel maillist  -  fpc-devel@lists.freepascal.org
http://lists.freepascal.org/mailman/listinfo/fpc-devel

Re: [fpc-devel] Interface to compressed files and archives

2005-01-01 Thread DrDiettrich

[EMAIL PROTECTED] wrote:

  E.g.: gzip.xyz, is this based on a gzip unit or a gzip variable or...
 
 Does this matter to you ?
 
 Normally one never uses a fully qualified identifier.

And that can become a problem, when a variable and a unit has the same
name. That's why I do not only prefer to prefix type names with T, but
also unit names with u, form (unit) names with f etc. As prefixes
for specific kinds of units seem to be in use by other people as well,
why not prefix all units?

 Only when a possible name conflict exists, which
 - Should be very rare, and avoided in the first place.
 - In such cases it will be obvious from the context.

Okay, name clashes between unit and variable names should be detectable
easily. But then a decision has to be made, which of both names should
stay unchanged, and which one to decorate. My preference then is to
decorate the unit names, because these occur less frequently in source
code, and almost only in obvious Uses clauses.

I know that my private prefix style is a bit uncommon, as is my coding
style (indentation...). In shareable contributions I'm willing to follow
the more widely accepted standards, of course :-)

DoDi



___
fpc-devel maillist  -  fpc-devel@lists.freepascal.org
http://lists.freepascal.org/mailman/listinfo/fpc-devel

Re: [fpc-devel] compiler bug?

2005-01-01 Thread DrDiettrich

Nico Aragón wrote:
 
   IIRC, any non-zero value is evaluated as True for a Boolean variable.
 
  You should not guess about any implementation.
 
 I don't. Do I?

Yes, you do. How can you know what bit pattern is stored in a boolean
variable? Using typecasts may result in silent type conversion,
returning something different from the really stored pattern.

The boolean data type is distinct from other data types, so that it's
wild guessing that every compiler and compiler version will treat
boolean values and variables in the same way, as observed when using a
specific version of an specific compiler.

For completeness: Many people also consider ByteBool etc. as being
boolean. These types have been made compatible with boolean
*intentionally*, with the documented behaviour that any non-zero value
will evaluate to True, under circumstances. But you may test yourself
what will happen when you compare such a variable with an non-boolean
value - it's not perfectly specified when the nonzero=true evaluation
will be effective. Try this with various compilers:

var b: ByteBool;
...
case b of
True: ...
False: ...
42: ...
else ...
end;

It's unspecified which compiler will accept the True and False constants
here at all...

DoDi



___
fpc-devel maillist  -  fpc-devel@lists.freepascal.org
http://lists.freepascal.org/mailman/listinfo/fpc-devel

Re: [fpc-devel] compiler bug?

2004-12-31 Thread DrDiettrich

Nico Aragón wrote:

 IIRC, any non-zero value is evaluated as True for a Boolean variable.

You should not guess about any implementation. Forcing out-of-range
values into strictly typed variables is a user bug, at the full risk of
(and shame on) that user.

Who's to blame when somebody applies FillChar on a pointer variable, and
the program consequently crashes?

  else
 WriteLn('Other');

This better should read:
WriteLn('corrupt data space!!!'); Panic;

DoDi



___
fpc-devel maillist  -  fpc-devel@lists.freepascal.org
http://lists.freepascal.org/mailman/listinfo/fpc-devel

Re: [fpc-devel] Interface to compressed files and archives

2004-12-31 Thread DrDiettrich

Marco van de Voort wrote:

 Better have a separate way. Otherwise you can't set e.g. a compressionlevel
 for that stream, _or_ you have to have lots of different constructors.

Compressors can require any kind and number of arguments, that must be
reflected somewhere, e.g. in the specific constructor. It's not
guaranteed that a compressor will have usable defaults for all
parameters.

For now only decompression will be supported by a general function,
where the decompressor (class) can be selected from the file extension
or header, and the decompressor then should be able to determine the
appropriate parameters from the concrete data source.

 One other thing to keep in mind (iirc) is that some algo's require the
 uncompressed size to unpack, and some the compressed size. So probably
 your interface has to support both.

Good to know. But the uncompressed size cannot be known in the general
case, so that such information must be supplied together with the
compressed data. The wrapper processor then must know how to retrieve
the unpacked size and the data from its own input stream.

 And use a 64-bit size and an endianness indicator if possible.

The 64 bit issue should be handled by the basic TStream class, where
also the according data types for size_t, off_t (or equivalent) should
be defined appropriately. The endianness of the target system also can
be selected at compile time - how?


 Search for zfs (zip filesystem). It was FPC compat for a while.

Can you be a bit more specific? I have traditional problems in searching
:-(
My first search resulted in 6000 hits, almost related to zip drives. My
second search retrieved two messages about errors in a StarKit...
package.

  I already
  decided to replace my own stdc unit by the FPC libc unit, with
  hopefully no changes to that unit.
 
 Then change them again to use BaseUnix/Unix :-)

I could locate 2 BaseUnix.pp units, for Linux and BSD. But I'm
developing under Windows :-(

What I really need is a unit with commonly used type names, so that all
modules ported from C to Pascal share the same types. The module
specific types are less important, because these are not shared with
other modules; nonetheless even these types should be defined based on
common types, like cUInt8.

 The libc unit is not a base unit of FPC, but exists merely for Kylix
 porting purposes, since it is pretty much only a direct x86 glibc
 translation, and not a general POSIX abstraction. It is only supported for
 Linux/x86.

Good to know. I already wondered how the implementation could work on
Windows.

 As said, a portable subset is available in units baseunix/unix

Where? I couldn't find any such directory and unit in the FPC sources
:-(

A related question: which (source...) directories must I add to the
search path, for using FPC and (primarily) Lazarus on Windows?

 See http://www.freepascal.org/docs-html/rtl/  for some docs
 http://www.stack.nl/~marcov/unixrtl.pdf

Just downloaded, but I couldn't find neither on the server nor in the
expanded docs-html.zip a /rtl directory.

DoDi


___
fpc-devel maillist  -  fpc-devel@lists.freepascal.org
http://lists.freepascal.org/mailman/listinfo/fpc-devel

Re: [fpc-devel] Interface to compressed files and archives

2004-12-31 Thread DrDiettrich

[EMAIL PROTECTED] wrote:

 Naming a unit with 'u' standard does not seem useful to me, but this is
 a matter of taste.
...
 All other files are assumed to be units.
 (projects/packages have distinct extensions anyway)

No problem at the directory level, but how to distinguish names of
units, types, variables etc. in qualified references?
E.g.: gzip.xyz, is this based on a gzip unit or a gzip variable or...

DoDi



___
fpc-devel maillist  -  fpc-devel@lists.freepascal.org
http://lists.freepascal.org/mailman/listinfo/fpc-devel

[fpc-devel] Interface to compressed files and archives

2004-12-29 Thread DrDiettrich

Hi there,

I'm new to this list and want to introduce myself and my intended
contributions to FreePascal.

My name is Dr. Hans-Peter Diettrich, and I live in Flensburg (Germany).
For brevity I use to sign my messages as DoDi. My main interests are
decompilers and (tools for) porting code. Usually I work with Delphi,
but this behaviour may change ;-)

Recently I came across some interesting library modules of FPC, that I
want to use in my own projects. Some of these modules deserve updates,
in general and for use with Delphi, and I want to contribute my
according work to the FPC community.

Currently I'm implementing an RPM clone for Windows, which in detail
should support source rpm's, better than the original RPM. Hereby I have
to deal with compressed files in various formats (gzip, bzip2), and
archive files (cpio, tar...).  I've already update or implemented some
of these modules, now I want to define a common interface and API for
compressed and archive streams, based on TStreams. The zstream unit is
dedicated to a single compressor, but it has an handy name. How should I
name a more general unit, would zstreams be acceptable?

My idea of a general (de-)compression interface is as follows:

In the general decompression unit a list of all available compressors is
maintained, every implemented and used compressor adds itself to this
list, in the initialization section of it's main unit.

Then a general Open or Decompress procedure can determine which
decompressor to use for an given stream, and can create the appropriate
decompressor object. For compressors it may be better to create the
according object directly, according to the desired compression format,
in which case the according arguments also can be passed to the
constructor of that class in the appropriate form.

The use of the de/compression stream objects should be obvious, Read or
Write is called until the EOF. The legacy C code of the compressors is
based on error codes and conditions that must be checked after almost
every call to an internal function, and which are available as the final
result after the information is fully processed. I want to modify that
model, so that errors will raise the predefined stream exceptions. This
approach will simplify, and make more transparent, the existing code as
well as the application code. It also will allow to hide the compressor
specific error codes from the application. Such a change will be
incompatible with the inherited decompressor API's, but does anybody see
a need to further support alternative and specialized access to
de/compressors, beyond the stream support?

If we can agree about the above details, I plan to convert the gzip,
bzip2 and zip modules to that common interface. I'm also willing to
update further modules for use of that interface, provided that the
modules already exist as Pascal source code.

---

Archive files deserve a more elaborate API, so that the files in an
archive can be extracted to individual files or streams. There was
already a suggestion, to define something like a virtual file system
interface for archive files. I suspect that something like this already
exists for use in the GUI browsers of both Linux and Windows. This may
deserve some research, before an accordingly compatible interface can be
defined. Now I'm waiting for according contributions from the OS gurus
before proceeding with this approach.

A much simpler interface could be based on enumeration and callback
procedures, that will allow to process existing archive files
sequentially. It also may be possible to create an directory tree for an
archive, but for now I will leave such an implementation to somebody
else ;-)
For the creation of new archive files, methods are required to add files
to the archive directory. The simplest approach will be based on
physical (existing) files, whose attributes can be retrieved by the
archiver from the existing file system. Then the application code must
not care about all related details.

---

Now you should have gotten the big picture of my intended activities.
Many more questions will arise when I proceed with my work. I already
decided to replace my own stdc unit by the FPC libc unit, with
hopefully no changes to that unit. For further compatibility it will be
necessary to find compromises between my coding style, and the style
used by the FPC community. E.g. I prefer to prefix all my units with an
u, so that the base names remain available for procedures or
variables. I also use upper case characters in the unit names, what may
not be appreciated by users from the Unix world. As a compromise it may
be possible to use a lib prefix, but this may conflict with existing
library names (libz...). Any ideas?


I'll stop now and thank you for your patient reading. Feel free to
modify the subject or to open new threads for discussing details.

Happy New Year
  DoDi


___
fpc-devel maillist  -  fpc-devel@lists.freepascal.org

51 matches

Mail list logo