[fpc-devel] Unicode resourcestrings
Hi, Is there a way in current FPC to have unicode or wide resourcestrings? Thanks, Martin ___ fpc-devel maillist - fpc-devel@lists.freepascal.org http://lists.freepascal.org/mailman/listinfo/fpc-devel
Re: [fpc-devel] Patch, font rendering on Arm-Linux devices.
Are enumeration types 1 or 4 bytes in Delphi? If they are one byte, it looks quite different (and I'm not sure about all the types used here, some seem to be sets, some enumerations). But at the first glance it seems, they used both packed records to either ensure minimum size or known record layout (maybe they even used the structure in some assembly module?), and also aligned them manually to avoid unaligned access issues. Vinzent. ___ fpc-devel maillist - fpc-devel@lists.freepascal.org http://lists.freepascal.org/mailman/listinfo/fpc-devel
Re: [fpc-devel] Patch, font rendering on Arm-Linux devices.
Luiz Americo Pereira Camara wrote: TVirtualNodePacked = packed record Index,//Offset 0 ChildCount: Cardinal; //Offset 4 NodeHeight: Word; //Offset 8 States: TVirtualNodeStates; //Offset 10 * Align: Byte; //Offset 14 ** CheckState: TCheckState; //Offset 15 ** CheckType: TCheckType; //Offset 16 Dummy: Byte; //Offset 17TotalCount: Cardinal; //Offset 18 * [...] TVirtualNodePacked = packed record Index,//Offset 0 ChildCount: Cardinal; //Offset 4 NodeHeight: Word; //Offset 8 States: TVirtualNodeStates; //Offset 10 * Align: Byte; //Offset 14 ** CheckState: TCheckState; //Offset 15 ** CheckType: TCheckType; //Offset 16 Dummy: Byte; //Offset 17 TotalCount: Cardinal; //Offset 18 * [...] The mail editor scrambled the record structure. I hope this time is more clear. Luiz ___ fpc-devel maillist - fpc-devel@lists.freepascal.org http://lists.freepascal.org/mailman/listinfo/fpc-devel
Re: [fpc-devel] Patch, font rendering on Arm-Linux devices.
Daniël Mantione wrote: Op Tue, 26 Feb 2008, schreef Luiz Americo Pereira Camara: Yury Sidorov wrote: The patch removes packed record for some platforms. IMO packed can be removed for all platforms. It will gain some speed. I'd like to understand more this issue. Why are non packed records faster? Cache trashing. One of the most underestimated performance killers in modern software. The difference occurs at memory allocation or at memory access? Memory access. What happens is that the non-packed version causes more cache misses. A cache miss costs many cycles on a modern cpu, a misaligned read just costs an extra memory access (which is fast if cached) on x86, and extra load instruction on ARM. This much cheaper than a chache miss. Thanks for all explanation. I'm sure that the change is worth. One more question: The VirtualTreeView tries to make the fields of the (packed) record aligned at dword boundary by grouping together smaller (one or two byte fields) or adding dummy fields. Does this trick overrides the unaligned memory access? The real beast: TVirtualNodePacked = packed record Index,//Offset 0 ChildCount: Cardinal; //Offset 4 NodeHeight: Word; //Offset 8 States: TVirtualNodeStates; //Offset 10 * Align: Byte; //Offset 14 ** CheckState: TCheckState; //Offset 15 ** CheckType: TCheckType; //Offset 16 Dummy: Byte; //Offset 17 TotalCount: Cardinal; //Offset 18 * [...] For what i understand, the fields marked with * makes an unaligned access because they are not in dword boundary. Right? Fields with ** also are not dword boundary aligned, but since are one byte fields there's not unaligned access. Right? And about 64bit systems. Should the fields be qword aligned or dword is still sufficient? Luiz ___ fpc-devel maillist - fpc-devel@lists.freepascal.org http://lists.freepascal.org/mailman/listinfo/fpc-devel
Re: [fpc-devel] Lazarus: A new widgest set
On Tuesday 19 February 2008 16.55:16 Martin Schreiber wrote: > On Tuesday 19 February 2008 15.53:16 Michael Schnell wrote: > > > If you compile the SVN trunk version with -dmse_with_ifi you will get > > > the MSEifi components in the component palette. > > > > Of course I really would like to help beta-testing this. Unfortunately, > > due to a firewall jail I am working in, I can't access an SVN. > > You can't use opensource projects without SVN access, you must solve the > problem. > > > Have I been correct assuming that I can do a "secondary" GUI using > > MSE(-ifi), i.e. taking a normal (existing) Delphi or Lazarus program > > that does feature it's normal GUI and add some MSE code (and widget > > definitions) plus a transport channel and then I can create controls > > that are visible on the screen of the remote machine. Moreover when the > > remote user "clicks" a control that had bee defined in that way, an > > event should be triggered (in a thread > enable event driven programming in a thread> or in the main thread). > > Correct. I never tried a Delphi or Lazarus applications as server, I use > MSEgui or MSEnogui applications. > > > Have I been correct assuming that either a Pascal program or a browser > > plugin (is that Java code ?) can be used as a target of the transport > > channel, and both should show a user interface that had been defined by > > the master program ? > > Correct. The browser plugin doesn't exist up to now. I think it will be a > Pascal dll/so. > > > It would be great if you could send me an example (at best a windows > > exe, using http) and the browser plugin, so that I can see what MSE can > > do. > > I'll see what I can do but not in the next days. > I made a demo of MSEifi with a server and a client connected by pipes. Win32 binaries: http://msedocumenting.svn.sourceforge.net/viewvc/msedocumenting/mse/trunk/help/tutorials/mseifi/ifipipedemo/bin/i386-win32/ifipipedemoclient.exe?view=log and http://msedocumenting.svn.sourceforge.net/viewvc/msedocumenting/mse/trunk/help/tutorials/mseifi/ifipipedemo/bin/i386-win32/ifipipedemoserver.exe?view=log Download ifipipedemoclient.exe and ifipipedemoserver.exe into , cd , run ifipipedemoclient.exe, click 'connect'. Screenshot: http://www.homepage.bluewin.ch/msegui/pics/mseifidemo.png Martin ___ fpc-devel maillist - fpc-devel@lists.freepascal.org http://lists.freepascal.org/mailman/listinfo/fpc-devel
Re: [fpc-devel] Freepascal in microcontrollers
On Thursday 28 February 2008 13:09, Michael Schnell wrote: > > Yes. That's what {$BIT_ORDER} would stand for (still, it would not > > change *byte* order). > > I don't understand this. I don't think the bit order within a byte is > to be considered changing. Well, the question is, if the first bit in a record is the leftmost or the rightmost bit. It's a matter of interpretation. But as Jonas pointed out, the order of the bits may change depending on the endianess (assuming I didn't misunderstand him). > I would call the issue "byte-order" and (thus I'd prefer something > like {$BIT_PACKED_BYTE_ORDER} or {$BIT_PACKED_ENDIAN}. It's not byte order. If I declare: |bitpacked record | X : Byte; | Y : Byte; |end record; X will still be at the lowest address and Y will be at @X + 1. The issue arises when I say: |bitpacked record | X : Boolean; | Y : Boolean; | Z : Two_Bit_Enum; |end; Assuming, bit 0 is the LSB, does the compiler access bit 0 and 1 (low order first) for X and Y or does it choose bit 7 and 6 (high order first) then? And how would it interprete a specific value for Z? At least two interpretations are possible: X:7, Y:6, Z[5:4] or X:0, Y:1, Z[3:2] ASCII graphic: |X|Y|Z|Z|-|-|-|-| |-|-|-|-|Z|Z|Y|X| Ok, I guess, the issue with the enum is none, because the LSB is still at the right place on the data bus, no matter how you look at it. So forget that. ;) Of course, there are more nasty things like |bitpacked record | X : Boolean; | Y : Byte; |end; where a single value would cross the byte boundary... *headscratch* I guess, there's a reason, why endianess issues are not automatically handled by the compiler. :D Vinzent. ___ fpc-devel maillist - fpc-devel@lists.freepascal.org http://lists.freepascal.org/mailman/listinfo/fpc-devel
Re: [fpc-devel] Patch, font rendering on Arm-Linux devices.
Op Thu, 28 Feb 2008, schreef Michael Schnell: An ARM does not have such logic and will suffer cache miss after cache miss. Nonetheless the count of word transfers form memory to/from the cache would be smaller with packed records which might result in a lot faster execution (of course depending on the layout of the record, speed of the memory, speed of the processor, type of operations done with the records, ...) That is exactly what I wanted to explain: even on ARM the lower amount of cache misses might pay for the (higher) cost of an unaligned load. Daniël___ fpc-devel maillist - fpc-devel@lists.freepascal.org http://lists.freepascal.org/mailman/listinfo/fpc-devel
Re: [fpc-devel] Patch, font rendering on Arm-Linux devices.
An ARM does not have such logic and will suffer cache miss after cache miss. Nonetheless the count of word transfers form memory to/from the cache would be smaller with packed records which might result in a lot faster execution (of course depending on the layout of the record, speed of the memory, speed of the processor, type of operations done with the records, ...) -Michael ___ fpc-devel maillist - fpc-devel@lists.freepascal.org http://lists.freepascal.org/mailman/listinfo/fpc-devel
Re: [fpc-devel] Freepascal in microcontrollers
Yes. That's what {$BIT_ORDER} would stand for (still, it would not change *byte* order). I don't understand this. I don't think the bit order within a byte is to be considered changing. I would call the issue "byte-order" and (thus I'd prefer something like {$BIT_PACKED_BYTE_ORDER} or {$BIT_PACKED_ENDIAN}. -Michael ___ fpc-devel maillist - fpc-devel@lists.freepascal.org http://lists.freepascal.org/mailman/listinfo/fpc-devel
Re: [fpc-devel] Patch, font rendering on Arm-Linux devices.
Op Thu, 28 Feb 2008, schreef Yury Sidorov: Yes, but if you have an array of them (as we have in this case), considerably more of these records will fit in the cache. Therefore you will have considerably less cache misses. This becomes even more serious when the processor in question does not have prefetching; in such case, traversing the array will cause cache miss after cache miss, a smaller array will then have less of these misses. You are right. Array of packed records is a bit more effective than array of non-packed records, at least on modern x86 CPUs. I do some benchmarks and got on Core Duo: 2070ms - for non-packed 1910ms - for packed But for CPUs which do not support misaligned data access - packed records are speed killers and need to be used as the last resort. I not 100% sure about this. Your Core Duo has a array traverse detector which activates prefetching. An ARM does not have such logic and will suffer cache miss after cache miss. However, it is for certain that a manual unaligned load is more expensive on ARM than a hardware unaligned load on x86. Also if record is not element of large array it is better do declare it as non-packed for all CPUs. Yes. Daniël___ fpc-devel maillist - fpc-devel@lists.freepascal.org http://lists.freepascal.org/mailman/listinfo/fpc-devel
Re: [fpc-devel] Freepascal in microcontrollers
On Thursday 28 February 2008 12:17, Michael Schnell wrote: > > Enable_Mode : Enable_Set; // bit 14 .. 15/leftmost bits > > With an x86 the "leftmost bits" will be in the "rightmost" (second) > of the two bytes, > > with an 68K the "leftmost bits" will be in the "leftmost" (first) of > the two bytes, Yes, bad example, because we already crossed the byte boundary. The real question about leftmost and rightmost was on which "data line" each bit would appear. (Usually it's called most significant bit, but as we're talking about hardware bits, not numbers, I wouldn't use that term. In this context, no bit is necessarily more significant than the other.) > So the two can't communicate this record via files or via network. > > If you want to have them understand each other, you need to define > the edianess of the record independently of that of the processor. Yes. That's what {$BIT_ORDER} would stand for (still, it would not change *byte* order). > Enumerated types don't help here. They weren't meant to solve the issue, they were meant to help to understand the issue I was trying to point out. The question was, how to interpret the enumeration values, if their bit order could/would differ from that of the record they're in. Should they be put in as is or swapped accordingly? Vinzent. ___ fpc-devel maillist - fpc-devel@lists.freepascal.org http://lists.freepascal.org/mailman/listinfo/fpc-devel
Re: [fpc-devel] Freepascal in microcontrollers
Enable_Mode : Enable_Set; // bit 14 .. 15/leftmost bits With an x86 the "leftmost bits" will be in the "rightmost" (second) of the two bytes, with an 68K the "leftmost bits" will be in the "leftmost" (first) of the two bytes, So the two can't communicate this record via files or via network. If you want to have them understand each other, you need to define the edianess of the record independently of that of the processor. Enumerated types don't help here. -Michael ___ fpc-devel maillist - fpc-devel@lists.freepascal.org http://lists.freepascal.org/mailman/listinfo/fpc-devel
Re: [fpc-devel] Freepascal in microcontrollers
On Thursday 28 February 2008 11:28, Michael Schnell wrote: > > AFAICS, it would be useful for bitpacked records only, so it could > > appear anywhere where a {PACKRECORDS} directive or similar can > > appear currently. > > IMHO it would only be useful (allowed with, regarded by) bitpacked > record, as any other data representation is supposed to be optimized > for speed according to the processor architecture. Hmm, not necessarily. I frequently use enumeration types to express the meaning of a set of hardware bits. So thinking about it, what if I'd use enumerations in a bitpacked record? Maybe like this: -- 8< -- type // 2 bits. Enable_Set = (Dont_Care := 0, // 00 Disable := 1, // 01 Enable:= 3); // 11 type Control = bitpacked record Continuous_Mode : Boolean;// bit 0/rightmost bit Alternate_Compare : Boolean;// bit 1 ... Enable_Mode : Enable_Set; // bit 14 .. 15/leftmost bits end; -- 8< -- I don't know if FPC can pack enumerations into a bitpacked record at all, but if it does, it might consider the bit order here, too. Consider something like: -- 8< -- var My_Set: Enable_Set; My_Record : Control; My_Set := My_Record.Enable_Mode; -- 8< -- How should the assignment be handled if the bit order for that bitpacked record is changed? Vinzent. ___ fpc-devel maillist - fpc-devel@lists.freepascal.org http://lists.freepascal.org/mailman/listinfo/fpc-devel
Re: [fpc-devel] Patch, font rendering on Arm-Linux devices.
From: "Daniël Mantione" <[EMAIL PROTECTED]> > On Thursday 28 February 2008 09:16, Daniël Mantione wrote: > >> Memory access. What happens is that the non-packed version causes >> more cache misses. > > Please elaborate. If the (unaligned) data is crossing a > cache-line, thus > causing two full cache-line reads, I'd understand that, but once > it's > in the cache, it wouldn't matter anymore? Yes, but if you have an array of them (as we have in this case), considerably more of these records will fit in the cache. Therefore you will have considerably less cache misses. This becomes even more serious when the processor in question does not have prefetching; in such case, traversing the array will cause cache miss after cache miss, a smaller array will then have less of these misses. You are right. Array of packed records is a bit more effective than array of non-packed records, at least on modern x86 CPUs. I do some benchmarks and got on Core Duo: 2070ms - for non-packed 1910ms - for packed But for CPUs which do not support misaligned data access - packed records are speed killers and need to be used as the last resort. Also if record is not element of large array it is better do declare it as non-packed for all CPUs. Yury. ___ fpc-devel maillist - fpc-devel@lists.freepascal.org http://lists.freepascal.org/mailman/listinfo/fpc-devel
Re: [fpc-devel] Freepascal in microcontrollers
On 28 Feb 2008, at 11:17, Daniël Mantione wrote: Op Thu, 28 Feb 2008, schreef Jonas Maebe: It's not about Linux vs. Windows, it's about FPC 2.2.0 vs FPC 3.4.0, coupled with the fact that bitpacked records as currently defined are not usable for defining a specific layout. It compeletely normal that a record written by a a program written in FPC 2.2 can be read be FPC 3.4. A regularly packed record, yes. "Non-packed" records: not at all. In fact, their layout changed in some circumstances in FPC 2.3.1 compared to earlier versions. As to bitpacked records: If you design a feature, after a grace time, it should be kept backward compatible. Not if they are described like this in the manual (ref.tex, line 1204): *** Note that the internals of the bitpacking are opaque: they can change at any time in the future. What is more: the internal packing depends on the endianness of the platform for which the compilation is done, and no conversion between platforms is possible. This makes bitpacked structures unsuitable for storing on disk or transport over networks. The format is however the same as the one used by the GNU Pascal Compiler, and we aim to retain this compatibility in the future. *** The same goes for the internal format of sets (which also changed a while ago), and should also go for the layout (as far as the part which is normally invisible to the programmer goes) and reference counting of ansistrings/interfaces etc. I haven't heard the argument why bitpacked records should be exempt from this. Because they were not designed/implemented with binary portability/ compatibility in mind, and doing so allows freedom to optimize them, make them compatible on any platform with the custom format there if any (e.g. how debuggers expect them to be laid out), etc. If you don't have to fix something in concrete, it's always a good idea not to do so because it'll only come back later to haunt you. That does not mean you have to actively try to change every opaque structure in every release in order to break backwards compatibility, but it does give you the freedom to do so when it's useful. As I said before: if you want to define something which has a predictable layout, you have to do so specifically and give the programmer the means to do so. If it is not clear what the compiler will/may do from just reading the declaration and active compiler directives, it's almost by definition improper to rely on any current implementation details. Jonas___ fpc-devel maillist - fpc-devel@lists.freepascal.org http://lists.freepascal.org/mailman/listinfo/fpc-devel
Re: [fpc-devel] Patch, font rendering on Arm-Linux devices.
On Thursday 28 February 2008 11:25, Daniël Mantione wrote: > Op Thu, 28 Feb 2008, schreef Vinzent Hoefler: > > On Thursday 28 February 2008 09:16, Daniël Mantione wrote: > >> Memory access. What happens is that the non-packed version causes > >> more cache misses. > > OMG. I'm s confused. ;) I read "that the packed version causes more cache misses" here. That was the part where I didn't understand why. > > Please elaborate. If the (unaligned) data is crossing a cache-line, > > thus causing two full cache-line reads, I'd understand that, but > > once it's in the cache, it wouldn't matter anymore? > > Yes, but if you have an array of them (as we have in this case), > considerably more of these records will fit in the cache. Yes, that's what I figured, so I'm on the same path as you here, it seems, but tracing back the discussion it read: -- 8< -- > I'd like to understand more this issue. > Why are non packed records faster? Cache trashing. One of the most underestimated performance killers in modern software. > The difference occurs at memory allocation or at memory access? Memory access. What happens is that the non-packed version causes more cache misses. -- 8< -- The first part tells me non-packed records are faster, but the second line tells me that the non-packed version also causes more cache misses, thus is slower. That got me confused, I think. Of course, the net result only depends on the benchmark you're using. ;) Vinzent. ___ fpc-devel maillist - fpc-devel@lists.freepascal.org http://lists.freepascal.org/mailman/listinfo/fpc-devel
Re: [fpc-devel] Freepascal in microcontrollers
AFAICS, it would be useful for bitpacked records only, so it could appear anywhere where a {PACKRECORDS} directive or similar can appear currently. IMHO it would only be useful (allowed with, regarded by) bitpacked record, as any other data representation is supposed to be optimized for speed according to the processor architecture. -Michael ___ fpc-devel maillist - fpc-devel@lists.freepascal.org http://lists.freepascal.org/mailman/listinfo/fpc-devel
Re: [fpc-devel] Patch, font rendering on Arm-Linux devices.
Op Thu, 28 Feb 2008, schreef Vinzent Hoefler: On Thursday 28 February 2008 09:16, Daniël Mantione wrote: Memory access. What happens is that the non-packed version causes more cache misses. Please elaborate. If the (unaligned) data is crossing a cache-line, thus causing two full cache-line reads, I'd understand that, but once it's in the cache, it wouldn't matter anymore? Yes, but if you have an array of them (as we have in this case), considerably more of these records will fit in the cache. Therefore you will have considerably less cache misses. This becomes even more serious when the processor in question does not have prefetching; in such case, traversing the array will cause cache miss after cache miss, a smaller array will then have less of these misses. Daniël___ fpc-devel maillist - fpc-devel@lists.freepascal.org http://lists.freepascal.org/mailman/listinfo/fpc-devel
Re: [fpc-devel] Freepascal in microcontrollers
Op Thu, 28 Feb 2008, schreef Jonas Maebe: On 28 Feb 2008, at 08:19, Daniël Mantione wrote: As long as the compiler is consistent between platforms, it is okay. Differences between little/big endian are acceptable because this is the only situation where we require the coder to manually intervene and write two code paths (usually a simple endian conversion). We don't force the coder to make different code paths between i.e. Linux/Windows, nor should we. It's not about Linux vs. Windows, it's about FPC 2.2.0 vs FPC 3.4.0, coupled with the fact that bitpacked records as currently defined are not usable for defining a specific layout. It compeletely normal that a record written by a a program written in FPC 2.2 can be read be FPC 3.4. If you design a feature, after a grace time, it should be kept backward compatible. I haven't heard the argument why bitpacked records should be exempt from this. Daniël ___ fpc-devel maillist - fpc-devel@lists.freepascal.org http://lists.freepascal.org/mailman/listinfo/fpc-devel
Re: [fpc-devel] Freepascal in microcontrollers
On 28 Feb 2008, at 08:19, Daniël Mantione wrote: As long as the compiler is consistent between platforms, it is okay. Differences between little/big endian are acceptable because this is the only situation where we require the coder to manually intervene and write two code paths (usually a simple endian conversion). We don't force the coder to make different code paths between i.e. Linux/Windows, nor should we. It's not about Linux vs. Windows, it's about FPC 2.2.0 vs FPC 3.4.0, coupled with the fact that bitpacked records as currently defined are not usable for defining a specific layout. For that sort of functionality, you need extensions anyway. If someone wants that functionality, it's better to create such extensions so the programmer can in fact specify everything and implement that, rather than adding constraints to the implementation of bitpacked records. Jonas___ fpc-devel maillist - fpc-devel@lists.freepascal.org http://lists.freepascal.org/mailman/listinfo/fpc-devel
Re: [fpc-devel] Freepascal in microcontrollers
On Thursday 28 February 2008 10:01, Michael Schnell wrote: > > {$BITORDER LOW_ORDER_FIRST} > > {$BITORDER HIGH_ORDER_FIRST} > > Where can this be used ? What exactly does it mean ? Well, call it proposal (of course, the names are strongly influenced by personal language preferences). AFAICS, it would be useful for bitpacked records only, so it could appear anywhere where a {PACKRECORDS} directive or similar can appear currently. Vinzent. ___ fpc-devel maillist - fpc-devel@lists.freepascal.org http://lists.freepascal.org/mailman/listinfo/fpc-devel
Re: [fpc-devel] Patch, font rendering on Arm-Linux devices.
internally the processor still has to have separate "8 bit" data paths and do shifting to reorder the bytes. This is a barrel shifter in the data path that is integrated in the queue and does not take an additional execution cycle. -Michael ___ fpc-devel maillist - fpc-devel@lists.freepascal.org http://lists.freepascal.org/mailman/listinfo/fpc-devel
Re: [fpc-devel] Patch, font rendering on Arm-Linux devices.
Michael Schnell wrote: If it accesses a misaligned 32 bit value it does two accesses (not 4): e.g. once 8 bit and once 24 bit (when reading each of the accesses is the same 32 bit, anyway). Logically you should think about it how I explained. That Intel did an optimization to make the speed impact less is a different issue: internally the processor still has to have separate "8 bit" data paths and do shifting to reorder the bytes. Perhaps this behaviour is specified in their optimization documents, or maybe you have the VHDL source? :-) Transferring data from/to the 1st level cache imposes a lot more delay than the misaligned access. Thus if there are many instances of a record variable that are used for calculation, it might be much faster to use the packed version. If there are only a few, usually the unpacked version should be faster. Show me the benchmark results ;-) Micha ___ fpc-devel maillist - fpc-devel@lists.freepascal.org http://lists.freepascal.org/mailman/listinfo/fpc-devel
Re: [fpc-devel] Patch, font rendering on Arm-Linux devices.
Micha Nelissen wrote: In addition to what the others said, think of it like your 32 bit processor suddenly being a 8 bit processor: it has to manually load 4 times 8 bit, arrange them into a 32 bit value, and only then use it. With non packed, it can use the value directly. With an x86 no additional code needs to be created by the compiler, as it _can_ do misaligned accesses (there are other processors that can't and need more code). If it accesses a misaligned 32 bit value it does two accesses (not 4): e.g. once 8 bit and once 24 bit (when reading each of the accesses is the same 32 bit, anyway). But all this is only internal in the core of the chip and thus _very_ fast, as the chip contains a (1st level) cache and same is connected to the second level cache (also within the chip) with a 128 bit or more data path. Transferring data from/to the 1st level cache imposes a lot more delay than the misaligned access. Thus if there are many instances of a record variable that are used for calculation, it might be much faster to use the packed version. If there are only a few, usually the unpacked version should be faster. -Michael ___ fpc-devel maillist - fpc-devel@lists.freepascal.org http://lists.freepascal.org/mailman/listinfo/fpc-devel
Re: [fpc-devel] Freepascal in microcontrollers
{$BITORDER LOW_ORDER_FIRST} {$BITORDER HIGH_ORDER_FIRST} Where can this be used ? What exactly does it mean ? -Michael ___ fpc-devel maillist - fpc-devel@lists.freepascal.org http://lists.freepascal.org/mailman/listinfo/fpc-devel
Re: [fpc-devel] Patch, font rendering on Arm-Linux devices.
On x86 processors it's usually only a speed penalty (or has anyone ever seen the AC flag turned on?), on other processors you may even have to workaround exceptions (i.e. bus errors), because the processor simply refuses to read or write unaligned data. It even is not guaranteed (or even common) that a misaligned access with a processor that only can do aligned memory actions can be cured by an exception. That is why the compiler needs to create complex code for the potentially misaligned elements of a packed record. All C compilers do this and I am positive that FP does it, too. So no problem here (beyond the additional cycles needed when working with packed records). A real problem comes up if you manipulate a pointer to a (supposedly aligned) multi-byte variable to make it point to an odd address. This will make the program crash on certain processors (not PC not "big" 68Ks, but "small" 68 Ks. -Michael ___ fpc-devel maillist - fpc-devel@lists.freepascal.org http://lists.freepascal.org/mailman/listinfo/fpc-devel
Re: [fpc-devel] Freepascal in microcontrollers
On Thursday 28 February 2008 09:51, Micha Nelissen wrote: > Well we have procedures to do byte swapping, but none to do bit > swapping. It's also very inefficient AFAIK; while changing the > compiler's definition of which bit to use is "free". {$BITORDER LOW_ORDER_FIRST} {$BITORDER HIGH_ORDER_FIRST} ? ___ fpc-devel maillist - fpc-devel@lists.freepascal.org http://lists.freepascal.org/mailman/listinfo/fpc-devel
Re: [fpc-devel] Patch, font rendering on Arm-Linux devices.
Luiz Americo Pereira Camara wrote: Why are non packed records faster? The difference occurs at memory allocation or at memory access? In addition to what the others said, think of it like your 32 bit processor suddenly being a 8 bit processor: it has to manually load 4 times 8 bit, arrange them into a 32 bit value, and only then use it. With non packed, it can use the value directly. Micha ___ fpc-devel maillist - fpc-devel@lists.freepascal.org http://lists.freepascal.org/mailman/listinfo/fpc-devel
Re: [fpc-devel] Freepascal in microcontrollers
Daniël Mantione wrote: Op Thu, 28 Feb 2008, schreef Micha Nelissen: Jonas said the bits were swapped, not the bytes. So PPC32 (1 shl 30) becomes (1 shl 6) on Intel (actual), while it should be (1 shl 1) (expected use by compiler). It's both the "second bit", bit it's in different places. Okay, but does this have impact on the discussion? I mean it makes manual endian conversion a bit more tricky (also need to swap bits around), but doesn't change the fact that you manually need to do endian conversion. Well we have procedures to do byte swapping, but none to do bit swapping. It's also very inefficient AFAIK; while changing the compiler's definition of which bit to use is "free". Micha ___ fpc-devel maillist - fpc-devel@lists.freepascal.org http://lists.freepascal.org/mailman/listinfo/fpc-devel
Re: [fpc-devel] Patch, font rendering on Arm-Linux devices.
On Thursday 28 February 2008 09:16, Daniël Mantione wrote: > Memory access. What happens is that the non-packed version causes > more cache misses. Please elaborate. If the (unaligned) data is crossing a cache-line, thus causing two full cache-line reads, I'd understand that, but once it's in the cache, it wouldn't matter anymore? IOW: How can a packed (thus smaller) record cause more cache misses than a better aligned (but bigger) one? That it can in certain circumstances, I'd understand, but as a general rule? Vinzent. ___ fpc-devel maillist - fpc-devel@lists.freepascal.org http://lists.freepascal.org/mailman/listinfo/fpc-devel
Re: [fpc-devel] Patch, font rendering on Arm-Linux devices.
On Tuesday 26 February 2008 17:27, Luiz Americo Pereira Camara wrote: > Yury Sidorov wrote: > > The patch removes packed record for some platforms. > > IMO packed can be removed for all platforms. It will gain some > > speed. > > I'd like to understand more this issue. > Why are non packed records faster? > The difference occurs at memory allocation or at memory access? At memory access. On x86 processors it's usually only a speed penalty (or has anyone ever seen the AC flag turned on?), on other processors you may even have to workaround exceptions (i.e. bus errors), because the processor simply refuses to read or write unaligned data. And then the only way to circumvent the processor's refusal is to read/write the data byte by byte or mask it out, which is slower than just reading or writing it. Consider writing a 16-bit value spanning across 32-bit-values where the processor can only access a single 32 bits value at an aligned address: *_ _ _ _*_ _ _ _ |0|1|2|3|4|5|6|7| |___| Now the data you need is spanning across bytes [2:5], but the processor can only read full 32 bits either at position 0 (reading bytes [0:3]), or position 4 (reading byte [4:7]). You'd need to read both processor words, mask the data in the lower and upper half of each and write back both words with the new data patched "inbetween" them. So by now, no matter if the processor handles it for you or if the compiler would insert the necessary code to do it, even a simple increment is insanely expensive in terms of processor cycles. Vinzent. ___ fpc-devel maillist - fpc-devel@lists.freepascal.org http://lists.freepascal.org/mailman/listinfo/fpc-devel
Re: [fpc-devel] Patch, font rendering on Arm-Linux devices.
Why are non packed records faster? Cache trashing. One of the most underestimated performance killers in modern software. smaller (packed) records will need less cache space and thus should be faster regarding the memory interface. -Michael ___ fpc-devel maillist - fpc-devel@lists.freepascal.org http://lists.freepascal.org/mailman/listinfo/fpc-devel
Re: [fpc-devel] Freepascal in microcontrollers
The only thing I want to guarantee is that blockwrite followed by blockread on a platform with the same endianness works and will work in the future. This (combined with ifdef based endian conversion) guarantees portability of structures to any platform. Or do I see this wrong? IMHO, I do think this not enough. ADAIK, there is an FP version for a high endian processor (68K); more can be crafted any time. It should be made easy to create communication systems independent of the architecture. Of course you can't have binary compatibility with all values by default (due to performance considerations), but if we do have a "bitpacked" type that is not optimized for speed but for structure, it should be possible to use it for that purpose. Moreover communication via structures in a documented layout is very often needed (Network, files, hardware, ...). There should be an easy way to craft a record type according to such a documentation, may if be documented to hold it's multi-byte values in high or low endian representation. "bitpacked" did open this box of Pandora and it's an obvious request to go all the way :) . (e.g.: TCP/IP defines high-endian, while PC's work low endian internally, so everything needs to be converted.) -Michael ___ fpc-devel maillist - fpc-devel@lists.freepascal.org http://lists.freepascal.org/mailman/listinfo/fpc-devel
Re: [fpc-devel] Freepascal in microcontrollers
Op Thu, 28 Feb 2008, schreef Micha Nelissen: Daniël Mantione wrote: To my knowledge there is no problem with the current implementation. Endian conversion is already the reponsibility of the programmer. Therefore I don't see a need for changes on the compiler side. Jonas said the bits were swapped, not the bytes. So PPC32 (1 shl 30) becomes (1 shl 6) on Intel (actual), while it should be (1 shl 1) (expected use by compiler). It's both the "second bit", bit it's in different places. Okay, but does this have impact on the discussion? I mean it makes manual endian conversion a bit more tricky (also need to swap bits around), but doesn't change the fact that you manually need to do endian conversion. Daniël___ fpc-devel maillist - fpc-devel@lists.freepascal.org http://lists.freepascal.org/mailman/listinfo/fpc-devel
Re: [fpc-devel] Patch, font rendering on Arm-Linux devices.
Op Tue, 26 Feb 2008, schreef Luiz Americo Pereira Camara: Yury Sidorov wrote: The patch removes packed record for some platforms. IMO packed can be removed for all platforms. It will gain some speed. I'd like to understand more this issue. Why are non packed records faster? Cache trashing. One of the most underestimated performance killers in modern software. The difference occurs at memory allocation or at memory access? Memory access. What happens is that the non-packed version causes more cache misses. A cache miss costs many cycles on a modern cpu, a misaligned read just costs an extra memory access (which is fast if cached) on x86, and extra load instruction on ARM. This much cheaper than a chache miss. Daniël___ fpc-devel maillist - fpc-devel@lists.freepascal.org http://lists.freepascal.org/mailman/listinfo/fpc-devel
Re: [fpc-devel] Freepascal in microcontrollers
To my knowledge there is no problem with the current implementation. Endian conversion is already the reponsibility of the programmer. Therefore I don't see a need for changes on the compiler side. It might be possible do define the individual bytes of a certain value in a bitpacked record to be located at certain bit positions by some complicated syntax, but if that is required for binary compatibility, that is not "nice" at all. A compiler option to optionally define the endianess is much more handy. -Michael ___ fpc-devel maillist - fpc-devel@lists.freepascal.org http://lists.freepascal.org/mailman/listinfo/fpc-devel
Re: [fpc-devel] Freepascal in microcontrollers
Op Thu, 28 Feb 2008, schreef Michael Schnell: To my knowledge there is no problem with the current implementation. Endian conversion is already the reponsibility of the programmer. Therefore I don't see a need for changes on the compiler side. I don't understand your meaning here. If a record with binary fields is defined and transferred to (e.g.) another instance of the same program running on another machine by network or within a file, or if it needs to be crafted according to a specification of a network block or file format or a hardware device, endianess needs to be taken care of, and it's not nice if the user needs to write active code to do this, if a "bitpacked" type is available that seemingly can be used for that issue. You even might want to define a record that just holds a single 16 bit value. Here it would be good to be able do define endianess of the bitpacked structure to make it compatible with the communication partner. Fixed endianness of records could be a language extension, it might even ease programmers from doing endian conversion (rather than doing ifdefs, just mention the endianness of your record), much nicer. However, the situation as it is, is that we do not have such a language feature and rely on manual endian conversion by means of ifdefs. There is no difference between normal, packed, or bitpacked records, you have to endian convert them manually. Daniël___ fpc-devel maillist - fpc-devel@lists.freepascal.org http://lists.freepascal.org/mailman/listinfo/fpc-devel
Re: [fpc-devel] Patch, font rendering on Arm-Linux devices.
Yury Sidorov wrote: The patch removes packed record for some platforms. IMO packed can be removed for all platforms. It will gain some speed. I'd like to understand more this issue. Why are non packed records faster? The difference occurs at memory allocation or at memory access? Original (Delphi) VirtualTreeView uses packed record, and i'm considering removing in the LCL port. Luiz ___ fpc-devel maillist - fpc-devel@lists.freepascal.org http://lists.freepascal.org/mailman/listinfo/fpc-devel
Re: [fpc-devel] Freepascal in microcontrollers
The compiler can only care about processor endianness. Having a known binary structure is something different as being usable for hardware access. Right. AFAIK, even C does not do this in a language construct. But FP might be better than standard :) . -Michael ___ fpc-devel maillist - fpc-devel@lists.freepascal.org http://lists.freepascal.org/mailman/listinfo/fpc-devel
Re: [fpc-devel] Freepascal in microcontrollers
Daniël Mantione wrote: To my knowledge there is no problem with the current implementation. Endian conversion is already the reponsibility of the programmer. Therefore I don't see a need for changes on the compiler side. Jonas said the bits were swapped, not the bytes. So PPC32 (1 shl 30) becomes (1 shl 6) on Intel (actual), while it should be (1 shl 1) (expected use by compiler). It's both the "second bit", bit it's in different places. Micha ___ fpc-devel maillist - fpc-devel@lists.freepascal.org http://lists.freepascal.org/mailman/listinfo/fpc-devel
Re: [fpc-devel] Freepascal in microcontrollers
Op Thu, 28 Feb 2008, schreef Michael Schnell: C-style bitpacking ("char c:1" and "int c:1" are often laid out differently in C depending on the previous fields, Not only this. C defines the layout as implementation depended. I once was bitten by this when porting a networked project from a low endian processor to a high endian processor :( . Thus if we want binary portability of the structures we need to be better than C (optionally defining the structure as high-endian or low endian on user request) Why? The only thing I want to guarantee is that blockwrite followed by blockread on a platform with the same endianness works and will work in the future. This (combined with ifdef based endian conversion) guarantees portability of structures to any platform. Or do I see this wrong? Daniël___ fpc-devel maillist - fpc-devel@lists.freepascal.org http://lists.freepascal.org/mailman/listinfo/fpc-devel
Re: [fpc-devel] Freepascal in microcontrollers
To my knowledge there is no problem with the current implementation. Endian conversion is already the reponsibility of the programmer. Therefore I don't see a need for changes on the compiler side. I don't understand your meaning here. If a record with binary fields is defined and transferred to (e.g.) another instance of the same program running on another machine by network or within a file, or if it needs to be crafted according to a specification of a network block or file format or a hardware device, endianess needs to be taken care of, and it's not nice if the user needs to write active code to do this, if a "bitpacked" type is available that seemingly can be used for that issue. You even might want to define a record that just holds a single 16 bit value. Here it would be good to be able do define endianess of the bitpacked structure to make it compatible with the communication partner. -Michael ___ fpc-devel maillist - fpc-devel@lists.freepascal.org http://lists.freepascal.org/mailman/listinfo/fpc-devel