Re: Toolchain upgrade? (Was: Instructions cache flush on ARM)

2007-05-03 Thread Frantisek Dufka

Siarhei Siamashka wrote:


I'm not going to statically link with glibc, but only with libstdc++ (standard
c++ library). There are a few known tricks to make gcc link with libstdc++
statically, but dynamically with all the rest of libraries. One of them is
creating a symlink to libstdc++.a in some empty directory and specify this
directory with -L option in gcc command line. When gcc will start linking,
it will be fooled to link with a static libstdc++ library. But I guess just
killing libstdc++.so in scratchbox will do the the job. After that, the
compiler theoretically should create binaries which should run with no
problems on the device even for c++ applications.


I used this trick for scummvm for IT2005. It works as long as your 
program does not load dynamically (directly or indirectly) some other 
c++ code compiled with stdc++ too e.g. when using some plugins in your 
application or other libraries written in C++.

___
maemo-developers mailing list
maemo-developers@maemo.org
https://maemo.org/mailman/listinfo/maemo-developers


Re: Toolchain upgrade? (Was: Instructions cache flush on ARM)

2007-05-03 Thread Eero Tamminen
Hi,

ext Siarhei Siamashka wrote:
 Did you also read Intel docs? Unaligned access has some restrictions
 on x86 as well. Do you have an example of some practical case where
 hardware unaligned support from ARM11 would work worse than on x86?

No.  Would be nice if somebody would test it.


 The compiler should do the job aligning data for performance reasons
 (as it does on x86 as well). But if you happen to have some unaligned
 data in memory anyway, just reading it with some minor unavoidable
 performance penalty will be faster than reading data one byte at
 a time and combining it into a 32-bit or 16-bit value (instructions
 timings can be also found in this Technical Reference Manual).

 Enabling hardware unaligned access support should make explicit
 pointer conversion hacks that are sometimes used in not very portable
 C code work just like they do on x86. Which is a good thing in my
 opinion.

Same test could check whether the CPU alignment fixup mode has any
performance penalty also for aligned operations...


 The number of posts asking for help is also nonzero:
 http://www.internettablettalk.com/forums/showthread.php?t=2668

What!?!  People who are porting/re-making m68k games haven't
heard of alignment issues?  Where the world is going...

Like ARM, M68k required things to be aligned and generated
buserror/SIGBUS if they weren't.  And these are not the only
architectures where things should be aligned.  If I remember
correctly Alpha (first 64-bit machine I tried almost 10 years
ago) required also aligning.


 In addition, I remember having explained about alignment issues to a few 
 people on #maemo channel over all this time, they all came complaining
 about applications working on x86 but crashing on ARM.

 So in my opinion this problem really exists, even if it is not so significant.

The quality of the developers these days, sigh... ;-)

People really should learn to use a compiler, it tells
about these issues (signed/unsigned in addition to alignment)
when asked.  These warnings produce some false positives though,
so one doesn't want to keep them on all the time.

I'm more concerned about bugs in code being silenty fixed
(wrong) by the CPU instead of developer getting notified about
the issues (with a crash).


- Eero
___
maemo-developers mailing list
maemo-developers@maemo.org
https://maemo.org/mailman/listinfo/maemo-developers


Re: Toolchain upgrade? (Was: Instructions cache flush on ARM)

2007-05-03 Thread Riku Voipio

Siarhei Siamashka wrote:

Hm.  The toolchain might not be built with -pg support.
As to using gprof, that produces fairly unreliable results.
I'd recommend building Oprofile kernel and latest oprofile
user-space tools.



Maybe Oprofile is good, but  gprof is better than nothing and does not require
recompiling kernel.

  

OTOH you don't need to recompile apps with -pg to profile with oprofile.
One needs to just recompile the kernel once.

Well, I'm worried not about how to workaround ICE but about the overall
quality of the compiler. I wonder how many compiler related bugs are lurking
in maemo software but are not caught yet? 
  

Me too..

Did anybody try installing newer toolchains in scratchbox and use them with
maemo SDK? I just don't have much free time for these experiments and 
don't want to break my installation of scratchbox which works now (more or

less acceptable)
  
No reason why it shouldn't work, one just needs to be sure the 
glibc/libstd++

built is matching enough. I tried the newer codesourcery toolchain (based
on gcc 4.1, and it mostly worked, except for apps using certain depreciated
syscalls. (i've forgotten already which ones).

http://www.scratchbox.org/wiki/ForeignToolchains


Enabling unaligned memory support will make life much easier for developers
unfamiliar with ARM platform. The number of applications for N800 should 
grow up, as less newbee developers will be turned away frustrated by the

alignment bugs they have never heared about before.
  
Alignment issues exist for other risc processors as well (hppa, alpha, 
ia64),

so most Linux programs to be ported to n800 should already be fixed.  Also,
who says all maemo products will be arm11 based? arm9 isn't going to
disappear overnight, especially it seems xscales will continue to evolve as
arm9-based.

This is of course just objecting for the sake of it:) I don't have
anything against enabling unaligned memory support, so tested patches
would be appreciated ;)
___
maemo-developers mailing list
maemo-developers@maemo.org
https://maemo.org/mailman/listinfo/maemo-developers


Re: Toolchain upgrade? (Was: Instructions cache flush on ARM)

2007-05-02 Thread Eero Tamminen
Hi,

ext Siarhei Siamashka wrote:
 By the way, do you have any plans for upgrading toolchain? Either I'm
 extremely unlucky, or current toolchain is really very buggy.
 You can see the known issues from the GCC bugzilla.
 There are a few bugs in C++ support which have been fixed
 in gcc 3.4.6 (Maemo toolchain is 3.4.4) or 4.x.
 
 But doesn't current maemo toolchain have lots of modifications to
 backport EABI support which only officially appeared in gcc 4.x? 
 These modifications might have introduced some additional instability.

True, but GCC bugzilla might anyway be missing bugs existing in upstream
gcc too (especially ones that happen only on less common platforms such
as ARM). :-)


 It does not support -pg option properly (for profiling with gprof),
 Hm.  The toolchain might not be built with -pg support.
 As to using gprof, that produces fairly unreliable results.
 I'd recommend building Oprofile kernel and latest oprofile
 user-space tools.
 
 Maybe Oprofile is good, but  gprof is better than nothing and does not require
 recompiling kernel.

On x86 I prefer valgrind/cachegrind/callgrind/kcachegrind as
that way one can browse the source code interactively with
the profiling information.  Getting to know how the source
really works is sometimes more useful than knowing the exact
bottleneckedness percentage of some function.

This is less useful for more hardware dependent features like video
though.  :-/


 also I encountered at least one internal compiler error and a couple of
 invalid code generation bugs already.
 C++ code generation?  Or C? (GCC bugzilla mentions only C++
 code generation issues)
 
 I have encountered the following problems on C code (MPlayer).
 
 ICE:
 http://gcc.gnu.org/bugzilla/show_bug.cgi?id=22177
 
 Definitely invalid code generation in inline asm (but the same bug 
 apparently shows up in gcc 4.1.1 as well):
 http://gcc.gnu.org/bugzilla/show_bug.cgi?id=31693
 
 Invalid code generation suspected:
 https://garage.maemo.org/tracker/index.php?func=detailaid=254group_id=54atid=269
 https://garage.maemo.org/tracker/index.php?func=detailaid=763group_id=54atid=269
 
 I did not investigate these two last problems thoroughfully (this might be
 probably some bad code in MPlayer with 'undefined behaviour' which works
 better on some compilers but breaks on the others), but they disappear when
 compiling with gcc 4.1.1 crosscompiler (outside scratchbox using gentoo
 crossdev).
 
 ICE you can get around by trying another optimization level
 (sometimes -Os or -O3 works where -O2 doesn't).
 
 Well, I'm worried not about how to workaround ICE but about the overall
 quality of the compiler. I wonder how many compiler related bugs are lurking
 in maemo software but are not caught yet? But again, maybe I'm just unlucky
 to get hit by more bugs than the others :)

I know only one issue where a bad code generation was suspected
(in C++ code), but it was never really verified.


 Did anybody try installing newer toolchains in scratchbox and use them with
 maemo SDK? I just don't have much free time for these experiments and 
 don't want to break my installation of scratchbox which works now (more or
 less acceptable)

Installing new toolchains for Sbox shouldn't be a problem (if it's
already available for it) and you can make a new Sbox target for each
toolchain you want to test.


 Building packages with new toolchain would probably need to have libstdc++
 linked statically for C++ applications to work on 770/N800, but otherwise
 everything should be fine.

Actually, you cannot really build static binaries with Glibc.
It links some stuff always dynamically (nss for example).
I don't know whether this is a problem in practice though.

(uClibc is better in this respect and produces significantly
smaller binaries too.)


 One more question is about the kernel, ARM11 seems to support unaligned
 memory access in hardware, but this feature is not enabled on N800.
 What the seems, to support and feature enabled mean in
 the above clause?  Seems how?  And what is result? Enabled what?
 
 seems is a standard disclaimer which means that I did not work with these
 features myself, only read this information from docs and can't be sure if I
 understood everything correctly :)
 
 ARM CPU is able to trap them?  Kernel could SIGBUS the co. processes?
 (as unaligned access has AFAIK undefined results on ARM, is often
 coding error and fixing those accesses on kernel side has definitive
 performance penalty)
 
 http://arm.com/documentation/ARMProcessor_Cores/index.html
 'ARM1136JF-S and ARM1136J-S r1p1 Technical Reference Manual'
 Chapter 4 'Unaligned and Mixed-Endian Data Access Support'

Did you read the section on ARMv6 unaligned data access restrictions?
Basically it doesn't work in all cases, the accesses are not atomic and
have performance implications.


 As ARM11 core used in N800 is little endian, does have floating point unit and
 supports unaligned memory access in hardware (which only needs to be 

Re: Toolchain upgrade? (Was: Instructions cache flush on ARM)

2007-05-02 Thread Siarhei Siamashka
On Wednesday 02 May 2007 18:48, Eero Tamminen wrote:
 On x86 I prefer valgrind/cachegrind/callgrind/kcachegrind as
 that way one can browse the source code interactively with
 the profiling information.  Getting to know how the source
 really works is sometimes more useful than knowing the exact
 bottleneckedness percentage of some function.

Sure, I'm also using valgrind/cachegrind/callgrind/kcachegrind in my 
work quite often. It's a very nice tool.

But callgrind for statistics does not provide information about floating point
math and integer divisions, so real results on ARM may be really very
different.

Also cache behaviour on Nokia 770 arm926ej-s core is very different from 
cache on x86. Actually arm926ej-s  does not allocate cache line on write 
miss and all the x86 cpus do. This makes very big difference for the code
which does lots of writes to uncached memory. Cachegrind only simulates
write-allocate cache.

I created the following patch for simulating read-allocate behaviour 
in callgrind (for more precise arm926ej-s simulation):
http://ufo2000.sourceforge.net/files/vg-read-allocate-cache-patch.diff

Though arm1136jf-s core from N800 now supports write-allocate cache 
and this patch is not needed when optimizing for N800 :)

  Did anybody try installing newer toolchains in scratchbox and use them
  with maemo SDK? I just don't have much free time for these experiments
  and don't want to break my installation of scratchbox which works now
  (more or less acceptable)

 Installing new toolchains for Sbox shouldn't be a problem (if it's
 already available for it) and you can make a new Sbox target for each
 toolchain you want to test.

Thanks, I'll try that. In my preliminary tests, mplayer becomes a few percents
faster for mpeg4 decoding when switching to gcc 4.1.1 (tested a build compiled
with a crosscompiler outside scratchbox, with no audio/video output except for
SDL, so not really useful for end users, but fine for benchmarking with
gprof).

  Building packages with new toolchain would probably need to have
  libstdc++ linked statically for C++ applications to work on 770/N800, but
  otherwise everything should be fine.

 Actually, you cannot really build static binaries with Glibc.
 It links some stuff always dynamically (nss for example).
 I don't know whether this is a problem in practice though.

I'm not going to statically link with glibc, but only with libstdc++ (standard
c++ library). There are a few known tricks to make gcc link with libstdc++
statically, but dynamically with all the rest of libraries. One of them is
creating a symlink to libstdc++.a in some empty directory and specify this
directory with -L option in gcc command line. When gcc will start linking,
it will be fooled to link with a static libstdc++ library. But I guess just
killing libstdc++.so in scratchbox will do the the job. After that, the
compiler theoretically should create binaries which should run with no
problems on the device even for c++ applications.

  http://arm.com/documentation/ARMProcessor_Cores/index.html
  'ARM1136JF-S and ARM1136J-S r1p1 Technical Reference Manual'
  Chapter 4 'Unaligned and Mixed-Endian Data Access Support'

 Did you read the section on ARMv6 unaligned data access restrictions?
 Basically it doesn't work in all cases, the accesses are not atomic and
 have performance implications.

Did you also read Intel docs? Unaligned access has some restrictions
on x86 as well. Do you have an example of some practical case where 
hardware unaligned support from ARM11 would work worse than on x86?

The compiler should do the job aligning data for performance reasons (as it
does on x86 as well). But if you happen to have some unaligned data in 
memory anyway, just reading it with some minor unavoidable performance 
penalty will be faster than reading data one byte at a time and combining it
into a 32-bit or 16-bit value (instructions timings can be also found in this
Technical Reference Manual).

Enabling hardware unaligned access support should make explicit pointer
conversion hacks that are sometimes used in not very portable C code work
just like they do on x86. Which is a good thing in my opinion.

  As ARM11 core used in N800 is little endian, does have floating point
  unit and supports unaligned memory access in hardware (which only needs
  to be enabled). It probably doesn't have any serious portability issues
  to be aware of anymore and vast majority of software initially developed
  for x86 should be easy to compile and run on it even without doing any
  modifications.

 Compiler aligns everything correctly if your code is correct.
 I think non-aligned code is bug and performance issue.

In the real world such buggy code unfortunately exists. And it works fine on
x86 which is probably the most widely used platform for software development.

  Enabling unaligned memory support will make life much easier for
  developers unfamiliar with ARM platform. The number of applications for
  

Toolchain upgrade? (Was: Instructions cache flush on ARM)

2007-04-30 Thread Siarhei Siamashka
On Tuesday 24 April 2007 10:56, you wrote:

  By the way, do you have any plans for upgrading toolchain? Either I'm
  extremely unlucky, or current toolchain is really very buggy.

 You can see the known issues from the GCC bugzilla.
 There are a few bugs in C++ support which have been fixed
 in gcc 3.4.6 (Maemo toolchain is 3.4.4) or 4.x.

But doesn't current maemo toolchain have lots of modifications to
backport EABI support which only officially appeared in gcc 4.x? 
These modifications might have introduced some additional instability.

  It does not support -pg option properly (for profiling with gprof),

 Hm.  The toolchain might not be built with -pg support.
 As to using gprof, that produces fairly unreliable results.
 I'd recommend building Oprofile kernel and latest oprofile
 user-space tools.

Maybe Oprofile is good, but  gprof is better than nothing and does not require
recompiling kernel.

  also I encountered at least one internal compiler error and a couple of
  invalid code generation bugs already.

 C++ code generation?  Or C? (GCC bugzilla mentions only C++
 code generation issues)

I have encountered the following problems on C code (MPlayer).

ICE:
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=22177

Definitely invalid code generation in inline asm (but the same bug 
apparently shows up in gcc 4.1.1 as well):
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=31693

Invalid code generation suspected:
https://garage.maemo.org/tracker/index.php?func=detailaid=254group_id=54atid=269
https://garage.maemo.org/tracker/index.php?func=detailaid=763group_id=54atid=269

I did not investigate these two last problems thoroughfully (this might be
probably some bad code in MPlayer with 'undefined behaviour' which works
better on some compilers but breaks on the others), but they disappear when
compiling with gcc 4.1.1 crosscompiler (outside scratchbox using gentoo
crossdev).

 ICE you can get around by trying another optimization level
 (sometimes -Os or -O3 works where -O2 doesn't).

Well, I'm worried not about how to workaround ICE but about the overall
quality of the compiler. I wonder how many compiler related bugs are lurking
in maemo software but are not caught yet? But again, maybe I'm just unlucky
to get hit by more bugs than the others :)

Did anybody try installing newer toolchains in scratchbox and use them with
maemo SDK? I just don't have much free time for these experiments and 
don't want to break my installation of scratchbox which works now (more or
less acceptable)

Building packages with new toolchain would probably need to have libstdc++
linked statically for C++ applications to work on 770/N800, but otherwise
everything should be fine.

  One more question is about the kernel, ARM11 seems to support unaligned
  memory access in hardware, but this feature is not enabled on N800.

 What the seems, to support and feature enabled mean in
 the above clause?  Seems how?  And what is result? Enabled what?

seems is a standard disclaimer which means that I did not work with these
features myself, only read this information from docs and can't be sure if I
understood everything correctly :)

 ARM CPU is able to trap them?  Kernel could SIGBUS the co. processes?
 (as unaligned access has AFAIK undefined results on ARM, is often
 coding error and fixing those accesses on kernel side has definitive
 performance penalty)

http://arm.com/documentation/ARMProcessor_Cores/index.html
'ARM1136JF-S and ARM1136J-S r1p1 Technical Reference Manual'
Chapter 4 'Unaligned and Mixed-Endian Data Access Support'

As ARM11 core used in N800 is little endian, does have floating point unit and
supports unaligned memory access in hardware (which only needs to be enabled).
It probably doesn't have any serious portability issues to be aware of
anymore and vast majority of software initially developed for x86 should be
easy to compile and run on it even without doing any modifications.

Enabling unaligned memory support will make life much easier for developers
unfamiliar with ARM platform. The number of applications for N800 should 
grow up, as less newbee developers will be turned away frustrated by the
alignment bugs they have never heared about before.

But this will be to some extent a bad thing for Nokia 770, as it will result
in more applications usable on N800, but buggy on 770
___
maemo-developers mailing list
maemo-developers@maemo.org
https://maemo.org/mailman/listinfo/maemo-developers