Re: [maemo-developers] Improving Cairo performance on the N800

2007-01-17 Thread Siarhei Siamashka
On Tuesday 16 January 2007 12:08, Zeeshan Ali wrote:

  Now, the recently announced Nokia N800 is different from the 770 in
  various ways that are interesting for Cairo performance. I've got my
  eye on the ARMv6 SIMD instructions and the PowerVR MBX accelerator.

Yeah! me too. The combined power of these two can make it possible
 to optimize a lot of nice free software out there for the N800 device.
  However! while former is fully documented and the documentation is
 available for general public, it doesn't have a lot to offer. ARMv6
 SIMD only operate on 32-bit words and hence i find it unlikely that it
 can be used to optimize double fp emulation in contrast to the intel
 wirelesss MMX, which provides a big bunch of 128-bit (CORRECTME: or
 was it 64- bit?) SIMD instructions. OTOH, these few SIMD instructions
 can still be used to optimize a lot of code but would it be a good
 idea for cairo if you need to convert the operand values to ints and
 the result(s) back to float?

Well, OMAP2420 seems to support floating point in hardware, so all this stuff
is probably not needed anymore :)

   I have already been thinking on utilizing ARMv6 before the N800 was
 release to public. My proposed plan of attack for the community (and
 also the Nokia employees) is simply the following:

 1. Patch GCC to provide ARMv6 intrinsics. (1 MM at most)
 2. Patch liboil [1] to utilize these intrinsics when compiled for
 ARMv6 target (1-3 MM)
 3. Make all the software utilize liboil wherever appropriate or ARMv6
 intrinsics directly if needed.

The 3rd step would ensure that you are optimizing your software for
 all the platforms for which liboil provides optimizations. OTOH! one
 can skip step#1 and write liboil implementations in assembly.

I already did a little progress on this and the result is two
 header files which provides inline functions abstracting the assembly
 instructions. I am attaching the headers. One of my friend was
 supposed to convert them to gcc intrinsics and patch gcc but i never
 got around to finish them. However I am attaching the headers so
 anyone can use it as a starter if he/she likes.

According to my tests, performance improvement from using such header 
files is minimal. They are easy to use, but the improvement is generally not
very good.

When I benchmarked idct performance, I also tested C implementaion with some
macros for fast armv5te 16-bit multiplication out of curiasity. Performance
improvement was only about 5%. While at the same time, handcrafted code
improves performance by as much as 50% (and still has potential for more
optimizations):
http://lists.mplayerhq.hu/pipermail/ffmpeg-devel/2006-September/045837.html

The very similar minimal effect is obtained from using such macros in ffmpeg
mp3 decoder.
 
The explanation is simple. Compiler is not able to shedule instructions 
as good as human especially if it has some 'alien' parts of code inserted 
in the flow of its instructions via inline asm. For example, this multiply
instruction takes 1 cycle to execute, but the result has 1 extra cycle latency
(for ARM9, it is even higher for ARM11 and is equal to 2 cycles) and you can't
use it immediately in the next instruction. As gcc does not know about the
sheduling of such instructions when using just macros, it may try to use
the result immediately and suffer form 1 or more cycles penalty because of
pipeline interlock.

So if really good performance is required, nothing can beat handcrafted
assembly yet. Of course it makes sense to profile code and optimize only 
time critical relatively small leaf functions.

By the way, free software is really poorly optimized for ARM right now. For
example, SDL is not optimized for ARM, xserver is probably not optimized 
as well, a lot of performance critical parts of code in various software are
still only implemented in C for ARM while they have x86 assembly 
optimizations long ago. Considering that Internet Tablets might have a tight
competition  with x86 UMPC devices in the near future, ARM poweded devices 
are at some disadvantage now. Is this something that we should try to
change? :-)
___
maemo-developers mailing list
maemo-developers@maemo.org
https://maemo.org/mailman/listinfo/maemo-developers


Re: [maemo-developers] Xvideo support for Nokia 770?

2007-01-17 Thread Siarhei Siamashka
On Wednesday 10 January 2007 01:51, Charles 'Buck' Krasic wrote:

 Siarhei Siamashka wrote:
  Actually I have been thinking about trying to implement Xvideo
  support on 770 for some time already. Now as N800 has Xvideo
  support, it would be nice to have it on 770 as well for better
  consistency and software compatibility.

 As you may recall, I was considering this back in August/September.
 I tried a few things, and reported some of my findings to this list.
 The code for all that is still available here:
 http://qstream.org/~krasic/770/dsp/

Yes, sure I remember. Thanks for doing these experiments and making 
the results available. It really helps to have more information around.

  I see the following possible options:
 
  1. Implement it just using ARM core and optimize it as much as
  possible (using dynamically generated code for scaling to get the
  best performance). Is quite a straightforward solution and only
  needs time to implement it.

 It is my impression that this might be the most attractive option.
 I noticed that TCPMP which seems to be the most performant player for
 the ARM uses this approach, and it is available under GPL, so it may
 be possible to adapt some of its code.

 In the long run, I would hope that integrating TCPMP scaling code into
 libswscale of the ffmpeg project might be the most elegant approach,
 since that seems to be the most performant/featureful/widel adopted
 open-source scaling code (but not yet on ARM).   For mplayer, it works
 out of the box, since libswcale actually originated from mplayer, and
 only recently migrated to ffmpeg.

I see, thanks for the information (I checked TCPMP sources some time ago, 
but was interested in runtime cpu capabilities detection code and did not look
at the scaler that time). Using TCPMP code may be an interesting option. But I
also still may try to make my own scaler implementation for two reasons:
1. TCPMP is covered by GPL license, and most parts of ffmpeg are LGPL, so
probably it makes sense making a clean room implementation of JIT powered
scaler for ARM under LGPL license
2. I'm worried about the performance. Knowing how the cache and write buffer
work on arm926 core, it is possible to tune generated code for it and get the
best performance possible. So the results can be better than for TCPMP.

I have just committed some initial assembly optimizations for unscaled
yuv420p - yuyv422 color format convertor to maemo mplayer SVN. It already
provides some performance improvement, for example on my test video file
(640x480 resolution, 24 fps) I get the following results now:

BENCHMARKs: VC: 114.526s VO:  21.055s A:   0.000s Sys:   1.582s =  137.163s
BENCHMARK%: VC: 83.4962% VO: 15.3503% A:  0.% Sys:  1.1535% = 100.%

We can compare it with the older results (decoding time was also 
improved a bit since that time because of recent assembly optimizations 
for dequantizer):
http://maemo.org/pipermail/maemo-developers/2006-December/006646.html

BENCHMARKs: VC: 121.282s VO:  31.538s A:   0.000s Sys:   1.577s =  154.397s
BENCHMARK%: VC: 78.5517% VO: 20.4267% A:  0.% Sys:  1.0216% = 100.%

Most of the speed improvement in color conversion and video output (VO: part)
is gained just from loop unrolling and avoiding using some extra instructions
as gcc does when compiling C code, but using STMD instruction to store 16
bytes at once at aligned location [1] provides at least 10% performance here.
If we estimate memory copy speed here with additional colorspace conversion
applied, it is about 70MB/s now for 640x480 24 fps video (though we need to
read a bit less data than write here, so it is a bit different from memcpy).
And I have observed peak memcpy performance about 110MB/s on Nokia 770. 
So this color convertor is quite close to memory bandwidth limit now. This
code can be optimized more by processing two image lines at once, so we can
get rid of some data read instructions and improve performance. Also
experimenting with prefetch reads may provide some improvement.

JIT generated code should have a bit worse performance, but not much. It we
decide to make 'nearest neghbour' scaling, the result should be probably as
fast as this nonscaled conversion. But I want to try some simplified variation
of bilinear scaling: each pixel in the destination buffer is either a copy of
some pixel in the source buffer or an average value of two pixels. This way it
should only introduce two extra instructions for each byte in output at
maximum: addition of two pixel color components and right shift.

  2. Try using dsp tasks that already exist on the device and are
  used for dspfbsink. But the sources of gst plugins contain code
  that limits video resolution for dspfbsink. I wonder if this check
  was introduced artificially or it is the limitation of DSP scaler
  and it can't handle anything larger than that. Also I wonder if
  existing video scaler DSP task can support direct rendering [2].

 I tried direct rendering in the 

[maemo-developers] Re: Discussion of a possible project - offline calendar project

2007-01-17 Thread Ross Burton
On Tue, 2007-01-16 at 22:23 +0100, Patrick Ohly wrote:
   * the showstopper though were performance/timeout issues in the
 EDS-DBus libraries (see below)

Eek!

 Anyway, the problem is that after downloading 200 contacts into the
 Nokia 770 e_book_get_contacts() fails with a timeout error. I was able
 to work around that by using e_book_async_get_contacts(), only to find
 that now e_book_get_changes() suffers from the same problem. I suspect
 that it is a DBus method call which is expected to complete more quickly
 than it really does.

Yes, there is a timeout on DBus calls, which isn't that long.  If you
have a lot of contacts EDS has to read every single one into a memory,
create a DBus message and send it back to the client (which gets copied
a number of times with the current bus protocol).  If you profile it
you'll see that memcpy() is the bottleneck here, basically there is too
much data to copy, and not enough memory bandwidth.

The solution is to always use the async methods unless you are coding a
toy application.  If you want to get all contacts, ideally create a book
view -- this means you get informed of the contacts both asynchronously
and incrementally (which is much nicer to system performance as there is
no multi-megabyte message to parse and copy over the bus).  When you
call get_changes, use the async version.

Have you had this working against eds-dbus?  The e_book_get_changes()
method until now was untested in the DBus port, and although I hoped it
worked I hadn't verified it.  If it has been working, that is great
news.

Feel free to mail me any further in-depth questions off-list,
Ross
-- 
Ross Burton mail: [EMAIL PROTECTED]
  jabber: [EMAIL PROTECTED]
 www: http://www.burtonini.com./
 PGP Fingerprint: 1A21 F5B0 D8D0 CFE3 81D4 E25A 2D09 E447 D0B4 33DF



___
maemo-developers mailing list
maemo-developers@maemo.org
https://maemo.org/mailman/listinfo/maemo-developers


Re: [maemo-developers] examining n800 kernel

2007-01-17 Thread Kalle Valo
Frantisek Dufka [EMAIL PROTECTED] writes:

 WI-FI

 seems to be same chip as in (newer) N770 devices (?), similar firmware
 blobs (3825.arm, 3826.arm) probably newer versions. Hopefully the
 speed will be better that those 500KB/s on N770 thanks to rest of the
 system.

The SPI bus is faster, so WLAN is a bit faster. I have managed to get
7 Mbit/s (I guess about 800 KB/s) TCP downstream with iperf.

-- 
Kalle Valo

___
maemo-developers mailing list
maemo-developers@maemo.org
https://maemo.org/mailman/listinfo/maemo-developers


RE: [maemo-developers] examining n800 kernel

2007-01-17 Thread Jakub.Pavelek
 I've been looking at the n800 kernel source in bora repository to 
 figure out what n800 is like comparing to the n770. Here is 
a summary 
 of some things I found. As I don't have the device I may be 
wrong with 
 something that could be easily verified.

 kernel is 2.6.18-omap1 - everybody probably knows that :-)
(snip)
 USB

 Seems to be 2.0, capable of high speed mode (480MBits), chip is 
 TUSB6010 by TI. No usb host mode is compiled in the kernel. Usb host 
 mode support was also removed from initfs (usb booting) so 
this may look bad.


  Looking at the shiny brochure-level documentation for the 
TUSB6010 it looks like it has full support for USB-OTG, 
including a 5V charge pump for driving peripheral devices.  
Whether or not it's possible to write support for it due to 
available documentation is another matter.
 I really wish Nokia would step forward and do it since it 
looks like it could be perfect for host-mode.
  Thanks for doing the kernel writeup!

Larry

Hi,

AFAIK there is no USB-OTG in N800. The connector (HW) does not handle it
so no effort spent in the kernel either.

Br,

--jakub
___
maemo-developers mailing list
maemo-developers@maemo.org
https://maemo.org/mailman/listinfo/maemo-developers


[maemo-developers] Re: maemo-developers Digest, Vol 21, Issue 57

2007-01-17 Thread Pablo Chacin


Yes, I also noticed the flurry of dev projects after the
 announcement.Interesting.



I would say,  logic, not interesting.  One thing is to invest  your
spare time in an open source project (I don't  know you guys,  but i do
other things  for a living) but also investing  400 euros is maybe too much
to ask to developers, who are creating the sustainability of the N800 as an
appealing platform for non developers. Also, consider that the new N800 is
much better and more appealing that the 770, so many of us are now
considering it seriously.

By the way, I'm not currently in any project (neither do I own a N800 in
this moment) but I'm cooking one project regarding gesture based interfaces.
If initial research goes well, I'll let you know  . . . and this will hapen
regardless  of that rebate :-D




Date: Tue, 16 Jan 2007 11:02:34 -0800

From: Ty Hoffman [EMAIL PROTECTED]
Subject: Re: [maemo-developers] n800 camera specs
To: Michael Wiktowy [EMAIL PROTECTED]
Cc: maemo developers mailing list maemo-developers@maemo.org
Message-ID: [EMAIL PROTECTED]
Content-Type: text/plain; charset=windows-1252; format=flowed

Michael Wiktowy wrote:
 On 1/16/07, Matt Clark [EMAIL PROTECTED] wrote:
 
  I take nokia is going to refund the €300/$300 price difference for
  people that bought an
  n800 already but are going to be in the dev program?

 Wow, that's a wonderfully well developed sense of entitlement you've
 got there.

 Getting beyond the easily misinterpreted intentions of mailing list
 participants, he does have a very good idea. If Nokia just sends 500
 worthy developers mail-in rebates for store-bought n800's then there
 is no issue of waiting anymore. It is probably the easiest thing to
 handle logistically on Nokia's side also.

 On another note, one beneficial thing that this delay has generated is
 a lot of developer activity on the 770 as devs fight to get noticed
 via project updates :]

 So as a simple user I say, Nokia, delay all you want ;]

 /Mike
 

 ___
 maemo-developers mailing list
 maemo-developers@maemo.org
 https://maemo.org/mailman/listinfo/maemo-developers

Yes, I also noticed the flurry of dev projects after the announcement.
Interesting. Maybe the whole discount program is just a psychological
experiment on motivation and reward, and there are no discounted units.
Perhaps it's a first test for the 'bounty' program discussed
earlier...(please don't misinterpret my intentions! I'm just kidding!)

--Ty


--

___
maemo-developers mailing list
maemo-developers@maemo.org
https://maemo.org/mailman/listinfo/maemo-developers


End of maemo-developers Digest, Vol 21, Issue 57


___
maemo-developers mailing list
maemo-developers@maemo.org
https://maemo.org/mailman/listinfo/maemo-developers


[maemo-developers] gtk+-2.0 uninstallable

2007-01-17 Thread Saifa Saifa
Hi

With new maemo3.0 libgtk2.0+-dev is not installable with apt-get.
When I tried to apt-get install libgtk+-2.0 it shows the unmet dependencies
libglib2.0-dev. It is looking for pkg-config package which is 
installed by osso-af-settings and pkg-config version is 0.15.0 
which is latest.
For building basic hildon c program, maemo requires those packages.
How to install these debs?

TIA
Saifa

=
Sonitrol: Security for Your Business
A leading provider of Verified Response security technology including Alarm 
Systems, Access Control, Video Surveillance and Fire Systems for businesses and 
government (GSA).
http://a8-asy.a8ww.net/a8-ads/adftrclick?redirectid=f97cf185da516826972d5ac69cbc624b

___
maemo-developers mailing list
maemo-developers@maemo.org
https://maemo.org/mailman/listinfo/maemo-developers


RE: [maemo-developers] DSP programming

2007-01-17 Thread Simon Pickering
I've been trying to get started with DSP programming, following the
instructions summarised in this post. 

 I just took a brief look at what is required for programming 
 for the DSP in the 770. It seems that building your own 
 modules is possible with publically awailable tools and 
 documents. At least I got the demo_console from DSP gateway 
 loaded and it seems to run OK.
 
 The following is what I did. Someone might be interested in 
 testing the instructions and maybe polish these into a proper 
 HOWTO to the maemo wiki.

snip

 3. Get the avs_kernel.out from your 770
 
 The dynamically loaded DSP modules are linked against a dummy 
 kernel object generated from the actual kernel. The DSP 
 kernel is in /etc/dsp directory. Just copy it to your host 
 and adjust dspgw-3.3-dsp/apps/demo_mod/Makefile to use the 
 770 avs_kernel.out to generate the dummy_kernel.obj instead 
 of using the default tinkernel.out. Now you should be able to 
 build the DSP side of the demo console with simple make.

After building the dsp side of the demos (demo_console or demo_fb), the next
step should be to run coff_unresolve on the resultant .out file, to remove
the dummy kernel, which has been linked in. So something like the following:

# create the dummy kernel
gen_dummy_kernel avs_kernel.out -o avs_kernel.obj

# Run make for the dsp demo in question 
# (compiles source and statically links in 
# the avs_kernel.obj dummy kernel)
make 

# remove the dummy kernel
coff_unresolve -s .tinkernel demo_console.out

Unfortunately, the code produced using this technique doesn't run. It
results in the eventual error message:

open: Device or resource busy

This is the same message as is received if you actually forget to run
coff_unresolve on the demo_console.out dsp task.

What does work, however, is simply renaming the unlinked demo_console.obj
file and placing that in the /lib/dsp/modules/ directory.

Have other people seen this problem? Has anyone else tried? I'd specifically
like to work out whether it's me making a mistake with coff_unresolve (as
this step will certainly be necessary for dsp tasks which are built using
more than one source file, as these can't be linked together without the
dummy kernel (avs_kernel.obj)).

Thanks,


Simon

P.S. The above steps are documented in the dsp_dld_spec13.pdf file from
dspgateway, in Chapter 5., Building a DSP dynamic task module.

___
maemo-developers mailing list
maemo-developers@maemo.org
https://maemo.org/mailman/listinfo/maemo-developers


Re: [maemo-developers] Help needed on setting up the development plarform for NOKIA N770/800

2007-01-17 Thread Raul Fernandes Herbster


Following are my questions:

1.I am newbie to Linuix. I have SUSE LINUX running in the VMWare Server on
Windows PC. I would like to know if there are detail steps as to how I can
setup the Maemo development platform on SUSE LINUX.



http://www.maemo.org/platform/docs/howtos/Maemo_tutorial_bora.html#settingup

2.From the links in www.maemo.org, I understand that the Mamemo works on

Debian. Do I have to change my linux distro to debian only?



No,  Debian http://www.debian.org/ or Ubuntu http://www.ubuntu.com/ are
recommended, but other fairly recent distributions should also work

3.Pl. provide me what tools I have to install in order.


The tools are listed at the tutorial (item 1).


Help is appreciated



___
maemo-developers mailing list
maemo-developers@maemo.org
https://maemo.org/mailman/listinfo/maemo-developers





--
Raul Fernandes Herbster
Embedded and Pervasive Computing Laboratory - embedded.dee.ufcg.edu.br
Electrical Engineering Department - DEE - www.dee.ufcg.edu.br
Electrical Engineering and Informatics Center - CEEI
Federal University of Campina Grande - UFCG - www.ufcg.edu.br
Caixa Postal 10105
58109-970 Campina Grande - PB - Brasil
___
maemo-developers mailing list
maemo-developers@maemo.org
https://maemo.org/mailman/listinfo/maemo-developers