Re: [maemo-developers] Optimized memory copying functions for Nokia 770
There seems to be no source for the functions in the tarball. Tomas Siarhei Siamashka wrote: Hello All, Here are the optimized memory copying functions for Nokia 770 (memset is more than twice faster, memcpy improves about 10-40% depending on relative data blocks alignment). http://ufo2000.sourceforge.net/files/fastmem-arm-20060312.tar.gz These functions were created as an attempt to experiment with getting maximum memory bandwith on Nokia 770 (powered by TI OMAP1710) and also learning ARM assembler in process. Getting maximum memory bandwidth utilization is needed for 2D games and probably other applications which need to process a lot of multimedia data. I'm particularly interested in getting the best performance for Allegro game programming library (http://alleg.sourceforge.net) on Nokia 770 and that was the motivation for writing this code. After a few experiments with reading/writing memory using different data size for each memory access operation, appears that writing in a bigger chunks is much more important for reading, that means writing 16-bits per memory access is usually twice faster than writing using 8-bit, 32-bit memory access is also twice faster than 16-bit access. There is no such significant performance degradation for reading with smaller chunks, so optimizing reading seems to be less important. After trying some orher half empirical experiments with writing to memory even more seems like the most efficient memory bandwidth is achieved by using 16-byte burst writes aligned on 16-byte boundary using STM instruction. And this seems to provide at least twice better memory bandwidth utilization than the standard 'memset' function on Nokia 770. Having such fantastic results, I decided to try making some optimized functions that can serve as a replacement for standard memset/memcpy functions. Aligned 16-byte write with STM instruction is a core part of all these functions, all the rest of code deals with leading/trailing unaligned data chunks. It implements the following functions (see more detailed comments in the code): memset8, memset16, memset32 - replacements for memset, optimized for different alignment memcpy16, memset32 - replacements for memcpy, optimized for different alignment Testing framework is included, which allows to ensure that this code provides valid results and is also really fast. In order to run the tests, this file should be compiled as c-source with FASTMEM_ARM_TEST_FRAMEWORK macro defined. Requirements for running this code: little endian ARM v4 compatible cpu Results from my Nokia 770 are the following: --- running correctness tests --- all the correctness tests passed --- running performance tests (memory bandwidth benchmark) ---: memset() memory bandwidth: 121.22MB/s memset8() memory bandwidth: 275.94MB/s memcpy() memory bandwidth (perfectly aligned): 104.86MB/s memcpy16() memory bandwidth (perfectly aligned): 113.98MB/s memcpy() memory bandwidth (16-bit aligned): 70.37MB/s memcpy16() memory bandwidth (16-bit aligned): 101.31MB/s --- testing performance for random blocks (size 0-15 bytes) --- memset time: 0.410 memset8 time: 0.260 --- testing performance for random blocks (size 0-511 bytes) --- memset time: 2.360 memset8 time: 1.140 TODO: 1. implement memcpy8 function (direct replacement for memcpy) 2. provide big endian support (currently the code is little endian) 3. investigate possibilities for getting the best performance on short buffer sizes 4. better testing in real world and on different ARM based devices I'm especially interested in getting feedback from running this code on different devices. It is quite possible that these functions are only optimal for OMAP1710, but do not provide any benefit on other devices. Currently this code improves Allegro game programming library performance quite a lot (in my not yet finished patch), but it might be also used for SDL. It is interesting if using these functions can improve GTK performance as well. In that case we could have a nice user interface responsivety improvement. As soon as a complete replacement for memcpy (memcpy8) is done, it can be probably also used as a patch for glibc to improve performance of all the programs automagically. Waiting for feedback, suggestions and test results on other ARM devices (not only Nokia 770). ___ maemo-developers mailing list maemo-developers@maemo.org https://maemo.org/mailman/listinfo/maemo-developers ___ maemo-developers mailing list maemo-developers@maemo.org https://maemo.org/mailman/listinfo/maemo-developers
RE: [maemo-developers] Optimized memory copying functions for Nokia770
T -Original Message- From: [EMAIL PROTECTED] [mailto:[EMAIL PROTECTED] On Behalf Of ext Tomas Frydrych Sent: Tuesday, March 14, 2006 11:23 To: maemo-developers@maemo.org Subject: Re: [maemo-developers] Optimized memory copying functions for Nokia770 There seems to be no source for the functions in the tarball. The implementation is in macros in the .h file. Best wishes, Dirk. ___ maemo-developers mailing list maemo-developers@maemo.org https://maemo.org/mailman/listinfo/maemo-developers
Re: [maemo-developers] Optimized memory copying functions for Nokia 770
Tomas Frydrych wrote: There seems to be no source for the functions in the tarball. Siarhei Siamashka wrote: Hello All, Here are the optimized memory copying functions for Nokia 770 (memset is more than twice faster, memcpy improves about 10-40% depending on relative data blocks alignment). http://ufo2000.sourceforge.net/files/fastmem-arm-20060312.tar.gz ... Like Dirk already replied, the implementation is in macros in the .h file. I'm sorry for not providing detailed instructions about using this tarball. Here they are: # wget http://ufo2000.sourceforge.net/files/fastmem-arm-20060312.tar.gz # tar -xzf fastmem-arm-20060312.tar.gz # cd fastmem-arm Now compile and run the test(in scratchbox using sbrsh cpu transparency method): # gcc -O2 -o fastmem-arm-test fastmem-arm-test.c # ./fastmem-arm-test If you want to use this optimized code in your programs, just add fastmem-arm.h file to your project and the following line into your source files: #include fastmem-arm.h And now you can use these functions (which are provided as a set of macros using inline assembler, so they are all contained within fastmem-arm.h file which is their source), the most simple to use is 'memset8', it is a direct replacement for 'memset' and can be used instead of it to provide a huge performance boost. The functions are optimized for different alignments, for example: uint16_t *memcpy16(uint16_t *dst, uint16_t *src, int count) It copies only 16-bit buffers, but it still can be used for a fast copy of 16-bit pixel data (as Nokia 770 uses 16-bit display). I can make 'memcpy8' function later, but expect its sources to grow about twice and become much more complicated (because of more complicated handling of leading/trailing bytes and 2 more relative alignment combinations). It will take some time. Hope this information helps. Still waiting for feedback :) ___ maemo-developers mailing list maemo-developers@maemo.org https://maemo.org/mailman/listinfo/maemo-developers
[maemo-developers] Measuring power consumption of 770
Hi all. We have a Nokia 770, and we want to study the effectiveness of some algorithms that reduce the number of transferred data in order to reduce the overall energy consumption of the device. To make this study, we need a way to measure the energy consumption of the device or, at least, an accurate measure of the battery charge. Our idea was to remove completely the battery, and measure the energy consumption through a cable connected to an electric socket with a multimeter. However, we noticed that the 770 does not work when the battery is not inserted, even if the device is connected to the electric plug with a cable. Does anybody have an idea about how to make the Nokia 770 work without the battery (just with the electric cable) or how to make such a measurement ? Many thanks. The best, Claudio Scordino _ Claudio Scordino Computer Science Department Ph.D. student University of Pisa, Italy Office: 341 Phone: +39 050 221 3137 e-mail: [EMAIL PROTECTED] home-page: http://www.di.unipi.it/~scordino/ _ ___ maemo-developers mailing list maemo-developers@maemo.org https://maemo.org/mailman/listinfo/maemo-developers
Re: [maemo-developers] Optimized memory copying functions for Nokia 770
Like Dirk already replied, the implementation is in macros in the .h file. I see. That makes the comparison with memcpy somewhat unfair, since you are not actually providing replacement functions, so this would only make difference for -O3 type optimatisation (where you trade speed for size); it would be interesting to see what the performance difference is if you add the C prologue and epilogue.# BTW, you can instruct gcc to use inlined assembler version of its memcpy and friends as well, I think -O3 includes this, but if I read bits/string.h correctly in my sbox, there are no such inlined functions on the arm though, so there is certainly value in doing this. Tomas ___ maemo-developers mailing list maemo-developers@maemo.org https://maemo.org/mailman/listinfo/maemo-developers
Re: [maemo-developers] Measuring power consumption of 770
Claudio Scordino wrote: Does anybody have an idea about how to make the Nokia 770 work without the battery (just with the electric cable) or how to make such a measurement ? When the battery is fully charged you can start measuring current in the cable since the battery is probably not used nor charged. Frantisek ___ maemo-developers mailing list maemo-developers@maemo.org https://maemo.org/mailman/listinfo/maemo-developers
Re: [maemo-developers] Optimized memory copying functions for Nokia 770
Hi, That makes the comparison with memcpy somewhat unfair, since you are not actually providing replacement functions, so this would only make difference for -O3 type optimatisation (where you trade speed for size); it would be interesting to see what the performance difference is if you add the C prologue and epilogue.# One should also remember that inlining functions increases the code size. On trivial sized test programs this is not an issue, but in real programs it is, especially with the RAM and cache sizes that ARM has. BTW, you can instruct gcc to use inlined assembler version of its memcpy and friends as well, I think -O3 includes this, but if I read bits/string.h correctly in my sbox, there are no such inlined functions on the arm though, so there is certainly value in doing this. AFAIK gcc will use it's own inline functions if the size is constant (it doesn't come from the C-library then). - Eero ___ maemo-developers mailing list maemo-developers@maemo.org https://maemo.org/mailman/listinfo/maemo-developers
Re: [maemo-developers] Measuring power consumption of 770
On Tue, 2006-03-14 at 14:53 +0100, ext Claudio Scordino wrote: Hi all. We have a Nokia 770, and we want to study the effectiveness of some algorithms that reduce the number of transferred data in order to reduce the overall energy consumption of the device. To make this study, we need a way to measure the energy consumption of the device or, at least, an accurate measure of the battery charge. Our idea was to remove completely the battery, and measure the energy consumption through a cable connected to an electric socket with a multimeter. However, we noticed that the 770 does not work when the battery is not inserted, even if the device is connected to the electric plug with a cable. Does anybody have an idea about how to make the Nokia 770 work without the battery (just with the electric cable) or how to make such a measurement ? :-O Why don't you just wire a battery to the battery pins and connect your meters to these wires? BTW, is there any pubblic reference to these algorithms? Many thanks. The best, Claudio Scordino _ Claudio Scordino Computer Science Department Ph.D. student University of Pisa, Italy Office: 341 Phone: +39 050 221 3137 e-mail: [EMAIL PROTECTED] home-page: http://www.di.unipi.it/~scordino/ _ ___ maemo-developers mailing list maemo-developers@maemo.org https://maemo.org/mailman/listinfo/maemo-developers -- Cheers, Igor Igor Stoppa (Nokia M - OSSO / Tampere) ___ maemo-developers mailing list maemo-developers@maemo.org https://maemo.org/mailman/listinfo/maemo-developers
Re: [maemo-developers] Optimized memory copying functions for Nokia 770
Tomas Frydrych wrote: Like Dirk already replied, the implementation is in macros in the .h file. I see. That makes the comparison with memcpy somewhat unfair, since you are not actually providing replacement functions, so this would only make difference for -O3 type optimatisation (where you trade speed for size); it would be interesting to see what the performance difference is if you add the C prologue and epilogue.# Memory bandwidth benchmarking is done on 2MB memory block, so prologue and epilogue code does not introduce any noticeable difference. I did not pay much attention on optimizing prologue/epilogue code yet, it should make difference on smaller buffer sizes, but it is in a TODO list. BTW, you can instruct gcc to use inlined assembler version of its memcpy and friends as well, I think -O3 includes this, but if I read bits/string.h correctly in my sbox, there are no such inlined functions on the arm though, so there is certainly value in doing this. Well you got the source, so you can do your own benchmarks either with -O2 or -O3 or even -O9 and post them here :) ___ maemo-developers mailing list maemo-developers@maemo.org https://maemo.org/mailman/listinfo/maemo-developers
Re: [maemo-developers] Optimized memory copying functions for Nokia 770
Eero Tamminen wrote: That makes the comparison with memcpy somewhat unfair, since you are not actually providing replacement functions, so this would only make difference for -O3 type optimatisation (where you trade speed for size); it would be interesting to see what the performance difference is if you add the C prologue and epilogue.# One should also remember that inlining functions increases the code size. On trivial sized test programs this is not an issue, but in real programs it is, especially with the RAM and cache sizes that ARM has. Sometimes inlining makes sense, sometimes it does not. In my case (blitting code for allegro game programming library) it does, just quoting myself: Also just improving glibc might not give the best results. Imagine a code for 16bpp bitmaps blitting. It contains a tight loop of copying pixels one line at a time. If we need to get the best performance possible, especially for small bitmaps with only a few horizontal pixels, extra overhead caused by a memcpy function call and also extra check for alignment (which is known to be 16-bit in this case) might make a noticeable difference. So directly inlining code from that 'memcpy16' macro will be better in this case. By the way, I tried to search for asm optimized versions of memcpy for ARM platforms. Did not do that before as my mistake was that I assumed glibc memcpy/memset implementations to be already optimized as much as posible. Appears that there is fast memcpy implementation in uclibc and there are also much more other implementations around. Seems like I tried to reinvent the wheel. Too bad if it appears that spending the whole 2 days on weekend was a useless waste of time :( Well, at least I did not try to steal someone's else code and 'copyright' it. As I told before, my observations show that it is better to align writes on 16-byte boundaries at least on Nokia 770. The code I have posted is a proof of concept code and it shows that it is faster than default memset/memcpy on the device. I'm going to compare my code with uclibc implementation, if uclibc is in fact faster or has the same performance, I'll have to apologize for causing this mess and go away ashamed. In any case, performance of memcpy/memset on default Nokia 770 image is far from optimal. And considering that the device is certainly not overpowered, improvements in this area might probably help. Just checked GTK sources, memcpy is used in a lot of places, don't know whether it affects performance much though. Is it something worth investigating by Nokia developers? ___ maemo-developers mailing list maemo-developers@maemo.org https://maemo.org/mailman/listinfo/maemo-developers
Re: [maemo-developers] How to keep app running indefinitely
On Tue, 2006-14-03 at 09:04 +0200, Kalle Valo wrote: Steven Hill [EMAIL PROTECTED] writes: Good. But actually disabling the idle timer is just a workaround. It seems that the real problem is the application crashing whenever a disconnect from a network happens. I agree, but it is not clear what is happening. The messages I am seeing from my application are: Received status_changed to DISCONNECTING notification for IAP shss Received status_changed to IDLE for IAP shss Segmentation fault The OSSO_IAP_DISCONNECTED event in the iap_callback function does not appear to be reached, because it should print a brief message on entry. Ok, so this points to a bug in libosso-ic.so (the library providing the IC API) and I'll have to investigate this more. Can you give any instructions how to reproduce it? I haven't seen this before so I suspect there's something special needed for triggering this bug. What 770 software version are you using? Do you know where the status_changed messages are being generated? It's printed from libosso-ic.so. To be precise from src/ic-api.c line 104 which is in debian source package osso-ic-oss. I am running this app on (I believe) the most recent production version of the 770 (downloaded and installed using the Windows flasher - according to the Device information it is 3.2005.51-13. I have attached my network.c file and two header files. The connection code is based on the osso-ic example with a few changes and additions. Network connection is made with a call to get_connected. Let me know if this helps... Steve Hill /* * This file is part of DayCareLogin * * Copyright (C) 2006 SH Scientific Systems * * This software is free software; you can redistribute it and/or * modify it under the terms of the GNU Lesser General Public License * as published by the Free Software Foundation; either version 2.1 of * the License, or (at your option) any later version. * * This software is distributed in the hope that it will be useful, but * WITHOUT ANY WARRANTY; without even the implied warranty of * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU * Lesser General Public License for more details. * * You should have received a copy of the GNU Lesser General Public * License along with this software; if not, write to the Free Software * Foundation, Inc., 51 Franklin St, Fifth Floor, Boston, MA * 02110-1301 USA * */ #ifndef LOGINAPP_APPDATA_H #define LOGINAPP_APPDATA_H #include libosso.h #include network.h #include hildon-widgets/gtk-infoprint.h #include hildon-widgets/hildon-app.h #include sqlite.h #include gtk/gtk.h #include libgnomevfs/gnome-vfs.h #include libgnomevfs/gnome-vfs-utils.h #define dbaseFileName ~/securetrak_data.db #define IPaddrFileName ~/defaultIP.txt #define XMLFileName ~/temp.xml #define OutXMLFileName ~/out.xml #define PINGTIME 3 /* Time in milliseconds between pings */ typedef struct _AppData AppData; typedef struct _PwdData PwdData; /*typedef struct _OutBufData OutBufData;*/ typedef struct _AppContext AppContext; typedef struct _ChildChkWindow ChildChkWindow; typedef struct _AttendCheck AttendCheck; typedef struct _ThanksWindow ThanksWindow; typedef struct _StaffWindow StaffWindow; struct _StaffWindow { HildonAppView *staff_view; GtkWidget *Name_label; GtkWidget *Time_label; gchar *name; gchar *ID; gchar *In; }; struct _ThanksWindow { HildonAppView *thankyou_view; GtkWidget *Comment_label; GtkWidget *Time_label; }; struct _AttendCheck { gboolean in; /* if TRUE, checkin in status */ GnomeVFSHandle *xmlOut; /* Pointer to handle of file holding XML login/out info */ gchar *gid; /* The guardian ID as a string */ }; struct _AppContext { const char *host, *port; char iap_name[IAP_LENGTH]; APPState appstate; READState readstate; GIOChannel *channel; guint read_watch, write_watch; gboolean server_connected; gboolean recent_io; }; struct _PwdData { guint8 pwd_digits; /*Number of digits entered in password */ gchar pwd[4]; /*The password string*/ }; struct _ChildChkWindow { HildonAppView *child_checkin_view; GtkWidget *Guardian_name; GtkWidget *Date_label; GtkWidget *Time_label; GtkTreeView *Checked_in_view; GtkTreeView *Checked_out_view; gchar *Guardian_id; }; struct _AppData { HildonApp *app; /*handle to application */ AppContext *connection; /* handle to the network connection context */ HildonAppView *login_view; /*handle to the login view */ ChildChkWindow *child_checkin; ThanksWindow *thanks; StaffWindow *staff; gint Timeout_tag; /*pointer to allow destruction of timeout callback */ osso_context_t *osso;/*handle to osso */ PwdData* pwdPtr; /*pointer to password information */ GnomeVFSHandle *xmlOut; /* Pointer to handle of file holding XML login/out info */ }; #endif #include appdata.h #include osso-ic.h #include libgnomevfs/gnome-vfs.h #include libgnomevfs/gnome-vfs-utils.h extern
Re: [maemo-developers] Measuring power consumption of 770
On Tuesday 14 March 2006 16:31, Frantisek Dufka wrote: Claudio Scordino wrote: Does anybody have an idea about how to make the Nokia 770 work without the battery (just with the electric cable) or how to make such a measurement ? When the battery is fully charged you can start measuring current in the cable since the battery is probably not used nor charged. It makes sense but I want to be definitely sure: I need an accurate measure, therefore I need something better than probably. Thanks, Claudio ___ maemo-developers mailing list maemo-developers@maemo.org https://maemo.org/mailman/listinfo/maemo-developers
[maemo-developers] CACAO
Hi, The CACAO vm has executed the Knopflerfish OSGI (http://www.knopflerfish.org) test suite with one failure, the same as with Sable and Jam. All VMs use Classpath. No AWT tests were executed. It's nice that CACAO has so many JIT ports. We haven't done any perfomance benchmarks yet with Cacao. It'll be nice to see Cacao and libgcj become maemo packages. Best Regards, -- Philippe Laporte Software Gatespace Telematics Första Långgatan 18 41328 Göteborg Sweden Phone: +46 702 04 35 11 Fax: +46 31 24 16 50 Email: [EMAIL PROTECTED] ___ maemo-developers mailing list maemo-developers@maemo.org https://maemo.org/mailman/listinfo/maemo-developers
Re: [maemo-developers] Measuring power consumption of 770
Claudio Scordino [EMAIL PROTECTED] writes: The voltage and the current provided by the battery is much different from the values provided by an electric cable connected to a socket (220V and 50Hz here in Italy). That's why our measurement tools wouldn't work with such a small current... Umm ... any cheap multimeter will do, for both voltages. And, just as a reminder, if you can't measure small currents: the smaller one will be found at the 230V side (given the same power consumption), and you'll much more likely run into problems about what you're really measuring there. Use the battery and measure there. - Heike ___ maemo-developers mailing list maemo-developers@maemo.org https://maemo.org/mailman/listinfo/maemo-developers
Re: [maemo-developers] Question about intercepting the HOME button
You are right the maemo-af-desktop handles the key. I already written a patch where you can switch off the home button by sending a DBus Message to maemo-af-desktop. But you have to create a new rootfs and flash it to your nokia. If you need the patch send me a email. BTW the patch is for the current InternetTablet 2005 Version of maemo. Hi maemo hackers. I want to handle HOME button event and CONSUME it. I tried hacking HildonAppView to not register a gtk key event snooper to handle the HOME key (which seems to be an alias for GDK_F5) This didnt work. The app no longer handles the HOME key event but some other process does. It seems the maemo_af_desktop process is catching this event somewhere else and doing its thing. So my question is where is the code in maemo_af_desktop that catches the GDK_F5 and turns into some kind of message so that I can comment it out ? thanks. -re ___ maemo-developers mailing list maemo-developers@maemo.org https://maemo.org/mailman/listinfo/maemo-developers ___ maemo-developers mailing list maemo-developers@maemo.org https://maemo.org/mailman/listinfo/maemo-developers
Re: [maemo-developers] Question about intercepting the HOME button
I am using the 1.1 version of the maemo sdk release. Do I need something even newer, from svn ? -re Kalle Vahlman wrote: The key snooper was installed in hildon-home/hildon-home-main.c:hildon_home_main(), but it was removed at 2005-08-30 Karoliina Salminen [EMAIL PROTECTED] * Patched bug # 9179 * hildon-home/hildon-home-main.c: Replaced keysnooper with key press/release event handlers for the main window. This fixes problems with menu key handling. so in the 1.1 SDK at least its gone (AFAICT). Simply upgrading your SDK should be enough. Not sure about the official images though. ___ maemo-developers mailing list maemo-developers@maemo.org https://maemo.org/mailman/listinfo/maemo-developers
Re: [maemo-developers] Question about intercepting the HOME button
Yes I know thats how it works, which is why Im asking how to hack it. And yes, I realize i want to do something evil, perhaps. But, its for my own personal app and amusement. -re Tapani Pälli wrote: HOME-key is special. HOME button should always take you to home whatever the situation is (even if input is being grabbed by a menu). Therefore this key is handled in lower levels and Gtk application cannot do things with it. ___ maemo-developers mailing list maemo-developers@maemo.org https://maemo.org/mailman/listinfo/maemo-developers
Re: [maemo-developers] Question about intercepting the HOME button
On 3/15/06, Ramiro Estrugo [EMAIL PROTECTED] wrote: So my question is where is the code in maemo_af_desktop that catches the GDK_F5 and turns into some kind of message so that I can comment it out ? The key snooper was installed in hildon-home/hildon-home-main.c:hildon_home_main(), but it was removed at 2005-08-30 Karoliina Salminen [EMAIL PROTECTED] * Patched bug # 9179 * hildon-home/hildon-home-main.c: Replaced keysnooper with key press/release event handlers for the main window. This fixes problems with menu key handling. so in the 1.1 SDK at least its gone (AFAICT). Simply upgrading your SDK should be enough. Not sure about the official images though. -- Kalle Vahlman, [EMAIL PROTECTED] Powered by http://movial.fi Interesting stuff at http://syslog.movial.fi ___ maemo-developers mailing list maemo-developers@maemo.org https://maemo.org/mailman/listinfo/maemo-developers