Re: Opportunity for speedup
Cool! Bobby Powers wrote: On Wed, Feb 11, 2009 at 2:01 AM, Mitch Bradley w...@laptop.org wrote: I just measured the time taken by the boot animation by the simple technique of renaming /usr/bin/rhgb-client so the initscripts can't find it. how did you measure exactly? stopwatch? I'd like to recreate the tests. It sounds like you did this on a freshly flashed system? Yes on both counts. Stopwatch on freshly-flashed os7.img . With boot animation, OS build 7 (an older 8.2.1 candidate) takes 60 seconds from first dot (indicating OFW transfer to Linux) to Sugar prompt for your name. Without it, 53 seconds. I repeated the test several times with consistent results. Clearly, it should be possible to display that amount of information in much less than 7 seconds. The boot animation code is in the OLPC domain, not the upstream domain, so replacing it should be relatively free of upstream politics. So if anybody is interested in implementing a relatively simple boot-time speedup, I offer this as low-hanging fruit. I suggest 1 second (differential time between animation and no-animation cases) as a reasonable target goal, assuming images of the complexity of the current ones. Arbitrary full-screen graphics might require more time, but speeding up the baseline case is a good starting point. Go wild. So I've taken a first cut at this, implemented with the following design considerations (mostly from a conversation with Mitch) - the Python client/server was reimplemented as several standalone C programs (boot-anim-start, boot-anim-client, and some cleanup in boot-anim-stop) - a client and server was used before because there is state information that needs to be saved: we need to keep track of where in the animation we are. We can keep track of this by using offscreen memory in the framebuffer (its 16MB in size, and only the first 2ish MB is used for the onscreen graphics (my terminology might be off here)). For state we really only need to keep track of 2 integers, one for the current frame number and another to store the offset of the next diff to apply. - on startup we load an initial image into the framebuffer (the first 1200*900*2 bytes, since we use 2 bytes per pixel for color information), and then load in a series of changes to the framebuffer image (300KB). This takes the form of a series of diffs - for each update (a valid call to boot-anim-client) we apply the next diff in the series to the onscreen image and update our state information - after applying the last diff we have (the end in the animation series), freeze the DCON (when I first attempted to freeze the DCON when z-boot-anim-stop was called it left the screen in an inconsistent state, I believe because of X startup) - its designed to be as light as possible, using syscalls instead of libc functions as much as possible (the only thing we use libc for is string comparison, which could be replaced with a local function). while its written like this, I haven't worked on cutting down the linking (I need some guidance for that) To reduce the execution footprint, you could try linking it against dietlibc, http://www.fefe.de/dietlibc/ I'm not sure just how much time that would save; maybe it wouldn't be significant. But it's worth a try. comments and suggestions welcome :) I'd appreciate any testing as well as any code review. (the shutdown image appears to be broken, FYI. i haven't looked at that in depth, its probably a one line fix.) rpms (built with mock) are available at http://dev.laptop.org/~bobbyp/bootanim/ and source is avail at http://dev.laptop.org/git?p=users/bobbyp/bootanim -Bobby ___ Devel mailing list Devel@lists.laptop.org http://lists.laptop.org/listinfo/devel
Re: Guidance sought on collaboration techniques
On Thu, Feb 19, 2009 at 05:44, Gary C Martin g...@garycmartin.com wrote: Actually that raises a question, did Gadget make it in to the 8.2.1 build? Or is this still a future maybe? I take it it is/would be a Sugar future feature/dependancy? It was not included in 8.2.1. It's implemented in Sucrose 0.83.x. Regards Morgan ___ Devel mailing list Devel@lists.laptop.org http://lists.laptop.org/listinfo/devel
Re: [Sugar-devel] [PATCH] webactivity: seed the XS cookie at startup
Martin Langhoff wrote: On Thu, Feb 19, 2009 at 7:20 PM, Simon Schampijer si...@schampijer.de wrote: Martin Langhoff wrote: On Tue, Feb 17, 2009 at 11:03 AM, Simon Schampijer si...@schampijer.de wrote: Well, your call - using the schoolserver url then? The fqdn from backup server or jabber server. Either will do until we fix the registration stuff. Please state exactly which one you want - I want this to be your call. Ok. Both are wrong, and I'm good at being stoned for picking a wrong setting. The jabber server is at least recorded as a fqdn, so let's use that. Backup server is recorded as u...@fqdn:path . And let's make sure we fix registration and these values in the next feature work cycle :-) For the time being, ejabberd, backup and moodle are quite tightly bound together -- this is not a good long-term thing, but a fact of life in the short-term. cheers, m Ok, I pushed to browse git master now. Do you have a way to test if it is working fine against a schoolserver or should I create you the 0.82 xo bundle? Cheers, Simon ___ Devel mailing list Devel@lists.laptop.org http://lists.laptop.org/listinfo/devel
Re: Application crashing with Bad Window error only on OLPC 767 build
Some more progress. I had used libsugarize.so from this link http://www.catmoran.com/olpc/libsugarize.so to sugarise my activity. When I first started developing the activity, it wouldn't run on the XO unless I preloaded this so. On a hunch I removed this preload and now my application launches on the XO when run from the activity circle. It works fine after this except for when I quit my application. If I don't preload this so then when I close my application it doesn't close at once. It goes to the activity icon blinking state for about half a minute before it closes. When I had the preloaded the so on the older sugar builds, my application would quit almost immediately after closing the application and I wouldn't get that activity icon after closing. Can anybody help me figure out how to solve this? Thanks jbsp72 On Wed, Feb 18, 2009 at 8:49 PM, shivaprasad javali jbs...@gmail.comwrote: Finally some more information about what is happening. The application that I am running previously didn't use any UI packages and worked directly with windows through X calls. I had to add a xulrunner based browser to it which accepted only a gtk window and couldn't work with an X window directly. So I added the code for just the browser to use gtk. I switched to the old code ( pre gtk) and it works fine on the XO with 767 sugar build. Then I tried adding the gtk_init() call to the initialization sequence and it crashed immediately. I guess the problem has something to do with both gtk and X windows working together.( Although I don't get why the problem should persist only when I launch my app from the activity bar and not when I run it from the terminal) . Can anybody help me figure out what the problem might be? Thanks jbsp72 On Wed, Feb 18, 2009 at 5:49 AM, S Page i...@skierpage.com wrote: (private response, but if you learn more re-send to OLPC devel list) Do you see anything in the logs after enabling warnings? See http://wiki.laptop.org/go/Attaching_Sugar_logs_to_tickets Then there's general debugging on Linux using strace, gdb, etc. I don't see much on the wiki about that. shivaprasad javali wrote: I am trying to port an application to OLPC. It has a xulrunner based browser which draws into a gtk window. It runs perfectly on OLPC builds 708 and 711 , but when I try to run it on 767 build it crashes with an error in a X window system error with error code 3( Bad Window). Also the crash occurs only when I launch the activity from the Activity bar. If open the Terminal activity and run it from there it works fine. It crashes only when I run it from the Activity bar and on the 767 build. Could any body tell me what changed from build 711 to 767 which made my application to crash? Thanks jbsp72 ___ Devel mailing list Devel@lists.laptop.org http://lists.laptop.org/listinfo/devel
Re: Opportunity for speedup
mitch wrote: Bobby Powers wrote: - its designed to be as light as possible, using syscalls instead of libc functions as much as possible (the only thing we use libc for is string comparison, which could be replaced with a local function). while its written like this, I haven't worked on cutting down the linking (I need some guidance for that) great stuff bobby -- i'm happy to help with any remaining details if you like. To reduce the execution footprint, you could try linking it against dietlibc, http://www.fefe.de/dietlibc/ I'm not sure just how much time that would save; maybe it wouldn't be significant. But it's worth a try. my gut says that using already present glibc shared lib will be cheaper than introducing a new library, even if it's small and static. but you're right it's worth a try. and source is avail at http://dev.laptop.org/git?p=users/bobbyp/bootanim i took a very brief look. as a favor to future maintainers, i think you could either a) merge boot-anim-start/client/stop and ul-warning into a single executable (much of the code is the same) or b) extract the common parts (e.g. initial_setup(), and the code that mmaps the framebuffer) into a boot-anim-utils.c or something like that. (and while i'm all for reducing dependencies, the XO has so much else going on that i don't think using against string libraries or even stdio will affect things much in the greater scheme of things. so i'd have used fputs rather than write(2,...) for errors. but i understand the intent.) paul =- paul fox, p...@laptop.org ___ Devel mailing list Devel@lists.laptop.org http://lists.laptop.org/listinfo/devel
Re: Opportunity for speedup
I just measured the time taken by the boot animation by the simple technique of renaming /usr/bin/rhgb-client so the initscripts can't find it. how did you measure exactly? stopwatch? I'd like to recreate the tests. It sounds like you did this on a freshly flashed system? There were a number of tools used by some of the Fedora devs for boot speed when developing plymouth to replace the old RHGB system. It would be interesting to plymouth in this (both text and graphical) to see what the comparison is like. It might be possible to get alot of the wins that Fedora got with very little work as plymouth has a full plugin system so shouldn't be hard to add the OLPC boot logos in. Peter ___ Devel mailing list Devel@lists.laptop.org http://lists.laptop.org/listinfo/devel
Re: Guidance sought on collaboration techniques
Wade, I found out about schoolserver.media.mit.edu from the Wiki, and I got it set up successfully. It looks like a lot of stuff is better documented on the Wiki than it was in my pre-basement-flood days. I'll try to look there first in the future. I'm going to try to get Hello Mesh working this weekend and take it from there. Thanks again for your help. James Simmons ___ Devel mailing list Devel@lists.laptop.org http://lists.laptop.org/listinfo/devel
Re: Guidance sought on collaboration techniques
Gary, I only have one XO. I do all my development work on a couple of Fedora 10 boxes running Sugar. These two boxes are connected to the same router using Ethernet cables. It has been my experience that the only way they can collaborate is through the jabber server, and that makes me think that the mesh networking is a function of the wireless networking built into the XO and that two computers wired to the same router can't collaborate that way. I'd be interested to know if I'm right. Thanks, James Simmons Gary C Martin wrote: Hmm, interesting. I've had no problems here with 3 XOs all seeing each other, either via Mesh, or the single AP I have here. For most of the last ~4 months I've usually had them all with a blank jabber server setting and have been test collaborating locally. Actually, it's much more testable/repeatable now that the jabber server is not set by default, the default always seemed to be off-line; broken due to server load; or more recently, running some test Gadget build that prevented you from seeing anyone else. Actually that raises a question, did Gadget make it in to the 8.2.1 build? Or is this still a future maybe? I take it it is/would be a Sugar future feature/dependancy? --Gary ___ Devel mailing list Devel@lists.laptop.org http://lists.laptop.org/listinfo/devel
Re: [IAEP] RFC: Supporting olpc-ish Deployments - Draft 1
Hi all. Thank you Michael and Pia for this.. I have to say that although these questions and concerns are indeed needed , they are only counting one side of the history, like asking what would be the best for OLPC to give or the resources that OLPC can give..this is bad centered, OLPC deployments need more independence, the deployments run by governments need to have straightforward relations with the volunteers, and volunteer driven small deployments need more independence to manage it's own resources and address and resolve the concerns and questions stated here. If the deployments manage to have more independence from OLPC central, i can assure you that OLPC resources wouldn't be so needed as they are now, and can be focused in other tasks. SugarLabs is taking this focus for it's deployments, to have federated Local Labs with some common ground rules but with the possible maximum independence. This independence guarantees real empowerment, distribution of task and efforts, it's not only what SugarLabs can give to Local labs but also what Local Labs can give to SugarLabs, in my opinion and experience this is the best way that SugarLabs can support deployments. For more info: http://sugarlabs.org/go/DeploymentTeam http://sugarlabs.org/go/Local_Labs Rafael Ortiz On Thu, Feb 19, 2009 at 12:14 AM, Michael Stone mich...@laptop.org wrote: Folks, Pia Waugh (greebo) and I have spent a fair bit of time in the last month talking and thinking about what we can do in the next few months to best support present and future olpc-ish deployments (typically with XOs, typically running Sugar) and we'd like to share some of our thoughts with you. These thoughts are presented in draft form in order to solicit your feedback, which is eagerly awaited, and will likely be incorporated into future drafts. Regards, Michael -- 1. Motivation We think that many deployment-related needs are not being adequately met, particularly in the areas of: * knowledge-sharing and the ability to benefit from others' mistakes. * volume and quality of aid available for conducting deployments. * bandwidth, latency, and SNR of channels to other communities which work with deployments; e.g. other deployments, educators, software teams, distributions, researchers, consultants, and volunteers 2. Use Cases We're particularly interested in addressing these situations and needs: D1) I'm running a deployment... a) ...and I need help! Who shares my problem? Who can help me? b) ...and I want to do more! Who/what can I work with? c) ...and I want to share! Where do I go? What is needed? D2) I need to talk to people deploying XOs. a) Where do I go? b) What can I expect? D3) I'm working on a deployment plan. a) Where to I start? b) What have I forgotten? c) Am I using best practices? d) Can I get a review? D4) I need to know... a) real deployment numbers, b) maps, c) examples, d) photos, e) techniques, f) contact info, ... 3. Existing Resources for Use Cases Before we started, there were three basic mechanisms for addressing these use cases: 1) read the Deployment Guide and the Deployments page(s): http://wiki.laptop.org/go/Deployment_Guide http://wiki.laptop.org/go/Deployments http://wiki.laptop.org/go/Deployments_support 2) ask olpc-techsupp...@laptop.org. (Only available to large deployments?) 3) poke people on IRC. These three mechanisms are problematic because none of them can be relied upon, alone or in combination, to adequately address any of the use cases listed above. 4. New Resources for Use Cases So far, we've created two new resources which help bridge the gap: 4) weekly deployment support meetings, with minutes at http://wiki.laptop.org/go/Deployment_meetings#Meeting_notes which get aggregated each month into 5) a Deployment FAQ, http://wiki.laptop.org/go/Deployment_FAQ similar in form and spirit to the G1G1 http://wiki.laptop.org/go/Support_FAQ We think that these two new resources, in combination with the pre-existing resources, will help us provide the next level of support for our use cases. 4. Projects We presently have several ongoing (interrelated) projects which you might like to become (more deeply) involved in: P1) Keep improving the deployment support meetings -- so far, so good! -- your participation in these meetings is our best current source of new content for the Deployment FAQ and for... P2) Organize material captured in the meetings as FAQ entries -- the meeting minutes are chronological, which is good for minutes, but not particularly
Re: Guidance sought on collaboration techniques
On 19.02.2009, at 16:32, James Simmons wrote: Gary, I only have one XO. I do all my development work on a couple of Fedora 10 boxes running Sugar. These two boxes are connected to the same router using Ethernet cables. It has been my experience that the only way they can collaborate is through the jabber server, and that makes me think that the mesh networking is a function of the wireless networking built into the XO and that two computers wired to the same router can't collaborate that way. I'd be interested to know if I'm right. I'm positive collab does work in a LAN (non-mesh) without server, that's how I usually test. But I also remember to have had problems at times. For example, there are wireless access points that appear to suppress some inter-client communications. - Bert - ___ Devel mailing list Devel@lists.laptop.org http://lists.laptop.org/listinfo/devel
Latest on Read Etexts and espeak
Tony, It looks like every day I make some progress. I finally figured out that the reason text to speech was not pausing between sentences and paragraphs was that I was stripping all the punctuation out of the text before sending it on to espeak. I have to create an XML document marked up with SSIP tags and pass that to speech-dispatcher instead of the raw text. The reason for this is that SD needs these tags so it can do a callback into my code before it speaks a word. That's the only way I can do word highlighting. In the process of creating this marked up page I stripped out all the punctuation. Even my wife thought that was a stupid thing to do. My highlighting still sometimes lags behind the words being spoken, especially when there is a string of short words. It tends to catch up on the longer words. I'm going to try to make the code that highlights the words more efficient. It may be doing some things it doesn't need to do. This morning I noticed that if speech-dispatcher -d is running then mplayer can't open /dev/dsp so I don't get any sound when I play a movie. spd-say continues to work. I don't know if this is an SD problem or an mplayer problem. I hope this doesn't mean that if we use SD on the XO then Tam Tam doesn't work. I have posted a question to the SD mailing list on this. I should have a much improved Read Etexts published in the usual place some time this weekend. Thanks, James Simmons ___ Devel mailing list Devel@lists.laptop.org http://lists.laptop.org/listinfo/devel
Re: Application crashing with Bad Window error only on OLPC 767 build
Hey there, I think you're in fairly undocumented territory trying to develop a real activity using libsugarize.so. But to your credit, you seem to have a pretty good understanding of how the X protocol works (beyond mine anyway!). Have you seen this page before? It describes how Sugar interacts with activities at the X window level and may help you remove the need for libsugarize.so. http://wiki.laptop.org/go/Low-level_Activity_API Also, for debugging X errors like the Bad Window error in C programs, you might gain some use from this page. It will allow you to synchronize the X protocol and to then get a stack trace at the exact location which generates the error. http://www.rahul.net/kenton/perrors.html Personally I have seen errors like this from PyGTK, but it usually has to do with using some API outside the normal boundaries - like a Drawable blit with a negative area rectangle for example. Hope this helps! Wade 2009/2/19 shivaprasad javali jbs...@gmail.com Some more progress. I had used libsugarize.so from this link http://www.catmoran.com/olpc/libsugarize.so to sugarise my activity. When I first started developing the activity, it wouldn't run on the XO unless I preloaded this so. On a hunch I removed this preload and now my application launches on the XO when run from the activity circle. It works fine after this except for when I quit my application. If I don't preload this so then when I close my application it doesn't close at once. It goes to the activity icon blinking state for about half a minute before it closes. When I had the preloaded the so on the older sugar builds, my application would quit almost immediately after closing the application and I wouldn't get that activity icon after closing. Can anybody help me figure out how to solve this? Thanks jbsp72 On Wed, Feb 18, 2009 at 8:49 PM, shivaprasad javali jbs...@gmail.comwrote: Finally some more information about what is happening. The application that I am running previously didn't use any UI packages and worked directly with windows through X calls. I had to add a xulrunner based browser to it which accepted only a gtk window and couldn't work with an X window directly. So I added the code for just the browser to use gtk. I switched to the old code ( pre gtk) and it works fine on the XO with 767 sugar build. Then I tried adding the gtk_init() call to the initialization sequence and it crashed immediately. I guess the problem has something to do with both gtk and X windows working together.( Although I don't get why the problem should persist only when I launch my app from the activity bar and not when I run it from the terminal) . Can anybody help me figure out what the problem might be? Thanks jbsp72 On Wed, Feb 18, 2009 at 5:49 AM, S Page i...@skierpage.com wrote: (private response, but if you learn more re-send to OLPC devel list) Do you see anything in the logs after enabling warnings? See http://wiki.laptop.org/go/Attaching_Sugar_logs_to_tickets Then there's general debugging on Linux using strace, gdb, etc. I don't see much on the wiki about that. shivaprasad javali wrote: I am trying to port an application to OLPC. It has a xulrunner based browser which draws into a gtk window. It runs perfectly on OLPC builds 708 and 711 , but when I try to run it on 767 build it crashes with an error in a X window system error with error code 3( Bad Window). Also the crash occurs only when I launch the activity from the Activity bar. If open the Terminal activity and run it from there it works fine. It crashes only when I run it from the Activity bar and on the 767 build. Could any body tell me what changed from build 711 to 767 which made my application to crash? Thanks jbsp72 ___ Devel mailing list Devel@lists.laptop.org http://lists.laptop.org/listinfo/devel ___ Devel mailing list Devel@lists.laptop.org http://lists.laptop.org/listinfo/devel
Re: Opportunity for speedup
I'd suggest just uncompressing the various image files and re-timing as a start. The initial implementation was uncompressed, but people complained about space usage on the emulator images (which are uncompressed). The current code supports both uncompressed and compressed image formats. For uncompressed images, putting the bits on the screen is an mmap and memcpy, so I can't imagine any implementation being faster than that (it's possible, of course, that what's stealing CPU is the shell's invocation of the client program; recoding just that little part in C should be trivial, since it does nothing but write to a socket IIRC.) Anyway, further benchmarking of the current implementation is probably worthwhile before a complete reimplementation is called for. But if you want to reimplement it from scratch, go nuts. --scott -- ( http://cscott.net/ ) ___ Devel mailing list Devel@lists.laptop.org http://lists.laptop.org/listinfo/devel
Re: Opportunity for speedup
C. Scott Ananian wrote: I'd suggest just uncompressing the various image files and re-timing as a start. The initial implementation was uncompressed, but people complained about space usage on the emulator images (which are uncompressed). The current code supports both uncompressed and compressed image formats. For uncompressed images, putting the bits on the screen is an mmap and memcpy, so I can't imagine any implementation being faster than that (it's possible, of course, that what's stealing CPU is the shell's invocation of the client program; recoding just that little part in C should be trivial, since it does nothing but write to a socket IIRC.) Anyway, further benchmarking of the current implementation is probably worthwhile before a complete reimplementation is called for. But if you want to reimplement it from scratch, go nuts. --scott It has already been reimplemented. The disk I/O time for 26 full-screen images is several seconds. ___ Devel mailing list Devel@lists.laptop.org http://lists.laptop.org/listinfo/devel
Re: Opportunity for speedup
On Thu, Feb 19, 2009 at 1:22 PM, C. Scott Ananian csc...@laptop.org wrote: I'd suggest just uncompressing the various image files and re-timing as a start. The initial implementation was uncompressed, but people complained about space usage on the emulator images (which are uncompressed). The current code supports both uncompressed and compressed image formats. For uncompressed images, putting the bits on the screen is an mmap and memcpy, so I can't imagine any implementation being faster than that (it's possible, of course, that what's stealing CPU is the shell's invocation of the client program; recoding just that little part in C should be trivial, since it does nothing but write to a socket IIRC.) I implemented a RLE compressor specifically for these 16bit image files the last time this question came up. This can certainly be faster than memcpy since we are talking memory performance. GZip+RLE also beats plain GZip on size, again due to the contents of the images. http://wadeb.com/rle.c http://wadeb.com/unrle.c -Wade ___ Devel mailing list Devel@lists.laptop.org http://lists.laptop.org/listinfo/devel
Re: Opportunity for speedup
On Thu, Feb 19, 2009 at 1:22 PM, C. Scott Ananian csc...@laptop.org wrote: I'd suggest just uncompressing the various image files and re-timing as a start. The initial implementation was uncompressed, but people complained about space usage on the emulator images (which are uncompressed). The current code supports both uncompressed and compressed image formats. For uncompressed images, putting the bits on the screen is an mmap and memcpy, so I can't imagine any implementation being faster than that (it's possible, of course, that what's stealing CPU is the shell's invocation of the client program; recoding just that little part in C should be trivial, since it does nothing but write to a socket IIRC.) Anyway, further benchmarking of the current implementation is probably worthwhile before a complete reimplementation is called for. But if you want to reimplement it from scratch, go nuts. --scott I already re-implemented it - it was a fun optimization project and introduction to lower level systems programming. Using Mitch's D565 format to keep track of only the parts of the image that change cut down the implementation size significantly. Its now only 2 uncompressed images (frame00.565 and ul-warning.565), and 300KB of differences for the animation sequence. I understand reads from video memory (which I think is what the framebuffer is?) can be extremely slow, so it could turn out faster to open a D565 file, mmap it and mcpy the several tens of kilobytes of differences to the framebuffer than it is to read those differences from one part of video memory to another. This is where benchmarking should give some clearer answers. yours, Bobby ___ Devel mailing list Devel@lists.laptop.org http://lists.laptop.org/listinfo/devel
Re: Opportunity for speedup
2009/2/19 Wade Brainerd wad...@gmail.com: On Thu, Feb 19, 2009 at 1:22 PM, C. Scott Ananian csc...@laptop.org wrote: I'd suggest just uncompressing the various image files and re-timing as a start. The initial implementation was uncompressed, but people complained about space usage on the emulator images (which are uncompressed). The current code supports both uncompressed and compressed image formats. For uncompressed images, putting the bits on the screen is an mmap and memcpy, so I can't imagine any implementation being faster than that (it's possible, of course, that what's stealing CPU is the shell's invocation of the client program; recoding just that little part in C should be trivial, since it does nothing but write to a socket IIRC.) I implemented a RLE compressor specifically for these 16bit image files the last time this question came up. This can certainly be faster than memcpy since we are talking memory performance. Can you explain this? I don't think I have enough knowledge to evaluate your claim. bobby ___ Devel mailing list Devel@lists.laptop.org http://lists.laptop.org/listinfo/devel
Re: Opportunity for speedup
RLE (run length encoding) compresses sequences of identical pixels (runs) as value/count pairs. So abbccc would be stored as 1a 10b 3c. The decompressor looks like: while (cur end) { unsigned short count = *cur++; unsigned short value = *cur++; while (count--) *dest++ = value; } This can be faster than memcpy because you are reading significantly less memory than you would with memcpy, thus fewer cache misses are incurred. Because the startup images are mostly spans solid colors, this kind of compression works very well. If that were not the case, say if there were a left-to-right gradient in the background, RLE would probably make things worse, thus you have to be careful when choosing it. But the smaller size on disk and in memory would probably improve performance in other ways as well. Best, Wade On Thu, Feb 19, 2009 at 1:49 PM, Bobby Powers bobbypow...@gmail.com wrote: 2009/2/19 Wade Brainerd wad...@gmail.com: On Thu, Feb 19, 2009 at 1:22 PM, C. Scott Ananian csc...@laptop.org wrote: I'd suggest just uncompressing the various image files and re-timing as a start. The initial implementation was uncompressed, but people complained about space usage on the emulator images (which are uncompressed). The current code supports both uncompressed and compressed image formats. For uncompressed images, putting the bits on the screen is an mmap and memcpy, so I can't imagine any implementation being faster than that (it's possible, of course, that what's stealing CPU is the shell's invocation of the client program; recoding just that little part in C should be trivial, since it does nothing but write to a socket IIRC.) I implemented a RLE compressor specifically for these 16bit image files the last time this question came up. This can certainly be faster than memcpy since we are talking memory performance. Can you explain this? I don't think I have enough knowledge to evaluate your claim. bobby ___ Devel mailing list Devel@lists.laptop.org http://lists.laptop.org/listinfo/devel
Re: Opportunity for speedup
Oh, and you can feed one of the 565 files through my 'rle.c' program to see the compression ratio firsthand. On Thu, Feb 19, 2009 at 1:56 PM, Wade Brainerd wad...@gmail.com wrote: RLE (run length encoding) compresses sequences of identical pixels (runs) as value/count pairs. So abbccc would be stored as 1a 10b 3c. The decompressor looks like: while (cur end) { unsigned short count = *cur++; unsigned short value = *cur++; while (count--) *dest++ = value; } This can be faster than memcpy because you are reading significantly less memory than you would with memcpy, thus fewer cache misses are incurred. Because the startup images are mostly spans solid colors, this kind of compression works very well. If that were not the case, say if there were a left-to-right gradient in the background, RLE would probably make things worse, thus you have to be careful when choosing it. But the smaller size on disk and in memory would probably improve performance in other ways as well. Best, Wade On Thu, Feb 19, 2009 at 1:49 PM, Bobby Powers bobbypow...@gmail.comwrote: 2009/2/19 Wade Brainerd wad...@gmail.com: On Thu, Feb 19, 2009 at 1:22 PM, C. Scott Ananian csc...@laptop.org wrote: I'd suggest just uncompressing the various image files and re-timing as a start. The initial implementation was uncompressed, but people complained about space usage on the emulator images (which are uncompressed). The current code supports both uncompressed and compressed image formats. For uncompressed images, putting the bits on the screen is an mmap and memcpy, so I can't imagine any implementation being faster than that (it's possible, of course, that what's stealing CPU is the shell's invocation of the client program; recoding just that little part in C should be trivial, since it does nothing but write to a socket IIRC.) I implemented a RLE compressor specifically for these 16bit image files the last time this question came up. This can certainly be faster than memcpy since we are talking memory performance. Can you explain this? I don't think I have enough knowledge to evaluate your claim. bobby ___ Devel mailing list Devel@lists.laptop.org http://lists.laptop.org/listinfo/devel
Re: Opportunity for speedup
On Thu, Feb 19, 2009 at 1:56 PM, Wade Brainerd wad...@gmail.com wrote: RLE (run length encoding) compresses sequences of identical pixels (runs) as value/count pairs. So abbccc would be stored as 1a 10b 3c. The decompressor looks like: while (cur end) { unsigned short count = *cur++; unsigned short value = *cur++; while (count--) *dest++ = value; } This can be faster than memcpy because you are reading significantly less memory than you would with memcpy, thus fewer cache misses are incurred. Because the startup images are mostly spans solid colors, this kind of compression works very well. If that were not the case, say if there were a left-to-right gradient in the background, RLE would probably make things worse, thus you have to be careful when choosing it. But the smaller size on disk and in memory would probably improve performance in other ways as well. Best, Wade thanks, that makes sense ___ Devel mailing list Devel@lists.laptop.org http://lists.laptop.org/listinfo/devel
Re: [Server-devel] automatic olpc-update from XS
On Fri, Feb 20, 2009 at 1:34 AM, Daniel Drake d...@laptop.org wrote: Which bits did you think were missing? Quite a few bits. Happy to discuss over the phone or to draft more formal notes (that'd take a bit longer). Michael seems keen on working with us on developing this into an XS feature. I am not sure when I'll have time to attack this head-on but it's definitely something that I'll be working on. Right - very keen on catching up with Michael on plans (hoping we can aim for something that scratches your immedate itch, and also moves the game forward...) cheers, m -- martin.langh...@gmail.com mar...@laptop.org -- School Server Architect - ask interesting questions - don't get distracted with shiny stuff - working code first - http://wiki.laptop.org/go/User:Martinlanghoff ___ Server-devel mailing list server-de...@lists.laptop.org http://lists.laptop.org/listinfo/server-devel
Re: Opportunity for speedup
Bobby Powers wrote: On Thu, Feb 19, 2009 at 1:56 PM, Wade Brainerd wad...@gmail.com wrote: RLE (run length encoding) compresses sequences of identical pixels (runs) as value/count pairs. So abbccc would be stored as 1a 10b 3c. The decompressor looks like: while (cur end) { unsigned short count = *cur++; unsigned short value = *cur++; while (count--) *dest++ = value; } This can be faster than memcpy because you are reading significantly less memory than you would with memcpy, thus fewer cache misses are incurred. Because the startup images are mostly spans solid colors, this kind of compression works very well. If that were not the case, say if there were a left-to-right gradient in the background, RLE would probably make things worse, thus you have to be careful when choosing it. But the smaller size on disk and in memory would probably improve performance in other ways as well. Best, Wade thanks, that makes sense We are already getting some portion of the possible compression by doing the iframe style delta encoding of the second and subsequent frames, but the rle is still of some use. It does a good job of shrinking the first frame, and it halves the size of the delta wad. The first-frame-shrink could also be accomplished by the trick of assuming an initial solid background and representing the first frame as a delta from that. In either case, it looks like rle decoding might be a nice addition, as it reduces the size of the frames on disk from 1.2 MB to about 140 KB. ___ Devel mailing list Devel@lists.laptop.org http://lists.laptop.org/listinfo/devel
Re: Opportunity for speedup
da...@lang.hm wrote: if you have the diff of the images, do you need to read from the framebuffer at all? since you know what you put there, and know what you want to change, can't you just write your changed information to the right place? The framebuffer in this case is serving as persistent shared memory, thus avoiding the extra complexity of a client/server architecture to maintain the sequencing state. The extremely-tiny (4K - 1 memory page) client program initially reads the first frame into the on-screen framebuf and the delta set into off-screen framebuffer memory. On subsequent invocations, the client copies another delta into the on-screen framebuf. If it is statically linked and uses only direct syscalls, the exec() overhead is minimal - no shell process instantiation, no script startup, no ld.so invocations, no mapping in shared libraries, no relocation. ___ Devel mailing list Devel@lists.laptop.org http://lists.laptop.org/listinfo/devel
Re: Opportunity for speedup
On Thu, 19 Feb 2009, Mitch Bradley wrote: da...@lang.hm wrote: if you have the diff of the images, do you need to read from the framebuffer at all? since you know what you put there, and know what you want to change, can't you just write your changed information to the right place? The framebuffer in this case is serving as persistent shared memory, thus avoiding the extra complexity of a client/server architecture to maintain the sequencing state. The extremely-tiny (4K - 1 memory page) client program initially reads the first frame into the on-screen framebuf and the delta set into off-screen framebuffer memory. On subsequent invocations, the client copies another delta into the on-screen framebuf. If it is statically linked and uses only direct syscalls, the exec() overhead is minimal - no shell process instantiation, no script startup, no ld.so invocations, no mapping in shared libraries, no relocation. right, but why read the current framebuffer? you don't touch most of it, you aren't going to do anything different based on what's there (you are just going to overlay your new info there) so all you really need to do is to write the parts tha need to change. David Lang ___ Devel mailing list Devel@lists.laptop.org http://lists.laptop.org/listinfo/devel
Re: Price point plus sales to individuals
On Feb 18, 2009 at 10:24 PM, John Watlington wrote: I don't see how a non-profit can do this, as requires financing at risk, and staffing for uncertain demand.Let me know when you have the capital. Absolutely true and I'm not sure the capital would be enough. Its difficult to imagine OLPC entering a retail channel, directly or indirectly. Perhaps I have misunderstood the intent but thats what it seems like when we talk about individuals buying small volumes, perhaps simply to tinker with a cool machine. In the immediate future I don't see overwhelming evidence that OLPC should devote resources to satisfy every volume demand in every channel. The outcry over discontinuing Change The World was far greater than the willingness of people to put up money. Very often those willing to pay were small-volume resellers. OLPC is better off focused on its engineering, advocacy and implementation efforts AND supporting accompanying networks of olpc-phile communities. Running an effective developer program is fundamental to all the above. r.___ Devel mailing list Devel@lists.laptop.org http://lists.laptop.org/listinfo/devel
Re: [Server-devel] automatic olpc-update from XS
2009/2/19 Martin Langhoff martin.langh...@gmail.com: On Fri, Feb 20, 2009 at 1:34 AM, Daniel Drake d...@laptop.org wrote: Which bits did you think were missing? Quite a few bits. Happy to discuss over the phone or to draft more formal notes (that'd take a bit longer). More formal notes would be good for sharing with others... I have hit one problem. olpc-update-query expects the response from the server to be signed by the OLPC private OATS key, so this shoots down the ability of running an antitheft server anywhere except OLPC. (unless we modify the client code) Well, we now have the ability to create a Paraguayan OATS key instead, but if we want to run this on the XS then it means we have to ship our private OATS key on all the XSes, which means that it is not very private. We could run one central paraguayan antitheft server which the XOs contact over the internet, but I was hoping to keep everything local. Any ideas? Daniel ___ Server-devel mailing list server-de...@lists.laptop.org http://lists.laptop.org/listinfo/server-devel
Re: Opportunity for speedup
da...@lang.hm wrote: right, but why read the current framebuffer? you don't touch most of it, you aren't going to do anything different based on what's there (you are just going to overlay your new info there) so all you really need to do is to write the parts tha need to change. You don't read the on-screen part of the framebuffer. You copy delta data from off-screen framebuffer memory to portions of the on-screen framebuffer memory. On-screen vs. off-screen is irrelevant to the speed - read access to the memory that is reserved for display controller use is similarly slow in both cases. But considering that the delta data is small compared to the full images, it's worth it to store the deltas there, thus avoiding the overhead of the other alternatives for maintaining the context from one call to the next. Those alternatives are: a) Server process maintains context on behalf of repeatedly-executed client process. This incurs the complexity of client-server architectures - setup/teardown, library overhead, interprocess communication, scheduling. b) Client program reads new delta data from a file on each invocation. This incurs the filesystem overhead of opening a file on each invocation (in comparison, the off-screen framebuffer solution requires only a single open() and a single read() on the first invocation. c) Client program reads delta set into a shared memory segment and then reattaches to that segment on subsequent invocations. This is similar to the framebuffer approach except that it uses faster memory for the persistent storage. It might be a win from a speed perspective, but it is a bit more complex, requiring the program to deal with two memory objects instead of just one. The total amount of time that it could possibly save is about 50 mS, since that it the time it takes to read the delta set from the off-screen framebuffer. And if we use the RLE encoding suggested by Wade, the amount of off-screen data is halved, so the best-case savings are reduced to 25 mS total. ___ Devel mailing list Devel@lists.laptop.org http://lists.laptop.org/listinfo/devel
Re: Price point plus sales to individuals
On Thu, 19 Feb 2009, Robert D. Fadel wrote: On Feb 18, 2009 at 10:24 PM, John Watlington wrote: I don't see how a non-profit can do this, as requires financing at risk, and staffing for uncertain demand.Let me know when you have the capital. Absolutely true and I'm not sure the capital would be enough. Its difficult to imagine OLPC entering a retail channel, directly or indirectly. Perhaps I have misunderstood the intent but thats what it seems like when we talk about individuals buying small volumes, perhaps simply to tinker with a cool machine. In the immediate future I don't see overwhelming evidence that OLPC should devote resources to satisfy every volume demand in every channel. The outcry over discontinuing Change The World was far greater than the willingness of people to put up money. Very often those willing to pay were small-volume resellers. so you say that people need to pony up capital for this, but then dismiss those who were doing so. please pick one argument and stick with it. OLPC is better off focused on its engineering, advocacy and implementation efforts AND supporting accompanying networks of olpc-phile communities. Running an effective developer program is fundamental to all the above. if the systems aren't available, the advocacy will drop off. there would be no implementation effort. the engineering work has been completed, so no more effort is needed there. the only cost left is support. a good chunk of that can be absorbed by those 'small volume resellers' you are talking about, a good chunk is dealt with on the mailing lists (in large part by volunteers), but there is some that will end up on the OLPC plate. however, eliminating entire markets (with the advocates it would generate, the lessons that would be learned, and the wider experimentation that would happen) is a very high cost. as someone who has paied for 6 of these machines (plus the 6 'give 1' machines), I am _very_ dissapointed to see this cut off. what's worse is the unofficial we'll end this any second without any formal announcement (even a week or so later). David Lang___ Devel mailing list Devel@lists.laptop.org http://lists.laptop.org/listinfo/devel ___ Devel mailing list Devel@lists.laptop.org http://lists.laptop.org/listinfo/devel
Re: Opportunity for speedup
On Thu, 19 Feb 2009, Mitch Bradley wrote: da...@lang.hm wrote: right, but why read the current framebuffer? you don't touch most of it, you aren't going to do anything different based on what's there (you are just going to overlay your new info there) so all you really need to do is to write the parts tha need to change. You don't read the on-screen part of the framebuffer. You copy delta data from off-screen framebuffer memory to portions of the on-screen framebuffer memory. On-screen vs. off-screen is irrelevant to the speed - read access to the memory that is reserved for display controller use is similarly slow in both cases. But considering that the delta data is small compared to the full images, it's worth it to store the deltas there, thus avoiding the overhead of the other alternatives for maintaining the context from one call to the next. Those alternatives are: a) Server process maintains context on behalf of repeatedly-executed client process. This incurs the complexity of client-server architectures - setup/teardown, library overhead, interprocess communication, scheduling. b) Client program reads new delta data from a file on each invocation. This incurs the filesystem overhead of opening a file on each invocation (in comparison, the off-screen framebuffer solution requires only a single open() and a single read() on the first invocation. c) Client program reads delta set into a shared memory segment and then reattaches to that segment on subsequent invocations. This is similar to the framebuffer approach except that it uses faster memory for the persistent storage. It might be a win from a speed perspective, but it is a bit more complex, requiring the program to deal with two memory objects instead of just one. The total amount of time that it could possibly save is about 50 mS, since that it the time it takes to read the delta set from the off-screen framebuffer. And if we use the RLE encoding suggested by Wade, the amount of off-screen data is halved, so the best-case savings are reduced to 25 mS total. d) compile the delta set into the client program. does this really need to be a general-purpose solution here? or is this really only used for this specific purpose. David Lang ___ Devel mailing list Devel@lists.laptop.org http://lists.laptop.org/listinfo/devel
Re: Opportunity for speedup
da...@lang.hm wrote: d) compile the delta set into the client program. That works, but 1) It requires more work from the VM system on each invocation of the client program, which is now 1.x MB instead of 4K. 2) If a deployment wants to change the image set, it needs a compiler toolchain instead of a (small) delta-encoding program. Speed-wise, (d) might be a wash, or perhaps even a slight win. It depends on how efficient the VM system is, and the effectiveness of the filesystem buffer cache at preventing re-reads of the client process image (paging directly from JFFS2 is not possible). The framebuffer hack avoids numerous assumptions about the effectiveness of clever but complex subsystems (e.g. the VM system, the filesystem buffer cache, the shared library mechanisms, zlib, JFFS2 compression, ...). ___ Devel mailing list Devel@lists.laptop.org http://lists.laptop.org/listinfo/devel
Re: [Sugar-devel] [PATCH] webactivity: seed the XS cookie at startup
On Thu, Feb 19, 2009 at 10:08 PM, Simon Schampijer si...@schampijer.de wrote: Ok, I pushed to browse git master now. Do you have a way to test if it is working fine against a schoolserver or should I create you the 0.82 xo bundle? I have a git checkout - but on the 8.2.x so master won't work there. I'm rebasing your patch on the sucrose-0.82 branch to test, replacing gconf calls with profile.foo... cheers, m -- martin.langh...@gmail.com mar...@laptop.org -- School Server Architect - ask interesting questions - don't get distracted with shiny stuff - working code first - http://wiki.laptop.org/go/User:Martinlanghoff ___ Devel mailing list Devel@lists.laptop.org http://lists.laptop.org/listinfo/devel
New joyride build 2657
http://xs-dev.laptop.org/~cscott/olpc/streams/joyride/build2657 Changes in build 2657 from build: 2656 Size delta: 0.00M -olpc-update 2.17-1 +olpc-update 2.18-1 --- Changes for olpc-update 2.18-1 from 2.17-1 --- + Support multiple keys + Support multiple keys -- This mail was automatically generated See http://dev.laptop.org/~rwh/announcer/joyride-pkgs.html for aggregate logs See http://dev.laptop.org/~rwh/announcer/joyride_vs_update1.html for a comparison ___ Devel mailing list Devel@lists.laptop.org http://lists.laptop.org/listinfo/devel
New staging build 33
http://xs-dev.laptop.org/~cscott/xo-1/streams/staging/build33 Changes in build 33 from build: 32 Size delta: 0.00M -olpc-update 2.17-1 +olpc-update 2.18-1 --- Changes for olpc-update 2.18-1 from 2.17-1 --- + Support multiple keys + Support multiple keys -- This mail was automatically generated See http://dev.laptop.org/~rwh/announcer/staging-pkgs.html for aggregate logs See http://dev.laptop.org/~rwh/announcer/joyride_vs_update1.html for a comparison ___ Devel mailing list Devel@lists.laptop.org http://lists.laptop.org/listinfo/devel
Re: RFC: Supporting olpc-ish Deployments - Draft 1
On Thu, Feb 19, 2009 at 6:14 PM, Michael Stone mich...@laptop.org wrote: 6. Questions: * Does this analysis hold water? Seems to make sense, and looks like the strategies are quite rational. * Is there anything we could spend our time on which would yield a greater return on investment? I think you (plural you) are doing a good job overall, but I'm on the sidelines. The only thing I'd suggest is to avoid building too many deployment specific channels (irc, lists, etc). If the relevant discussions happen on the -dev channels, autistic devs like me are then forced to listen to the chatter of the most important members of the community. And that's a good thing. (ie: I'd prefer to see more traffic on server-devel -- and I'd actually rename it to 'server'.) * Are there any fixable roadblocks which prevent group from participating? (e.g., pervasive use of IRC for meetings?) I'm not an irc junkie but I am hoping people know to _also_ ask on the list if they think I can help. On irc you get the answers of the people that are there, which may or may not intersect with the people who know about your question... cheers, m -- martin.langh...@gmail.com mar...@laptop.org -- School Server Architect - ask interesting questions - don't get distracted with shiny stuff - working code first - http://wiki.laptop.org/go/User:Martinlanghoff ___ Devel mailing list Devel@lists.laptop.org http://lists.laptop.org/listinfo/devel
On optimizing Theora
-BEGIN PGP SIGNED MESSAGE- Hash: SHA1 I have been testing libtheora-1.0 on a MP XO. On build 767, using F9's gcc-4.3, I compiled libtheora with CFLAGS=-march=geode. I tested encode, with the command time encoder_example -v 1 coastguard_cif.y4m /dev/null using the test video from http://media.xiph.org/video/derf/y4m/coastguard_qcif.y4m. This test ran in 44.15 +/- 0.15 seconds (all times are user time). I then tested decode, with the command time dump_video coastguard_cif1.ogv /dev/null using the ogg video that would be produced by the encoder above were it not redirected to /dev/null. This test ran in 4.60 +/- 0.05 seconds. I then repeated these tests after recompiling with -march=i586 - -mtune=generic, which I assume are approximately the CFLAGS used by Fedora. The resultant times were 41.6 +/- 0.1 and 4.45 +/- 0.05. In conclusion, compiling libtheora with -march=geode causes it to run significantly (20 sigma, 7%) slower than -march=i586 -mtune=generic for encoding, and possibly slightly slower for decoding as well. GCC 4.3 evidently does not do a very good job of optimizing for geode. - --Ben -BEGIN PGP SIGNATURE- Version: GnuPG v2.0.9 (GNU/Linux) iEYEARECAAYFAkmeP4oACgkQUJT6e6HFtqQw8wCdEhQQi0qzQNjn++HQU1uQRMXG +aIAnA/LStzVA7pSZGMRFIWXUbeQv3oc =wp55 -END PGP SIGNATURE- ___ Devel mailing list Devel@lists.laptop.org http://lists.laptop.org/listinfo/devel
Re: On optimizing Theora
On Fri, Feb 20, 2009 at 12:28:42AM -0500, Benjamin M. Schwartz wrote: GCC 4.3 evidently does not do a very good job of optimizing for geode. What percentage of CPU time was spent in libtheora? -- James Cameronmailto:qu...@us.netrek.org http://quozl.netrek.org/ ___ Devel mailing list Devel@lists.laptop.org http://lists.laptop.org/listinfo/devel
Re: [Server-devel] automatic olpc-update from XS
2009/2/18 Martin Langhoff martin.langh...@gmail.com: the original plan was to rig all that from the anti-theft proto. On the server side, there's a lot of work to do to make that happen. On the laptop side, it's unclear. For many months my emails to cscott on the matter went unanswered, I reviewed the code with mstone last Dec and it appeared to be lacking important bits. Which bits did you think were missing? I discussed this with Michael the other day and concluded that no XO-side changes were needed, but I might have missed something. We just need something on the XS side which implements the version-checking part of the antitheft protocol. No. If you find a usable path down that way, _fantastic_ and I'll be happy to merge it in. Michael seems keen on working with us on developing this into an XS feature. I am not sure when I'll have time to attack this head-on but it's definitely something that I'll be working on. Daniel ___ Server-devel mailing list Server-devel@lists.laptop.org http://lists.laptop.org/listinfo/server-devel