Re: Tile Cache Size
On Fri, Nov 12, 1999 at 05:32:10PM +0100, Marc Lehmann wrote: You saying that the tile system in Gimp is faster is not useful. ^^^ I didn't. Please don't leap into every discussion just to bait me Marc, it's very annoying and I somehow doubt that others are reading this list just to see whether you can bait me enough for me to really bite. Nick.
Re: Re: Tile Cache Size
On Thu, Nov 11, 1999 at 12:56:58PM +0100, "Ewald R. de Wit" [EMAIL PROTECTED] wrote: Well the algorithm involved is a simple 256 byte lookup table (or 3 of them for each of the RGB channels). There is not much one can screw up about it, both performance and precision wise. The only different to the gimp is that it is signal driven, i.e. the work is done in the gtk idle loop and the screen is updated as necessary. Maybe the cycles are burnt here. Yes... anybody who wants to do a profile? ;() I consider any image that doesn't fit into RAM to be a pathological case and I don't really care about this sort of oddball performance. Magazine covers rarely if ever fit into memory (unless you give up on undo). image as a large file into memory. Chances are that the linear layout of this large file give a much more efficient disk access pattern then the Gimp's current swap file. In the case of brightness_contrast and similar operations, the layout is of no consequence to performance (the pixels are not connected and can be processed in any order). -- -==- | ==-- _ | ---==---(_)__ __ __ Marc Lehmann +-- --==---/ / _ \/ // /\ \/ / [EMAIL PROTECTED] |e| -=/_/_//_/\_,_/ /_/\_\ XX11-RIPE --+ The choice of a GNU generation | |
Re: Re: Tile Cache Size
On Tue, Nov 09, 1999 at 11:59:58AM +1000, David Bonnell wrote: On Tue, 9 Nov 1999, Ewald R. de Wit wrote: Anyway, today I went over the Gimp sources and noticed how complicated the tile architecture makes things and I couldn't help wondering why the heck it was put in. All it seems to do is to give you an order of magnitude slower speed when dealing with large images. And large images were supposed to be the very reason for a tiling architecture. I'm afraid I have to agree with you on the performance WRT large images. I tried editing a couple of large images yesterday (10MB/600dpi) and it was painfully slow (Dual 300MHz PII, 128MB RAM). I've got a 20MB/1200dpi one I want to edit and I'm not looking forward to it! Getting more than 128MB ram does help. Actually, the more the better. I cant think of too much. I have 380MB at work and manipulating large scans is _considerably_ faster than with similar cpu but less ram (which is kinda obvious anyway) Also, like someone pointed out, put as much tile-cache as you have free ram for gimp. Sucks to have earthquakes in Taiwan just now kicking ram-prices up.. :( Tuomas -- .---( t i g e r t @ g i m p . o r g )---. | some stuff at http://tigert.gimp.org/ | `---'
Re: Tile Cache Size
Nich Lamb wrote on Die, 09 Nov 1999: Why does my 7274 x 9985 RGB image (212743Kb of data by my calculations) result in the creation of a gimpswap which is up to 500Mb in size? Where do you think can the undo information reside??? Uwe -- mailto:[EMAIL PROTECTED] http://rcswww.urz.tu-dresden.de/~koloska/ ---- right now the web page is in german only but this will change as time goes by ;-)
Re: Tile Cache Size
On Thu, Nov 11, 1999 at 01:40:28AM +0100, Uwe Koloska wrote: Nich Lamb wrote on Die, 09 Nov 1999: Why does my 7274 x 9985 RGB image (212743Kb of data by my calculations) result in the creation of a gimpswap which is up to 500Mb in size? Where do you think can the undo information reside??? I would guess that when I actually DO something there will an appropriate number of additional tiles allocated in the swapfile. Experience agrees. Give or take tile leaks, which now tigert has mentioned them, explain this problem very well... Why? Where do YOU think the undo information goes? Nick.
Re: Re: Tile Cache Size
Nick Lamb ([EMAIL PROTECTED]) wrote: Loading a large image (*): Wait about 2 mins, loader finishes and now after a further couple of minutes the image is drawn, however later performance is slightly faster than in the default case above. (*) A large image here is one which genuinely WILL NOT fit in memory, by any stretch of the imagination. It is a JPEG tiled TIFF (nasty!) of dimensions 7274x9985 and in full 24-bit colour. Hi Nick, That is one large image you have there. I'm using filmscans and PhotoCD (base 5) images which are of the order of 2500x4000. On a 128 MB machine with a tile-cache-size of 80M everything is going quite slow. Adjusting the tile-cache-size doesn't make things go faster for me. If the image doesn't fit into memory then operating on it will take a long time anyway. IMHO it is silly to optimize for this case at the (very considerable) expense of the much more common image dimensions. I'm doing my own image processing using non tiling images and not only is it much more pleasant to program for, it is also much faster. -- -- Ewald
Re: Re: Tile Cache Size
On Tue, Nov 09, 1999 at 01:27:29AM +0100, "Ewald R. de Wit" [EMAIL PROTECTED] wrote: 2. The fragmentation of tiles within the swap file. The sound of Gimp trashing the harddisk suggests that this is a very big issue. For which spatial indexing would be solution. Anyway, today I went over the Gimp sources and noticed how complicated the tile architecture makes things and I couldn't help wondering why the heck it was put in. All it seems to do is to give you an order of magnitude slower speed when dealing with large images. And large images were supposed to be the very reason for a tiling architecture. I really don't agree with you on the speed issue. data is most often processed by tile, in which case the program will find an almost ideal situation, memory and cache-wise. -- -==- | ==-- _ | ---==---(_)__ __ __ Marc Lehmann +-- --==---/ / _ \/ // /\ \/ / [EMAIL PROTECTED] |e| -=/_/_//_/\_,_/ /_/\_\ XX11-RIPE --+ The choice of a GNU generation | |
Re: Re: Tile Cache Size
On Wed, Nov 03, 1999 at 05:40:03PM -0600, Tim Mooney [EMAIL PROTECTED] wrote: As far as I know, most Unix and Unix-like OSes will generally try give you the space you're requesting as a contiguous chunk. In the case of files like a (e.g.) 40 Meg swap-file for the gimp, that may not be possible, even for a filesystem that is much less than 50% full. All it takes is "average" fragmentation to ruin the OS' ability to give you a contiguous chunk. Then you have to convince me (you don't have, but I won't believe you) that this is a problem... surely, each ~8MB chunk of a large file is fragmented on an ext2 file system _by definition_, but these are very large chunks. This means, I think, that all bets *are* off, at least regarding the gimp's ability to keep tiles "close" on-disk. I think that on an OS that encourages fragmentation you are right. However, I can't think of a common os in use that allows to run gimp and has such a bad filesystem. The bigger the swap file, the less likely it is that the gimp will be able to do this. That's why I asked my original question. But this is not at all a problem. For example, on my 8GB main (i.e. /usr, /home) partition that I already use since two years ans that is 95% full (too full for the file system in question) I have 0.5% fragmentation. Only two files have fragmented chunks smaller than 8MB (the maximum). In any case, since you think this is bad (and I think this is good), only hard data would clear this problem. However, as a matter of fact, linux' swapping (the os with the ext2 fs) is _far_ worse than _any_ application-side swapping. -- -==- | ==-- _ | ---==---(_)__ __ __ Marc Lehmann +-- --==---/ / _ \/ // /\ \/ / [EMAIL PROTECTED] |e| -=/_/_//_/\_,_/ /_/\_\ XX11-RIPE --+ The choice of a GNU generation | |
Re: Re: Tile Cache Size
I rather think _you_ are missing the point (which is disk layout and minimizing seeks, and _not_ a better memory layout. The tile based scheme leads itself naturally to spatial indexing, in fatc it's already half the way to go). About HD: is there a way to do swap on demand to a partition? A daemon, lib or something? I think that is what high end DBs do, handle their own data physical space. Just and idea based in what I have heard. Discard at will. GSR
Re: Re: Tile Cache Size
On Tue, Nov 09, 1999 at 12:04:14AM +0100, "Guillermo S. Romero / Familia Romero" [EMAIL PROTECTED] wrote: About HD: is there a way to do swap on demand to a partition? A daemon, lib or something? is this gimp-related (?) or do you want something like swapd? or swap priorities? I think that is what high end DBs do, handle their own data physical these high end db's also do not recommend this technique ;- -- -==- | ==-- _ | ---==---(_)__ __ __ Marc Lehmann +-- --==---/ / _ \/ // /\ \/ / [EMAIL PROTECTED] |e| -=/_/_//_/\_,_/ /_/\_\ XX11-RIPE --+ The choice of a GNU generation | |
Re: Re: Tile Cache Size
About HD: is there a way to do swap on demand to a partition? A daemon, lib or something? is this gimp-related (?) or do you want something like swapd? or swap priorities? I know what swapd and swap priorities are (I think I do, OS thing and how partitions are used). I am speaking about something that "owns" a partition (or filesystem or warp generator) and accepts request from apps like "hey you! move this are of my RAM to your storage" and "quickly! give me back the data I give you some minutes ago". So Gimp could use it, instead of using OS things (swap or filesystem). I guess everybody will agree that a partition handled by one process (with high performance in mind) is a good solution. Well, now you can point why my suggestion is 100% wrong. ;] I think that is what high end DBs do, handle their own data physical these high end db's also do not recommend this technique ;- Really? Maybe time to get a book about DB implementation. Damn, so many thing, so few time. GSR
Re: Re: Tile Cache Size
On Tue, 9 Nov 1999, Guillermo S. Romero / Familia Romero wrote: So Gimp could use it, instead of using OS things (swap or filesystem). I guess everybody will agree that a partition handled by one process (with high performance in mind) is a good solution. What's wrong with using mmap? You can even map a raw partition if you want. (According to the man page, I've never tried it). -Dave -- http://www.flamingtext.com/
Re: Re: Tile Cache Size
Marc Lehmann ([EMAIL PROTECTED]) wrote: But this is not at all a problem. For example, on my 8GB main (i.e. /usr, /home) partition that I already use since two years ans that is 95% full (too full for the file system in question) I have 0.5% fragmentation. Only two files have fragmented chunks smaller than 8MB (the maximum). I think 2 things need to be clearly seperated here: 1. The fragmentation of the swap file on the harddisk. I agree that this is a bit of a non issue with the ext2 filesystem (even if the swapfile gets fragmented a bit it's no big deal); 2. The fragmentation of tiles within the swap file. The sound of Gimp trashing the harddisk suggests that this is a very big issue. Anyway, today I went over the Gimp sources and noticed how complicated the tile architecture makes things and I couldn't help wondering why the heck it was put in. All it seems to do is to give you an order of magnitude slower speed when dealing with large images. And large images were supposed to be the very reason for a tiling architecture. -- -- Ewald
Re: Re: Tile Cache Size
On Tue, 9 Nov 1999, Ewald R. de Wit wrote: Anyway, today I went over the Gimp sources and noticed how complicated the tile architecture makes things and I couldn't help wondering why the heck it was put in. All it seems to do is to give you an order of magnitude slower speed when dealing with large images. And large images were supposed to be the very reason for a tiling architecture. I'm afraid I have to agree with you on the performance WRT large images. I tried editing a couple of large images yesterday (10MB/600dpi) and it was painfully slow (Dual 300MHz PII, 128MB RAM). I've got a 20MB/1200dpi one I want to edit and I'm not looking forward to it! -Dave
Re: Re: Tile Cache Size
On Tue, 9 Nov 1999, David Bonnell wrote: On Tue, 9 Nov 1999, Ewald R. de Wit wrote: Anyway, today I went over the Gimp sources and noticed how complicated the tile architecture makes things and I couldn't help wondering why the heck it was put in. All it seems to do is to give you an order of magnitude slower speed when dealing with large images. And large images were supposed to be the very reason for a tiling architecture. I'm afraid I have to agree with you on the performance WRT large images. I tried editing a couple of large images yesterday (10MB/600dpi) and it was painfully slow (Dual 300MHz PII, 128MB RAM). I've got a 20MB/1200dpi one I want to edit and I'm not looking forward to it! Hmm. Are you setting the tile cache size to something reasonable? It will definitely suck with the default 10mb tile cache... later, Andrew Kieschnick
Re: Re: Tile Cache Size
On Tue, Nov 09, 1999 at 01:27:29AM +0100, Ewald R. de Wit wrote: Anyway, today I went over the Gimp sources and noticed how complicated the tile architecture makes things and I couldn't help wondering why the heck it was put in. All it seems to do is to give you an order of magnitude slower speed when dealing with large images. And large images were supposed to be the very reason for a tiling architecture. I have no idea where this came from, Ewald did you actually do any benchmarking, or just a few thought experiments? Computers do not behave in practise as it may seem they should ideally. Here are some practical results from my real Gimp machine, a PII 300MHz with 64Mb of memory and ~64Mb of swap. This is with CVS Gimp. If I configure Gimp to believe that it has as much memory as one might conceivably need, then the results are as follows: Loading a large image (*): Wait 10 mins, get bored, try to kill, but the machine is in a swap death loop, after 5 more minutes, just as I log in as root from a serial console, X experiences resource starvations and so Gimp, Gnome, xterms and everything go into the drink. If I configure Gimp with a large but not improbable tile cache (64Mb): Loading a large image (*): Wait 5 mins, TIFF loader finishes, after a further 10 mins the image has been drawn at 10:1 reduction. Now with the defaults as supplied (12Mb ISTR): Loading a large image (*): Wait about 2 mins, loader finishes and now after a further 2 or 3 mins the image has been drawn. And finally with my preferred settings (20Mb): Loading a large image (*): Wait about 2 mins, loader finishes and now after a further couple of minutes the image is drawn, however later performance is slightly faster than in the default case above. (*) A large image here is one which genuinely WILL NOT fit in memory, by any stretch of the imagination. It is a JPEG tiled TIFF (nasty!) of dimensions 7274x9985 and in full 24-bit colour. Nick.
Re: Re: Tile Cache Size
Further to my last post (and possibly related to Ewald's complaints too) Why does my 7274 x 9985 RGB image (212743Kb of data by my calculations) result in the creation of a gimpswap which is up to 500Mb in size? The performance for such images seems adequate to me (can't compare PotatoShop because it refuses to load any of our larger images at all) but this is wasting a lot of active disk space :( Nick.
Re: Re: Tile Cache Size
On Mon, 8 Nov 1999, Andrew Kieschnick wrote: Hmm. Are you setting the tile cache size to something reasonable? It will definitely suck with the default 10mb tile cache... Bumped it up to 60MB and it's better. Something as simple has hiding one of the layers takes about 10 seconds to complete. (Took about 20-30secs before I think). Or moving a rectangular region of one layer (which should affect only the union of the pixels under the src and dest rectangles?) takes about 10 seconds too. (Again, was probably almost 20 seconds before increasing the tile cache size). -Dave
Re: Re: Tile Cache Size
In regard to: Re: Re: Tile Cache Size, Marc Lehmann said (at 1:05am on Nov...: On Mon, Nov 01, 1999 at 08:04:39PM -0600, Tim Mooney [EMAIL PROTECTED] wrote: Wouldn't the situation be even worse, then, if we're going through the filesystem and there's "average" fragmentation? You seem to be assuming that the filesystem allocation will be contiguous (or at least close) on disk, but can you really make that assumption? I don´t care for windows, if you wanted to hear that ;- If the OS does not make this assumption a fact (in most cases) then all bets are off anyway. I wasn't talking about windoze, and I don't care for it either. ;-) All filesystems have the notion of fragmentation. Some of them *encourage* it (UFS/BFFS, for example), and many of them *discourage* it, providing tools to defragment, coalescing the pockets of free space into large chunks of free space. As far as I know, most Unix and Unix-like OSes will generally try give you the space you're requesting as a contiguous chunk. In the case of files like a (e.g.) 40 Meg swap-file for the gimp, that may not be possible, even for a filesystem that is much less than 50% full. All it takes is "average" fragmentation to ruin the OS' ability to give you a contiguous chunk. This means, I think, that all bets *are* off, at least regarding the gimp's ability to keep tiles "close" on-disk. The bigger the swap file, the less likely it is that the gimp will be able to do this. That's why I asked my original question. Tim -- Tim Mooney [EMAIL PROTECTED] Information Technology Services (701) 231-1076 (Voice) Room 242-J1, IACC Building (701) 231-8541 (Fax) North Dakota State University, Fargo, ND 58105-5164
Re: [gimp-devel] Re: tile cache size
On Mon, Nov 01, 1999 at 06:58:09PM +0100, Simon Budig wrote: Austin Donnelly ([EMAIL PROTECTED]) wrote: Idea: if the size is set to 0, make it mean "guess something good". Out of the box gimp can come with it set to 0, and we just make the algorithm pick something appropriate. That's the hard part. Just to start a discussion: What about trying to detect if it is a "private" machine with less than 5 regular users and then using 80% of the physical memory? What? All our users have login names starting from nobody000 to nobody999! :) Also, 5x80% = 400%.. a hypothetical situation of all 5 users using Gimp at the same time :) Just joking, this might have a point but it is not trivial. But at least tell me what _I_ should use to avoid excess swapping and even crashing X.. Like I said, I have 256MB. X eats like 90MB (high res, high depth), Netscape bloats too (60MB is just a 'normal' case) etc.. so say, I have something like 100MB for Gimp, sometimes a bit more (Netscape will get swapped out when I work on gimp) Someone mentioned the tile-cache should be something like half of the ram available, does that make sense? Thanks, Tuomas -- .---( t i g e r t @ g i m p . o r g )---. | some stuff at http://tigert.gimp.org/ | `---'
Re: [gimp-devel] Re: tile cache size
On Mon, Nov 01, 1999 at 12:05:26PM -0800, Tuomas Kuosmanen [EMAIL PROTECTED] wrote: But at least tell me what _I_ should use to avoid excess swapping and even It´s easy... try to detect the pysical memory (on common platforms). Then use getrlimit to find out how much virtual memory we are to use. Then take a bit less than the minimum of both values. If none exist, use 10MB. In any case, process limits need to be set administratively on multi-user machines anyway, so they should be a good guideline. Someone mentioned the tile-cache should be something like half of the ram available, does that make sense? This lacks a definition of "available". The best value is clearly 100% of the ram that is "available" ;) -- -==- | ==-- _ | ---==---(_)__ __ __ Marc Lehmann +-- --==---/ / _ \/ // /\ \/ / [EMAIL PROTECTED] |e| -=/_/_//_/\_,_/ /_/\_\ XX11-RIPE --+ The choice of a GNU generation | |
Re: Re: Tile Cache Size
This is not necissarily true. The System-Swap routine is optimized for arbitrary data. Gimp organizes its image-data in tiles and may perform better in swapping those tiles, since they are a very special data-structure. Nor false. So the swapping routines could be optimized specially for those data (I have no idea if this is done currently) and perform better than the systems routine. Yes, but Gimp swaps to files, while system normally swaps to partition, and if the admin is smart, to a fast disk which main (unique?) task is swapping, maybe even sharing swap among a group of disks. Kernels swap is optimized (I hope it is, otherwise... argh!), we dunno about Gimp. This is more a per-site than theoric thing, I know. In the machines I used, the best was to use kernel fuctions instead of Gimp (specially if it fills your home dir up to quota limit). Is there a way to get a valid performance measurement? The NFS problem should be adressed. Can we detect somehow if the configured swap-directory is a NFS-Direcrtory and issue a warning? Via mount command. I think all Unix can do that. GSR
Re: Tile Cache Size
This is totally wrong in the case of Linux (ok, not unix, but even more common). Hehehe, then how will you describe my experiences with other non-unix systems? Do not waste your time trying: pathetic and noisy just to start. With a better layout, gimp swapping should be able to succeed virtual memory in all cases (of if partition writes are faster). Thanks for the info. Then the bad performace of Gimp must be due the fact that Gimp and kernel were swapping at the same time (so hd heads move from one point to another constantly). (Ok, 2.4 will fix most of linux´ swpaping mess and use a better layout on disk, but at the moment what I say holds). Thanks kernel developers. Question: Could Gimp use a file with few holes? In otherwords, reserve space in advance, say in chunks of some MB (so OS tries to make each big block in an acceptabe layout). And what about partitions, like OS or some CDR apps? GSR
Re: Re: Tile Cache Size
In regard to: Re: Re: Tile Cache Size, Marc Lehmann said (at 10:35pm on Nov...: On Mon, Nov 01, 1999 at 10:22:08PM +0100, "Guillermo S. Romero / Familia Romero" [EMAIL PROTECTED] wrote: Yes, but Gimp swaps to files, while system normally swaps to partition, and if the admin is smart, to a fast disk which main (unique?) task is swapping, maybe even sharing swap among a group of disks. Kernels swap is optimized (I hope it is, otherwise... argh!), we dunno about Gimp. The point is that the kernel keyes the swap by memory address (physical or virtual does not matter). Which means the keys are basically random. Gimp can use optimized ordering (e.g. group tiles that are near eahc other near on the medium) that no kernel can use. Once you start to seek your performance is gone, _no matter_ how fast your physical swap may be (for linear r/w). Wouldn't the situation be even worse, then, if we're going through the filesystem and there's "average" fragmentation? You seem to be assuming that the filesystem allocation will be contiguous (or at least close) on disk, but can you really make that assumption? Tim -- Tim Mooney [EMAIL PROTECTED] Information Technology Services (701) 231-1076 (Voice) Room 242-J1, IACC Building (701) 231-8541 (Fax) North Dakota State University, Fargo, ND 58105-5164
Re: Re: Tile Cache Size
[Lots of people writing barking mad things about tile swapping] Look, you're all missing the point. Gimp does it's own tile swapping not because it wants to control the layout on disk. As some of you have pointed out, this is futile. The only reason to swap a tile at a time is to do with controlling how far apart in memory neighbouring pixels are. Consider a very wide image. If it is stored as a large linear array in memory (possibly paged by the OS to an OS-managed swap file), then the common operation of consulting a pixel above or below the current one results in needing to skip a large number of bytes through the linear array. This results in poor CPU cache performance. So, we use a tiled memory layout. Once the data is in a tiled representation in memory, there seems little point in converting it into a linear buffer before writing it to disk. This would certainly take more time than it would to just hand the data to the OS. Now, the size of a tile cache (ie number of tiles we'd like to be able to access as speedily as possible) should be a little over the number of tiles it takes to cover the width of the image. This is so that filters which iterate over every single pixel from left to right, top to bottom, perform better on the horizontal boundary between adjacent strips of tiles. Consider a 3x3 convolution (let's say a blur matrix). When the center of the matrix is at the top of the second row of tiles, the top of the matrix needs to reference the first row of tiles. It is helpful for performance to have this top row available. Which means caching ceil(img_width / tile_width) + 1 tiles. And gentlemen, this is not rocket science. It's what undergraduates are normally taught in their basic "OS Functions" lectures. The gimp is a good example of why application-specific paging control can be a performance boost. Now can we drop this silly subject, please? Austin
Re: Tile Cache Size
Austin Donnelly wrote: [Lots of people writing barking mad things about tile swapping] Look, you're all missing the point. Gimp does it's own tile swapping not because it wants to control the layout on disk. As some of you have pointed out, this is futile. The only reason to swap a tile at a time is to do with controlling how far apart in memory neighbouring pixels are. Consider a very wide image. If it is stored as a large linear array in memory (possibly paged by the OS to an OS-managed swap file), then the common operation of consulting a pixel above or below the current one results in needing to skip a large number of bytes through the linear array. This results in poor CPU cache performance. So, we use a tiled memory layout. Once the data is in a tiled representation in memory, there seems little point in converting it into a linear buffer before writing it to disk. This would certainly take more time than it would to just hand the data to the OS. Now, the size of a tile cache (ie number of tiles we'd like to be able to access as speedily as possible) should be a little over the number of tiles it takes to cover the width of the image. This is so that filters which iterate over every single pixel from left to right, top to bottom, perform better on the horizontal boundary between adjacent strips of tiles. Consider a 3x3 convolution (let's say a blur matrix). When the center of the matrix is at the top of the second row of tiles, the top of the matrix needs to reference the first row of tiles. It is helpful for performance to have this top row available. Which means caching ceil(img_width / tile_width) + 1 tiles. And gentlemen, this is not rocket science. It's what undergraduates are normally taught in their basic "OS Functions" lectures. The gimp is a good example of why application-specific paging control can be a performance boost. Now can we drop this silly subject, please? Austin okay... now what is the best size to set it to? /me runs *grin* -- Garrett LeSage - Linux.com Art Director [EMAIL PROTECTED] - http://linux.com/