On Wed, 21 Dec 2005 09:21:42 -0600 Nathan Ingersoll <[EMAIL PROTECTED]> babbled:
> On 12/20/05, The Rasterman Carsten Haitzler <[EMAIL PROTECTED]> wrote: > > > > > > just one thing - with efm it shouldnt be forking 1 process per image all > > at > > once. it will only be keeping 1 forked child at a time - running along > > generating for images without thumbs if they need one. the parent just > > gets the > > child exit event then forks off another. so it's 1 fork per image - and > > only > > per image that needs a thumb ANd only once per generation. i think you can > > safely asume even on the worst of posi systems 1 fork is nothing compared > > to > > the workload of loading, scaling then writing an image file :) > > Ahh, I should have read the code more closely. I saw the fork() at the top > of the _e_thumb_generate and assumed the worst. Thanks for clarifying that. yeah. that;d just be silly to fork of possibly 100's of children for a directory full of unthumbnailed images :) > anyway. i personalyl still favor the fork model as it requires nop pthreads, > > likely is no overhead compared to threads, has no concurrency and cache > > issues, > > aned is simple. > > > Agreed. excellent :) > what it does ned is an ability to tune how many forked image > > generators to allow at a time (efm allows only 1 so dual cpu systems will > > be > > happy, more cpus wont benefit - ok maybe 3 as x is probably involved, and > > you > > might say 4 if you let the kernel run IO cpu instructions on a 4th cpu). > > I think single CPU systems could even benefit from spawning a few worker > processes. Each process will reach a state where it's waiting on a read from > the image file, so two or three thumbnailing processes could potentially > interleave their resource usage pretty well. One sitting in a blocking read > while the other is processing image data. sure - to an extent it will. though the by far biggest gain will be in "smart thumbnailing" ala epeg - ie for common image formats make special fast paths. frankly jpeg is the biggest offenderthere with digital photos pushing 8 megapisels ande above even now, and even your average camera doing 4-6 megapixels. i think we can improve on epeg, by adding exif thumbnail handling (if a thumbnail is already inlineed in the jpeg as extended exif info and use that if its apropriate). anyway - i digress. :) > anyway > > - you DO have a very valid point for when 2 apps start thumbnailign the > > same > > dir. we should definitely put in a locking mechanism for that so wither > > they > > share the workload, or the first guy in gets "lock ownership" and drives > > the > > thumbnailing until he's done and the other process sits and waits (maybe > > polling the lock file if we sue that mechanism - the owenr coudl update > > the > > timespamp on the lockfile whenever it generates something. or the lock > > file > > could contain info as so the queue of ungenerated images to go... or maybe > > the > > simplest case all processes not owning the thumbnailing for that dir hold > > off > > until the owner releases (hopefulyl not too long from now) and then do a > > full > > update). > > > I'm actually pretty surprised the fd.o spec didn't address locking at all, > at least not that I saw the last time I read it. indeed - i should read it again. it's been a few years since i last read it, but i think they kind of thought the chance of concurrency low and thus simply a performance issue in the end, and well - certain other desktop environments pushing such things havent been renowned for their focus on efficiency in the past :) > anyway - i do agree that there is need to unify. i do also think there are 2 > > levels here. 1. just generate thumbs and let calling process know (either > > via a > > blocking api or a fork/event), 2. be able to ask for the thumb path for > > any > > given file path, and 3. load thumb into a canvas object (another level > > entirely). i do think you want to support blocking and sync - both. really > > async can just be a wrapper on top of the blocking api. > > I think 1 and 2 are what I'm most concerned with atm. Following the > fd.ospec (to a point, since its lack of jpeg support is just dumb) 3 > is not a > large issue since its just loading a png or jpeg. i forgot: thumbnail pruning. ie go through thumbnail cache/stash and find thumbnails that don't apply to any existing file :) need that., > imho it could do with: > > > > 1. add a file path to the thumb gen queue > > 2. delete a file path from the queue > > 3. begin queue processing > > 4. pause/unpause queue processing > > 5. end queue processing > > 6. ask for thumb path from file path > > 7. brute-force blocking-api generate thumb > > 8. get "new thumb available" events > > 9. set paralellism count (how many threads or forked children to allow at > > a > > time) > > > I think we're on the same page here as far as features. So here's the idea I > had in mind for implementation. First off, we have a lib that provides the > blocking API with Epsilon. I spoke to atmos about all of this and he > expressed a desire to keep Epsilon simple and not expand the functionality > much at that level. So that could provide the lowest level blocking sure. epsilon seems fine. i might think some internals work to move deps to runtime dlopens might be nice :) > thumbnail generation based on MIME type with plugins (as mentioned in your > next paragraph). To address the async aspect, we'd wrap Epsilon with the plugins is definitely another way - either you just expand epsilon's internals to dloepn() a new lib manually and handle it , if that lib exists, or have a plugin do that. :) > queue processing and event API with the features you mention above. Then to > provide the async behavior (and address the locking problem), the lib could > actually setup an IPC channel and fork off a small daemon. That daemon would OR the lib EXECS a known existing utility daemon. the user can run it at startup to keep it around, or its auto-spawned with the -terminate 10 (after the last queued item is finished wait for 10 seconds then quit) as you say below. this gives an option for some people to have it always around - but the default being spawn as needed then sit in a small loop for up to 2 seconds or whatever trying to connect to the thumbnailer ipc service and if this fails, give up. > then be available to all processes owned by that user and fork off processes > responsible for the actual generation of the thumbnails. Since the daemon > provides the only route to thumbnail generation we don't have to deal with > locking or potential deadlocks and the race condition is eliminated. If the > daemon is auto-started on demand, it could also exit after a configurable > amount of inactivity. It also knows what's currently being processed so it > can shortcut the requeueing of duplicate items. sure. definitely. soundds good and proper. > The only real downside I see to this is the IPC communication overhead, > though entropy actually does this to communicate between threads and appears > to do so w/o any significant performance impact. Any other concerns that I'm > missing? well if we generate the thumbnail path in the lib/client, look to see if it exists - then just use that and never deal with a daemon serivce over ipc, we will cut out the worst possible offender (client-server roudn trips needing to ask the server for thumbnail path). then we only incure latency and ipc overhead when waitign on a thumbnail generation - and really 99% of that work is in actually loading and decoding the image file, scaling it and writing it out again. our problems occur when the work to thumbnail becomes so little the ipc does become significant overhead (which only happens in the "already generated" case really - which we have shortcutted client-side anyway) > locking can be implemented under the bonnet of such and api. also one thing but since its a daemon service - locking is not needed as as the server will deal with it internally :) (ie it will only generate 1 thumb of 1 file on the qwueue as it will do queue merging (if the same file is on the todo queue twice the 2nd instance gets promoted to the first and all clients waiting on an answer for that file get the "done" reply when the first instance is done - thus only 1 copy of a file on the queue at a time and thus no concurrency issues) :) just need to deal with standard daemon issues (if client disconnects unref/remove all those clients queed requests, etc.) > > i > > think might be good here is that the lib actually dynamically adapts to > > whatever libs it can find RUNTIME - not compile-time. so if it finds > > imlib2, it > > will sue it. if it finds evas, it will use that, if it finds epeg, it will > > use > > that - it can dlopen the libs just like the runtime linker, and thus adapt > > to > > whatever is on the system runtime without compile time dependencies > > (installing > > more libs just gets faster thumbnailing or more format support etc.). this > > allows us to add other things in future under the hood (thumbnailing > > pdf's, > > html files, text files, svg, etc.) > > > Yeah, I think this all belongs at the lower level blocking API. agreed. :) > Nathan > -- ------------- Codito, ergo sum - "I code, therefore I am" -------------- The Rasterman (Carsten Haitzler) [EMAIL PROTECTED] 裸好多 Tokyo, Japan (東京 日本) ------------------------------------------------------- This SF.net email is sponsored by: Splunk Inc. Do you grep through log files for problems? Stop! Download the new AJAX search engine that makes searching your log files as easy as surfing the web. DOWNLOAD SPLUNK! http://ads.osdn.com/?ad_id=7637&alloc_id=16865&op=click _______________________________________________ enlightenment-devel mailing list enlightenment-devel@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/enlightenment-devel