On Wed, 21 Dec 2005 09:21:42 -0600 Nathan Ingersoll <[EMAIL PROTECTED]>
babbled:

> On 12/20/05, The Rasterman Carsten Haitzler <[EMAIL PROTECTED]> wrote:
> >
> >
> > just one thing - with efm it shouldnt be forking 1 process per image all
> > at
> > once. it will only be keeping 1 forked child at a time - running along
> > generating for images without thumbs if they need one. the parent just
> > gets the
> > child exit event then forks off another. so it's 1 fork per image - and
> > only
> > per image that needs a thumb ANd only once per generation. i think you can
> > safely asume even on the worst of posi systems 1 fork is nothing compared
> > to
> > the workload of loading, scaling then writing an image file :)
> 
> Ahh, I should have read the code more closely. I saw the fork() at the top
> of the _e_thumb_generate and assumed the worst. Thanks for clarifying that.

yeah. that;d just be silly to fork of possibly 100's of children for a
directory full of unthumbnailed images :)

> anyway. i personalyl still favor the fork model as it requires nop pthreads,
> > likely is no overhead compared to threads, has no concurrency and cache
> > issues,
> > aned is simple.
> 
> 
> Agreed.

excellent :)

> what it does ned is an ability to tune how many forked image
> > generators to allow at a time (efm allows only 1 so dual cpu systems will
> > be
> > happy, more cpus wont benefit - ok maybe 3 as x is probably involved, and
> > you
> > might say 4 if you let the kernel run IO cpu instructions on a 4th cpu).
> 
> I think single CPU systems could even benefit from spawning a few worker
> processes. Each process will reach a state where it's waiting on a read from
> the image file, so two or three thumbnailing processes could potentially
> interleave their resource usage pretty well. One sitting in a blocking read
> while the other is processing image data.

sure - to an extent it will. though the by far biggest gain will be in "smart
thumbnailing" ala epeg - ie for common image formats make special fast paths.
frankly jpeg is the biggest offenderthere with digital photos pushing 8
megapisels ande above even now, and even your average camera doing 4-6
megapixels. i think we can improve on epeg, by adding exif thumbnail handling
(if a thumbnail is already inlineed in the jpeg as extended exif info and use
that if its apropriate). anyway - i digress. :)

> anyway
> > - you DO have a very valid point for when 2 apps start thumbnailign the
> > same
> > dir. we should definitely put in a locking mechanism for that so wither
> > they
> > share the workload, or the first guy in gets "lock ownership" and drives
> > the
> > thumbnailing until he's done and the other process sits and waits (maybe
> > polling the lock file if we sue that mechanism - the owenr coudl update
> > the
> > timespamp on the lockfile whenever it generates something. or the lock
> > file
> > could contain info as so the queue of ungenerated images to go... or maybe
> > the
> > simplest case all processes not owning the thumbnailing for that dir hold
> > off
> > until the owner releases (hopefulyl not too long from now) and then do a
> > full
> > update).
> 
> 
> I'm actually pretty surprised the fd.o spec didn't address locking at all,
> at least not that I saw the last time I read it.

indeed - i should read it again. it's been a few years since i last read it,
but i think they kind of thought the chance of concurrency low and thus simply
a performance issue in the end, and well - certain other desktop environments
pushing such things havent been renowned for their focus on efficiency in the
past :)

> anyway - i do agree that there is need to unify. i do also think there are 2
> > levels here. 1. just generate thumbs and let calling process know (either
> > via a
> > blocking api or a fork/event), 2. be able to ask for the thumb path for
> > any
> > given file path, and 3. load thumb into a canvas object (another level
> > entirely). i do think you want to support blocking and sync - both. really
> > async can just be a wrapper on top of the blocking api.
> 
> I think 1 and 2 are what I'm most concerned with atm. Following the
> fd.ospec (to a point, since its lack of jpeg support is just dumb) 3
> is not a
> large issue since its just loading a png or jpeg.

i forgot: thumbnail pruning. ie go through thumbnail cache/stash and find
thumbnails that don't apply to any existing file :) need that.,

> imho it could do with:
> >
> > 1. add a file path to the thumb gen queue
> > 2. delete a file path from the queue
> > 3. begin queue processing
> > 4. pause/unpause queue processing
> > 5. end queue processing
> > 6. ask for thumb path from file path
> > 7. brute-force blocking-api generate thumb
> > 8. get "new thumb available" events
> > 9. set paralellism count (how many threads or forked children to allow at
> > a
> > time)
> 
> 
> I think we're on the same page here as far as features. So here's the idea I
> had in mind for implementation. First off, we have a lib that provides the
> blocking API with Epsilon. I spoke to atmos about all of this and he
> expressed a desire to keep Epsilon simple and not expand the functionality
> much at that level. So that could provide the lowest level blocking

sure. epsilon seems fine. i might think some internals work to move deps to
runtime dlopens might be nice :)

> thumbnail generation based on MIME type with plugins (as mentioned in your
> next paragraph). To address the async aspect, we'd wrap Epsilon with the

plugins is definitely another way - either you just expand epsilon's internals
to dloepn() a new lib manually and handle it , if that lib exists, or have a
plugin do that. :)

> queue processing and event API with the features you mention above. Then to
> provide the async behavior (and address the locking problem), the lib could
> actually setup an IPC channel and fork off a small daemon. That daemon would

OR the lib EXECS a known existing utility daemon. the user can run it at
startup to keep it around, or its auto-spawned with the -terminate 10 (after
the last queued item is finished wait for 10 seconds then quit) as you say
below. this gives an option for some people to have it always around - but the
default being spawn as needed then sit in a small loop for up to 2 seconds or
whatever trying to connect to the thumbnailer ipc service and if this fails,
give up.

> then be available to all processes owned by that user and fork off processes
> responsible for the actual generation of the thumbnails. Since the daemon
> provides the only route to thumbnail generation we don't have to deal with
> locking or potential deadlocks and the race condition is eliminated. If the
> daemon is auto-started on demand, it could also exit after a configurable
> amount of inactivity. It also knows what's currently being processed so it
> can shortcut the requeueing of duplicate items.

sure. definitely. soundds good and proper.

> The only real downside I see to this is the IPC communication overhead,
> though entropy actually does this to communicate between threads and appears
> to do so w/o any significant performance impact. Any other concerns that I'm
> missing?

well if we generate the thumbnail path in the lib/client, look to see if it
exists - then just use that and never deal with a daemon serivce over ipc, we
will cut out the worst possible offender (client-server roudn trips needing to
ask the server for thumbnail path). then we only incure latency and ipc
overhead when waitign on a thumbnail generation - and really 99% of that work
is in actually loading and decoding the image file, scaling it and writing it
out again. our problems occur when the work to thumbnail becomes so little the
ipc does become significant overhead (which only happens in the "already
generated" case really - which we have shortcutted client-side anyway)

> locking can be implemented under the bonnet of such and api. also one thing

but since its a daemon service - locking is not needed as as the server will
deal with it internally :) (ie it will only generate 1 thumb of 1 file on the
qwueue as it will do queue merging (if the same file is on the todo queue twice
the 2nd instance gets promoted to the first and all clients waiting on an
answer for that file get the "done" reply when the first instance is done -
thus only 1 copy of a file on the queue at a time and thus no concurrency
issues) :) just need to deal with standard daemon issues (if client disconnects
unref/remove all those clients queed requests, etc.)

> > i
> > think might be good here is that the lib actually dynamically adapts to
> > whatever libs it can find RUNTIME - not compile-time. so if it finds
> > imlib2, it
> > will sue it. if it finds evas, it will use that, if it finds epeg, it will
> > use
> > that - it can dlopen the libs just like the runtime linker, and thus adapt
> > to
> > whatever is on the system runtime without compile time dependencies
> > (installing
> > more libs just gets faster thumbnailing or more format support etc.). this
> > allows us to add other things in future under the hood (thumbnailing
> > pdf's,
> > html files, text files, svg, etc.)
> 
> 
> Yeah, I think this all belongs at the lower level blocking API.

agreed. :)

> Nathan
> 


-- 
------------- Codito, ergo sum - "I code, therefore I am" --------------
The Rasterman (Carsten Haitzler)    [EMAIL PROTECTED]
裸好多
Tokyo, Japan (東京 日本)


-------------------------------------------------------
This SF.net email is sponsored by: Splunk Inc. Do you grep through log files
for problems?  Stop!  Download the new AJAX search engine that makes
searching your log files as easy as surfing the  web.  DOWNLOAD SPLUNK!
http://ads.osdn.com/?ad_id=7637&alloc_id=16865&op=click
_______________________________________________
enlightenment-devel mailing list
enlightenment-devel@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/enlightenment-devel

Reply via email to