2009/12/17 Ingo Gambin <igam...@brilliant.de>

> Basically with already existing pdf-files that works fine, but checking
> the whole procedures inserting timing-outputs I realized that, although
> the extraction method is done and the next methods are called, itext (or
> the system) is still writing the file to directory B and therefore when
> the browser tries to embed it (adobe plugin) its not fully written and
> the adobe plugin can not read it because of that.
>

I'm surprised, as I didn't think itext used background threads in order to
write a document.  Certainly I've never had a problem with this, though I
may just have been lucky!  Is the directory on the same machine, or is it
remote?  If remote, network bandwidth/latency may be causing the issue you
describe.


> now I wonder how to solve that problem. So to my ideas:
>        a) only continue after extraction when the file is fully written
>                => Actually I have no idea about how to check that
>

I'd ask Bruno and co on the itext list how you would know that itext has
finished.


>        b) have the servlet wait/sleep for maybe up to a second
>                => from what I found so far, sleeping a servlet is NOT
>                good and on the other hand what if one pdf-page i want
>                to extract is so big/has so many graphics in it that the
>                process lasts longer than a second
>
> "Not good" is a generalisation; let's look at the specifics.  First off,
you're not sleeping "a servlet"; you're sleeping a thread.  Sleeping a
thread means that the request being handled by that thread takes longer to
complete.  Therefore, the thread is returned to the pool later.  Therefore,
more threads are needed to achieve the same throughput.  This is only a
problem if your server can't handle the extra load.

I wouldn't just throw sleep calls around the code like they were confetti,
but I'll confess to having one place in my own code where a HTTP request has
to wait for an external executable to complete its task and write some rows
into a relational database.  Unfortunately the only approach here is to
poll: repeatedly try to retrieve the "I'm done" row, sleeping a while
between each test.  It works well enough, until we get a better
architectural solution to the problem.

- Peter

Reply via email to