Re: [osmosis-dev] calling DataPostbox.release() multiple times

2012-01-18 Thread Richard Hansen

On 2012-01-18 07:23, Brett Henderson wrote:


On 16 January 2012 07:22, Richard Hansen  wrote:


On 2012-01-15 07:34, Brett Henderson wrote:


You're correct in saying that Releasable classes can be released
multiple times, however that is not typically how they're used
in the Osmosis pipeline.


Does it make sense to change the Releasable API definition to say
that release can only be called once?  Or might that break
multiple-sink stages like merge?


I've updated the Releasable API definition to say that implementations
should support release being called multiple times but that it isn't
mandatory and cannot be relied on by clients.


Cool.  With that change, no other changes should be necessary.


release() and outputRelease() have been designed to support re-use.  I
don't think I've tested it, but I believe it will work.  The two
variables you mention are initialized near the start of those methods
and only read after the Released flags have been set so their
initial state isn't important.  However I can see that is confusing so
I've modified their initial state to match their final state of true
(I'll check it in shortly).


I agree with that change.  The only other thing I would suggest is a 
comment indicating that the class is designed to support reuse.  (Or are 
all tasks supposed to be reusable?)




Your patch would probably work but I'd prefer to continue to support
re-use rather than to use that variable for a second purpose.


Good point.  With your Releasable contract change, the patch is 
unnecessary anyway.





An alternative is to increase the burden on the users of
DataPostbox. They could be required to either ensure that the output
thread restarts or avoid calling DataPostbox.release() once the
output thread exits. For example, EntityBuffer.run() could set an
'outputThreadIsRunning' boolean at the top and clear it at the end.
Then EntityBuffer.release() would be modified to only call
buffer.release() if outputThreadIsRunning is true.


Race conditions between threads are tough to avoid, and I believe the
above change would introduce one.  If the output thread hasn't entered
EntityBuffer.run yet and the outputThreadIsRunning flag hasn't been set,
but the calling thread encounters an early exception and calls
EntityBuffer.release, the calling thread will think that the output
thread isn't running and there is no need to perform release
functionality.  This would then lead to the output thread blocking when
it finally does start.


Yes, you are correct -- there is a race condition in that scheme. 
Concurrency is fun, isn't it?  :)




The way the class is constructed now, each thread will synchronise at
the initialize, complete and release methods.  If either thread skips
straight to release


Oh, I didn't realize that was possible.  That makes it hard if not 
impossible to support both reuse and multiple calls to release().



then the other thread detects that as an error
condition and aborts appropriately, but in all cases both threads will
synchronise around release before exiting.  The only reason to
synchronise around release is to support re-use which may never be used
in practice but it should work well.  We can't even rely on the
initialize methods being called because it is possible for a thread to
fail before that is called.

I'd rather leave the class as it is designed now.  It's a tradeoff
between supporting re-use and incurring the limitation that every call
to release must be matched by a call to outputRelease in another
thread.  I don't think it's possible to have it both ways.


I agree with your analysis.

Thanks for making the changes!

-Richard

___
osmosis-dev mailing list
osmosis-dev@openstreetmap.org
http://lists.openstreetmap.org/listinfo/osmosis-dev


Re: [osmosis-dev] Remove entities in a bounding box leave other intact

2012-01-18 Thread Jaume Figueras i Jové

Thank you Frederik,

worked perfectly.

On 01/16/2012 08:03 PM, Frederik Ramm wrote:

You can use the --bp task and define a polygon with an exclamation mark
before the polygon ID to cut something out, see
http://wiki.openstreetmap.org/wiki/Osmosis/Polygon_Filter_File_Format


Jaume.

___
osmosis-dev mailing list
osmosis-dev@openstreetmap.org
http://lists.openstreetmap.org/listinfo/osmosis-dev


Re: [osmosis-dev] calling DataPostbox.release() multiple times

2012-01-18 Thread Brett Henderson
On 16 January 2012 07:22, Richard Hansen  wrote:

> On 2012-01-15 07:34, Brett Henderson wrote:
>
>> You're correct in saying that Releasable classes can be released
>> multiple times, however that is not typically how they're used in the
>> Osmosis pipeline.
>>
>
> Does it make sense to change the Releasable API definition to say that
> release can only be called once?  Or might that break multiple-sink stages
> like merge?


I've updated the Releasable API definition to say that implementations
should support release being called multiple times but that it isn't
mandatory and cannot be relied on by clients.

Multiple sink stages like merge expose two separate sink instances to the
two upstream pipes (through MultiSink interface which allows multiple Sink
instances to be obtained) which feed into separate DataPostbox instances
internally.  Each sink instance only has its release method called once.
So a single release call limitation shouldn't break anything.


>
>  Looking at the code a bit more I think it might only need minor changes.
>>  I had thought that release() and outputRelease() changed the state to
>> equal that of a freshly constructed DataPostbox (to support reuse?), but
>> that's not true:  inputExit and outputExit are false after construction but
>> true after release().  So simply checking inputExit at the beginning of
>> release() should take care of it.  See the attached (untested) patch.
>>
>
release() and outputRelease() have been designed to support re-use.  I
don't think I've tested it, but I believe it will work.  The two variables
you mention are initialized near the start of those methods and only read
after the Released flags have been set so their initial state isn't
important.  However I can see that is confusing so I've modified their
initial state to match their final state of true (I'll check it in shortly).

Your patch would probably work but I'd prefer to continue to support re-use
rather than to use that variable for a second purpose.


>
> An alternative is to increase the burden on the users of DataPostbox. They
> could be required to either ensure that the output thread restarts or avoid
> calling DataPostbox.release() once the output thread exits. For example,
> EntityBuffer.run() could set an 'outputThreadIsRunning' boolean at the top
> and clear it at the end.  Then EntityBuffer.release() would be modified to
> only call buffer.release() if outputThreadIsRunning is true.


Race conditions between threads are tough to avoid, and I believe the above
change would introduce one.  If the output thread hasn't entered
EntityBuffer.run yet and the outputThreadIsRunning flag hasn't been set,
but the calling thread encounters an early exception and calls
EntityBuffer.release, the calling thread will think that the output thread
isn't running and there is no need to perform release functionality.  This
would then lead to the output thread blocking when it finally does start.

The way the class is constructed now, each thread will synchronise at the
initialize, complete and release methods.  If either thread skips straight
to release then the other thread detects that as an error condition and
aborts appropriately, but in all cases both threads will synchronise around
release before exiting.  The only reason to synchronise around release is
to support re-use which may never be used in practice but it should work
well.  We can't even rely on the initialize methods being called because it
is possible for a thread to fail before that is called.

I'd rather leave the class as it is designed now.  It's a tradeoff between
supporting re-use and incurring the limitation that every call to release
must be matched by a call to outputRelease in another thread.  I don't
think it's possible to have it both ways.

Brett
___
osmosis-dev mailing list
osmosis-dev@openstreetmap.org
http://lists.openstreetmap.org/listinfo/osmosis-dev