It is possible to modify in a way that is 100% compatible with existing 
"correct" practice, which is to lock and unlock the object. Lock the state when 
any parse/unparse call is in flight, unlock when none are in flight.

setters can then block waiting for the lock to be freed.

That's guaranteed to be compatible with any correct current practice.

I just hate to put all that complexity in there.
________________________________
From: Steve Lawrence <stephen.d.lawre...@gmail.com>
Sent: Wednesday, March 25, 2020 3:44 PM
To: dev@daffodil.apache.org <dev@daffodil.apache.org>
Subject: Re: Compiler.setExternalDFDLVariable(s) considered challenged

My one minor concern about the locking after parse/unparse is that it
changes behavior, and could potentially break things (I assume use after
lock is an assertion of some kind?).

It's definitely possible to use the DataProcessor in a safe manner while
mutating the state currently if someone is careful, so to not allow
mutating it after calling parse/unparse could potentially break someones
existing use of Daffodil.

What are your thoughts on @deprecating those functions, the deprecation
message can be something like "This is not thread-safe, be very careful
if you do not use it". And rather than locking and asserting, we could
just create a warning and say "You're using this in a potentially
dangerous manner, you should use the new API"?

- Steve

On 3/25/20 3:35 PM, Beckerle, Mike wrote:
> A fully functional API is probably good to have, and I'd like to define that 
> also and cut over to using that within Daffodil.
>
> We would need a dataProcessor.reset() method that creates a new dp which has 
> had all external vars, tunables, basically any of those formerly settable 
> things, forgotten so you can start clean.
>
> (Would clean() be a better name than reset() ?)
>
> This reset() subsumes my clone() method idea.
>
> I want to deprecate (literally with "@deprecated" annotations) the setters 
> generally, but my compromise is once you parse/unparse the object becomes 
> immutable permanently. So the setters are just a different way to 
> parameterize the object.
>
> The other thing that is impacted by this change it the 
> OpenDFDL/ibmCrossTester that implements these same APIs but drives IBM DFDL 
> with them.  That's ok for now. I can retrofit that later. It will still work, 
> just get deprecation warnings.
>
> ________________________________
> From: Steve Lawrence <stephen.d.lawre...@gmail.com>
> Sent: Wednesday, March 25, 2020 12:05 PM
> To: dev@daffodil.apache.org <dev@daffodil.apache.org>
> Subject: Re: Compiler.setExternalDFDLVariable(s) considered challenged
>
> This feels like the right approach. Thoughts on instead of having to
> manually clone() and having some sort of locking mechanism when parse is
> called, we just make a DataProcessor completely immutable, and setting a
> variable will implicitly create a copy with modified state, something like:
>
> val dp = dataProcessor
>            .withTunable("tunableFoo", "1")
>            .withVariable("variableFoo", 2")
>            .withVariable("variableBar", 2")
>
> It's not as java-y, but feels safer. I can imagine a case where someone
> sets a bunch of state, calls parse and locks things, and then way down
> the line they get a "data processor is locked" error because they tried
> to change some state.
>
>
> On 3/25/20 11:50 AM, Beckerle, Mike wrote:
>> Here's my concrete proposal to fix DataProcesor's API.
>>
>> We add a clone() method. It copies the object sharing the big/expensive 
>> stuff like the compiled schema but gives you a separate copy of all the 
>> state that is settable via setValidationMode, setExternalDFDLVariable, and 
>> setTunable.
>>
>> So you get a DataProcessor either from a ProcessorFactory, or from 
>> Compiler.reload().
>>
>> You set state.
>> You can call parse/unparse on various threads
>>
>> If you want to change state settings for another parse call, you must call 
>> clone() to get another instance of the DataProcessor. This will share the 
>> expensive/big stuff like the compiled schema, but give you a separate copy 
>> of all the state that is settable via setValidationMode, 
>> setExternalDFDLVariable, and setTunable.
>>
>> Then you set state
>> You can call parse/unparse on various threads.
>>
>> I hate to get into locking and such, but once you call parse/unparse, that 
>> should lock the object state, and any further setter calls should 
>> fail/throw. You should have to clone it to "change" the settings.
>>
>> This makes the use of the setters a java-ish way of achieving the same 
>> reentrancy one would get if there were no setters and simply more arguments 
>> to the parse/unparse calls.
>>
>> ________________________________
>> From: Beckerle, Mike <mbecke...@tresys.com>
>> Sent: Wednesday, March 25, 2020 11:29 AM
>> To: dev@daffodil.apache.org <dev@daffodil.apache.org>
>> Subject: Re: Compiler.setExternalDFDLVariable(s) considered challenged
>>
>> To respond to my own thread here, I think given that we allow multiple 
>> simultaneous calls to parse/unparse from different threads, we must say that 
>> the DataProcessor object is immutable once parse or unparse are called.
>>
>> I suppose we could say that it is mutable, but behavior is undetermined if 
>> any parse or unparse calls are active on any thread.
>>
>> But this is just asking for trouble IMHO.
>>
>> I think we started out with a stateful, non-thread capable API. The idea is 
>> that one thread would be invoking a data processor at a time. A data 
>> procesor was the state-block of an execution.
>>
>> The need to share compiled processor reloads, because the compile schemas 
>> were expensive to create, tempted us to allow multiple parse/unparse calls 
>> on different threads.
>>
>> Fact is, I think we should have said no to this, provided a 
>> DataProcessor.clone() to create instances that shared the reloaded compiled 
>> schema binary, but otherwise had separate state, and said that parse/unparse 
>> were synchronized methods on their DP instance.
>>
>> Instead we're in a 1/2 way world where we don't have a thread-reasonable API 
>> due to mutable state in what turns out to be a cross-thread shared object.
>>
>> ________________________________
>> From: Beckerle, Mike <mbecke...@tresys.com>
>> Sent: Wednesday, March 25, 2020 10:55 AM
>> To: dev@daffodil.apache.org <dev@daffodil.apache.org>
>> Subject: Compiler.setExternalDFDLVariable(s) considered challenged
>>
>> Why does the API for Daffodil have
>>
>> Compiler.setExternalDFDLVariable(...)
>> and
>> Compiler.setExternalDFDLVariables(...)
>>
>> on it.
>>
>> I believe we should deprecate this.
>>
>> Compilers are parameterized by some of the tunables I understand.
>>
>> But the external DFDL variables? These cannot affect compilation. The schema 
>> compiler needs to know statically the information about variables found in 
>> the schema itself in the dfdl:defineVariable statement annotations.
>>
>> But the compiler doesn't need external variable bindings. In fact if it did 
>> know and use them, it would be building assumptions into the compiled schema 
>> that it shouldn't be building in.
>>
>> Setting external var bindings on the Compiler just causes problems when that 
>> compiler instance is reused in a context where those settings aren't 
>> appropriate. (JIRA DAFFODIL-2302 is one such problem)
>>
>> I believe setExternalDFDLVariable(s) methods should be deprecated, and 
>> external variables bindings should be an optional argument to the 
>> parse/unparse methods.
>>
>> The setters cause thread safety issues because the DP is stateful, even 
>> though we want multiple calls to parse/unparse to be executable on different 
>> threads.
>>
>> Consider: if we allow ordinary setExternalDFDLVariables and add a 
>> resetExternalDFDLVariables to clear them, then imagine one wants to make two 
>> parse calls on separate threads with different external variables bindings:
>>
>> so on main thread.....
>>      dp.setExternalVariables(...bindings 1...)
>>      spawn the thread 1
>> on the thread 1
>>      dp.parse(....)
>> back on main thread
>>      dp.resetExternalVariables() // race condition. Did the parse call read 
>> the external variables before this reset or not?
>>      dp.setExternalVariables(...bindings 2....)
>>     .....
>>
>> However, if we make the external variable bindings an argument to parse, we 
>> avoid all of this.
>>
>> Alternatively, since DataProcessor has setExternalDFDLVariable, we can 
>> prohibit multiple calls on the same DataProcessor object simultaneously. We 
>> can provide a clone() method that preserves the loaded/reloaded processor, 
>> but constructs another DataProcessor object, thereby allowing separate 
>> external variables state per DataProcessor instance.
>>
>>
>> Comments?
>>
>>
>>
>>
>>
>> Mike Beckerle | Principal Engineer
>> [Owl Cyber Defense]<http://owlcyberdefense.com>
>> [cid:2a423dec-0558-414e-b369-14a24becc40f] is now a part of 
>> Owl<https://owlcyberdefense.com/news/owl-cyber-defense-tresys-technology-announce-merger/>
>> P +1-781-330-0412
>> W owlcyberdefense.com<http://www.owlcyberdefense.com>
>>
>
>

Reply via email to