Re: sbt testOnly not working for me

2020-03-25 Thread Steve Lawrence
sbt treats each argument as a separate sbt comand. So when you run:

  sbt testOnly **.Link16Test

because of the space, those are seen as two separate arguments, which is
equivalent to running these back to back:

  sbt testOnly
  sbt **.Link16Test

If you have a single command with a space in it, you need to wrap the
command in quotes so it's seen as a single argument, e.g.:

  sbt 'testOnly **.Link16Test'

On 3/25/20 5:37 PM, Beckerle, Mike wrote:
> 
> So due to bug DAFFODIL-2302, I have a schema where I can't run all the tests 
> at 
> once via 'sbt test'.
> 
> I wanted to run the tests separately via 'sbt testOnly **.Link16Test' and 
> 'sbt 
> testOnly **.Link16DERGTest' which are two different TDML test suites.
> 
> But when I do that, in every case it runs all the tests and therefore runs 
> into 
> the bug in DAFFODIL-2302, which will not be reassuring to a user.
> 
> Is there something I'm doing wrong that is making sbt testOnly not work?
> 
> Does this work for others?
> 
> Mike Beckerle | Principal Engineer
> Owl Cyber Defense 
> /is now a part of Owl 
> /
> P +1-781-330-0412
> W owlcyberdefense.com 
> 



sbt testOnly not working for me

2020-03-25 Thread Beckerle, Mike

So due to bug DAFFODIL-2302, I have a schema where I can't run all the tests at 
once via 'sbt test'.

I wanted to run the tests separately via 'sbt testOnly **.Link16Test' and 'sbt 
testOnly **.Link16DERGTest' which are two different TDML test suites.

But when I do that, in every case it runs all the tests and therefore runs into 
the bug in DAFFODIL-2302, which will not be reassuring to a user.

Is there something I'm doing wrong that is making sbt testOnly not work?

Does this work for others?

Mike Beckerle | Principal Engineer
[Owl Cyber Defense]
[cid:0886c233-19d0-494d-8264-481fc46e507e] is now a part of 
Owl
P +1-781-330-0412
W owlcyberdefense.com


Re: [DISCUSS] Daffodil 2.6.0 Release

2020-03-25 Thread Adams, Joshua
I did have interest, but it doesn't look like I'll have the time to do any 
Daffodil work for at least the next 2 weeks.  I have non-issue with someone 
else taking this one.

Josh

On Mar 25, 2020 3:56 PM, Olabusayo Kilo  wrote:
I'd like to have the below done for 2.6.0

https://issues.apache.org/jira/browse/DAFFODIL-2305 - Remove some unused
methods from Term.scala

https://issues.apache.org/jira/browse/DAFFODIL-2275 - Reduce code smells
and make some configuration changes to Sonarqube

I don't mind managing the release, although I think Josh A had expressed
interest last time.


On 3/25/20 3:24 PM, Beckerle, Mike wrote:
> I have two issues I'd like fixed for 2.6.0
>
> https://issues.apache.org/jira/browse/DAFFODIL-2302 - this is making it 
> difficult to update a Link16 schema that is in use by multiple parties.
>
> There's a second issue, which is that we still have this "NadaParsers are 
> supposed to optimize out" problem. Came up in this same schema again even 
> after the fix which supposedly addressed it. So I think the fix solved some, 
> but not all of this problem.  I will try to recreate this and reopen the 
> ticket if it is still happening.
>
> I don't want to manage the release, but I definitely think it's better if it 
> isn't Steve L., ... I could have my arm twisted.
> 
> From: Steve Lawrence 
> Sent: Wednesday, March 25, 2020 3:12 PM
> To: dev@daffodil.apache.org 
> Subject: [DISCUSS] Daffodil 2.6.0 Release
>
> With the last of the changes recently merged to support improved schema
> compilation speed and memory, I think now is a good time to start
> discussing the next release.
>
> 1) Does anyone have any changes that are close to being ready to merge
> or bugs that are worth delaying the next release until resolved?
>
> 2) Does anyone want to volunteer to be the release manager for 2.6.0?
> The release process [1] now uses containers for most of the heavy
> lifting, and since I did it last time it would be good to get someone
> else familiar with the new release process and make sure the container
> works for someone other than myself.
>
> - Steve
>
--
Best Regards
Lola K.



Re: [DISCUSS] Daffodil 2.6.0 Release

2020-03-25 Thread Olabusayo Kilo

I'd like to have the below done for 2.6.0

https://issues.apache.org/jira/browse/DAFFODIL-2305 - Remove some unused 
methods from Term.scala


https://issues.apache.org/jira/browse/DAFFODIL-2275 - Reduce code smells 
and make some configuration changes to Sonarqube


I don't mind managing the release, although I think Josh A had expressed 
interest last time.



On 3/25/20 3:24 PM, Beckerle, Mike wrote:

I have two issues I'd like fixed for 2.6.0

https://issues.apache.org/jira/browse/DAFFODIL-2302 - this is making it 
difficult to update a Link16 schema that is in use by multiple parties.

There's a second issue, which is that we still have this "NadaParsers are supposed 
to optimize out" problem. Came up in this same schema again even after the fix which 
supposedly addressed it. So I think the fix solved some, but not all of this problem.  I 
will try to recreate this and reopen the ticket if it is still happening.

I don't want to manage the release, but I definitely think it's better if it 
isn't Steve L., ... I could have my arm twisted.

From: Steve Lawrence 
Sent: Wednesday, March 25, 2020 3:12 PM
To: dev@daffodil.apache.org 
Subject: [DISCUSS] Daffodil 2.6.0 Release

With the last of the changes recently merged to support improved schema
compilation speed and memory, I think now is a good time to start
discussing the next release.

1) Does anyone have any changes that are close to being ready to merge
or bugs that are worth delaying the next release until resolved?

2) Does anyone want to volunteer to be the release manager for 2.6.0?
The release process [1] now uses containers for most of the heavy
lifting, and since I did it last time it would be good to get someone
else familiar with the new release process and make sure the container
works for someone other than myself.

- Steve


--
Best Regards
Lola K.



Re: Compiler.setExternalDFDLVariable(s) considered challenged

2020-03-25 Thread Beckerle, Mike
It is possible to modify in a way that is 100% compatible with existing 
"correct" practice, which is to lock and unlock the object. Lock the state when 
any parse/unparse call is in flight, unlock when none are in flight.

setters can then block waiting for the lock to be freed.

That's guaranteed to be compatible with any correct current practice.

I just hate to put all that complexity in there.

From: Steve Lawrence 
Sent: Wednesday, March 25, 2020 3:44 PM
To: dev@daffodil.apache.org 
Subject: Re: Compiler.setExternalDFDLVariable(s) considered challenged

My one minor concern about the locking after parse/unparse is that it
changes behavior, and could potentially break things (I assume use after
lock is an assertion of some kind?).

It's definitely possible to use the DataProcessor in a safe manner while
mutating the state currently if someone is careful, so to not allow
mutating it after calling parse/unparse could potentially break someones
existing use of Daffodil.

What are your thoughts on @deprecating those functions, the deprecation
message can be something like "This is not thread-safe, be very careful
if you do not use it". And rather than locking and asserting, we could
just create a warning and say "You're using this in a potentially
dangerous manner, you should use the new API"?

- Steve

On 3/25/20 3:35 PM, Beckerle, Mike wrote:
> A fully functional API is probably good to have, and I'd like to define that 
> also and cut over to using that within Daffodil.
>
> We would need a dataProcessor.reset() method that creates a new dp which has 
> had all external vars, tunables, basically any of those formerly settable 
> things, forgotten so you can start clean.
>
> (Would clean() be a better name than reset() ?)
>
> This reset() subsumes my clone() method idea.
>
> I want to deprecate (literally with "@deprecated" annotations) the setters 
> generally, but my compromise is once you parse/unparse the object becomes 
> immutable permanently. So the setters are just a different way to 
> parameterize the object.
>
> The other thing that is impacted by this change it the 
> OpenDFDL/ibmCrossTester that implements these same APIs but drives IBM DFDL 
> with them.  That's ok for now. I can retrofit that later. It will still work, 
> just get deprecation warnings.
>
> 
> From: Steve Lawrence 
> Sent: Wednesday, March 25, 2020 12:05 PM
> To: dev@daffodil.apache.org 
> Subject: Re: Compiler.setExternalDFDLVariable(s) considered challenged
>
> This feels like the right approach. Thoughts on instead of having to
> manually clone() and having some sort of locking mechanism when parse is
> called, we just make a DataProcessor completely immutable, and setting a
> variable will implicitly create a copy with modified state, something like:
>
> val dp = dataProcessor
>.withTunable("tunableFoo", "1")
>.withVariable("variableFoo", 2")
>.withVariable("variableBar", 2")
>
> It's not as java-y, but feels safer. I can imagine a case where someone
> sets a bunch of state, calls parse and locks things, and then way down
> the line they get a "data processor is locked" error because they tried
> to change some state.
>
>
> On 3/25/20 11:50 AM, Beckerle, Mike wrote:
>> Here's my concrete proposal to fix DataProcesor's API.
>>
>> We add a clone() method. It copies the object sharing the big/expensive 
>> stuff like the compiled schema but gives you a separate copy of all the 
>> state that is settable via setValidationMode, setExternalDFDLVariable, and 
>> setTunable.
>>
>> So you get a DataProcessor either from a ProcessorFactory, or from 
>> Compiler.reload().
>>
>> You set state.
>> You can call parse/unparse on various threads
>>
>> If you want to change state settings for another parse call, you must call 
>> clone() to get another instance of the DataProcessor. This will share the 
>> expensive/big stuff like the compiled schema, but give you a separate copy 
>> of all the state that is settable via setValidationMode, 
>> setExternalDFDLVariable, and setTunable.
>>
>> Then you set state
>> You can call parse/unparse on various threads.
>>
>> I hate to get into locking and such, but once you call parse/unparse, that 
>> should lock the object state, and any further setter calls should 
>> fail/throw. You should have to clone it to "change" the settings.
>>
>> This makes the use of the setters a java-ish way of achieving the same 
>> reentrancy one would get if there were no setters and simply more arguments 
>> to the parse/unparse calls.
>>
>> 
>> From: Beckerle, Mike 
>> Sent: Wednesday, March 25, 2020 11:29 AM
>> To: dev@daffodil.apache.org 
>> Subject: Re: Compiler.setExternalDFDLVariable(s) considered challenged
>>
>> To respond to my own thread here, I think given that we allow multiple 
>> simultaneous calls to parse/unparse from different threads, we must say that 
>> the 

Re: Compiler.setExternalDFDLVariable(s) considered challenged

2020-03-25 Thread Steve Lawrence
My one minor concern about the locking after parse/unparse is that it
changes behavior, and could potentially break things (I assume use after
lock is an assertion of some kind?).

It's definitely possible to use the DataProcessor in a safe manner while
mutating the state currently if someone is careful, so to not allow
mutating it after calling parse/unparse could potentially break someones
existing use of Daffodil.

What are your thoughts on @deprecating those functions, the deprecation
message can be something like "This is not thread-safe, be very careful
if you do not use it". And rather than locking and asserting, we could
just create a warning and say "You're using this in a potentially
dangerous manner, you should use the new API"?

- Steve

On 3/25/20 3:35 PM, Beckerle, Mike wrote:
> A fully functional API is probably good to have, and I'd like to define that 
> also and cut over to using that within Daffodil.
> 
> We would need a dataProcessor.reset() method that creates a new dp which has 
> had all external vars, tunables, basically any of those formerly settable 
> things, forgotten so you can start clean.
> 
> (Would clean() be a better name than reset() ?)
> 
> This reset() subsumes my clone() method idea.
> 
> I want to deprecate (literally with "@deprecated" annotations) the setters 
> generally, but my compromise is once you parse/unparse the object becomes 
> immutable permanently. So the setters are just a different way to 
> parameterize the object.
> 
> The other thing that is impacted by this change it the 
> OpenDFDL/ibmCrossTester that implements these same APIs but drives IBM DFDL 
> with them.  That's ok for now. I can retrofit that later. It will still work, 
> just get deprecation warnings.
> 
> 
> From: Steve Lawrence 
> Sent: Wednesday, March 25, 2020 12:05 PM
> To: dev@daffodil.apache.org 
> Subject: Re: Compiler.setExternalDFDLVariable(s) considered challenged
> 
> This feels like the right approach. Thoughts on instead of having to
> manually clone() and having some sort of locking mechanism when parse is
> called, we just make a DataProcessor completely immutable, and setting a
> variable will implicitly create a copy with modified state, something like:
> 
> val dp = dataProcessor
>.withTunable("tunableFoo", "1")
>.withVariable("variableFoo", 2")
>.withVariable("variableBar", 2")
> 
> It's not as java-y, but feels safer. I can imagine a case where someone
> sets a bunch of state, calls parse and locks things, and then way down
> the line they get a "data processor is locked" error because they tried
> to change some state.
> 
> 
> On 3/25/20 11:50 AM, Beckerle, Mike wrote:
>> Here's my concrete proposal to fix DataProcesor's API.
>>
>> We add a clone() method. It copies the object sharing the big/expensive 
>> stuff like the compiled schema but gives you a separate copy of all the 
>> state that is settable via setValidationMode, setExternalDFDLVariable, and 
>> setTunable.
>>
>> So you get a DataProcessor either from a ProcessorFactory, or from 
>> Compiler.reload().
>>
>> You set state.
>> You can call parse/unparse on various threads
>>
>> If you want to change state settings for another parse call, you must call 
>> clone() to get another instance of the DataProcessor. This will share the 
>> expensive/big stuff like the compiled schema, but give you a separate copy 
>> of all the state that is settable via setValidationMode, 
>> setExternalDFDLVariable, and setTunable.
>>
>> Then you set state
>> You can call parse/unparse on various threads.
>>
>> I hate to get into locking and such, but once you call parse/unparse, that 
>> should lock the object state, and any further setter calls should 
>> fail/throw. You should have to clone it to "change" the settings.
>>
>> This makes the use of the setters a java-ish way of achieving the same 
>> reentrancy one would get if there were no setters and simply more arguments 
>> to the parse/unparse calls.
>>
>> 
>> From: Beckerle, Mike 
>> Sent: Wednesday, March 25, 2020 11:29 AM
>> To: dev@daffodil.apache.org 
>> Subject: Re: Compiler.setExternalDFDLVariable(s) considered challenged
>>
>> To respond to my own thread here, I think given that we allow multiple 
>> simultaneous calls to parse/unparse from different threads, we must say that 
>> the DataProcessor object is immutable once parse or unparse are called.
>>
>> I suppose we could say that it is mutable, but behavior is undetermined if 
>> any parse or unparse calls are active on any thread.
>>
>> But this is just asking for trouble IMHO.
>>
>> I think we started out with a stateful, non-thread capable API. The idea is 
>> that one thread would be invoking a data processor at a time. A data 
>> procesor was the state-block of an execution.
>>
>> The need to share compiled processor reloads, because the compile schemas 
>> were expensive to create, tempted us to allow 

Re: Compiler.setExternalDFDLVariable(s) considered challenged

2020-03-25 Thread Beckerle, Mike
A fully functional API is probably good to have, and I'd like to define that 
also and cut over to using that within Daffodil.

We would need a dataProcessor.reset() method that creates a new dp which has 
had all external vars, tunables, basically any of those formerly settable 
things, forgotten so you can start clean.

(Would clean() be a better name than reset() ?)

This reset() subsumes my clone() method idea.

I want to deprecate (literally with "@deprecated" annotations) the setters 
generally, but my compromise is once you parse/unparse the object becomes 
immutable permanently. So the setters are just a different way to parameterize 
the object.

The other thing that is impacted by this change it the OpenDFDL/ibmCrossTester 
that implements these same APIs but drives IBM DFDL with them.  That's ok for 
now. I can retrofit that later. It will still work, just get deprecation 
warnings.


From: Steve Lawrence 
Sent: Wednesday, March 25, 2020 12:05 PM
To: dev@daffodil.apache.org 
Subject: Re: Compiler.setExternalDFDLVariable(s) considered challenged

This feels like the right approach. Thoughts on instead of having to
manually clone() and having some sort of locking mechanism when parse is
called, we just make a DataProcessor completely immutable, and setting a
variable will implicitly create a copy with modified state, something like:

val dp = dataProcessor
   .withTunable("tunableFoo", "1")
   .withVariable("variableFoo", 2")
   .withVariable("variableBar", 2")

It's not as java-y, but feels safer. I can imagine a case where someone
sets a bunch of state, calls parse and locks things, and then way down
the line they get a "data processor is locked" error because they tried
to change some state.


On 3/25/20 11:50 AM, Beckerle, Mike wrote:
> Here's my concrete proposal to fix DataProcesor's API.
>
> We add a clone() method. It copies the object sharing the big/expensive stuff 
> like the compiled schema but gives you a separate copy of all the state that 
> is settable via setValidationMode, setExternalDFDLVariable, and setTunable.
>
> So you get a DataProcessor either from a ProcessorFactory, or from 
> Compiler.reload().
>
> You set state.
> You can call parse/unparse on various threads
>
> If you want to change state settings for another parse call, you must call 
> clone() to get another instance of the DataProcessor. This will share the 
> expensive/big stuff like the compiled schema, but give you a separate copy of 
> all the state that is settable via setValidationMode, 
> setExternalDFDLVariable, and setTunable.
>
> Then you set state
> You can call parse/unparse on various threads.
>
> I hate to get into locking and such, but once you call parse/unparse, that 
> should lock the object state, and any further setter calls should fail/throw. 
> You should have to clone it to "change" the settings.
>
> This makes the use of the setters a java-ish way of achieving the same 
> reentrancy one would get if there were no setters and simply more arguments 
> to the parse/unparse calls.
>
> 
> From: Beckerle, Mike 
> Sent: Wednesday, March 25, 2020 11:29 AM
> To: dev@daffodil.apache.org 
> Subject: Re: Compiler.setExternalDFDLVariable(s) considered challenged
>
> To respond to my own thread here, I think given that we allow multiple 
> simultaneous calls to parse/unparse from different threads, we must say that 
> the DataProcessor object is immutable once parse or unparse are called.
>
> I suppose we could say that it is mutable, but behavior is undetermined if 
> any parse or unparse calls are active on any thread.
>
> But this is just asking for trouble IMHO.
>
> I think we started out with a stateful, non-thread capable API. The idea is 
> that one thread would be invoking a data processor at a time. A data procesor 
> was the state-block of an execution.
>
> The need to share compiled processor reloads, because the compile schemas 
> were expensive to create, tempted us to allow multiple parse/unparse calls on 
> different threads.
>
> Fact is, I think we should have said no to this, provided a 
> DataProcessor.clone() to create instances that shared the reloaded compiled 
> schema binary, but otherwise had separate state, and said that parse/unparse 
> were synchronized methods on their DP instance.
>
> Instead we're in a 1/2 way world where we don't have a thread-reasonable API 
> due to mutable state in what turns out to be a cross-thread shared object.
>
> 
> From: Beckerle, Mike 
> Sent: Wednesday, March 25, 2020 10:55 AM
> To: dev@daffodil.apache.org 
> Subject: Compiler.setExternalDFDLVariable(s) considered challenged
>
> Why does the API for Daffodil have
>
> Compiler.setExternalDFDLVariable(...)
> and
> Compiler.setExternalDFDLVariables(...)
>
> on it.
>
> I believe we should deprecate this.
>
> Compilers are parameterized by some of the tunables I understand.
>
> But 

Re: [DISCUSS] Daffodil 2.6.0 Release

2020-03-25 Thread Beckerle, Mike
I have two issues I'd like fixed for 2.6.0

https://issues.apache.org/jira/browse/DAFFODIL-2302 - this is making it 
difficult to update a Link16 schema that is in use by multiple parties.

There's a second issue, which is that we still have this "NadaParsers are 
supposed to optimize out" problem. Came up in this same schema again even after 
the fix which supposedly addressed it. So I think the fix solved some, but not 
all of this problem.  I will try to recreate this and reopen the ticket if it 
is still happening.

I don't want to manage the release, but I definitely think it's better if it 
isn't Steve L., ... I could have my arm twisted.

From: Steve Lawrence 
Sent: Wednesday, March 25, 2020 3:12 PM
To: dev@daffodil.apache.org 
Subject: [DISCUSS] Daffodil 2.6.0 Release

With the last of the changes recently merged to support improved schema
compilation speed and memory, I think now is a good time to start
discussing the next release.

1) Does anyone have any changes that are close to being ready to merge
or bugs that are worth delaying the next release until resolved?

2) Does anyone want to volunteer to be the release manager for 2.6.0?
The release process [1] now uses containers for most of the heavy
lifting, and since I did it last time it would be good to get someone
else familiar with the new release process and make sure the container
works for someone other than myself.

- Steve


Re: Compiler.setExternalDFDLVariable(s) considered challenged

2020-03-25 Thread Steve Lawrence
This feels like the right approach. Thoughts on instead of having to
manually clone() and having some sort of locking mechanism when parse is
called, we just make a DataProcessor completely immutable, and setting a
variable will implicitly create a copy with modified state, something like:

val dp = dataProcessor
   .withTunable("tunableFoo", "1")
   .withVariable("variableFoo", 2")
   .withVariable("variableBar", 2")

It's not as java-y, but feels safer. I can imagine a case where someone
sets a bunch of state, calls parse and locks things, and then way down
the line they get a "data processor is locked" error because they tried
to change some state.


On 3/25/20 11:50 AM, Beckerle, Mike wrote:
> Here's my concrete proposal to fix DataProcesor's API.
> 
> We add a clone() method. It copies the object sharing the big/expensive stuff 
> like the compiled schema but gives you a separate copy of all the state that 
> is settable via setValidationMode, setExternalDFDLVariable, and setTunable.
> 
> So you get a DataProcessor either from a ProcessorFactory, or from 
> Compiler.reload().
> 
> You set state.
> You can call parse/unparse on various threads
> 
> If you want to change state settings for another parse call, you must call 
> clone() to get another instance of the DataProcessor. This will share the 
> expensive/big stuff like the compiled schema, but give you a separate copy of 
> all the state that is settable via setValidationMode, 
> setExternalDFDLVariable, and setTunable.
> 
> Then you set state
> You can call parse/unparse on various threads.
> 
> I hate to get into locking and such, but once you call parse/unparse, that 
> should lock the object state, and any further setter calls should fail/throw. 
> You should have to clone it to "change" the settings.
> 
> This makes the use of the setters a java-ish way of achieving the same 
> reentrancy one would get if there were no setters and simply more arguments 
> to the parse/unparse calls.
> 
> 
> From: Beckerle, Mike 
> Sent: Wednesday, March 25, 2020 11:29 AM
> To: dev@daffodil.apache.org 
> Subject: Re: Compiler.setExternalDFDLVariable(s) considered challenged
> 
> To respond to my own thread here, I think given that we allow multiple 
> simultaneous calls to parse/unparse from different threads, we must say that 
> the DataProcessor object is immutable once parse or unparse are called.
> 
> I suppose we could say that it is mutable, but behavior is undetermined if 
> any parse or unparse calls are active on any thread.
> 
> But this is just asking for trouble IMHO.
> 
> I think we started out with a stateful, non-thread capable API. The idea is 
> that one thread would be invoking a data processor at a time. A data procesor 
> was the state-block of an execution.
> 
> The need to share compiled processor reloads, because the compile schemas 
> were expensive to create, tempted us to allow multiple parse/unparse calls on 
> different threads.
> 
> Fact is, I think we should have said no to this, provided a 
> DataProcessor.clone() to create instances that shared the reloaded compiled 
> schema binary, but otherwise had separate state, and said that parse/unparse 
> were synchronized methods on their DP instance.
> 
> Instead we're in a 1/2 way world where we don't have a thread-reasonable API 
> due to mutable state in what turns out to be a cross-thread shared object.
> 
> 
> From: Beckerle, Mike 
> Sent: Wednesday, March 25, 2020 10:55 AM
> To: dev@daffodil.apache.org 
> Subject: Compiler.setExternalDFDLVariable(s) considered challenged
> 
> Why does the API for Daffodil have
> 
> Compiler.setExternalDFDLVariable(...)
> and
> Compiler.setExternalDFDLVariables(...)
> 
> on it.
> 
> I believe we should deprecate this.
> 
> Compilers are parameterized by some of the tunables I understand.
> 
> But the external DFDL variables? These cannot affect compilation. The schema 
> compiler needs to know statically the information about variables found in 
> the schema itself in the dfdl:defineVariable statement annotations.
> 
> But the compiler doesn't need external variable bindings. In fact if it did 
> know and use them, it would be building assumptions into the compiled schema 
> that it shouldn't be building in.
> 
> Setting external var bindings on the Compiler just causes problems when that 
> compiler instance is reused in a context where those settings aren't 
> appropriate. (JIRA DAFFODIL-2302 is one such problem)
> 
> I believe setExternalDFDLVariable(s) methods should be deprecated, and 
> external variables bindings should be an optional argument to the 
> parse/unparse methods.
> 
> The setters cause thread safety issues because the DP is stateful, even 
> though we want multiple calls to parse/unparse to be executable on different 
> threads.
> 
> Consider: if we allow ordinary setExternalDFDLVariables and add a 
> resetExternalDFDLVariables to clear 

Re: Compiler.setExternalDFDLVariable(s) considered challenged

2020-03-25 Thread Steve Lawrence
Agreed on deprecating setting variables in the Compiler. It really just
doesn't make sense to be doing that.

Definitely seems reasonable to add the ability to set variables when
calling parse/unparse as that's the most threadsafe way to handle it,
and feels natural to me. Note that we still have other mutable state in
a DataProcessor like tuanbles, validationeMode, debbuging, that we might
want to think about as well. Though those do feel a little different
than variables.

The idea of having to call a clone() method to change a DataProcessors
variables seems a little unintuitive to me. But if we take a more
functional approach and make it entirely non-mutable, it maybe seems
better, something like:

Any thoughts on what the API might like going the parse route vs the
DataProcessor route?


On 3/25/20 10:55 AM, Beckerle, Mike wrote:
> Why does the API for Daffodil have
> 
> Compiler.setExternalDFDLVariable(...)
> and
> Compiler.setExternalDFDLVariables(...)
> 
> on it.
> 
> I believe we should deprecate this.
> 
> Compilers are parameterized by some of the tunables I understand.
> 
> But the external DFDL variables? These cannot affect compilation. The schema 
> compiler needs to know statically the information about variables found in 
> the 
> schema itself in the dfdl:defineVariable statement annotations.
> 
> But the compiler doesn't need external variable bindings. In fact if it did 
> know 
> and use them, it would be building assumptions into the compiled schema that 
> it 
> shouldn't be building in.
> 
> Setting external var bindings on the Compiler just causes problems when that 
> compiler instance is reused in a context where those settings aren't 
> appropriate. (JIRA DAFFODIL-2302 is one such problem)
> 
> I believe setExternalDFDLVariable(s) methods should be deprecated, and 
> external 
> variables bindings should be an optional argument to the parse/unparse 
> methods.
> 
> The setters cause thread safety issues because the DP is stateful, even 
> though 
> we want multiple calls to parse/unparse to be executable on different threads.
> 
> Consider: if we allow ordinary setExternalDFDLVariables and add a 
> resetExternalDFDLVariables to clear them, then imagine one wants to make two 
> parse calls on separate threads with different external variables bindings:
> 
> so on main thread.
>   dp.setExternalVariables(...bindings 1...)
>   spawn the thread 1
> on the thread 1
>   dp.parse()
> back on main thread
>   dp.resetExternalVariables() // race condition. Did the parse call read 
> the 
> external variables before this reset or not?
>   dp.setExternalVariables(...bindings 2)
>  .
> 
> However, if we make the external variable bindings an argument to parse, we 
> avoid all of this.
> 
> Alternatively, since DataProcessor has setExternalDFDLVariable, we can 
> prohibit 
> multiple calls on the same DataProcessor object simultaneously. We can 
> provide a 
> clone() method that preserves the loaded/reloaded processor, but constructs 
> another DataProcessor object, thereby allowing separate external variables 
> state 
> per DataProcessor instance.
> 
> 
> Comments?
> 
> 
> 
> 
> 
> Mike Beckerle | Principal Engineer
> Owl Cyber Defense 
> /is now a part of Owl 
> /
> P +1-781-330-0412
> W owlcyberdefense.com 
> 



Re: Compiler.setExternalDFDLVariable(s) considered challenged

2020-03-25 Thread Beckerle, Mike
Here's my concrete proposal to fix DataProcesor's API.

We add a clone() method. It copies the object sharing the big/expensive stuff 
like the compiled schema but gives you a separate copy of all the state that is 
settable via setValidationMode, setExternalDFDLVariable, and setTunable.

So you get a DataProcessor either from a ProcessorFactory, or from 
Compiler.reload().

You set state.
You can call parse/unparse on various threads

If you want to change state settings for another parse call, you must call 
clone() to get another instance of the DataProcessor. This will share the 
expensive/big stuff like the compiled schema, but give you a separate copy of 
all the state that is settable via setValidationMode, setExternalDFDLVariable, 
and setTunable.

Then you set state
You can call parse/unparse on various threads.

I hate to get into locking and such, but once you call parse/unparse, that 
should lock the object state, and any further setter calls should fail/throw. 
You should have to clone it to "change" the settings.

This makes the use of the setters a java-ish way of achieving the same 
reentrancy one would get if there were no setters and simply more arguments to 
the parse/unparse calls.


From: Beckerle, Mike 
Sent: Wednesday, March 25, 2020 11:29 AM
To: dev@daffodil.apache.org 
Subject: Re: Compiler.setExternalDFDLVariable(s) considered challenged

To respond to my own thread here, I think given that we allow multiple 
simultaneous calls to parse/unparse from different threads, we must say that 
the DataProcessor object is immutable once parse or unparse are called.

I suppose we could say that it is mutable, but behavior is undetermined if any 
parse or unparse calls are active on any thread.

But this is just asking for trouble IMHO.

I think we started out with a stateful, non-thread capable API. The idea is 
that one thread would be invoking a data processor at a time. A data procesor 
was the state-block of an execution.

The need to share compiled processor reloads, because the compile schemas were 
expensive to create, tempted us to allow multiple parse/unparse calls on 
different threads.

Fact is, I think we should have said no to this, provided a 
DataProcessor.clone() to create instances that shared the reloaded compiled 
schema binary, but otherwise had separate state, and said that parse/unparse 
were synchronized methods on their DP instance.

Instead we're in a 1/2 way world where we don't have a thread-reasonable API 
due to mutable state in what turns out to be a cross-thread shared object.


From: Beckerle, Mike 
Sent: Wednesday, March 25, 2020 10:55 AM
To: dev@daffodil.apache.org 
Subject: Compiler.setExternalDFDLVariable(s) considered challenged

Why does the API for Daffodil have

Compiler.setExternalDFDLVariable(...)
and
Compiler.setExternalDFDLVariables(...)

on it.

I believe we should deprecate this.

Compilers are parameterized by some of the tunables I understand.

But the external DFDL variables? These cannot affect compilation. The schema 
compiler needs to know statically the information about variables found in the 
schema itself in the dfdl:defineVariable statement annotations.

But the compiler doesn't need external variable bindings. In fact if it did 
know and use them, it would be building assumptions into the compiled schema 
that it shouldn't be building in.

Setting external var bindings on the Compiler just causes problems when that 
compiler instance is reused in a context where those settings aren't 
appropriate. (JIRA DAFFODIL-2302 is one such problem)

I believe setExternalDFDLVariable(s) methods should be deprecated, and external 
variables bindings should be an optional argument to the parse/unparse methods.

The setters cause thread safety issues because the DP is stateful, even though 
we want multiple calls to parse/unparse to be executable on different threads.

Consider: if we allow ordinary setExternalDFDLVariables and add a 
resetExternalDFDLVariables to clear them, then imagine one wants to make two 
parse calls on separate threads with different external variables bindings:

so on main thread.
 dp.setExternalVariables(...bindings 1...)
 spawn the thread 1
on the thread 1
 dp.parse()
back on main thread
 dp.resetExternalVariables() // race condition. Did the parse call read the 
external variables before this reset or not?
 dp.setExternalVariables(...bindings 2)
.

However, if we make the external variable bindings an argument to parse, we 
avoid all of this.

Alternatively, since DataProcessor has setExternalDFDLVariable, we can prohibit 
multiple calls on the same DataProcessor object simultaneously. We can provide 
a clone() method that preserves the loaded/reloaded processor, but constructs 
another DataProcessor object, thereby allowing separate external variables 
state per DataProcessor instance.


Comments?





Mike Beckerle 

Re: Compiler.setExternalDFDLVariable(s) considered challenged

2020-03-25 Thread Beckerle, Mike
To respond to my own thread here, I think given that we allow multiple 
simultaneous calls to parse/unparse from different threads, we must say that 
the DataProcessor object is immutable once parse or unparse are called.

I suppose we could say that it is mutable, but behavior is undetermined if any 
parse or unparse calls are active on any thread.

But this is just asking for trouble IMHO.

I think we started out with a stateful, non-thread capable API. The idea is 
that one thread would be invoking a data processor at a time. A data procesor 
was the state-block of an execution.

The need to share compiled processor reloads, because the compile schemas were 
expensive to create, tempted us to allow multiple parse/unparse calls on 
different threads.

Fact is, I think we should have said no to this, provided a 
DataProcessor.clone() to create instances that shared the reloaded compiled 
schema binary, but otherwise had separate state, and said that parse/unparse 
were synchronized methods on their DP instance.

Instead we're in a 1/2 way world where we don't have a thread-reasonable API 
due to mutable state in what turns out to be a cross-thread shared object.


From: Beckerle, Mike 
Sent: Wednesday, March 25, 2020 10:55 AM
To: dev@daffodil.apache.org 
Subject: Compiler.setExternalDFDLVariable(s) considered challenged

Why does the API for Daffodil have

Compiler.setExternalDFDLVariable(...)
and
Compiler.setExternalDFDLVariables(...)

on it.

I believe we should deprecate this.

Compilers are parameterized by some of the tunables I understand.

But the external DFDL variables? These cannot affect compilation. The schema 
compiler needs to know statically the information about variables found in the 
schema itself in the dfdl:defineVariable statement annotations.

But the compiler doesn't need external variable bindings. In fact if it did 
know and use them, it would be building assumptions into the compiled schema 
that it shouldn't be building in.

Setting external var bindings on the Compiler just causes problems when that 
compiler instance is reused in a context where those settings aren't 
appropriate. (JIRA DAFFODIL-2302 is one such problem)

I believe setExternalDFDLVariable(s) methods should be deprecated, and external 
variables bindings should be an optional argument to the parse/unparse methods.

The setters cause thread safety issues because the DP is stateful, even though 
we want multiple calls to parse/unparse to be executable on different threads.

Consider: if we allow ordinary setExternalDFDLVariables and add a 
resetExternalDFDLVariables to clear them, then imagine one wants to make two 
parse calls on separate threads with different external variables bindings:

so on main thread.
 dp.setExternalVariables(...bindings 1...)
 spawn the thread 1
on the thread 1
 dp.parse()
back on main thread
 dp.resetExternalVariables() // race condition. Did the parse call read the 
external variables before this reset or not?
 dp.setExternalVariables(...bindings 2)
.

However, if we make the external variable bindings an argument to parse, we 
avoid all of this.

Alternatively, since DataProcessor has setExternalDFDLVariable, we can prohibit 
multiple calls on the same DataProcessor object simultaneously. We can provide 
a clone() method that preserves the loaded/reloaded processor, but constructs 
another DataProcessor object, thereby allowing separate external variables 
state per DataProcessor instance.


Comments?





Mike Beckerle | Principal Engineer
[Owl Cyber Defense]
[cid:2a423dec-0558-414e-b369-14a24becc40f] is now a part of 
Owl
P +1-781-330-0412
W owlcyberdefense.com


Compiler.setExternalDFDLVariable(s) considered challenged

2020-03-25 Thread Beckerle, Mike
Why does the API for Daffodil have

Compiler.setExternalDFDLVariable(...)
and
Compiler.setExternalDFDLVariables(...)

on it.

I believe we should deprecate this.

Compilers are parameterized by some of the tunables I understand.

But the external DFDL variables? These cannot affect compilation. The schema 
compiler needs to know statically the information about variables found in the 
schema itself in the dfdl:defineVariable statement annotations.

But the compiler doesn't need external variable bindings. In fact if it did 
know and use them, it would be building assumptions into the compiled schema 
that it shouldn't be building in.

Setting external var bindings on the Compiler just causes problems when that 
compiler instance is reused in a context where those settings aren't 
appropriate. (JIRA DAFFODIL-2302 is one such problem)

I believe setExternalDFDLVariable(s) methods should be deprecated, and external 
variables bindings should be an optional argument to the parse/unparse methods.

The setters cause thread safety issues because the DP is stateful, even though 
we want multiple calls to parse/unparse to be executable on different threads.

Consider: if we allow ordinary setExternalDFDLVariables and add a 
resetExternalDFDLVariables to clear them, then imagine one wants to make two 
parse calls on separate threads with different external variables bindings:

so on main thread.
 dp.setExternalVariables(...bindings 1...)
 spawn the thread 1
on the thread 1
 dp.parse()
back on main thread
 dp.resetExternalVariables() // race condition. Did the parse call read the 
external variables before this reset or not?
 dp.setExternalVariables(...bindings 2)
.

However, if we make the external variable bindings an argument to parse, we 
avoid all of this.

Alternatively, since DataProcessor has setExternalDFDLVariable, we can prohibit 
multiple calls on the same DataProcessor object simultaneously. We can provide 
a clone() method that preserves the loaded/reloaded processor, but constructs 
another DataProcessor object, thereby allowing separate external variables 
state per DataProcessor instance.


Comments?





Mike Beckerle | Principal Engineer
[Owl Cyber Defense]
[cid:2a423dec-0558-414e-b369-14a24becc40f] is now a part of 
Owl
P +1-781-330-0412
W owlcyberdefense.com