As an alternative to additional boolean field, you could use one bit of
expectedCount/modCount int field(s):
- let initial value of expectedCount be 1 (odd value)
- instead of (expectedCount >= 0) ==> (expectedCount != 1)
- let initial value of modCount be 0 (even value)
- instead of modCount++ ==> modCount += 2;
Regards, Peter
On 09/17/2015 01:08 AM, Stuart Marks wrote:
On 9/16/15 8:43 AM, Xueming Shen wrote:
I'm talking about the check "immediately" prior to the call to
accept(). It
will not function after the modCount tips over to the negative int
value,
because the "expectedCount >=0" check.
Consider the use scenario that the Scanner is on top of an endless input
stream, you have a token stream on top of it. The check before the
"accept(token" will not be performed until the expectedCount/modCount
tips
back to positive value again from the negative, then off, then on...
During
the off period (it will take a while from negative back to positive),
the
stream will just work fine to feed the accept() the "next" token even if
there is another thread keeps "stealing" tokens from the same
scanner, if the
timing is right. Looks like not really a "fail-fast" in this scenario.
Right, after modCount wraps around to negative, the CME checking
becomes dysfunctional. It doesn't do any harm, but it ceases to
perform proper checking. This is kind of a corner case but ... well I
admit I did have to do a fair bit of puzzling to figure out what the
behavior would be and to prove to myself that it was benign.
This can be "easily" addressed, if you have a separate boolean field
such as
"initialized". The code can look like below in tryAdvance(...)
[edited per your subsequent message]
if (!initialized) {
expectedCount = modCount;
initialized = true;
}
if (expectedCount != modCount) {
throw new CME();
}
...
Well, if you think this is an unlikely use scenario and the intention
of the
check/guard here is mainly to prevent the wrong doing within the pipe
operation, then it might not worth the extra field, and I'm fine with
the
latest webrev.
The cost of the additional field is negligible. I haven't written out
the code but I suspect that an explicit "initialized" field will be
easier to reason about, certainly easier to understand than the
behavior that occurs if modCount wraps around to negative.
Note that this also applies to Matcher, although it's less likely
since Matcher's input is a CharSequence instead of an indefinite-sized
source such as a file or an input stream. In talking to Paul Sandoz
about this (author of the streams stuff for Matcher) he felt it was
important to keep the behaviors of Matcher and Scanner consistent.
But this has dragged out somewhat and I don't really want to add
Matcher changes to this changeset. How about I do the following:
1) I'll push the latest webrev as it stands.
2) I'll file a separate bug to clean up Scanner's and Matcher's
spliterators' modCount checking to avoid the overflow issue.
3) I'll fix all the spliterators at the same time.
How does that sound?
s'marks