As an alternative to additional boolean field, you could use one bit of expectedCount/modCount int field(s):

- let initial value of expectedCount be 1 (odd value)
- instead of (expectedCount >= 0) ==> (expectedCount != 1)
- let initial value of modCount be 0 (even value)
- instead of modCount++ ==> modCount += 2;

Regards, Peter

On 09/17/2015 01:08 AM, Stuart Marks wrote:


On 9/16/15 8:43 AM, Xueming Shen wrote:

I'm talking about the check "immediately" prior to the call to accept(). It will not function after the modCount tips over to the negative int value,
because the "expectedCount >=0" check.

Consider the use scenario that the Scanner is on top of an endless input
stream, you have a token stream on top of it. The check before the
"accept(token" will not be performed until the expectedCount/modCount tips back to positive value again from the negative, then off, then on... During the off period (it will take a while from negative back to positive), the
stream will just work fine to feed the accept() the "next" token even if
there is another thread keeps "stealing" tokens from the same scanner, if the
timing is right.  Looks like not really a "fail-fast" in this scenario.

Right, after modCount wraps around to negative, the CME checking becomes dysfunctional. It doesn't do any harm, but it ceases to perform proper checking. This is kind of a corner case but ... well I admit I did have to do a fair bit of puzzling to figure out what the behavior would be and to prove to myself that it was benign.

This can be "easily" addressed, if you have a separate boolean field such as
"initialized". The code can look like below in tryAdvance(...)

[edited per your subsequent message]


     if (!initialized) {
         expectedCount = modCount;
         initialized = true;
     }
     if (expectedCount != modCount) {
         throw new CME();
     }
     ...

Well, if you think this is an unlikely use scenario and the intention of the
check/guard here is mainly to prevent the wrong doing within the pipe
operation, then it might not worth the extra field, and I'm fine with the
latest webrev.

The cost of the additional field is negligible. I haven't written out the code but I suspect that an explicit "initialized" field will be easier to reason about, certainly easier to understand than the behavior that occurs if modCount wraps around to negative.

Note that this also applies to Matcher, although it's less likely since Matcher's input is a CharSequence instead of an indefinite-sized source such as a file or an input stream. In talking to Paul Sandoz about this (author of the streams stuff for Matcher) he felt it was important to keep the behaviors of Matcher and Scanner consistent.

But this has dragged out somewhat and I don't really want to add Matcher changes to this changeset. How about I do the following:

1) I'll push the latest webrev as it stands.

2) I'll file a separate bug to clean up Scanner's and Matcher's spliterators' modCount checking to avoid the overflow issue.

3) I'll fix all the spliterators at the same time.

How does that sound?

s'marks

Reply via email to