Hi Stuart,

Thanks for the detailed review.

Here is a possible way forward:

1) Add the methods to Matcher, as proposed in the initial webrev.
1.1) Change the specification of Matcher.results to reset the stream before 
matching, making it consistent with the replace* methods. 

2) Add convenience methods for all replace*() and matches() on Pattern that 
defer to those on Matcher.

We can do that in two stages, focusing on 1) in this review.

I was not too concerned about the static method and the stream returning method 
having the same name as the context is quite different. For stream returning 
methods there is a de-facto pattern of using a plural of the stream element 
kind, so i prefer that to findAll. What about the name "Pattern.matchResults"? 
which chains well with "Pattern.match(...).results()".

--

Regarding the disparity between MatchResult and Matcher. I think that would 
require a new sub-interface of MatchResult from which Matcher extends from and 
returns. If we think it important we should do that for 9 otherwise we will be 
stuck for the stream-based methods.

Paul.

On Feb 11, 2015, at 2:02 AM, Stuart Marks <stuart.ma...@oracle.com> wrote:

> Hi Paul,
> 
> I spent some time looking at this API. Overall it seems to me that things 
> work a bit more nicely when these methods are added to Pattern instead of 
> Matcher. Unfortunately there are some odd things with the existing API that 
> make this tradeoff not so obvious.
> 
> First, here's what a simple replacement operation looks like when 
> replaceAll() is added to Matcher:
> 
>        String input = "fooaaaabbfooabbbfoo";
>        Pattern p = Pattern.compile("a*b");
>        Matcher m = p.matcher(input);
>        String result = m.replaceAll(mr -> mr.group().toUpperCase());
> 
> But if replaceAll() is on Pattern, we can skip a step:
> 
>        String input = "fooaaaabbfooabbbfoo";
>        Pattern p = Pattern.compile("a*b");
>        String result = p.replaceAll(input, mr -> mr.group().toUpperCase());
> 
> Getting stream of match results is similar. So yes, I agree that it 
> simplifies things to have these be on Pattern instead of Matcher.
> 
> An advantage of having these on Pattern is that the matcher that gets created 
> is encapsulated, and its state isn't exposed to being mucked about by the 
> application. Thus you can avoid the additional concurrent modification checks 
> that you have to do if replaceAll et. al. are on Matcher.
> 
> Unfortunately, putting these on Pattern now creates some difficulties meshing 
> with the existing API.
> 
> One issue is that Matcher already has replaceAll(String) and 
> replaceFirst(String). It would be strange to have these here and to have 
> replaceAll(replacer) and replaceFirst(replacer) on Pattern.
> 
> Another issue is that Matcher supports matching on region (subrange) of its 
> input. For example, today you can do this:
> 
>    pattern.matcher(input).region(start, end)
> 
> The region will constrain the matching for certain operations, such as find() 
> (but not replaceAll or replaceFirst). If something like results() were added 
> to Matcher, I'd expect that it would respect the Matcher's region, but if 
> results() (or matches() as you called it) were on Pattern, the region 
> constraint would be lacking.
> 
> Also note that Pattern already has this:
> 
>    static boolean matches(regex, input)
> 
> so I don't think an overload of matches() that returns a Stream would be a 
> good idea. (Maybe findAll()?)
> 
> Another issue, not directly related to where the new lambda/streams methods 
> get added, is that MatchResult allows references only numbered capturing 
> groups. Matcher, which implements MatchResult, also supports named capturing 
> groups, with the new overloaded methods group(String), start(String), and 
> end(String). These were added in Java 7. Logically these also belong on 
> MatchResult, but they probably weren't added because of the compatibility 
> issue of adding methods to interfaces. Maybe we should add these as default 
> methods to MatchResult.
> 
> (But what would the supported implementation be? Just throw 
> UnsupportedOperationException?)
> 
> --
> 
> I'm not entirely sure where this leaves things. It certainly seems more 
> convenient to have the new methods on Pattern. But given the way the existing 
> API is set up, it seems like it's a better fit to add them to Matcher.
> 
> s'marks
> 
> 
> 
> On 2/9/15 6:18 AM, Paul Sandoz wrote:
>> Here is an alternative that pushes the methods on to Pattern instead:
>> 
>>   
>> http://cr.openjdk.java.net/~psandoz/jdk9/JDK-8071479--Matcher-stream-results/on-Pattern/webrev/
>> 
>> (Whe webrev reports some files as empty, please ingore those, i have this 
>> webrev stacked on the previous one.)
>> 
>> I have also included replaceFirst.
>> 
>> This simplifies things for streaming on the match results and also for 
>> replacing.
>> 
>> Note that the existing replace* methods on Matcher reset the matcher before 
>> matching and indicate that the matcher should be reset afterwards for reuse. 
>> Thus there is no loss in functionality moving such lambda accepting methods 
>> from Matcher to Pattern. It comes down to the performance of reusing a 
>> matcher, which does not seems so compelling to me.
>> 
>> Paul.
>> 
>> On Feb 5, 2015, at 11:59 AM, Paul Sandoz <paul.san...@oracle.com> wrote:
>> 
>>> Hi.
>>> 
>>> Please review these stream/lambda enhancements on Matcher:
>>> 
>>>  
>>> http://cr.openjdk.java.net/~psandoz/jdk9/JDK-8071479--Matcher-stream-results/webrev/
>>> 
>>> Two new methods are added to Matcher:
>>> 
>>> 1) replaceAll(Function<MatchResult, String> ) that is more flexible than 
>>> the existing replaceAll that accepts a single value.
>>> 
>>> 2) Stream<MatchResult> results() that returns a stream of MatchResult for 
>>> all matches.
>>> 
>>> The former does introduce a minor source incompatibility for a null 
>>> argument, but then so did the new append methods accepting StringBuilder 
>>> that were recently added (see JDK-8039124).
>>> 
>>> For the latter i opted to place the method on Matcher rather than Pattern 
>>> as i think that is a better fit with current usages of Matcher and 
>>> operating on a MatchResult. That marginally increases the complexity since 
>>> co-modification checking is required.
>>> 
>>> I update the test PatternStreamTest to derive the expected result.
>>> 
>>> 
>>> I suppose i could add another method replaceFirst(Function<MatchResult, 
>>> String> ) if anyone feels strongly about that. Consistency-wise it seems 
>>> the right thing to do.
>>> 
>>> Paul.
>> 

Reply via email to