attilakreiner commented on code in PR #10681:
URL: https://github.com/apache/iceberg/pull/10681#discussion_r1678355098
##########
.baseline/checkstyle/checkstyle.xml:
##########
@@ -281,16 +280,13 @@
</module>
<module name="InnerAssignment"/> <!-- Java Coding Guidelines: Inner
assignments: Not used -->
<module name="MemberName"> <!-- Java Style Guide: Non-constant field
names -->
- <property name="format" value="^[a-z][a-zA-Z0-9]+$"/>
- <message key="name.invalidPattern" value="Member name ''{0}'' must
match pattern ''{1}''."/>
+ <property name="format" value="^[a-z][a-zA-Z0-9]++$"/>
Review Comment:
The `*+` and `++` are _possessive_ quantifiers (as opposed to `*` and `+`
which are _greedy_; and `*?` and `+?` which are _lazy_).
I found the explanation on this page very useful:
https://www.rexegg.com/regex-quantifiers.php#possessive
My understanding is that in cases when we can use both _greedy_ and
_possessive_ quantifiers, _possessive_ are faster. Quote form the link above:
> This behavior is particularly useful when you know there is no valid
reason why the engine should ever backtrack into a section of matched text, as
you can save the engine a lot of needless work.
This seems a common regex feature, but honestly speaking I wasn't familiar
with in until @findepi pointed it out (thx for that) in the review of #10673.
I understand using _possessive_ quantifiers is good for performance, OTOH
it's possible that many people are unfamiliar with this syntax, so it may be a
tradeoff in terms of readability (and also a deviation from checkstyle
standards). WDYT?
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: [email protected]
For queries about this service, please contact Infrastructure at:
[email protected]
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]