theshoeshiner commented on code in PR #450:
URL: https://github.com/apache/commons-text/pull/450#discussion_r1377572434


##########
src/main/java/org/apache/commons/text/cases/CharacterDelimitedCase.java:
##########
@@ -0,0 +1,151 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one or more
+ * contributor license agreements.  See the NOTICE file distributed with
+ * this work for additional information regarding copyright ownership.
+ * The ASF licenses this file to You under the Apache License, Version 2.0
+ * (the "License"); you may not use this file except in compliance with
+ * the License.  You may obtain a copy of the License at
+ *
+ *      http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+package org.apache.commons.text.cases;
+
+import java.util.ArrayList;
+import java.util.List;
+
+import org.apache.commons.lang3.CharUtils;
+
+/**
+ * DelimitedCase is a case in which the true alphabetic case of the characters 
is ignored by default
+ * and tokens themselves are determined by the presence of a delimiter between 
each token.

Review Comment:
   The way I look at the string "foo-Bar-baz" is that the case actually is 
relevant _at an API level_. It's simply ignored by one specific implementation. 
I'd compare this to how various Number instances essentially ignore/drop the 
decimal portion when converting between them - e.g. In the context of creating 
an Integer the value 1.1 has an irrelevant decimal, but we still have 
doubleValue() on the Number class, because decimals are relevant to other 
implementations of Number. (<- Maybe this isn't the best of examples. I didn't 
have much time to come up with a better one, but I think the point is that some 
properties that are assumed to be relevant at an API level may not be relevant 
in specific implementations.)
   
   But to dig deeper into your suggestion - I did initially consider 
StringTokenizer, as I noticed some of it's features are being somewhat 
duplicated, but thought it might require some significant changes to make 
StringTokenizer fulfill the purpose of this API. After giving it more thought, 
these changes may not be _that_ significant, so I can take a stab at a POC and 
see if there are any other serious wrinkles with re-using that class.
   
   FWIW here were my initial thoughts...
   
   - StringTokenizer is designed to skip the matches found by the delimiter 
matcher, which wouldn't work for case matching, so we'd need to alter it to 
allow not dropping the (now dynamic) delimiter match. Creating a CaseTokenizer 
subclass would probably not be necessary given that StringTokenizer is already 
very configurable, and it doesnt seem like a stretch to add a little more to it 
- but will have to confirm the implications of this.
   - StringTokenizer has no functionality to format new strings from tokens, so 
that would have to be added, and Im not sure how that would work if it were 
embedded directly into the StringTokenizer, given that the Tokenizer doesnt 
even know what the delimiter matcher is doing under the hood. So the formatting 
would probably be best delegated to a different class. Preferably a class that 
is named similarly? StringUntokenizer? TokenStringifier?
   - I think we'd still need some classes to group the two related functions. 
So something like a SnakeCase class that under the hood has a compatible 
StringTokenizer and TokenStringifier instance?
   
   I'll take a stab at the POC. Let me know if you see any of my initial 
thoughts that are way off base.



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]

Reply via email to