Re: [PR] Cases API + 4 implementations (Pascal, Camel, Kebab, Snake) [commons-text]

via GitHub Fri, 20 Oct 2023 05:01:54 -0700


elharo commented on code in PR #450:
URL: https://github.com/apache/commons-text/pull/450#discussion_r1366862715



##########
src/main/java/org/apache/commons/text/cases/CamelCase.java:
##########
@@ -0,0 +1,39 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one or more
+ * contributor license agreements.  See the NOTICE file distributed with
+ * this work for additional information regarding copyright ownership.
+ * The ASF licenses this file to You under the Apache License, Version 2.0
+ * (the "License"); you may not use this file except in compliance with
+ * the License.  You may obtain a copy of the License at
+ *
+ *      http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+package org.apache.commons.text.cases;
+
+/**
+ * Case implementation that parses and formats strings of the form 
'myCamelCase'
+ * <p>
+ * CamelCase is a case where tokens are delimited by upper case unicode 
characters. The very first

Review Comment:
   Unicode (capitalized)



##########
src/main/java/org/apache/commons/text/cases/Case.java:
##########
@@ -0,0 +1,52 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one or more
+ * contributor license agreements.  See the NOTICE file distributed with
+ * this work for additional information regarding copyright ownership.
+ * The ASF licenses this file to You under the Apache License, Version 2.0
+ * (the "License"); you may not use this file except in compliance with
+ * the License.  You may obtain a copy of the License at
+ *
+ *      http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+package org.apache.commons.text.cases;
+
+import java.util.List;
+
+/**
+ * Handles formatting and parsing tokens to/from a String. For most 
implementations tokens returned
+ * by the parse method should abide by any restrictions present in the format 
method. i.e. calling

Review Comment:
   delete should
   i.e. --> That is,



##########
src/main/java/org/apache/commons/text/cases/Case.java:
##########
@@ -0,0 +1,52 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one or more
+ * contributor license agreements.  See the NOTICE file distributed with
+ * this work for additional information regarding copyright ownership.
+ * The ASF licenses this file to You under the Apache License, Version 2.0
+ * (the "License"); you may not use this file except in compliance with
+ * the License.  You may obtain a copy of the License at
+ *
+ *      http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+package org.apache.commons.text.cases;
+
+import java.util.List;
+
+/**
+ * Handles formatting and parsing tokens to/from a String. For most 
implementations tokens returned

Review Comment:
   Handles formatting and parsing --> Formats and parses
   For most implementations --> In most implementations,



##########
src/main/java/org/apache/commons/text/cases/CharacterDelimitedCase.java:
##########
@@ -0,0 +1,144 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one or more
+ * contributor license agreements.  See the NOTICE file distributed with
+ * this work for additional information regarding copyright ownership.
+ * The ASF licenses this file to You under the Apache License, Version 2.0
+ * (the "License"); you may not use this file except in compliance with
+ * the License.  You may obtain a copy of the License at
+ *
+ *      http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+package org.apache.commons.text.cases;
+
+import java.util.ArrayList;
+import java.util.HashSet;
+import java.util.List;
+import java.util.Set;
+
+import org.apache.commons.lang3.CharUtils;
+
+/**
+ * DelimitedCase is a case in which the true alphabetic case of the characters 
is ignored by default
+ * and tokens themselves are determined by the presence of a delimiter between 
each token.
+ */
+public class CharacterDelimitedCase implements Case {
+
+    /** Delimiters to be used when parsing. */
+    private Set<Integer> parseDelimiters;
+
+    /** Delimiter to be used when formatting. */
+    private String formatDelimiter;
+
+    /**
+     * Constructs a new Delimited Case.
+     * @param delimiter the character to use as both the parse and format 
delimiter
+     */
+    protected CharacterDelimitedCase(char delimiter) {
+        this(new char[] { delimiter }, CharUtils.toString(delimiter));
+    }
+
+    /**
+     * Constructs a new delimited case.
+     * @param parseDelimiters the array of delimiters to use when parsing
+     * @param formatDelimiter the delimiter to use when formatting
+     */
+    protected CharacterDelimitedCase(char[] parseDelimiters, String 
formatDelimiter) {
+        super();
+        if (parseDelimiters == null || parseDelimiters.length == 0) {
+            throw new IllegalArgumentException("Parse Delimiters cannot be 
null or empty");
+        }
+        if (formatDelimiter == null || formatDelimiter.length() == 0) {

Review Comment:
   Algernately, no delimiters could mean fromat/parse just return the input 
unchanged



##########
src/main/java/org/apache/commons/text/cases/CharacterDelimitedCase.java:
##########
@@ -0,0 +1,144 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one or more
+ * contributor license agreements.  See the NOTICE file distributed with
+ * this work for additional information regarding copyright ownership.
+ * The ASF licenses this file to You under the Apache License, Version 2.0
+ * (the "License"); you may not use this file except in compliance with
+ * the License.  You may obtain a copy of the License at
+ *
+ *      http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+package org.apache.commons.text.cases;
+
+import java.util.ArrayList;
+import java.util.HashSet;
+import java.util.List;
+import java.util.Set;
+
+import org.apache.commons.lang3.CharUtils;
+
+/**
+ * DelimitedCase is a case in which the true alphabetic case of the characters 
is ignored by default
+ * and tokens themselves are determined by the presence of a delimiter between 
each token.
+ */
+public class CharacterDelimitedCase implements Case {
+
+    /** Delimiters to be used when parsing. */
+    private Set<Integer> parseDelimiters;
+
+    /** Delimiter to be used when formatting. */
+    private String formatDelimiter;
+
+    /**
+     * Constructs a new Delimited Case.
+     * @param delimiter the character to use as both the parse and format 
delimiter
+     */
+    protected CharacterDelimitedCase(char delimiter) {
+        this(new char[] { delimiter }, CharUtils.toString(delimiter));
+    }
+
+    /**
+     * Constructs a new delimited case.
+     * @param parseDelimiters the array of delimiters to use when parsing
+     * @param formatDelimiter the delimiter to use when formatting
+     */
+    protected CharacterDelimitedCase(char[] parseDelimiters, String 
formatDelimiter) {
+        super();
+        if (parseDelimiters == null || parseDelimiters.length == 0) {
+            throw new IllegalArgumentException("Parse Delimiters cannot be 
null or empty");
+        }
+        if (formatDelimiter == null || formatDelimiter.length() == 0) {
+            throw new IllegalArgumentException("Format Delimiters cannot be 
null or empty");
+        }
+        this.parseDelimiters = generateDelimiterSet(parseDelimiters);
+        this.formatDelimiter = formatDelimiter;
+    }
+
+    /**
+     * Formats tokens into Delimited Case.
+     * <p>
+     * Tokens are iterated on and appended to an output stream, with an 
instance of a
+     * delimiter character between them. This method validates that the 
delimiter character is not
+     * part of the token. If it is found within the token an exception is 
thrown.<br>
+     * No other restrictions are placed on the contents of the tokens.
+     * Note: This Case does support empty tokens.<br>
+     * </p>
+     * @param tokens the tokens to be formatted into a delimited string
+     * @return the delimited string
+     * @throws IllegalArgumentException if any tokens contain the delimiter 
character
+     */
+    @Override
+    public String format(Iterable<String> tokens) {
+        StringBuilder formattedString = new StringBuilder();
+        int i = 0;
+        for (String token : tokens) {
+            int delimiterFoundIndex = token.indexOf(formatDelimiter);
+            if (delimiterFoundIndex > -1) {
+                throw new IllegalArgumentException("Token " + i + " contains 
delimiter character '" + formatDelimiter + "' at index " + delimiterFoundIndex);
+            }
+            if (i > 0) {
+                formattedString.append(formatDelimiter);
+            }
+            i++;
+            formattedString.append(token);
+        }
+        return formattedString.toString();
+    }
+
+    /**
+     * Parses delimited string into tokens.
+     * <p>
+     * Input string is parsed one character at a time until a delimiter 
character is reached.
+     * When a delimiter character is reached a new token begins. The delimiter 
character is
+     * considered reserved, and is omitted from the returned parsed tokens.<br>
+     * No other restrictions are placed on the contents of the input string. 
<br>
+     * </p>
+     * @param string the delimited string to be parsed
+     * @return the list of tokens found in the string
+     */
+    @Override
+    public List<String> parse(String string) {
+        List<String> tokens = new ArrayList<>();
+        if (string.length() == 0) {

Review Comment:
   string.isEmpty()



##########
src/main/java/org/apache/commons/text/cases/PascalCase.java:
##########
@@ -0,0 +1,38 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one or more
+ * contributor license agreements.  See the NOTICE file distributed with
+ * this work for additional information regarding copyright ownership.
+ * The ASF licenses this file to You under the Apache License, Version 2.0
+ * (the "License"); you may not use this file except in compliance with
+ * the License.  You may obtain a copy of the License at
+ *
+ *      http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+package org.apache.commons.text.cases;
+
+/**
+ * Case implementation which parses and formats strings of the form 
'MyPascalString'
+ * <p>
+ * PascalCase tokens are delimited by upper case unicode characters. Each 
parsed token

Review Comment:
   Unicode



##########
src/main/java/org/apache/commons/text/cases/KebabCase.java:
##########
@@ -0,0 +1,40 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one or more
+ * contributor license agreements.  See the NOTICE file distributed with
+ * this work for additional information regarding copyright ownership.
+ * The ASF licenses this file to You under the Apache License, Version 2.0
+ * (the "License"); you may not use this file except in compliance with
+ * the License.  You may obtain a copy of the License at
+ *
+ *      http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+package org.apache.commons.text.cases;
+
+/**
+ * Case implementation which parses and formats strings of the form 
'my-kebab-string'
+ * <p>
+ * KebabCase is a delimited case where the delimiter is a hyphen character '-'.
+ * </p>
+ */
+public final class KebabCase extends CharacterDelimitedCase {

Review Comment:
   I wonder if it would be nice to make these all constant values of ina  class 
somewhere? E.g. CaseConvertes.KEBAB, CaseConverters.CAMEL, etc. I'm not sure. 
Just a thought.



##########
src/main/java/org/apache/commons/text/cases/CamelCase.java:
##########
@@ -0,0 +1,39 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one or more
+ * contributor license agreements.  See the NOTICE file distributed with
+ * this work for additional information regarding copyright ownership.
+ * The ASF licenses this file to You under the Apache License, Version 2.0
+ * (the "License"); you may not use this file except in compliance with
+ * the License.  You may obtain a copy of the License at
+ *
+ *      http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+package org.apache.commons.text.cases;
+
+/**
+ * Case implementation that parses and formats strings of the form 
'myCamelCase'
+ * <p>
+ * CamelCase is a case where tokens are delimited by upper case unicode 
characters. The very first
+ * token should begin with a lower case character, and any subsequent tokens 
begin with an
+ * upper case character. All remaining characters will be lower cased or non 
cased.

Review Comment:
   lower case



##########
src/main/java/org/apache/commons/text/cases/UpperCaseDelimitedCase.java:
##########
@@ -0,0 +1,163 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one or more
+ * contributor license agreements.  See the NOTICE file distributed with
+ * this work for additional information regarding copyright ownership.
+ * The ASF licenses this file to You under the Apache License, Version 2.0
+ * (the "License"); you may not use this file except in compliance with
+ * the License.  You may obtain a copy of the License at
+ *
+ *      http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+package org.apache.commons.text.cases;
+
+import java.util.ArrayList;
+import java.util.List;
+
+/**
+ * Case implementation which parses and formats strings where tokens are 
delimited by upper case characters.
+ */
+public class UpperCaseDelimitedCase implements Case {
+
+    /** Flag to indicate whether the first character of the first token should 
be upper cased. */
+    private boolean lowerCaseFirstCharacter = false;
+
+    /**
+     * Constructs a new UpperCaseDelimitedCase instance.
+     */
+    protected UpperCaseDelimitedCase(boolean lowerCaseFirstCharacter) {
+        this.lowerCaseFirstCharacter = lowerCaseFirstCharacter;
+    }
+
+    /**
+     * Parses a string into tokens.
+     * <p>
+     * String characters are iterated over and when an upper case unicode 
character is
+     * encountered, that character is considered to be the start of a new 
token, with the character
+     * itself included in the token. This method will never return empty 
tokens.

Review Comment:
   will never return --> never returns (Tech writing lives in the eternal 
present)



##########
src/main/java/org/apache/commons/text/cases/CharacterDelimitedCase.java:
##########
@@ -0,0 +1,144 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one or more
+ * contributor license agreements.  See the NOTICE file distributed with
+ * this work for additional information regarding copyright ownership.
+ * The ASF licenses this file to You under the Apache License, Version 2.0
+ * (the "License"); you may not use this file except in compliance with
+ * the License.  You may obtain a copy of the License at
+ *
+ *      http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+package org.apache.commons.text.cases;
+
+import java.util.ArrayList;
+import java.util.HashSet;
+import java.util.List;
+import java.util.Set;
+
+import org.apache.commons.lang3.CharUtils;
+
+/**
+ * DelimitedCase is a case in which the true alphabetic case of the characters 
is ignored by default
+ * and tokens themselves are determined by the presence of a delimiter between 
each token.
+ */
+public class CharacterDelimitedCase implements Case {
+
+    /** Delimiters to be used when parsing. */
+    private Set<Integer> parseDelimiters;
+
+    /** Delimiter to be used when formatting. */
+    private String formatDelimiter;
+
+    /**
+     * Constructs a new Delimited Case.
+     * @param delimiter the character to use as both the parse and format 
delimiter
+     */
+    protected CharacterDelimitedCase(char delimiter) {
+        this(new char[] { delimiter }, CharUtils.toString(delimiter));
+    }
+
+    /**
+     * Constructs a new delimited case.
+     * @param parseDelimiters the array of delimiters to use when parsing
+     * @param formatDelimiter the delimiter to use when formatting
+     */
+    protected CharacterDelimitedCase(char[] parseDelimiters, String 
formatDelimiter) {
+        super();
+        if (parseDelimiters == null || parseDelimiters.length == 0) {
+            throw new IllegalArgumentException("Parse Delimiters cannot be 
null or empty");
+        }
+        if (formatDelimiter == null || formatDelimiter.length() == 0) {
+            throw new IllegalArgumentException("Format Delimiters cannot be 
null or empty");
+        }
+        this.parseDelimiters = generateDelimiterSet(parseDelimiters);
+        this.formatDelimiter = formatDelimiter;
+    }
+
+    /**
+     * Formats tokens into Delimited Case.
+     * <p>
+     * Tokens are iterated on and appended to an output stream, with an 
instance of a
+     * delimiter character between them. This method validates that the 
delimiter character is not
+     * part of the token. If it is found within the token an exception is 
thrown.<br>
+     * No other restrictions are placed on the contents of the tokens.
+     * Note: This Case does support empty tokens.<br>
+     * </p>
+     * @param tokens the tokens to be formatted into a delimited string
+     * @return the delimited string
+     * @throws IllegalArgumentException if any tokens contain the delimiter 
character
+     */
+    @Override
+    public String format(Iterable<String> tokens) {
+        StringBuilder formattedString = new StringBuilder();
+        int i = 0;
+        for (String token : tokens) {
+            int delimiterFoundIndex = token.indexOf(formatDelimiter);
+            if (delimiterFoundIndex > -1) {
+                throw new IllegalArgumentException("Token " + i + " contains 
delimiter character '" + formatDelimiter + "' at index " + delimiterFoundIndex);
+            }
+            if (i > 0) {

Review Comment:
   You really only need a is_first_token boolean for i, and maybe not even 
that. Simply initialize the formattedString with the firstToken before the loop 
begins and then loop over the remaining tokens. 



##########
src/main/java/org/apache/commons/text/cases/CharacterDelimitedCase.java:
##########
@@ -0,0 +1,144 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one or more
+ * contributor license agreements.  See the NOTICE file distributed with
+ * this work for additional information regarding copyright ownership.
+ * The ASF licenses this file to You under the Apache License, Version 2.0
+ * (the "License"); you may not use this file except in compliance with
+ * the License.  You may obtain a copy of the License at
+ *
+ *      http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+package org.apache.commons.text.cases;
+
+import java.util.ArrayList;
+import java.util.HashSet;
+import java.util.List;
+import java.util.Set;
+
+import org.apache.commons.lang3.CharUtils;
+
+/**
+ * DelimitedCase is a case in which the true alphabetic case of the characters 
is ignored by default
+ * and tokens themselves are determined by the presence of a delimiter between 
each token.
+ */
+public class CharacterDelimitedCase implements Case {
+
+    /** Delimiters to be used when parsing. */
+    private Set<Integer> parseDelimiters;
+
+    /** Delimiter to be used when formatting. */
+    private String formatDelimiter;
+
+    /**
+     * Constructs a new Delimited Case.
+     * @param delimiter the character to use as both the parse and format 
delimiter
+     */
+    protected CharacterDelimitedCase(char delimiter) {
+        this(new char[] { delimiter }, CharUtils.toString(delimiter));
+    }
+
+    /**
+     * Constructs a new delimited case.
+     * @param parseDelimiters the array of delimiters to use when parsing
+     * @param formatDelimiter the delimiter to use when formatting
+     */
+    protected CharacterDelimitedCase(char[] parseDelimiters, String 
formatDelimiter) {
+        super();
+        if (parseDelimiters == null || parseDelimiters.length == 0) {

Review Comment:
   These are two separate cases that should have different error messages, one 
for null and one for empty.



##########
src/main/java/org/apache/commons/text/cases/CharacterDelimitedCase.java:
##########
@@ -0,0 +1,144 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one or more
+ * contributor license agreements.  See the NOTICE file distributed with
+ * this work for additional information regarding copyright ownership.
+ * The ASF licenses this file to You under the Apache License, Version 2.0
+ * (the "License"); you may not use this file except in compliance with
+ * the License.  You may obtain a copy of the License at
+ *
+ *      http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+package org.apache.commons.text.cases;
+
+import java.util.ArrayList;
+import java.util.HashSet;
+import java.util.List;
+import java.util.Set;
+
+import org.apache.commons.lang3.CharUtils;
+
+/**
+ * DelimitedCase is a case in which the true alphabetic case of the characters 
is ignored by default
+ * and tokens themselves are determined by the presence of a delimiter between 
each token.
+ */
+public class CharacterDelimitedCase implements Case {
+
+    /** Delimiters to be used when parsing. */
+    private Set<Integer> parseDelimiters;
+
+    /** Delimiter to be used when formatting. */
+    private String formatDelimiter;
+
+    /**
+     * Constructs a new Delimited Case.
+     * @param delimiter the character to use as both the parse and format 
delimiter
+     */
+    protected CharacterDelimitedCase(char delimiter) {
+        this(new char[] { delimiter }, CharUtils.toString(delimiter));
+    }
+
+    /**
+     * Constructs a new delimited case.
+     * @param parseDelimiters the array of delimiters to use when parsing
+     * @param formatDelimiter the delimiter to use when formatting
+     */
+    protected CharacterDelimitedCase(char[] parseDelimiters, String 
formatDelimiter) {
+        super();
+        if (parseDelimiters == null || parseDelimiters.length == 0) {
+            throw new IllegalArgumentException("Parse Delimiters cannot be 
null or empty");
+        }
+        if (formatDelimiter == null || formatDelimiter.length() == 0) {
+            throw new IllegalArgumentException("Format Delimiters cannot be 
null or empty");
+        }
+        this.parseDelimiters = generateDelimiterSet(parseDelimiters);
+        this.formatDelimiter = formatDelimiter;
+    }
+
+    /**
+     * Formats tokens into Delimited Case.
+     * <p>
+     * Tokens are iterated on and appended to an output stream, with an 
instance of a
+     * delimiter character between them. This method validates that the 
delimiter character is not
+     * part of the token. If it is found within the token an exception is 
thrown.<br>
+     * No other restrictions are placed on the contents of the tokens.
+     * Note: This Case does support empty tokens.<br>
+     * </p>
+     * @param tokens the tokens to be formatted into a delimited string
+     * @return the delimited string
+     * @throws IllegalArgumentException if any tokens contain the delimiter 
character
+     */
+    @Override
+    public String format(Iterable<String> tokens) {
+        StringBuilder formattedString = new StringBuilder();
+        int i = 0;
+        for (String token : tokens) {
+            int delimiterFoundIndex = token.indexOf(formatDelimiter);
+            if (delimiterFoundIndex > -1) {
+                throw new IllegalArgumentException("Token " + i + " contains 
delimiter character '" + formatDelimiter + "' at index " + delimiterFoundIndex);
+            }
+            if (i > 0) {
+                formattedString.append(formatDelimiter);
+            }
+            i++;
+            formattedString.append(token);
+        }
+        return formattedString.toString();
+    }
+
+    /**
+     * Parses delimited string into tokens.
+     * <p>
+     * Input string is parsed one character at a time until a delimiter 
character is reached.
+     * When a delimiter character is reached a new token begins. The delimiter 
character is
+     * considered reserved, and is omitted from the returned parsed tokens.<br>
+     * No other restrictions are placed on the contents of the input string. 
<br>
+     * </p>
+     * @param string the delimited string to be parsed
+     * @return the list of tokens found in the string
+     */
+    @Override
+    public List<String> parse(String string) {
+        List<String> tokens = new ArrayList<>();
+        if (string.length() == 0) {
+            return tokens;
+        }
+        int strLen = string.length();
+        int[] tokenCodePoints = new int[strLen];
+        int tokenCodePointsOffset = 0;
+        for (int i = 0; i < string.length();) {
+            final int codePoint = string.codePointAt(i);
+            if (parseDelimiters.contains(codePoint)) {
+                tokens.add(new String(tokenCodePoints, 0, 
tokenCodePointsOffset));
+                tokenCodePoints = new int[strLen];
+                tokenCodePointsOffset = 0;
+                i++;
+            } else {
+                tokenCodePoints[tokenCodePointsOffset++] = codePoint;
+                i += Character.charCount(codePoint);
+            }
+        }
+        tokens.add(new String(tokenCodePoints, 0, tokenCodePointsOffset));
+        return tokens;
+    }
+
+    /**
+     * Converts an array of delimiters to a hash set of code points. The 
generated hash set provides O(1) lookup time.
+     *
+     * @param delimiters set of characters to determine capitalization, null 
means whitespace
+     * @return the Set of delimiter characters in the input array
+     */
+    private static Set<Integer> generateDelimiterSet(final char[] delimiters) {

Review Comment:
   Premature optimization is the root of all evil in programming -- Knuth
   
   Given the likely small size of the set, linear search in an array/list will 
likely be much faster. All the time here will go to constant overhead.



##########
src/main/java/org/apache/commons/text/cases/UpperCaseDelimitedCase.java:
##########
@@ -0,0 +1,163 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one or more
+ * contributor license agreements.  See the NOTICE file distributed with
+ * this work for additional information regarding copyright ownership.
+ * The ASF licenses this file to You under the Apache License, Version 2.0
+ * (the "License"); you may not use this file except in compliance with
+ * the License.  You may obtain a copy of the License at
+ *
+ *      http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+package org.apache.commons.text.cases;
+
+import java.util.ArrayList;
+import java.util.List;
+
+/**
+ * Case implementation which parses and formats strings where tokens are 
delimited by upper case characters.
+ */
+public class UpperCaseDelimitedCase implements Case {
+
+    /** Flag to indicate whether the first character of the first token should 
be upper cased. */
+    private boolean lowerCaseFirstCharacter = false;
+
+    /**
+     * Constructs a new UpperCaseDelimitedCase instance.
+     */
+    protected UpperCaseDelimitedCase(boolean lowerCaseFirstCharacter) {
+        this.lowerCaseFirstCharacter = lowerCaseFirstCharacter;
+    }
+
+    /**
+     * Parses a string into tokens.
+     * <p>
+     * String characters are iterated over and when an upper case unicode 
character is
+     * encountered, that character is considered to be the start of a new 
token, with the character

Review Comment:
   is considered to be the start of --> starts



##########
src/main/java/org/apache/commons/text/cases/UpperCaseDelimitedCase.java:
##########
@@ -0,0 +1,163 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one or more
+ * contributor license agreements.  See the NOTICE file distributed with
+ * this work for additional information regarding copyright ownership.
+ * The ASF licenses this file to You under the Apache License, Version 2.0
+ * (the "License"); you may not use this file except in compliance with
+ * the License.  You may obtain a copy of the License at
+ *
+ *      http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+package org.apache.commons.text.cases;
+
+import java.util.ArrayList;
+import java.util.List;
+
+/**
+ * Case implementation which parses and formats strings where tokens are 
delimited by upper case characters.
+ */
+public class UpperCaseDelimitedCase implements Case {
+
+    /** Flag to indicate whether the first character of the first token should 
be upper cased. */
+    private boolean lowerCaseFirstCharacter = false;
+
+    /**
+     * Constructs a new UpperCaseDelimitedCase instance.
+     */
+    protected UpperCaseDelimitedCase(boolean lowerCaseFirstCharacter) {
+        this.lowerCaseFirstCharacter = lowerCaseFirstCharacter;
+    }
+
+    /**
+     * Parses a string into tokens.
+     * <p>
+     * String characters are iterated over and when an upper case unicode 
character is
+     * encountered, that character is considered to be the start of a new 
token, with the character
+     * itself included in the token. This method will never return empty 
tokens.
+     * </p>
+     * @param string the string to parse
+     * @return the list of tokens found in the string
+     */
+    @Override
+    public List<String> parse(String string) {
+        List<String> tokens = new ArrayList<>();
+        if (string.length() == 0) {

Review Comment:
   It's not wrong to throw a NullPointerException here, but I do wonder if it 
would be a more convenient API is this returned an empty list on null. I'm not 
sure.



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]

Re: [PR] Cases API + 4 implementations (Pascal, Camel, Kebab, Snake) [commons-text]

Reply via email to