Title case is extremely confusing. There are two different notions of "title case" in common usage. One is essentially an uppercase for ligatures that upper cases the first "part" of the ligature but not the second. This is what the Unicode title case property refers to and what Character.toTitleCase does. It's not clear that this applies to strings, at all. The other notion of title case is in phrases and headings, where this refers to upper casing the first letter of each word, possibly excluding non-initial articles and prepositions.
On Mon, Aug 11, 2025 at 12:37 PM Miguel Muñoz <swingguy1...@gmail.com> wrote: > > When you implement the toTitleCase() method, I hope you will consider the > multilingual capability of Unicode, and use the Character.toTitleCase() and > Character.isTitleCase() methods to convert or test a word's first > character. If the text is in English, this method returns the same result > as Character.toUpperCase(), but there is at least one language where the > toTitleCase() method returns a different result. (I think that language is > Turkish.) You should also include the relevant characters in your unit > tests. > > The reason is that certain characters are ligatures, which are two letters > combined into a single character. Some ligatures may appear at the > beginning of words. These have three forms, an upper case form, a lower > case form, and a title case form. Calling the toTitleCase() method will > convert all characters to the proper title-case form. > > Also, since English use of title case excludes articles, prepositions and > conjunctions, you may want to add a method with a signature like this: > > public static String toTitleCase(String text, Set<String> exclusions) > > -- Miguel Muñoz > > > On Sat, Aug 2, 2025 at 4:56 AM Kunal Bhangale < > bhangalekunal2631...@gmail.com> wrote: > > > Hi Commons Lang Developers, > > > > I would like to propose the addition of some new utility methods to > > `StringUtils` in Apache Commons Lang. These methods are commonly needed in > > real-world projects but currently not available in the library. > > > > Here are some initial ideas: > > > > > > 1. *findAllOccurrences(String str, String subStr)* > > - Description: Returns a list of all indexes where a substring occurs in > > the main string. > > - Example: findAllOccurrences("abcabc", "a") → [0, 3] > > > > 2. *toTitleCase(String str)* > > - Description: Converts each word's first character to uppercase and the > > rest to lowercase. > > - Example: toTitleCase("hello world") → "Hello World" > > > > 3. *smartTruncate(String str, int maxLength)* > > - Description: Truncates the string to the nearest full word under the > > limit and appends "..." if needed. > > - Example: smartTruncate("This is a long sentence", 10) → "This is..." > > > > 4. *removeRepeatedCharacters(String str)* > > - Description: Removes consecutive duplicate characters. > > - Example: removeRepeatedCharacters("aaabbbcccaaa") → "abca" > > > > 5. *isTitleCase(String str)* > > - Description: Checks if the input is in title case format. > > - Example: isTitleCase("Hello World") → true > > > > 6. *countWords(String str)* > > - Description: Returns the number of words in the input string. > > - Example: countWords("Apache Commons Lang") → 3 > > > > > > I’d be happy to implement these methods and write appropriate JUnit tests. > > If the community finds these valuable, I can create a JIRA issue and start > > working on the patch. > > > > Looking forward to your feedback! > > > > Thanks & Regards, > > Kunal Bhangale > > bhangalekunal2631...@gmail.com > > -- Elliotte Rusty Harold elh...@ibiblio.org --------------------------------------------------------------------- To unsubscribe, e-mail: dev-unsubscr...@commons.apache.org For additional commands, e-mail: dev-h...@commons.apache.org