[ 
https://issues.apache.org/jira/browse/FREEMARKER-97?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16500684#comment-16500684
 ] 

T X commented on FREEMARKER-97:
-------------------------------

The JavaScript implementation translates into Java code reasonably easily, and 
essentially implements the Chicago Manual of Style, which would be suitable as 
the base implementation:

{code:javascript}
/* 
  * To Title Case 2.1 – http://individed.com/code/to-title-case/
  * Copyright © 2008–2013 David Gouch. Licensed under the MIT License.
 */

String.prototype.toTitleCase = function(){
  var smallWords = 
/^(a|an|and|as|at|but|by|en|for|if|in|nor|of|on|or|per|the|to|vs?\.?|via)$/i;

  return this.replace(/[A-Za-z0-9\u00C0-\u00FF]+[^\s-]*/g, function(match, 
index, title){
    if (index > 0 && index + match.length !== title.length &&
      match.search(smallWords) > -1 && title.charAt(index - 2) !== ":" &&
      (title.charAt(index + match.length) !== '-' || title.charAt(index - 1) 
=== '-') &&
      title.charAt(index - 1).search(/[^\s-]/) < 0) {
      return match.toLowerCase();
    }

    if (match.substr(1).search(/[A-Z]|\../) > -1) {
      return match;
    }

    return match.charAt(0).toUpperCase() + match.substr(1);
  });
};
{code}

Alternatively, there is John Gruber and John Resig's [JavaScript 
implementation|https://johnresig.com/files/titleCaps.js]:

{code:javascript}
/*
 * Title Caps
 * 
 * Ported to JavaScript By John Resig - http://ejohn.org/ - 21 May 2008
 * Original by John Gruber - http://daringfireball.net/ - 10 May 2008
 * License: http://www.opensource.org/licenses/mit-license.php
 */

(function(){
        var small = 
"(a|an|and|as|at|but|by|en|for|if|in|of|on|or|the|to|v[.]?|via|vs[.]?)";
        var punct = "([!\"#$%&'()*+,./:;<=>?@[\\\\\\]^_`{|}~-]*)";
  
        this.titleCaps = function(title){
                var parts = [], split = /[:.;?!] |(?: |^)["Ò]/g, index = 0;
                
                while (true) {
                        var m = split.exec(title);

                        parts.push( title.substring(index, m ? m.index : 
title.length)
                                .replace(/\b([A-Za-z][a-z.'Õ]*)\b/g, 
function(all){
                                        return /[A-Za-z]\.[A-Za-z]/.test(all) ? 
all : upper(all);
                                })
                                .replace(RegExp("\\b" + small + "\\b", "ig"), 
lower)
                                .replace(RegExp("^" + punct + small + "\\b", 
"ig"), function(all, punct, word){
                                        return punct + upper(word);
                                })
                                .replace(RegExp("\\b" + small + punct + "$", 
"ig"), upper));
                        
                        index = split.lastIndex;
                        
                        if ( m ) parts.push( m[0] );
                        else break;
                }
                
                return parts.join("").replace(/ V(s?)\. /ig, " v$1. ")
                        .replace(/(['Õ])S\b/ig, "$1s")
                        .replace(/\b(AT&T|Q&A)\b/ig, function(all){
                                return all.toUpperCase();
                        });
        };
    
        function lower(word){
                return word.toLowerCase();
        }
    
        function upper(word){
          return word.substr(0,1).toUpperCase() + word.substr(1);
        }
})();
{code}

{quote}Then there's the slight problem that if we support, say, English, French 
and Spanish out of the box, then why not language X... people can read things 
into that.{quote}

Be open and honest. It takes time, skill, and multi-lingual fluency to develop 
software that operates on multiple human languages. People are welcome to 
contribute. It's a good idea to ensure that the implementation is pluggable so 
that writing different converters is easy. (Also, {{title_case}} isn't 
applicable to every language -- some writing systems have no concept of upper 
case and lower case.)

> Header capitalization using standard styles
> -------------------------------------------
>
>                 Key: FREEMARKER-97
>                 URL: https://issues.apache.org/jira/browse/FREEMARKER-97
>             Project: Apache Freemarker
>          Issue Type: Wish
>          Components: engine
>    Affects Versions: 2.3.28
>            Reporter: T X
>            Priority: Minor
>
> FreeMarker offers a couple of simple algorithms for changing the case of 
> titles:
>  * 
> [https://freemarker.apache.org/docs/ref_builtins_string.html#ref_builtin_cap_firs|https://freemarker.apache.org/docs/ref_builtins_string.html#ref_builtin_cap_first]
>  * 
> [https://freemarker.apache.org/docs/ref_builtins_string.html#ref_builtin_capitalizet|https://freemarker.apache.org/docs/ref_builtins_string.html#ref_builtin_cap_first]
> Neither of these capitalize the text in ways that adhere to various standard 
> styles:
>  * [Chicago Manual of 
> Style|https://en.wikipedia.org/wiki/The_Chicago_Manual_of_Style]
>  * [Associated Press|https://en.wikipedia.org/wiki/AP_Stylebook]
>  * [MLA Style Manual|https://en.wikipedia.org/wiki/MLA_Style_Manual]
>  * [APA Style|https://en.wikipedia.org/wiki/APA_style]
> Consider the following texts:
>  * On iPhone the Transcript Extends Outside of Screen Frame
>  * PEAR And GNA Report Performance
>  * BCMailPlusFTPClient Sends Document without Document ID
>  * JWebUnit: Non-PEN Orders main.xhtml Meta Refresh Tag Issue
> These are correct as written and must not be adjusted by an algorithm that 
> changes the text title. There are a couple of web sites that produce the 
> expected titles (note that the second site capitalizes the word "without," 
> which implies the algorithm does not use Chicago conventions):
>  * [https://titlecaseconverter.com/]
>  * [http://individed.com/code/to-title-case/]
> There are a variety of implementations that perform such a feat:
> * 
> [JavaScript|https://github.com/gouch/to-title-case/blob/master/to-title-case.js]
> * [Perl|https://gist.github.com/gruber/9f9e8650d68b13ce4d78]
> * [PHP|https://gist.github.com/HipsterJazzbo/2532c93a18db7451b0cec529c95b53c4]
> These implementations do not require a whitelist. (So "iPhone" and 
> "ClassName" will remain as given.) Apache Commons' {{WordUtils}} class does 
> not implement Chicago Style, and I suspect it is also a fairly simple 
> algorithm.
> Since {{?capitalize}} and {{?cap_first}} are taken, I propose {{?title_case}} 
> with an optional parameter (default is Chicago):
>  * chicago
>  * ap
>  * apa
>  * mla
>  



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

Reply via email to