Github user BryanCutler commented on a diff in the pull request:

    https://github.com/apache/spark/pull/20838#discussion_r204190696
  
    --- Diff: dev/create-release/releaseutils.py ---
    @@ -149,7 +152,11 @@ def get_commits(tag):
                 if not is_valid_author(author):
                     author = github_username
             # Guard against special characters
    -        author = unidecode.unidecode(unicode(author, "UTF-8")).strip()
    +        try:               # Python 2
    +            author = unicode(author, "UTF-8")
    +        except NameError:  # Python 3
    +            author = str(author)
    +        author = unidecode.unidecode(author).strip()
    --- End diff --
    
    My thought was that we are first casting `author` this to unicode already 
with `unicode(author)` and it doesn't really matter if it is "UTF-8" or not 
because we then immediately decode it into ASCII with `unidecode`, which can 
handle it even it it wasn't "UTF-8", so the end result should be the same I 
believe.  It was just to clean up a little, so not a big deal either way.  The 
way it is now replicates the old behavior, so it's probably safer. 


---

---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to