Github user BryanCutler commented on a diff in the pull request:
https://github.com/apache/spark/pull/20838#discussion_r204190696
--- Diff: dev/create-release/releaseutils.py ---
@@ -149,7 +152,11 @@ def get_commits(tag):
if not is_valid_author(author):
author = github_username
# Guard against special characters
- author = unidecode.unidecode(unicode(author, "UTF-8")).strip()
+ try: # Python 2
+ author = unicode(author, "UTF-8")
+ except NameError: # Python 3
+ author = str(author)
+ author = unidecode.unidecode(author).strip()
--- End diff --
My thought was that we are first casting `author` this to unicode already
with `unicode(author)` and it doesn't really matter if it is "UTF-8" or not
because we then immediately decode it into ASCII with `unidecode`, which can
handle it even it it wasn't "UTF-8", so the end result should be the same I
believe. It was just to clean up a little, so not a big deal either way. The
way it is now replicates the old behavior, so it's probably safer.
---
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]