potiuk commented on a change in pull request #17057:
URL: https://github.com/apache/airflow/pull/17057#discussion_r671847272
##########
File path: airflow/kubernetes/kubernetes_helper_functions.py
##########
@@ -37,7 +37,7 @@ def _strip_unsafe_kubernetes_special_chars(string: str) ->
str:
:param string: The requested Pod name
:return: Pod name stripped of any unsafe characters
"""
- return ''.join(ch.lower() for ch in list(string) if ch.isalnum())
+ return ''.join(ch.lower() for ch in list(string) if
ch.isalnum()).encode('ascii', 'ignore').decode()
Review comment:
Ah I see the comment, above. But I think it would be better to use
unidecode. Maybe a little extreme but there are a few words in Polish for
example, that contain only, or mostly accented characters. Not that they are
often used, but this might lead to quite some ambiguity. And it might be worse
in other languages.
Example words in Polish with only accented characters:
żółć, łóż, łżą, żąć, żął
Few short ones:
łódź, łażącą, łożącą, łóżmyż, łóżże, łżącą, żąłeś, żęłaś, żółcą, żółcę
Few longer words with mostly
niedołężność, współdźwięcznością, żółtoróżowością,
pięćdziesięciopięcioipółlatkąście
:D
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: [email protected]
For queries about this service, please contact Infrastructure at:
[email protected]