HyukjinKwon edited a comment on pull request #30232: URL: https://github.com/apache/spark/pull/30232#issuecomment-720965482
BTW, @dongjoon-hyun, seems like Ubuntu 20.04 changed its default Python version to Python 3 (?). I think that's why the error message shows a bit weird like `b'......`. I saw this symptom before. We deprecated Python 2 in Spark 3 and added Python 3 supports in the dev scripts too (at SPARK-27889) but `branch-2.4` does not have the fix yet IIRC. Can you try to remove this line https://github.com/apache/spark/pull/30232/files#diff-48c0ee97c53013d18d6bbae44648f7fab9af2e0bf5b0dc1ca761e18ec5c478f2R126: ```diff # Yarn has a Python specific test too, for example, YarnClusterSuite. - if: contains(matrix.modules, 'yarn') || contains(matrix.modules, 'pyspark') || (contains(matrix.modules, 'sql') && !contains(matrix.modules, 'sql-')) with: ``` so the builds install Python 2 always? As far as I remember, the last installed Python becomes the default `python`. ---------------------------------------------------------------- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: [email protected] --------------------------------------------------------------------- To unsubscribe, e-mail: [email protected] For additional commands, e-mail: [email protected]
