My apologies, I didn't clarify earlier. I don't want to remove Python 3. What I mean is to remove the 'require: python' dependency in the Spark spec and control. This way, installing Spark won't require a Python dependency. If users need to use PySpark, they can manually install the corresponding Python version using Conda.
Additionally, there are many extra installations in the Bigtop code for managing Python 2. As far as I know, all components now support Python 3, and Python 2 has been deprecated for a long time. Bigtop just hasn't done the Python 3 upgrade work yet. This is because it involves the Python version in Spark 3 packaging, GPDB Python, Ranger Python, and Phoenix Python dependencies, but these issues can be resolved. Ambari used to strongly depend on Python 2, but Ambari has been dropped from Bigtop. None of the other components have a strong dependency on Python 2. In Spark, PySpark can be managed separately by users, so specifying a Python 3 version in the packaging isn't a good choice. GPDB 6 officially supports Python 3. While Ranger doesn't necessarily require Python for installation, although it has some Python 2 scripts, they are used relatively sparingly. So, one of the goals of this discussion is to remove Python as a dependency for Spark installation and to facilitate the future upgrade of Python 2 to Python 3 in Bigtop. > On Dec 18, 2023, at 14:50, 李帅 <lishuaipeng...@gmail.com> wrote: > > python3 has a lot of compatibility issues, different linux distro have > different python3 versions. > > Jialiang Cai <jialiangca...@gmail.com> 于 2023年12月18日周一 09:46写道: > >> Dear Community Members, >> >> I would like to initiate a discussion regarding the removal of Python from >> the Spark3 installation package. Here are a few reasons for considering >> this change: >> >> 1.Unlike Apache Ambari, which installs components individually, Spark3's >> core functionality does not depend on Python3. Therefore, it may not be >> appropriate to make Python3 a mandatory installation dependency for Spark. >> Spark itself can run without Python3, and users who do not intend to use >> PySpark should still be able to install and use Spark without any issues. >> >> 2.The Python3 version required by PySpark is often relatively high, and >> many operating systems do not provide such high Python versions by default. >> Including PySpark's Python3 dependency in the Bigtop codebase would >> introduce significant complexity. It might be more suitable for users to >> manually install the specific Python3 version required by PySpark, perhaps >> using Conda or other methods. >> >> 3.Removing Python3 dependency from Spark can also benefit the overall >> transition of Bigtop from Python2 to Python3. Python2 has not been >> maintained for a considerable period, and streamlining the codebase to work >> with Python3 can be a step toward maintaining the project's relevance and >> security. >> >> I encourage everyone to share their thoughts and opinions on this matter. >> Your feedback is valuable as we consider the best course of action. >> >> Thank you for your participation and input. >> >> Best regards, >> jiaLiang