One nice benefit of a CQLSH PIP package which was omitted in this discussion is that it is "Python-version-agnostic". What I mean by that is that the way how we currently package CQLSH in RPM is that the container it is produced in is using Python 3.6 so the produced RPM will run, believe or not, only on distros with Python 3.6. See (1) for more details.
To solve this problem without a PIP package, we would need to start to build RPMs per supported Python version. I briefly looked into what Python versions are present in the most popular RPM distros and the most prevalent are 3.6.x, 3.9.x and 3.11.x. I personaly think that solving this problem by producing 3 RPMs instead of one is quite impractical but it seems like currently we do not have any other option. If we had an official PIP package, I can imagine that we would not ship CQLSH in RPM at all (maybe not in DEB either?) so we would decouple this. A PIP package is installable almost anywhere (if it is Python 3, that is the way how I solved the problem in 18642, I just installed a PIP package because RPM installation was broken). On the other hand, a user should be able to just download what we ship, extract it, run the db and connect to it. All being done out of the box. Hence I think we should still ship CQLSH sources within Cassandra tarball but it might be installable locally from the tarball like: pip install /where/my/cassandra/tarbal/is/extracted/cqlshpackage This would search for setup.py / project.toml, then it would build the wheel and it would install it locally if one wishes to do so. I do not think that depending on PIP in 2023 is a lot to ask for. PIP was made an official package manager in Python years ago. Another problem I see is that how do we say what CQLSH is compatible with what Cassandra release? If we shipped CQLSH as a PIP package as part of the tarball, we would guarantee that they play together. If it is living somewhere online, how can be people sure that what they install is compatible with Cassandra they run? I am sorry if this was already explained somewhere. (1) https://issues.apache.org/jira/browse/CASSANDRA-18642 Regards ________________________________________ From: Dinesh Joshi <djo...@apache.org> Sent: Wednesday, August 9, 2023 21:31 To: dev@cassandra.apache.org Subject: Re: [Discuss] CEP-35: Add PIP support for CQLSH NetApp Security WARNING: This is an external email. Do not click links or open attachments unless you recognize the sender and know the content is safe. Brad, Thanks for starting this discussion. My understanding is that we're simply adding pip support for cqlsh and Apache Cassandra project will officially publish a cqlsh pip package. This is a good goal but other than having an official pip package, what is it that we're gaining? Please don't interpret this as push back on your proposal but I am unclear on what we're trying to solve by making this official distribution. There are several distribution channels and it is untenable to officially support all of them. If we do adopt this, there will be non-zero overhead of the release process. This is fine but we need volunteers to run this process. My understanding is that they need to be ideally PMC or at least Committers on the project to go through all the steps to successfully release a new artifact for our users. I would have liked this CEP to go a bit further than just packaging cqlsh in pip. IMHO we should have cqlsh as a separate sub-project. It doesn't need to live in the cassandra repo. Extracting cqlsh into it's separate repo would allow us to truly decouple cqlsh from the server. This is already true for the most part as we rely on the Python driver which is compatible with several cassandra releases. As it stands today it is not possible for us to update cqlsh without making a Cassandra release. If you truly want to go a bit further, we should consider rewriting cqlsh in Java so we can easily share code from the server. We can then potentially use Java Native Image[1] to produce a truly platform independent binary like golang. Python has its strengths but it does get hairy as it expects certain runtime components on the target. Java With Native Image we make things very simple from a user's perspective very similar to how golang produces statically linked binaries. This might be a very far out thought but it is worth exploring. I believe GraalVM's license might allow us to produce binaries that we can incorporate in our release but IANAL so maybe we can ask ASF legal on their opinion. Giving cqlsh it's own identity as a sub-project might help us build a roadmap and evolve it along these lines. I would like other folks to chime in with their opinions. Dinesh On 8/9/23 09:18, Brad wrote: > > As per the CEP process guidelines, I'm starting a formal DISCUSS thread > to resume the conversation started here[1]. > > The developers who maintain the Python CQLSH client on the official > Python PYPI repository would like to integrate and donate their open > source work to the Apache Cassandra project so it can be more tightly > and seamlessly integrated. > > The Apache Cassandra project pre-dates the adoption in Python 3.4 of > PyPI as the default package manager. As a result, an unofficial > distribution has been provided by a group of developers who have > maintained the repository there since October 2013. > > The installable version of CQLSH on PyPI.org allows end users to install > a cqlsh client with PIP - no tarball or path setup required. I.e., > > $ pip install cqlsh > > This popular package has 50K downloads per month and is today maintained > by Jeff Wideman and Brad Schoening. The PYPI package is updated upon > every major release by simply repackaging the CQLSH that ships with > every Cassandra release. > > CQLSH PyPI Repository: https://pypi.org/project/cqlsh/ > <https://pypi.org/project/cqlsh/> > > > This CEP Proposal suggests incorporating PYPI as a regular part of the > Cassandra release process and making the CQLSH project on PYPI an > official distribution point. > > The full CEP can be reviewed at: > > Wiki: CEP-35: Add PIP support for CQLSH > > <https://cwiki.apache.org/confluence/pages/viewpage.action?pageId=263425995>. > > Jira: CASSANDRA-18654 > <https://issues.apache.org/jira/browse/CASSANDRA-18654> > > > But in brief, the proposal will: > > * Add PyPI.org as an official distribution point for CQLSH > * Allow end users to install CQLSH with simply 'pip install cqlsh' on > MacOS, Windows and Linux platforms. > * Donate the modest amount of existing configuration files by the > authors to Apache Cassandra > * This only involves the Python CQLSH client, no changes to > distribution of Java server side code and tools are involved. > > We welcome further discussion and suggestions regarding this proposal on > the mailing list here. > > Regards, > > Jeff Widman & > Brad Schoening > > [1] https://lists.apache.org/thread/sy3p2b2tncg1bk6x3r0r60y10dm6l18d > <https://lists.apache.org/thread.html/ra7caa1dd42ccaa04bcabfbc33233995c125c655f9a3cdb2c7bd8e9f7%40%3Cdev.cassandra.apache.org%3E>