I've experimentally added https://issues.apache.org/jira/browse/CASSANDRA-16984 to https://issues.apache.org/jira/browse/CASSANDRA-18306 (post 4.0 cleanup)
- - -- --- ----- -------- ------------- Jacek Lewandowski pt., 10 mar 2023 o 09:56 Berenguer Blasi <berenguerbl...@gmail.com> napisał(a): > +1 deprecate + removal > On 10/3/23 1:41, Jeremy Hanna wrote: > > It was mainly to integrate with Hadoop - I used it from 0.6 to 1.2 in > production prior to starting at DataStax and at that time I was stitching > together Cloudera's distribution of Hadoop with Cassandra. Back then there > were others that used it as well. As far as I know, usage dropped off when > the Spark Cassandra Connector got pretty mature. It enabled people to take > an off the shelf Hadoop distribution and run the Hadoop processes on the > same nodes or external to the Cassandra cluster and get topology > information to do things like Hadoop splits and things like that through > the Hadoop interfaces. I think the version lag is an indication that it > hasn't been used recently. Also, like others have said, the Spark > Cassandra Connector is really what people should be using at this point > imo. That or depending on the use case, Apple's bulk reader: > https://github.com/jberragan/spark-cassandra-bulkreader that is mentioned > on https://issues.apache.org/jira/browse/CASSANDRA-16222. > > On Mar 9, 2023, at 12:00 PM, Rahul Xavier Singh > <rahul.xavier.si...@gmail.com> <rahul.xavier.si...@gmail.com> wrote: > > What is the hadoop code for? For interacting from Hadoop via CQL, or > Thrift if it's that old, or directly looking at SSTables? Been using C* > since 2 and have never used it. > > Agree to deprecate in next possible 4.1.x version and remove in 5.0 > > Rahul Singh > Chief Executive Officer | Business Platform Architect m: 202.905.2818 e: > rahul.si...@anant.us li: http://linkedin.com/in/xingh ca: > http://calendly.com/xingh > > *We create, support, and manage real-time global data & analytics > platforms for the modern enterprise.* > > * Anant | https://anant.us <https://anant.us/>* > 3 Washington Circle, Suite 301 > Washington, D.C. 20037 > > *http://Cassandra.Link <http://cassandra.link/>* : The best resources for > Apache Cassandra > > > On Thu, Mar 9, 2023 at 12:53 PM Brandon Williams <dri...@gmail.com> wrote: > >> I think if we reach consensus here that decides it. I too vote to >> deprecate in 4.1.x. This means we would remove it in 5.0. >> >> Kind Regards, >> Brandon >> >> On Thu, Mar 9, 2023 at 11:32 AM Ekaterina Dimitrova >> <e.dimitr...@gmail.com> wrote: >> > >> > Deprecation sounds good to me, but I am not completely sure in which >> version we can do it. If it is possible to add a deprecation warning in the >> 4.x series or at least 4.1.x - I vote for that. >> > >> > On Thu, 9 Mar 2023 at 12:14, Jacek Lewandowski < >> lewandowski.ja...@gmail.com> wrote: >> >> >> >> Is it possible to deprecate it in the 4.1.x patch release? :) >> >> >> >> >> >> - - -- --- ----- -------- ------------- >> >> Jacek Lewandowski >> >> >> >> >> >> czw., 9 mar 2023 o 18:11 Brandon Williams <dri...@gmail.com> >> napisał(a): >> >>> >> >>> This is my feeling too, but I think we should accomplish this by >> >>> deprecating it first. I don't expect anything will change after the >> >>> deprecation period. >> >>> >> >>> Kind Regards, >> >>> Brandon >> >>> >> >>> On Thu, Mar 9, 2023 at 11:09 AM Jacek Lewandowski >> >>> <lewandowski.ja...@gmail.com> wrote: >> >>> > >> >>> > I vote for removing it entirely. >> >>> > >> >>> > thanks >> >>> > - - -- --- ----- -------- ------------- >> >>> > Jacek Lewandowski >> >>> > >> >>> > >> >>> > czw., 9 mar 2023 o 18:07 Miklosovic, Stefan < >> stefan.mikloso...@netapp.com> napisał(a): >> >>> >> >> >>> >> Derek, >> >>> >> >> >>> >> I have couple more points ... I do not think that extracting it to >> a separate repository is "win". That code is on Hadoop 1.0.3. We would be >> spending a lot of work on extracting it just to extract 10 years old code >> with occasional updates (in my humble opinion just to make it compilable >> again if the code around changes). What good is in that? We would have one >> more place to take care of ... Now we at least have it all in one place. >> >>> >> >> >>> >> I believe we have four options: >> >>> >> >> >>> >> 1) leave it there so it will be like this is for next years with >> questionable and diminishing usage >> >>> >> 2) update it to Hadoop 3.3 (I wonder who is going to do that) >> >>> >> 3) 2) and extract it to a separate repository but if we do 2) we >> can just leave it there >> >>> >> 4) remove it >> >>> >> >> >>> >> ________________________________________ >> >>> >> From: Derek Chen-Becker <de...@chen-becker.org> >> >>> >> Sent: Thursday, March 9, 2023 15:55 >> >>> >> To: dev@cassandra.apache.org >> >>> >> Subject: Re: Role of Hadoop code in Cassandra 5.0 >> >>> >> >> >>> >> NetApp Security WARNING: This is an external email. Do not click >> links or open attachments unless you recognize the sender and know the >> content is safe. >> >>> >> >> >>> >> >> >>> >> >> >>> >> I think the question isn't "Who ... is still using that?" but more >> "are we actually going to support it?" If we're on a version that old it >> would appear that we've basically abandoned it, although there do appear to >> have been refactoring (for other things) commits in the last couple of >> years. I would be in favor of removal from 5.0, but at the very least, >> could it be moved into a separate repo/package so that it's not pulling a >> relatively large dependency subtree from Hadoop into our main codebase? >> >>> >> >> >>> >> Cheers, >> >>> >> >> >>> >> Derek >> >>> >> >> >>> >> On Thu, Mar 9, 2023 at 6:44 AM Miklosovic, Stefan < >> stefan.mikloso...@netapp.com<mailto:stefan.mikloso...@netapp.com>> wrote: >> >>> >> Hi list, >> >>> >> >> >>> >> I stumbled upon Hadoop package again. I think there was some >> discussion about the relevancy of Hadoop code some time ago but I would >> like to ask this again. >> >>> >> >> >>> >> Do you think Hadoop code (1) is still relevant in 5.0? Who in the >> industry is still using that? >> >>> >> >> >>> >> We might drop a lot of code and some Hadoop dependencies too (3) >> (even their scope is "provided"). The version of Hadoop we build upon is >> 1.0.3 which was released 10 years ago. This code does not have any tests >> nor documentation on the website. >> >>> >> >> >>> >> There seems to be issues like this (2) and it seems like the >> solution is to, basically, use Spark Cassandra connector instead which I >> would say is quite reasonable. >> >>> >> >> >>> >> Regards >> >>> >> >> >>> >> (1) >> https://github.com/apache/cassandra/tree/trunk/src/java/org/apache/cassandra/hadoop >> >>> >> (2) >> https://lists.apache.org/thread/jdy5hdc2l7l29h04dqol5ylroqos1y2p >> >>> >> (3) >> https://github.com/apache/cassandra/blob/trunk/.build/parent-pom-template.xml#L507-L589 >> >>> >> >> >>> >> >> >>> >> -- >> >>> >> +---------------------------------------------------------------+ >> >>> >> | Derek Chen-Becker | >> >>> >> | GPG Key available at https://keybase.io/dchenbecker and | >> >>> >> | https://pgp.mit.edu/pks/lookup?search=derek%40chen-becker.org | >> >>> >> | Fngrprnt: EB8A 6480 F0A3 C8EB C1E7 7F42 AFC5 AFEE 96E4 6ACC | >> >>> >> +---------------------------------------------------------------+ >> >>> >> >> > >