Hi all, Is there any update on gora with hadoop 3.x versions?
Thanks & Regards, Gajalakshmi.G Assistant Consultant Tata Consultancy Services Mailto: [email protected]<https://mail.tcs.com/owa/redir.aspx?C=15cf4bf65eff4bdab465e0a2dd682f11&URL=mailto%3agajalakshmi.g%40tcs.com> ________________________________ From: Shashanka Balakuntala <[email protected]> Sent: Tuesday, June 23, 2020 6:11 PM To: [email protected] <[email protected]> Subject: Re: Nutch with Hadoop 3.x version "External email. Open with Caution" Hi, There is a gora issue[1] created to do this exact change. I will look into it and if its a minor fix will try and fix it as well. But do keep following the issue for more updates. [1] - https://issues.apache.org/jira/browse/GORA-537 On Tue, 23 Jun 2020, 16:39 Gajalakshmi G, <[email protected]> wrote: > Hi, > > Thanks for the suggestion, I have already changed the Hadoop dependencies > to 3.x version in my Nutch 2.3.1 ivy file. As I am using gora-core.jar (0.9 > version) to store the crawl data , Which is having the comparability only > with Hadoop 2.x version and makes my crawling is not getting completed. So > any suggestions to make this Nutch 2.x version with gora-core.jar > combination work in Hadoop 3.x version? > > > Thanks & Regards, > > Gajalakshmi.G > > > ________________________________ > From: Shashanka Balakuntala <[email protected]> > Sent: Tuesday, June 23, 2020 2:02 PM > To: [email protected] <[email protected]> > Subject: Re: Nutch with Hadoop 3.x version > > "External email. Open with Caution" > > Hi, > > I would like to point out that the Nutch 2.x version is not under active > maintenance/development and has been retired. By saying that, you can > follow the below steps to upgrade the 2.x to run on Hadoop 3.x: > > 1. Navigate to the root directory of the Nutch 2.x (local cloned > directory). If you have a binary release, clone the repository using > https://github.com/apache/nutch/tree/2.x and navigate to the directory. > 2. Open ivy/ivy.xml using the editor of you choice. > 3. After the line 49, you have <!-- Hadoop Dependencies --> change the > hadoop dependencies to 3.x(version which you want). Just to mention here > the 1.x has been updated to 3.1.3 and works well there. If you need to > check the changes, you can refer to pull request to see the changes which > was made to port Hadoop to 3.1.3 in 1.x branch here > <https://github.com/apache/nutch/pull/507/files> > 4. Make sure to change all the hadoop dependencies in the ivy.xml file. > 5. Then go back to the root directory and build the project using "ant > runtime". > 6. You can find the runnable nutch script in ./runtime/local/bin/nutch and > more information on building project from source here > <https://cwiki.apache.org/confluence/display/NUTCH/Nutch2Tutorial> > > This should make the changes for Nutch to use Hadoop 3.x. > > *Regards* > Shashanka Balakuntala Srinivasa > > > > On Tue, Jun 23, 2020 at 12:08 PM Gajalakshmi G > <[email protected]> wrote: > > > Hi all, > > > > > > > > I am using Nutch 2.3.1 with gora-core.jar of the version 0.6(tried upto > > 0.9 version of gora jars). With this versions I am not able to > successfully > > crawl a site with Hadoop 3.x version. Is the latest Nutch 2.4 version run > > on Hadoop 3.x versions? Do we need to do any specific changes to make the > > Nutch 2.4 to run on Hadoop 3.x ? > > > > > > > > Thanks & Regards, > > > > Gajalakshmi.G > > > > Assistant Consultant > > > > Tata Consultancy Services > > Mailto: [email protected]< > > > https://mail.tcs.com/owa/redir.aspx?C=15cf4bf65eff4bdab465e0a2dd682f11&URL=mailto%3agajalakshmi.g%40tcs.com > > > > > =====-----=====-----===== > > Notice: The information contained in this e-mail > > message and/or attachments to it may contain > > confidential or privileged information. If you are > > not the intended recipient, any dissemination, use, > > review, distribution, printing or copying of the > > information contained in this e-mail message > > and/or attachments to it are strictly prohibited. If > > you have received this communication in error, > > please notify us by reply e-mail or telephone and > > immediately and permanently delete the message > > and any attachments. Thank you > > > > > > >

