Sebastian:
I'm sorry. It's the first time I use mail list, would you be kind to tell me how to start a new thread? bellow is all I known of a mail list be: send a mail to "[email protected]". ------------------ ???????? ------------------ ??????: "Sebastian Nagel";<[email protected]>; ????????: 2017??3??3??(??????) ????1:33 ??????: "user"<[email protected]>; ????: Re: How to avoid repeatedly upload job jars Hi, please, start a new thread for a new topic or question. That will others help to find the right answer for their problem when searching in the mailing list archive. Thanks, Sebastian On 03/02/2017 11:01 AM, katta surendra babu wrote: > Hi Sebastian, > > > I am looking to work with Json related website to crawl the data of that > website by using Nutch 2.3.1 , Hbase0.98 , Solr5.6 . > > > > Here the problem is : > > for the 1st round I get the Json data into Hbase, but for second round I > am not getting the meta data and the html links in nutch > > > So, please help me out if you can ... to crawl the Json website completely. > > > > On Thu, Mar 2, 2017 at 3:21 PM, Sebastian Nagel <[email protected]> > wrote: > >> Hi, >> >> maybe the Hadoop Distributed Cache is what you are looking for? >> >> Best, >> Sebastian >> >> On 03/02/2017 01:35 AM, 391772322 wrote: >>> archived nutch job jar has a size of about 400M, every step will upload >> this archive and distribute to every work node. Is there away to upload >> only nutch jar, but leave depended libs on every work node? >>> >> >> > >

