date:20091113

Re: Nutch Hadoop question

2009-11-13 Thread Eran Zinman

Hi All, Don't want to bother you guys too much... I've tried searching for this topic and do some testing myself but so far was quite unsuccessful. Basically - I wish to use some computers only for map-reduce processing and not for HDFS, does anyone know how this can be done? Thanks, Eran On

Re: Nutch Hadoop question

2009-11-13 Thread TuxRacer69

Hi Eran, mapreduce has to store its data on HDFS file system. But if you want to separate the two groups of servers, you could build two separate HDFS filesystems. To separate the two setups, you will need to make sure there is no cross communication between the two parts, cheer Alex Eran

Re: Nutch Hadoop question

2009-11-13 Thread Andrzej Bialecki

TuxRacer69 wrote: Hi Eran, mapreduce has to store its data on HDFS file system. More specifically, it needs read/write access to a shared filesystem. If you are brave enough you can use NFS, too, or any other type of filesystem that can be mounted locally on each node (e.g. a NetApp). But

Re: Synonym Filter with Nutch

2009-11-13 Thread Andrzej Bialecki

Dharan Althuru wrote: Hi, We are trying to incorporate synonym filter during indexing using Nutch. As per my understanding Nutch doesn’t have synonym indexing plug-in by default. Can we extend IndexFilter in Nutch to incorporate the synonym filter plug-in available in Lucene using WordNet or

How to configure nutch to crawl parallelly

2009-11-13 Thread xiao yang

Hi, All I'm using Nutch-1.0 on a 12 nodes cluster, and configure conf/hadoop-site.xml as follow: ... property namemapred.tasktracker.map.tasks.maximum/name value20/value /property property namemapred.tasktracker.reduce.tasks.maximum/name value20/value /property ... but

can't deploy nutch-1.0.war ???

2009-11-13 Thread MilleBii

I'm stuck and not able to deploy nutch-1.0.war I get following error in the catalina.log: Exception when processing TLD indicated by the ressource path /WEB-INF/taglibs-i18n.tld in the context /nutch-1.0 What could it be the taglibs is there, the *.properties files are there. ANY HELP where

Re: How to configure nutch to crawl parallelly

2009-11-13 Thread Otis Gospodnetic

I don't recall off the top of my head what that jobtracker.jsp shows, but judging by name, it shows your job. Each job is composed of multiple map and reduce tasks. Drill into your job and you should see multiple tasks running. Otis -- Sematext is hiring --

Re: Nutch Hadoop question

2009-11-13 Thread Eran Zinman

Thanks for the help guys. On Fri, Nov 13, 2009 at 5:20 PM, Andrzej Bialecki a...@getopt.org wrote: TuxRacer69 wrote: Hi Eran, mapreduce has to store its data on HDFS file system. More specifically, it needs read/write access to a shared filesystem. If you are brave enough you can use

Re: Nutch Hadoop question

Re: Nutch Hadoop question

Re: Nutch Hadoop question

Re: Synonym Filter with Nutch

How to configure nutch to crawl parallelly

can't deploy nutch-1.0.war ???

Re: How to configure nutch to crawl parallelly

Re: Nutch Hadoop question

8 matches

Site Navigation

Mail list logo

Footer information