mapred branch

2006-04-10 Thread Anton Potehin
Where now placed mapred branch of nutch ?



Re: mapred branch

2006-04-10 Thread Piotr Kosiorowski

Anton Potehin wrote:

Where now placed mapred branch of nutch ?



it is developed in trunk now.
P.


(mapred branch) Job.xml as a directory instead of a file, other issues.

2005-08-16 Thread Jeremy Bensley
I have been attempting to get the mapred branch version of the crawler
working and have hit some snags.

First, I have observed the same behavior as a previous poster from
yesterday who, instead of specifying a file for the URLs to be read
from, must now specify a directory (full path) to which a file
containing the URL list is stored. From the response to that thread I
am gathering that it isn't desired behavior to specify a directory
instead of a file.

Second, and more importantly, I am having issues with task trackers. I
have three machines running task tracker, and a fourth running the job
tracker, and they seem to be talking well. Whenever I try to invoke
crawl using the job tracker, however, all of my task trackers
continually fail with this:

050816 134532 parsing /tmp/nutch/mapred/local/tracker/task_m_5o5uvx/job.xml
[Fatal Error] :-1:-1: Premature end of file.
050816 134532 SEVERE error parsing conf file:
org.xml.sax.SAXParseException: Premature end of file.
java.lang.RuntimeException: org.xml.sax.SAXParseException: Premature
end of file.
at org.apache.nutch.util.NutchConf.loadResource(NutchConf.java:355)
at org.apache.nutch.util.NutchConf.getProps(NutchConf.java:290)
at org.apache.nutch.util.NutchConf.get(NutchConf.java:91)
at org.apache.nutch.mapred.JobConf.getJar(JobConf.java:80)
at 
org.apache.nutch.mapred.TaskTracker$TaskInProgress.localizeTask(TaskTracker.java:335)
at 
org.apache.nutch.mapred.TaskTracker$TaskInProgress.init(TaskTracker.java:319)
at 
org.apache.nutch.mapred.TaskTracker.offerService(TaskTracker.java:221)
at org.apache.nutch.mapred.TaskTracker.run(TaskTracker.java:269)
at org.apache.nutch.mapred.TaskTracker.main(TaskTracker.java:610)
Caused by: org.xml.sax.SAXParseException: Premature end of file.
at org.apache.xerces.parsers.DOMParser.parse(Unknown Source)
at org.apache.xerces.jaxp.DocumentBuilderImpl.parse(Unknown Source)
at javax.xml.parsers.DocumentBuilder.parse(DocumentBuilder.java:172)
at org.apache.nutch.util.NutchConf.loadResource(NutchConf.java:315)
... 8 more

Whenever I look at the job.xml file specified by this location, it
turns out that it is a directory, not a file.

drwxrwxr-x  2 jeremy  users 4096 Aug 16 13:45 job.xml


Any help / observation of these issues is most appreciated.

Thanks,

Jeremy


mapred branch Revision 226742

2005-08-01 Thread Yitao Duan
I saw this revision fixed something that has been puzzling me.
However, if the fix is applied, NDFS can't handle 0-byte files
anymore. It will simply hang. I didn't look into the code yet. Maybe
this case is something that needs to be handled specially?

Yitao