Hello Frank,

Yes, it is memory issue you must increase java heap size.

Just follow this instructions (another things to add to wiki ;)

Eclipse -> Window -> Preferences -> Java -> Installed JREs -> edit -> Default VM arguments

I've set mine to -Xms5m -Xmx150m because I have like 200MB RAM left after runnig all apps

-Xms (minimum ammount of RAM memory for running applications)
-Xmx (maximum)

It should help.

Thanks,
Bartosz

Frank McCown pisze:
Hello Bartosz,

I'm running the default Nutch 1.0 version on Windows XP (2 GB RAM)
with Eclipse 3.3.0.  I followed the directions at

http://wiki.apache.org/nutch/RunNutchInEclipse0.9

exactly as stated.  I'm able to run the default Nutch 0.9 release
without any problems in Eclipse.  But when I run 1.0, I always get the
java.io.IOException as stated in my last email.  I had assumed it was
due to the plugin issue, but maybe not.  I'm just running a very small
crawl with two seed URLs.

Here's what hadoop.log says:

2009-04-13 13:41:03,010 INFO  crawl.Crawl - crawl started in: crawl
2009-04-13 13:41:03,025 INFO  crawl.Crawl - rootUrlDir = urls
2009-04-13 13:41:03,025 INFO  crawl.Crawl - threads = 10
2009-04-13 13:41:03,025 INFO  crawl.Crawl - depth = 3
2009-04-13 13:41:03,025 INFO  crawl.Crawl - topN = 5
2009-04-13 13:41:03,479 INFO  crawl.Injector - Injector: starting
2009-04-13 13:41:03,479 INFO  crawl.Injector - Injector: crawlDb: crawl/crawldb
2009-04-13 13:41:03,479 INFO  crawl.Injector - Injector: urlDir: urls
2009-04-13 13:41:03,479 INFO  crawl.Injector - Injector: Converting
injected urls to crawl db entries.
2009-04-13 13:41:03,588 WARN  mapred.JobClient - Use
GenericOptionsParser for parsing the arguments. Applications should
implement Tool for the same.
2009-04-13 13:41:06,105 WARN  mapred.LocalJobRunner - job_local_0001
java.lang.OutOfMemoryError: Java heap space
        at 
org.apache.hadoop.mapred.MapTask$MapOutputBuffer.<init>(MapTask.java:498)
        at org.apache.hadoop.mapred.MapTask.run(MapTask.java:305)
        at 
org.apache.hadoop.mapred.LocalJobRunner$Job.run(LocalJobRunner.java:138)


I have not tried Sanjoy's advice yet... it looks like this is a memory issue.

Any advice would be much appreciated,
Frank


2009/4/10 Bartosz Gadzimski <bartek...@o2.pl>:
Hello Frank,

Please look into hadoop.log and let maybe there is something more.

About your error - you must give us more specific configuration of your
nutch.

Default nutch installation is working with no problems (I'v never changed
src/plugin path)

Please tell us: version of nutch
any changes
different configurations (different then crawl-urlfilter - adding your
domain).

Thanks,
Bartosz

Frank McCown pisze:
Adding cygwin to my PATH solved my problem with whoami.  But now I'm
getting an exception when running the crawler:

Injector: Converting injected urls to crawl db entries.
Exception in thread "main" java.io.IOException: Job failed!
       at org.apache.hadoop.mapred.JobClient.runJob(JobClient.java:1232)
       at org.apache.nutch.crawl.Injector.inject(Injector.java:160)
       at org.apache.nutch.crawl.Crawl.main(Crawl.java:114)

I know from searching the mailing list that this is normally due to a
bad plugin.folders setting in the nutch-default.xml, but I used the
same value as the tutorial (./src/plugin) to no avail.

(As an aside, seems like Hadoop should provide a better error message
if the plugin folder doesn't exist.)

Anyway, thanks, Bartosz, for your help.

Frank


2009/4/10 Bartosz Gadzimski <bartek...@o2.pl>:

Hello,

So now you have to install cygwin and be sure that you add it to PATH

it's in http://wiki.apache.org/nutch/RunNutchInEclipse0.9

After this you should be able to run "bash" command from command prompt
(Menu Start > RUN > cmd.exe)

Then you'r done - everything will be working.

I must add it to wiki, I forgot about whoami problem.

Take care,
Bartosz

sanjoy.gh...@thomsonreuters.com pisze:

Thanks for the suggestion Bartosz.  I downloaded whoami, and It promptly
crashed on "bash".

09/04/10 12:02:28 WARN fs.FileSystem: uri=file:///
javax.security.auth.login.LoginException: Login failed: Cannot run
program "bash": CreateProcess error=2, The system cannot find the file
specified
      at
org.apache.hadoop.security.UnixUserGroupInformation.login(UnixUserGroupI
nformation.java:250)
      at
org.apache.hadoop.security.UnixUserGroupInformation.login(UnixUserGroupI
nformation.java:275)
      at
org.apache.hadoop.security.UnixUserGroupInformation.login(UnixUserGroupI
nformation.java:257)
      at
org.apache.hadoop.security.UserGroupInformation.login(UserGroupInformati
on.java:67)
      at
org.apache.hadoop.fs.FileSystem$Cache$Key.<init>(FileSystem.java:1438)
      at
org.apache.hadoop.fs.FileSystem$Cache.get(FileSystem.java:1376)
      at org.apache.hadoop.fs.FileSystem.get(FileSystem.java:215)
      at org.apache.hadoop.fs.FileSystem.get(FileSystem.java:120)
      at org.apache.nutch.crawl.Crawl.main(Crawl.java:84)

Where am I going to find "bash" on Windows without running commandline
cygwin?  Is there a way to turn off this security in Hadoop?

Thanks,
Sanjoy

-----Original Message-----
From: Bartosz Gadzimski [mailto:bartek...@o2.pl] Sent: Friday, April 10,
2009 5:06 AM
To: nutch-dev@lucene.apache.org
Subject: Re: login failed exception

Hello,

I am not sure if it's the case but you should try to add whoami to your
windows box.

for example for windows xp and sp2:
http://www.microsoft.com/downloads/details.aspx?FamilyId=49AE8576-9BB9-4
126-9761-BA8011FABF38&displaylang=en


Thanks,
Bartosz

Frank McCown pisze:


I've been running 0.9 in Eclipse on Windows for some time, and I was
successful in running the NutchBean from version 1.0 in Eclipse, but
the crawler gave me the same exception as it gave this individual.
Maybe there's something else I'm overlooking, but I followed the
Tutorial at

http://wiki.apache.org/nutch/RunNutchInEclipse0.9

to a T.  I'll keep working on it though.

Frank


2009/4/10 Bartosz Gadzimski <bartek...@o2.pl>:


fmccown pisze:


You must run Nutch's crawler using cygwin on Windows since cygwin


has the


whoami program.  If you run it from Eclipse on Windows, it can't use
cygwin's whoami program and will fail with the exceptions you saw.


This


is
an unfortunately design decision in Hadoop which makes anything


after


version 9.0 not work in Eclipse on Windows.




It's not true, please look at
http://wiki.apache.org/nutch/RunNutchInEclipse0.9

I am using nutch 1.0 with eclipse on windows with no problems.

Thanks,
Bartosz






Reply via email to