Re: [Nutch-dev] no nutch script file under bin directory
Hi: sorry, here's the original discussion that led to the link I accidentally sent twice; I had meant to include it too. http://www.mail-archive.com/[EMAIL PROTECTED]/msg08621.html - Original Message From: Tsengtan A Shuy [EMAIL PROTECTED] To: Tsengtan A Shuy [EMAIL PROTECTED]; [EMAIL PROTECTED] Sent: Tuesday, July 17, 2007 12:32:49 PM Subject: RE: no nutch script file under bin directory BTW, I just found out there is only one web page reference in your last email. So I do not understand what you quoted two discussions. Adam Shuy, President ePacific Web Design Hosting Professional Web/Software developer TEL: 408-272-6946 www.epacificweb.com -Original Message- From: Tsengtan A Shuy [mailto:[EMAIL PROTECTED] Sent: Tuesday, July 17, 2007 12:23 PM To: '[EMAIL PROTECTED]' Subject: no nutch script file under bin directory I follow the msg06571.html to check out the trunk. Then I found there is no nutch script file under the bin directory. How do you crawl the multiple websites without this nutch script file? Adam Shuy, President ePacific Web Design Hosting Professional Web/Software developer TEL: 408-272-6946 www.epacificweb.com -Original Message- From: Kai_testing Middleton [mailto:[EMAIL PROTECTED] Sent: Monday, July 16, 2007 8:43 AM To: [EMAIL PROTECTED] Subject: Re: OOM error during parsing with nekohtml You could try looking at these two discussions: http://www.mail-archive.com/[EMAIL PROTECTED]/msg06571.html http://www.mail-archive.com/[EMAIL PROTECTED]/msg06571.html --Kai The fish are biting. Get more visitors on your site using Yahoo! Search Marketing. http://searchmarketing.yahoo.com/arp/sponsoredsearch_v2.php- This SF.net email is sponsored by DB2 Express Download DB2 Express C - the FREE version of DB2 express and take control of your XML. No limits. Just data. Click to get it now. http://sourceforge.net/powerbar/db2/___ Nutch-developers mailing list Nutch-developers@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/nutch-developers
Re: [Nutch-dev] no nutch script file under bin directory
How do I apply nutch to a website without using Tomcat root directory or remote search engine like Mozdex.com? Adam Shuy, President ePacific Web Design Hosting Professional Web/Software developer TEL: 408-272-6946 www.epacificweb.com -Original Message- From: Kai_testing Middleton [mailto:[EMAIL PROTECTED] Sent: Wednesday, July 18, 2007 9:42 AM To: [EMAIL PROTECTED] Subject: Re: no nutch script file under bin directory Hi: sorry, here's the original discussion that led to the link I accidentally sent twice; I had meant to include it too. http://www.mail-archive.com/[EMAIL PROTECTED]/msg08621.html - Original Message From: Tsengtan A Shuy [EMAIL PROTECTED] To: Tsengtan A Shuy [EMAIL PROTECTED]; [EMAIL PROTECTED] Sent: Tuesday, July 17, 2007 12:32:49 PM Subject: RE: no nutch script file under bin directory BTW, I just found out there is only one web page reference in your last email. So I do not understand what you quoted two discussions. Adam Shuy, President ePacific Web Design Hosting Professional Web/Software developer TEL: 408-272-6946 www.epacificweb.com -Original Message- From: Tsengtan A Shuy [mailto:[EMAIL PROTECTED] Sent: Tuesday, July 17, 2007 12:23 PM To: '[EMAIL PROTECTED]' Subject: no nutch script file under bin directory I follow the msg06571.html to check out the trunk. Then I found there is no nutch script file under the bin directory. How do you crawl the multiple websites without this nutch script file? Adam Shuy, President ePacific Web Design Hosting Professional Web/Software developer TEL: 408-272-6946 www.epacificweb.com -Original Message- From: Kai_testing Middleton [mailto:[EMAIL PROTECTED] Sent: Monday, July 16, 2007 8:43 AM To: [EMAIL PROTECTED] Subject: Re: OOM error during parsing with nekohtml You could try looking at these two discussions: http://www.mail-archive.com/[EMAIL PROTECTED]/msg06571.html http://www.mail-archive.com/[EMAIL PROTECTED]/msg06571.html --Kai The fish are biting. Get more visitors on your site using Yahoo! Search Marketing. http://searchmarketing.yahoo.com/arp/sponsoredsearch_v2.php - This SF.net email is sponsored by DB2 Express Download DB2 Express C - the FREE version of DB2 express and take control of your XML. No limits. Just data. Click to get it now. http://sourceforge.net/powerbar/db2/ ___ Nutch-developers mailing list Nutch-developers@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/nutch-developers
Re: [Nutch-dev] no nutch script file under bin directory
I'm not actually sure ... I think I downloaded and unzipped a nightly build in my usr/local directory thus creating this directory: /usr/local/nutch-2007-06-27_06-52-44 then from within that directory I ran the svn command ... if I remember correctly. You can always try just making a 'nutch' directory or a 'nutch0.9' directory, running svn, and see if it creates another subdirectory under that, then moves things to where you want. - Original Message From: Tsengtan A Shuy [EMAIL PROTECTED] To: [EMAIL PROTECTED] Sent: Tuesday, July 17, 2007 5:30:18 PM Subject: RE: no nutch script file under bin directory This may seems like a silly question, but I need to know it anyway. When I check out the trunk, I shall put it to the nutch directory which should be the latest release directory e.g: nutch-0.9 release. Am I right? Adam Shuy, President ePacific Web Design Hosting Professional Web/Software developer TEL: 408-272-6946 www.epacificweb.com -Original Message- From: Tsengtan A Shuy [mailto:[EMAIL PROTECTED] Sent: Tuesday, July 17, 2007 12:33 PM To: 'Tsengtan A Shuy'; [EMAIL PROTECTED] Subject: RE: no nutch script file under bin directory BTW, I just found out there is only one web page reference in your last email. So I do not understand what you quoted two discussions. Adam Shuy, President ePacific Web Design Hosting Professional Web/Software developer TEL: 408-272-6946 www.epacificweb.com -Original Message- From: Tsengtan A Shuy [mailto:[EMAIL PROTECTED] Sent: Tuesday, July 17, 2007 12:23 PM To: '[EMAIL PROTECTED]' Subject: no nutch script file under bin directory I follow the msg06571.html to check out the trunk. Then I found there is no nutch script file under the bin directory. How do you crawl the multiple websites without this nutch script file? Adam Shuy, President ePacific Web Design Hosting Professional Web/Software developer TEL: 408-272-6946 www.epacificweb.com -Original Message- From: Kai_testing Middleton [mailto:[EMAIL PROTECTED] Sent: Monday, July 16, 2007 8:43 AM To: [EMAIL PROTECTED] Subject: Re: OOM error during parsing with nekohtml You could try looking at these two discussions: http://www.mail-archive.com/[EMAIL PROTECTED]/msg06571.html http://www.mail-archive.com/[EMAIL PROTECTED]/msg06571.html --Kai Shape Yahoo! in your own image. Join our Network Research Panel today! http://surveylink.yahoo.com/gmrs/yahoo_panel_invite.asp?a=7 - This SF.net email is sponsored by DB2 Express Download DB2 Express C - the FREE version of DB2 express and take control of your XML. No limits. Just data. Click to get it now. http://sourceforge.net/powerbar/db2/___ Nutch-developers mailing list Nutch-developers@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/nutch-developers
Re: [Nutch-dev] no nutch script file under bin directory
Where do you get the nightly build? I followed your referral web page and use wget http://lucene.zones.apache.org:8080/hudson/job/Nutch-Nightly/lastStableBuild /artifact/trunk/build/nutch-2007-06-27_06-52-44.tar.gz to get it. Then I got the file not found error message. Adam Shuy, President ePacific Web Design Hosting Professional Web/Software developer TEL: 408-272-6946 www.epacificweb.com -Original Message- From: Kai_testing Middleton [mailto:[EMAIL PROTECTED] Sent: Wednesday, July 18, 2007 11:35 AM To: [EMAIL PROTECTED] Subject: Re: no nutch script file under bin directory I'm not actually sure ... I think I downloaded and unzipped a nightly build in my usr/local directory thus creating this directory: /usr/local/nutch-2007-06-27_06-52-44 then from within that directory I ran the svn command ... if I remember correctly. You can always try just making a 'nutch' directory or a 'nutch0.9' directory, running svn, and see if it creates another subdirectory under that, then moves things to where you want. - Original Message From: Tsengtan A Shuy [EMAIL PROTECTED] To: [EMAIL PROTECTED] Sent: Tuesday, July 17, 2007 5:30:18 PM Subject: RE: no nutch script file under bin directory This may seems like a silly question, but I need to know it anyway. When I check out the trunk, I shall put it to the nutch directory which should be the latest release directory e.g: nutch-0.9 release. Am I right? Adam Shuy, President ePacific Web Design Hosting Professional Web/Software developer TEL: 408-272-6946 www.epacificweb.com -Original Message- From: Tsengtan A Shuy [mailto:[EMAIL PROTECTED] Sent: Tuesday, July 17, 2007 12:33 PM To: 'Tsengtan A Shuy'; [EMAIL PROTECTED] Subject: RE: no nutch script file under bin directory BTW, I just found out there is only one web page reference in your last email. So I do not understand what you quoted two discussions. Adam Shuy, President ePacific Web Design Hosting Professional Web/Software developer TEL: 408-272-6946 www.epacificweb.com -Original Message- From: Tsengtan A Shuy [mailto:[EMAIL PROTECTED] Sent: Tuesday, July 17, 2007 12:23 PM To: '[EMAIL PROTECTED]' Subject: no nutch script file under bin directory I follow the msg06571.html to check out the trunk. Then I found there is no nutch script file under the bin directory. How do you crawl the multiple websites without this nutch script file? Adam Shuy, President ePacific Web Design Hosting Professional Web/Software developer TEL: 408-272-6946 www.epacificweb.com -Original Message- From: Kai_testing Middleton [mailto:[EMAIL PROTECTED] Sent: Monday, July 16, 2007 8:43 AM To: [EMAIL PROTECTED] Subject: Re: OOM error during parsing with nekohtml You could try looking at these two discussions: http://www.mail-archive.com/[EMAIL PROTECTED]/msg06571.html http://www.mail-archive.com/[EMAIL PROTECTED]/msg06571.html --Kai Shape Yahoo! in your own image. Join our Network Research Panel today! http://surveylink.yahoo.com/gmrs/yahoo_panel_invite.asp?a=7 - This SF.net email is sponsored by DB2 Express Download DB2 Express C - the FREE version of DB2 express and take control of your XML. No limits. Just data. Click to get it now. http://sourceforge.net/powerbar/db2/ ___ Nutch-developers mailing list Nutch-developers@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/nutch-developers
Re: [Nutch-dev] no nutch script file under bin directory
The nightly builds are all cataloged here: http://lucene.zones.apache.org:8080/hudson/job/Nutch-Nightly/ The current nightly build is #153 from July 18. For instance, you could do: wget http://lucene.zones.apache.org:8080/hudson/job/Nutch-Nightly/153/artifact/trunk/build/nutch-2007-07-18_04-01-20.tar.gz --Kai - Original Message From: Tsengtan A Shuy [EMAIL PROTECTED] To: [EMAIL PROTECTED] Sent: Wednesday, July 18, 2007 11:59:52 AM Subject: RE: no nutch script file under bin directory Where do you get the nightly build? I followed your referral web page and use wget http://lucene.zones.apache.org:8080/hudson/job/Nutch-Nightly/lastStableBuild /artifact/trunk/build/nutch-2007-06-27_06-52-44.tar.gz to get it. Then I got the file not found error message. Adam Shuy, President ePacific Web Design Hosting Professional Web/Software developer TEL: 408-272-6946 www.epacificweb.com -Original Message- From: Kai_testing Middleton [mailto:[EMAIL PROTECTED] Sent: Wednesday, July 18, 2007 11:35 AM To: [EMAIL PROTECTED] Subject: Re: no nutch script file under bin directory I'm not actually sure ... I think I downloaded and unzipped a nightly build in my usr/local directory thus creating this directory: /usr/local/nutch-2007-06-27_06-52-44 then from within that directory I ran the svn command ... if I remember correctly. You can always try just making a 'nutch' directory or a 'nutch0.9' directory, running svn, and see if it creates another subdirectory under that, then moves things to where you want. - Original Message From: Tsengtan A Shuy [EMAIL PROTECTED] To: [EMAIL PROTECTED] Sent: Tuesday, July 17, 2007 5:30:18 PM Subject: RE: no nutch script file under bin directory This may seems like a silly question, but I need to know it anyway. When I check out the trunk, I shall put it to the nutch directory which should be the latest release directory e.g: nutch-0.9 release. Am I right? Adam Shuy, President ePacific Web Design Hosting Professional Web/Software developer TEL: 408-272-6946 www.epacificweb.com -Original Message- From: Tsengtan A Shuy [mailto:[EMAIL PROTECTED] Sent: Tuesday, July 17, 2007 12:33 PM To: 'Tsengtan A Shuy'; [EMAIL PROTECTED] Subject: RE: no nutch script file under bin directory BTW, I just found out there is only one web page reference in your last email. So I do not understand what you quoted two discussions. Adam Shuy, President ePacific Web Design Hosting Professional Web/Software developer TEL: 408-272-6946 www.epacificweb.com -Original Message- From: Tsengtan A Shuy [mailto:[EMAIL PROTECTED] Sent: Tuesday, July 17, 2007 12:23 PM To: '[EMAIL PROTECTED]' Subject: no nutch script file under bin directory I follow the msg06571.html to check out the trunk. Then I found there is no nutch script file under the bin directory. How do you crawl the multiple websites without this nutch script file? Adam Shuy, President ePacific Web Design Hosting Professional Web/Software developer TEL: 408-272-6946 www.epacificweb.com -Original Message- From: Kai_testing Middleton [mailto:[EMAIL PROTECTED] Sent: Monday, July 16, 2007 8:43 AM To: [EMAIL PROTECTED] Subject: Re: OOM error during parsing with nekohtml You could try looking at these two discussions: http://www.mail-archive.com/[EMAIL PROTECTED]/msg06571.html http://www.mail-archive.com/[EMAIL PROTECTED]/msg06571.html --Kai Shape Yahoo! in your own image. Join our Network Research Panel today! http://surveylink.yahoo.com/gmrs/yahoo_panel_invite.asp?a=7 Food fight? Enjoy some healthy debate in the Yahoo! Answers Food Drink QA. http://answers.yahoo.com/dir/?link=listsid=396545367- This SF.net email is sponsored by DB2 Express Download DB2 Express C - the FREE version of DB2 express and take control of your XML. No limits. Just data. Click to get it now. http://sourceforge.net/powerbar/db2/___ Nutch-developers mailing list Nutch-developers@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/nutch-developers
Re: [Nutch-dev] no nutch script file under bin directory
BTW, I just found out there is only one web page reference in your last email. So I do not understand what you quoted two discussions. Adam Shuy, President ePacific Web Design Hosting Professional Web/Software developer TEL: 408-272-6946 www.epacificweb.com -Original Message- From: Tsengtan A Shuy [mailto:[EMAIL PROTECTED] Sent: Tuesday, July 17, 2007 12:23 PM To: '[EMAIL PROTECTED]' Subject: no nutch script file under bin directory I follow the msg06571.html to check out the trunk. Then I found there is no nutch script file under the bin directory. How do you crawl the multiple websites without this nutch script file? Adam Shuy, President ePacific Web Design Hosting Professional Web/Software developer TEL: 408-272-6946 www.epacificweb.com -Original Message- From: Kai_testing Middleton [mailto:[EMAIL PROTECTED] Sent: Monday, July 16, 2007 8:43 AM To: [EMAIL PROTECTED] Subject: Re: OOM error during parsing with nekohtml You could try looking at these two discussions: http://www.mail-archive.com/[EMAIL PROTECTED]/msg06571.html http://www.mail-archive.com/[EMAIL PROTECTED]/msg06571.html --Kai - This SF.net email is sponsored by DB2 Express Download DB2 Express C - the FREE version of DB2 express and take control of your XML. No limits. Just data. Click to get it now. http://sourceforge.net/powerbar/db2/ ___ Nutch-developers mailing list Nutch-developers@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/nutch-developers
Re: [Nutch-dev] no nutch script file under bin directory
This may seems like a silly question, but I need to know it anyway. When I check out the trunk, I shall put it to the nutch directory which should be the latest release directory e.g: nutch-0.9 release. Am I right? Adam Shuy, President ePacific Web Design Hosting Professional Web/Software developer TEL: 408-272-6946 www.epacificweb.com -Original Message- From: Tsengtan A Shuy [mailto:[EMAIL PROTECTED] Sent: Tuesday, July 17, 2007 12:33 PM To: 'Tsengtan A Shuy'; [EMAIL PROTECTED] Subject: RE: no nutch script file under bin directory BTW, I just found out there is only one web page reference in your last email. So I do not understand what you quoted two discussions. Adam Shuy, President ePacific Web Design Hosting Professional Web/Software developer TEL: 408-272-6946 www.epacificweb.com -Original Message- From: Tsengtan A Shuy [mailto:[EMAIL PROTECTED] Sent: Tuesday, July 17, 2007 12:23 PM To: '[EMAIL PROTECTED]' Subject: no nutch script file under bin directory I follow the msg06571.html to check out the trunk. Then I found there is no nutch script file under the bin directory. How do you crawl the multiple websites without this nutch script file? Adam Shuy, President ePacific Web Design Hosting Professional Web/Software developer TEL: 408-272-6946 www.epacificweb.com -Original Message- From: Kai_testing Middleton [mailto:[EMAIL PROTECTED] Sent: Monday, July 16, 2007 8:43 AM To: [EMAIL PROTECTED] Subject: Re: OOM error during parsing with nekohtml You could try looking at these two discussions: http://www.mail-archive.com/[EMAIL PROTECTED]/msg06571.html http://www.mail-archive.com/[EMAIL PROTECTED]/msg06571.html --Kai - This SF.net email is sponsored by DB2 Express Download DB2 Express C - the FREE version of DB2 express and take control of your XML. No limits. Just data. Click to get it now. http://sourceforge.net/powerbar/db2/ ___ Nutch-developers mailing list Nutch-developers@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/nutch-developers