Re: Cron Job Error

2008-12-25 Thread Tony Wang
You need to type the full path to your Nutch, I think. On Fri, Dec 26, 2008 at 12:39 AM, Neil Rosewarm neil_rosew...@yahoo.comwrote: Greetings! I have trouble on following: Could you please help me in fixing the issue? I've /nutch directory in /root. I'm trying to run the script through

Re: search not working on website interface, but works on shell by Nutbeans

2008-12-26 Thread Tony Wang
. From: Tony Wang ivyt...@gmail.com To: nutch-user@lucene.apache.org Sent: Friday, December 26, 2008 2:34:55 AM Subject: search not working on website interface, but works on shell by Nutbeans Hi, The search doesn't work on the website interface here: http://208.64.71.46

Re: error when doing search queries

2008-12-26 Thread Tony Wang
/jira/browse/NUTCH-671) Regards Edwin On Fri, Dec 26, 2008 at 4:30 AM, Tony Wang ivyt...@gmail.com wrote: Finally, I got Nutch installed and working here http://208.64.71.46:8080 , but when I submit a search query, the error message is: org.apache.jasper.JasperException: /search.jsp

Re: error when doing search queries

2008-12-26 Thread Tony Wang
to nutch source release. $ cd /path/to/nutch $ patch /path/to/patchfile Replace /path/to/nutch and /path/to/patchfile to the path of Nutch and the patch file respectively. Finally run ant war to build the war file. Regards Edwin On Sat, Dec 27, 2008 at 1:39 PM, Tony Wang ivyt...@gmail.com

how to update Nutch search index

2008-12-28 Thread Tony Wang
I've run Nutch for indexing a second website, and I wonder how to update the Search Index so that I can find something I need in the new indexing data? I have cratead a file named recrawl in the bin directory by following this article

Pathching problem

2008-12-30 Thread Tony Wang
Hello everyone! I am trying to integrate Nutch with Solr by applying the NUTCH-442_v8.patch file. But not much successful in the patching process. See below: The text below shown *in red* is my input on the SSH client window: I've just downloaded

Nutch Patching problem

2009-01-08 Thread Tony Wang
I am kind of struggling making nutch work with solr right now. I try to apply nutch-442 patch to the latest nutch release, but the patching gave me lots of errors. I wonder whether this patch is really the one that I need to integrate nutch with solr? Anyone mind to answer my questions? thanks!

Re: Release 1.0?

2009-02-02 Thread Tony Wang
that's an exciting piece of news! what's the added feature to Nutch 1.0? Thanks On Mon, Feb 2, 2009 at 9:36 AM, Andrzej Bialecki a...@getopt.org wrote: Marko Bauhardt wrote: Hi, is there anybody out there? ;) exists a plan when version 1.0 will be released? thanks marko On Jan 28,

Re: Release 1.0?

2009-02-02 Thread Tony Wang
I definitely like the Nutch/Solr integration the best! Thanks guys! Tony On Mon, Feb 2, 2009 at 10:03 AM, Andrzej Bialecki a...@getopt.org wrote: Tony Wang wrote: that's an exciting piece of news! what's the added feature to Nutch 1.0? Please see the CHANGES.txt in the nightly releases

Re: Nutch 1.0 - Setting up and running Nutch for crawling and Solr for indexing and querying.

2009-02-21 Thread Tony Wang
I don't see that Nutch 1.0 has been released. Where did you download it? nightly build? thanks On Fri, Feb 20, 2009 at 6:31 PM, Kham Vo k...@mac.com wrote: Hello Nutch 1.0 designers, I successfully installed and set up Nutch 1.0 (build # 722). Ran bin/nutch crawl urls -dir crawl -depth 3

Exception when crawling

2009-03-01 Thread Tony Wang
I just installed the nightly build (March 1, 2009) on my dedicated server and I tried to craw a single site, but it throws below exception: Exception in thread main java.io.IOException: Job failed! at org.apache.hadoop.mapred.JobClient.runJob(JobClient.java:1232) at

Re: How do you setup your svn for your nutch code?

2009-03-01 Thread Tony Wang
from my understanding, Nutch 1.0 is already in the latest nightly build. On Sun, Mar 1, 2009 at 5:22 PM, dealmaker vin...@gmail.com wrote: Hi, I am modifying Nutch 0.9 code for my project. Currently, I put all my 0.9 code in my local main trunk. But I know that 1.0 will be out soon, and

Re: Problem with crawling using the latest 1.0 trunk

2009-03-02 Thread Tony Wang
man, I have exactly the same problem with nutch 1.0 in the SVN trunk! I wonder when the nutch team will release the official 1.0. really cannot wait. On Mon, Mar 2, 2009 at 12:09 PM, ahammad ahmed.ham...@gmail.com wrote: I am aware that this is still a development version, but I need to test a

Re: Problem with crawling using the latest 1.0 trunk

2009-03-02 Thread Tony Wang
thanks Justin. the build #736 works flawlessly! On Mon, Mar 2, 2009 at 1:34 PM, Justin Yao jus...@snooth.com wrote: Same problem here if using build #740 (Mar 2, 2009 4:01:53 AM) I switched to build #736 (Feb 26, 2009 4:01:15 AM) and it worked then. Justin Tony Wang wrote: man, I have

Re: blank results page

2009-03-02 Thread Tony Wang
build #736 currently and have same problem as you in one of my testing servers but it works fine in all other servers. All my servers are CentOS 5.2 with almost same configuration. I didn't have time to debug that and just use the working servers to do my project now. Tony Wang wrote: I

how to crawl multiple websites in each run?

2009-03-02 Thread Tony Wang
Can someone on this list give me some instructions about how to crawl multiple websites in each run? Should I make a list of websites in the urls folder? but how to set up the crawl-urlfilter.txt? thanks! -- Are you RCholic? www.RCholic.com 温 良 恭 俭 让 仁 义 礼 智 信

Re: how to crawl multiple websites in each run?

2009-03-02 Thread Tony Wang
this: # accept hosts in MY.DOMAIN.NAME +^http://([a-z0-9]*\.)*aaa.edu/ +^http://([a-z0-9]*\.)*bbb.edu/ .. good luck! yanky 2009/3/3 Tony Wang ivyt...@gmail.com Can someone on this list give me some instructions about how to crawl multiple websites in each run? Should I make a list

exception for Nutch build #736

2009-03-02 Thread Tony Wang
I just encountered this exception when using Nutch build #736: Exception in thread main org.apache.hadoop.mapred.InvalidInputException: Input path does not exist: file:/opt/tomcat6/webapps/nutch/data/segments/20090302235647/parse_data at

Re: blank results page

2009-03-02 Thread Tony Wang
as mentioned in Nutch Wiki: nutch invertlinks mycrawldir/linkdb mycrawldir/segments/* nutch index mycrawldir/indexes mycrawldir/crawldb mycrawldir/linkdb mycrawldir/segments/* Justin Tony Wang wrote: maybe I need to update the index after each crawl? What's the command line you use to update

Re: how to crawl multiple websites in each run?

2009-03-03 Thread Tony Wang
, outlinks leading from a page to external hosts will be ignored. This is an effective way to limit the crawl to include only initially injected hosts, without creating complex URLFilters. /description /property Tony Wang wrote: that helps a lot! thanks! 2009/3/2 yanky young yanky.yo

error when bootstrap DMOZ databases

2009-03-03 Thread Tony Wang
Hi, Maybe this is not the appropriate email list to ask, but just want to know if anyone of you had this error before: wget http://rdf.dmoz.org/rdf/content.rdf.u8.gz gunzip content.rdf.u8.gz the error for the gunzip is: *gzip: content.rdf.u8: No space left on device* - well, on

how to make Nutch work for Solr?

2009-03-06 Thread Tony Wang
Hi all, For those on the Solr user list who have already seen my question about Nutch/Solr integration, I want to apologize for the redundant messages and I don't mean to spam the two mailing lists. I have been desperately seeking information/documentation on making Nutch crawl for Solr indexing.

Re: how to make Nutch work for Solr?

2009-03-06 Thread Tony Wang
2009/3/6 Tony Wang ivyt...@gmail.com Hi all, For those on the Solr user list who have already seen my question about Nutch/Solr integration, I want to apologize for the redundant messages and I don't mean to spam the two mailing lists. I have been desperately seeking information

Re: how to make Nutch work for Solr?

2009-03-06 Thread Tony Wang
can! (well OK. maybe not my mum but apart from her...) Andy 2009/3/6 Tony Wang ivyt...@gmail.com Hi all, For those on the Solr user list who have already seen my question about Nutch/Solr integration, I want to apologize for the redundant messages and I don't mean to spam the two

Re: how to make Nutch work for Solr?

2009-03-06 Thread Tony Wang
SolrInjector Sorry, that's kida vague. You'll have to work out what parameters the SolrInjector job wants for yourself, but if I can work it out then anyone can! (well OK. maybe not my mum but apart from her...) Andy 2009/3/6 Tony Wang ivyt...@gmail.com Hi all, For those on the Solr user

Re: The Future of Nutch

2009-03-16 Thread Tony Wang
I just wish there could be some clear documentation for Nutch/Solr integration publicly available. Or some developers are already working on this? - Tony On Mon, Mar 16, 2009 at 6:50 PM, Otis Gospodnetic ogjunk-nu...@yahoo.comwrote: Hello, Comments inlined. - Original Message

Re: [ANNOUNCE] Apache Nutch 1.0

2009-03-28 Thread Tony Wang
Hi Sami, Thank you so much for the good news. Is there going to be documentation for Solr integration? Sorry to Otis, I know you are going to ask me to try to find it out by myself ;) Thanks! - Tony On Sat, Mar 28, 2009 at 1:53 PM, Sami Siren ssi...@gmail.com wrote: I am pleased to announce