Implementing a custom SAX / DOM parser

2009-03-16 Thread MyD
Hi @ all, I'd like to know if it is possible to implement his own sax parser for a plugin and where this could be done e.g. at which extension point. Thanks in advance. Cheers, MyD -- View this message in context:

Re: synchronized File Writer

2009-03-16 Thread yanky young
Hi: it seems you are writing to a xml file with multiple threads. I guess it can be done by using BlockingQueue in java 1.5 concurrency api. you just add any url entry into the queue from multiple producer threads, and use a separate consumer thread to retrive url entries from the queue and write

nutch 0.7

2009-03-16 Thread Mayank Kamthan
Hi! I need nutch 0.7. Can someone please provide me a pointer to it to download. When I try via the Apache site it leads me to nutch 0.9. Please give a pointer for the 0.7 release. Regards, Mayank. -- Mayank Kamthan

1.0 mp3 plugin test not pass

2009-03-16 Thread jackyu
Testcase: testId3v2 took 8.06 sec FAILED expected:postgresql comment id3v2 but was:null junit.framework.ComparisonFailure: expected:postgresql comment id3v2 but was:null at org.apache.nutch.parse.mp3.TestMP3Parser.testId3v2(TestMP3Parser.java:76) Testcase: testId3v1 took 2.045

Re: Too many open files Nutch 0.8

2009-03-16 Thread vishal vachhani
http://knol.google.com/k/fred-grott/open-file-limits-settings-on-ubuntu/166jfml0mowlh/3 hp it will help On Mon, Mar 16, 2009 at 4:03 PM, José Mestre jose.mes...@aduneo.com wrote: Hi, We are using NutchBean search on a website. When a search is done, NutchBean opens some files (about 12).

RE: Nutch 1.0 Status?

2009-03-16 Thread Lukas, Ray
-Original Message- From: Jim Van Sciver [mailto:jvansci...@gmail.com] Sent: Monday, March 16, 2009 3:42 PM To: nutch-user@lucene.apache.org Subject: Nutch 1.0 Status? I read in the developers email list that Nutch 1.0 has been packaged for release to Apache. Congratulations!! What

Task failed to report status when merging segments

2009-03-16 Thread Justin Yao
Hi, I encountered an error when I try to merge segment using the latest nightly build nutch. I have 3 hadoop nodes and all servers have CentOS 5.2 installed. Every time when I tried to merge segment using command: nutch mergesegs crawl/MERGEDsegments -dir crawl/segments, it would fail with

Indexing the local file system

2009-03-16 Thread Huang, Zijian(Victor)
__ From: Huang, Zijian(Victor) Sent: Monday, March 16, 2009 10:56 AM To: 'nutch-user@lucene.apache.org' Subject: Indexing the local file system Hi, all: I am new to Nutch, can anyone please tell me what do I do to index

Re: The Future of Nutch

2009-03-16 Thread Otis Gospodnetic
Hello, Comments inlined. - Original Message From: Dennis Kubes ku...@apache.org To: nutch-user@lucene.apache.org Sent: Friday, March 13, 2009 8:19:37 PM With the release of Nutch 1.0 I think it is a good time to begin a discussion about the future of Nutch. Here are some

Re: The Future of Nutch

2009-03-16 Thread Tony Wang
I just wish there could be some clear documentation for Nutch/Solr integration publicly available. Or some developers are already working on this? - Tony On Mon, Mar 16, 2009 at 6:50 PM, Otis Gospodnetic ogjunk-nu...@yahoo.comwrote: Hello, Comments inlined. - Original Message

Re: Indexing the local file system

2009-03-16 Thread Gopikrishnan Kookkal
Check this out: http://www.folge2.de/tp/search/1/crawling-the-local-filesystem-with-nutch On Tue, Mar 17, 2009 at 3:55 AM, Huang, Zijian(Victor) zijian.hu...@etrade.com wrote: __ From: Huang, Zijian(Victor) Sent: Monday, March 16,