Re: [VOTE] Apache Nutch 1.1 Release Candidate #1

2010-04-06 Thread Mattmann, Chris A (388J)
Oh, per usual, forgot to throw in my +1. So, +1! Cheers, Chris On 4/7/10 1:14 AM, "Mattmann, Chris A (388J)" wrote: Hi Folks, I have posted a candidate for the Apache Nutch 1.1 release. The source code is at: http://people.apache.org/~mattmann/apache-nutch-1.1/rc1/ See the included CHANGES

[VOTE] Apache Nutch 1.1 Release Candidate #1

2010-04-06 Thread Mattmann, Chris A (388J)
Hi Folks, I have posted a candidate for the Apache Nutch 1.1 release. The source code is at: http://people.apache.org/~mattmann/apache-nutch-1.1/rc1/ See the included CHANGES.txt file for details on release contents and latest changes. The release was made using the Nutch release process, docume

how to parse (only text) web sites while crawling

2010-04-06 Thread cefurkan0 cefurkan0
i can succesfully run crawl command via cygwin on windows xp. and i can also make web search via using tomcat. but i also want to save parsed pages during crawling event so when i start crawling with like this bin/nutch crawl urls -dir crawled -depth 3 i also want save parsed html files to text

Re: [VOTE] Nutch to become a top-level project (TLP)

2010-04-06 Thread Doğacan Güney
+1, wow I can't believe I missed this :) On Tue, Apr 6, 2010 at 16:09, Dennis Kubes wrote: > +1, sorry for the late response, been traveling lately. > > Andrzej Bialecki wrote: >> >> Hi all, >> >> According to an earlier [DISCUSS] thread on the nutch-dev list I'm >> calling for a vote on the prop

Re: [VOTE] Nutch to become a top-level project (TLP)

2010-04-06 Thread MilleBii
+1 connection with Lucene is minimal 2010/4/1, Mattmann, Chris A (388J) : > Hi Andrzej, > > +1 from me. > > Cheers, > Chris > > > > On 4/1/10 10:23 AM, "Andrzej Bialecki" wrote: > > Hi all, > > According to an earlier [DISCUSS] thread on the nutch-dev list I'm > calling for a vote on the proposal

Re: [VOTE] Nutch to become a top-level project (TLP)

2010-04-06 Thread Dennis Kubes
+1, sorry for the late response, been traveling lately. Andrzej Bialecki wrote: Hi all, According to an earlier [DISCUSS] thread on the nutch-dev list I'm calling for a vote on the proposal to make Nutch a top-level project. To quickly recap the reasons and consequences of such move: the ASF b

Re: description and keywords

2010-04-06 Thread Julien Nioche
See https://issues.apache.org/jira/browse/NUTCH-783 for a utility which allows you to check the fields generated by the IndexingFilters. Using Luke is also a good way of making sure the fields/values in a Lucene index Are you using the SVN version? If so check that the indexing filter is listed i

Re: description and keywords

2010-04-06 Thread ramires
Hi I put these lines to nutch-site.xml then search a keywords but there is no result page. I think Indexer not index metatags. metatags.names description;keywords query.basic.description.boost 5.0 query.basic.keywords.boost 5.0 -- View this message in context: http://n3.