nutch-dev
Thread
Date
Earlier messages
Later messages
Messages by Date
2009/02/11
[Nutch Wiki] Update of "GettingNutchRunningWithWindows" by FrankMcCown
Apache Wiki
2009/02/11
[Nutch Wiki] Trivial Update of "RunNutchInEclipse0.9" by FrankMcCown
Apache Wiki
2009/02/11
[jira] Closed: (NUTCH-683) NUTCH-676 broke CrawlDbMerger
JIRA
2009/02/10
[Nutch Wiki] Update of "RunNutchInEclipse0.9" by FrankMcCown
Apache Wiki
2009/02/10
[jira] Updated: (NUTCH-563) Include custom fields in BasicQueryFilter
julien nioche (JIRA)
2009/02/09
[jira] Closed: (NUTCH-686) Russian Analysis Plugin
OpenTeam.ru (JIRA)
2009/02/09
[jira] Created: (NUTCH-686) Russian Analysis Plugin
OpenTeam.ru (JIRA)
2009/02/09
[jira] Updated: (NUTCH-686) Russian Analysis Plugin
OpenTeam.ru (JIRA)
2009/02/09
RPC timeout
程越强
2009/02/08
Re: writing plugin
Techie
2009/02/06
[jira] Commented: (NUTCH-643) ClassCastException in PdfParser on encrypted PDF with empty password
Hudson (JIRA)
2009/02/06
[jira] Commented: (NUTCH-636) Http client plug-in https doesn't work on IBM JRE
Hudson (JIRA)
2009/02/06
[jira] Commented: (NUTCH-74) French Analyzer Plugin
Andrzej Bialecki (JIRA)
2009/02/06
[jira] Closed: (NUTCH-74) French Analyzer Plugin
Andrzej Bialecki (JIRA)
2009/02/06
[jira] Updated: (NUTCH-479) Support for OR queries
Andrzej Bialecki (JIRA)
2009/02/06
[jira] Commented: (NUTCH-479) Support for OR queries
Andrzej Bialecki (JIRA)
2009/02/06
[jira] Commented: (NUTCH-262) Summary excerpts and highlights problems
Andrzej Bialecki (JIRA)
2009/02/06
[jira] Closed: (NUTCH-262) Summary excerpts and highlights problems
Andrzej Bialecki (JIRA)
2009/02/06
[jira] Updated: (NUTCH-455) dedup on tokenized fields is faulty
Andrzej Bialecki (JIRA)
2009/02/06
[jira] Commented: (NUTCH-455) dedup on tokenized fields is faulty
Andrzej Bialecki (JIRA)
2009/02/06
[jira] Commented: (NUTCH-357) crawling simulation
Andrzej Bialecki (JIRA)
2009/02/06
[jira] Closed: (NUTCH-357) crawling simulation
Andrzej Bialecki (JIRA)
2009/02/06
[jira] Commented: (NUTCH-261) Multi Language Support
Andrzej Bialecki (JIRA)
2009/02/06
[jira] Closed: (NUTCH-261) Multi Language Support
Andrzej Bialecki (JIRA)
2009/02/06
[jira] Commented: (NUTCH-673) Upgrade the Carrot2 plug-in to release 3.0
Andrzej Bialecki (JIRA)
2009/02/06
[jira] Updated: (NUTCH-673) Upgrade the Carrot2 plug-in to release 3.0
Andrzej Bialecki (JIRA)
2009/02/06
[jira] Commented: (NUTCH-683) NUTCH-676 broke CrawlDbMerger
Andrzej Bialecki (JIRA)
2009/02/06
[jira] Commented: (NUTCH-469) changes to geoPosition plugin to make it work on nutch 0.9
Andrzej Bialecki (JIRA)
2009/02/06
[jira] Commented: (NUTCH-563) Include custom fields in BasicQueryFilter
Andrzej Bialecki (JIRA)
2009/02/06
[jira] Commented: (NUTCH-251) Administration GUI
Andrzej Bialecki (JIRA)
2009/02/06
[jira] Updated: (NUTCH-251) Administration GUI
Andrzej Bialecki (JIRA)
2009/02/06
[jira] Commented: (NUTCH-636) Http client plug-in https doesn't work on IBM JRE
Andrzej Bialecki (JIRA)
2009/02/06
[jira] Closed: (NUTCH-636) Http client plug-in https doesn't work on IBM JRE
Andrzej Bialecki (JIRA)
2009/02/06
[jira] Commented: (NUTCH-643) ClassCastException in PdfParser on encrypted PDF with empty password
Andrzej Bialecki (JIRA)
2009/02/06
[jira] Closed: (NUTCH-643) ClassCastException in PdfParser on encrypted PDF with empty password
Andrzej Bialecki (JIRA)
2009/02/06
[jira] Created: (NUTCH-685) Content-level redirect status lost in ParseSegment
Andrzej Bialecki (JIRA)
2009/02/03
[jira] Commented: (NUTCH-671) JSP errors in Nutch searcher webapp running with Tomcat 6
Hudson (JIRA)
2009/02/03
[jira] Commented: (NUTCH-279) Additions for regex-normalize
Hudson (JIRA)
2009/02/03
[jira] Closed: (NUTCH-671) JSP errors in Nutch searcher webapp running with Tomcat 6
Andrzej Bialecki (JIRA)
2009/02/03
[jira] Commented: (NUTCH-671) JSP errors in Nutch searcher webapp running with Tomcat 6
Andrzej Bialecki (JIRA)
2009/02/03
[jira] Commented: (NUTCH-92) DistributedSearch incorrectly scores results
Andrzej Bialecki (JIRA)
2009/02/03
[jira] Updated: (NUTCH-92) DistributedSearch incorrectly scores results
Andrzej Bialecki (JIRA)
2009/02/03
[jira] Commented: (NUTCH-279) Additions for regex-normalize
Andrzej Bialecki (JIRA)
2009/02/03
[jira] Closed: (NUTCH-279) Additions for regex-normalize
Andrzej Bialecki (JIRA)
2009/02/03
[jira] Commented: (NUTCH-558) Need tool to retrieve domain statistics
Andrzej Bialecki (JIRA)
2009/02/03
[jira] Updated: (NUTCH-558) Need tool to retrieve domain statistics
Andrzej Bialecki (JIRA)
2009/02/03
[jira] Commented: (NUTCH-353) pages that serverside forwards will be refetched every time
Andrzej Bialecki (JIRA)
2009/02/03
[jira] Closed: (NUTCH-353) pages that serverside forwards will be refetched every time
Andrzej Bialecki (JIRA)
2009/02/03
[jira] Commented: (NUTCH-518) Fix OpicScoringFilter to respect scoring filter chaining
Andrzej Bialecki (JIRA)
2009/02/03
[jira] Closed: (NUTCH-518) Fix OpicScoringFilter to respect scoring filter chaining
Andrzej Bialecki (JIRA)
2009/02/03
[jira] Closed: (NUTCH-656) DeleteDuplicates based on crawlDB only
julien nioche (JIRA)
2009/02/03
Re: Release 1.0?
Marko Bauhardt
2009/02/02
[jira] Work started: (NUTCH-631) MoreIndexingFilter fails with NoSuchElementException
Chris A. Mattmann (JIRA)
2009/02/02
[jira] Assigned: (NUTCH-631) MoreIndexingFilter fails with NoSuchElementException
Chris A. Mattmann (JIRA)
2009/02/02
Re: Release 1.0?
Andrzej Bialecki
2009/02/02
Re: Release 1.0?
Marko Bauhardt
2009/02/01
Hadoop Get Together @ Berlin
Isabel Drost
2009/01/31
writing plugin
Raagu
2009/01/30
Re: [jira] Created: (NUTCH-633) ParseSegment no longer allow reparsing
Grease
2009/01/30
Re: [jira] Created: (NUTCH-683) NUTCH-676 broke CrawlDbMerger
Raghavendra Neelekani
2009/01/30
[jira] Updated: (NUTCH-684) Dedup support for Solr
JIRA
2009/01/30
[jira] Created: (NUTCH-684) Dedup support for Solr
JIRA
2009/01/29
[jira] Commented: (NUTCH-682) SOLR indexer does not set boost on the document
Hudson (JIRA)
2009/01/29
Re: [jira] Updated: (NUTCH-683) NUTCH-676 broke CrawlDbMerger
Raghavendra Neelekani
2009/01/29
[jira] Updated: (NUTCH-683) NUTCH-676 broke CrawlDbMerger
JIRA
2009/01/29
[jira] Created: (NUTCH-683) NUTCH-676 broke CrawlDbMerger
JIRA
2009/01/29
[jira] Closed: (NUTCH-682) SOLR indexer does not set boost on the document
JIRA
2009/01/29
[jira] Created: (NUTCH-682) SOLR indexer does not set boost on the document
julien nioche (JIRA)
2009/01/28
[jira] Commented: (NUTCH-571) parse-mp3 plugin doesn't always index album of mp3
Hudson (JIRA)
2009/01/28
[jira] Commented: (NUTCH-628) Host database to keep track of host-level information
JIRA
2009/01/28
[jira] Commented: (NUTCH-628) Host database to keep track of host-level information
Andrzej Bialecki (JIRA)
2009/01/28
[jira] Commented: (NUTCH-628) Host database to keep track of host-level information
JIRA
2009/01/28
[jira] Commented: (NUTCH-628) Host database to keep track of host-level information
Otis Gospodnetic (JIRA)
2009/01/28
[jira] Closed: (NUTCH-680) Update external jars to latest versions
JIRA
2009/01/28
[jira] Commented: (NUTCH-643) ClassCastException in PdfParser on encrypted PDF with empty password
JIRA
2009/01/28
[jira] Commented: (NUTCH-643) ClassCastException in PdfParser on encrypted PDF with empty password
Andrzej Bialecki (JIRA)
2009/01/28
[jira] Commented: (NUTCH-643) ClassCastException in PdfParser on encrypted PDF with empty password
Andrzej Bialecki (JIRA)
2009/01/28
[jira] Commented: (NUTCH-643) ClassCastException in PdfParser on encrypted PDF with empty password
Guillaume Smet (JIRA)
2009/01/28
[jira] Commented: (NUTCH-643) ClassCastException in PdfParser on encrypted PDF with empty password
JIRA
2009/01/28
[jira] Closed: (NUTCH-571) parse-mp3 plugin doesn't always index album of mp3
JIRA
2009/01/28
[jira] Updated: (NUTCH-626) fetcher2 breaks out the domain with db.ignore.external.links set at cross domain redirects
JIRA
2009/01/28
Release 1.0?
Marko Bauhardt
2009/01/27
[jira] Commented: (NUTCH-628) Host database to keep track of host-level information
Hudson (JIRA)
2009/01/27
[jira] Commented: (NUTCH-680) Update external jars to latest versions
Hudson (JIRA)
2009/01/27
[jira] Commented: (NUTCH-628) Host database to keep track of host-level information
JIRA
2009/01/27
[jira] Commented: (NUTCH-680) Update external jars to latest versions
JIRA
2009/01/26
[jira] Commented: (NUTCH-650) Hbase Integration
JIRA
2009/01/26
[Nutch Wiki] Update of "Mailing" by GrantIngersoll
Apache Wiki
2009/01/26
[Nutch Wiki] Update of "Mailing" by GrantIngersoll
Apache Wiki
2009/01/26
Re: Nutch ScoringFilter plugin problems
Doğacan Güney
2009/01/26
Re: Nutch ScoringFilter plugin problems
Pau
2009/01/25
[jira] Closed: (NUTCH-567) Proper (?) handling of URIs in TagSoup.
JIRA
2009/01/25
[jira] Closed: (NUTCH-574) Including inlink anchor text in index can create irrelevant search results.
JIRA
2009/01/25
[jira] Closed: (NUTCH-627) Minimize host address lookup
JIRA
2009/01/25
[jira] Closed: (NUTCH-660) Does anybody know how to let nutch crawl this kind of website?
JIRA
2009/01/25
[jira] Closed: (NUTCH-588) Help Need
JIRA
2009/01/25
[jira] Closed: (NUTCH-611) Upgrade Nutch to use Hadoop 0.16
JIRA
2009/01/25
[jira] Closed: (NUTCH-675) Reduce tasks do not report their status and are killed by jobtracker
JIRA
2009/01/24
[jira] Commented: (NUTCH-680) Update external jars to latest versions
Hudson (JIRA)
2009/01/24
[jira] Updated: (NUTCH-628) Host database to keep track of host-level information
JIRA
2009/01/24
[jira] Issue Comment Edited: (NUTCH-680) Update external jars to latest versions
JIRA
2009/01/24
[jira] Commented: (NUTCH-680) Update external jars to latest versions
JIRA
2009/01/23
[jira] Commented: (NUTCH-628) Host database to keep track of host-level information
Otis Gospodnetic (JIRA)
2009/01/23
[jira] Commented: (NUTCH-666) Analysis plugins for multiple language and new Language Identifier Tool
Otis Gospodnetic (JIRA)
2009/01/23
[jira] Commented: (NUTCH-386) Plugin to index categories by url rules
Stefano Tauriello (JIRA)
2009/01/23
[jira] Commented: (NUTCH-673) Upgrade the Carrot2 plug-in to release 3.0
JIRA
2009/01/23
[jira] Commented: (NUTCH-666) Analysis plugins for multiple language and new Language Identifier Tool
Dennis Kubes (JIRA)
2009/01/23
[jira] Updated: (NUTCH-666) Analysis plugins for multiple language and new Language Identifier Tool
Dennis Kubes (JIRA)
2009/01/23
[jira] Commented: (NUTCH-628) Host database to keep track of host-level information
JIRA
2009/01/23
[jira] Updated: (NUTCH-655) Injecting Crawl metadata
JIRA
2009/01/23
[jira] Commented: (NUTCH-666) Analysis plugins for multiple language and new Language Identifier Tool
JIRA
2009/01/23
Re: [jira] Created: (NUTCH-680) Update external jars to latest versions
Doğacan Güney
2009/01/22
[jira] Commented: (NUTCH-628) Host database to keep track of host-level information
Otis Gospodnetic (JIRA)
2009/01/22
[jira] Commented: (NUTCH-655) Injecting Crawl metadata
Otis Gospodnetic (JIRA)
2009/01/22
[jira] Commented: (NUTCH-386) Plugin to index categories by url rules
Stefano Tauriello (JIRA)
2009/01/22
[jira] Commented: (NUTCH-386) Plugin to index categories by url rules
Beaucarnea (JIRA)
2009/01/22
[jira] Commented: (NUTCH-386) Plugin to index categories by url rules
Stefano Tauriello (JIRA)
2009/01/21
Re: login failed exception
Vimal Varghese
2009/01/21
[jira] Commented: (NUTCH-579) Feed plugin only indexes one post per feed due to identical digest
Hudson (JIRA)
2009/01/21
[jira] Commented: (NUTCH-681) parse-mp3 compilation problem
Hudson (JIRA)
2009/01/21
[jira] Commented: (NUTCH-676) MapWritable is written inefficiently and confusingly
Hudson (JIRA)
2009/01/21
[jira] Commented: (NUTCH-681) parse-mp3 compilation problem
Wildan Maulana (JIRA)
2009/01/21
[jira] Closed: (NUTCH-579) Feed plugin only indexes one post per feed due to identical digest
JIRA
2009/01/21
[jira] Closed: (NUTCH-676) MapWritable is written inefficiently and confusingly
JIRA
2009/01/21
[jira] Commented: (NUTCH-676) MapWritable is written inefficiently and confusingly
JIRA
2009/01/21
[jira] Commented: (NUTCH-676) MapWritable is written inefficiently and confusingly
Todd Lipcon (JIRA)
2009/01/21
[jira] Updated: (NUTCH-628) Host database to keep track of host-level information
JIRA
2009/01/21
[jira] Commented: (NUTCH-644) RTF parser doesn't compile anymore
JIRA
2009/01/21
[jira] Commented: (NUTCH-676) MapWritable is written inefficiently and confusingly
Todd Lipcon (JIRA)
2009/01/21
[jira] Updated: (NUTCH-676) MapWritable is written inefficiently and confusingly
JIRA
2009/01/21
[jira] Updated: (NUTCH-650) Hbase Integration
JIRA
2009/01/21
[jira] Commented: (NUTCH-655) Injecting Crawl metadata
JIRA
2009/01/21
[jira] Updated: (NUTCH-664) Possibility to update already stored documents.
JIRA
2009/01/21
[jira] Updated: (NUTCH-677) Segment merge filering based on segment content
JIRA
2009/01/21
[jira] Closed: (NUTCH-681) parse-mp3 compilation problem
JIRA
2009/01/21
[jira] Reopened: (NUTCH-681) parse-mp3 compilation problem
JIRA
2009/01/21
[jira] Commented: (NUTCH-679) Fetcher2 implementing Tool
julien nioche (JIRA)
2009/01/21
Re: Nutch ScoringFilter plugin problems
Pau
2009/01/21
Re: Nutch ScoringFilter plugin problems
Doğacan Güney
2009/01/21
[jira] Resolved: (NUTCH-681) parse-mp3 compilation problem
Wildan Maulana (JIRA)
2009/01/20
Re: [jira] Created: (NUTCH-680) Update external jars to latest versions
Piotr Kosiorowski
2009/01/20
Re: [jira] Created: (NUTCH-680) Update external jars to latest versions
Otis Gospodnetic
2009/01/20
Re: [jira] Created: (NUTCH-680) Update external jars to latest versions
Piotr Kosiorowski
2009/01/20
[jira] Updated: (NUTCH-676) MapWritable is written inefficiently and confusingly
JIRA
2009/01/20
[jira] Commented: (NUTCH-669) Consolidate code for Fetcher and Fetcher2
JIRA
2009/01/20
[jira] Closed: (NUTCH-661) errors when the uri contains space characters
JIRA
2009/01/20
Re: [jira] Created: (NUTCH-680) Update external jars to latest versions
Doğacan Güney
2009/01/20
Re: [jira] Created: (NUTCH-680) Update external jars to latest versions
Otis Gospodnetic
2009/01/20
Re: [jira] Created: (NUTCH-680) Update external jars to latest versions
Doğacan Güney
2009/01/20
Re: [jira] Created: (NUTCH-680) Update external jars to latest versions
Otis Gospodnetic
2009/01/20
[jira] Commented: (NUTCH-679) Fetcher2 implementing Tool
Otis Gospodnetic (JIRA)
2009/01/20
Nutch ScoringFilter plugin problems
Pau
2009/01/20
[jira] Closed: (NUTCH-572) Scoring and redirected Urls
JIRA
2009/01/20
Re: [jira] Created: (NUTCH-680) Update external jars to latest versions
Doğacan Güney
2009/01/20
Re: [jira] Created: (NUTCH-680) Update external jars to latest versions
Piotr Kosiorowski
2009/01/20
[jira] Updated: (NUTCH-681) parse-mp3 compilation problem
Wildan Maulana (JIRA)
2009/01/20
[jira] Created: (NUTCH-681) parse-mp3 compilation problem
Wildan Maulana (JIRA)
2009/01/19
[jira] Commented: (NUTCH-678) Hadoop 0.19 requires an update of jets3t
Hudson (JIRA)
2009/01/19
[jira] Closed: (NUTCH-678) Hadoop 0.19 requires an update of jets3t
JIRA
2009/01/19
[jira] Commented: (NUTCH-678) Hadoop 0.19 requires an update of jets3t
julien nioche (JIRA)
2009/01/19
[jira] Created: (NUTCH-680) Update external jars to latest versions
JIRA
2009/01/19
[jira] Commented: (NUTCH-679) Fetcher2 implementing Tool
JIRA
2009/01/19
[jira] Commented: (NUTCH-678) Hadoop 0.19 requires an update of jets3t
JIRA
2009/01/19
login failed exception
Vimal Varghese
2009/01/18
Re: How to set up Nutch in Eclipse IDE
Eric Christeson
2009/01/17
Hudson build is back to normal: Nutch-trunk #696
Apache Hudson Server
2009/01/16
Build failed in Hudson: Nutch-trunk #695
Apache Hudson Server
2009/01/16
Re: How to set up Nutch in Eclipse IDE
Pradeep Pujari
2009/01/15
Hudson build is back to normal: Nutch-trunk #694
Apache Hudson Server
2009/01/15
Re: How to set up Nutch in Eclipse IDE
Pradeep Pujari
2009/01/15
Re: How to set up Nutch in Eclipse IDE
Pradeep Pujari
2009/01/15
[jira] Updated: (NUTCH-679) Fetcher2 implementing Tool
julien nioche (JIRA)
2009/01/15
[jira] Created: (NUTCH-679) Fetcher2 implementing Tool
julien nioche (JIRA)
2009/01/14
Re: How to set up Nutch in Eclipse IDE
Edwin Chu
2009/01/14
How to set up Nutch in Eclipse IDE
Pradeep Pujari
2009/01/14
Build failed in Hudson: Nutch-trunk #693
Apache Hudson Server
2009/01/14
[jira] Created: (NUTCH-678) Hadoop 0.19 requires an update of jets3t
julien nioche (JIRA)
2009/01/13
[jira] Commented: (NUTCH-627) Minimize host address lookup
Hudson (JIRA)
2009/01/13
[jira] Resolved: (NUTCH-627) Minimize host address lookup
Otis Gospodnetic (JIRA)
2009/01/13
[Nutch Wiki] Update of "NewScoring" by OtisGospodnetic
Apache Wiki
2009/01/12
[jira] Commented: (NUTCH-442) Integrate Solr/Nutch
Hudson (JIRA)
2009/01/12
[jira] Commented: (NUTCH-594) Serve Nutch search results in multiple formats including XML and JSON
Hudson (JIRA)
2009/01/12
[jira] Commented: (NUTCH-668) Domain URL Filter
Hudson (JIRA)
2009/01/12
[jira] Commented: (NUTCH-652) AdaptiveFetchSchedule#setFetchSchedule doesn't calculate fetch interval correctly
Hudson (JIRA)
2009/01/12
Hudson build is back to normal: Nutch-trunk #691
Apache Hudson Server
2009/01/12
[Nutch Wiki] Update of "FrontPage" by DennisKubes
Apache Wiki
2009/01/12
[Nutch Wiki] Update of "NewPage" by DennisKubes
Apache Wiki
2009/01/12
[Nutch Wiki] Update of "NewScoring" by DennisKubes
Apache Wiki
2009/01/12
[Nutch Wiki] Update of "NewPage" by DennisKubes
Apache Wiki
2009/01/12
[jira] Commented: (NUTCH-442) Integrate Solr/Nutch
JIRA
2009/01/12
[jira] Commented: (NUTCH-442) Integrate Solr/Nutch
JIRA
2009/01/12
[jira] Updated: (NUTCH-579) Feed plugin only indexes one post per feed due to identical digest
JIRA
2009/01/12
[jira] Closed: (NUTCH-652) AdaptiveFetchSchedule#setFetchSchedule doesn't calculate fetch interval correctly
JIRA
2009/01/12
[jira] Commented: (NUTCH-670) feed plugin does not parse RSS2 enclosures
JIRA
2009/01/12
[jira] Commented: (NUTCH-673) Upgrade the Carrot2 plug-in to release 3.0
JIRA
2009/01/12
[jira] Resolved: (NUTCH-442) Integrate Solr/Nutch
JIRA
2009/01/12
[jira] Commented: (NUTCH-442) Integrate Solr/Nutch
JIRA
2009/01/11
Build failed in Hudson: Nutch-trunk #690
Apache Hudson Server
2009/01/10
Build failed in Hudson: Nutch-trunk #689
Apache Hudson Server
2009/01/10
3 Positions for Nutch Developers in Mumbai
TalentKonnect
Earlier messages
Later messages