nutch-dev
Thread
Date
Earlier messages
Later messages
Messages by Thread
[jira] Updated: (NUTCH-707) Generation of multiple segments in multiple runs returns only 1 segment
Michael Chan (JIRA)
[jira] Updated: (NUTCH-707) Generation of multiple segments in multiple runs returns only 1 segment
Otis Gospodnetic (JIRA)
[jira] Closed: (NUTCH-707) Generation of multiple segments in multiple runs returns only 1 segment
Andrzej Bialecki (JIRA)
[jira] Commented: (NUTCH-707) Generation of multiple segments in multiple runs returns only 1 segment
Andrzej Bialecki (JIRA)
[jira] Commented: (NUTCH-707) Generation of multiple segments in multiple runs returns only 1 segment
Hudson (JIRA)
planning for nutch-1.0-rc1
Sami Siren
Re: planning for nutch-1.0-rc1
Andrzej Bialecki
Re: planning for nutch-1.0-rc1
Andrzej Bialecki
Re: planning for nutch-1.0-rc1
Sami Siren
Re: planning for nutch-1.0-rc1
Bartosz Gadzimski
Re: planning for nutch-1.0-rc1
Dennis Kubes
Re: planning for nutch-1.0-rc1
Andrzej Bialecki
Re: planning for nutch-1.0-rc1
Bartosz Gadzimski
Re: planning for nutch-1.0-rc1
Dennis Kubes
Re: planning for nutch-1.0-rc1
Bartosz Gadzimski
Re: planning for nutch-1.0-rc1
Bartosz Gadzimski
Re: planning for nutch-1.0-rc1
Dennis Kubes
Re: planning for nutch-1.0-rc1
Sami Siren
Re: planning for nutch-1.0-rc1
Dennis Kubes
[jira] Created: (NUTCH-706) Url regex normalizer
Meghna Kukreja (JIRA)
[jira] Commented: (NUTCH-706) Url regex normalizer
Meghna Kukreja (JIRA)
[jira] Commented: (NUTCH-706) Url regex normalizer
Dmitry Lihachev (JIRA)
[jira] Updated: (NUTCH-706) Url regex normalizer
Julien Nioche (JIRA)
[jira] Commented: (NUTCH-706) Url regex normalizer
Ken Krugler (JIRA)
Url regex normalizer
Meghna Kukreja
Re: Url regex normalizer
Andrzej Bialecki
Re: Url regex normalizer
Meghna Kukreja
Re: Url regex normalizer
Sami Siren
[jira] Created: (NUTCH-705) parse-rtf plugin
Dmitry Lihachev (JIRA)
[jira] Commented: (NUTCH-705) parse-rtf plugin
Dmitry Lihachev (JIRA)
[jira] Updated: (NUTCH-705) parse-rtf plugin
Dmitry Lihachev (JIRA)
[jira] Commented: (NUTCH-705) parse-rtf plugin
Sami Siren (JIRA)
[jira] Updated: (NUTCH-705) parse-rtf plugin
Andrzej Bialecki (JIRA)
[jira] Commented: (NUTCH-705) parse-rtf plugin
Dmitry Lihachev (JIRA)
[jira] Commented: (NUTCH-705) parse-rtf plugin
Sami Siren (JIRA)
[jira] Resolved: (NUTCH-705) parse-rtf plugin
Julien Nioche (JIRA)
[Nutch Wiki] Trivial Update of "FrontPage" by BartoszGadzimski
Apache Wiki
[Nutch Wiki] Update of "SimpleMapReduceTutorial" by BartoszGadzimski
Apache Wiki
[Nutch Wiki] Update of "DownloadingNutch" by BartoszGadzimski
Apache Wiki
[jira] Created: (NUTCH-704) ensure that more important pages are crawled first
kr (JIRA)
[jira] Closed: (NUTCH-704) ensure that more important pages are crawled first
Andrzej Bialecki (JIRA)
[jira] Created: (NUTCH-703) Upgrade to Hadoop 0.19.1
Andrzej Bialecki (JIRA)
[jira] Commented: (NUTCH-703) Upgrade to Hadoop 0.19.1
Sami Siren (JIRA)
Re: [jira] Commented: (NUTCH-703) Upgrade to Hadoop 0.19.1
Andrzej Bialecki
[jira] Closed: (NUTCH-703) Upgrade to Hadoop 0.19.1
Andrzej Bialecki (JIRA)
[jira] Commented: (NUTCH-703) Upgrade to Hadoop 0.19.1
Hudson (JIRA)
[jira] Created: (NUTCH-702) Lazy Instanciation of Metadata in CrawlDatum
julien nioche (JIRA)
[jira] Updated: (NUTCH-702) Lazy Instanciation of Metadata in CrawlDatum
julien nioche (JIRA)
[jira] Updated: (NUTCH-702) Lazy Instanciation of Metadata in CrawlDatum
julien nioche (JIRA)
[jira] Updated: (NUTCH-702) Lazy Instanciation of Metadata in CrawlDatum
julien nioche (JIRA)
[jira] Commented: (NUTCH-702) Lazy Instanciation of Metadata in CrawlDatum
Edwin Chu (JIRA)
[jira] Commented: (NUTCH-702) Lazy Instanciation of Metadata in CrawlDatum
Dmitry Lihachev (JIRA)
[jira] Updated: (NUTCH-702) Lazy Instanciation of Metadata in CrawlDatum
Julien Nioche (JIRA)
[jira] Updated: (NUTCH-702) Lazy Instanciation of Metadata in CrawlDatum
JIRA
[jira] Commented: (NUTCH-702) Lazy Instanciation of Metadata in CrawlDatum
Julien Nioche (JIRA)
[jira] Commented: (NUTCH-702) Lazy Instanciation of Metadata in CrawlDatum
Julien Nioche (JIRA)
[jira] Closed: (NUTCH-702) Lazy Instanciation of Metadata in CrawlDatum
JIRA
[jira] Commented: (NUTCH-702) Lazy Instanciation of Metadata in CrawlDatum
Hudson (JIRA)
[jira] Commented: (NUTCH-702) Lazy Instanciation of Metadata in CrawlDatum
zhangxihua (JIRA)
Is there the functions of "More Like This" and "Spell Checking"?
buddha1021
Re: Is there the functions of "More Like This" and "Spell Checking"?
dealmaker
Re: Is there the functions of "More Like This" and "Spell Checking"?
Otis Gospodnetic
Re: Is there the functions of "More Like This" and "Spell Checking"?
dealmaker
Re: Is there the functions of "More Like This" and "Spell Checking"?
Otis Gospodnetic
Re: Is there the functions of "More Like This" and "Spell Checking"?
dealmaker
[jira] Commented: (NUTCH-247) robot parser to restrict.
Hudson (JIRA)
NutchAnalysis.java STOP_WORDS not configurable?
Bartosz Gadzimski
Re: NutchAnalysis.java STOP_WORDS not configurable?
Otis Gospodnetic
[jira] Created: (NUTCH-701) replace Fetcher with Fetcher2
Sami Siren (JIRA)
[jira] Updated: (NUTCH-701) Replace Fetcher with Fetcher2
Sami Siren (JIRA)
[jira] Commented: (NUTCH-701) Replace Fetcher with Fetcher2
Andrzej Bialecki (JIRA)
[jira] Resolved: (NUTCH-701) Replace Fetcher with Fetcher2
Sami Siren (JIRA)
[jira] Resolved: (NUTCH-247) robot parser to restrict.
Sami Siren (JIRA)
[jira] Created: (NUTCH-700) Neko1.9.11 goes into a loop
julien nioche (JIRA)
[jira] Commented: (NUTCH-700) Neko1.9.11 goes into a loop
julien nioche (JIRA)
[jira] Updated: (NUTCH-700) Neko1.9.11 goes into a loop
Sami Siren (JIRA)
[jira] Resolved: (NUTCH-700) Neko1.9.11 goes into a loop
Sami Siren (JIRA)
[jira] Commented: (NUTCH-700) Neko1.9.11 goes into a loop
Hudson (JIRA)
[jira] Created: (NUTCH-699) Add an "official" solr schema for solr integration
JIRA
[jira] Commented: (NUTCH-699) Add an "official" solr schema for solr integration
JIRA
[jira] Commented: (NUTCH-699) Add an "official" solr schema for solr integration
Dmitry Lihachev (JIRA)
[jira] Commented: (NUTCH-699) Add an "official" solr schema for solr integration
Sami Siren (JIRA)
[jira] Commented: (NUTCH-699) Add an "official" solr schema for solr integration
Andrzej Bialecki (JIRA)
[jira] Resolved: (NUTCH-699) Add an "official" solr schema for solr integration
Sami Siren (JIRA)
[jira] Commented: (NUTCH-699) Add an "official" solr schema for solr integration
Hudson (JIRA)
[jira] Commented: (NUTCH-699) Add an "official" solr schema for solr integration
Dmitry Lihachev (JIRA)
[jira] Commented: (NUTCH-477) Extend URLFilters to support different filtering chains
Andrzej Bialecki (JIRA)
[jira] Commented: (NUTCH-477) Extend URLFilters to support different filtering chains
Sami Siren (JIRA)
[jira] Commented: (NUTCH-477) Extend URLFilters to support different filtering chains
Dennis Kubes (JIRA)
[jira] Commented: (NUTCH-477) Extend URLFilters to support different filtering chains
Julien Nioche (JIRA)
[jira] Updated: (NUTCH-247) robot parser to restrict.
Sami Siren (JIRA)
[jira] Updated: (NUTCH-477) Extend URLFilters to support different filtering chains
Sami Siren (JIRA)
[jira] Updated: (NUTCH-477) Extend URLFilters to support different filtering chains
Andrzej Bialecki (JIRA)
[jira] Updated: (NUTCH-477) Extend URLFilters to support different filtering chains
Chris A. Mattmann (JIRA)
[jira] Updated: (NUTCH-573) Multiple Domains - Query Search
Sami Siren (JIRA)
[jira] Updated: (NUTCH-573) Multiple Domains - Query Search
Chris A. Mattmann (JIRA)
[jira] Updated: (NUTCH-573) Multiple Domains - Query Search
Chris A. Mattmann (JIRA)
[Nutch Wiki] Update of "InstallingWeb2" by SamiSiren
Apache Wiki
Re: [Nutch Wiki] Update of "InstallingWeb2" by SamiSiren
Andrzej Bialecki
Re: [Nutch Wiki] Update of "InstallingWeb2" by SamiSiren
Sami Siren
[Nutch Wiki] Update of "RunningNutchAndSolr" by SamiSiren
Apache Wiki
[jira] Created: (NUTCH-698) CrawlDb is corrupted after a few crawl cycles
JIRA
[jira] Updated: (NUTCH-698) CrawlDb is corrupted after a few crawl cycles
JIRA
[jira] Updated: (NUTCH-698) CrawlDb is corrupted after a few crawl cycles
JIRA
[jira] Updated: (NUTCH-698) CrawlDb is corrupted after a few crawl cycles
Sami Siren (JIRA)
[jira] Resolved: (NUTCH-698) CrawlDb is corrupted after a few crawl cycles
Sami Siren (JIRA)
[jira] Commented: (NUTCH-698) CrawlDb is corrupted after a few crawl cycles
Hudson (JIRA)
[jira] Created: (NUTCH-697) Generate log output for solr indexer and dedup
Dmitry Lihachev (JIRA)
[jira] Updated: (NUTCH-697) Generate log output for solr indexer and dedup
Dmitry Lihachev (JIRA)
[jira] Created: (NUTCH-696) Timeout for Parser
julien nioche (JIRA)
[jira] Commented: (NUTCH-696) Timeout for Parser
JIRA
[jira] Commented: (NUTCH-696) Timeout for Parser
julien nioche (JIRA)
[jira] Commented: (NUTCH-696) Timeout for Parser
Julien Nioche (JIRA)
[jira] Closed: (NUTCH-696) Timeout for Parser
Julien Nioche (JIRA)
[jira] Created: (NUTCH-695) incorrect mime type detection by MoreIndexingFilter plugin
Dmitry Lihachev (JIRA)
[jira] Updated: (NUTCH-695) incorrect mime type detection by MoreIndexingFilter plugin
Dmitry Lihachev (JIRA)
[jira] Updated: (NUTCH-695) incorrect mime type detection by MoreIndexingFilter plugin
Dmitry Lihachev (JIRA)
[jira] Updated: (NUTCH-695) incorrect mime type detection by MoreIndexingFilter plugin
Dmitry Lihachev (JIRA)
[jira] Updated: (NUTCH-695) incorrect mime type detection by MoreIndexingFilter plugin
Dmitry Lihachev (JIRA)
[jira] Issue Comment Edited: (NUTCH-695) incorrect mime type detection by MoreIndexingFilter plugin
Dmitry Lihachev (JIRA)
[jira] Issue Comment Edited: (NUTCH-695) incorrect mime type detection by MoreIndexingFilter plugin
Dmitry Lihachev (JIRA)
[jira] Resolved: (NUTCH-695) incorrect mime type detection by MoreIndexingFilter plugin
Sami Siren (JIRA)
[jira] Commented: (NUTCH-695) incorrect mime type detection by MoreIndexingFilter plugin
Dmitry Lihachev (JIRA)
[jira] Commented: (NUTCH-695) incorrect mime type detection by MoreIndexingFilter plugin
Hudson (JIRA)
[jira] Created: (NUTCH-694) Distributed Search Server fails
Dr. Nadine Hochstotter (JIRA)
[jira] Updated: (NUTCH-694) Distributed Search Server fails
Sami Siren (JIRA)
[jira] Commented: (NUTCH-694) Distributed Search Server fails
Dr. Nadine Hochstotter (JIRA)
[jira] Commented: (NUTCH-694) Distributed Search Server fails
Sami Siren (JIRA)
[jira] Commented: (NUTCH-694) Distributed Search Server fails
Dr. Nadine Hochstotter (JIRA)
[jira] Commented: (NUTCH-694) Distributed Search Server fails
Andrzej Bialecki (JIRA)
[jira] Updated: (NUTCH-694) Distributed Search Server fails
Sami Siren (JIRA)
[jira] Updated: (NUTCH-694) Distributed Search Server fails
Sami Siren (JIRA)
[jira] Commented: (NUTCH-694) Distributed Search Server fails
Dr. Nadine Hochstotter (JIRA)
[jira] Resolved: (NUTCH-694) Distributed Search Server fails
Sami Siren (JIRA)
[jira] Commented: (NUTCH-694) Distributed Search Server fails
Hudson (JIRA)
[jira] Created: (NUTCH-693) Add configurable option for treating nofollow behaviour.
Andrew McCall (JIRA)
[jira] Updated: (NUTCH-693) Add configurable option for treating nofollow behaviour.
Andrew McCall (JIRA)
[jira] Assigned: (NUTCH-693) Add configurable option for treating nofollow behaviour.
Otis Gospodnetic (JIRA)
[jira] Commented: (NUTCH-693) Add configurable option for treating nofollow behaviour.
Otis Gospodnetic (JIRA)
[jira] Commented: (NUTCH-693) Add configurable option for treating nofollow behaviour.
Andrzej Bialecki (JIRA)
[jira] Commented: (NUTCH-693) Add configurable option for treating nofollow behaviour.
Andrew McCall (JIRA)
[jira] Commented: (NUTCH-693) Add configurable option for treating nofollow behaviour.
Andrzej Bialecki (JIRA)
[jira] Updated: (NUTCH-693) Add configurable option for treating nofollow behaviour.
Andrzej Bialecki (JIRA)
would someone help confirm a patch (fix incorrect encoding detection in cached.jsp)
Justin Yao
Re: would someone help confirm a patch (fix incorrect encoding detection in cached.jsp)
Sami Siren
dump Fetcher?
Sami Siren
[jira] Updated: (NUTCH-583) FeedParser empty links for items
Sami Siren (JIRA)
[jira] Updated: (NUTCH-583) FeedParser empty links for items
Chris A. Mattmann (JIRA)
[jira] Resolved: (NUTCH-563) Include custom fields in BasicQueryFilter
Sami Siren (JIRA)
[jira] Created: (NUTCH-692) AlreadyBeingCreatedException with Hadoop 0.19
julien nioche (JIRA)
[jira] Commented: (NUTCH-692) AlreadyBeingCreatedException with Hadoop 0.19
Sami Siren (JIRA)
[jira] Commented: (NUTCH-692) AlreadyBeingCreatedException with Hadoop 0.19
julien nioche (JIRA)
[jira] Commented: (NUTCH-692) AlreadyBeingCreatedException with Hadoop 0.19
julien nioche (JIRA)
[jira] Commented: (NUTCH-692) AlreadyBeingCreatedException with Hadoop 0.19
Cosmin Lehene (JIRA)
[jira] Commented: (NUTCH-692) AlreadyBeingCreatedException with Hadoop 0.19
Julien Nioche (JIRA)
[jira] Updated: (NUTCH-692) AlreadyBeingCreatedException with Hadoop 0.19
Cosmin Lehene (JIRA)
[jira] Commented: (NUTCH-692) AlreadyBeingCreatedException with Hadoop 0.19
JIRA
[jira] Commented: (NUTCH-692) AlreadyBeingCreatedException with Hadoop 0.19
Cosmin Lehene (JIRA)
[jira] Commented: (NUTCH-692) AlreadyBeingCreatedException with Hadoop 0.19
Julien Nioche (JIRA)
[jira] Commented: (NUTCH-692) AlreadyBeingCreatedException with Hadoop 0.19
Cosmin Lehene (JIRA)
[jira] Commented: (NUTCH-692) AlreadyBeingCreatedException with Hadoop 0.19
Julien Nioche (JIRA)
[jira] Commented: (NUTCH-692) AlreadyBeingCreatedException with Hadoop 0.19
Julien Nioche (JIRA)
[jira] Commented: (NUTCH-692) AlreadyBeingCreatedException with Hadoop 0.19
JIRA
[jira] Commented: (NUTCH-692) AlreadyBeingCreatedException with Hadoop 0.19
Julien Nioche (JIRA)
[jira] Commented: (NUTCH-692) AlreadyBeingCreatedException with Hadoop 0.19
Andrzej Bialecki (JIRA)
[jira] Commented: (NUTCH-692) AlreadyBeingCreatedException with Hadoop 0.19
Julien Nioche (JIRA)
[jira] Assigned: (NUTCH-692) AlreadyBeingCreatedException with Hadoop 0.19
Julien Nioche (JIRA)
[jira] Resolved: (NUTCH-692) AlreadyBeingCreatedException with Hadoop 0.19
Julien Nioche (JIRA)
[jira] Resolved: (NUTCH-591) StringIndexOutOfBoundsException when extracting text from a Word document.
Sami Siren (JIRA)
[jira] Commented: (NUTCH-591) StringIndexOutOfBoundsException when extracting text from a Word document.
Dmitry Lihachev (JIRA)
[jira] Created: (NUTCH-691) Update jakarta poi jars to the most relevant version
Dmitry Lihachev (JIRA)
[jira] Updated: (NUTCH-691) Update jakarta poi jars to the most relevant version
Dmitry Lihachev (JIRA)
[jira] Updated: (NUTCH-691) Update jakarta poi jars to the most relevant version
Dmitry Lihachev (JIRA)
[jira] Updated: (NUTCH-691) Update jakarta poi jars to the most relevant version
Dmitry Lihachev (JIRA)
[jira] Updated: (NUTCH-691) Update jakarta poi jars to the most relevant version
Dmitry Lihachev (JIRA)
[jira] Updated: (NUTCH-691) Update jakarta poi jars to the most relevant version
Dmitry Lihachev (JIRA)
[jira] Commented: (NUTCH-691) Update jakarta poi jars to the most relevant version
Dmitry Lihachev (JIRA)
[jira] Issue Comment Edited: (NUTCH-691) Update jakarta poi jars to the most relevant version
Dmitry Lihachev (JIRA)
[jira] Updated: (NUTCH-691) Update jakarta poi jars to the most relevant version
Dmitry Lihachev (JIRA)
[jira] Resolved: (NUTCH-691) Update jakarta poi jars to the most relevant version
Sami Siren (JIRA)
[jira] Commented: (NUTCH-691) Update jakarta poi jars to the most relevant version
Hudson (JIRA)
[jira] Created: (NUTCH-690) bug in DomContentUtils.shouldThrowAwayLink?
Peter Sparks (JIRA)
[jira] Created: (NUTCH-689) Swf parser doesn't seem to handle relative links
Peter Sparks (JIRA)
[jira] Updated: (NUTCH-689) Swf parser doesn't seem to handle relative links
Peter Sparks (JIRA)
[jira] Commented: (NUTCH-689) Swf parser doesn't seem to handle relative links
Sami Siren (JIRA)
[jira] Updated: (NUTCH-689) Swf parser doesn't seem to handle relative links
Peter Sparks (JIRA)
[jira] Updated: (NUTCH-689) Swf parser doesn't seem to handle relative links
Peter Sparks (JIRA)
[jira] Commented: (NUTCH-689) Swf parser doesn't seem to handle relative links
Sami Siren (JIRA)
[jira] Commented: (NUTCH-689) Swf parser doesn't seem to handle relative links
Peter Sparks (JIRA)
[jira] Updated: (NUTCH-310) Review Log Levels
Sami Siren (JIRA)
[jira] Updated: (NUTCH-310) Review Log Levels
Chris A. Mattmann (JIRA)
[jira] Updated: (NUTCH-249) black- white list url filtering
Sami Siren (JIRA)
[jira] Updated: (NUTCH-249) black- white list url filtering
Marko Bauhardt (JIRA)
[jira] Updated: (NUTCH-249) black- white list url filtering
Chris A. Mattmann (JIRA)
[jira] Updated: (NUTCH-309) Uses commons logging Code Guards
Sami Siren (JIRA)
[jira] Updated: (NUTCH-309) Uses commons logging Code Guards
Chris A. Mattmann (JIRA)
[jira] Updated: (NUTCH-469) changes to geoPosition plugin to make it work on nutch 0.9
Sami Siren (JIRA)
[jira] Updated: (NUTCH-469) changes to geoPosition plugin to make it work on nutch 0.9
Julien Nioche (JIRA)
[jira] Updated: (NUTCH-86) LanguageIdentifier API enhancements
Sami Siren (JIRA)
[jira] Resolved: (NUTCH-582) Add missing type parameters
Sami Siren (JIRA)
Earlier messages
Later messages