Dear Wiki user, You have subscribed to a wiki page or wiki category on "Nutch Wiki" for change notification.
The following page has been changed by DogacanGuney: http://wiki.apache.org/nutch/FrontPage ------------------------------------------------------------------------------ Please contribute your knowledge about Nutch here! - == General Information == + == General Information == - * [http://www.nutch.org Nutch Website ] + * [http://www.nutch.org Nutch Website ] - * ["Features"] + * ["Features"] - * PublicServers running Nutch + * PublicServers running Nutch - * ["Presentations"] on Nutch + * ["Presentations"] on Nutch - * Press ["Articles"] + * Press ["Articles"] - * ["Evaluations"] of Search Quality + * ["Evaluations"] of Search Quality - * ["Help Wanted"] organizations hiring Nutch expertise + * ["Help Wanted"] organizations hiring Nutch expertise - * Commercial ["Support"] and developers for hire + * Commercial ["Support"] and developers for hire - * ["Mailing"] Lists + * ["Mailing"] Lists - * AcademicArticles that deal with Nutch + * AcademicArticles that deal with Nutch - == Nutch Administration == + == Nutch Administration == - * DownloadingNutch + * DownloadingNutch - * HardwareRequirements + * HardwareRequirements - * '''[http://peterpuwang.googlepages.com/NutchGuideForDummies.htm Tutorial] -- Latest step by Step Installation guide for dummies: Nutch 0.9.''' + * '''[http://peterpuwang.googlepages.com/NutchGuideForDummies.htm Tutorial] -- Latest step by Step Installation guide for dummies: Nutch 0.9.''' - * [http://lucene.apache.org/nutch/tutorial.html Tutorial] -- A Step-by-Step guide to getting Nutch up and running. + * [http://lucene.apache.org/nutch/tutorial.html Tutorial] -- A Step-by-Step guide to getting Nutch up and running. - * NutchTutorial ''on the wiki'' + * NutchTutorial ''on the wiki'' - * ["Nutch - The Java Search Engine"] (Builds on the basic tutorials. Includes index maintenance scripts) + * ["Nutch - The Java Search Engine"] (Builds on the basic tutorials. Includes index maintenance scripts) - * [:NutchHadoopTutorial:Nutch Hadoop Tutorial] - How to setup Nutch and Hadoop over a cluster of machines + * [:NutchHadoopTutorial:Nutch Hadoop Tutorial] - How to setup Nutch and Hadoop over a cluster of machines - * [:Automating_Fetches_with_Python:Automating Fetches with Python] - How to automatic the Nutch fetching process using Python + * [:Automating_Fetches_with_Python:Automating Fetches with Python] - How to automatic the Nutch fetching process using Python - * [:Upgrading_Hadoop:Upgrading Hadoop Version in Nutch] - Basic steps for upgrading Hadoop in Nutch. + * [:Upgrading_Hadoop:Upgrading Hadoop Version in Nutch] - Basic steps for upgrading Hadoop in Nutch. - * ["FAQ"] + * ["FAQ"] - * [:CommandLineOptions:Commandline] options for 0.7.x + * [:CommandLineOptions:Commandline] options for 0.7.x - * [:08CommandLineOptions:Commandline] options for version 0.8 + * [:08CommandLineOptions:Commandline] options for version 0.8 - * OverviewDeploymentConfigs + * OverviewDeploymentConfigs - * NutchConfigurationFiles + * NutchConfigurationFiles - * GettingNutchRunningWithUtf8 - For support of non-ASCII characters (Chinese, German, Japanese, Korean). + * GettingNutchRunningWithUtf8 - For support of non-ASCII characters (Chinese, German, Japanese, Korean). - * GettingNutchRunningWithResin - Resin is a JSP/Servlet/EJB application server (alternative to tomcat). + * GettingNutchRunningWithResin - Resin is a JSP/Servlet/EJB application server (alternative to tomcat). - * GettingNutchRunningWithJetty + * GettingNutchRunningWithJetty - * GettingNutchRunningWithUbuntu + * GettingNutchRunningWithUbuntu - * GettingNutchRunningWithWindows + * GettingNutchRunningWithWindows - * GettingNutchRunningWithMacOsx + * GettingNutchRunningWithMacOsx - * GettingNutchRunningWithRedHatApplicationServer + * GettingNutchRunningWithRedHatApplicationServer - * GettingNutchRunningWithDebian + * GettingNutchRunningWithDebian - * GettingNutchRunningWithSocksProxy + * GettingNutchRunningWithSocksProxy - * ErrorMessages -- What they mean and suggestions for getting rid of them. + * ErrorMessages -- What they mean and suggestions for getting rid of them. - * SimpleMapReduceTutorial + * SimpleMapReduceTutorial - * SetupProxyForNutch - using Tinyproxy on Ubuntu + * SetupProxyForNutch - using Tinyproxy on Ubuntu - * CreateNewFilter - for example to add a category metadata to your index and be able to search for it + * CreateNewFilter - for example to add a category metadata to your index and be able to search for it - * UpgradeFrom07To08 + * UpgradeFrom07To08 - * ["Upgrading_from_0.8.x_to_0.9"] + * ["Upgrading_from_0.8.x_to_0.9"] - * RunNutchInEclipse for v0.8 + * RunNutchInEclipse for v0.8 - * ["RunNutchInEclipse0.9"] for v0.9 + * ["RunNutchInEclipse0.9"] for v0.9 - * ["Crawl"] - script to crawl (and possible recrawl too) + * ["Crawl"] - script to crawl (and possible recrawl too) - * IntranetRecrawl - script to recrawl a crawl + * IntranetRecrawl - script to recrawl a crawl - * MergeCrawl - script to merge 2 (or more) crawls + * MergeCrawl - script to merge 2 (or more) crawls - * SearchOverMultipleIndexes - configuring nutch to enable searching over multiple indexes + * SearchOverMultipleIndexes - configuring nutch to enable searching over multiple indexes - * CrossPlatformNutchScripts + * CrossPlatformNutchScripts - * MonitoringNutchCrawls - techniques for keeping an eye on a nutch crawl's progress. + * MonitoringNutchCrawls - techniques for keeping an eye on a nutch crawl's progress. - * ["Nutch 0.9 Crawl Script Tutorial"] + * ["Nutch 0.9 Crawl Script Tutorial"] - * HttpAuthenticationSchemes - How to enable Nutch to authenticate itself using NTLM, Basic or Digest authentication schemes. + * HttpAuthenticationSchemes - How to enable Nutch to authenticate itself using NTLM, Basic or Digest authentication schemes. - * NonDefaultIntranetCrawlingOptions - Desirable options to add to your intranet crawling configuration. + * NonDefaultIntranetCrawlingOptions - Desirable options to add to your intranet crawling configuration. - * RunningNutchAndSolr - How to configure Nutch to crawl, but post to Solr for search/index + * RunningNutchAndSolr - How to configure Nutch to crawl, but post to Solr for search/index - == Nutch Development == + == Nutch Development == - * [:Becoming_A_Nutch_Developer:Becoming a Nutch Developer] - Start developing and contributing to Nutch. + * [:Becoming_A_Nutch_Developer:Becoming a Nutch Developer] - Start developing and contributing to Nutch. - * PluginCentral -- How to write your own plugins and use other people's. + * PluginCentral -- How to write your own plugins and use other people's. - * InternalDocumentation -- How Nutch works. + * InternalDocumentation -- How Nutch works. - * [http://lucene.apache.org/nutch/apidocs/index.html JavaDocs] -- The !JavaDocs for Nutch. + * [http://lucene.apache.org/nutch/apidocs/index.html JavaDocs] -- The !JavaDocs for Nutch. - * [http://lucene.apache.org/nutch/version_control.html Nutch Version Control] + * [http://lucene.apache.org/nutch/version_control.html Nutch Version Control] - * MultiLingualSupport - ''In development''. + * MultiLingualSupport - ''In development''. - * FixingOpicScoring - ''In planning''. + * FixingOpicScoring - ''In planning''. - * HowToContribute + * HowToContribute - * TaskList -- Tasks for Nutch developers. + * TaskList -- Tasks for Nutch developers. - * ["Development"] -- More tasks for Nutch developers. + * ["Development"] -- More tasks for Nutch developers. - * ["Committer's_Rules"] -- Committers should follow these guidelines when deciding, which branch to use for committing the patches and when to commit. + * ["Committer's_Rules"] -- Committers should follow these guidelines when deciding, which branch to use for committing the patches and when to commit. - * ["Release_HOWTO"] + * ["Release_HOWTO"] - * ["Website Update HOWTO"] + * ["Website Update HOWTO"] - * ["Image Search Design"] + * ["Image Search Design"] - * ["NutchOSGi"] + * ["NutchOSGi"] - * ["StrategicGoals"] + * ["StrategicGoals"] - * ["IndexStructure"] + * ["IndexStructure"] - * ["Getting Started"] + * ["Getting Started"] - * JavaDemoApplication - A simple demonstration of how to use the Nutch APIin a Java application + * JavaDemoApplication - A simple demonstration of how to use the Nutch APIin a Java application - * InstallingWeb2 + * InstallingWeb2 - == Nutch 2.0 == + == Nutch 2.0 == - * ["Nutch2Architecture"] -- Discussions on the Nutch 2.0 architecture. + * ["Nutch2Architecture"] -- Discussions on the Nutch 2.0 architecture. - == Other Resources == + == Other Resources == - * [http://nutch.sourceforge.net/blog/cutting.html Doug's Weblog] -- He's the one who originally wrote Lucene and Nutch. + * [http://nutch.sourceforge.net/blog/cutting.html Doug's Weblog] -- He's the one who originally wrote Lucene and Nutch. - * [http://wiki.media-style.com/display/nutchDocu/Home Stefan's Nutch Documentation] + * [http://wiki.media-style.com/display/nutchDocu/Home Stefan's Nutch Documentation] - * [http://frutch.free.fr/wikini/ Frutch Wiki] -- French Nutch Wiki + * [http://frutch.free.fr/wikini/ Frutch Wiki] -- French Nutch Wiki - * The [http://nutch.sourceforge.net/cgi-bin/twiki/view/Main/Nutch Old Wiki] + * The [http://nutch.sourceforge.net/cgi-bin/twiki/view/Main/Nutch Old Wiki] - * ["Search_Theory"] Search Theory & White Papers + * ["Search_Theory"] Search Theory & White Papers - * [http://wiki.apache.org/nutch-data/attachments/FrontPage/attachments/Hadoop-Nutch%200.8%20Tutorial%2022-07-06%20%3CNavoni%20Roberto%3E Tutorial Hadoop+Nutch 0.8 night build Roberto Navoni 24-07-06] + * [http://wiki.apache.org/nutch-data/attachments/FrontPage/attachments/Hadoop-Nutch%200.8%20Tutorial%2022-07-06%20%3CNavoni%20Roberto%3E Tutorial Hadoop+Nutch 0.8 night build Roberto Navoni 24-07-06] - * [http://blog.foofactory.fi/ FooFactory] Nutch and Hadoop related posts + * [http://blog.foofactory.fi/ FooFactory] Nutch and Hadoop related posts - * [http://spinn3r.com Spinn3r] [http://spinn3r.com/opensource.php Open Source components] (our contribution to the crawling OSS community with more to come). + * [http://spinn3r.com Spinn3r] [http://spinn3r.com/opensource.php Open Source components] (our contribution to the crawling OSS community with more to come). - * [http://www.interadvertising.co.uk/blog/nutch_logos Larger / better quality Nutch logos] Re-created Nutch logos available in GIF, PNG & EPS in resolutions up to 1200 x 449 + * [http://www.interadvertising.co.uk/blog/nutch_logos Larger / better quality Nutch logos] Re-created Nutch logos available in GIF, PNG & EPS in resolutions up to 1200 x 449