Hi Michael, On Sat, Nov 19, 2016 at 8:09 AM, <[email protected]> wrote:
> From: Michael Coffey <[email protected]> > To: "[email protected]" <[email protected]> > Cc: > Date: Fri, 18 Nov 2016 21:15:14 +0000 (UTC) > Subject: indexing to Solr > Where can I find up-to-date information on indexing to Solr? http://wiki.apache.org/nutch/NutchTutorial in particular https://wiki.apache.org/nutch/NutchTutorial#Step-by-Step:_Indexing_into_Apache_Solr If you find any issues with this tutorial then please let us know. Thank you. > When I search the web, I find tutorials that use the deprecated solrindex > command. I also find questions where people want to know why it doesn't > work. > That is because the only official documentation resides at http://wiki.apache.org/nutch/NutchTutorial > I have a good nutch 1.12 installation on a working hadoop cluster and a > Solr 6.3.0 installation which works for their gettingstarted example. > You should use the specified version of Solr for the Nutch release. This is Solr 5.4.1 as defined in the indexer-solr plugin ivy.xml > I have questions likeDo I need to create a core and a collection in solr? Yes I would. This is explained at https://wiki.apache.org/nutch/NutchTutorial#Setup_Solr_for_search > Do I need http or cloud type server?Do I need solr.zookeeper.url ? > This is not a Nutch question. This is your preferred Solr configuration. If you are just starting out then I would say it is not a big deal... experiment and go with what works best for your requirements and resources capacity. > What else needs to be set in nutch-site.xml? > Not much. For reference though, here are the Solr configuration options. https://github.com/apache/nutch/blob/master/conf/nutch-default.xml#L1750-L1826 > What about schema? > This is covered in https://wiki.apache.org/nutch/NutchTutorial#Setup_Solr_for_search > > Thanks for all the help so far! > > No problems. Any more issues, ping us here and we will help. Ta

