Hi Pau,
I have not used the solrindex command, but from the "input path" error message,
it sounds like it wants the actual segment directory under segments/.
The nutch crawl script uses the following commands:
* inject
* generate
* fetch
* parse
* updatedb
* invertlinks
* dedup
* index
* clean
Hi Rashmi,
I have followed your suggestions.
Now I'm seeing a different error.
bin/nutch solrindex http://127.0.0.1:8983/solr crawl/crawld -linkdb
crawl/linkdb crawl/segments
The input path at segments is not a segment... skipping
Indexer: starting at 2017-07-11 20:45:56
Indexer: deleting gone
Hi Pau,
Yes, it took me a while to get things working because the tutorial is not
complete or up to date.
In conf/nutch-site.xml, the value for plugin.includes uses indexer-elastic by
default. If you want to use SOLR, you'll have to change it to indexer-solr.
I haven't tried SOLR 6.6, but
Hi Yossi and BlackIce,
many thanks for your tips. However, a tutorial needs to be self-contained,
or at least link to the documentation/tutorial on how to configure the
parts it uses.
On Tue, Jul 11, 2017 at 1:39 PM BlackIce wrote:
> I think by default the newer SOLR
I think by default the newer SOLR starts in "schemaless" mode.. One neds to
create a config directory with ALL necessary configuration files like
schema and solar.conf BEFORE creating the collection and then run a command
to create this collection using this conf directory. I don't have access to
I struggled with this as well. Eventually I moved to ElasticSearch, which is
much easier.
What I did manage to find out, is that in newer versions of SOLR you need to
use ZooKeeper to update the conf file. see https://stackoverflow.com/a/43351358.
-Original Message-
From: Pau Paches
Hi,
I just crawl a single URL so no whole web crawling.
So I do option 2, fetching, invertlinks successfully. This is just Nutch 1.x
Then I do Indexing into Apache Solr so go to section Setup Solr for search.
First thing that does not work:
cd ${APACHE_SOLR_HOME}/example
java -jar start.jar
No
7 matches
Mail list logo