When you run the inject and generate commands, in the console output do you
see your site being added?
Also while fetching and parsing you should be able to see the number of
successful fetches and parse actions in your console. Ideally this should
be equal to or more than the number of sites you've put in the seed.txt
file.
If this is not the case then there is some issue with either your seed.txt
file or the regex-urlfilter file.

While running the crawl command, you doing need to index to solr
separately. The command will do it for you.
Run ./crawl to see usage instructions.

On Mon, 22 Feb 2016, 11:41 Tom Running <[email protected]> wrote:

> Yes, I did ran these before run ./nutch solrindex
> http://localhost:8983/solr/ -all and get nothing.
>
>
> From /home/nutch/runtime/local/bin/
>
> ./nutch inject ../urls/seed.txt
> ./nutch readdb
> ./nutch generate -topN 2500
> ./nutch fetch -all
> ./nutch parse -all
> ./nutch updatedb
>
> Did not run the crawl command.
>
> Would I just run ./crawl ??
> then run this again ./nutch solrindex http://localhost:8983/solr/ -all
>
> Thank you very much for response to my questions.
>
> Tom
>
>
> On Sun, Feb 21, 2016 at 11:25 PM, Binoy Dalal <[email protected]>
> wrote:
>
> > Just to be clear, you did run the preceding nutch commands to inject,
> > generate, fetch and parse the URLs right?
> >
> > Additionally try with the ./crawl command to directly crawl and index
> > everything to solr without having to manually run all the steps.
> >
> > On Mon, 22 Feb 2016, 07:24 Tom Running <[email protected]> wrote:
> >
> > > I am trying to get Nutch to run solrindex and having problem.  I am
> using
> > > the following instruction from
> > > this document http://wiki.apache.org/nutch/Nutch2Tutorial.  Everything
> > > are working except when I ran the following command.
> > >
> > >
> > > *./nutch solrindex http://localhost:8983/solr <
> > http://localhost:8983/solr>
> > > -all*
> > >
> > >
> > >
> > > ****** it came back with the following info  *****
> > > ****** It seems to have problem with indexing ****
> > > IndexingJob: starting
> > > Active IndexWriters :
> > > SOLRIndexWriter
> > >         solr.server.url : URL of the SOLR instance (mandatory)
> > >         solr.commit.size : buffer size when sending to SOLR (default
> > 1000)
> > >         solr.mapping.file : name of the mapping file for fields
> (default
> > > solrindex-mapping.xml)
> > >         solr.auth : use authentication* (default false)*
> > >         solr.auth.username : username for authentication
> > >         solr.auth.password : password for authentication
> > > IndexingJob: done.
> > >
> > >
> > > When I launch the SOLR Web UI interface can not query or find any
> things
> > > under the default collection1 or the gettingstarted_shard1_replica1 or
> > > gettingstarted_shard2_replica1
> > >
> > >
> > > I have also tried with this option (with the colletion1) and still not
> > > able to query anything.
> > > ./nutch solrindex http://localhost:8983/solr/collection1 -all
> > >
> > >
> > >
> > > After download SOLR 4.10.3 and start it as it with command
> > > /home/solr/bin/solr start -e cloud -noprompt
> > >
> > > I did not modify any configuration file not posting any file or
> directory
> > > from within SOLR. I am assuming this command ./nutch solrindex
> > > http://localhost:8983/solr/collection1 will do all the posting and
> index
> > > for SOLR.
> > >
> > > Any ideas what am I missing here.  Any advice where to go from here
> > would
> > > be greatly appreciate.
> > >
> > > I Did tried copy /nutch/runtime/local/conf/*.*   into SOLR and it did
> not
> > > make any different.
> > >
> > > Thank you.
> > >
> > > Tom
> > >
> > > --
> > Regards,
> > Binoy Dalal
> >
>
-- 
Regards,
Binoy Dalal

Reply via email to