No problem and sorry about that!
On Fri, Oct 5, 2018 at 11:50 AM Sebastian Nagel
wrote:
> Hi Timeka,
>
> > because Solr is missing the
> > files from its packet for it to work.
>
> There are many Solr versions available and it easily may happen that the
> description in the Wiki is outdated or
Hi Timeka,
> because Solr is missing the
> files from its packet for it to work.
There are many Solr versions available and it easily may happen that the
description in the Wiki is outdated or not applicable for your combination
of Nutch and Solr.
Please try to give as much information as
Thank you so very much
On Fri, Oct 5, 2018, 3:41 PM Yash Thenuan Thenuan
wrote:
> You can use elasticsearch.
>
> On Sat, 6 Oct 2018, 00:58 Timeka Cobb, wrote:
>
> > Hello folks! Does anyone know of a good alternative to Solr? Im asking
> this
> > becasue Ive been trying to connect the 2 and
You can use elasticsearch.
On Sat, 6 Oct 2018, 00:58 Timeka Cobb, wrote:
> Hello folks! Does anyone know of a good alternative to Solr? Im asking this
> becasue Ive been trying to connect the 2 and its been so frustrating.
> The Nutch Wiki is extremely unreliable when it comes to Solr and
Hello folks! Does anyone know of a good alternative to Solr? Im asking this
becasue Ive been trying to connect the 2 and its been so frustrating.
The Nutch Wiki is extremely unreliable when it comes to Solr and every site
I go to for info leads me nowhere. Does anyone know of something else I
Hi Amarnath,
the only possibility is that https://www.abc.com/ is skipped
- by another rule in regex-urlfilter.txt
- or another URL filter plugin
Please check your configuration carefully. You may also use the tool
bin/nutch filterchecker
to test the filters beforehand: every active filter
Info given is not sufficient to figure out the problem.
1. You need to add indexer-solr to the plugins list.
2. Check "solr index properties" in nutch-default.xml ( It has lot of
properties)
check out - https://wiki.apache.org/nutch/NutchTutorial for detailed
explanation.
On Fri, Oct 5, 2018
Also, check last regex line.
*# accept anything else*
*+.*
By mistake if you have made it negative( -.), everything will be discarded.
Best,
Govind
On Fri, Oct 5, 2018 at 1:02 PM Sebastian Nagel
wrote:
> Hi Amarnath,
>
> the only possibility is that https://www.abc.com/ is skipped
> - by
HI ALL,
while i am using nutch for crawling and indexing in to solr,while storing
data in to solr encoding issue facing
in site having the title
title : ebm-papst Motoren & Ventilatoren GmbH - Axialventilatoren und
Radialventilatoren aus Linz, Österreich
but in solr storing in the below
I see that but I'm the instructions they say to create resources and the
command line Nutch Wiki offers doesn't work because Solr is missing the
files from its packet for it to work..I will try again. Thank ya so much
for ya help
On Fri, Oct 5, 2018, 4:48 AM govind nitk wrote:
> Info given is
Hi Sebastian,
Thanks for the update, here is my regex pattern to block my use case after
long spent time.
*-.*(modal[-_a-zA-Z0-9]*[\.]html|exit.html[\/]?\??.*|model[-_a-zA-Z0-9]*[\.]html|exitpage.*|exitPage.*)*
There was some other pattern which caused whole block, I rectified it.
Thanks,
11 matches
Mail list logo