On Mon, Apr 5, 2010 at 3:32 PM, Anil Kumar a...@nexusemp.com wrote:
Hi
I'm using Nutch crawler in my project.
I scrubbing the data from one of the site
which have multiple links in that page leads
to another web pages.
Nutch does not crawling the all links.
Help me to resolve this
Hi
I'm using Nutch crawler in my project and crawled more than 2GB of data
using Nutch runbot script. Up to 2GB segment merger has took and ended
with in 24 hrs but now it takes more than 48 hrs and still running. I
have set depth to 16 and topN to 2500. I want to run crawler every day
as per my
hi
Metatag parser work great. When I dumped with readseg I saw
metatag.keywords field and data.But I can't solve query part. I put
nutch-site.xml this and deploy it. When I querying a keyword there is no
search result. Is there anything wrong with this situation ?
property
Just a reminder, just over one week left open on the CFP. Some great talks
entered already. Keep it up!
On Mar 24, 2010, at 8:03 PM, Grant Ingersoll wrote:
Apache Lucene EuroCon Call For Participation - Prague, Czech Republic May 20
21, 2010
All submissions must be received by
On Mon, Apr 5, 2010 at 5:27 PM, ashokkumar.raveendi...@wipro.com wrote:
Hi
I'm using Nutch crawler in my project and crawled more than 2GB of data
using Nutch runbot script. Up to 2GB segment merger has took and ended
with in 24 hrs but now it takes more than 48 hrs and still running. I
Hi,
Thank you for your suggestion. I have around 500+ internet urls
configured for crawling and crawl process is running in Amazon cloud. I
have already reduced my depth to 8, topN to 1000 and also increased
fetcher threads to 150 and limited 50 urls per host using
generate.max.per.host
Hi
The query-basic plugin is used to include these fields in the search e.g. in
nutch-site.xml
{code:xml}
property
namequery.basic.description.boost/name
value2.0/value
/property
property
namequery.basic.keywords.boost/name
value2.0/value
/property
{code}
The query filter included in
On 2010-04-05 16:54, ashokkumar.raveendi...@wipro.com wrote:
Hi,
Thank you for your suggestion. I have around 500+ internet urls
configured for crawling and crawl process is running in Amazon cloud. I
have already reduced my depth to 8, topN to 1000 and also increased
fetcher threads
I would like to use a keepword filter, like in Solr, but I'm not using
Solr... what options do I have apart rebuilding one from scratch in Nutch ?
--
-MilleBii-
Hi,
-Original Message-
From: Susam Pal [mailto:susam@gmail.com]
Sent: Tuesday, 6 April 2010 12:18 AM
To: nutch-user@lucene.apache.org
Subject: Re: Nutch segment merge is very slow
On Mon, Apr 5, 2010 at 5:27 PM, ashokkumar.raveendi...@wipro.com
wrote:
Hi
I'm using
10 matches
Mail list logo