Hi, I am using nutch 8.1 to search mp3 files in intranet http sites, I've
revise the nutch-default.xml as following:
property
nameplugin.includes/name
valueprotocol-http|parse-mp3|urlfilter-regex|parse-(text|html|js)|index-basic|query-(basic|site|url)|summary-basic|scoring-opic/value
Is there a way to have nutch return some hit context (a la google) to
better identify the hit?
For example, if I search for nutch, a link pointing to
http://lucene.apache.org/nutch/; would be followed by the following
context:
This is the first *Nutch* release as an Apache Lucene sub-project.
...
You need to enable index-more and query-more plugins to enable
type, date range etc based query..
property
nameplugin.includes/name
valueprotocol-http|urlfilter-regex|parse-(text|html|js|mp3)|index-(basic|more)|query-(basic|more|site|url)|summary-basic|scoring-opic/value
/property
On
Hello List,
does anyone of you has experience with the Linux command nice to set a
lower priority to a crawler or indexer process?
When I run an indexer, I have the problem that my load goes up to 2.00
sometimes. This lags other processes and sometimes its really disturbing.
Any ideas?
I run index step with nice -13
- -Original Message-
- From: NG-Marketing, M.Schneider [mailto:[EMAIL PROTECTED]
- Sent: 28 ??? 2006 ?. 19:20
- To: nutch-user@lucene.apache.org
- Subject: nice a indexer
-
- Hello List,
-
-
-
- does anyone of you has experience with the
On 28/10/06, [EMAIL PROTECTED] [EMAIL PROTECTED] wrote:
Can the plugin parse-mp3 parse the infomation in mp3 files such as author,
song name, artist and so on ?
The parse-mp3 plugin can obtain any information in the ID3 tags
contained in the file. If this information is not part of the file,
Hi,
Do you have some hints that would improve speed for the following nutch
commands?
./nutch generate db segments -topN 1000
s=`ls -d segments/2* | tail -1`
./nutch fetch $s
./nutch updatedb db $s
./nutch index $s
./nutch dedup segments tmpfile
I mean, do you have some hints for the