hi all:
i get a problem when I integrat Nutch-0.7.1 with an intelligent Chinese
Lexical Analysis System.
and i follow the next page:
http://www.nutchhacks.com/ftopic391.phphighlight=chinese
which wrote by *caoyuzhong
*when ant my modified java files , javac told me that couldn't find the
=6source
site to run ICTCLASCaller.
Notice: The codes and the DLL's usage is restricted by ICTCLAS copyright
(NOT MINE).
Details of usage are put into the comments of ICTCLASCaller.java.
Good Luck!
2006/3/27, kauu [EMAIL PROTECTED]:
hi all:
i get a problem when I integrat Nutch-0.7.1
output for details.
Total time: 39 seconds
On 3/27/06, kauu [EMAIL PROTECTED] wrote:
i get it
thank goodness!
i'am so happy to tell everyone i get it ! and i will write it for anyone
else!
On 3/27/06, kauu [EMAIL PROTECTED] wrote:
thanks any way
On 3/27/06, Yong-gang
i think you should learn the javacc ,then understand the analasis.jj
then the thai will be resolved soon .
just try it
On 11/7/06, sanjeev [EMAIL PROTECTED] wrote:
Hello,
After playing around with nutch for a few months I was tying to implement
the thai lanaguage analyzer for nutch.
hi :
i get a problem now ,i can't build the nutch in the linux os with ant
and my ant version is
Apache Ant version 1.5.2-20 compiled on September 25 2003
the error is below
so anyone get the same problem ?i need ur help
Buildfile: build.xml
BUILD FAILED
anyone kown the detail of the process with the topic how to start working
with MapReduce?
i'v read something in the FAQ ,but i don't understand it very well , my
version is 0.7.2, not 0.8x
--
www.babatu.com
yes, i 'm ur side
On 11/23/06, Scott Green [EMAIL PROTECTED] wrote:
Hi
NUTCH-61(http://issues.apache.org/jira/browse/NUTCH-61) is about
adaptive re-fetch plugin, and Jerome Charron had commented --Why not
making FetchSchedule a new ExtensionPoint and then
DefaultFetchSchedule and
thx very much ,i'll try it
On 12/9/06, Sami Siren [EMAIL PROTECTED] wrote:
吴志敏 wrote:
I want to read the stored segments to a xml file, but when I read the
SegmentReader.java, I find that it 's not a simple thing.
it's a hadoop's job to dump a text file. I just want to dump the
segments'
I can't test my parse-rss pluging in the nutch-0.8.1
I just can't test the default rsstest.rss file.
2007-01-25 17:04:34,703 INFO conf.Configuration
(Configuration.java:getConfResourceAsInputStream(340)) - found resource
parse-plugins.xml at
please give us the url,thx
On 1/25/07, chee wu [EMAIL PROTECTED] wrote:
Just appended the portion for .81 to NUTCH-339
- Original Message -
From: Armel T. Nene [EMAIL PROTECTED]
To: nutch-dev@lucene.apache.org
Sent: Thursday, January 25, 2007 8:06 AM
Subject: RE: Fetcher2
Chee,
i want to crawl the rss feeds and parse them ,then index them and at last
when search the content I just want that the hit just like an individual
page.
i don't know wether i tell u clearly.
item
title欧洲暴风雪后发制人 致航班延误交通混乱(组图)/title
description暴风雪横扫欧洲,导致多次航班延误
that whether we can write a plugin to get the functionality.
anyone who can give me some hint?
On 1/26/07, Gal Nitzan [EMAIL PROTECTED] wrote:
Hi Kauu,
The functionality you require doesn't exist in the current parse-rss
plugin. I need the same functionality but it doesn't exist and I believe
it's
who can tell me where and how to build a nutch document in nutch-0.8.1?
for example , one html page is a document , but i want to detach a document
to several ones .
On 1/27/07, kauu [EMAIL PROTECTED] wrote:
that's the right thing.
i think we should to do some thing when nutch fetch a page
that's right ,but in the other word , i just need to index the exact
information in a page .but in real ,the real world pages contain lots of
spam ,so i just want to index the description.
On 1/27/07, sishen [EMAIL PROTECTED] wrote:
On 1/26/07, Gal Nitzan [EMAIL PROTECTED] wrote:
Hi Kauu
://blog.idna-solutions.com
-Original Message-
From: kauu [mailto:[EMAIL PROTECTED]
Sent: 27 January 2007 06:43
To: nutch-dev@lucene.apache.org; [EMAIL PROTECTED]
Subject: Re: parse-rss make them items as different pages
who can tell me where and how to build a nutch document in nutch-0.8.1
Hi folks :
What’s I want to do is to separate a rss file into several pages .
Just as what has been discussed before. I want fetch a rss page and index
it as different documents in the index. So the searcher can search the
Item’s info as a individual hit.
What’s my opinion create a
://lucene.apache.org/nutch
category : news
author : kauu
so , is the plugin parse-rss can satisfy what i need?
item
titlenutch--open source/title
description
nutch nutch nutch nutch nutch
/description
linkhttp://lucene.apache.org/nutch/link
categorynews /category
why when I changed the nutch/conf/log4j.properties
I just changed the first line
Log4j.rootLogger=info,drfa to log4j.rootLogger=debug,drfa
Like this:
*** **
# RootLogger - DailyRollingFileAppender
#log4j.rootLogger=INFO,DRFA
,
Chris
On 1/30/07 7:30 PM, kauu [EMAIL PROTECTED] wrote:
thx for ur reply .
mybe i didn't tell clearly .
I want to index the item as a
individual page .then when i search the some
thing for example nutch-open
source, the nutch return a hit which contain
title : nutch-open source
sorry , i will be careful .thx any way
On 1/31/07, chee wu [EMAIL PROTECTED] wrote:
set the two java arguments-Dhadoop.log.file and -Dhadoop.log.dir
should fix your problem.
btw,not to put much chinese characters in your mail..
- Original Message -
From: kauu [EMAIL PROTECTED
, is to allow you to associate the metadata fields
category:, and author: with the item Outlink...
Cheers,
Chris
On 1/30/07 7:30 PM, kauu [EMAIL PROTECTED] wrote:
thx for ur reply .
mybe i didn't tell clearly .
I want to index the item as a
individual page .then when i search the some
thing
)
at org.apache.hadoop.mapred.ReduceTask.run(ReduceTask.java:247)
at org.apache.hadoop.mapred.LocalJobRunner$Job.run(LocalJobRunner.java
:112)
On 2/3/07, Renaud Richardet [EMAIL PROTECTED] wrote:
Gal, Chris, Kauu,
So, if I understand correctly, you need a way to pass information along
the fetches, so that when Nutch fetches
22 matches
Mail list logo