Zaheed Haque wrote:
Hi
Lot of the patch/plugins in Jiira are not updated to reflect changes
in trunk. Probably the way to test it would be building this using
that specific revision of nutch.
I'm aware of that. I just put a note because I see that this patch is
for 0.9.
regards,
Uros
cheers
On 9/5/06, Uros Gruber (JIRA) <[EMAIL PROTECTED]> wrote:
[
http://issues.apache.org/jira/browse/NUTCH-249?page=comments#action_12432584
]
Uros Gruber commented on NUTCH-249:
-----------------------------------
I'm trying to test this patch but I'm having build problems
compile-core:
[javac] Compiling 2 source files to
/usr/home/uros/nutch-wb/build/classes
[javac]
/usr/home/uros/nutch-wb/src/java/org/apache/nutch/crawl/bw/BWUpdateDb.java:261:
createJob(org.apache.hadoop.conf.Configuration,org.apache.hadoop.fs.Path)
in org.apache.nutch.crawl.CrawlDb cannot be applied to
(org.apache.hadoop.conf.Configuration,java.io.File)
[javac] JobConf updateJob = CrawlDb.createJob(getConf(),
crawlDb);
[javac] ^
[javac]
/usr/home/uros/nutch-wb/src/java/org/apache/nutch/crawl/bw/BWUpdateDb.java:267:
install(org.apache.hadoop.mapred.JobConf,org.apache.hadoop.fs.Path)
in org.apache.nutch.crawl.CrawlDb cannot be applied to
(org.apache.hadoop.mapred.JobConf,java.io.File)
[javac] CrawlDb.install(updateJob, crawlDb);
[javac] ^
[javac] Note:
/usr/home/uros/nutch-wb/src/java/org/apache/nutch/crawl/bw/BWUpdateDb.java
uses or overrides a deprecated API.
> black- white list url filtering
> -------------------------------
>
> Key: NUTCH-249
> URL: http://issues.apache.org/jira/browse/NUTCH-249
> Project: Nutch
> Issue Type: Improvement
> Components: fetcher
> Affects Versions: 0.8
> Reporter: Stefan Groschupf
> Priority: Trivial
> Fix For: 0.9.0
>
> Attachments: blackWhiteListV2.patch, blackWhiteListV3.patch
>
>
> Existing url filter mechanisms need to process each url against
each filter pattern. For very large filter sets this may be does not
scale very well.
--
This message is automatically generated by JIRA.
-
If you think it was sent incorrectly contact one of the
administrators: http://issues.apache.org/jira/secure/Administrators.jspa
-
For more information on JIRA, see:
http://www.atlassian.com/software/jira