hi,
I ve downloaded apache-ant-1.7.0 version...
the idea is to compile the nutch source code..
and i ve placed in my nutch directory..
does this means the installation of ant is over...?
or is there any steps to be followed...
if so kindly tell me the steps which i have to follow to compile the
y don't u compile nutch in eclipse if you are working in windows enviornment,
then u need not to download ant .
if you can proceed with that then i can explain you rest.
in linux i have worked only till deployment and not done any testing and
running of nutch source code.
Thanks
i guess java program can be compiled once and then it can be run anywhere...
so once compiled in widows and then if that package can be used in Unix,then
explain me the further steps..
so if its possible to compile the code in eclipse,then please tell me how to
do..
i don have any idea abt
Adding optional terms to a query
Key: NUTCH-470
URL: https://issues.apache.org/jira/browse/NUTCH-470
Project: Nutch
Issue Type: Wish
Components: searcher
Affects Versions: 0.9.0
[
https://issues.apache.org/jira/browse/NUTCH-470?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Trond Andersen updated NUTCH-470:
-
Attachment: optional.patch
A small patch making it possible to add optional terms to the Query
Ok thanks for all your input guys! I`ll discuss this with my co-worker.
Dennis, what more information do you need?
Thanks everyone!
Briggs wrote:
One more thing...
Are you using a distributed index? If this is so, you do not want to
do this; indexes should be local to the machine that
Fix synchronization in NutchBean creation
-
Key: NUTCH-471
URL: https://issues.apache.org/jira/browse/NUTCH-471
Project: Nutch
Issue Type: Bug
Components: searcher
Affects Versions:
[
https://issues.apache.org/jira/browse/NUTCH-471?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Enis Soztutar updated NUTCH-471:
Attachment: NutchBeanCreationSync_v1.patch
this patch synchronizes NutchBean.get((ServletContext
Hey guys,
one more addition, we're not using DFS. We got a single XP box with NFTS (so
no distributed index).
Hope this helps, greetings..
JoostRuiter wrote:
Ok thanks for all your input guys! I`ll discuss this with my co-worker.
Dennis, what more information do you need?
Thanks
Hi all,
I have been working on Fetcher2 code lately and I came across this
particular code (in FetchItemQueue.getFetchItem) that I didn't quite
understand:
public FetchItem getFetchItem() {
...
long last = endTime.get() + (maxThreads 1 ? crawlDelay : minCrawlDelay);
...
}
Now, the
I got some additional info from our developer:
I never
had much luck with the merge tools but you might post this snippit from
your log to the board:
2007-04-23 20:01:56,656 INFO segment.SegmentMerger - Slice size: 5
URLs.
2007-04-23 20:01:56,656 INFO segment.SegmentMerger - Slice size:
NullPointerException in ZipTextExtractor if no MIME type for zipped file
Key: NUTCH-472
URL: https://issues.apache.org/jira/browse/NUTCH-472
Project: Nutch
Issue Type:
ExcepExtractor performance bad due to String concatenation
--
Key: NUTCH-473
URL: https://issues.apache.org/jira/browse/NUTCH-473
Project: Nutch
Issue Type: Improvement
[
https://issues.apache.org/jira/browse/NUTCH-473?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Antony Bowesman updated NUTCH-473:
--
Summary: ExcelExtractor performance bad due to String concatenation (was:
ExcepExtractor
I have discovered another bug in Fetcher2. Plugin lib-http checks
Protocol.CHECK_{BLOCKING,ROBOTS}(which resolve to strings
protocol.plugin.check.{blocking,robots}) to see if it should handle
blocking or not.
But fetcher2 sets http.plugin.check.{blocking,robots} (notice the
protocol/http
Doğacan Güney wrote:
Hi all,
I have been working on Fetcher2 code lately and I came across this
particular code (in FetchItemQueue.getFetchItem) that I didn't quite
understand:
public FetchItem getFetchItem() {
...
long last = endTime.get() + (maxThreads 1 ? crawlDelay :
Doğacan Güney wrote:
I have discovered another bug in Fetcher2. Plugin lib-http checks
Protocol.CHECK_{BLOCKING,ROBOTS}(which resolve to strings
protocol.plugin.check.{blocking,robots}) to see if it should handle
blocking or not.
But fetcher2 sets http.plugin.check.{blocking,robots}
[
https://issues.apache.org/jira/browse/NUTCH-471?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#action_12491290
]
Andrzej Bialecki commented on NUTCH-471:
-
+1. Nice trick with the unsynchronized check. :)
Fix
On 4/24/07, Andrzej Bialecki [EMAIL PROTECTED] wrote:
Doğacan Güney wrote:
Hi all,
I have been working on Fetcher2 code lately and I came across this
particular code (in FetchItemQueue.getFetchItem) that I didn't quite
understand:
public FetchItem getFetchItem() {
...
long
Fetcher2 sets server-delay and blocking checks incorrectly
--
Key: NUTCH-474
URL: https://issues.apache.org/jira/browse/NUTCH-474
Project: Nutch
Issue Type: Bug
Components:
[
https://issues.apache.org/jira/browse/NUTCH-474?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Doğacan Güney updated NUTCH-474:
Attachment: fetcher2.patch
Fetcher2 sets server-delay and blocking checks incorrectly
[
https://issues.apache.org/jira/browse/NUTCH-471?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#action_12491305
]
Sami Siren commented on NUTCH-471:
--
Isn't the DCL declared to be broken?
We could perhaps instead instantiate
[
https://issues.apache.org/jira/browse/NUTCH-473?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Sami Siren resolved NUTCH-473.
--
Resolution: Duplicate
duplicate of NUTCH-456
ExcelExtractor performance bad due to String
Mike Schwartz wrote:
I have modified the geoPosition plugin
(http://wiki.apache.org/nutch/GeoPosition) code to work with nutch 0.9.
(The code was built originally using nutch 0.7.) I'd like to contribute
my changes to the nutch project. I already communicated with the code's
author
[
https://issues.apache.org/jira/browse/NUTCH-471?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#action_12491313
]
Enis Soztutar commented on NUTCH-471:
-
Nice trick with the unsynchronized check. :)
Wow, indeed i have used a
Doğacan Güney wrote:
I don't get it. The code seems to do exactly the opposite of what you
are saying. If maxThreads == 1 then maxThreads 1 is false thus the
expression evaluates to minCrawlDelay not crawlDelay. Shouldn't the
expression be (maxThreads 1 ? minCrawlDelay : crawlDelay) ?
Yep,
On 4/24/07, Andrzej Bialecki [EMAIL PROTECTED] wrote:
Doğacan Güney wrote:
I don't get it. The code seems to do exactly the opposite of what you
are saying. If maxThreads == 1 then maxThreads 1 is false thus the
expression evaluates to minCrawlDelay not crawlDelay. Shouldn't the
[
https://issues.apache.org/jira/browse/NUTCH-469?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Mike Schwartz updated NUTCH-469:
Attachment: geoPosition0.6_cdiff.zip
I've attached the contenxt diff from geoPosition 0.5 that I'm
ok, thanks - I've attached the zipped context diff to the Jira
ticket. Please let me know if you have any problems with this
- Mike
At 08:57 AM 4/24/2007, Sami Siren wrote:
Mike Schwartz wrote:
I have modified the geoPosition plugin
(http://wiki.apache.org/nutch/GeoPosition) code to work
Very briefly, with an HtmlParseFilter and a list of weighted words.
This filter examines the Parse text and add a boost value if it finds
one of the words in the list.
This boost value is added to ParseData MetaData.
Then, a ScoringPlugin reads this MetaData (passScoreAfterParsing) and
update
[
https://issues.apache.org/jira/browse/NUTCH-474?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Andrzej Bialecki closed NUTCH-474.
---
Resolution: Fixed
Assignee: Andrzej Bialecki
Fixed in rev. 532088. Thanks!
Fetcher2
致公司财务/经理:您好!
深圳市讯通实业有限公司(全国各大中城市均有分公司) 公司本着互惠互
利的原则合理对外代开发票.代开范围:(商品销售、广告、电脑版运输发票、
其它服务、租赁、建筑安装、餐饮定额发票等)税率1.5%左右代开。
贵公司在做帐或进销存方面如需用到的话,我司可提供全方面的服务。可
根据所做数量额度的大小来衡量优惠的点数。欢迎来电咨询!郑重承诺所用票
据均可上网查询验证后付款!
联系人:周先生联系电话:13928442060
E- MAIL [EMAIL PROTECTED]
with
-
This SF.net email is sponsored by DB2 Express
Download DB2 Express C - the FREE version of DB2 express and take
control of your XML. No limits. Just data. Click to get it now.
http://sourceforge.net/powerbar/db2/
is
-
This SF.net email is sponsored by DB2 Express
Download DB2 Express C - the FREE version of DB2 express and take
control of your XML. No limits. Just data. Click to get it now.
http://sourceforge.net/powerbar/db2/
34 matches
Mail list logo