[
https://issues.apache.org/jira/browse/NUTCH-518?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#action_12513819
]
Enis Soztutar commented on NUTCH-518:
-
Since there is no ordering among scoring filters, if we do something
[
https://issues.apache.org/jira/browse/NUTCH-518?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#action_12513821
]
Doğacan Güney commented on NUTCH-518:
-
This is another alternative. I am not suggesting that we use it but just
[
https://issues.apache.org/jira/browse/NUTCH-518?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#action_12513823
]
Doğacan Güney commented on NUTCH-518:
-
Btw, I think removing initial score arguments and merging scores in
[
https://issues.apache.org/jira/browse/NUTCH-518?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#action_12513826
]
Enis Soztutar commented on NUTCH-518:
-
I think removing initial score arguments and merging scores in
I tried running hadoop on a nfs mounted home directory on a single node. But
as the following tip on the wiki says, I am stuck with an issue:
Don't use DFS on an NFS mount. DFS uses locks, and NFS may be configured to
not allow them.
How to figure out if NFS uses locks ?
Is there a work around
A common infrastructure for different index backends
Key: NUTCH-520
URL: https://issues.apache.org/jira/browse/NUTCH-520
Project: Nutch
Issue Type: Improvement
Components:
[
https://issues.apache.org/jira/browse/NUTCH-518?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#action_12513853
]
Andrzej Bialecki commented on NUTCH-518:
-
IMHO this change is not helpful. It takes away too much control
Modified injector to allow newly injected CrawlDatum to overwrite original
--
Key: NUTCH-521
URL: https://issues.apache.org/jira/browse/NUTCH-521
Project: Nutch
Issue
[
https://issues.apache.org/jira/browse/NUTCH-521?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Rob Young updated NUTCH-521:
Attachment: inject.patch
Modified injector to allow newly injected CrawlDatum to overwrite original
In org.apache.nutch.crawl.LinkDb on line 261 it creates a working
directory (newLinkDb) based on the current working directory. This
should be configurable rather than being based on where Tomcat was
started. I am planning on writing a patch to pull the hadoop.tmp.dir
setting if it is available,
Robert Young wrote:
In org.apache.nutch.crawl.LinkDb on line 261 it creates a working
directory (newLinkDb) based on the current working directory. This
should be configurable rather than being based on where Tomcat was
started. I am planning on writing a patch to pull the hadoop.tmp.dir
[
https://issues.apache.org/jira/browse/NUTCH-521?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#action_12513870
]
Doğacan Güney commented on NUTCH-521:
-
AFAICS, you didn't give users a way to specify whether they want to
Use URLValidator in the Injector
Key: NUTCH-522
URL: https://issues.apache.org/jira/browse/NUTCH-522
Project: Nutch
Issue Type: Improvement
Components: injector
Reporter: Emmanuel Joke
[
https://issues.apache.org/jira/browse/NUTCH-522?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Emmanuel Joke updated NUTCH-522:
Attachment: NUTCH-522.patch
Patch provided
Use URLValidator in the Injector
Tomcat only comes into it because we have to start Tomcat in the
searcher directory, I'm guessing it's the same however you choose to
use Nutch. It would still have to do a rename across physical volumes
if searcher.dir is set to something different would it not?
How does this sound as a
I don't use the nutch web application, but You don't have to
start nutch in the searcher directory. You can set the location of
the searcher dir within the nutch-site.xml config file.
Add this node and set the location of your index:
property
namesearcher.dir/name
Hi ,
After replacing it with the Throwable, it safely parsed that page, but got
the same OOM Error during the parse of
http://lcweb2.loc.gov/ndlpcoop/nicmoas/livn-2/liv
n0181.sgm. But this time it seems that the error occured at line 78 .
Here is the stacktrace. (The same page we cant parse
[
https://issues.apache.org/jira/browse/NUTCH-522?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#action_12513895
]
Doğacan Güney commented on NUTCH-522:
-
I like the idea, but your patch seems to have a bug. Now injector only
欢迎访问七彩谷成人用品商城
http://jow.7cv.com
・美国SizePro增大丸
・德国火焰壮阳片
・德国金刚片
・大将军胶囊
・巴西壮阳果胶囊
・超级猛男壮阳组合
・宝和超浓缩海狗丸
・雪域藏獒生物胶囊
・中华猛男王健力片
・印度种马延时胶囊
・蚁力回春丹
http://jow.7cv.com
★ [露乳] 火辣露乳开档―娇艳欲滴露乳裙
★ [薄纱装] 轻透薄纱装―绝色倾城短裙
★ [丝袜] 游戏丝袜―性感女神开裆连身袜
★ [小裤] 情趣小裤―花解语刺绣开裆小裤
http://jow.7cv.com
Yes, I do this for the searcher directory but in the LinkDb class it
makes a reference to a Path which is relative (just for a temporary
working directory). This is the problem, because if I start tomcat in
a path where the java user does not have permissions to create a
directory then LinkDb
Ahh, now I see what you are referring to. Thanks for the question.
Now I know why I was getting garbage in my directory a while back.
So, I guess you may need to edit that class. Are you using hadoop in
local mode?
On 7/19/07, Robert Young [EMAIL PROTECTED] wrote:
Yes, I do this for the
[
https://issues.apache.org/jira/browse/NUTCH-522?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Emmanuel Joke updated NUTCH-522:
Attachment: NUTCH-522_v2.patch
Oops, my mistake. Please find an updated patch.
Actually I've a
---广东粤鹏发有限公司-
致:(财务/经理)---您们好!
目前由于,我司有部分余额税票;现可向全国各地中小城市提供优惠代开。
本公司郑重承诺所开票据均可上网查询验证;
更希望能够有机会与贵司合作!此信息长期有效有需者敬请保留,以备后用---谢谢!
负责人:刘尉
手机:13692212010
-
This SF.net
23 matches
Mail list logo