[Nutch-dev] 中国企业门户网 www.zqgh.net 免费注册 免费建站

2006-11-13 Thread 企业门户 www.zqgh.net
中国企业门户网 www.zqgh.net 免费注册 免费建站 企业网站 个人网站 发布供应信息 代理信息 加工信息 合作项目 求购信息 人才广场 企业招聘 产品分类 商业资讯 创业招商 中啦!彩票网www.zhongla.cn--网上彩票代购合买、走势分析; 为彩票销售点搭建网上销售平台。 网址: http://www.zhongla.cn/- Using Tomcat but need to do more?

[Nutch-dev] [jira] Commented: (NUTCH-395) Increase fetching speed

2006-11-13 Thread Andrzej Bialecki (JIRA)
[ http://issues.apache.org/jira/browse/NUTCH-395?page=comments#action_12449292 ] Andrzej Bialecki commented on NUTCH-395: - +1 - this patch looks good to me - if you could just fix the whitespace issues prior to committing, so that it

[Nutch-dev] 尊敬的您:

2006-11-13 Thread [EMAIL PROTECTED]
您好! 本公司是经财政局批准,工商注册登记成立的税=务代=理公司。本公司实力雄厚,在全国各大城市都设有分公司,有着良好的社会关系。因受其它公司和税=务委托,现在有部分国=税和地=税发=票优惠向外代=开。(0.5-2%)的税率即可开到,欢迎至电咨询! 国=税发=票有:普通商品销售专用发=票和机动车销售专用发=票。 地=税发=票有:广告,建筑安装,租赁,公路内河运输,国际运输,其它服务业和咨询等。 您方收到发=票后,可到税务局验证,验证无问题后付款给我方公司。

[Nutch-dev] Last-modified http field

2006-11-13 Thread Javier Parapar Lopez
Hi, I am trying to implement a plugin of indexing and parsing for specific purpose. I need to get the last-modified http field of the html documents, to have an estimation of the publishing date of the documents. If I try with

[Nutch-dev] 。代。开。发。票。

2006-11-13 Thread 谢飞
贵公司负责人(经理/财务)您好! 联创贸易发展有限公司,在全国各地设有分公司我公司 有雄厚的实力、良好的社会关系。现进项较多完成不了每月 销售额度、现每月有一部份(增值税发票、普通国税地税发 票)优惠代开或合作, 点数较低。还可以根据所做数量额度 的大小来商讨优惠点数、公司成立多年一直坚持以“诚信” 为中心作为公司的核心思想、牢固树立公司形象、真正做到 “彼此合作一次、必成永久朋友” 本公司郑重承诺所用绝对是真票!更希望能够有机会与 贵公司合作!此信息长期有效,如需进一步洽商! 详情请电:13925592092 联系人:刘林

[Nutch-dev] Warning: message 1GjHKk-0001Xh-Io delayed 24 hours

2006-11-13 Thread Mail Delivery System
This message was created automatically by mail delivery software. A message that you sent has not yet been delivered to one or more of its recipients after more than 24 hours on the queue on externalmx-1.sourceforge.net. The message identifier is: 1GjHKk-0001Xh-Io The subject of the message

Re: [Nutch-dev] Last-modified http field

2006-11-13 Thread Javier P. L.
Hi, I am trying to implement a plugin of indexing and parsing for specific purpose. I need to get the last-modified http field of the html documents, to have an estimation of the publishing date of the documents. If I try with

[Nutch-dev] [jira] Commented: (NUTCH-400) Update add missing license headers

2006-11-13 Thread Sami Siren (JIRA)
[ http://issues.apache.org/jira/browse/NUTCH-400?page=comments#action_12449440 ] Sami Siren commented on NUTCH-400: -- I updated headers and added missing headers to .java files in trunk. There are still plenty of (.xml, .jsp, html, properties)

[Nutch-dev] [jira] Created: (NUTCH-401) Hardcoded /tmp directory in SegmentReader

2006-11-13 Thread Rod Taylor (JIRA)
Hardcoded /tmp directory in SegmentReader - Key: NUTCH-401 URL: http://issues.apache.org/jira/browse/NUTCH-401 Project: Nutch Issue Type: Bug Affects Versions: 0.8.2, 0.9.0 Reporter:

[Nutch-dev] [jira] Resolved: (NUTCH-395) Increase fetching speed

2006-11-13 Thread Sami Siren (JIRA)
[ http://issues.apache.org/jira/browse/NUTCH-395?page=all ] Sami Siren resolved NUTCH-395. -- Fix Version/s: 0.9.0 Resolution: Fixed applied to trunk with some additional whitespace changes. Increase fetching speed ---

[Nutch-dev] Nutch requires now Java 1.5

2006-11-13 Thread Andrzej Bialecki
Hi all, As a consequence of recent commits, and following the decision made in Hadoop regarding newer releases, Nutch trunk/ as of now requires at least Java 1.5 to compile and run. This also means that in trunk/ we can use Java 5 language features in new files, or when refactoring existing

[Nutch-dev] [jira] Commented: (NUTCH-401) Hardcoded /tmp directory in SegmentReader

2006-11-13 Thread Sami Siren (JIRA)
[ http://issues.apache.org/jira/browse/NUTCH-401?page=comments#action_12449485 ] Sami Siren commented on NUTCH-401: -- Shouldn't this directory be configurable? I found it because of permission issues (/tmp isn't globally writable to catch stuff

[Nutch-dev] TO:进出口部/买单(250RMB/份)/百 分百保证质量/快捷/减少 成本/减少收汇风险

2006-11-13 Thread 兆良 利
降低出口收汇风险/减少出口成本的最佳服务新鹏广州分部:买单报关/商检/产地证/商会证/熏蒸(中国:600元/份;香港:300元/份)一条龙 青岛\上海\宁波\北京\大连各个口岸我公司可以提供优惠出口核销单证(五千美金以下150RMB/份五千美金以上250RMB/份,单到付款,百分百保证单证质量),所有单证均不用贵公司外汇,大大减低贵公司/厂方的收汇风险。欢迎来电。深圳市新鹏进出口有限公司,是一家在深圳注册,享有进出口业务经营权的外贸公司。我公司主营国际货物运输、进出口贸易、代理进出口报关、中港拖车及代办各类商检、FORM

Re: [Nutch-dev] [jira] Resolved: (NUTCH-395) Increase fetching speed

2006-11-13 Thread AJ Chen
Sami, Thanks for resolving this serious issue. I just updated my code from trunk and plan to test fetch speed. But ,there is a runtime error related to switching from UTF8 to Text. Since the error is from hadoop, how do I fix it? java.lang.ClassCastException: org.apache.hadoop.io.UTF8 at