[jira] Commented: (NUTCH-810) Upgrade to Tika 0.7

2010-04-06 Thread Hudson (JIRA)
[ https://issues.apache.org/jira/browse/NUTCH-810?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12854332#action_12854332 ] Hudson commented on NUTCH-810: -- Integrated in Nutch-trunk #1116 (See

[jira] Commented: (NUTCH-779) Mechanism for passing metadata from parse to crawldb

2010-03-30 Thread Hudson (JIRA)
[ https://issues.apache.org/jira/browse/NUTCH-779?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12851719#action_12851719 ] Hudson commented on NUTCH-779: -- Integrated in Nutch-trunk #1112 (See

[jira] Commented: (NUTCH-784) CrawlDBScanner

2010-03-29 Thread Hudson (JIRA)
[ https://issues.apache.org/jira/browse/NUTCH-784?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12851238#action_12851238 ] Hudson commented on NUTCH-784: -- Integrated in Nutch-trunk # (See

[jira] Commented: (NUTCH-740) Configuration option to override default language for fetched pages.

2010-03-22 Thread Hudson (JIRA)
[ https://issues.apache.org/jira/browse/NUTCH-740?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12848537#action_12848537 ] Hudson commented on NUTCH-740: -- Integrated in Nutch-trunk #1104 (See

[jira] Commented: (NUTCH-762) Alternative Generator which can generate several segments in one parse of the crawlDB

2010-03-22 Thread Hudson (JIRA)
[ https://issues.apache.org/jira/browse/NUTCH-762?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12848536#action_12848536 ] Hudson commented on NUTCH-762: -- Integrated in Nutch-trunk #1104 (See

[jira] Commented: (NUTCH-787) Upgrade Lucene to 3.0.1.

2010-03-19 Thread Hudson (JIRA)
[ https://issues.apache.org/jira/browse/NUTCH-787?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12847709#action_12847709 ] Hudson commented on NUTCH-787: -- Integrated in Nutch-trunk #1101 (See

[jira] Commented: (NUTCH-803) Upgrade Hadoop to 0.20.2

2010-03-19 Thread Hudson (JIRA)
[ https://issues.apache.org/jira/browse/NUTCH-803?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12847710#action_12847710 ] Hudson commented on NUTCH-803: -- Integrated in Nutch-trunk #1101 (See

[jira] Commented: (NUTCH-796) Zero results problems difficult to troubleshoot due to lack of logging

2010-03-18 Thread Hudson (JIRA)
[ https://issues.apache.org/jira/browse/NUTCH-796?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12847219#action_12847219 ] Hudson commented on NUTCH-796: -- Integrated in Nutch-trunk #1100 (See

[jira] Commented: (NUTCH-801) Remove RTF and MP3 parse plugins

2010-03-11 Thread Hudson (JIRA)
[ https://issues.apache.org/jira/browse/NUTCH-801?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12844352#action_12844352 ] Hudson commented on NUTCH-801: -- Integrated in Nutch-trunk #1093 (See

[jira] Commented: (NUTCH-798) Upgrade to SOLR1.4

2010-03-11 Thread Hudson (JIRA)
[ https://issues.apache.org/jira/browse/NUTCH-798?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12844351#action_12844351 ] Hudson commented on NUTCH-798: -- Integrated in Nutch-trunk #1093 (See

[jira] Commented: (NUTCH-799) SOLRIndexer to commit once all reducers have finished

2010-03-05 Thread Hudson (JIRA)
[ https://issues.apache.org/jira/browse/NUTCH-799?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12842177#action_12842177 ] Hudson commented on NUTCH-799: -- Integrated in Nutch-trunk #1087 (See

[jira] Commented: (NUTCH-782) Ability to order htmlparsefilters

2010-03-01 Thread Hudson (JIRA)
[ https://issues.apache.org/jira/browse/NUTCH-782?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12840002#action_12840002 ] Hudson commented on NUTCH-782: -- Integrated in Nutch-trunk #1083 (See

[jira] Commented: (NUTCH-719) fetchQueues.totalSize incorrect in Fetcher2

2010-02-19 Thread Hudson (JIRA)
[ https://issues.apache.org/jira/browse/NUTCH-719?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12836125#action_12836125 ] Hudson commented on NUTCH-719: -- Integrated in Nutch-trunk #1074 (See

[jira] Commented: (NUTCH-793) search.jsp compile errors

2010-02-16 Thread Hudson (JIRA)
[ https://issues.apache.org/jira/browse/NUTCH-793?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12834659#action_12834659 ] Hudson commented on NUTCH-793: -- Integrated in Nutch-trunk #1071 (See

[jira] Commented: (NUTCH-794) Language Identification must use check the parse metadata for language values

2010-02-16 Thread Hudson (JIRA)
[ https://issues.apache.org/jira/browse/NUTCH-794?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12834657#action_12834657 ] Hudson commented on NUTCH-794: -- Integrated in Nutch-trunk #1071 (See

[jira] Commented: (NUTCH-766) Tika parser

2010-02-16 Thread Hudson (JIRA)
[ https://issues.apache.org/jira/browse/NUTCH-766?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12834658#action_12834658 ] Hudson commented on NUTCH-766: -- Integrated in Nutch-trunk #1071 (See

[jira] Commented: (NUTCH-792) Nutch version still contains 1.0

2010-02-14 Thread Hudson (JIRA)
[ https://issues.apache.org/jira/browse/NUTCH-792?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12833690#action_12833690 ] Hudson commented on NUTCH-792: -- Integrated in Nutch-trunk #1069 (See

[jira] Commented: (NUTCH-790) Some external javadoc links are broken

2010-02-14 Thread Hudson (JIRA)
[ https://issues.apache.org/jira/browse/NUTCH-790?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12833689#action_12833689 ] Hudson commented on NUTCH-790: -- Integrated in Nutch-trunk #1069 (See

[jira] Commented: (NUTCH-781) Update Tika to v0.6 for the MimeType detection

2010-02-02 Thread Hudson (JIRA)
[ https://issues.apache.org/jira/browse/NUTCH-781?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12828968#action_12828968 ] Hudson commented on NUTCH-781: -- Integrated in Nutch-trunk #1059 (See

[jira] Commented: (NUTCH-775) Enhance Searcher interface

2010-02-01 Thread Hudson (JIRA)
[ https://issues.apache.org/jira/browse/NUTCH-775?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12828459#action_12828459 ] Hudson commented on NUTCH-775: -- Integrated in Nutch-trunk #1058 (See

[jira] Commented: (NUTCH-781) Update Tika to v0.6 for the MimeType detection

2010-02-01 Thread Hudson (JIRA)
[ https://issues.apache.org/jira/browse/NUTCH-781?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12828458#action_12828458 ] Hudson commented on NUTCH-781: -- Integrated in Nutch-trunk #1058 (See

[jira] Commented: (NUTCH-767) Update Tika to v0.5 for the MimeType detection

2010-01-11 Thread Hudson (JIRA)
[ https://issues.apache.org/jira/browse/NUTCH-767?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12799072#action_12799072 ] Hudson commented on NUTCH-767: -- Integrated in Nutch-trunk #1037 (See

[jira] Commented: (NUTCH-269) CrawlDbReducer: OOME because no upper-bound on inlinks count

2010-01-08 Thread Hudson (JIRA)
[ https://issues.apache.org/jira/browse/NUTCH-269?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12798305#action_12798305 ] Hudson commented on NUTCH-269: -- Integrated in Nutch-trunk #1034 (See

[jira] Commented: (NUTCH-768) Upgrade Nutch 1.0 to use Hadoop 0.20

2009-12-18 Thread Hudson (JIRA)
[ https://issues.apache.org/jira/browse/NUTCH-768?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12792782#action_12792782 ] Hudson commented on NUTCH-768: -- Integrated in Nutch-trunk #1015 (See

[jira] Commented: (NUTCH-777) Upgrading to jetty6 broke unit tests

2009-12-18 Thread Hudson (JIRA)
[ https://issues.apache.org/jira/browse/NUTCH-777?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12792783#action_12792783 ] Hudson commented on NUTCH-777: -- Integrated in Nutch-trunk #1015 (See

[jira] Commented: (NUTCH-767) Update Tika to v0.5 for the MimeType detection

2009-12-04 Thread Hudson (JIRA)
[ https://issues.apache.org/jira/browse/NUTCH-767?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12786339#action_12786339 ] Hudson commented on NUTCH-767: -- Integrated in Nutch-trunk #1002 (See

[jira] Commented: (NUTCH-753) Prevent new Fetcher to retrieve the robots twice

2009-11-28 Thread Hudson (JIRA)
[ https://issues.apache.org/jira/browse/NUTCH-753?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12783235#action_12783235 ] Hudson commented on NUTCH-753: -- Integrated in Nutch-trunk #995 (See

[jira] Commented: (NUTCH-773) some minor bugs in AbstractFetchSchedule.java

2009-11-28 Thread Hudson (JIRA)
[ https://issues.apache.org/jira/browse/NUTCH-773?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12783238#action_12783238 ] Hudson commented on NUTCH-773: -- Integrated in Nutch-trunk #995 (See

[jira] Commented: (NUTCH-772) Upgrade Nutch to use Lucene 2.9.1

2009-11-28 Thread Hudson (JIRA)
[ https://issues.apache.org/jira/browse/NUTCH-772?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12783236#action_12783236 ] Hudson commented on NUTCH-772: -- Integrated in Nutch-trunk #995 (See

[jira] Commented: (NUTCH-760) Allow field mapping from nutch to solr index

2009-11-28 Thread Hudson (JIRA)
[ https://issues.apache.org/jira/browse/NUTCH-760?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12783237#action_12783237 ] Hudson commented on NUTCH-760: -- Integrated in Nutch-trunk #995 (See

[jira] Commented: (NUTCH-765) Allow Crawl class to call Either Solr or Lucene Indexer

2009-11-28 Thread Hudson (JIRA)
[ https://issues.apache.org/jira/browse/NUTCH-765?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12783234#action_12783234 ] Hudson commented on NUTCH-765: -- Integrated in Nutch-trunk #995 (See

[jira] Commented: (NUTCH-761) Avoid cloningCrawlDatum in CrawlDbReducer

2009-11-28 Thread Hudson (JIRA)
[ https://issues.apache.org/jira/browse/NUTCH-761?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12783239#action_12783239 ] Hudson commented on NUTCH-761: -- Integrated in Nutch-trunk #995 (See

[jira] Commented: (NUTCH-738) Close SegmentUpdater when FetchedSegments is closed

2009-11-28 Thread Hudson (JIRA)
[ https://issues.apache.org/jira/browse/NUTCH-738?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12783359#action_12783359 ] Hudson commented on NUTCH-738: -- Integrated in Nutch-trunk #996 (See

[jira] Commented: (NUTCH-741) Job file includes multiple copies of nutch config files.

2009-11-28 Thread Hudson (JIRA)
[ https://issues.apache.org/jira/browse/NUTCH-741?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12783357#action_12783357 ] Hudson commented on NUTCH-741: -- Integrated in Nutch-trunk #996 (See

[jira] Commented: (NUTCH-712) ParseOutputFormat should catch java.net.MalformedURLException coming from normalizers

2009-11-28 Thread Hudson (JIRA)
[ https://issues.apache.org/jira/browse/NUTCH-712?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12783360#action_12783360 ] Hudson commented on NUTCH-712: -- Integrated in Nutch-trunk #996 (See

[jira] Commented: (NUTCH-746) NutchBeanConstructor does not close NutchBean upon contextDestroyed, causing resource leak in the container.

2009-11-28 Thread Hudson (JIRA)
[ https://issues.apache.org/jira/browse/NUTCH-746?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12783356#action_12783356 ] Hudson commented on NUTCH-746: -- Integrated in Nutch-trunk #996 (See

[jira] Commented: (NUTCH-739) SolrDeleteDuplications too slow when using hadoop

2009-11-28 Thread Hudson (JIRA)
[ https://issues.apache.org/jira/browse/NUTCH-739?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12783358#action_12783358 ] Hudson commented on NUTCH-739: -- Integrated in Nutch-trunk #996 (See

[jira] Commented: (NUTCH-679) Fetcher2 implementing Tool

2009-10-09 Thread Hudson (JIRA)
[ https://issues.apache.org/jira/browse/NUTCH-679?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12764295#action_12764295 ] Hudson commented on NUTCH-679: -- Integrated in Nutch-trunk #959 (See

[jira] Commented: (NUTCH-754) Use GenericOptionsParser instead of FileSystem.parseArgs()

2009-10-09 Thread Hudson (JIRA)
[ https://issues.apache.org/jira/browse/NUTCH-754?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12764294#action_12764294 ] Hudson commented on NUTCH-754: -- Integrated in Nutch-trunk #959 (See

[jira] Commented: (NUTCH-707) Generation of multiple segments in multiple runs returns only 1 segment

2009-10-09 Thread Hudson (JIRA)
[ https://issues.apache.org/jira/browse/NUTCH-707?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12764293#action_12764293 ] Hudson commented on NUTCH-707: -- Integrated in Nutch-trunk #959 (See

[jira] Commented: (NUTCH-756) CrawlDatum.set() does not reset Metadata if it is null

2009-10-09 Thread Hudson (JIRA)
[ https://issues.apache.org/jira/browse/NUTCH-756?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12764297#action_12764297 ] Hudson commented on NUTCH-756: -- Integrated in Nutch-trunk #959 (See

[jira] Commented: (NUTCH-757) RequestUtils getBooleanParameter() always returns false

2009-10-09 Thread Hudson (JIRA)
[ https://issues.apache.org/jira/browse/NUTCH-757?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12764299#action_12764299 ] Hudson commented on NUTCH-757: -- Integrated in Nutch-trunk #959 (See

[jira] Commented: (NUTCH-758) Set subversion eol-style to native

2009-10-09 Thread Hudson (JIRA)
[ https://issues.apache.org/jira/browse/NUTCH-758?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12764300#action_12764300 ] Hudson commented on NUTCH-758: -- Integrated in Nutch-trunk #959 (See

[jira] Commented: (NUTCH-702) Lazy Instanciation of Metadata in CrawlDatum

2009-09-08 Thread Hudson (JIRA)
[ https://issues.apache.org/jira/browse/NUTCH-702?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12752913#action_12752913 ] Hudson commented on NUTCH-702: -- Integrated in Nutch-trunk #929 (See

[jira] Commented: (NUTCH-735) crawl-tool.xml must be read before nutch-site.xml when invoked using crawl command

2009-06-07 Thread Hudson (JIRA)
[ https://issues.apache.org/jira/browse/NUTCH-735?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12717137#action_12717137 ] Hudson commented on NUTCH-735: -- Integrated in Nutch-trunk #838 (See

[jira] Commented: (NUTCH-721) Fetcher2 Slow

2009-04-02 Thread Hudson (JIRA)
[ https://issues.apache.org/jira/browse/NUTCH-721?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12695233#action_12695233 ] Hudson commented on NUTCH-721: -- Integrated in Nutch-trunk #772 (See

[jira] Commented: (NUTCH-725) NOTICE.txt is lacking info that should be there

2009-03-19 Thread Hudson (JIRA)
[ https://issues.apache.org/jira/browse/NUTCH-725?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12683745#action_12683745 ] Hudson commented on NUTCH-725: -- Integrated in Nutch-trunk #758 (See

[jira] Commented: (NUTCH-727) Add KEYS file to release artifact

2009-03-19 Thread Hudson (JIRA)
[ https://issues.apache.org/jira/browse/NUTCH-727?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12683743#action_12683743 ] Hudson commented on NUTCH-727: -- Integrated in Nutch-trunk #758 (See

[jira] Commented: (NUTCH-723) LICENCE.txt is lacking info that should be there

2009-03-19 Thread Hudson (JIRA)
[ https://issues.apache.org/jira/browse/NUTCH-723?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12683742#action_12683742 ] Hudson commented on NUTCH-723: -- Integrated in Nutch-trunk #758 (See

[jira] Commented: (NUTCH-715) Subcollection plugin doesn't work with default subcollections.xml file

2009-03-10 Thread Hudson (JIRA)
[ https://issues.apache.org/jira/browse/NUTCH-715?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12680749#action_12680749 ] Hudson commented on NUTCH-715: -- Integrated in Nutch-trunk #749 (See

[jira] Commented: (NUTCH-684) Dedup support for Solr

2009-03-09 Thread Hudson (JIRA)
[ https://issues.apache.org/jira/browse/NUTCH-684?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12680374#action_12680374 ] Hudson commented on NUTCH-684: -- Integrated in Nutch-trunk #748 (See

[jira] Commented: (NUTCH-711) Indexer failing after upgrade to Hadoop 0.19.1

2009-03-04 Thread Hudson (JIRA)
[ https://issues.apache.org/jira/browse/NUTCH-711?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12679064#action_12679064 ] Hudson commented on NUTCH-711: -- Integrated in Nutch-trunk #743 (See

[jira] Commented: (NUTCH-669) Consolidate code for Fetcher and Fetcher2

2009-03-03 Thread Hudson (JIRA)
[ https://issues.apache.org/jira/browse/NUTCH-669?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12678573#action_12678573 ] Hudson commented on NUTCH-669: -- Integrated in Nutch-trunk #742 (See

[jira] Commented: (NUTCH-419) unavailable robots.txt kills fetch

2009-03-03 Thread Hudson (JIRA)
[ https://issues.apache.org/jira/browse/NUTCH-419?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12678574#action_12678574 ] Hudson commented on NUTCH-419: -- Integrated in Nutch-trunk #742 (See

[jira] Commented: (NUTCH-703) Upgrade to Hadoop 0.19.1

2009-02-27 Thread Hudson (JIRA)
[ https://issues.apache.org/jira/browse/NUTCH-703?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12677646#action_12677646 ] Hudson commented on NUTCH-703: -- Integrated in Nutch-trunk #738 (See

[jira] Commented: (NUTCH-699) Add an official solr schema for solr integration

2009-02-27 Thread Hudson (JIRA)
[ https://issues.apache.org/jira/browse/NUTCH-699?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12677647#action_12677647 ] Hudson commented on NUTCH-699: -- Integrated in Nutch-trunk #738 (See

[jira] Commented: (NUTCH-626) fetcher2 breaks out the domain with db.ignore.external.links set at cross domain redirects

2009-02-24 Thread Hudson (JIRA)
[ https://issues.apache.org/jira/browse/NUTCH-626?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12676496#action_12676496 ] Hudson commented on NUTCH-626: -- Integrated in Nutch-trunk #735 (See

[jira] Commented: (NUTCH-247) robot parser to restrict.

2009-02-24 Thread Hudson (JIRA)
[ https://issues.apache.org/jira/browse/NUTCH-247?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12676495#action_12676495 ] Hudson commented on NUTCH-247: -- Integrated in Nutch-trunk #735 (See

[jira] Commented: (NUTCH-698) CrawlDb is corrupted after a few crawl cycles

2009-02-24 Thread Hudson (JIRA)
[ https://issues.apache.org/jira/browse/NUTCH-698?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12676497#action_12676497 ] Hudson commented on NUTCH-698: -- Integrated in Nutch-trunk #735 (See

[jira] Commented: (NUTCH-694) Distributed Search Server fails

2009-02-23 Thread Hudson (JIRA)
[ https://issues.apache.org/jira/browse/NUTCH-694?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12676176#action_12676176 ] Hudson commented on NUTCH-694: -- Integrated in Nutch-trunk #734 (See

[jira] Commented: (NUTCH-695) incorrect mime type detection by MoreIndexingFilter plugin

2009-02-19 Thread Hudson (JIRA)
[ https://issues.apache.org/jira/browse/NUTCH-695?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12675235#action_12675235 ] Hudson commented on NUTCH-695: -- Integrated in Nutch-trunk #730 (See

[jira] Commented: (NUTCH-563) Include custom fields in BasicQueryFilter

2009-02-18 Thread Hudson (JIRA)
[ https://issues.apache.org/jira/browse/NUTCH-563?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12674885#action_12674885 ] Hudson commented on NUTCH-563: -- Integrated in Nutch-trunk #729 (See

[jira] Commented: (NUTCH-688) Fix missing/wrong headers in source files

2009-02-18 Thread Hudson (JIRA)
[ https://issues.apache.org/jira/browse/NUTCH-688?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12674888#action_12674888 ] Hudson commented on NUTCH-688: -- Integrated in Nutch-trunk #729 (See

[jira] Commented: (NUTCH-691) Update jakarta poi jars to the most relevant version

2009-02-18 Thread Hudson (JIRA)
[ https://issues.apache.org/jira/browse/NUTCH-691?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12674887#action_12674887 ] Hudson commented on NUTCH-691: -- Integrated in Nutch-trunk #729 (See

[jira] Commented: (NUTCH-683) NUTCH-676 broke CrawlDbMerger

2009-02-11 Thread Hudson (JIRA)
[ https://issues.apache.org/jira/browse/NUTCH-683?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12672871#action_12672871 ] Hudson commented on NUTCH-683: -- Integrated in Nutch-trunk #722 (See

[jira] Commented: (NUTCH-676) MapWritable is written inefficiently and confusingly

2009-02-11 Thread Hudson (JIRA)
[ https://issues.apache.org/jira/browse/NUTCH-676?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12672870#action_12672870 ] Hudson commented on NUTCH-676: -- Integrated in Nutch-trunk #722 (See

[jira] Commented: (NUTCH-636) Http client plug-in https doesn't work on IBM JRE

2009-02-06 Thread Hudson (JIRA)
[ https://issues.apache.org/jira/browse/NUTCH-636?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12671406#action_12671406 ] Hudson commented on NUTCH-636: -- Integrated in Nutch-trunk #717 (See

[jira] Commented: (NUTCH-643) ClassCastException in PdfParser on encrypted PDF with empty password

2009-02-06 Thread Hudson (JIRA)
[ https://issues.apache.org/jira/browse/NUTCH-643?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12671407#action_12671407 ] Hudson commented on NUTCH-643: -- Integrated in Nutch-trunk #717 (See

[jira] Commented: (NUTCH-279) Additions for regex-normalize

2009-02-03 Thread Hudson (JIRA)
[ https://issues.apache.org/jira/browse/NUTCH-279?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12670230#action_12670230 ] Hudson commented on NUTCH-279: -- Integrated in Nutch-trunk #714 (See

[jira] Commented: (NUTCH-671) JSP errors in Nutch searcher webapp running with Tomcat 6

2009-02-03 Thread Hudson (JIRA)
[ https://issues.apache.org/jira/browse/NUTCH-671?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12670231#action_12670231 ] Hudson commented on NUTCH-671: -- Integrated in Nutch-trunk #714 (See

[jira] Commented: (NUTCH-682) SOLR indexer does not set boost on the document

2009-01-29 Thread Hudson (JIRA)
[ https://issues.apache.org/jira/browse/NUTCH-682?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12668736#action_12668736 ] Hudson commented on NUTCH-682: -- Integrated in Nutch-trunk #709 (See

[jira] Commented: (NUTCH-571) parse-mp3 plugin doesn't always index album of mp3

2009-01-28 Thread Hudson (JIRA)
[ https://issues.apache.org/jira/browse/NUTCH-571?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12668324#action_12668324 ] Hudson commented on NUTCH-571: -- Integrated in Nutch-trunk #708 (See

[jira] Commented: (NUTCH-680) Update external jars to latest versions

2009-01-27 Thread Hudson (JIRA)
[ https://issues.apache.org/jira/browse/NUTCH-680?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12667930#action_12667930 ] Hudson commented on NUTCH-680: -- Integrated in Nutch-trunk #707 (See

[jira] Commented: (NUTCH-628) Host database to keep track of host-level information

2009-01-27 Thread Hudson (JIRA)
[ https://issues.apache.org/jira/browse/NUTCH-628?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12667929#action_12667929 ] Hudson commented on NUTCH-628: -- Integrated in Nutch-trunk #707 (See

[jira] Commented: (NUTCH-680) Update external jars to latest versions

2009-01-24 Thread Hudson (JIRA)
[ https://issues.apache.org/jira/browse/NUTCH-680?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12667047#action_12667047 ] Hudson commented on NUTCH-680: -- Integrated in Nutch-trunk #704 (See

[jira] Commented: (NUTCH-676) MapWritable is written inefficiently and confusingly

2009-01-21 Thread Hudson (JIRA)
[ https://issues.apache.org/jira/browse/NUTCH-676?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12666046#action_12666046 ] Hudson commented on NUTCH-676: -- Integrated in Nutch-trunk #701 (See

[jira] Commented: (NUTCH-681) parse-mp3 compilation problem

2009-01-21 Thread Hudson (JIRA)
[ https://issues.apache.org/jira/browse/NUTCH-681?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12666047#action_12666047 ] Hudson commented on NUTCH-681: -- Integrated in Nutch-trunk #701 (See

[jira] Commented: (NUTCH-579) Feed plugin only indexes one post per feed due to identical digest

2009-01-21 Thread Hudson (JIRA)
[ https://issues.apache.org/jira/browse/NUTCH-579?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12666045#action_12666045 ] Hudson commented on NUTCH-579: -- Integrated in Nutch-trunk #701 (See

[jira] Commented: (NUTCH-678) Hadoop 0.19 requires an update of jets3t

2009-01-19 Thread Hudson (JIRA)
[ https://issues.apache.org/jira/browse/NUTCH-678?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12665327#action_12665327 ] Hudson commented on NUTCH-678: -- Integrated in Nutch-trunk #699 (See

[jira] Commented: (NUTCH-627) Minimize host address lookup

2009-01-13 Thread Hudson (JIRA)
[ https://issues.apache.org/jira/browse/NUTCH-627?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12663619#action_12663619 ] Hudson commented on NUTCH-627: -- Integrated in Nutch-trunk #692 (See

[jira] Commented: (NUTCH-652) AdaptiveFetchSchedule#setFetchSchedule doesn't calculate fetch interval correctly

2009-01-12 Thread Hudson (JIRA)
[ https://issues.apache.org/jira/browse/NUTCH-652?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12663223#action_12663223 ] Hudson commented on NUTCH-652: -- Integrated in Nutch-trunk #691 (See

[jira] Commented: (NUTCH-668) Domain URL Filter

2009-01-12 Thread Hudson (JIRA)
[ https://issues.apache.org/jira/browse/NUTCH-668?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12663224#action_12663224 ] Hudson commented on NUTCH-668: -- Integrated in Nutch-trunk #691 (See

[jira] Commented: (NUTCH-594) Serve Nutch search results in multiple formats including XML and JSON

2009-01-12 Thread Hudson (JIRA)
[ https://issues.apache.org/jira/browse/NUTCH-594?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12663226#action_12663226 ] Hudson commented on NUTCH-594: -- Integrated in Nutch-trunk #691 (See

[jira] Commented: (NUTCH-442) Integrate Solr/Nutch

2009-01-12 Thread Hudson (JIRA)
[ https://issues.apache.org/jira/browse/NUTCH-442?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12663225#action_12663225 ] Hudson commented on NUTCH-442: -- Integrated in Nutch-trunk #691 (See

[jira] Commented: (NUTCH-667) Input Format for working with Content in Hadoop Streaming

2008-12-17 Thread Hudson (JIRA)
[ https://issues.apache.org/jira/browse/NUTCH-667?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12657684#action_12657684 ] Hudson commented on NUTCH-667: -- Integrated in Nutch-trunk #667 (See

[jira] Commented: (NUTCH-663) Upgrade Nutch to use Hadoop 0.19

2008-12-17 Thread Hudson (JIRA)
[ https://issues.apache.org/jira/browse/NUTCH-663?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12657682#action_12657682 ] Hudson commented on NUTCH-663: -- Integrated in Nutch-trunk #667 (See

[jira] Commented: (NUTCH-646) New Indexing Framework for Nutch

2008-12-17 Thread Hudson (JIRA)
[ https://issues.apache.org/jira/browse/NUTCH-646?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12657685#action_12657685 ] Hudson commented on NUTCH-646: -- Integrated in Nutch-trunk #667 (See

[jira] Commented: (NUTCH-665) Search Load Testing Tool

2008-12-17 Thread Hudson (JIRA)
[ https://issues.apache.org/jira/browse/NUTCH-665?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12657683#action_12657683 ] Hudson commented on NUTCH-665: -- Integrated in Nutch-trunk #667 (See

[jira] Commented: (NUTCH-635) LinkAnalysis Tool for Nutch

2008-12-17 Thread Hudson (JIRA)
[ https://issues.apache.org/jira/browse/NUTCH-635?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12657686#action_12657686 ] Hudson commented on NUTCH-635: -- Integrated in Nutch-trunk #667 (See

[jira] Commented: (NUTCH-662) Upgrade Nutch to use Lucene 2.4

2008-12-17 Thread Hudson (JIRA)
[ https://issues.apache.org/jira/browse/NUTCH-662?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12657688#action_12657688 ] Hudson commented on NUTCH-662: -- Integrated in Nutch-trunk #667 (See

[jira] Commented: (NUTCH-647) Resolve URLs tool

2008-12-17 Thread Hudson (JIRA)
[ https://issues.apache.org/jira/browse/NUTCH-647?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12657687#action_12657687 ] Hudson commented on NUTCH-647: -- Integrated in Nutch-trunk #667 (See

[jira] Commented: (NUTCH-621) Nutch needs to declare it's crypto usage

2008-09-30 Thread Hudson (JIRA)
[ https://issues.apache.org/jira/browse/NUTCH-621?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12635790#action_12635790 ] Hudson commented on NUTCH-621: -- Integrated in Nutch-trunk #585 (See

[jira] Commented: (NUTCH-653) Upgrade to hadoop 0.18

2008-09-24 Thread Hudson (JIRA)
[ https://issues.apache.org/jira/browse/NUTCH-653?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12634370#action_12634370 ] Hudson commented on NUTCH-653: -- Integrated in Nutch-trunk #582 (See

[jira] Commented: (NUTCH-651) Remove bin/{start|stop}-balancer.sh from svn tracking

2008-09-24 Thread Hudson (JIRA)
[ https://issues.apache.org/jira/browse/NUTCH-651?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12634371#action_12634371 ] Hudson commented on NUTCH-651: -- Integrated in Nutch-trunk #582 (See

[jira] Commented: (NUTCH-375) Link to 0.8.x apidocs broken on website

2008-09-22 Thread Hudson (JIRA)
[ https://issues.apache.org/jira/browse/NUTCH-375?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12633614#action_12633614 ] Hudson commented on NUTCH-375: -- Integrated in Nutch-trunk #580 (See

[jira] Commented: (NUTCH-651) Remove bin/{start|stop}-balancer.sh from svn tracking

2008-09-22 Thread Hudson (JIRA)
[ https://issues.apache.org/jira/browse/NUTCH-651?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12633616#action_12633616 ] Hudson commented on NUTCH-651: -- Integrated in Nutch-trunk #580 (See

[jira] Commented: (NUTCH-633) ParseSegment no longer allow reparsing

2008-09-22 Thread Hudson (JIRA)
[ https://issues.apache.org/jira/browse/NUTCH-633?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12633615#action_12633615 ] Hudson commented on NUTCH-633: -- Integrated in Nutch-trunk #580 (See

[jira] Commented: (NUTCH-639) Change LuceneDocumentWrapper visibility from private to protected

2008-09-20 Thread Hudson (JIRA)
[ https://issues.apache.org/jira/browse/NUTCH-639?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12633042#action_12633042 ] Hudson commented on NUTCH-639: -- Integrated in Nutch-trunk #578 (See

[jira] Commented: (NUTCH-642) Unit tests fail when run in non-local mode

2008-08-18 Thread Hudson (JIRA)
[ https://issues.apache.org/jira/browse/NUTCH-642?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12623553#action_12623553 ] Hudson commented on NUTCH-642: -- Integrated in Nutch-trunk #545 (See

[jira] Commented: (NUTCH-634) Patch - Nutch - Hadoop 0.17.1

2008-07-21 Thread Hudson (JIRA)
[ https://issues.apache.org/jira/browse/NUTCH-634?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12615503#action_12615503 ] Hudson commented on NUTCH-634: -- Integrated in Nutch-trunk #516 (See

  1   2   >