[jira] [Commented] (NUTCH-797) parse-tika is not properly constructing URLs when the target begins with a ?

2011-10-11 Thread Andrzej Bialecki (Commented) (JIRA)
[ https://issues.apache.org/jira/browse/NUTCH-797?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13124737#comment-13124737 ] Andrzej Bialecki commented on NUTCH-797: - The fixup code in Tika is still a

[jira] [Commented] (NUTCH-1097) application/xhtml+xml should be enabled in plugin.xml of parse-html; allow multiple mimetypes for plugin.xml

2011-10-11 Thread Ferdy (Commented) (JIRA)
[ https://issues.apache.org/jira/browse/NUTCH-1097?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13124761#comment-13124761 ] Ferdy commented on NUTCH-1097: -- Hi, As far as I know, currently parse-tika is used as a

[jira] [Updated] (NUTCH-1053) Parsing of RSS feeds fails

2011-10-11 Thread Julien Nioche (Updated) (JIRA)
[ https://issues.apache.org/jira/browse/NUTCH-1053?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Julien Nioche updated NUTCH-1053: - Fix Version/s: 1.5 I'd happily give an example of fix it myself if only I could find it :-)

[jira] [Updated] (NUTCH-1053) Parsing of RSS feeds fails

2011-10-11 Thread Julien Nioche (Updated) (JIRA)
[ https://issues.apache.org/jira/browse/NUTCH-1053?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Julien Nioche updated NUTCH-1053: - Fix Version/s: (was: 1.4) Parsing of RSS feeds fails ---

[jira] [Created] (NUTCH-1156) building errors with gora-hbase as a backend; update ivy.xml to use correct dependancies

2011-10-11 Thread Ferdy (Created) (JIRA)
building errors with gora-hbase as a backend; update ivy.xml to use correct dependancies Key: NUTCH-1156 URL: https://issues.apache.org/jira/browse/NUTCH-1156

[jira] [Created] (NUTCH-1157) building errors with gora-hbase as a backend; update ivy.xml to use correct dependancies

2011-10-11 Thread Ferdy (Created) (JIRA)
building errors with gora-hbase as a backend; update ivy.xml to use correct dependancies Key: NUTCH-1157 URL: https://issues.apache.org/jira/browse/NUTCH-1157

[jira] [Closed] (NUTCH-1157) building errors with gora-hbase as a backend; update ivy.xml to use correct dependancies

2011-10-11 Thread Ferdy (Closed) (JIRA)
[ https://issues.apache.org/jira/browse/NUTCH-1157?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Ferdy closed NUTCH-1157. Resolution: Duplicate My bad I clicked twice. See Nutch-1156. building errors with gora-hbase as

[jira] [Commented] (NUTCH-797) parse-tika is not properly constructing URLs when the target begins with a ?

2011-10-11 Thread Andrzej Bialecki (Commented) (JIRA)
[ https://issues.apache.org/jira/browse/NUTCH-797?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13125016#comment-13125016 ] Andrzej Bialecki commented on NUTCH-797: - Uhh, sorry - I'll fix this in a moment.

[jira] [Commented] (NUTCH-797) parse-tika is not properly constructing URLs when the target begins with a ?

2011-10-11 Thread Andrzej Bialecki (Commented) (JIRA)
[ https://issues.apache.org/jira/browse/NUTCH-797?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13125077#comment-13125077 ] Andrzej Bialecki commented on NUTCH-797: - I'm puzzled by the algorithm in

[jira] [Commented] (NUTCH-797) parse-tika is not properly constructing URLs when the target begins with a ?

2011-10-11 Thread Markus Jelsma (Commented) (JIRA)
[ https://issues.apache.org/jira/browse/NUTCH-797?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13125093#comment-13125093 ] Markus Jelsma commented on NUTCH-797: - I would expect

[jira] [Commented] (NUTCH-797) parse-tika is not properly constructing URLs when the target begins with a ?

2011-10-11 Thread Markus Jelsma (Commented) (JIRA)
[ https://issues.apache.org/jira/browse/NUTCH-797?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13125142#comment-13125142 ] Markus Jelsma commented on NUTCH-797: - Mmm, i think you are correct. It's bit confusing

[jira] [Commented] (NUTCH-1135) Fix TestGoraStorage for Nutchgora

2011-10-11 Thread Ferdy (Commented) (JIRA)
[ https://issues.apache.org/jira/browse/NUTCH-1135?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13125145#comment-13125145 ] Ferdy commented on NUTCH-1135: -- It seems like Gora simply tries to connect to a non-existing

[jira] [Commented] (NUTCH-1081) ant tests fail

2011-10-11 Thread Ferdy (Commented) (JIRA)
[ https://issues.apache.org/jira/browse/NUTCH-1081?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13125146#comment-13125146 ] Ferdy commented on NUTCH-1081: -- It seems like your patch is fine, at least as a temporary

[jira] [Commented] (NUTCH-1135) Fix TestGoraStorage for Nutchgora

2011-10-11 Thread Lewis John McGibbney (Commented) (JIRA)
[ https://issues.apache.org/jira/browse/NUTCH-1135?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13125147#comment-13125147 ] Lewis John McGibbney commented on NUTCH-1135: - Hi Ferdy, firstly thanks for

[jira] [Commented] (NUTCH-1081) ant tests fail

2011-10-11 Thread Lewis John McGibbney (Commented) (JIRA)
[ https://issues.apache.org/jira/browse/NUTCH-1081?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13125151#comment-13125151 ] Lewis John McGibbney commented on NUTCH-1081: - Thanks Ferdy. It was also my

[jira] [Updated] (NUTCH-797) parse-tika is not properly constructing URLs when the target begins with a ?

2011-10-11 Thread Andrzej Bialecki (Updated) (JIRA)
[ https://issues.apache.org/jira/browse/NUTCH-797?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Andrzej Bialecki updated NUTCH-797: Attachment: NUTCH-797.patch Tentative patch, which changes the meaning of fixEmbeddedParams

[jira] [Commented] (NUTCH-1135) Fix TestGoraStorage for Nutchgora

2011-10-11 Thread Ferdy (Commented) (JIRA)
[ https://issues.apache.org/jira/browse/NUTCH-1135?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13125306#comment-13125306 ] Ferdy commented on NUTCH-1135: -- No problem I'll work out a patch that fixes the test (at

[jira] [Resolved] (NUTCH-1132) Fix TestGenerator for Nutchgora

2011-10-11 Thread Lewis John McGibbney (Resolved) (JIRA)
[ https://issues.apache.org/jira/browse/NUTCH-1132?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Lewis John McGibbney resolved NUTCH-1132. - Resolution: Fixed Committed @ revision 1182060 in nutchgora branch

[jira] [Resolved] (NUTCH-1133) Fix TestInjector for Nutchgora

2011-10-11 Thread Lewis John McGibbney (Resolved) (JIRA)
[ https://issues.apache.org/jira/browse/NUTCH-1133?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Lewis John McGibbney resolved NUTCH-1133. - Resolution: Fixed Committed @ revision 1182060 in nutchgora branch

[jira] [Resolved] (NUTCH-1134) Fix TestFetcher for Nutchgora

2011-10-11 Thread Lewis John McGibbney (Resolved) (JIRA)
[ https://issues.apache.org/jira/browse/NUTCH-1134?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Lewis John McGibbney resolved NUTCH-1134. - Resolution: Fixed Committed @ revision 1182060 in nutchgora branch

[jira] [Created] (NUTCH-1158) Write JUnit tests for all nutchgora plugins

2011-10-11 Thread Lewis John McGibbney (Created) (JIRA)
Write JUnit tests for all nutchgora plugins --- Key: NUTCH-1158 URL: https://issues.apache.org/jira/browse/NUTCH-1158 Project: Nutch Issue Type: Improvement Affects Versions: nutchgora

[jira] [Created] (NUTCH-1160) Write JUnit tests for index-basic

2011-10-11 Thread Lewis John McGibbney (Created) (JIRA)
Write JUnit tests for index-basic - Key: NUTCH-1160 URL: https://issues.apache.org/jira/browse/NUTCH-1160 Project: Nutch Issue Type: Sub-task Components: indexer Affects Versions: nutchgora

[jira] [Created] (NUTCH-1161) Write JUnit tests for microformats-reltag plugin

2011-10-11 Thread Lewis John McGibbney (Created) (JIRA)
Write JUnit tests for microformats-reltag plugin Key: NUTCH-1161 URL: https://issues.apache.org/jira/browse/NUTCH-1161 Project: Nutch Issue Type: Sub-task Affects Versions: nutchgora

[jira] [Created] (NUTCH-1162) Write JUnit tests for parse-js

2011-10-11 Thread Lewis John McGibbney (Created) (JIRA)
Write JUnit tests for parse-js -- Key: NUTCH-1162 URL: https://issues.apache.org/jira/browse/NUTCH-1162 Project: Nutch Issue Type: Sub-task Components: parser Affects Versions: nutchgora

[jira] [Created] (NUTCH-1163) Write JUnit tests for protocol-ftp

2011-10-11 Thread Lewis John McGibbney (Created) (JIRA)
Write JUnit tests for protocol-ftp -- Key: NUTCH-1163 URL: https://issues.apache.org/jira/browse/NUTCH-1163 Project: Nutch Issue Type: Sub-task Affects Versions: nutchgora Reporter: Lewis John

[jira] [Created] (NUTCH-1164) Write JUnit tests for protocol-http

2011-10-11 Thread Lewis John McGibbney (Created) (JIRA)
Write JUnit tests for protocol-http --- Key: NUTCH-1164 URL: https://issues.apache.org/jira/browse/NUTCH-1164 Project: Nutch Issue Type: Sub-task Affects Versions: nutchgora Reporter: Lewis

[jira] [Created] (NUTCH-1167) Write JUnit tests for scoring-opic

2011-10-11 Thread Lewis John McGibbney (Created) (JIRA)
Write JUnit tests for scoring-opic -- Key: NUTCH-1167 URL: https://issues.apache.org/jira/browse/NUTCH-1167 Project: Nutch Issue Type: Sub-task Affects Versions: nutchgora Reporter: Lewis John

[jira] [Created] (NUTCH-1166) Write JUnit tests for scoring-link

2011-10-11 Thread Lewis John McGibbney (Created) (JIRA)
Write JUnit tests for scoring-link -- Key: NUTCH-1166 URL: https://issues.apache.org/jira/browse/NUTCH-1166 Project: Nutch Issue Type: Sub-task Components: linkdb Affects Versions: nutchgora

[jira] [Created] (NUTCH-1168) Write JUnit tests for tld

2011-10-11 Thread Lewis John McGibbney (Created) (JIRA)
Write JUnit tests for tld - Key: NUTCH-1168 URL: https://issues.apache.org/jira/browse/NUTCH-1168 Project: Nutch Issue Type: Sub-task Affects Versions: nutchgora Reporter: Lewis John McGibbney

unsubscribe

2011-10-11 Thread Dr. Klaus Mapara
Am 11.10.2011 um 22:11 schrieb Lewis John McGibbney (Resolved) (JIRA): [ https://issues.apache.org/jira/browse/NUTCH-1132?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Lewis John McGibbney resolved NUTCH-1132. -

[jira] [Created] (NUTCH-1169) Write JUnit tests for urlfilter-prefix

2011-10-11 Thread Lewis John McGibbney (Created) (JIRA)
Write JUnit tests for urlfilter-prefix -- Key: NUTCH-1169 URL: https://issues.apache.org/jira/browse/NUTCH-1169 Project: Nutch Issue Type: Sub-task Affects Versions: nutchgora Reporter:

[jira] [Created] (NUTCH-1170) Write JUnit tests for urlfilter-validator

2011-10-11 Thread Lewis John McGibbney (Created) (JIRA)
Write JUnit tests for urlfilter-validator - Key: NUTCH-1170 URL: https://issues.apache.org/jira/browse/NUTCH-1170 Project: Nutch Issue Type: Sub-task Affects Versions: nutchgora

[jira] [Reopened] (NUTCH-623) Change plugin source directory languageidentifier to language-identifier

2011-10-11 Thread Lewis John McGibbney (Reopened) (JIRA)
[ https://issues.apache.org/jira/browse/NUTCH-623?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Lewis John McGibbney reopened NUTCH-623: reopening and applying to nutchgora branch as this is a fairly trivial mapping

[jira] [Commented] (NUTCH-1097) application/xhtml+xml should be enabled in plugin.xml of parse-html; allow multiple mimetypes for plugin.xml

2011-10-11 Thread Lewis John McGibbney (Commented) (JIRA)
[ https://issues.apache.org/jira/browse/NUTCH-1097?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13125367#comment-13125367 ] Lewis John McGibbney commented on NUTCH-1097: - Does anyone else have input for

[jira] [Commented] (NUTCH-1097) application/xhtml+xml should be enabled in plugin.xml of parse-html; allow multiple mimetypes for plugin.xml

2011-10-11 Thread Andrzej Bialecki (Commented) (JIRA)
[ https://issues.apache.org/jira/browse/NUTCH-1097?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13125414#comment-13125414 ] Andrzej Bialecki commented on NUTCH-1097: -- +1 the idea makes sense. Patch looks

[jira] [Updated] (NUTCH-623) Change plugin source directory languageidentifier to language-identifier

2011-10-11 Thread Lewis John McGibbney (Updated) (JIRA)
[ https://issues.apache.org/jira/browse/NUTCH-623?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Lewis John McGibbney updated NUTCH-623: --- Attachment: NUTCH-623-nutchgora-20111011.patch patch attachment for nutchgora branch

[jira] [Closed] (NUTCH-623) Change plugin source directory languageidentifier to language-identifier

2011-10-11 Thread Lewis John McGibbney (Closed) (JIRA)
John McGibbney Priority: Trivial Fix For: 1.4, nutchgora Attachments: NUTCH-623-branch-1.4-20110810.patch, NUTCH-623-branch-1.4-20110810.patch, NUTCH-623-branch-1.4-20110910-v2.patch, NUTCH-623-nutchgora-20111011.patch, NUTCH-623-trunk-1.4-20110924.patch, NUTCH

[jira] [Resolved] (NUTCH-623) Change plugin source directory languageidentifier to language-identifier

2011-10-11 Thread Lewis John McGibbney (Resolved) (JIRA)
-branch-1.4-20110810.patch, NUTCH-623-branch-1.4-20110910-v2.patch, NUTCH-623-nutchgora-20111011.patch, NUTCH-623-trunk-1.4-20110924.patch, NUTCH-623-trunk-2.0-20110810.patch When trying to develop and debug Nutch in eclipse, following the instructions at http://wiki.apache.org/nutch

[jira] [Commented] (NUTCH-1098) better url-normalizer basic

2011-10-11 Thread Lewis John McGibbney (Commented) (JIRA)
[ https://issues.apache.org/jira/browse/NUTCH-1098?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13125429#comment-13125429 ] Lewis John McGibbney commented on NUTCH-1098: - Hi Radim are you happy with

[jira] [Commented] (NUTCH-1005) Index headings plugin

2011-10-11 Thread Lewis John McGibbney (Commented) (JIRA)
[ https://issues.apache.org/jira/browse/NUTCH-1005?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13125436#comment-13125436 ] Lewis John McGibbney commented on NUTCH-1005: - Hi Markus Julien, I really

[jira] [Resolved] (NUTCH-629) Detect slow and timeout servers and drop their URLs

2011-10-11 Thread Lewis John McGibbney (Resolved) (JIRA)
[ https://issues.apache.org/jira/browse/NUTCH-629?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Lewis John McGibbney resolved NUTCH-629. Resolution: Won't Fix As Otis is no longer with us, as as per Markus' comments I

[jira] [Commented] (NUTCH-628) Host database to keep track of host-level information

2011-10-11 Thread Lewis John McGibbney (Commented) (JIRA)
[ https://issues.apache.org/jira/browse/NUTCH-628?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13125443#comment-13125443 ] Lewis John McGibbney commented on NUTCH-628: Hi Markus, can you confirm if this