[jira] [Commented] (NUTCH-1414) Date extraction parse filter

2016-07-18 Thread Cody Amen (JIRA)
[ https://issues.apache.org/jira/browse/NUTCH-1414?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15383332#comment-15383332 ] Cody Amen commented on NUTCH-1414: -- Now after I installed the parsefilter-regex the field

[jira] [Commented] (NUTCH-1414) Date extraction parse filter

2016-07-18 Thread Markus Jelsma (JIRA)
[ https://issues.apache.org/jira/browse/NUTCH-1414?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15383121#comment-15383121 ] Markus Jelsma commented on NUTCH-1414: -- Java provides proper PCRE compatible regular

[jira] [Commented] (NUTCH-1414) Date extraction parse filter

2016-07-18 Thread Cody Amen (JIRA)
[ https://issues.apache.org/jira/browse/NUTCH-1414?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15383092#comment-15383092 ] Cody Amen commented on NUTCH-1414: -- Yeah, I will take a look. The content is in the form

[jira] [Commented] (NUTCH-1414) Date extraction parse filter

2016-07-18 Thread Markus Jelsma (JIRA)
[ https://issues.apache.org/jira/browse/NUTCH-1414?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15382987#comment-15382987 ] Markus Jelsma commented on NUTCH-1414: -- The regex parse filter NUTCH-2227 can grab st

[jira] [Commented] (NUTCH-1414) Date extraction parse filter

2016-07-18 Thread Cody Amen (JIRA)
[ https://issues.apache.org/jira/browse/NUTCH-1414?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15382974#comment-15382974 ] Cody Amen commented on NUTCH-1414: -- So I definitely have a finite set of domains I am ind

[jira] [Commented] (NUTCH-1414) Date extraction parse filter

2016-07-18 Thread Markus Jelsma (JIRA)
[ https://issues.apache.org/jira/browse/NUTCH-1414?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15382957#comment-15382957 ] Markus Jelsma commented on NUTCH-1414: -- It operates on the parsed text or the extract

[jira] [Commented] (NUTCH-1414) Date extraction parse filter

2016-07-18 Thread Cody Amen (JIRA)
[ https://issues.apache.org/jira/browse/NUTCH-1414?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15382653#comment-15382653 ] Cody Amen commented on NUTCH-1414: -- Ok, this filter parses the whole page though right?

[jira] [Commented] (NUTCH-1414) Date extraction parse filter

2016-07-18 Thread Markus Jelsma (JIRA)
[ https://issues.apache.org/jira/browse/NUTCH-1414?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15382088#comment-15382088 ] Markus Jelsma commented on NUTCH-1414: -- Sure: {code} index.parse.md metatag.des

[jira] [Commented] (NUTCH-1414) Date extraction parse filter

2016-07-17 Thread Cody Amen (JIRA)
[ https://issues.apache.org/jira/browse/NUTCH-1414?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15381773#comment-15381773 ] Cody Amen commented on NUTCH-1414: -- Any configuration to get the data into Solr? > Date

[jira] [Commented] (NUTCH-1414) Date extraction parse filter

2014-01-28 Thread Markus Jelsma (JIRA)
[ https://issues.apache.org/jira/browse/NUTCH-1414?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13884078#comment-13884078 ] Markus Jelsma commented on NUTCH-1414: -- Ah yes, you are right. That makes sense, the

[jira] [Commented] (NUTCH-1414) Date extraction parse filter

2014-01-28 Thread Luke (JIRA)
[ https://issues.apache.org/jira/browse/NUTCH-1414?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13884073#comment-13884073 ] Luke commented on NUTCH-1414: - I think I found the culprit. >From the link you gave, Solr want

[jira] [Commented] (NUTCH-1414) Date extraction parse filter

2014-01-28 Thread Markus Jelsma (JIRA)
[ https://issues.apache.org/jira/browse/NUTCH-1414?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13883997#comment-13883997 ] Markus Jelsma commented on NUTCH-1414: -- Hi - you don't have to change it, it is alrea

[jira] [Commented] (NUTCH-1414) Date extraction parse filter

2014-01-28 Thread Luke (JIRA)
[ https://issues.apache.org/jira/browse/NUTCH-1414?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13883986#comment-13883986 ] Luke commented on NUTCH-1414: - Thanks for the response. If I can follow up a bit more on passi

[jira] [Commented] (NUTCH-1414) Date extraction parse filter

2014-01-27 Thread Markus Jelsma (JIRA)
[ https://issues.apache.org/jira/browse/NUTCH-1414?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13882886#comment-13882886 ] Markus Jelsma commented on NUTCH-1414: -- Hi Luke, * We send it to Solr using protecte

[jira] [Commented] (NUTCH-1414) Date extraction parse filter

2014-01-26 Thread Luke (JIRA)
[ https://issues.apache.org/jira/browse/NUTCH-1414?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13882282#comment-13882282 ] Luke commented on NUTCH-1414: - Hi Markus/Others, Firstly, let me say I like this functionalit

[jira] [Commented] (NUTCH-1414) Date extraction parse filter

2012-07-09 Thread Kevin Gao (JIRA)
[ https://issues.apache.org/jira/browse/NUTCH-1414?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13409497#comment-13409497 ] Kevin Gao commented on NUTCH-1414: -- yes, that is working. thank you very much.

[jira] [Commented] (NUTCH-1414) Date extraction parse filter

2012-07-09 Thread Kevin Gao (JIRA)
[ https://issues.apache.org/jira/browse/NUTCH-1414?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13409465#comment-13409465 ] Kevin Gao commented on NUTCH-1414: -- HI Markus: I found that in your build.xml file, we ne

[jira] [Commented] (NUTCH-1414) Date extraction parse filter

2012-07-06 Thread Markus Jelsma (JIRA)
[ https://issues.apache.org/jira/browse/NUTCH-1414?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13408077#comment-13408077 ] Markus Jelsma commented on NUTCH-1414: -- Ok, this is fine by me. But do you perhaps al

[jira] [Commented] (NUTCH-1414) Date extraction parse filter

2012-07-06 Thread Julien Nioche (JIRA)
[ https://issues.apache.org/jira/browse/NUTCH-1414?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13408055#comment-13408055 ] Julien Nioche commented on NUTCH-1414: -- I'm concerned about the proliferation of micr