[jira] Kommentiert: (NUTCH-21) parser plugin for MS PowerPoint slides

2005-09-01 Thread Stephan Strittmatter (JIRA)
[ http://issues.apache.org/jira/browse/NUTCH-21?page=comments#action_12320763 ] Stephan Strittmatter commented on NUTCH-21: --- I will verify the Unit-Tests until next week! parser plugin for MS PowerPoint slides

[jira] Aktualisiert: (NUTCH-21) parser plugin for MS PowerPoint slides

2005-08-02 Thread Stephan Strittmatter (JIRA)
[ http://issues.apache.org/jira/browse/NUTCH-21?page=all ] Stephan Strittmatter updated NUTCH-21: -- Attachment: parse-mspowerpoint.zip Updated plugin sources in respect of changed Nutch interface parser plugin for MS PowerPoint slides

[jira] Aktualisiert: (NUTCH-20) Extract urls from plain texts

2005-08-02 Thread Stephan Strittmatter (JIRA)
[ http://issues.apache.org/jira/browse/NUTCH-20?page=all ] Stephan Strittmatter updated NUTCH-20: -- Description: Some parsers have no Outlinks returned. E.g. the Word-Parser. This class is able to extract (absolute) hyperlinks from a plain String