[jira] [Commented] (TIKA-1365) Incorrectly MimeType detection for Apache Lucene web site

2015-03-19 Thread Hudson (JIRA)
[ https://issues.apache.org/jira/browse/TIKA-1365?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14368606#comment-14368606 ] Hudson commented on TIKA-1365: -- SUCCESS: Integrated in tika-trunk-jdk1.7 #558 (See

[jira] [Commented] (TIKA-1365) Incorrectly MimeType detection for Apache Lucene web site

2015-03-19 Thread ASF GitHub Bot (JIRA)
[ https://issues.apache.org/jira/browse/TIKA-1365?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14368559#comment-14368559 ] ASF GitHub Bot commented on TIKA-1365: -- Github user asfgit closed the pull request at:

[jira] [Commented] (TIKA-1365) Incorrectly MimeType detection for Apache Lucene web site

2015-03-17 Thread ASF GitHub Bot (JIRA)
[ https://issues.apache.org/jira/browse/TIKA-1365?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14366176#comment-14366176 ] ASF GitHub Bot commented on TIKA-1365: -- GitHub user mkr opened a pull request:

[jira] [Commented] (TIKA-1365) Incorrectly MimeType detection for Apache Lucene web site

2015-03-17 Thread Matthias Krueger (JIRA)
[ https://issues.apache.org/jira/browse/TIKA-1365?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14366201#comment-14366201 ] Matthias Krueger commented on TIKA-1365: Quick wrapup: * HTML starting with comment

[jira] [Commented] (TIKA-1365) Incorrectly MimeType detection for Apache Lucene web site

2014-07-16 Thread Nick Burch (JIRA)
[ https://issues.apache.org/jira/browse/TIKA-1365?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14063292#comment-14063292 ] Nick Burch commented on TIKA-1365: -- There shouldn't normally be a difference between

[jira] [Commented] (TIKA-1365) Incorrectly MimeType detection for Apache Lucene web site

2014-07-16 Thread Tien Nguyen Manh (JIRA)
[ https://issues.apache.org/jira/browse/TIKA-1365?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14063301#comment-14063301 ] Tien Nguyen Manh commented on TIKA-1365: The reason that Tyler Palsulich test with

[jira] [Commented] (TIKA-1365) Incorrectly MimeType detection for Apache Lucene web site

2014-07-16 Thread Ken Krugler (JIRA)
[ https://issues.apache.org/jira/browse/TIKA-1365?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14063467#comment-14063467 ] Ken Krugler commented on TIKA-1365: --- Turning the XML parser into a fuzzy parser (like

[jira] [Commented] (TIKA-1365) Incorrectly MimeType detection for Apache Lucene web site

2014-07-16 Thread Tyler Palsulich (JIRA)
[ https://issues.apache.org/jira/browse/TIKA-1365?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14063566#comment-14063566 ] Tyler Palsulich commented on TIKA-1365: --- Ah, [~tiennm]. Good point. I didn't even

[jira] [Commented] (TIKA-1365) Incorrectly MimeType detection for Apache Lucene web site

2014-07-16 Thread Matthias Krueger (JIRA)
[ https://issues.apache.org/jira/browse/TIKA-1365?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14063617#comment-14063617 ] Matthias Krueger commented on TIKA-1365: {code} System.out.println(new

[jira] [Commented] (TIKA-1365) Incorrectly MimeType detection for Apache Lucene web site

2014-07-16 Thread Matthias Krueger (JIRA)
[ https://issues.apache.org/jira/browse/TIKA-1365?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14063699#comment-14063699 ] Matthias Krueger commented on TIKA-1365: Some more observations: * The lower

[jira] [Commented] (TIKA-1365) Incorrectly MimeType detection for Apache Lucene web site

2014-07-16 Thread Ken Krugler (JIRA)
[ https://issues.apache.org/jira/browse/TIKA-1365?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14063744#comment-14063744 ] Ken Krugler commented on TIKA-1365: --- Hi Tyler - the response from fetching

[jira] [Commented] (TIKA-1365) Incorrectly MimeType detection for Apache Lucene web site

2014-07-16 Thread Ken Krugler (JIRA)
[ https://issues.apache.org/jira/browse/TIKA-1365?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14063750#comment-14063750 ] Ken Krugler commented on TIKA-1365: --- Hi Matthias - I agree that HTML's priority should be

[jira] [Commented] (TIKA-1365) Incorrectly MimeType detection for Apache Lucene web site

2014-07-15 Thread Tyler Palsulich (JIRA)
[ https://issues.apache.org/jira/browse/TIKA-1365?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14062180#comment-14062180 ] Tyler Palsulich commented on TIKA-1365: --- Thanks! It looks like the html is malformed,

[jira] [Commented] (TIKA-1365) Incorrectly MimeType detection for Apache Lucene web site

2014-07-15 Thread Tien Nguyen Manh (JIRA)
[ https://issues.apache.org/jira/browse/TIKA-1365?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14062979#comment-14062979 ] Tien Nguyen Manh commented on TIKA-1365: i think XMLParser throws that exception is

[jira] [Commented] (TIKA-1365) Incorrectly MimeType detection for Apache Lucene web site

2014-07-14 Thread Tyler Palsulich (JIRA)
[ https://issues.apache.org/jira/browse/TIKA-1365?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14060655#comment-14060655 ] Tyler Palsulich commented on TIKA-1365: --- Hi [~tiennm]. Thanks for raising this issue.

[jira] [Commented] (TIKA-1365) Incorrectly MimeType detection for Apache Lucene web site

2014-07-14 Thread Tien Nguyen Manh (JIRA)
[ https://issues.apache.org/jira/browse/TIKA-1365?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14061531#comment-14061531 ] Tien Nguyen Manh commented on TIKA-1365: [~tpalsulich] Ah yes, I tried with url