[
https://issues.apache.org/jira/browse/TIKA-1365?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14368606#comment-14368606
]
Hudson commented on TIKA-1365:
--
SUCCESS: Integrated in tika-trunk-jdk1.7 #558 (See
[
https://issues.apache.org/jira/browse/TIKA-1365?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14368559#comment-14368559
]
ASF GitHub Bot commented on TIKA-1365:
--
Github user asfgit closed the pull request at:
[
https://issues.apache.org/jira/browse/TIKA-1365?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14366176#comment-14366176
]
ASF GitHub Bot commented on TIKA-1365:
--
GitHub user mkr opened a pull request:
[
https://issues.apache.org/jira/browse/TIKA-1365?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14366201#comment-14366201
]
Matthias Krueger commented on TIKA-1365:
Quick wrapup:
* HTML starting with comment
[
https://issues.apache.org/jira/browse/TIKA-1365?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14063292#comment-14063292
]
Nick Burch commented on TIKA-1365:
--
There shouldn't normally be a difference between
[
https://issues.apache.org/jira/browse/TIKA-1365?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14063301#comment-14063301
]
Tien Nguyen Manh commented on TIKA-1365:
The reason that Tyler Palsulich test with
[
https://issues.apache.org/jira/browse/TIKA-1365?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14063467#comment-14063467
]
Ken Krugler commented on TIKA-1365:
---
Turning the XML parser into a fuzzy parser (like
[
https://issues.apache.org/jira/browse/TIKA-1365?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14063566#comment-14063566
]
Tyler Palsulich commented on TIKA-1365:
---
Ah, [~tiennm]. Good point. I didn't even
[
https://issues.apache.org/jira/browse/TIKA-1365?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14063617#comment-14063617
]
Matthias Krueger commented on TIKA-1365:
{code}
System.out.println(new
[
https://issues.apache.org/jira/browse/TIKA-1365?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14063699#comment-14063699
]
Matthias Krueger commented on TIKA-1365:
Some more observations:
* The lower
[
https://issues.apache.org/jira/browse/TIKA-1365?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14063744#comment-14063744
]
Ken Krugler commented on TIKA-1365:
---
Hi Tyler - the response from fetching
[
https://issues.apache.org/jira/browse/TIKA-1365?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14063750#comment-14063750
]
Ken Krugler commented on TIKA-1365:
---
Hi Matthias - I agree that HTML's priority should be
[
https://issues.apache.org/jira/browse/TIKA-1365?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14062180#comment-14062180
]
Tyler Palsulich commented on TIKA-1365:
---
Thanks! It looks like the html is malformed,
[
https://issues.apache.org/jira/browse/TIKA-1365?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14062979#comment-14062979
]
Tien Nguyen Manh commented on TIKA-1365:
i think XMLParser throws that exception is
[
https://issues.apache.org/jira/browse/TIKA-1365?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14060655#comment-14060655
]
Tyler Palsulich commented on TIKA-1365:
---
Hi [~tiennm]. Thanks for raising this issue.
[
https://issues.apache.org/jira/browse/TIKA-1365?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14061531#comment-14061531
]
Tien Nguyen Manh commented on TIKA-1365:
[~tpalsulich] Ah yes,
I tried with url
16 matches
Mail list logo