Hans Brende created TIKA-3151:
-
Summary: Update jaxb-runtime and remove activation dependencies &
exclusions
Key: TIKA-3151
URL: https://issues.apache.org/jira/browse/TIKA-3151
Project: Tika
[
https://issues.apache.org/jira/browse/TIKA-2819?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16759255#comment-16759255
]
Hans Brende commented on TIKA-2819:
---
[~talli...@apache.org] You're welcome!
However, it looks like you
Hans Brende created TIKA-2819:
-
Summary: Update jaxb & activation
Key: TIKA-2819
URL: https://issues.apache.org/jira/browse/TIKA-2819
Project: Tika
Issue Type: Improvement
Components:
[
https://issues.apache.org/jira/browse/TIKA-2038?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16722543#comment-16722543
]
Hans Brende commented on TIKA-2038:
---
[~faghani] Glad to hear that my hypothesis was correct, and that F8
[
https://issues.apache.org/jira/browse/TIKA-2038?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16698539#comment-16698539
]
Hans Brende edited comment on TIKA-2038 at 11/26/18 2:38 PM:
-
The success of
[
https://issues.apache.org/jira/browse/TIKA-2038?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16698513#comment-16698513
]
Hans Brende edited comment on TIKA-2038 at 11/26/18 7:10 AM:
-
Here's a more
[
https://issues.apache.org/jira/browse/TIKA-2038?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16698539#comment-16698539
]
Hans Brende commented on TIKA-2038:
---
The success of this IUST implementation (even if based on
[
https://issues.apache.org/jira/browse/TIKA-2038?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16698513#comment-16698513
]
Hans Brende commented on TIKA-2038:
---
Here's a more rigorous demonstration of my claim (by
[
https://issues.apache.org/jira/browse/TIKA-2038?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16698497#comment-16698497
]
Hans Brende commented on TIKA-2038:
---
As sort of a sanity check on my part, I wanted to make sure that
[
https://issues.apache.org/jira/browse/TIKA-2038?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16698305#comment-16698305
]
Hans Brende edited comment on TIKA-2038 at 11/26/18 4:54 AM:
-
[~faghani]
[
https://issues.apache.org/jira/browse/TIKA-2038?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16698305#comment-16698305
]
Hans Brende commented on TIKA-2038:
---
[~faghani] Thanks for the response! If my understanding of the
[
https://issues.apache.org/jira/browse/TIKA-2038?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16694940#comment-16694940
]
Hans Brende edited comment on TIKA-2038 at 11/21/18 5:05 PM:
-
Alternatively,
[
https://issues.apache.org/jira/browse/TIKA-2038?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16694940#comment-16694940
]
Hans Brende edited comment on TIKA-2038 at 11/21/18 5:03 PM:
-
Alternatively,
[
https://issues.apache.org/jira/browse/TIKA-2038?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16694940#comment-16694940
]
Hans Brende commented on TIKA-2038:
---
Alternatively, you could use guava's
[
https://issues.apache.org/jira/browse/TIKA-2038?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16692478#comment-16692478
]
Hans Brende commented on TIKA-2038:
---
Oh, and one small detail I forgot to mention: jchardet also counted
[
https://issues.apache.org/jira/browse/TIKA-2038?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16692323#comment-16692323
]
Hans Brende edited comment on TIKA-2038 at 11/19/18 10:28 PM:
--
[~faghani]
[
https://issues.apache.org/jira/browse/TIKA-2038?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16692323#comment-16692323
]
Hans Brende commented on TIKA-2038:
---
[~faghani]
[~talli...@apache.org]
This issue inspired me to look
[
https://issues.apache.org/jira/browse/TIKA-2778?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16686068#comment-16686068
]
Hans Brende commented on TIKA-2778:
---
[~ffang] No, that dependency is from
[
https://issues.apache.org/jira/browse/TIKA-2778?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16686061#comment-16686061
]
Hans Brende commented on TIKA-2778:
---
[~talli...@apache.org]
+1, I think manually excluding
[
https://issues.apache.org/jira/browse/TIKA-2778?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16685996#comment-16685996
]
Hans Brende commented on TIKA-2778:
---
[~talli...@apache.org] would you mind posting the full stack trace
[
https://issues.apache.org/jira/browse/TIKA-2778?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16685969#comment-16685969
]
Hans Brende commented on TIKA-2778:
---
[~talli...@apache.org] Well, that's a bummer.
I'm not 100% sure
[
https://issues.apache.org/jira/browse/TIKA-2778?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16685865#comment-16685865
]
Hans Brende commented on TIKA-2778:
---
Yay!
deleting two dependencies from pom == successful day
>
[
https://issues.apache.org/jira/browse/TIKA-2778?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16685854#comment-16685854
]
Hans Brende commented on TIKA-2778:
---
[~talli...@apache.org] Did upgrading to jaxb-runtime 2.3.1 do the
[
https://issues.apache.org/jira/browse/TIKA-2778?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16685764#comment-16685764
]
Hans Brende commented on TIKA-2778:
---
[~talli...@apache.org] Sorry, commented before seeing your comment.
[
https://issues.apache.org/jira/browse/TIKA-2778?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16685763#comment-16685763
]
Hans Brende commented on TIKA-2778:
---
[~talli...@apache.org] no worries! As regards this issue, do you
[
https://issues.apache.org/jira/browse/TIKA-2778?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Hans Brende updated TIKA-2778:
--
Description:
The latest version of org.glassfish.jaxb:jaxb-runtime is 2.3.1, which fixes a
few issues
[
https://issues.apache.org/jira/browse/TIKA-2778?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Hans Brende updated TIKA-2778:
--
Description:
The latest version of org.glassfish.jaxb:jaxb-runtime is 2.3.1, which fixes a
few issues
[
https://issues.apache.org/jira/browse/TIKA-2743?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16682066#comment-16682066
]
Hans Brende commented on TIKA-2743:
---
I created a new issue for that: TIKA-2778
> Replace
Hans Brende created TIKA-2778:
-
Summary: Upgrade jaxb-runtime and javax.activation
Key: TIKA-2778
URL: https://issues.apache.org/jira/browse/TIKA-2778
Project: Tika
Issue Type: Task
[
https://issues.apache.org/jira/browse/TIKA-2743?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16681936#comment-16681936
]
Hans Brende commented on TIKA-2743:
---
[~talli...@apache.org] [~thetaphi]
Oh, also! It looks like there
[
https://issues.apache.org/jira/browse/TIKA-2743?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16681902#comment-16681902
]
Hans Brende commented on TIKA-2743:
---
[~talli...@apache.org] Runtime scope should theoretically still
[
https://issues.apache.org/jira/browse/TIKA-2743?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16681902#comment-16681902
]
Hans Brende edited comment on TIKA-2743 at 11/9/18 8:07 PM:
[
https://issues.apache.org/jira/browse/TIKA-2743?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16680749#comment-16680749
]
Hans Brende commented on TIKA-2743:
---
[~talli...@apache.org] shouldn't jaxb-runtime have {{runtime}},
[
https://issues.apache.org/jira/browse/TIKA-2771?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16680557#comment-16680557
]
Hans Brende commented on TIKA-2771:
---
[~talli...@apache.org] Great! I will definitely check that out.
[
https://issues.apache.org/jira/browse/TIKA-2771?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16680378#comment-16680378
]
Hans Brende edited comment on TIKA-2771 at 11/8/18 9:26 PM:
[
https://issues.apache.org/jira/browse/TIKA-2771?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16680378#comment-16680378
]
Hans Brende commented on TIKA-2771:
---
[~talli...@apache.org] Does Tika have a corpus of documents paired
[
https://issues.apache.org/jira/browse/TIKA-2771?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16677340#comment-16677340
]
Hans Brende commented on TIKA-2771:
---
[~talli...@apache.org] I've implemented my ideas for charset
[
https://issues.apache.org/jira/browse/TIKA-2771?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16676887#comment-16676887
]
Hans Brende edited comment on TIKA-2771 at 11/6/18 9:00 PM:
Compare to the
[
https://issues.apache.org/jira/browse/TIKA-2771?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16676926#comment-16676926
]
Hans Brende commented on TIKA-2771:
---
One thing I am sure of, however, is that if your chances of getting
[
https://issues.apache.org/jira/browse/TIKA-2771?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16676887#comment-16676887
]
Hans Brende edited comment on TIKA-2771 at 11/6/18 3:33 PM:
Compare to the
[
https://issues.apache.org/jira/browse/TIKA-2771?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16676887#comment-16676887
]
Hans Brende edited comment on TIKA-2771 at 11/6/18 3:31 PM:
Compare to the
[
https://issues.apache.org/jira/browse/TIKA-2771?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16676887#comment-16676887
]
Hans Brende edited comment on TIKA-2771 at 11/6/18 3:23 PM:
Compare to the
[
https://issues.apache.org/jira/browse/TIKA-2771?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16676887#comment-16676887
]
Hans Brende commented on TIKA-2771:
---
Compare to the following analogous test for ISO-8859-1 variants:
[
https://issues.apache.org/jira/browse/TIKA-2771?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16676109#comment-16676109
]
Hans Brende commented on TIKA-2771:
---
[~talli...@apache.org] I did a little experimentation with each of
[
https://issues.apache.org/jira/browse/TIKA-2771?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16675880#comment-16675880
]
Hans Brende commented on TIKA-2771:
---
[~wave] Yep, just ran the following
{code:java}
IntStream.range(0,
[
https://issues.apache.org/jira/browse/TIKA-2771?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16675828#comment-16675828
]
Hans Brende edited comment on TIKA-2771 at 11/5/18 10:44 PM:
-
[
https://issues.apache.org/jira/browse/TIKA-2771?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16675828#comment-16675828
]
Hans Brende edited comment on TIKA-2771 at 11/5/18 10:44 PM:
-
[
https://issues.apache.org/jira/browse/TIKA-2771?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16675828#comment-16675828
]
Hans Brende commented on TIKA-2771:
---
[~talli...@apache.org] Ah, you're correct as regards the byteMap.
[
https://issues.apache.org/jira/browse/TIKA-2771?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16675708#comment-16675708
]
Hans Brende commented on TIKA-2771:
---
[~talli...@apache.org] I'm not sure which all of the charsets are
[
https://issues.apache.org/jira/browse/TIKA-2771?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Hans Brende updated TIKA-2771:
--
Description:
When I try to run the CharsetDetector on
[
https://issues.apache.org/jira/browse/TIKA-2771?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16673481#comment-16673481
]
Hans Brende commented on TIKA-2771:
---
(Also relating to my last thought, on the subject of "waiting for
[
https://issues.apache.org/jira/browse/TIKA-2771?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16673429#comment-16673429
]
Hans Brende edited comment on TIKA-2771 at 11/2/18 5:09 PM:
(For my last
[
https://issues.apache.org/jira/browse/TIKA-2771?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16673429#comment-16673429
]
Hans Brende commented on TIKA-2771:
---
(For my last thought, I'd recommend taking a look at this:
[
https://issues.apache.org/jira/browse/TIKA-2771?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16673400#comment-16673400
]
Hans Brende commented on TIKA-2771:
---
[~talli...@apache.org] I totally understand not wanting to modify
[
https://issues.apache.org/jira/browse/TIKA-2771?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16673215#comment-16673215
]
Hans Brende commented on TIKA-2771:
---
[~talli...@apache.org] IBM500 (a.k.a. EBCDIC 500) is an EBCDIC
[
https://issues.apache.org/jira/browse/TIKA-2771?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16672520#comment-16672520
]
Hans Brende edited comment on TIKA-2771 at 11/2/18 3:19 AM:
Just had another
[
https://issues.apache.org/jira/browse/TIKA-2771?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16672520#comment-16672520
]
Hans Brende commented on TIKA-2771:
---
Just had another thought: when the input filter is enabled, it
[
https://issues.apache.org/jira/browse/TIKA-2771?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16672134#comment-16672134
]
Hans Brende edited comment on TIKA-2771 at 11/1/18 9:47 PM:
I mean, because
[
https://issues.apache.org/jira/browse/TIKA-2771?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16672203#comment-16672203
]
Hans Brende commented on TIKA-2771:
---
(Source:
[
https://issues.apache.org/jira/browse/TIKA-2771?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16672196#comment-16672196
]
Hans Brende commented on TIKA-2771:
---
Oh... and probably the best hint of all that this is not IBM500 is
[
https://issues.apache.org/jira/browse/TIKA-2771?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16672134#comment-16672134
]
Hans Brende edited comment on TIKA-2771 at 11/1/18 8:52 PM:
I mean, because
[
https://issues.apache.org/jira/browse/TIKA-2771?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16672178#comment-16672178
]
Hans Brende commented on TIKA-2771:
---
One good hint that this is not IBM500 is that *all* of the
[
https://issues.apache.org/jira/browse/TIKA-2771?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16672134#comment-16672134
]
Hans Brende commented on TIKA-2771:
---
I mean, because otherwise, if you're doing n-gram detection for
[
https://issues.apache.org/jira/browse/TIKA-2771?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16672116#comment-16672116
]
Hans Brende edited comment on TIKA-2771 at 11/1/18 8:12 PM:
Not sure if this
[
https://issues.apache.org/jira/browse/TIKA-2771?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16672116#comment-16672116
]
Hans Brende commented on TIKA-2771:
---
Not sure if this is a contributing factor, but peering into the
Hans Brende created TIKA-2771:
-
Summary: enableInputFilter() wrecks charset detection for some
short html documents
Key: TIKA-2771
URL: https://issues.apache.org/jira/browse/TIKA-2771
Project: Tika
[
https://issues.apache.org/jira/browse/TIKA-2690?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Hans Brende updated TIKA-2690:
--
Description:
Exclude commons-logging and commons-logging-api from {{uimafit-core}}
dependencies.
As
Hans Brende created TIKA-2690:
-
Summary: Exclude commons-logging & commons-logging-api from
uimafit-core
Key: TIKA-2690
URL: https://issues.apache.org/jira/browse/TIKA-2690
Project: Tika
Issue
68 matches
Mail list logo