[
https://issues.apache.org/jira/browse/TIKA-2592?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16384242#comment-16384242
]
Ken Krugler commented on TIKA-2592:
---
[~AndreasMeier] - I assume when you said:
{quote}I don't think we
[
https://issues.apache.org/jira/browse/TIKA-2592?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Ken Krugler updated TIKA-2592:
--
Attachment: IANA Charset names.txt
> HTML with charset unicode handled as utf-16 instead utf-8
>
[
https://issues.apache.org/jira/browse/TIKA-2592?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Ken Krugler updated TIKA-2592:
--
Priority: Minor (was: Major)
> HTML with charset unicode handled as utf-16 instead utf-8
>
[
https://issues.apache.org/jira/browse/TIKA-2592?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Ken Krugler updated TIKA-2592:
--
Issue Type: Improvement (was: Bug)
> HTML with charset unicode handled as utf-16 instead utf-8
>
On Fri, 2 Mar 2018, Luís Filipe Nassif wrote:
If I make no progress on TIKA-1466 until 3/9, you can start the release
process without it. But do you devs agree with the proposed change: allow
overriding of glob patterns in custom-mimetypes.xml?
What happens if you have two different custom
[
https://issues.apache.org/jira/browse/TIKA-2569?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16384155#comment-16384155
]
Tim Allison edited comment on TIKA-2569 at 3/2/18 9:10 PM:
---
[~BAEApache], if all
[
https://issues.apache.org/jira/browse/TIKA-2569?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16384155#comment-16384155
]
Tim Allison commented on TIKA-2569:
---
[~BAEApache], if all goes according to plan, we'll start the release
[
https://issues.apache.org/jira/browse/TIKA-2592?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16384081#comment-16384081
]
Tim Allison commented on TIKA-2592:
---
bq. Do you have a testcorpus or are you crawling the web Tim
[
https://issues.apache.org/jira/browse/TIKA-2597?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16384065#comment-16384065
]
Todd Dixon commented on TIKA-2597:
--
>From what i read on the FILE_FLAG_POSIX_SEMANTICS flag that will only
[
https://issues.apache.org/jira/browse/TIKA-2597?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16383880#comment-16383880
]
Nick Burch commented on TIKA-2597:
--
Trying to fully re-implement the Windows case-insensitivity rules
Todd Dixon created TIKA-2597:
Summary: Attachment Extraction Case Sensitivity
Key: TIKA-2597
URL: https://issues.apache.org/jira/browse/TIKA-2597
Project: Tika
Issue Type: Bug
> But do you devs agree with the proposed change: allow overriding of glob
> patterns in custom-mimetypes.xml?
+1 from me
From: Luís Filipe Nassif [mailto:lfcnas...@gmail.com]
Sent: Friday, March 2, 2018 8:21 AM
To: Allison, Timothy B.
Cc: dev@tika.apache.org
Subject: Re:
[
https://issues.apache.org/jira/browse/TIKA-2568?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16383556#comment-16383556
]
Tim Allison commented on TIKA-2568:
---
Just added you to the PMC group on JIRA. Sorry for our delay!
>
[
https://issues.apache.org/jira/browse/TIKA-2568?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Tim Allison reassigned TIKA-2568:
-
Assignee: Luis Filipe Nassif
> Full encrypted 7Z file not detected as such
>
TIKA-2591 and TIKA-2568
+1
TIKA-1466 -- how long will it take, do you think? This seems potentially
non-trivial...
-Original Message-
From: Luís Filipe Nassif [mailto:lfcnas...@gmail.com]
Sent: Thursday, March 1, 2018 5:41 PM
To: dev@tika.apache.org
Subject: Re: Tika 1.18?
I think we
[
https://issues.apache.org/jira/browse/TIKA-2592?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16383350#comment-16383350
]
Andreas Meier edited comment on TIKA-2592 at 3/2/18 10:56 AM:
--
{quote}Before
[
https://issues.apache.org/jira/browse/TIKA-2592?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Andreas Meier updated TIKA-2592:
Attachment: TestHTMLCharsetCP1256.html
TestHTMLCharsetArabicCP1256.html
> HTML with
[
https://issues.apache.org/jira/browse/TIKA-2592?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16383350#comment-16383350
]
Andreas Meier commented on TIKA-2592:
-
{quote}
Before making this kind of change (default "unicode" to
18 matches
Mail list logo