On Friday, December 8, 2017, 7:43:05 PM EST, Tim Allison
wrote:
A candidate for the Tika 1.17 release is available at:
https://dist.apache.org/repos/dist/dev/tika/
The release candidate is a zip archive of the sources in:
There was a data transfer glitch to nexus. Will respin #2.
On Friday, December 8, 2017, 2:51:14 PM EST, Tim Allison
wrote:
A candidate for the Tika 1.17 release is available at:
https://dist.apache.org/repos/dist/dev/tika/
The release candidate is a zip
RC #2….
On 12/8/17, 2:30 PM, "Allison, Timothy B." wrote:
Wait, no that's totally hosed, there's not even a source zip file in:
https://repository.apache.org/content/repositories/orgapachetika-1027
Wait, no that's totally hosed, there's not even a source zip file in:
https://repository.apache.org/content/repositories/orgapachetika-1027
https://repository.apache.org/content/repositories/orgapachetika-1027/org/apache/tika/tika/1.17/
Do I need to respin w rc2? Or is there a way to push to
Do we expect only the src to be in nexus, not the jar artifacts (with sigs and
digests) for app, server, eval?
-Original Message-
From: Chris Mattmann [mailto:mattm...@apache.org]
Sent: Friday, December 8, 2017 5:07 PM
To: dev@tika.apache.org
Subject: Re: 1.17 rc1 and two repos in
[
https://issues.apache.org/jira/browse/TIKA-2523?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16284321#comment-16284321
]
Tim Allison commented on TIKA-2523:
---
https://bz.apache.org/bugzilla/show_bug.cgi?id=61881
Turns out this
Hey Tim, probably just upload errors on the first one and so it tried again. No
worries. Drop and close
the first, and just use the 2nd.
Cheers,
Chris
On 12/8/17, 12:05 PM, "Allison, Timothy B." wrote:
Not sure what happened, but two repos were created in Nexus:
[
https://issues.apache.org/jira/browse/TIKA-2523?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Tim Allison updated TIKA-2523:
--
Summary: Regression in ppt parsing -- "typeface can't be null or empty"
(was: Regression in ppt
[
https://issues.apache.org/jira/browse/TIKA-2523?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Tim Allison updated TIKA-2523:
--
Attachment: 802350.ppt
triggering file from govdocs1
> Regression in ppt parsing
>
Tim Allison created TIKA-2523:
-
Summary: Regression in ppt parsing
Key: TIKA-2523
URL: https://issues.apache.org/jira/browse/TIKA-2523
Project: Tika
Issue Type: Bug
Reporter: Tim
And do I remember correctly that the full distro should be in nexus, not just
the source code as we currently have in:
https://repository.apache.org/content/repositories/orgapachetika-1027/
Need to respin rc2 on Monday if that's the case.
-Original Message-
From: Allison, Timothy B.
Not sure what happened, but two repos were created in Nexus:
https://repository.apache.org/content/repositories/orgapachetika-1026/
https://repository.apache.org/content/repositories/orgapachetika-1027/
The first one (1026) failed with checksum problems, and I dropped it.
I closed the second one
A candidate for the Tika 1.17 release is available at:
https://dist.apache.org/repos/dist/dev/tika/
The release candidate is a zip archive of the sources in:
https://github.com/apache/tika/tree/1.17-rc1
The SHA1 checksum of the archive is 37f3cd19051160a8c488b1aa7ff25c3ae515c359.
In addition,
[
https://issues.apache.org/jira/browse/TIKA-2521?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16284026#comment-16284026
]
Hudson commented on TIKA-2521:
--
SUCCESS: Integrated in Jenkins build Tika-trunk #1410 (See
[
https://issues.apache.org/jira/browse/TIKA-2522?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16283948#comment-16283948
]
Tim Allison commented on TIKA-2522:
---
I think fixing this very minor regression poses more risk than
[
https://issues.apache.org/jira/browse/TIKA-2522?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Tim Allison updated TIKA-2522:
--
Summary: Trivial regression in MSWord parser -- not extracting Encite Add
in text any more (was:
[
https://issues.apache.org/jira/browse/TIKA-2522?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Tim Allison updated TIKA-2522:
--
Attachment: 508650.doc
Example file. We should extract "pharmacology" from this...among other words
in
Tim Allison created TIKA-2522:
-
Summary: Regression in MSWord parser -- not extracting Encite Add
in text any more
Key: TIKA-2522
URL: https://issues.apache.org/jira/browse/TIKA-2522
Project: Tika
[
https://issues.apache.org/jira/browse/TIKA-2483?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16283922#comment-16283922
]
Hudson commented on TIKA-2483:
--
SUCCESS: Integrated in Jenkins build Tika-trunk #1409 (See
[
https://issues.apache.org/jira/browse/TIKA-2521?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Tim Allison resolved TIKA-2521.
---
Resolution: Fixed
Fix Version/s: 1.17
> SAX-based docx/pptx should start a new line before
Tim Allison created TIKA-2521:
-
Summary: SAX-based docx/pptx should start a new line before second
paragraph within a cell
Key: TIKA-2521
URL: https://issues.apache.org/jira/browse/TIKA-2521
Project:
[
https://issues.apache.org/jira/browse/TIKA-2519?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16283830#comment-16283830
]
Hudson commented on TIKA-2519:
--
SUCCESS: Integrated in Jenkins build Tika-trunk #1408 (See
[
https://issues.apache.org/jira/browse/TIKA-2483?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Tim Allison resolved TIKA-2483.
---
Resolution: Fixed
Fix Version/s: 1.17
> Using PackageParser in ForkParser causes NPE
>
[
https://issues.apache.org/jira/browse/TIKA-2483?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16283780#comment-16283780
]
Tim Allison edited comment on TIKA-2483 at 12/8/17 4:38 PM:
Regression tests in
Yes, Tim, I saw all these reporting artifacs, I agree they are good things.
2017-12-08 14:32 GMT-02:00 Allison, Timothy B. :
> Thank you, Luís. I’ve finally had a chance to take a look. As exceptions
> go, the PPT is the most eye-opening. I don’t know how I didn’t catch
>
Thank you, Luís. I’ve finally had a chance to take a look. As exceptions go,
the PPT is the most eye-opening. I don’t know how I didn’t catch those…ugh.
There are a bunch more exceptions for zerobyte file exceptions in attachments,
but this is a good thing, because now we can figure out if
[
https://issues.apache.org/jira/browse/TIKA-2483?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16283780#comment-16283780
]
Tim Allison edited comment on TIKA-2483 at 12/8/17 4:26 PM:
Regression tests in
[
https://issues.apache.org/jira/browse/TIKA-2483?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16283780#comment-16283780
]
Tim Allison edited comment on TIKA-2483 at 12/8/17 4:15 PM:
Regression tests in
[
https://issues.apache.org/jira/browse/TIKA-2483?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16283780#comment-16283780
]
Tim Allison commented on TIKA-2483:
---
Regression tests in prep for 1.17 show that we need to add quite a
Vincent van Donselaar created TIKA-2520:
---
Summary: OptimaizeLangDetector#loadModels() should not be called
for every single langdetect HTTP request
Key: TIKA-2520
URL:
[
https://issues.apache.org/jira/browse/TIKA-2519?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16283679#comment-16283679
]
Tim Allison commented on TIKA-2519:
---
Thank you [~esaunders]!
> Issue parsing multiple CHM files
[
https://issues.apache.org/jira/browse/TIKA-2519?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Tim Allison resolved TIKA-2519.
---
Resolution: Fixed
Fix Version/s: 1.17
> Issue parsing multiple CHM files concurrently
>
[
https://issues.apache.org/jira/browse/TIKA-2519?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16281225#comment-16281225
]
Tim Allison edited comment on TIKA-2519 at 12/8/17 2:51 PM:
Thank you for
33 matches
Mail list logo