[
https://issues.apache.org/jira/browse/TIKA-1675?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14617457#comment-14617457
]
Michael McCandless commented on TIKA-1675:
--
bq. If the project is dead and not
[
https://issues.apache.org/jira/browse/TIKA-1628?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Michael McCandless resolved TIKA-1628.
--
Resolution: Pending Closed
Thanks [~gagravarr] and [~thetaphi]
ExternalParser.check
[
https://issues.apache.org/jira/browse/TIKA-1544?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14309956#comment-14309956
]
Michael McCandless commented on TIKA-1544:
--
bq. Michael McCandless, is the fix
[
https://issues.apache.org/jira/browse/TIKA-1544?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14310014#comment-14310014
]
Michael McCandless commented on TIKA-1544:
--
bq. I have hesitation about changing
[
https://issues.apache.org/jira/browse/TIKA-1305?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14013647#comment-14013647
]
Michael McCandless commented on TIKA-1305:
--
Net/net the RTF is corrupted right?
[
https://issues.apache.org/jira/browse/TIKA-1078?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Michael McCandless resolved TIKA-1078.
--
Resolution: Fixed
Thanks Stefano, I made one small change (added generics:
[
https://issues.apache.org/jira/browse/TIKA-1078?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13869205#comment-13869205
]
Michael McCandless commented on TIKA-1078:
--
Thanks Stefano!
Can you fix the
[
https://issues.apache.org/jira/browse/TIKA-1211?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13850567#comment-13850567
]
Michael McCandless commented on TIKA-1211:
--
+1 to fix XHTMLContentHandler to allow
[
https://issues.apache.org/jira/browse/TIKA-1192?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Michael McCandless resolved TIKA-1192.
--
Resolution: Fixed
Fix Version/s: 1.5
Thanks Dave, I just committed this.
[
https://issues.apache.org/jira/browse/TIKA-1192?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Michael McCandless reassigned TIKA-1192:
Assignee: Michael McCandless
ArrayIndexOutOfBoundsException: 9 parsing RTF
[
https://issues.apache.org/jira/browse/TIKA-1192?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13817472#comment-13817472
]
Michael McCandless commented on TIKA-1192:
--
bq. Yes, when that fragment is part of
[
https://issues.apache.org/jira/browse/TIKA-1192?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13817512#comment-13817512
]
Michael McCandless commented on TIKA-1192:
--
Thanks Dave.
[
https://issues.apache.org/jira/browse/TIKA-1181?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13788163#comment-13788163
]
Michael McCandless commented on TIKA-1181:
--
The RTFParser currently only carries
[
https://issues.apache.org/jira/browse/TIKA-1143?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13698837#comment-13698837
]
Michael McCandless commented on TIKA-1143:
--
Are you able to extract text from the
[
https://issues.apache.org/jira/browse/TIKA-1128?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Michael McCandless reassigned TIKA-1128:
Assignee: Michael McCandless
Replace line tabulation with line break
[
https://issues.apache.org/jira/browse/TIKA-1128?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Michael McCandless updated TIKA-1128:
-
Fix Version/s: 1.5
Replace line tabulation with line break
[
https://issues.apache.org/jira/browse/TIKA-1128?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13670252#comment-13670252
]
Michael McCandless commented on TIKA-1128:
--
Thanks Privezentsev.
Do you have an
[
https://issues.apache.org/jira/browse/TIKA-1128?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Michael McCandless resolved TIKA-1128.
--
Resolution: Fixed
Fix Version/s: (was: 1.5)
1.4
Thanks
[
https://issues.apache.org/jira/browse/TIKA-1098?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13615793#comment-13615793
]
Michael McCandless commented on TIKA-1098:
--
Hmm PDFBox is hitting that exception
[
https://issues.apache.org/jira/browse/TIKA-1074?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13585082#comment-13585082
]
Michael McCandless commented on TIKA-1074:
--
bq. My app needs to extract text even
[
https://issues.apache.org/jira/browse/TIKA-1074?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13584176#comment-13584176
]
Michael McCandless commented on TIKA-1074:
--
Thanks Jukka.
InterruptedException is
[
https://issues.apache.org/jira/browse/TIKA-1074?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13584249#comment-13584249
]
Michael McCandless commented on TIKA-1074:
--
{quote}
bq. InterruptedException is
[
https://issues.apache.org/jira/browse/TIKA-1074?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13584362#comment-13584362
]
Michael McCandless commented on TIKA-1074:
--
OK I'll remove the future proofing.
[
https://issues.apache.org/jira/browse/TIKA-1074?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Michael McCandless resolved TIKA-1074.
--
Resolution: Fixed
Extraction should continue if an exception is hit visiting an
[
https://issues.apache.org/jira/browse/TIKA-1074?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Michael McCandless resolved TIKA-1074.
--
Resolution: Fixed
Extraction should continue if an exception is hit visiting an
[
https://issues.apache.org/jira/browse/TIKA-1074?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Michael McCandless updated TIKA-1074:
-
Attachment: TIKA-1074.patch
Patch, catching Exception not Throwable, and restoring the
[
https://issues.apache.org/jira/browse/TIKA-1074?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Michael McCandless reopened TIKA-1074:
--
Extraction should continue if an exception is hit visiting an embedded
document
[
https://issues.apache.org/jira/browse/TIKA-1074?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13582481#comment-13582481
]
Michael McCandless commented on TIKA-1074:
--
Thanks Uwe, I'll change to catching
[
https://issues.apache.org/jira/browse/TIKA-1074?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Michael McCandless reassigned TIKA-1074:
Assignee: Michael McCandless
Extraction should continue if an exception is hit
[
https://issues.apache.org/jira/browse/TIKA-1074?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Michael McCandless updated TIKA-1074:
-
Attachment: TIKA-1074.patch
Patch, just logging a warning and continuing, if we hit the
[
https://issues.apache.org/jira/browse/TIKA-369?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13573492#comment-13573492
]
Michael McCandless commented on TIKA-369:
-
The language-detection lib is now in
[
https://issues.apache.org/jira/browse/TIKA-1053?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Michael McCandless resolved TIKA-1053.
--
Resolution: Fixed
Fix Version/s: 1.4
Thanks Uwe.
Upgrade Tika
Michael McCandless created TIKA-1078:
Summary: TikaCLI: invalid characters in embedded document name
causes FNFE when trying to save
Key: TIKA-1078
URL: https://issues.apache.org/jira/browse/TIKA-1078
[
https://issues.apache.org/jira/browse/TIKA-1078?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Michael McCandless updated TIKA-1078:
-
Attachment: T-DS_Excel2003-PPT2003_1.xls
TikaCLI: invalid characters in embedded
[
https://issues.apache.org/jira/browse/TIKA-1079?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Michael McCandless updated TIKA-1079:
-
Attachment: guide_to_daips_(id_3152_ver_1.0.0).doc
Word document hits AIOOBE in
[
https://issues.apache.org/jira/browse/TIKA-1074?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13571288#comment-13571288
]
Michael McCandless commented on TIKA-1074:
--
TIKA-1079 is another example where if
[
https://issues.apache.org/jira/browse/TIKA-1072?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13570208#comment-13570208
]
Michael McCandless commented on TIKA-1072:
--
OK I did some digging on this. The
[
https://issues.apache.org/jira/browse/TIKA-1072?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13570305#comment-13570305
]
Michael McCandless commented on TIKA-1072:
--
Thanks Nick, I'll try asking on
[
https://issues.apache.org/jira/browse/TIKA-1072?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13570308#comment-13570308
]
Michael McCandless commented on TIKA-1072:
--
OK I opened TIKA-1074; this issue will
Michael McCandless created TIKA-1074:
Summary: Extraction should continue if an exception is hit
visiting an embedded document
Key: TIKA-1074
URL: https://issues.apache.org/jira/browse/TIKA-1074
[
https://issues.apache.org/jira/browse/TIKA-1072?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Michael McCandless updated TIKA-1072:
-
Attachment: Ole10NativeEntry.bin
I'm attaching the 40 byte \U0001Ole10Native entry (40
Michael McCandless created TIKA-1072:
Summary: AIOOBE when handling embedded document in .doc file
Key: TIKA-1072
URL: https://issues.apache.org/jira/browse/TIKA-1072
Project: Tika
Issue
[
https://issues.apache.org/jira/browse/TIKA-1072?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Michael McCandless updated TIKA-1072:
-
Attachment: 20-Force-on-a-current-S00.doc
AIOOBE when handling embedded document in
Michael McCandless created TIKA-1067:
Summary: Tika extracts non-existent asterisks (*) from .ppt files
Key: TIKA-1067
URL: https://issues.apache.org/jira/browse/TIKA-1067
Project: Tika
[
https://issues.apache.org/jira/browse/TIKA-1062?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13562060#comment-13562060
]
Michael McCandless commented on TIKA-1062:
--
Hi Axel,
I don't actually know that
[
https://issues.apache.org/jira/browse/TIKA-1062?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13560927#comment-13560927
]
Michael McCandless commented on TIKA-1062:
--
Should the ListDescriptor list =
[
https://issues.apache.org/jira/browse/TIKA-1048?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Michael McCandless resolved TIKA-1048.
--
Resolution: Fixed
XMLParser should add whitespace between elements
Michael McCandless created TIKA-1048:
Summary: XMLParser should add whitespace between elements
Key: TIKA-1048
URL: https://issues.apache.org/jira/browse/TIKA-1048
Project: Tika
Issue
[
https://issues.apache.org/jira/browse/TIKA-1048?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Michael McCandless updated TIKA-1048:
-
Attachment: TIKA-1048.patch
Patch w/ failing test ... I'm not sure where/how to best fix
[
https://issues.apache.org/jira/browse/TIKA-1031?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Michael McCandless resolved TIKA-1031.
--
Resolution: Fixed
TikaCLI doesn't create sub-dirs when extracting Zip files
[
https://issues.apache.org/jira/browse/TIKA-1032?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Michael McCandless resolved TIKA-1032.
--
Resolution: Fixed
Fix Version/s: 1.3
Powerpoint (.pptx) can have duplicate
[
https://issues.apache.org/jira/browse/TIKA-712?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13508010#comment-13508010
]
Michael McCandless commented on TIKA-712:
-
I committed the patch; I'll leave this
[
https://issues.apache.org/jira/browse/TIKA-1035?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Michael McCandless resolved TIKA-1035.
--
Resolution: Fixed
PDF bookmark text is not extracted
[
https://issues.apache.org/jira/browse/TIKA-1036?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Michael McCandless resolved TIKA-1036.
--
Resolution: Fixed
Fix Version/s: 1.3
ZIP parsing doesn't leave placeholders
Michael McCandless created TIKA-1035:
Summary: PDF bookmark text is not extracted
Key: TIKA-1035
URL: https://issues.apache.org/jira/browse/TIKA-1035
Project: Tika
Issue Type: Bug
[
https://issues.apache.org/jira/browse/TIKA-1035?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Michael McCandless updated TIKA-1035:
-
Attachment: TIKA-1035.patch
Patch w/ test ...
PDF bookmark text is not
Michael McCandless created TIKA-1036:
Summary: ZIP parsing doesn't leave placeholders for each package
entry
Key: TIKA-1036
URL: https://issues.apache.org/jira/browse/TIKA-1036
Project: Tika
[
https://issues.apache.org/jira/browse/TIKA-1036?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Michael McCandless updated TIKA-1036:
-
Attachment: TIKA-1036.patch
Patch w/ test ...
ZIP parsing doesn't leave
[
https://issues.apache.org/jira/browse/TIKA-712?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Michael McCandless updated TIKA-712:
Attachment: TIKA-712.patch
I think I found a committable workaround (patch) for including
Michael McCandless created TIKA-1033:
Summary: Tika doesn't parse embedded OLE Chart/Graph objects
Key: TIKA-1033
URL: https://issues.apache.org/jira/browse/TIKA-1033
Project: Tika
Issue
[
https://issues.apache.org/jira/browse/TIKA-1033?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Michael McCandless updated TIKA-1033:
-
Attachment: emb.ppt
Tika doesn't parse embedded OLE Chart/Graph objects
[
https://issues.apache.org/jira/browse/TIKA-1033?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13504563#comment-13504563
]
Michael McCandless commented on TIKA-1033:
--
Here's the full stack trace when I
[
https://issues.apache.org/jira/browse/TIKA-1033?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13504668#comment-13504668
]
Michael McCandless commented on TIKA-1033:
--
I asked the person who created this
[
https://issues.apache.org/jira/browse/TIKA-1033?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13504673#comment-13504673
]
Michael McCandless commented on TIKA-1033:
--
bq. The raw chart object looks to
[
https://issues.apache.org/jira/browse/TIKA-1033?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13504703#comment-13504703
]
Michael McCandless commented on TIKA-1033:
--
Interesting: with PowerPoint 2007,
[
https://issues.apache.org/jira/browse/TIKA-1033?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13504726#comment-13504726
]
Michael McCandless commented on TIKA-1033:
--
OK I opened
Michael McCandless created TIKA-1031:
Summary: TikaCLI doesn't create sub-dirs when extracting Zip files
Key: TIKA-1031
URL: https://issues.apache.org/jira/browse/TIKA-1031
Project: Tika
[
https://issues.apache.org/jira/browse/TIKA-1031?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Michael McCandless updated TIKA-1031:
-
Attachment: TIKA-1031.patch
Patch w/ test fix.
TikaCLI doesn't create
[
https://issues.apache.org/jira/browse/TIKA-1024?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Michael McCandless resolved TIKA-1024.
--
Resolution: Fixed
An MP3 with an UTF-16 ID3 tag containing only the BOM should
[
https://issues.apache.org/jira/browse/TIKA-1025?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Michael McCandless resolved TIKA-1025.
--
Resolution: Fixed
Fix Version/s: 1.3
Powerpoint (.ppt) parser doesn't leave
[
https://issues.apache.org/jira/browse/TIKA-369?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13499838#comment-13499838
]
Michael McCandless commented on TIKA-369:
-
+1 to cut over to
Michael McCandless created TIKA-1024:
Summary: An MP3 with an UTF-16 ID3 tag containing only the BOM
should produce empty string value for that tag
Key: TIKA-1024
URL:
[
https://issues.apache.org/jira/browse/TIKA-1024?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Michael McCandless updated TIKA-1024:
-
Attachment: testNakedUTF16BOM.mp3
An MP3 with an UTF-16 ID3 tag containing only the
[
https://issues.apache.org/jira/browse/TIKA-1024?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Michael McCandless updated TIKA-1024:
-
Attachment: TIKA-1024.patch
Patch w/ failing test and fix.
An MP3 with
Michael McCandless created TIKA-1025:
Summary: Powerpoint (.ppt) parser doesn't leave placeholder where
documents are embedded
Key: TIKA-1025
URL: https://issues.apache.org/jira/browse/TIKA-1025
[
https://issues.apache.org/jira/browse/TIKA-1025?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Michael McCandless updated TIKA-1025:
-
Attachment: TIKA-1025.patch
Patch w/ test fix.
Powerpoint (.ppt)
[
https://issues.apache.org/jira/browse/TIKA-1019?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Michael McCandless resolved TIKA-1019.
--
Resolution: Fixed
Document links in Word documents don't leave a placeholder
[
https://issues.apache.org/jira/browse/TIKA-1019?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Michael McCandless resolved TIKA-1019.
--
Resolution: Fixed
Document links in Word documents don't leave a placeholder
[
https://issues.apache.org/jira/browse/TIKA-1019?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Michael McCandless reopened TIKA-1019:
--
I reverted my commit for now ... the test file was way too large ...
[
https://issues.apache.org/jira/browse/TIKA-1019?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Michael McCandless reassigned TIKA-1019:
Assignee: Michael McCandless
Document links in Word documents don't leave a
[
https://issues.apache.org/jira/browse/TIKA-1019?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Michael McCandless updated TIKA-1019:
-
Attachment: testDocumentLink.doc
TIKA-1019.patch
Patch w/ test and fix.
[
https://issues.apache.org/jira/browse/TIKA-1015?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Michael McCandless resolved TIKA-1015.
--
Resolution: Fixed
Word (.doc) embedded files don't set relationship ID in the
[
https://issues.apache.org/jira/browse/TIKA-953?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Michael McCandless reopened TIKA-953:
-
I have another non-ustar tar file that's incorrectly detected as
application/octet-stream
[
https://issues.apache.org/jira/browse/TIKA-953?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Michael McCandless updated TIKA-953:
Attachment: test2.tar
file reports this as a tar archive, but:
{noformat}
cat test2.tar |
Michael McCandless created TIKA-1015:
Summary: Word (.doc) embedded files don't set relationship ID in
the Metadata
Key: TIKA-1015
URL: https://issues.apache.org/jira/browse/TIKA-1015
Project:
[
https://issues.apache.org/jira/browse/TIKA-1015?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Michael McCandless updated TIKA-1015:
-
Attachment: TIKA-1015.patch
Simple patch, but my only slight hesitation is I added an
[
https://issues.apache.org/jira/browse/TIKA-1011?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Michael McCandless resolved TIKA-1011.
--
Resolution: Fixed
Exception (Null charset name) processing .mhtml file
Michael McCandless created TIKA-1011:
Summary: Exception (Null charset name) processing .mhtml file
Key: TIKA-1011
URL: https://issues.apache.org/jira/browse/TIKA-1011
Project: Tika
[
https://issues.apache.org/jira/browse/TIKA-1011?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Michael McCandless updated TIKA-1011:
-
Attachment: TIKA-1011.patch
Exception (Null charset name) processing .mhtml file
Michael McCandless created TIKA-1010:
Summary: Embedded documents in RTF are not extracted
Key: TIKA-1010
URL: https://issues.apache.org/jira/browse/TIKA-1010
Project: Tika
Issue Type:
[
https://issues.apache.org/jira/browse/TIKA-1005?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Michael McCandless updated TIKA-1005:
-
Attachment: TIKA-1005.patch
Patch w/ test ...
In Microsoft Office Word
[
https://issues.apache.org/jira/browse/TIKA-1006?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Michael McCandless reassigned TIKA-1006:
Assignee: Michael McCandless
NPE in extractParagraph (styleClass) in
[
https://issues.apache.org/jira/browse/TIKA-1006?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13474947#comment-13474947
]
Michael McCandless commented on TIKA-1006:
--
Thanks Sture, that patch looks good!
[
https://issues.apache.org/jira/browse/TIKA-1005?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Michael McCandless reassigned TIKA-1005:
Assignee: Michael McCandless
In Microsoft Office Word 2010 documents, text
[
https://issues.apache.org/jira/browse/TIKA-1005?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13474958#comment-13474958
]
Michael McCandless commented on TIKA-1005:
--
Thanks David, I'll dig!
[
https://issues.apache.org/jira/browse/TIKA-1006?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Michael McCandless resolved TIKA-1006.
--
Resolution: Fixed
Fix Version/s: 1.3
Thanks Sture, I just committed the test
[
https://issues.apache.org/jira/browse/TIKA-1005?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13474250#comment-13474250
]
Michael McCandless commented on TIKA-1005:
--
Could you attach an example showing
[
https://issues.apache.org/jira/browse/TIKA-997?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Michael McCandless resolved TIKA-997.
-
Resolution: Fixed
Fix Version/s: 1.3
Leave a placeholder when documents are
[
https://issues.apache.org/jira/browse/TIKA-997?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Michael McCandless updated TIKA-997:
Attachment: TIKA-997.patch
Patch.
It's not perfect, because the placeholder will appear at
Michael McCandless created TIKA-999:
---
Summary: RTF Parser doesn't extract page/word/character count
metadata
Key: TIKA-999
URL: https://issues.apache.org/jira/browse/TIKA-999
Project: Tika
1 - 100 of 217 matches
Mail list logo