Hi,
On Mon, Aug 22, 2011 at 11:06 AM, nirnaydewan nirnayde...@gmail.com wrote:
But for the XHTML output, i believe that is one time process while
extraction is being done. That means again i have to store/index that xhtml
output text as well for later use. Is this correct or am i missing
Incorrent mime-type for .pptm, .ppsm and .ppsx in OOXMLParser
-
Key: TIKA-693
URL: https://issues.apache.org/jira/browse/TIKA-693
Project: Tika
Issue Type: Bug
[
https://issues.apache.org/jira/browse/TIKA-693?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Maxim Valyanskiy resolved TIKA-693.
---
Resolution: Fixed
Committed revision 1160216.
Incorrect mime-type for .pptm, .ppsm and
[
https://issues.apache.org/jira/browse/TIKA-693?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Maxim Valyanskiy updated TIKA-693:
--
Summary: Incorrect mime-type for .pptm, .ppsm and .ppsx in OOXMLParser
(was: Incorrent
On Thu, 18 Aug 2011, Tom Grant wrote:
Is there a way to programmatically register new Mime Types?
I think the expectation was that people finding gaps would open a new jira
entry, and list the details of these mimetypes and then everyone would
benefit from them!
There shouldn't be many
Hey,
and welcome to the Tika.
Using Eclipse you would better download an eclipse plug-in:
http://m2eclipse.sonatype.org/sites/m2e
Having downloaded and installed plug-in, your next step could be importing
Tika project like that: ' *File* -* Import* - *Existing Maven Project* '
...
However, if
Here's the use case that I'm attempting to solve. I have a customer with
many legacy systems, some of which are completely custom. These systems
have data files that will never be seen outside of their environment. For
example, some are XML files with their own schemas. Some are similar to the
[
https://issues.apache.org/jira/browse/TIKA-683?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Michael McCandless updated TIKA-683:
Attachment: testWORD_bold_character_runs2.docx
[
https://issues.apache.org/jira/browse/TIKA-683?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13089072#comment-13089072
]
Michael McCandless commented on TIKA-683:
-
Sorry, wrong issue -- that last patch was