[ 
https://issues.apache.org/jira/browse/TIKA-4368?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

luman updated TIKA-4368:
------------------------
    Description: 
# Non-rich text content is not checked for the latest version, so when the 
content is TextExtendedAscii, it is still parsed repeatedly.
# Time parsing does not detect the version and may extract repeatedly.
# Dates are not parsed.
# non-Ascii characters unable to correctly extract parsed.
## Garbled text
## No parsing performed

The attachments include the original OneNote file, a screenshot of OneNote app, 
and a screenshot of TikaGUI app. 

  was:
# Non-rich text content is not checked for the latest version, so when the 
content is TextExtendedAscii, it is still parsed repeatedly.
 # Time parsing does not detect the version and may extract repeatedly.
 # Dates are not parsed.
 # non-Ascii characters unable to correctly extract parsed.

 # 
 ## Garbled text
 ## No parsing performed

The attachments include the original OneNote file, a screenshot of OneNote app, 
and a screenshot of TikaGUI app. 


> Unable to correctly extract content in OneNote
> ----------------------------------------------
>
>                 Key: TIKA-4368
>                 URL: https://issues.apache.org/jira/browse/TIKA-4368
>             Project: Tika
>          Issue Type: Bug
>          Components: parser
>    Affects Versions: 3.0.0, 4.0.0
>            Reporter: luman
>            Assignee: Tim Allison
>            Priority: Major
>         Attachments: Multilingual.one, Onenote-Screenshot.jpg, 
> Tika-gui-Screenshot.jpg
>
>
> # Non-rich text content is not checked for the latest version, so when the 
> content is TextExtendedAscii, it is still parsed repeatedly.
> # Time parsing does not detect the version and may extract repeatedly.
> # Dates are not parsed.
> # non-Ascii characters unable to correctly extract parsed.
> ## Garbled text
> ## No parsing performed
> The attachments include the original OneNote file, a screenshot of OneNote 
> app, and a screenshot of TikaGUI app. 



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

Reply via email to