[
https://issues.apache.org/jira/browse/TIKA-2597?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16384065#comment-16384065
]
Todd Dixon commented on TIKA-2597:
----------------------------------
>From what i read on the FILE_FLAG_POSIX_SEMANTICS flag that will only work
>within the applications that actually applications that have it set.
>Therefore it wouldnt really be of any use since i'm using the Tika App. Would
>it be possible to add a feature to allow configuration of attachment prefixs?
>to allow for unique naming? ie
Att01_Filename
Att02_Filename
Configurable on the command line?
Thanks,
Todd
> Attachment Extraction Case Sensitivity
> --------------------------------------
>
> Key: TIKA-2597
> URL: https://issues.apache.org/jira/browse/TIKA-2597
> Project: Tika
> Issue Type: Bug
> Components: app
> Affects Versions: 1.17
> Environment: windows
> Reporter: Todd Dixon
> Priority: Major
>
> Using the --extract option on a pdf with embedded files I am seeing that not
> all of the attachments are extracted. There are several files embedded that
> contain the same name. The names that are exactly the same are accounted for
> with a suffix of (1) etc. However when there is a similar name that is not
> the same case the parse does not account for changing the name with the
> suffix and thus overwrites the file on disk. Example
> FW Letter,.msg
> FW letter.msg
> Will result in only one attachment extracted. Would it be possible to update
> the filename comparison to account for windows file systems which see those
> two files as the same name?
> Thanks!
--
This message was sent by Atlassian JIRA
(v7.6.3#76005)