[
https://issues.apache.org/jira/browse/TIKA-4309?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17883291#comment-17883291
]
ASF GitHub Bot commented on TIKA-4309:
--------------------------------------
alexey-pelykh commented on PR #1947:
URL: https://github.com/apache/tika/pull/1947#issuecomment-2363832999
> Y, that makes sense. If we treat it as a container, file though, we could
make up/find a mime type for fatmachO (application/x-fat-mach-o I just made
that up...is there an existing mime type?) and then the attachments/embedded
files would have their own correct mime types.
I've seen `application/x-mach-binary` and `application/x-mach-o` only,
however I'd be more than happy to come up with more definite ones, alike the
ELF-ones. Mach-O also can be shared lib, executable, etc. So technically we
should've been doing `application/x-mach-o-executable`
`application/x-mach-o-sharedlib` and so on. For Fat I've seen
`application/x-mach-universal` but following the proposed structure it would've
been `application/x-mach-o-universal`
> If at all possible, we should try to use magic to distinguish fat machO
from the other cafebabes.
The structure of fat Mach-O is quite
[vague](https://book.hacktricks.xyz/macos-hardening/macos-security-and-privilege-escalation/macos-files-folders-and-binaries/universal-binaries-and-mach-o-format#fat-header)
(and
[this](https://opensource.apple.com/source/xnu/xnu-123.5/EXTERNAL_HEADERS/mach-o/fat.h)),
only deep validation by code can help. So ideally I'd use ExecutableParser as
priority and if it fails - try other magic-matching `cafebabe`'s
> It is tricky if you're new to Tika.
:) That I've noticed :)
> I can try to help if you can create the skeleton for this file type
I'd gladly do so, the container is quite simple. It's 0xCAFEBABE + uint32t
of number of headers and every header just contains cpu/arch/type flags
> ExecutableParser: support MachO
> -------------------------------
>
> Key: TIKA-4309
> URL: https://issues.apache.org/jira/browse/TIKA-4309
> Project: Tika
> Issue Type: New Feature
> Reporter: Alexey Pelykh
> Priority: Major
>
--
This message was sent by Atlassian Jira
(v8.20.10#820010)