[
https://issues.apache.org/jira/browse/NIFI-11084?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17768114#comment-17768114
]
Nissim Shiman edited comment on NIFI-11084 at 10/3/23 7:10 PM:
---------------------------------------------------------------
Once NIFI-11463 is complete, a custom rule can be made for this [1], which will
hit before reaching the tika rules. This would elevate csv rule processing
above image/x-portable-graymap (Thank you [~dstieg1] for pointing this out to
me].
[1] Config Strategy would be set to "Add"
and Config Body would be:
{code:java}
<?xml version="1.0" encoding="UTF-8"?>
<mime-info>
<mime-type type="text/csv">
<glob pattern="*.csv"/>
<sub-class-of type="text/plain"/>
</mime-type>
</mime-info>
{code}
Update: 10/3/23: During code review for NIFI-11463, an alternative
implementation was discussed and decided upon where the above workaround would
no longer be possible.
Apologies for raising premature hopes about this.
was (Author: nissim shiman):
Once NIFI-11463 is complete, a custom rule can be made for this [1], which will
hit before reaching the tika rules. This would elevate csv rule processing
above image/x-portable-graymap (Thank you [~dstieg1] for pointing this out to
me].
[1] Config Strategy would be set to "Add"
and Config Body would be:
{code:java}
<?xml version="1.0" encoding="UTF-8"?>
<mime-info>
<mime-type type="text/csv">
<glob pattern="*.csv"/>
<sub-class-of type="text/plain"/>
</mime-type>
</mime-info>
{code}
> Character/text data "mis-identified" by IdentifyMimeType processor
> ------------------------------------------------------------------
>
> Key: NIFI-11084
> URL: https://issues.apache.org/jira/browse/NIFI-11084
> Project: Apache NiFi
> Issue Type: Bug
> Components: Core Framework
> Affects Versions: 1.15.2, 1.19.1
> Environment: Windows Server 2019, Java 11.0.17
> Reporter: Mark Ward
> Priority: Minor
> Attachments: mime_type_mis-id_file.csv
>
>
> When *IdentifyMimeType* is presented with a text file with a `.csv` extension
> and the first two characters of the content as `P2`, the processor
> mis-identifies the mime.extension as `pgm` and mime.type as
> `image/x-portable-graymap`.
> The processor's *Use Filename In Detection* property is set to `true`.
> An example file is attached and the following flow and be used to reproduce:
> *GetFile* > *IdentifyMimeType* where the outputted flowfile's attributes can
> be inspected.
> This has been tested on NiFi versions `1.15.2` and `1.19.1` both running on a
> Window's Server 2019 instance.
--
This message was sent by Atlassian Jira
(v8.20.10#820010)