[
https://issues.apache.org/jira/browse/IMAGING-281?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Garret Wilson resolved IMAGING-281.
-----------------------------------
Resolution: Not A Problem
Good news: I found the source of the problem.
It seems that Microsoft stores the XP tags, including {{XPTitle}}, not as UTF-8
but as UCS-2. (See [ExifTool FAQ on
charsets|https://exiftool.org/faq.html#Q10].) Thus in Apache Commons Imaging I
should have been using {{TagInfoXpString}} instead of {{TagInfoAscii}}, like
this:
{code:java}
TagInfoXpString EXIF_TAG_XP_TITLE = new TagInfoXpString("XPTitle", 0x9C9B,
TiffDirectoryType.EXIF_DIRECTORY_IFD0);
{code}
The other libraries, which knew about the XP tag encoding, had been been trying
to read the UTF-8 value as a two-byte Unicode encoding, which is what resulted
in the corrupted string. (I'm not sure why Irfan view managed to read it. Maybe
it went out of its way to detect UTF-8 before trying UCS-2 or UTF-16LE, or
maybe it is just written incorrectly and was making the same mistaken
assumption I made.)
I'll close this ticket as not a problem.
> Simple Exif XPTitle corrupted.
> ------------------------------
>
> Key: IMAGING-281
> URL: https://issues.apache.org/jira/browse/IMAGING-281
> Project: Commons Imaging
> Issue Type: Bug
> Affects Versions: 1.0-alpha2
> Reporter: Garret Wilson
> Priority: Blocker
> Attachments: gate-turret-exif-bad-title.jpg
>
>
> I have a small input JPEG image containing _no metadata sections whatsoever_.
> I use Apache Commons Imaging 1.0-alpha2 to add two simple Exif {{IFD0}}
> properties using
> [{{ExifRewriter().updateExifMetadataLossy()}}|https://commons.apache.org/proper/commons-imaging/apidocs/org/apache/commons/imaging/formats/jpeg/exif/ExifRewriter.html#updateExifMetadataLossy-org.apache.commons.imaging.common.bytesource.ByteSource-java.io.OutputStream-org.apache.commons.imaging.formats.tiff.write.TiffOutputSet-].
> * {{XPTitle}} ({{0x9C9B}}): "Gate and Turret"
> * {{Copyright}} ({{33432}}, {{0x8298}}): "Copyright © 2009 Garret Wilson"
> Here is a simplified excerpt of the code:
> {code:java}
> TagInfoAscii EXIF_XP_TITLE_TAG_INFO = new TagInfoAscii("XPTitle", 0x9C9B, -1,
> TiffDirectoryType.EXIF_DIRECTORY_IFD0); //XPTitle (0x9C9B)
> TagInfoAscii EXIF_COPYRIGHT_TAG_INFO = new TagInfoAscii("Copyright", 0x8298,
> -1, TiffDirectoryType.EXIF_DIRECTORY_IFD0); //Copyright (33432, 0x8298)
> …
> TiffOutputSet tiffOutputSet = new TiffOutputSet();
> TiffOutputDirectory exifDirectory = tiffOutputSet.getOrCreateRootDirectory();
> exifDirectory.add(EXIF_XP_TITLE_TAG_INFO, "Gate and Turret");
> exifDirectory.add(EXIF_COPYRIGHT_TAG_INFO, "Copyright © 2009 Garret Wilson");
> …
> new ExifRewriter().updateExifMetadataLossy(byteSource, outputStream,
> tiffOutputSet);
> {code}
> Using [ExifTool|https://exiftool.org/] 12.16 (via
> [ExifToolGUI|https://exiftool.org/gui/) 5.16.0.0], the {{Copyright}} value is
> stored correctly but the {{XPTitle}} is stored as "慇整愠摮吠牵敲t".
> [Metadata++|https://www.logipole.com/metadata++-en.htm] 1.22.14 also shows
> the same corrupted value.
> This is disheartening, as this is nearly the most simple test case possible.
> (Note that [IrfanView|https://www.irfanview.com/] 4.54 can read the
> {{XPTitle}} just fine! Nevertheless ExifTool is the gold standard for image
> metadata reading, and is confirmed by Metadata++. Having an image the
> metadata of which cannot be read in ExifTool is a show-stopper.)
> I'm will attach the test case image to this ticket.
--
This message was sent by Atlassian Jira
(v8.3.4#803005)