Following a post on the User-Commons-Apache log (from 2012), I ended up with the following code which seems to work. It writes proper Unicode, which I can read back successfully using ExifTool. I also see the comment nicely in Windows Explorer, and under File > Properties. Note I changed the field type from ASCII to FIELD_TYPE_UNDEFINED, otherwise (with ASCII) it did not work. At least Windows couldn't make sense of the EXIF data.
// http://osdir.com/ml/user-commons-apache/2012-03/msg00046.html byte[] unicodeMarker = new byte[]{ 0x55, 0x4E, 0x49, 0x43, 0x4F, 0x44, 0x45, 0x00 }; byte[] comment = textToSet.getBytes(ENCODING_UTF16); // OR UTF-16BE if the file is big-endian! byte[] bytesComment = new byte[unicodeMarker.length + comment.length]; System.arraycopy(unicodeMarker, 0, bytesComment, 0, unicodeMarker.length); System.arraycopy(comment, 0, bytesComment, unicodeMarker.length, comment.length); TiffOutputField exif_comment = new TiffOutputField(TiffConstants.EXIF_TAG_USER_COMMENT, TiffFieldTypeConstants.FIELD_TYPE_UNDEFINED, bytesComment.length, bytesComment); I can now write UserComment: "æøå" without problems :) - Joakim On 31 May 2016 at 17:39, Benedikt Ritter <brit...@apache.org> wrote: > Hello Joachim, > > Joakim Knudsen <joakim.gr...@gmail.com> schrieb am Sa., 28. Mai 2016 um > 21:10 Uhr: > > > Hi Benedikt, and thanks for replying! > > > > So, if FieldType is unused, maybe the alternative, simpler constructor is > > more appropriate/correct to use? > > > > // try using the approach given in the example (modified from the GPS > tag): > > TiffOutputField exif_comment = TiffOutputField.create( > > TiffConstants.EXIF_TAG_USER_COMMENT, > > outputSet.byteOrder, textToSet); > > > > However, now Sanselan throws an ImageWriteException: > > org.apache.sanselan.ImageWriteException: Tag has unexpected data type. > > > > So are you 100% sure field type should not be set (to ASCII)? > > > > No, I'm just saying that it uses a hard coded encoding anyway :-) > > > > > > Next, you're saying the string to set (textToSet) is converted internally > > to byte array, using US-ASCII encoding. > > If I try writing "æøåæøå" to a file, I get "쎦쎸쎥쎦쎸쎥" when I copy the JPEG > > out and check Properties in Windows Explorer. > > If I write only ASCII characters, e.g. "Test", then that comes through > just > > fine. > > > > In summary, here is the code that works for me (except non-ASCII > > characters): > > > > > > *// > > > > > http://mail-archives.apache.org/mod_mbox/commons-user/201203.mbox/%3CCAJm2B-mYCXYKuyu=Hs8UAZCpw-B=kwuz4gszfuobvzwun0l...@mail.gmail.com%3E > > < > > > http://mail-archives.apache.org/mod_mbox/commons-user/201203.mbox/%3CCAJm2B-mYCXYKuyu=Hs8UAZCpw-B=kwuz4gszfuobvzwun0l...@mail.gmail.com%3E > > >*byte > > b[] = ExifTagConstants.EXIF_TAG_USER_COMMENT.encodeValue( > > TiffFieldTypeConstants.FIELD_TYPE_ASCII, > > textToSet, outputSet. > > *byteOrder*); > > > > // constructor arguments: taginfo tag fieldtype count bytes > > TiffOutputField exif_comment = new > > TiffOutputField(TiffConstants.EXIF_TAG_USER_COMMENT.tag, > > TiffConstants.EXIF_TAG_USER_COMMENT, > > TiffFieldTypeConstants.FIELD_TYPE_UNDEFINED, > > b.length, b); > > > > The provided links indicate to me, that it is possible to write non ASCII > characters. Are you sure your code looks like what Damjan suggested? > > Benedikt > > > > > > > > > > Joakim > > > > > > > > On 22 May 2016 at 15:29, Benedikt Ritter <brit...@apache.org> wrote: > > > > > Hello Joakim > > > > > > Joakim Knudsen <joakim.gr...@gmail.com> schrieb am Sa., 21. Mai 2016 > um > > > 19:29 Uhr: > > > > > > > Hi List! > > > > > > > > I'm working on an Android app, where I want to read and write "EXIF > > tags" > > > > to JPEG files on the device. Sanselan 0.97 seems to work perfectly, > > > > although it's a bit complicated to work with EXIF tags/directories. > > > > > > > > The specific tags I'm interested in, is EXIF_TAG_USER_COMMENT and > > > > EXIF_TAG_IMAGE_DESCRIPTION. > > > > According to the documentation I could find, UserComment is of field > > type > > > > "undefined", whereas ImageDescription is of field type ASCII. > > > > > > > > > > > > > > http://www.awaresystems.be/imaging/tiff/tifftags/privateifd/exif/usercomment.html > > > > > http://www.awaresystems.be/imaging/tiff/tifftags/imagedescription.html > > > > > > > > What's the proper way of creating those tags, wrt. charset etc? I > want > > as > > > > wide as possible character support (æøå etc). > > > > > > > > I find different discussions online, with different advice. Seems two > > > > constructors are going around, where the simpler one does not deal > with > > > > charset/encoding at all. This one uses the .create method: > > > > > > > > String textToSet = "Some Text æøå"; > > > > > > > > TiffOutputField exif_comment = TiffOutputField.create( > > > > TiffConstants.EXIF_TAG_USER_COMMENT, > > > > outputSet.byteOrder, textToSet); > > > > > > > > > > > > while this one uses the standard constructor: > > > > > > > > byte b[] = ExifTagConstants.EXIF_TAG_USER_COMMENT.encodeValue( > > > > TiffFieldTypeConstants.FIELD_TYPE_ASCII, > > > > textToSet, outputSet.byteOrder > > > > ); > > > > > > > > // constructor arguments: taginfo tag fieldtype count bytes > > > > TiffOutputField exif_comment2 = new > > > > TiffOutputField(TiffConstants.EXIF_TAG_USER_COMMENT.tag, > > > > TiffConstants.EXIF_TAG_USER_COMMENT, > > > > TiffFieldTypeConstants.FIELD_TYPE_UNDEFINED, > > > > b.length, b); > > > > > > > > In this last one, the string to set has been converted to a byte > array > > > > first. But can/should I set the encoding anywhere? > > > > > > > > Is the field type even ASCII? This information seems to indicate it's > > > > not ASCII... > > > > > > > > > > > > > > http://www.awaresystems.be/imaging/tiff/tifftags/privateifd/exif/usercomment.html > > > > > > > > > > > > Need some help here, as you can see, to get this right. The second > > > > approach above does seem to work in my app, but I'd like to be sure > > > > I'm not somehow messing up the JPEGs on the deviced. > > > > > > > > > > I've looked at the code of > > > org.apache.commons.imaging.formats.tiff.taginfos.TagInfoGpsText > > > (ExifTagConstants.EXIF_TAG_USER_COMMENT is an instance of > > TagInfoGpsText). > > > Here are my observations: > > > > > > - The FieldType parameter, which you have set to > > > TiffFieldTypeConstants.FIELD_TYPE_ASCII is never used in the > > implemenation > > > of encodeValue(FieldType, Object, ByteOrder) > > > - When converting the input String to byte array, > String.getBytes(String > > > charsetName) is used > > > - For charsetName "US-ASCII" is always used (it can not be configured > by > > > the user) > > > > > > So my guess is, that the code will not handle characters not in the > > > US-ASCII charset correctly. > > > > > > Benedikt > > > > > > > > > > > > > > > > > > > > > > Joakim > > > > > > > > > >