Hello Joakim, Joakim Knudsen <joakim.gr...@gmail.com> schrieb am Mi., 1. Juni 2016 um 15:10 Uhr:
> Sure! That would also give even more scrutiny to the code. I'm not 100% > sure this is totally correct, but I got wonderful help from Phil Harvey > (ExifTool) to get the charset/encoding correct. > So I'm pretty confident. How do I contribute? > Looking at the Commons Imaging website [1] I realised, that we currently do not have a user guide :o) To the best idea would probably be to add it to the Sample Usage page [2]. The website is build from source in SVN [3]. You would have to check that out, modify the documentation and then create an SVN patch file, using svn diff >> mypatch.diff the mypatch.diff would then have to be attached to a Jira issue. More information can be found in [5]. > Btw, you wouldn't happen to know anything about IPTC and XMP, would you? It > seems the EXIF tags I'm writing (UserComment and ImageDescription) are not > enough for the comment to appear as a caption in image viewer software > (like Picasa etc). I was wondering (hoping) Sanselan could write the > following tags: > > IPTC:Caption-Abstract > and > XMP:Description > > To be honest, I don't know much about how Sanselan/Imaging works. I have worked on the code for a while, but I don't use it in my current projects. So the only thing I can do, is look through the code for you and try to find an answer to your questions :-) Benedikt [1] http://commons.apache.org/proper/commons-imaging/index.html [2] http://commons.apache.org/proper/commons-imaging/sampleusage.html [3] http://svn.apache.org/repos/asf/commons/proper/imaging/trunk [4] http://issues.apache.org/jira/browse/IMAGING [5] http://commons.apache.org/patches.html > > Joakim > > On 1 June 2016 at 14:55, Benedikt Ritter <brit...@apache.org> wrote: > > > Hello Joakim, > > > > glad you found out what to do. This would make for a good addition to the > > user guide. Would you like to contribute your findings? > > > > Benedikt > > > > Joakim Knudsen <joakim.gr...@gmail.com> schrieb am Di., 31. Mai 2016 um > > 19:21 Uhr: > > > > > Btw, ENCODING_UTF16 is just a String = "UTF-16LE" (Little Endian) > > > > > > On 31 May 2016 at 19:20, Joakim Knudsen <joakim.gr...@gmail.com> > wrote: > > > > > > > Following a post on the User-Commons-Apache log (from 2012), I ended > up > > > > with the following code which seems to work. > > > > It writes proper Unicode, which I can read back successfully using > > > > ExifTool. I also see the comment nicely in Windows Explorer, and > under > > > File > > > > > Properties. > > > > Note I changed the field type from ASCII to FIELD_TYPE_UNDEFINED, > > > > otherwise (with ASCII) it did not work. At least Windows couldn't > make > > > > sense of the EXIF data. > > > > > > > > // http://osdir.com/ml/user-commons-apache/2012-03/msg00046.html > > > > byte[] unicodeMarker = new byte[]{ 0x55, 0x4E, 0x49, 0x43, 0x4F, > 0x44, > > > > 0x45, 0x00 }; > > > > byte[] comment = textToSet.getBytes(ENCODING_UTF16); // OR UTF-16BE > if > > > the file is big-endian! > > > > byte[] bytesComment = new byte[unicodeMarker.length + > comment.length]; > > > > System.arraycopy(unicodeMarker, 0, bytesComment, 0, > > > unicodeMarker.length); > > > > System.arraycopy(comment, 0, bytesComment, unicodeMarker.length, > > > comment.length); > > > > > > > > TiffOutputField exif_comment = new > > > TiffOutputField(TiffConstants.EXIF_TAG_USER_COMMENT, > > > > TiffFieldTypeConstants.FIELD_TYPE_UNDEFINED, > > > bytesComment.length, bytesComment); > > > > > > > > > > > > I can now write UserComment: "æøå" without problems :) > > > > > > > > > > > > > > > > - Joakim > > > > > > > > > > > > On 31 May 2016 at 17:39, Benedikt Ritter <brit...@apache.org> wrote: > > > > > > > >> Hello Joachim, > > > >> > > > >> Joakim Knudsen <joakim.gr...@gmail.com> schrieb am Sa., 28. Mai > 2016 > > um > > > >> 21:10 Uhr: > > > >> > > > >> > Hi Benedikt, and thanks for replying! > > > >> > > > > >> > So, if FieldType is unused, maybe the alternative, simpler > > constructor > > > >> is > > > >> > more appropriate/correct to use? > > > >> > > > > >> > // try using the approach given in the example (modified from the > > GPS > > > >> tag): > > > >> > TiffOutputField exif_comment = TiffOutputField.create( > > > >> > TiffConstants.EXIF_TAG_USER_COMMENT, > > > >> > outputSet.byteOrder, textToSet); > > > >> > > > > >> > However, now Sanselan throws an ImageWriteException: > > > >> > org.apache.sanselan.ImageWriteException: Tag has unexpected data > > type. > > > >> > > > > >> > So are you 100% sure field type should not be set (to ASCII)? > > > >> > > > > >> > > > >> No, I'm just saying that it uses a hard coded encoding anyway :-) > > > >> > > > >> > > > >> > > > > >> > Next, you're saying the string to set (textToSet) is converted > > > >> internally > > > >> > to byte array, using US-ASCII encoding. > > > >> > If I try writing "æøåæøå" to a file, I get "쎦쎸쎥쎦쎸쎥" when I copy > the > > > JPEG > > > >> > out and check Properties in Windows Explorer. > > > >> > If I write only ASCII characters, e.g. "Test", then that comes > > through > > > >> just > > > >> > fine. > > > >> > > > > >> > In summary, here is the code that works for me (except non-ASCII > > > >> > characters): > > > >> > > > > >> > > > > >> > *// > > > >> > > > > >> > > > > >> > > > > > > http://mail-archives.apache.org/mod_mbox/commons-user/201203.mbox/%3CCAJm2B-mYCXYKuyu=Hs8UAZCpw-B=kwuz4gszfuobvzwun0l...@mail.gmail.com%3E > > > >> > < > > > >> > > > > >> > > > > > > http://mail-archives.apache.org/mod_mbox/commons-user/201203.mbox/%3CCAJm2B-mYCXYKuyu=Hs8UAZCpw-B=kwuz4gszfuobvzwun0l...@mail.gmail.com%3E > > > >> > >*byte > > > >> > b[] = ExifTagConstants.EXIF_TAG_USER_COMMENT.encodeValue( > > > >> > TiffFieldTypeConstants.FIELD_TYPE_ASCII, > > > >> > textToSet, outputSet. > > > >> > *byteOrder*); > > > >> > > > > >> > // constructor arguments: taginfo tag fieldtype count bytes > > > >> > TiffOutputField exif_comment = new > > > >> > TiffOutputField(TiffConstants.EXIF_TAG_USER_COMMENT.tag, > > > >> > TiffConstants.EXIF_TAG_USER_COMMENT, > > > >> > TiffFieldTypeConstants.FIELD_TYPE_UNDEFINED, > > > >> > b.length, b); > > > >> > > > > >> > > > >> The provided links indicate to me, that it is possible to write non > > > ASCII > > > >> characters. Are you sure your code looks like what Damjan suggested? > > > >> > > > >> Benedikt > > > >> > > > >> > > > >> > > > > >> > > > > >> > > > > >> > Joakim > > > >> > > > > >> > > > > >> > > > > >> > On 22 May 2016 at 15:29, Benedikt Ritter <brit...@apache.org> > > wrote: > > > >> > > > > >> > > Hello Joakim > > > >> > > > > > >> > > Joakim Knudsen <joakim.gr...@gmail.com> schrieb am Sa., 21. Mai > > > 2016 > > > >> um > > > >> > > 19:29 Uhr: > > > >> > > > > > >> > > > Hi List! > > > >> > > > > > > >> > > > I'm working on an Android app, where I want to read and write > > > "EXIF > > > >> > tags" > > > >> > > > to JPEG files on the device. Sanselan 0.97 seems to work > > > perfectly, > > > >> > > > although it's a bit complicated to work with EXIF > > > tags/directories. > > > >> > > > > > > >> > > > The specific tags I'm interested in, is EXIF_TAG_USER_COMMENT > > and > > > >> > > > EXIF_TAG_IMAGE_DESCRIPTION. > > > >> > > > According to the documentation I could find, UserComment is of > > > field > > > >> > type > > > >> > > > "undefined", whereas ImageDescription is of field type ASCII. > > > >> > > > > > > >> > > > > > > >> > > > > > >> > > > > >> > > > > > > http://www.awaresystems.be/imaging/tiff/tifftags/privateifd/exif/usercomment.html > > > >> > > > > > > >> > > http://www.awaresystems.be/imaging/tiff/tifftags/imagedescription.html > > > >> > > > > > > >> > > > What's the proper way of creating those tags, wrt. charset > etc? > > I > > > >> want > > > >> > as > > > >> > > > wide as possible character support (æøå etc). > > > >> > > > > > > >> > > > I find different discussions online, with different advice. > > Seems > > > >> two > > > >> > > > constructors are going around, where the simpler one does not > > deal > > > >> with > > > >> > > > charset/encoding at all. This one uses the .create method: > > > >> > > > > > > >> > > > String textToSet = "Some Text æøå"; > > > >> > > > > > > >> > > > TiffOutputField exif_comment = TiffOutputField.create( > > > >> > > > TiffConstants.EXIF_TAG_USER_COMMENT, > > > >> > > > outputSet.byteOrder, textToSet); > > > >> > > > > > > >> > > > > > > >> > > > while this one uses the standard constructor: > > > >> > > > > > > >> > > > byte b[] = ExifTagConstants.EXIF_TAG_USER_COMMENT.encodeValue( > > > >> > > > TiffFieldTypeConstants.FIELD_TYPE_ASCII, > > > >> > > > textToSet, outputSet.byteOrder > > > >> > > > ); > > > >> > > > > > > >> > > > // constructor arguments: taginfo tag fieldtype count bytes > > > >> > > > TiffOutputField exif_comment2 = new > > > >> > > > TiffOutputField(TiffConstants.EXIF_TAG_USER_COMMENT.tag, > > > >> > > > TiffConstants.EXIF_TAG_USER_COMMENT, > > > >> > > > TiffFieldTypeConstants.FIELD_TYPE_UNDEFINED, > > > >> > > > b.length, b); > > > >> > > > > > > >> > > > In this last one, the string to set has been converted to a > byte > > > >> array > > > >> > > > first. But can/should I set the encoding anywhere? > > > >> > > > > > > >> > > > Is the field type even ASCII? This information seems to > indicate > > > >> it's > > > >> > > > not ASCII... > > > >> > > > > > > >> > > > > > > >> > > > > > >> > > > > >> > > > > > > http://www.awaresystems.be/imaging/tiff/tifftags/privateifd/exif/usercomment.html > > > >> > > > > > > >> > > > > > > >> > > > Need some help here, as you can see, to get this right. The > > second > > > >> > > > approach above does seem to work in my app, but I'd like to be > > > sure > > > >> > > > I'm not somehow messing up the JPEGs on the deviced. > > > >> > > > > > > >> > > > > > >> > > I've looked at the code of > > > >> > > org.apache.commons.imaging.formats.tiff.taginfos.TagInfoGpsText > > > >> > > (ExifTagConstants.EXIF_TAG_USER_COMMENT is an instance of > > > >> > TagInfoGpsText). > > > >> > > Here are my observations: > > > >> > > > > > >> > > - The FieldType parameter, which you have set to > > > >> > > TiffFieldTypeConstants.FIELD_TYPE_ASCII is never used in the > > > >> > implemenation > > > >> > > of encodeValue(FieldType, Object, ByteOrder) > > > >> > > - When converting the input String to byte array, > > > >> String.getBytes(String > > > >> > > charsetName) is used > > > >> > > - For charsetName "US-ASCII" is always used (it can not be > > > configured > > > >> by > > > >> > > the user) > > > >> > > > > > >> > > So my guess is, that the code will not handle characters not in > > the > > > >> > > US-ASCII charset correctly. > > > >> > > > > > >> > > Benedikt > > > >> > > > > > >> > > > > > >> > > > > > > >> > > > > > > >> > > > > > > >> > > > Joakim > > > >> > > > > > > >> > > > > > >> > > > > >> > > > > > > > > > > > > > >