Re: [guardian-dev] sanitizing PNGs
Donno if you know about the JPEG Redaction lib from ObscuraCam. Its in C++, unfortunately, but might have useful approaches in it: https://github.com/guardianproject/ObscuraCam/tree/master/app/src/main/jni .hc Michael Rogers: > That doesn't look easy, unfortunately. The class seems to be designed to > work in three stages: > > 1. Load the EXIF data from a file or input stream > 2. Modify the EXIF data > 3. Write the modified image to an output stream by reading the input a > second time and replacing the EXIF segment > > We can't skip to stage 3 because it depends on state that was > initialised during stage 1. Even if we don't care about the original > EXIF data, some of the state seems like it would be vital, such as the > byte order and colour space. > > Maybe we could use a giant BufferedInputStream big enough to hold the > whole image, allowing us to read the stream twice? > > Cheers, > Michael > > On 28/03/18 19:43, Hans-Christoph Steiner wrote: >> >> Ah cool! It would be awesome to have the EXIF stripping work on a >> stream, rather than a file. >> >> .hc >> >> Michael Rogers: >>> Fantastic! >>> >>> The code is just a single file with minimal Android dependencies, so I >>> made a quick (untested) Java port: >>> >>> https://code.briarproject.org/akwizgran/metadata >>> >>> Cheers, >>> Michael >>> >>> On 26/03/18 22:32, Hans-Christoph Steiner wrote: Turns out Google released an Android Support library that makes it trivial to strip EXIF from JPEGs and some RAW formats: https://android-developers.googleblog.com/2016/12/introducing-the-exifinterface-support-library.html I found it via this app in F-Droid: https://gitlab.com/juanitobananas/scrambled-exif This is all it does: ExifInterface exifInterface = new ExifInterface(imagePath); for (String attribute : getExifAttributes()) { if (exifInterface.getAttribute(attribute) != null) { exifInterface.setAttribute(attribute, null); } exifInterface.saveAttributes(); .hc Michael Rogers: > Please feel free to use it, I place it in the public domain. I'll have a > look at JPEGs next time I'm procrastinating. ;-) > > (By the way, after sending I noticed a bug: if the file ends with a > truncated ancillary chunk, I think the cleaner will loop forever trying > to skip to the end of the chunk. Should be easy to fix though.) > > Cheers, > Michael > > On 13/12/17 13:02, Hans-Christoph Steiner wrote: >> >> That's awesome! Feeling inspired to also strip JPEGs? :-) I think >> they're easier. There is jhead, exiftool, and ObscuraCam's JNI code for >> examples. Can we use this under the GPLv3? >> >> .hc >> >> Michael Rogers: >>> Hi Hans-Christoph, >>> >>> I hacked this together based on the PNG specification, which >>> distinguishes between ancillary chunks that can be removed without >>> affecting the image data, and critical chunks that can't. It's been >>> tested on exactly two PNGs so far. :-) >>> >>> http://www.libpng.org/pub/png/spec/1.2/PNG-Structure.html >>> >>> Cheers, >>> Michael >>> >>> On 12/12/17 16:25, Hans-Christoph Steiner wrote: pyexiftool is just a wrapper for exiftool. exiftool looks great, but for my use case, I only need to strip all metadata. It would be much easier if that was in pure Python and pure Java. perl is a no go on Android. It was dead simple to strip EXIF from JPEG in Python: from pil import Image with open(inpath) as fp: in_image = Image.open(fp) data = list(in_image.getdata()) out_image = Image.new(in_image.mode, in_image.size) out_image.putdata(data) out_image.save(outpath) But that broke some PNGs, and the rest were larger in size. .hc Rick Valenzuela: > oh, you may already know this, but the previous code keeps a copy of > the > file and metadata. if you want it gone with no copies, you have to > add a > switch to overwrite, e.g.: > > ``` > with exiftool.ExifTool() as et: > et.execute(b'-all=', b'-overwrite_original', b'some.png') > ``` > > On 12/12/2017 23:45, Rick Valenzuela wrote: >> heh, nice -- I just found this: >> >> https://github.com/smarnach/pyexiftool >> >> Tried it out and it worked great: >> ``` >> with exiftool.ExifTool() as et: >> et.execute(b'-all=', b'some.png') >> ``` >> >> On 12/12/2017 19:53, Hans-Christoph Steiner wrote: >>> >>> Ah, cool, I thought exiftool only worked with JPEGs. It seems to >>> work
Re: [guardian-dev] sanitizing PNGs
Sure, loading the whole thing in memory sounds better than dealing with the filesystem. The only problem I can there is if its used to process large images on small mobile devices. .hc Michael Rogers: > That doesn't look easy, unfortunately. The class seems to be designed to > work in three stages: > > 1. Load the EXIF data from a file or input stream > 2. Modify the EXIF data > 3. Write the modified image to an output stream by reading the input a > second time and replacing the EXIF segment > > We can't skip to stage 3 because it depends on state that was > initialised during stage 1. Even if we don't care about the original > EXIF data, some of the state seems like it would be vital, such as the > byte order and colour space. > > Maybe we could use a giant BufferedInputStream big enough to hold the > whole image, allowing us to read the stream twice? > > Cheers, > Michael > > On 28/03/18 19:43, Hans-Christoph Steiner wrote: >> >> Ah cool! It would be awesome to have the EXIF stripping work on a >> stream, rather than a file. >> >> .hc >> >> Michael Rogers: >>> Fantastic! >>> >>> The code is just a single file with minimal Android dependencies, so I >>> made a quick (untested) Java port: >>> >>> https://code.briarproject.org/akwizgran/metadata >>> >>> Cheers, >>> Michael >>> >>> On 26/03/18 22:32, Hans-Christoph Steiner wrote: Turns out Google released an Android Support library that makes it trivial to strip EXIF from JPEGs and some RAW formats: https://android-developers.googleblog.com/2016/12/introducing-the-exifinterface-support-library.html I found it via this app in F-Droid: https://gitlab.com/juanitobananas/scrambled-exif This is all it does: ExifInterface exifInterface = new ExifInterface(imagePath); for (String attribute : getExifAttributes()) { if (exifInterface.getAttribute(attribute) != null) { exifInterface.setAttribute(attribute, null); } exifInterface.saveAttributes(); .hc Michael Rogers: > Please feel free to use it, I place it in the public domain. I'll have a > look at JPEGs next time I'm procrastinating. ;-) > > (By the way, after sending I noticed a bug: if the file ends with a > truncated ancillary chunk, I think the cleaner will loop forever trying > to skip to the end of the chunk. Should be easy to fix though.) > > Cheers, > Michael > > On 13/12/17 13:02, Hans-Christoph Steiner wrote: >> >> That's awesome! Feeling inspired to also strip JPEGs? :-) I think >> they're easier. There is jhead, exiftool, and ObscuraCam's JNI code for >> examples. Can we use this under the GPLv3? >> >> .hc >> >> Michael Rogers: >>> Hi Hans-Christoph, >>> >>> I hacked this together based on the PNG specification, which >>> distinguishes between ancillary chunks that can be removed without >>> affecting the image data, and critical chunks that can't. It's been >>> tested on exactly two PNGs so far. :-) >>> >>> http://www.libpng.org/pub/png/spec/1.2/PNG-Structure.html >>> >>> Cheers, >>> Michael >>> >>> On 12/12/17 16:25, Hans-Christoph Steiner wrote: pyexiftool is just a wrapper for exiftool. exiftool looks great, but for my use case, I only need to strip all metadata. It would be much easier if that was in pure Python and pure Java. perl is a no go on Android. It was dead simple to strip EXIF from JPEG in Python: from pil import Image with open(inpath) as fp: in_image = Image.open(fp) data = list(in_image.getdata()) out_image = Image.new(in_image.mode, in_image.size) out_image.putdata(data) out_image.save(outpath) But that broke some PNGs, and the rest were larger in size. .hc Rick Valenzuela: > oh, you may already know this, but the previous code keeps a copy of > the > file and metadata. if you want it gone with no copies, you have to > add a > switch to overwrite, e.g.: > > ``` > with exiftool.ExifTool() as et: > et.execute(b'-all=', b'-overwrite_original', b'some.png') > ``` > > On 12/12/2017 23:45, Rick Valenzuela wrote: >> heh, nice -- I just found this: >> >> https://github.com/smarnach/pyexiftool >> >> Tried it out and it worked great: >> ``` >> with exiftool.ExifTool() as et: >> et.execute(b'-all=', b'some.png') >> ``` >> >> On 12/12/2017 19:53, Hans-Christoph Steiner wrote: >>> >>> Ah, cool, I thought exiftool only worked with JPEGs. It seems to >>> work >>> with just
Re: [guardian-dev] sanitizing PNGs
That doesn't look easy, unfortunately. The class seems to be designed to work in three stages: 1. Load the EXIF data from a file or input stream 2. Modify the EXIF data 3. Write the modified image to an output stream by reading the input a second time and replacing the EXIF segment We can't skip to stage 3 because it depends on state that was initialised during stage 1. Even if we don't care about the original EXIF data, some of the state seems like it would be vital, such as the byte order and colour space. Maybe we could use a giant BufferedInputStream big enough to hold the whole image, allowing us to read the stream twice? Cheers, Michael On 28/03/18 19:43, Hans-Christoph Steiner wrote: > > Ah cool! It would be awesome to have the EXIF stripping work on a > stream, rather than a file. > > .hc > > Michael Rogers: >> Fantastic! >> >> The code is just a single file with minimal Android dependencies, so I >> made a quick (untested) Java port: >> >> https://code.briarproject.org/akwizgran/metadata >> >> Cheers, >> Michael >> >> On 26/03/18 22:32, Hans-Christoph Steiner wrote: >>> >>> Turns out Google released an Android Support library that makes it >>> trivial to strip EXIF from JPEGs and some RAW formats: >>> https://android-developers.googleblog.com/2016/12/introducing-the-exifinterface-support-library.html >>> >>> I found it via this app in F-Droid: >>> https://gitlab.com/juanitobananas/scrambled-exif >>> >>> This is all it does: >>> ExifInterface exifInterface = new ExifInterface(imagePath); >>> for (String attribute : getExifAttributes()) { >>> if (exifInterface.getAttribute(attribute) != null) { >>> exifInterface.setAttribute(attribute, null); >>> } >>> exifInterface.saveAttributes(); >>> >>> .hc >>> >>> Michael Rogers: Please feel free to use it, I place it in the public domain. I'll have a look at JPEGs next time I'm procrastinating. ;-) (By the way, after sending I noticed a bug: if the file ends with a truncated ancillary chunk, I think the cleaner will loop forever trying to skip to the end of the chunk. Should be easy to fix though.) Cheers, Michael On 13/12/17 13:02, Hans-Christoph Steiner wrote: > > That's awesome! Feeling inspired to also strip JPEGs? :-) I think > they're easier. There is jhead, exiftool, and ObscuraCam's JNI code for > examples. Can we use this under the GPLv3? > > .hc > > Michael Rogers: >> Hi Hans-Christoph, >> >> I hacked this together based on the PNG specification, which >> distinguishes between ancillary chunks that can be removed without >> affecting the image data, and critical chunks that can't. It's been >> tested on exactly two PNGs so far. :-) >> >> http://www.libpng.org/pub/png/spec/1.2/PNG-Structure.html >> >> Cheers, >> Michael >> >> On 12/12/17 16:25, Hans-Christoph Steiner wrote: >>> >>> pyexiftool is just a wrapper for exiftool. exiftool looks great, but >>> for my use case, I only need to strip all metadata. It would be much >>> easier if that was in pure Python and pure Java. perl is a no go on >>> Android. >>> >>> It was dead simple to strip EXIF from JPEG in Python: >>> >>> from pil import Image >>> with open(inpath) as fp: >>> in_image = Image.open(fp) >>> data = list(in_image.getdata()) >>> out_image = Image.new(in_image.mode, in_image.size) >>> out_image.putdata(data) >>> out_image.save(outpath) >>> >>> But that broke some PNGs, and the rest were larger in size. >>> >>> .hc >>> >>> Rick Valenzuela: oh, you may already know this, but the previous code keeps a copy of the file and metadata. if you want it gone with no copies, you have to add a switch to overwrite, e.g.: ``` with exiftool.ExifTool() as et: et.execute(b'-all=', b'-overwrite_original', b'some.png') ``` On 12/12/2017 23:45, Rick Valenzuela wrote: > heh, nice -- I just found this: > > https://github.com/smarnach/pyexiftool > > Tried it out and it worked great: > ``` > with exiftool.ExifTool() as et: > et.execute(b'-all=', b'some.png') > ``` > > On 12/12/2017 19:53, Hans-Christoph Steiner wrote: >> >> Ah, cool, I thought exiftool only worked with JPEGs. It seems to >> work >> with just about every image format. Now the open question is how to >> strip all PNG metadata with Python and Java. >> >> .hc >> >> Rick Valenzuela: >>> does exiftool do what you need? >>> >>> `exiftool -all= ` >>> >>> On 11/12/2017 17:57, Hans-Christoph Steiner wrote:
Re: [guardian-dev] sanitizing PNGs
Ah cool! It would be awesome to have the EXIF stripping work on a stream, rather than a file. .hc Michael Rogers: > Fantastic! > > The code is just a single file with minimal Android dependencies, so I > made a quick (untested) Java port: > > https://code.briarproject.org/akwizgran/metadata > > Cheers, > Michael > > On 26/03/18 22:32, Hans-Christoph Steiner wrote: >> >> Turns out Google released an Android Support library that makes it >> trivial to strip EXIF from JPEGs and some RAW formats: >> https://android-developers.googleblog.com/2016/12/introducing-the-exifinterface-support-library.html >> >> I found it via this app in F-Droid: >> https://gitlab.com/juanitobananas/scrambled-exif >> >> This is all it does: >> ExifInterface exifInterface = new ExifInterface(imagePath); >> for (String attribute : getExifAttributes()) { >> if (exifInterface.getAttribute(attribute) != null) { >> exifInterface.setAttribute(attribute, null); >> } >> exifInterface.saveAttributes(); >> >> .hc >> >> Michael Rogers: >>> Please feel free to use it, I place it in the public domain. I'll have a >>> look at JPEGs next time I'm procrastinating. ;-) >>> >>> (By the way, after sending I noticed a bug: if the file ends with a >>> truncated ancillary chunk, I think the cleaner will loop forever trying >>> to skip to the end of the chunk. Should be easy to fix though.) >>> >>> Cheers, >>> Michael >>> >>> On 13/12/17 13:02, Hans-Christoph Steiner wrote: That's awesome! Feeling inspired to also strip JPEGs? :-) I think they're easier. There is jhead, exiftool, and ObscuraCam's JNI code for examples. Can we use this under the GPLv3? .hc Michael Rogers: > Hi Hans-Christoph, > > I hacked this together based on the PNG specification, which > distinguishes between ancillary chunks that can be removed without > affecting the image data, and critical chunks that can't. It's been > tested on exactly two PNGs so far. :-) > > http://www.libpng.org/pub/png/spec/1.2/PNG-Structure.html > > Cheers, > Michael > > On 12/12/17 16:25, Hans-Christoph Steiner wrote: >> >> pyexiftool is just a wrapper for exiftool. exiftool looks great, but >> for my use case, I only need to strip all metadata. It would be much >> easier if that was in pure Python and pure Java. perl is a no go on >> Android. >> >> It was dead simple to strip EXIF from JPEG in Python: >> >> from pil import Image >> with open(inpath) as fp: >> in_image = Image.open(fp) >> data = list(in_image.getdata()) >> out_image = Image.new(in_image.mode, in_image.size) >> out_image.putdata(data) >> out_image.save(outpath) >> >> But that broke some PNGs, and the rest were larger in size. >> >> .hc >> >> Rick Valenzuela: >>> oh, you may already know this, but the previous code keeps a copy of the >>> file and metadata. if you want it gone with no copies, you have to add a >>> switch to overwrite, e.g.: >>> >>> ``` >>> with exiftool.ExifTool() as et: >>> et.execute(b'-all=', b'-overwrite_original', b'some.png') >>> ``` >>> >>> On 12/12/2017 23:45, Rick Valenzuela wrote: heh, nice -- I just found this: https://github.com/smarnach/pyexiftool Tried it out and it worked great: ``` with exiftool.ExifTool() as et: et.execute(b'-all=', b'some.png') ``` On 12/12/2017 19:53, Hans-Christoph Steiner wrote: > > Ah, cool, I thought exiftool only worked with JPEGs. It seems to work > with just about every image format. Now the open question is how to > strip all PNG metadata with Python and Java. > > .hc > > Rick Valenzuela: >> does exiftool do what you need? >> >> `exiftool -all= ` >> >> On 11/12/2017 17:57, Hans-Christoph Steiner wrote: >>> >>> Anyone know any tools for sanitizing PNGs without touching the >>> compressed image data? With JPEG it is easy to strip out EXIF with >>> python-pil or many other tools. I haven't found a simple, clean >>> approach >>> in Python for PNGs. >>> >>> .hc >>> >> > >>> >> >> -- PGP fingerprint: EE66 20C7 136B 0D2C 456C 0A4D E9E2 8DEA 00AA 5556 https://pgp.mit.edu/pks/lookup?op=vindex=0xE9E28DEA00AA5556 ___ List info: https://lists.mayfirst.org/mailman/listinfo/guardian-dev To unsubscribe, email: guardian-dev-unsubscr...@lists.mayfirst.org
Re: [guardian-dev] sanitizing PNGs
Fantastic! The code is just a single file with minimal Android dependencies, so I made a quick (untested) Java port: https://code.briarproject.org/akwizgran/metadata Cheers, Michael On 26/03/18 22:32, Hans-Christoph Steiner wrote: > > Turns out Google released an Android Support library that makes it > trivial to strip EXIF from JPEGs and some RAW formats: > https://android-developers.googleblog.com/2016/12/introducing-the-exifinterface-support-library.html > > I found it via this app in F-Droid: > https://gitlab.com/juanitobananas/scrambled-exif > > This is all it does: > ExifInterface exifInterface = new ExifInterface(imagePath); > for (String attribute : getExifAttributes()) { > if (exifInterface.getAttribute(attribute) != null) { > exifInterface.setAttribute(attribute, null); > } > exifInterface.saveAttributes(); > > .hc > > Michael Rogers: >> Please feel free to use it, I place it in the public domain. I'll have a >> look at JPEGs next time I'm procrastinating. ;-) >> >> (By the way, after sending I noticed a bug: if the file ends with a >> truncated ancillary chunk, I think the cleaner will loop forever trying >> to skip to the end of the chunk. Should be easy to fix though.) >> >> Cheers, >> Michael >> >> On 13/12/17 13:02, Hans-Christoph Steiner wrote: >>> >>> That's awesome! Feeling inspired to also strip JPEGs? :-) I think >>> they're easier. There is jhead, exiftool, and ObscuraCam's JNI code for >>> examples. Can we use this under the GPLv3? >>> >>> .hc >>> >>> Michael Rogers: Hi Hans-Christoph, I hacked this together based on the PNG specification, which distinguishes between ancillary chunks that can be removed without affecting the image data, and critical chunks that can't. It's been tested on exactly two PNGs so far. :-) http://www.libpng.org/pub/png/spec/1.2/PNG-Structure.html Cheers, Michael On 12/12/17 16:25, Hans-Christoph Steiner wrote: > > pyexiftool is just a wrapper for exiftool. exiftool looks great, but > for my use case, I only need to strip all metadata. It would be much > easier if that was in pure Python and pure Java. perl is a no go on > Android. > > It was dead simple to strip EXIF from JPEG in Python: > > from pil import Image > with open(inpath) as fp: > in_image = Image.open(fp) > data = list(in_image.getdata()) > out_image = Image.new(in_image.mode, in_image.size) > out_image.putdata(data) > out_image.save(outpath) > > But that broke some PNGs, and the rest were larger in size. > > .hc > > Rick Valenzuela: >> oh, you may already know this, but the previous code keeps a copy of the >> file and metadata. if you want it gone with no copies, you have to add a >> switch to overwrite, e.g.: >> >> ``` >> with exiftool.ExifTool() as et: >> et.execute(b'-all=', b'-overwrite_original', b'some.png') >> ``` >> >> On 12/12/2017 23:45, Rick Valenzuela wrote: >>> heh, nice -- I just found this: >>> >>> https://github.com/smarnach/pyexiftool >>> >>> Tried it out and it worked great: >>> ``` >>> with exiftool.ExifTool() as et: >>> et.execute(b'-all=', b'some.png') >>> ``` >>> >>> On 12/12/2017 19:53, Hans-Christoph Steiner wrote: Ah, cool, I thought exiftool only worked with JPEGs. It seems to work with just about every image format. Now the open question is how to strip all PNG metadata with Python and Java. .hc Rick Valenzuela: > does exiftool do what you need? > > `exiftool -all= ` > > On 11/12/2017 17:57, Hans-Christoph Steiner wrote: >> >> Anyone know any tools for sanitizing PNGs without touching the >> compressed image data? With JPEG it is easy to strip out EXIF with >> python-pil or many other tools. I haven't found a simple, clean >> approach >> in Python for PNGs. >> >> .hc >> > >>> >> > >>> > 0x9FC527CC.asc Description: application/pgp-keys signature.asc Description: OpenPGP digital signature ___ List info: https://lists.mayfirst.org/mailman/listinfo/guardian-dev To unsubscribe, email: guardian-dev-unsubscr...@lists.mayfirst.org
Re: [guardian-dev] sanitizing PNGs
Turns out Google released an Android Support library that makes it trivial to strip EXIF from JPEGs and some RAW formats: https://android-developers.googleblog.com/2016/12/introducing-the-exifinterface-support-library.html I found it via this app in F-Droid: https://gitlab.com/juanitobananas/scrambled-exif This is all it does: ExifInterface exifInterface = new ExifInterface(imagePath); for (String attribute : getExifAttributes()) { if (exifInterface.getAttribute(attribute) != null) { exifInterface.setAttribute(attribute, null); } exifInterface.saveAttributes(); .hc Michael Rogers: > Please feel free to use it, I place it in the public domain. I'll have a > look at JPEGs next time I'm procrastinating. ;-) > > (By the way, after sending I noticed a bug: if the file ends with a > truncated ancillary chunk, I think the cleaner will loop forever trying > to skip to the end of the chunk. Should be easy to fix though.) > > Cheers, > Michael > > On 13/12/17 13:02, Hans-Christoph Steiner wrote: >> >> That's awesome! Feeling inspired to also strip JPEGs? :-) I think >> they're easier. There is jhead, exiftool, and ObscuraCam's JNI code for >> examples. Can we use this under the GPLv3? >> >> .hc >> >> Michael Rogers: >>> Hi Hans-Christoph, >>> >>> I hacked this together based on the PNG specification, which >>> distinguishes between ancillary chunks that can be removed without >>> affecting the image data, and critical chunks that can't. It's been >>> tested on exactly two PNGs so far. :-) >>> >>> http://www.libpng.org/pub/png/spec/1.2/PNG-Structure.html >>> >>> Cheers, >>> Michael >>> >>> On 12/12/17 16:25, Hans-Christoph Steiner wrote: pyexiftool is just a wrapper for exiftool. exiftool looks great, but for my use case, I only need to strip all metadata. It would be much easier if that was in pure Python and pure Java. perl is a no go on Android. It was dead simple to strip EXIF from JPEG in Python: from pil import Image with open(inpath) as fp: in_image = Image.open(fp) data = list(in_image.getdata()) out_image = Image.new(in_image.mode, in_image.size) out_image.putdata(data) out_image.save(outpath) But that broke some PNGs, and the rest were larger in size. .hc Rick Valenzuela: > oh, you may already know this, but the previous code keeps a copy of the > file and metadata. if you want it gone with no copies, you have to add a > switch to overwrite, e.g.: > > ``` > with exiftool.ExifTool() as et: > et.execute(b'-all=', b'-overwrite_original', b'some.png') > ``` > > On 12/12/2017 23:45, Rick Valenzuela wrote: >> heh, nice -- I just found this: >> >> https://github.com/smarnach/pyexiftool >> >> Tried it out and it worked great: >> ``` >> with exiftool.ExifTool() as et: >> et.execute(b'-all=', b'some.png') >> ``` >> >> On 12/12/2017 19:53, Hans-Christoph Steiner wrote: >>> >>> Ah, cool, I thought exiftool only worked with JPEGs. It seems to work >>> with just about every image format. Now the open question is how to >>> strip all PNG metadata with Python and Java. >>> >>> .hc >>> >>> Rick Valenzuela: does exiftool do what you need? `exiftool -all= ` On 11/12/2017 17:57, Hans-Christoph Steiner wrote: > > Anyone know any tools for sanitizing PNGs without touching the > compressed image data? With JPEG it is easy to strip out EXIF with > python-pil or many other tools. I haven't found a simple, clean > approach > in Python for PNGs. > > .hc > >>> >> > >> -- PGP fingerprint: EE66 20C7 136B 0D2C 456C 0A4D E9E2 8DEA 00AA 5556 https://pgp.mit.edu/pks/lookup?op=vindex=0xE9E28DEA00AA5556 ___ List info: https://lists.mayfirst.org/mailman/listinfo/guardian-dev To unsubscribe, email: guardian-dev-unsubscr...@lists.mayfirst.org
Re: [guardian-dev] sanitizing PNGs
Please feel free to use it, I place it in the public domain. I'll have a look at JPEGs next time I'm procrastinating. ;-) (By the way, after sending I noticed a bug: if the file ends with a truncated ancillary chunk, I think the cleaner will loop forever trying to skip to the end of the chunk. Should be easy to fix though.) Cheers, Michael On 13/12/17 13:02, Hans-Christoph Steiner wrote: > > That's awesome! Feeling inspired to also strip JPEGs? :-) I think > they're easier. There is jhead, exiftool, and ObscuraCam's JNI code for > examples. Can we use this under the GPLv3? > > .hc > > Michael Rogers: >> Hi Hans-Christoph, >> >> I hacked this together based on the PNG specification, which >> distinguishes between ancillary chunks that can be removed without >> affecting the image data, and critical chunks that can't. It's been >> tested on exactly two PNGs so far. :-) >> >> http://www.libpng.org/pub/png/spec/1.2/PNG-Structure.html >> >> Cheers, >> Michael >> >> On 12/12/17 16:25, Hans-Christoph Steiner wrote: >>> >>> pyexiftool is just a wrapper for exiftool. exiftool looks great, but >>> for my use case, I only need to strip all metadata. It would be much >>> easier if that was in pure Python and pure Java. perl is a no go on >>> Android. >>> >>> It was dead simple to strip EXIF from JPEG in Python: >>> >>> from pil import Image >>> with open(inpath) as fp: >>> in_image = Image.open(fp) >>> data = list(in_image.getdata()) >>> out_image = Image.new(in_image.mode, in_image.size) >>> out_image.putdata(data) >>> out_image.save(outpath) >>> >>> But that broke some PNGs, and the rest were larger in size. >>> >>> .hc >>> >>> Rick Valenzuela: oh, you may already know this, but the previous code keeps a copy of the file and metadata. if you want it gone with no copies, you have to add a switch to overwrite, e.g.: ``` with exiftool.ExifTool() as et: et.execute(b'-all=', b'-overwrite_original', b'some.png') ``` On 12/12/2017 23:45, Rick Valenzuela wrote: > heh, nice -- I just found this: > > https://github.com/smarnach/pyexiftool > > Tried it out and it worked great: > ``` > with exiftool.ExifTool() as et: > et.execute(b'-all=', b'some.png') > ``` > > On 12/12/2017 19:53, Hans-Christoph Steiner wrote: >> >> Ah, cool, I thought exiftool only worked with JPEGs. It seems to work >> with just about every image format. Now the open question is how to >> strip all PNG metadata with Python and Java. >> >> .hc >> >> Rick Valenzuela: >>> does exiftool do what you need? >>> >>> `exiftool -all= ` >>> >>> On 11/12/2017 17:57, Hans-Christoph Steiner wrote: Anyone know any tools for sanitizing PNGs without touching the compressed image data? With JPEG it is easy to strip out EXIF with python-pil or many other tools. I haven't found a simple, clean approach in Python for PNGs. .hc >>> >> > >>> > 0x9FC527CC.asc Description: application/pgp-keys signature.asc Description: OpenPGP digital signature ___ List info: https://lists.mayfirst.org/mailman/listinfo/guardian-dev To unsubscribe, email: guardian-dev-unsubscr...@lists.mayfirst.org
Re: [guardian-dev] sanitizing PNGs
That's awesome! Feeling inspired to also strip JPEGs? :-) I think they're easier. There is jhead, exiftool, and ObscuraCam's JNI code for examples. Can we use this under the GPLv3? .hc Michael Rogers: > Hi Hans-Christoph, > > I hacked this together based on the PNG specification, which > distinguishes between ancillary chunks that can be removed without > affecting the image data, and critical chunks that can't. It's been > tested on exactly two PNGs so far. :-) > > http://www.libpng.org/pub/png/spec/1.2/PNG-Structure.html > > Cheers, > Michael > > On 12/12/17 16:25, Hans-Christoph Steiner wrote: >> >> pyexiftool is just a wrapper for exiftool. exiftool looks great, but >> for my use case, I only need to strip all metadata. It would be much >> easier if that was in pure Python and pure Java. perl is a no go on >> Android. >> >> It was dead simple to strip EXIF from JPEG in Python: >> >> from pil import Image >> with open(inpath) as fp: >> in_image = Image.open(fp) >> data = list(in_image.getdata()) >> out_image = Image.new(in_image.mode, in_image.size) >> out_image.putdata(data) >> out_image.save(outpath) >> >> But that broke some PNGs, and the rest were larger in size. >> >> .hc >> >> Rick Valenzuela: >>> oh, you may already know this, but the previous code keeps a copy of the >>> file and metadata. if you want it gone with no copies, you have to add a >>> switch to overwrite, e.g.: >>> >>> ``` >>> with exiftool.ExifTool() as et: >>> et.execute(b'-all=', b'-overwrite_original', b'some.png') >>> ``` >>> >>> On 12/12/2017 23:45, Rick Valenzuela wrote: heh, nice -- I just found this: https://github.com/smarnach/pyexiftool Tried it out and it worked great: ``` with exiftool.ExifTool() as et: et.execute(b'-all=', b'some.png') ``` On 12/12/2017 19:53, Hans-Christoph Steiner wrote: > > Ah, cool, I thought exiftool only worked with JPEGs. It seems to work > with just about every image format. Now the open question is how to > strip all PNG metadata with Python and Java. > > .hc > > Rick Valenzuela: >> does exiftool do what you need? >> >> `exiftool -all= ` >> >> On 11/12/2017 17:57, Hans-Christoph Steiner wrote: >>> >>> Anyone know any tools for sanitizing PNGs without touching the >>> compressed image data? With JPEG it is easy to strip out EXIF with >>> python-pil or many other tools. I haven't found a simple, clean approach >>> in Python for PNGs. >>> >>> .hc >>> >> > >>> >> -- PGP fingerprint: EE66 20C7 136B 0D2C 456C 0A4D E9E2 8DEA 00AA 5556 https://pgp.mit.edu/pks/lookup?op=vindex=0xE9E28DEA00AA5556 ___ List info: https://lists.mayfirst.org/mailman/listinfo/guardian-dev To unsubscribe, email: guardian-dev-unsubscr...@lists.mayfirst.org
Re: [guardian-dev] sanitizing PNGs
Hi Hans-Christoph, I hacked this together based on the PNG specification, which distinguishes between ancillary chunks that can be removed without affecting the image data, and critical chunks that can't. It's been tested on exactly two PNGs so far. :-) http://www.libpng.org/pub/png/spec/1.2/PNG-Structure.html Cheers, Michael On 12/12/17 16:25, Hans-Christoph Steiner wrote: > > pyexiftool is just a wrapper for exiftool. exiftool looks great, but > for my use case, I only need to strip all metadata. It would be much > easier if that was in pure Python and pure Java. perl is a no go on > Android. > > It was dead simple to strip EXIF from JPEG in Python: > > from pil import Image > with open(inpath) as fp: > in_image = Image.open(fp) > data = list(in_image.getdata()) > out_image = Image.new(in_image.mode, in_image.size) > out_image.putdata(data) > out_image.save(outpath) > > But that broke some PNGs, and the rest were larger in size. > > .hc > > Rick Valenzuela: >> oh, you may already know this, but the previous code keeps a copy of the >> file and metadata. if you want it gone with no copies, you have to add a >> switch to overwrite, e.g.: >> >> ``` >> with exiftool.ExifTool() as et: >> et.execute(b'-all=', b'-overwrite_original', b'some.png') >> ``` >> >> On 12/12/2017 23:45, Rick Valenzuela wrote: >>> heh, nice -- I just found this: >>> >>> https://github.com/smarnach/pyexiftool >>> >>> Tried it out and it worked great: >>> ``` >>> with exiftool.ExifTool() as et: >>> et.execute(b'-all=', b'some.png') >>> ``` >>> >>> On 12/12/2017 19:53, Hans-Christoph Steiner wrote: Ah, cool, I thought exiftool only worked with JPEGs. It seems to work with just about every image format. Now the open question is how to strip all PNG metadata with Python and Java. .hc Rick Valenzuela: > does exiftool do what you need? > > `exiftool -all= ` > > On 11/12/2017 17:57, Hans-Christoph Steiner wrote: >> >> Anyone know any tools for sanitizing PNGs without touching the >> compressed image data? With JPEG it is easy to strip out EXIF with >> python-pil or many other tools. I haven't found a simple, clean approach >> in Python for PNGs. >> >> .hc >> > >>> >> > import java.io.EOFException; import java.io.FileInputStream; import java.io.FileOutputStream; import java.io.InputStream; import java.io.IOException; import java.io.OutputStream; import java.util.Arrays; public class PngCleaner { public static final byte[] PNG_FILE_SIGNATURE = {(byte) 137, 80, 78, 71, 13, 10, 26, 10}; public static void clean(InputStream in, OutputStream out) throws IOException { byte[] buf = new byte[8]; if (!readFully(in, buf)) throw new EOFException(); if (!Arrays.equals(PNG_FILE_SIGNATURE, buf)) throw new IOException("Not a valid PNG file"); out.write(buf); while (readFully(in, buf)) { if (isAncillaryChunk(buf)) skipChunk(in, buf); else copyChunk(in, out, buf); } } private static boolean readFully(InputStream in, byte[] b) throws IOException { int off = 0; while (off < b.length) { int read = in.read(b, off, b.length - off); if (read == -1) return false; off += read; } return true; } private static boolean isAncillaryChunk(byte[] header) { // Ancillary bit is bit 5 of first byte of chunk type return (header[4] & 0x20) != 0; } private static void skipChunk(InputStream in, byte[] header) throws IOException { // Chunk length is first four bytes of chunk header, excluding 4-byte CRC long length = readUint32(header) + 4; long skipped = 0; while (skipped < length) skipped += in.skip(length - skipped); } private static long readUint32(byte[] src) { return ((src[0] & 0xFFL) << 24) | ((src[1] & 0xFFL) << 16) | ((src[2] & 0xFFL) << 8) | (src[3] & 0xFFL); } private static void copyChunk(InputStream in, OutputStream out, byte[] header) throws IOException { out.write(header); // Chunk length is first four bytes of chunk header, excluding 4-byte CRC long length = readUint32(header) + 4; long copied = 0; byte[] buf = new byte[1024]; while (copied < length) { int read = in.read(buf, 0, Math.min((int) (length - copied), buf.length)); if (read == -1) throw new EOFException(); out.write(buf, 0, read); copied += read; } } public static void main(String[] args) throws IOException { if (args.length != 2) { System.err.println("Usage: PngCleaner "); System.exit(1); } InputStream in = new FileInputStream(args[0]); OutputStream out = new FileOutputStream(args[1]); clean(in, out); in.close(); out.flush(); out.close(); } } 0x9FC527CC.asc Description: application/pgp-keys signature.asc Description: OpenPGP digital signature ___ List info: https://lists.mayfirst.org/mailman/listinfo/guardian-dev To
Re: [guardian-dev] sanitizing PNGs
oh, you may already know this, but the previous code keeps a copy of the file and metadata. if you want it gone with no copies, you have to add a switch to overwrite, e.g.: ``` with exiftool.ExifTool() as et: et.execute(b'-all=', b'-overwrite_original', b'some.png') ``` On 12/12/2017 23:45, Rick Valenzuela wrote: > heh, nice -- I just found this: > > https://github.com/smarnach/pyexiftool > > Tried it out and it worked great: > ``` > with exiftool.ExifTool() as et: > et.execute(b'-all=', b'some.png') > ``` > > On 12/12/2017 19:53, Hans-Christoph Steiner wrote: >> >> Ah, cool, I thought exiftool only worked with JPEGs. It seems to work >> with just about every image format. Now the open question is how to >> strip all PNG metadata with Python and Java. >> >> .hc >> >> Rick Valenzuela: >>> does exiftool do what you need? >>> >>> `exiftool -all= ` >>> >>> On 11/12/2017 17:57, Hans-Christoph Steiner wrote: Anyone know any tools for sanitizing PNGs without touching the compressed image data? With JPEG it is easy to strip out EXIF with python-pil or many other tools. I haven't found a simple, clean approach in Python for PNGs. .hc >>> >> > -- Rick Valenzuela Videojournalist Shanghai, China ___ List info: https://lists.mayfirst.org/mailman/listinfo/guardian-dev To unsubscribe, email: guardian-dev-unsubscr...@lists.mayfirst.org
Re: [guardian-dev] sanitizing PNGs
heh, nice -- I just found this: https://github.com/smarnach/pyexiftool Tried it out and it worked great: ``` with exiftool.ExifTool() as et: et.execute(b'-all=', b'some.png') ``` On 12/12/2017 19:53, Hans-Christoph Steiner wrote: > > Ah, cool, I thought exiftool only worked with JPEGs. It seems to work > with just about every image format. Now the open question is how to > strip all PNG metadata with Python and Java. > > .hc > > Rick Valenzuela: >> does exiftool do what you need? >> >> `exiftool -all= ` >> >> On 11/12/2017 17:57, Hans-Christoph Steiner wrote: >>> >>> Anyone know any tools for sanitizing PNGs without touching the >>> compressed image data? With JPEG it is easy to strip out EXIF with >>> python-pil or many other tools. I haven't found a simple, clean approach >>> in Python for PNGs. >>> >>> .hc >>> >> > -- Rick Valenzuela Videojournalist Shanghai, China ___ List info: https://lists.mayfirst.org/mailman/listinfo/guardian-dev To unsubscribe, email: guardian-dev-unsubscr...@lists.mayfirst.org
Re: [guardian-dev] sanitizing PNGs
Ah, cool, I thought exiftool only worked with JPEGs. It seems to work with just about every image format. Now the open question is how to strip all PNG metadata with Python and Java. .hc Rick Valenzuela: > does exiftool do what you need? > > `exiftool -all= ` > > On 11/12/2017 17:57, Hans-Christoph Steiner wrote: >> >> Anyone know any tools for sanitizing PNGs without touching the >> compressed image data? With JPEG it is easy to strip out EXIF with >> python-pil or many other tools. I haven't found a simple, clean approach >> in Python for PNGs. >> >> .hc >> > -- PGP fingerprint: EE66 20C7 136B 0D2C 456C 0A4D E9E2 8DEA 00AA 5556 https://pgp.mit.edu/pks/lookup?op=vindex=0xE9E28DEA00AA5556 ___ List info: https://lists.mayfirst.org/mailman/listinfo/guardian-dev To unsubscribe, email: guardian-dev-unsubscr...@lists.mayfirst.org
Re: [guardian-dev] sanitizing PNGs
does exiftool do what you need? `exiftool -all= ` On 11/12/2017 17:57, Hans-Christoph Steiner wrote: > > Anyone know any tools for sanitizing PNGs without touching the > compressed image data? With JPEG it is easy to strip out EXIF with > python-pil or many other tools. I haven't found a simple, clean approach > in Python for PNGs. > > .hc > -- Rick Valenzuela Shanghai, China +86 185 0177 0138 r...@rickv.com https://www.linkedin.com/in/rickvalenzuela GPG: 0x054124ADD5644029 ___ List info: https://lists.mayfirst.org/mailman/listinfo/guardian-dev To unsubscribe, email: guardian-dev-unsubscr...@lists.mayfirst.org