Re: [funsec] Image forensics

Dan Kaminsky Mon, 28 Dec 2009 09:51:45 -0800

I don't necessarily disagree with your assertions, Neal -- or, I at  
least think you're well within your rights as an author to take your  
particular position.


However, as an independent reviewer, I see a really small sample size  
for your findings, and no ground truth analysis. In other words, if I  
hand you 100 photos, approximately 50 of which are photoshopped and  
approximately 50 of which aren't, what percentage will your tools be  
better than chance at picking out the altered photos, and determining  
the alterations?

As you yourself admit, natural features can trigger your tool.  How  
often *do* they?  As you intriguingly point out, not always. This is  
good.

However.

Forensics aren't a game. People live and die over the determinations  
we make. There have...been issues, with bite mark analysis, and with  
arson determination, that have thoroughly destroyed lives, up to and  
including the death penalty.  This stuff is really important, way more  
than anything on this list.

What I would like to do is actually give you the hundred images as  
described, and receive:

A) The raw output from your tool (identical settings for all files --  
if you need multiple settings, multiply them out across all files).
B) Your interpretation of the output

I will then unmask the originals, and changes, and we can calculate  
the relative effectiveness of your various approaches.

I've always liked your work, Neal. I mean that, I was a graphics geek  
before I was a security geek, and you've done amazing work at the  
intersection.  I just think some numbers would make it infinitely  
stronger.

What do you think?


On Dec 28, 2009, at 6:13 PM, "Dr. Neal Krawetz" <[email protected]>  
wrote:

> On 27 Dec 2009, Rob, grandpa of Ryan, Trevor, Devon & Hannah wrote:
>> An interesting analysis of a graphic recently used by Victoria's  
>> Secret in their
>>
>> advertising.  This gives chapter and verse of the techniques used,  
>> and results
>> obtained, demonstrating the ability to determine if an image has  
>> been altered, and
>> even which parts of an image have been modified, and how.
>>
>> http://www.hackerfactor.com/blog/index.php?/archives/322-Body-By-Victoria.html
>
> [snip]
>
> Thanks for the compliments.
> (I'm just catching up on my emails...)
>
>
> Re: Dan Kaminsky
>> Neal's code is neat and pretty, but chapter and verse is no  
>> substitute
>> for open code and side by side checks. A LOT of his output bears a
>> strong resemblence to edge detection (really, look for high frequency
>> signal, it'll show up in every test).
>
> Edges can show up for many reasons.
>  - The edge may be a high frequency region (as you stated) that  
> appears.
>  - With algorithms like ELA and LG, high contrast edges (like  
> stripes on
>    a zebra) can be at a higher error level or strong gradient than the
>    rest of the image. However, it will not be significantly stronger.
>    (If ELA has a black background, then the high contrast edge may be
>    grayish, but not white.)
>  - Artists usually make changes at edges to reduce visual detection.
>    Think about it: if you are going to cut out or mask something,  
> you are
>    going to do it along the edge.  In the VS example, her outline is
>    visible, but inside edges are not.  If the algorithms were only
>    picking up edges, then all edges (inside, outside, and outline)  
> should
>    be at the same level.  They are not.
>
> As a counter example to your edge theory, consider:
> http://www.hackerfactor.com/blog/index.php?/archives/338-Id-Rather-Wear-Photoshop.html
> (If you get a 503 server error, just reload.  GoDaddy's server is  
> having
> trouble with the concurrent connection load right now.  This will be
> fixed in January.)
> In the Error Level Analysis, the halo totally disappears, even  
> though it
> is a high contrast and high frequency element (white on dark).
> If the algorithm was measuring edges, then the halo should still be  
> visible
> at least to some degree.
>
> Second, with regards to "open code", I strongly disagree with your
> assumption.  You seem to assume that releasing the code will allow  
> people
> to validate the methods.
>
> - If I release my own tool, then they will just use it and look at the
>   results.  This does not validate the code nor the methods.
>
> - If I don't release my own tools, but describe the algorithms, then
>   people will create their own and perform a more scientific  
> comparison.
>
> If you create your own tool that implements a variation of the  
> algorithm(s)
> and you cannot generate the same kind of results, then there is either
> something wrong with your code or with mine.  Now we can do a proper
> comparison.  We have a hypothesis and multiple tools to test it.
>
> As an example, I have implemented my own PCA, DCT, and wavelet  
> libraries.
> (I couldn't use any of the public ones due to GPL issues.)  To  
> validate
> my libraries, I compared the results with GSL and other public  
> libraries.
> Since GSL and the other public libraries generate the same output as
> my own library, it validates the implementation and method.
>
> Thus, to validate the algorithms I use, someone else needs to  
> implement
> something based on the description of the algorithm.  Already, someone
> implemented ELA based on the description in my Black Hat presentation:
>  http://www.tinyappz.com/wiki/Error_Level_Analyser
> His tool creates different coloring (he decided to use a temperature  
> map),
> but it generates results that are similar enough to validate the  
> algorithm
> and implementation.
>
> There is another group that is working on their own variation of  
> Luminance
> Gradient, but they have not yet released their code. (And I don't  
> know if
> they plan to.)  Then again, my LG implementation is not unique.   
> There are
> dozens of published papers that implement variations of the algorithm.
> The algorithm I use is one of the most trivial methods (but it is fast
> and effective).
>
> Finally, I have no intention of releasing my code to the open source
> community.  My code is designed to assist forensic investigators  
> with a
> serious problem: distinguishing real photos from computer graphics,  
> and
> identifying manipulation.  (This is the "real vs virtual" child porn
> problem.)  A full, public release only helps the bad guys.
> (Yes: this is the Security by Obscurity vs Full Disclosure debate.   
> I've
> chosen my side.)
>
>
> Re: Imri Goldberg
> John Graham-Cummings' copy-move code is really pretty cool.
> I wrote my own variation (based on the same paper that he cites);  
> mine is
> heavily optimized.  I described some of my optimization at:
> http://www.hackerfactor.com/blog/index.php?/archives/308-Send-In-The-Clones.html
> There is even a group working on their own variation:
>  http://www.tinyappz.com/wiki/Copymove
> (If John's code, my code, and Tinyappz all generate similar results,  
> then
> the algorithm must work and the methodology must be sound!)
>
>
> Re: Martin Tomasek
>> I like wavelet-based algorithms the most.
>
> To each their own. :-)
> Wavelets definitely have some strong points.
> But for signal analysis, I'm actually growing very fond of Gaussian
> Pyramid Decomposition.
>
>                    -Neal
> --
> Neal Krawetz, Ph.D.
> Hacker Factor Solutions
> http://www.hackerfactor.com/
> Author of "Introduction to Network Security" (Charles River Media,  
> 2006)
> and "Hacking Ubuntu" (Wiley, 2007)
>
> _______________________________________________
> Fun and Misc security discussion for OT posts.
> https://linuxbox.org/cgi-bin/mailman/listinfo/funsec
> Note: funsec is a public and open mailing list.
_______________________________________________
Fun and Misc security discussion for OT posts.
https://linuxbox.org/cgi-bin/mailman/listinfo/funsec
Note: funsec is a public and open mailing list.

Re: [funsec] Image forensics

Reply via email to