On Wed, Nov 26, 2014 at 02:39:25PM +0530, Sashank Dara wrote: > But what are the theoretical roots ? > Can we model the variations in the code that exhibit the same behavior ? > > (Am not able to articulate it more formally , let me give a try) > Say how to model two different strings of same language exhibiting same > behavior ?
People researching "ROP gadgets" and how to construct programs out of them are doing significant research into modelling the behavior of weird machines made out of somewhat random bits of code. I believe that there are researchers with automated systems to create exploits with. I heard there were back in 2008 or so I assume it is much more mature now. So they do symbolic analysis. > Can we model run time behavior of a program in Computation theory at all ? I suppose the question is, what is the behavior and polymorphism are you interested in? Replacing "add one" with "subtract negative one" is pretty easy to detect. That's a clearly equivalent machine. Automated trivial polymorphism here: http://www.crazyboy.com/hydan/ You might make progress with that kind of analysis with this: http://bitblaze.cs.berkeley.edu/ But what if I add an extra system call to sleep? Most of what malware payload is interested in is side-effects, like snooping on the keyboard and sending it out over the network. It's not the kind of computation that academics typically talk about. If the polymorphism you're trying to detect involves changes to system calls, you'll need some kind of model of their semantics to detect that it used to send a buffer in one syscall but now it sends it in two. You might be able to do something interesting with detecting "bad" things and exfiltration of data with static analysis. However, things like games actually scan the raw keyboard, and clever malware is doing its keyboard snooping in the exact same way to avoid detection. On top of that, most malware is probably going to be using some kind of "packer", so you need to emulate the unpacking long enough to get the actual instructions it will execute. That behavior might be detectable. Maybe that's what you're referring to. Sophisticated malware is detecting this emulation and not unpacking, or waiting a long period of time before unpacking itself. However, beware that you can do a LOT of anti-RE stuff: http://www.recon.cx/en/f/vskype-part1.pdf http://www.recon.cx/en/f/vskype-part2.pdf While it may not be possible to analyze such software, it may be possible to separate software trying to do tricky things like self-modifying code from software being open and honest. But there will likely be very common classes of "bad" behavior that are widely used, like patching the GOT (global offset table), and patching up DLL call pointers on the first call, and you'd have to write detectors for that and avoid blacklisting the software for that alone. -- http://www.subspacefield.org/~travis/ Split a packed field and I am there; parse a line of text and you will find me.
pgpZy7eWJcYRO.pgp
Description: PGP signature
_______________________________________________ langsec-discuss mailing list langsec-discuss@mail.langsec.org https://mail.langsec.org/cgi-bin/mailman/listinfo/langsec-discuss