Obfuscation techinque in Dotnet

shijumon-codebrain Thu, 24 Jun 2004 23:22:20 -0700

-----------------------------------------------------------

New Message on cochindotnet


-----------------------------------------------------------
From: shijumon-codebrain
Message 1 in Discussion

Hi All
        You know it is possible to generate source code from your dotnet 
.dll or .exe files. So if you
want to secure your source code please apply Obfuscation techiniques.

by
codebrain
-------------------------------------------------------------------------------------------------------------------


This article was published on MSDN� Magazine

SUMMARY
One of the advantages of the .NET architecture is that assemblies built with 
it contain lots of useful information that can be recovered using ILDASM, 
the intermediate language disassembler. A side effect, though, is that 
someone with access to your binaries can recover a good approximation of the 
original source code. Here the authors present program obfuscation as a way 
to deter reverse engineering. In addition, they discuss the different types 
of obfuscation technologies available and demonstrate the new obfuscation 
tool that is included in Visual Studio .NET 2003.

By now you are probably familiar with all of the benefits that the 
metadata-rich Microsoft� .NET Framework architecture brings to the table, 
from easing the burdens of deployment and versioning to the rich IDE 
functionality enabled by self-describing binaries. You may not know that the 
easy availability of all this metadata has introduced a problem that until 
now was not a concern for most developers. Programs written for the common 
language runtime (CLR) are easier to reverse engineer. This is not in any 
way a fault in the design of the .NET Framework; it is simply a reality of 
modern, intermediate-compiled languages (Java-language applications display 
the same characteristics). Both Java and the .NET Framework use rich 
metadata embedded inside the executable code: bytecode in the case of Java, 
Microsoft Intermediate Language (MSIL) in .NET. Being much higher level than 
binary machine code, the executable files are laden with information that 
can be easily deciphered.
  With the help of tools like ILDASM (the MSIL disassembler that ships with 
the .NET Framework SDK) or decompilers such as Anakrino and Reflector for 
.NET, anyone can easily look at your assemblies and reverse engineer them 
back into readable source code. Hackers can search for security flaws to 
exploit, steal unique ideas, and crack programs. This should be enough to 
give you pause.
  Don't worry, though. There's a solution�obfuscation�that will help you 
thwart reverse engineering. Obfuscation is a technique that provides for 
seamless renaming of symbols in assemblies as well as other tricks to foil 
decompilers. When it is properly applied, obfuscation can increase the 
protection against decompilation by many orders of magnitude, while leaving 
the application intact. Obfuscation is commonly used in Java environments 
and for years has been helping companies protect the intellectual property 
in their Java-based products.
  Several third-parties have answered the call by creating obfuscators for 
.NET code. Microsoft includes the Dotfuscator Community Edition with Visual 
Studio� .NET 2003 in partnership with our company PreEmptive Solutions, 
which ships a number of various obfuscator packages.
  Using the Dotfuscator Community Edition, this article will teach you all 
about obfuscation (and a little about decompilation), the types of 
obfuscation commonly available, and some of the issues you will need to 
address when working with an obfuscator.
  To demonstrate decompilation and obfuscation, we are going to use an 
open-source implementation of the classic Vexed game. Vexed.NET was written 
by Roey Ben-amotz and is available at http://vexeddotnet.benamotz.com. It's 
a puzzle game in which your goal is to move similar blocks together, which 
causes them to disappear. Below is a simple method from the source code of 
Vexed.NET:
public void undo() {
  if (numOfMoves>0) {
    numOfMoves--;
    if (_UserMoves.Length>=2)
        _UserMoves = _UserMoves.Substring(0, _UserMoves.Length02);
    this.loadBoard(this.moveHistory[numOfMmoves -
                      (numOfMoves/50) * 50]);
    this.drawBoard(this.gr);
  }
}


Disassembly
  The .NET Framework SDK ships with a disassembler utility called ILDASM, 
which allows you to decompile .NET Framework assemblies into IL Assembly 
Language statements. In order to start ILDASM, you must make sure that the 
.NET Framework SDK is installed and type ILDASM on the command line followed 
by the name of the program that you want to disassemble. In our case, we 
will type "ILDASM vexed.net.exe". This will launch the ILDASM UI, which can 
be used to browse the structure of any .NET Framework-based application. 
Figure 1 shows the undo method disassembled.

Decompilation
  If you're now thinking that only a small circle of folks who actually know 
IL Assembly Language will see and understand your source code, remember that 
the decompilation doesn't stop there. We can recreate the actual source code 
by using a decompiler. These utilities can decompile a .NET assembly 
directly back to a high-level language like C#, Visual Basic� .NET, or C++. 
Let's look at the undo method generated by the Anakrino decompiler:
public void undo() {
  if (this.numOfMoves > 0) {
    this.numOfMoves =
      this.numOfMoves - 1;
    if (this._UserMoves.Length >= 2)
      this._UserMoves =
           this._UserMoves.Substring(0, this._UserMoves.Length - 2);
      this.loadBoard(
           this.moveHistory[this.numOfMoves -
               this.numOfMoves / 50 * 50]);
      this.drawBoard(this.gr);
    }
}
As you can see, the results are almost identical to the original code. 
Later, we will revisit this to see the results after obfuscation.

Obfuscation in Depth
  Obfuscation is accomplished using a set of related technologies. Its goal 
is to hide the intent of a program without changing its runtime behavior. 
It's not encryption, but in the context of .NET code, it might be better. 
You could encrypt .NET assemblies to make them completely unreadable. 
However, this methodology suffers from a classic dilemma�since the runtime 
must execute unencrypted code, the decryption key must be kept with the 
encrypted program. Therefore, an automated utility could be created to 
recover the key, decrypt the code, and then write out the IL to disk in its 
original form. Once that happens, the program is fully exposed to 
decompilation.
  To give an analogy, encryption is like locking a six-course meal into a 
lockbox. Only the intended diner (in this case, the CLR) has the key and we 
don't want anyone else to know what he or she is going to eat. 
Unfortunately, at mealtime the food will be in plain view to all observers. 
Obfuscation works more like putting the six-course meal into a blender and 
sending it to the diner in a plastic bag. Sure, everyone can see the food in 
transit, but besides a lucky pea or some beef-colored goop, they don't know 
what the original meal is. The diner still gets the intended delivery and 
the meal still provides the same nutritional value as it did before 
(luckily, the CLR isn't picky about taste). The trick of an obfuscator is to 
confuse observers, while still delivering the same product to the CLR.
  Of course, obfuscation (or encryption) is not a hundred percent foolproof. 
Even compiled C++ can be disassembled. If a hacker is persistent enough, she 
can reproduce your code.


Figure 2 Obfuscation Process

  Obfuscation is a process that is applied to compiled .NET assemblies, not 
source code. An obfuscator never reads or alters your source code. Figure 2 
shows the flow of the obfuscation process. The output of the obfuscator is 
another set of assemblies, functionally equivalent to the input assemblies, 
yet transformed in ways that hinder reverse engineering. We will now 
consider two essential techniques that Dotfuscator Community Edition uses to 
achieve that goal: renaming and removing nonessential metadata.

Renaming Metadata
  The first line of defense in obfuscation is to rename meaningful names 
with non-meaningful ones. As you know, there is a lot of value in 
well-chosen names. They help make your code self-documenting and serve as 
valuable clues that reveal the purpose of the item they represent. The CLR 
doesn't care how descriptive a name is, so obfuscators are free to change 
them, typically to one-character names like "a".
  Obviously there are constraints on the amount of renaming an obfuscator 
will be able to perform on a particular application. Generally speaking, 
there are three common renaming scenarios.
  If your application consists of one or more assemblies that are standalone 
(that is, no unobfuscated code depends on any of the assemblies), then the 
obfuscator is free to rename an assembly regardless of the name's 
visibility, so long as the names and references to them are consistent 
across the set of assemblies. A Windows� Forms application is a good example 
of this. At the opposite extreme, if your application is designed to be used 
by unobfuscated code, the obfuscator cannot change the names of types or 
members visible to those clients. Examples of this type of application are 
shared class libraries, reusable components, and the like. Somewhere in 
between are applications that are meant to plug into existing unobfuscated 
frameworks. In this case, the obfuscator can rename anything not accessed by 
the environment in which it is running, regardless of visibility. ASP.NET 
applications are good examples of this type of application.
  Dotfuscator Community Edition uses a patented renaming technique called 
overload induction that adds a twist to renaming. Method identifiers are 
maximally overloaded after an exhaustive scope analysis. Instead of 
substituting one new name for each old name, the overload induction 
technique renames as many methods as possible to the same name, confusing 
anyone trying to understand the decompiled code.
  In addition, as a nice side effect, the size of the application decreases 
due to the smaller size of the string heap contained in the assembly. For 
example, if you have a name that is 20 characters long, renaming it to "a" 
saves 19 characters. In addition, continually reusing names saves space by 
conserving string heap entries. Renaming everything to "a" means that "a" is 
stored only once, and each method or field renamed to "a" can point to it. 
Overload induction enhances this effect because the shortest identifiers are 
continually reused. Typically, an overload-induced project will have up to 
35 percent of the methods renamed to "a".
  To see the impact of renaming on decompiled code, take a look at the undo 
method after the renaming process:
public void c() {
    if (this.p > 0) {
        this.p = this.p - 1;
        if (this.r.Length >= 2)
            this.r = this.r.Substring(0, this.r.Length - 2);
        this.a(this.q[this.p - this.p / 50 * 50]);
        this.a(this.e);
    }
}
You can see that without any other kinds of obfuscation, this method is 
already much more difficult to understand.

Removing Nonessential Metadata
  Not all of the metadata in a compiled .NET-based application is used by 
the runtime. Some of it is there to be consumed by other tools such as 
designers, IDEs, and debuggers. For example, if you define a property called 
"Size" on a type in C#, the compiler will emit metadata for the property 
name "Size" and associate that name with the methods that implement the get 
and set operations (which it names "get_Size" and "set_Size", respectively). 
When you write code that sets the Size property, the compiler will always 
generate a call to the method "set_Size" itself and will never reference the 
property by its name. In fact, the name of the property is there for the IDE 
and developers who are using your code; it is never accessed by the CLR.
  If your application is meant to be used by just the runtime and not by 
other tools, it's safe for an obfuscator to remove this type of metadata. In 
addition to property names, event names and method parameter names fall into 
this category. Dotfuscator Community Edition removes all these types of 
metadata when it deems that it is safe to do so.

Additional Techniques
  Dotfuscator Community Edition provides good obfuscation using the 
techniques we've just described, but you should be aware of additional 
obfuscation techniques that provide even stronger protection and may foil 
reverse engineering altogether. Dotfuscator Professional Edition implements 
many additional techniques, including control flow obfuscation, string 
encryption, incremental obfuscation, and size reduction.
Control flow is a powerful obfuscation technique, the goal of which is to 
hide the intent of a sequence of instructions without changing the logic. 
More importantly, it is used to remove the clues that decompilers look for 
in order to faithfully reproduce high-level source code statements, such as 
if-then-else statements and loops. In fact, this technique tends to break 
decompilers.
  To see this effect in action, look at the decompiled undo method again, 
after applying renaming and control flow obfuscation (see Figure 3). You can 
see that instead of the original nested if statements, the decompiler has 
produced an if statement, two nested while loops, and some gotos to tie it 
all together. The label i1 is referenced but it is not generated by the 
decompiler (this is a decompiler bug, we presume).
String encryption is a technique that applies a simple encryption algorithm 
to string literals embedded in your application. As mentioned before, any 
encryption (or specifically, decryption) that's performed at run time is 
inherently insecure. That is, a smart hacker can eventually break it, but 
for strings in application code, it is worthwhile. Let's face it, if hackers 
want to get into your code, they don't blindly start searching renamed 
types. They probably do searches for "Invalid License Key" which point them 
right to the code where license handling is performed. Searching on strings 
is incredibly easy; string encryption raises the bar because only the 
encrypted version is present in the compiled code.
Incremental Obfuscation helps with the challenge of issuing a patch to fix a 
customer's problems in the face of obfuscation. Fixing bugs in code often 
creates or deletes classes, methods, or fields. Changing code (for example, 
adding or deleting a method) may cause subsequent obfuscation runs to rename 
things slightly differently. What was previously called "a" might now be 
called "b". Unfortunately, how and what was renamed differently is a 
mystery.
  Incremental obfuscation can combat this problem. Dotfuscator creates a map 
file to tell you how it performed the renaming. That same map file, however, 
can be used as input to Dotfuscator on subsequent runs to dictate that 
renames used previously should be used again wherever possible. If you 
release your product and then patch a few classes, Dotfuscator can be run in 
such a way as to mimic its previous renaming scheme. That way, you can issue 
just the patched classes to your customers.
Size reduction does not strictly impede reverse engineering, but it is worth 
mentioning because obfuscators almost always have to perform a dependency 
analysis on the set of input assemblies. Thus the obfuscator is in a good 
position to do more than obfuscate, and some of the better ones will use 
their knowledge of your application to remove code that your program is not 
using. It seems odd that unused code removal can actually do anything�who 
writes code they don't use? Well, the answer is all of us. What's more, we 
all use libraries and types written by other people that were written to be 
reusable.
  Reusable code implies there is contingent code that handles many cases; 
however, in any given application, you typically only use one or two of 
those many cases. An advanced obfuscator can determine this and strip out 
all the unused code (again, from the compiled assembly, not the source). The 
result is that the output contains precisely the types and methods your 
application needs�nothing more. A smaller application has the benefits of 
conserving computing resources and reducing load times. This can be 
especially important for apps running on the .NET Compact Framework or 
distributed applications.

Using Dotfuscator Community Edition
  Now let's use Dotfuscator Community Edition to obfuscate the Vexed 
application. Dotfuscator Community Edition uses a configuration file that 
specifies the obfuscation settings for a particular application. It has a 
GUI to help you easily create and maintain the configuration file as well as 
run the obfuscator and examine the output. In addition, the Dotfuscator 
Community Edition's command-line interface allows you to easily integrate 
obfuscation into your automated build process. You can launch the GUI right 
from the tools menu of Visual Studio .NET 2003.
  To configure Vexed for obfuscation, you need to specify three items in the 
Dotfuscator Community Edition GUI: the input assembly, the map file 
location, and the output directory. The input assemblies (Dotfuscator calls 
these "trigger assemblies") are specified on the Trigger tab. You can add as 
many here as you want, but you only need one for the Vexed application.
  You specify the map file location on the Rename | Options tab (see Figure 
4). The map file is an essential piece of information that contains the 
unambiguous name mappings between the original and unobfuscated names. It is 
very important to keep this file after you obfuscate your application; 
without it, you will not be able to easily troubleshoot the obfuscated app. 
Due to its importance, Dotfuscator will not overwrite an existing map file 
by default unless you explicitly check the "Overwrite Map file" box.
  Finally, the Build tab allows you to specify the directory where the 
obfuscated application will be placed. Once you have done that, you are 
ready to obfuscate the application. You can save your configuration file for 
later use, then either press the "Build" button on the Build tab or use the 
"Play" button on the toolbar. While building, Dotfuscator displays progress 
information in the GUI's output pane. You can control the amount of 
information that is displayed here by choosing Quiet or Verbose on the 
Options tab.
  Once the build is complete, you can visually explore the results on the 
Output tab, shown in Figure 5. As you can see, Dotfuscator displays a 
graphical view of the application similar to an object browser. The new 
names are immediately below the original names in the view. In the figure, 
you can see that the class named "board" was renamed to "h", and two methods 
with different signatures (init and ToImage) were both renamed "a".

Examining the Map File
  The map file that Dotfuscator produces is an XML-formatted file, and in 
addition to the already mentioned name mappings, it contains some statistics 
that give a sense of how effective the renaming process was. Figure 6 
summarizes the statistics for types and methods after obfuscating the Vexed 
application.
  Map files are also used to perform incremental obfuscation. This process 
allows you to import names from a previous run, which tells the obfuscator 
to perform renaming in the same way as it was performed previously. If you 
are releasing a patch (or a new plug-in) for an already obfuscated 
application, you can obfuscate the updates using the same name set as the 
original version. This is of particular interest to enterprise development 
teams maintaining multiple interdependent applications.

Obfuscator Pitfalls
  Obfuscation�especially renaming�can be tricky on complex applications and 
is highly sensitive to correct configuration. If you aren't careful, the 
obfuscator can break your application. In this section, we'll discuss some 
of the more common issues that can arise when using an obfuscator.
  First, you need to do a little more work when your application includes a 
strongly named assembly. Strongly named assemblies are digitally signed, 
allowing the runtime to determine if an assembly has been altered after 
signing. The signature is an SHA1 hash signed with the private key of an RSA 
public/private key pair. Both the signature and the public key are embedded 
in the assembly's metadata. Since an obfuscator modifies the assembly, it is 
essential that signing occur after obfuscation. You should delay-sign the 
assembly during development and before obfuscation, then complete the 
signing process afterward. See the .NET Framework documentation for more 
details about delay-signed assemblies, and remember to turn off strong name 
validation while testing your delay-signed assemblies.
  The use of the Reflection API and dynamic class loading will also 
complicate the obfuscation process. Since these facilities are dynamic, they 
tend to defeat the static analysis techniques used by most obfuscators. 
Consider the following C# code snippet that gets a type by name and 
dynamically instantiates it, returning the type cast to an interface:
public MyInterface GetNewType() {
    Type type = Type.GetType( GetUserInputString(), true );
    object newInstance = Activator.CreateInstance( type );
    return newInstance as MyInterface;
}

   The name of the type is coming from another method. GetUserInputString 
may be asking the user to enter a string, or perhaps it retrieves it from a 
database. Either way, the type name is not present in the code for a static 
analysis to recover, so there is no way of knowing which types in the input 
assemblies may be instantiated in this manner. The solution in this case is 
to prevent renaming of all potentially loadable types that implement 
MyInterface (note that method and field renaming can still be performed). 
This is where manual configuration and some knowledge of the application 
being obfuscated plays an important role. Dotfuscator Community Edition 
gives you the tools to prevent the renaming of select types, methods, or 
fields. You can pick and choose individual names; alternatively, you can 
write exclusion rules using regular expressions and other criteria, such as 
visibility on scope. For example, you could exclude all public methods from 
renaming.
  Another issue with using an obfuscator occurs after you have deployed an 
obfuscated application and you are trying to support it. Say your 
application is throwing an exception (which happens even to the best of us) 
and a customer sends you a stack dump that looks something like this:
System.Exception: A serious error has occurred
   at cv.a()
   at cv..ctor(Hashtable A_0)
   at ar.a(di A_0)
   at ae.a(String[] A_0)
Obviously this is a lot less informative than a stack dump from the 
unobfuscated program. The good news is that you can use the map file 
generated during obfuscation to decode the stack trace back to the original. 
The bad news is that there is sometimes not enough information in the stack 
trace to unambiguously retrieve the original symbols from the map file. For 
example, notice in the dump that the method return types are omitted. In 
applications obfuscated with an enhanced overload induction renaming 
algorithm, methods that differ only by return type may be renamed to the 
same name. So the stack trace can be ambiguous. Most of the time, you can 
narrow the possibilities enough to find the original names to a high degree 
of certainty. To help, Dotfuscator Professional provides a tool to 
automatically translate the stack trace back to the original offending 
method.

Conclusion
  You don't need to let hackers use the handy ILDASM utility on your app for 
questionable purposes. You can protect your code with a good obfuscator. 
Obfuscation raises the reverse engineering bar. In the Visual Studio .NET 
2003 box, Dotfuscator Community Edition makes good obfuscation just a few 
clicks away.

_________________________________________________________________
Easiest Money Transfer to India. Send Money To 6000 Indian Towns. 
http://go.msnserver.com/IN/48198.asp Easiest Way To Send Money Home!



-----------------------------------------------------------

To stop getting this e-mail, or change how often it arrives, go to your E-mail 
Settings.
http://groups.msn.com/cochindotnet/_emailsettings.msnw

Need help? If you've forgotten your password, please go to Passport Member Services.
http://groups.msn.com/_passportredir.msnw?ppmprop=help

For other questions or feedback, go to our Contact Us page.
http://groups.msn.com/contact

If you do not want to receive future e-mail from this MSN group, or if you received 
this message by mistake, please click the "Remove" link below. On the pre-addressed 
e-mail message that opens, simply click "Send". Your e-mail address will be deleted 
from this group's mailing list.
mailto:[EMAIL PROTECTED]

Obfuscation techinque in Dotnet

Reply via email to