-----------------------------------------------------------
New Message on cochindotnet
-----------------------------------------------------------
From: shijumon-codebrain
Message 1 in Discussion
Hi All
You know it is possible to generate source code from your dotnet
.dll or .exe files. So if you
want to secure your source code please apply Obfuscation techiniques.
by
codebrain
-------------------------------------------------------------------------------------------------------------------
This article was published on MSDN� Magazine
SUMMARY
One of the advantages of the .NET architecture is that assemblies built with
it contain lots of useful information that can be recovered using ILDASM,
the intermediate language disassembler. A side effect, though, is that
someone with access to your binaries can recover a good approximation of the
original source code. Here the authors present program obfuscation as a way
to deter reverse engineering. In addition, they discuss the different types
of obfuscation technologies available and demonstrate the new obfuscation
tool that is included in Visual Studio .NET 2003.
By now you are probably familiar with all of the benefits that the
metadata-rich Microsoft� .NET Framework architecture brings to the table,
from easing the burdens of deployment and versioning to the rich IDE
functionality enabled by self-describing binaries. You may not know that the
easy availability of all this metadata has introduced a problem that until
now was not a concern for most developers. Programs written for the common
language runtime (CLR) are easier to reverse engineer. This is not in any
way a fault in the design of the .NET Framework; it is simply a reality of
modern, intermediate-compiled languages (Java-language applications display
the same characteristics). Both Java and the .NET Framework use rich
metadata embedded inside the executable code: bytecode in the case of Java,
Microsoft Intermediate Language (MSIL) in .NET. Being much higher level than
binary machine code, the executable files are laden with information that
can be easily deciphered.
With the help of tools like ILDASM (the MSIL disassembler that ships with
the .NET Framework SDK) or decompilers such as Anakrino and Reflector for
.NET, anyone can easily look at your assemblies and reverse engineer them
back into readable source code. Hackers can search for security flaws to
exploit, steal unique ideas, and crack programs. This should be enough to
give you pause.
Don't worry, though. There's a solution�obfuscation�that will help you
thwart reverse engineering. Obfuscation is a technique that provides for
seamless renaming of symbols in assemblies as well as other tricks to foil
decompilers. When it is properly applied, obfuscation can increase the
protection against decompilation by many orders of magnitude, while leaving
the application intact. Obfuscation is commonly used in Java environments
and for years has been helping companies protect the intellectual property
in their Java-based products.
Several third-parties have answered the call by creating obfuscators for
.NET code. Microsoft includes the Dotfuscator Community Edition with Visual
Studio� .NET 2003 in partnership with our company PreEmptive Solutions,
which ships a number of various obfuscator packages.
Using the Dotfuscator Community Edition, this article will teach you all
about obfuscation (and a little about decompilation), the types of
obfuscation commonly available, and some of the issues you will need to
address when working with an obfuscator.
To demonstrate decompilation and obfuscation, we are going to use an
open-source implementation of the classic Vexed game. Vexed.NET was written
by Roey Ben-amotz and is available at http://vexeddotnet.benamotz.com. It's
a puzzle game in which your goal is to move similar blocks together, which
causes them to disappear. Below is a simple method from the source code of
Vexed.NET:
public void undo() {
if (numOfMoves>0) {
numOfMoves--;
if (_UserMoves.Length>=2)
_UserMoves = _UserMoves.Substring(0, _UserMoves.Length02);
this.loadBoard(this.moveHistory[numOfMmoves -
(numOfMoves/50) * 50]);
this.drawBoard(this.gr);
}
}
Disassembly
The .NET Framework SDK ships with a disassembler utility called ILDASM,
which allows you to decompile .NET Framework assemblies into IL Assembly
Language statements. In order to start ILDASM, you must make sure that the
.NET Framework SDK is installed and type ILDASM on the command line followed
by the name of the program that you want to disassemble. In our case, we
will type "ILDASM vexed.net.exe". This will launch the ILDASM UI, which can
be used to browse the structure of any .NET Framework-based application.
Figure 1 shows the undo method disassembled.
Decompilation
If you're now thinking that only a small circle of folks who actually know
IL Assembly Language will see and understand your source code, remember that
the decompilation doesn't stop there. We can recreate the actual source code
by using a decompiler. These utilities can decompile a .NET assembly
directly back to a high-level language like C#, Visual Basic� .NET, or C++.
Let's look at the undo method generated by the Anakrino decompiler:
public void undo() {
if (this.numOfMoves > 0) {
this.numOfMoves =
this.numOfMoves - 1;
if (this._UserMoves.Length >= 2)
this._UserMoves =
this._UserMoves.Substring(0, this._UserMoves.Length - 2);
this.loadBoard(
this.moveHistory[this.numOfMoves -
this.numOfMoves / 50 * 50]);
this.drawBoard(this.gr);
}
}
As you can see, the results are almost identical to the original code.
Later, we will revisit this to see the results after obfuscation.
Obfuscation in Depth
Obfuscation is accomplished using a set of related technologies. Its goal
is to hide the intent of a program without changing its runtime behavior.
It's not encryption, but in the context of .NET code, it might be better.
You could encrypt .NET assemblies to make them completely unreadable.
However, this methodology suffers from a classic dilemma�since the runtime
must execute unencrypted code, the decryption key must be kept with the
encrypted program. Therefore, an automated utility could be created to
recover the key, decrypt the code, and then write out the IL to disk in its
original form. Once that happens, the program is fully exposed to
decompilation.
To give an analogy, encryption is like locking a six-course meal into a
lockbox. Only the intended diner (in this case, the CLR) has the key and we
don't want anyone else to know what he or she is going to eat.
Unfortunately, at mealtime the food will be in plain view to all observers.
Obfuscation works more like putting the six-course meal into a blender and
sending it to the diner in a plastic bag. Sure, everyone can see the food in
transit, but besides a lucky pea or some beef-colored goop, they don't know
what the original meal is. The diner still gets the intended delivery and
the meal still provides the same nutritional value as it did before
(luckily, the CLR isn't picky about taste). The trick of an obfuscator is to
confuse observers, while still delivering the same product to the CLR.
Of course, obfuscation (or encryption) is not a hundred percent foolproof.
Even compiled C++ can be disassembled. If a hacker is persistent enough, she
can reproduce your code.
Figure 2 Obfuscation Process
Obfuscation is a process that is applied to compiled .NET assemblies, not
source code. An obfuscator never reads or alters your source code. Figure 2
shows the flow of the obfuscation process. The output of the obfuscator is
another set of assemblies, functionally equivalent to the input assemblies,
yet transformed in ways that hinder reverse engineering. We will now
consider two essential techniques that Dotfuscator Community Edition uses to
achieve that goal: renaming and removing nonessential metadata.
Renaming Metadata
The first line of defense in obfuscation is to rename meaningful names
with non-meaningful ones. As you know, there is a lot of value in
well-chosen names. They help make your code self-documenting and serve as
valuable clues that reveal the purpose of the item they represent. The CLR
doesn't care how descriptive a name is, so obfuscators are free to change
them, typically to one-character names like "a".
Obviously there are constraints on the amount of renaming an obfuscator
will be able to perform on a particular application. Generally speaking,
there are three common renaming scenarios.
If your application consists of one or more assemblies that are standalone
(that is, no unobfuscated code depends on any of the assemblies), then the
obfuscator is free to rename an assembly regardless of the name's
visibility, so long as the names and references to them are consistent
across the set of assemblies. A Windows� Forms application is a good example
of this. At the opposite extreme, if your application is designed to be used
by unobfuscated code, the obfuscator cannot change the names of types or
members visible to those clients. Examples of this type of application are
shared class libraries, reusable components, and the like. Somewhere in
between are applications that are meant to plug into existing unobfuscated
frameworks. In this case, the obfuscator can rename anything not accessed by
the environment in which it is running, regardless of visibility. ASP.NET
applications are good examples of this type of application.
Dotfuscator Community Edition uses a patented renaming technique called
overload induction that adds a twist to renaming. Method identifiers are
maximally overloaded after an exhaustive scope analysis. Instead of
substituting one new name for each old name, the overload induction
technique renames as many methods as possible to the same name, confusing
anyone trying to understand the decompiled code.
In addition, as a nice side effect, the size of the application decreases
due to the smaller size of the string heap contained in the assembly. For
example, if you have a name that is 20 characters long, renaming it to "a"
saves 19 characters. In addition, continually reusing names saves space by
conserving string heap entries. Renaming everything to "a" means that "a" is
stored only once, and each method or field renamed to "a" can point to it.
Overload induction enhances this effect because the shortest identifiers are
continually reused. Typically, an overload-induced project will have up to
35 percent of the methods renamed to "a".
To see the impact of renaming on decompiled code, take a look at the undo
method after the renaming process:
public void c() {
if (this.p > 0) {
this.p = this.p - 1;
if (this.r.Length >= 2)
this.r = this.r.Substring(0, this.r.Length - 2);
this.a(this.q[this.p - this.p / 50 * 50]);
this.a(this.e);
}
}
You can see that without any other kinds of obfuscation, this method is
already much more difficult to understand.
Removing Nonessential Metadata
Not all of the metadata in a compiled .NET-based application is used by
the runtime. Some of it is there to be consumed by other tools such as
designers, IDEs, and debuggers. For example, if you define a property called
"Size" on a type in C#, the compiler will emit metadata for the property
name "Size" and associate that name with the methods that implement the get
and set operations (which it names "get_Size" and "set_Size", respectively).
When you write code that sets the Size property, the compiler will always
generate a call to the method "set_Size" itself and will never reference the
property by its name. In fact, the name of the property is there for the IDE
and developers who are using your code; it is never accessed by the CLR.
If your application is meant to be used by just the runtime and not by
other tools, it's safe for an obfuscator to remove this type of metadata. In
addition to property names, event names and method parameter names fall into
this category. Dotfuscator Community Edition removes all these types of
metadata when it deems that it is safe to do so.
Additional Techniques
Dotfuscator Community Edition provides good obfuscation using the
techniques we've just described, but you should be aware of additional
obfuscation techniques that provide even stronger protection and may foil
reverse engineering altogether. Dotfuscator Professional Edition implements
many additional techniques, including control flow obfuscation, string
encryption, incremental obfuscation, and size reduction.
Control flow is a powerful obfuscation technique, the goal of which is to
hide the intent of a sequence of instructions without changing the logic.
More importantly, it is used to remove the clues that decompilers look for
in order to faithfully reproduce high-level source code statements, such as
if-then-else statements and loops. In fact, this technique tends to break
decompilers.
To see this effect in action, look at the decompiled undo method again,
after applying renaming and control flow obfuscation (see Figure 3). You can
see that instead of the original nested if statements, the decompiler has
produced an if statement, two nested while loops, and some gotos to tie it
all together. The label i1 is referenced but it is not generated by the
decompiler (this is a decompiler bug, we presume).
String encryption is a technique that applies a simple encryption algorithm
to string literals embedded in your application. As mentioned before, any
encryption (or specifically, decryption) that's performed at run time is
inherently insecure. That is, a smart hacker can eventually break it, but
for strings in application code, it is worthwhile. Let's face it, if hackers
want to get into your code, they don't blindly start searching renamed
types. They probably do searches for "Invalid License Key" which point them
right to the code where license handling is performed. Searching on strings
is incredibly easy; string encryption raises the bar because only the
encrypted version is present in the compiled code.
Incremental Obfuscation helps with the challenge of issuing a patch to fix a
customer's problems in the face of obfuscation. Fixing bugs in code often
creates or deletes classes, methods, or fields. Changing code (for example,
adding or deleting a method) may cause subsequent obfuscation runs to rename
things slightly differently. What was previously called "a" might now be
called "b". Unfortunately, how and what was renamed differently is a
mystery.
Incremental obfuscation can combat this problem. Dotfuscator creates a map
file to tell you how it performed the renaming. That same map file, however,
can be used as input to Dotfuscator on subsequent runs to dictate that
renames used previously should be used again wherever possible. If you
release your product and then patch a few classes, Dotfuscator can be run in
such a way as to mimic its previous renaming scheme. That way, you can issue
just the patched classes to your customers.
Size reduction does not strictly impede reverse engineering, but it is worth
mentioning because obfuscators almost always have to perform a dependency
analysis on the set of input assemblies. Thus the obfuscator is in a good
position to do more than obfuscate, and some of the better ones will use
their knowledge of your application to remove code that your program is not
using. It seems odd that unused code removal can actually do anything�who
writes code they don't use? Well, the answer is all of us. What's more, we
all use libraries and types written by other people that were written to be
reusable.
Reusable code implies there is contingent code that handles many cases;
however, in any given application, you typically only use one or two of
those many cases. An advanced obfuscator can determine this and strip out
all the unused code (again, from the compiled assembly, not the source). The
result is that the output contains precisely the types and methods your
application needs�nothing more. A smaller application has the benefits of
conserving computing resources and reducing load times. This can be
especially important for apps running on the .NET Compact Framework or
distributed applications.
Using Dotfuscator Community Edition
Now let's use Dotfuscator Community Edition to obfuscate the Vexed
application. Dotfuscator Community Edition uses a configuration file that
specifies the obfuscation settings for a particular application. It has a
GUI to help you easily create and maintain the configuration file as well as
run the obfuscator and examine the output. In addition, the Dotfuscator
Community Edition's command-line interface allows you to easily integrate
obfuscation into your automated build process. You can launch the GUI right
from the tools menu of Visual Studio .NET 2003.
To configure Vexed for obfuscation, you need to specify three items in the
Dotfuscator Community Edition GUI: the input assembly, the map file
location, and the output directory. The input assemblies (Dotfuscator calls
these "trigger assemblies") are specified on the Trigger tab. You can add as
many here as you want, but you only need one for the Vexed application.
You specify the map file location on the Rename | Options tab (see Figure
4). The map file is an essential piece of information that contains the
unambiguous name mappings between the original and unobfuscated names. It is
very important to keep this file after you obfuscate your application;
without it, you will not be able to easily troubleshoot the obfuscated app.
Due to its importance, Dotfuscator will not overwrite an existing map file
by default unless you explicitly check the "Overwrite Map file" box.
Finally, the Build tab allows you to specify the directory where the
obfuscated application will be placed. Once you have done that, you are
ready to obfuscate the application. You can save your configuration file for
later use, then either press the "Build" button on the Build tab or use the
"Play" button on the toolbar. While building, Dotfuscator displays progress
information in the GUI's output pane. You can control the amount of
information that is displayed here by choosing Quiet or Verbose on the
Options tab.
Once the build is complete, you can visually explore the results on the
Output tab, shown in Figure 5. As you can see, Dotfuscator displays a
graphical view of the application similar to an object browser. The new
names are immediately below the original names in the view. In the figure,
you can see that the class named "board" was renamed to "h", and two methods
with different signatures (init and ToImage) were both renamed "a".
Examining the Map File
The map file that Dotfuscator produces is an XML-formatted file, and in
addition to the already mentioned name mappings, it contains some statistics
that give a sense of how effective the renaming process was. Figure 6
summarizes the statistics for types and methods after obfuscating the Vexed
application.
Map files are also used to perform incremental obfuscation. This process
allows you to import names from a previous run, which tells the obfuscator
to perform renaming in the same way as it was performed previously. If you
are releasing a patch (or a new plug-in) for an already obfuscated
application, you can obfuscate the updates using the same name set as the
original version. This is of particular interest to enterprise development
teams maintaining multiple interdependent applications.
Obfuscator Pitfalls
Obfuscation�especially renaming�can be tricky on complex applications and
is highly sensitive to correct configuration. If you aren't careful, the
obfuscator can break your application. In this section, we'll discuss some
of the more common issues that can arise when using an obfuscator.
First, you need to do a little more work when your application includes a
strongly named assembly. Strongly named assemblies are digitally signed,
allowing the runtime to determine if an assembly has been altered after
signing. The signature is an SHA1 hash signed with the private key of an RSA
public/private key pair. Both the signature and the public key are embedded
in the assembly's metadata. Since an obfuscator modifies the assembly, it is
essential that signing occur after obfuscation. You should delay-sign the
assembly during development and before obfuscation, then complete the
signing process afterward. See the .NET Framework documentation for more
details about delay-signed assemblies, and remember to turn off strong name
validation while testing your delay-signed assemblies.
The use of the Reflection API and dynamic class loading will also
complicate the obfuscation process. Since these facilities are dynamic, they
tend to defeat the static analysis techniques used by most obfuscators.
Consider the following C# code snippet that gets a type by name and
dynamically instantiates it, returning the type cast to an interface:
public MyInterface GetNewType() {
Type type = Type.GetType( GetUserInputString(), true );
object newInstance = Activator.CreateInstance( type );
return newInstance as MyInterface;
}
The name of the type is coming from another method. GetUserInputString
may be asking the user to enter a string, or perhaps it retrieves it from a
database. Either way, the type name is not present in the code for a static
analysis to recover, so there is no way of knowing which types in the input
assemblies may be instantiated in this manner. The solution in this case is
to prevent renaming of all potentially loadable types that implement
MyInterface (note that method and field renaming can still be performed).
This is where manual configuration and some knowledge of the application
being obfuscated plays an important role. Dotfuscator Community Edition
gives you the tools to prevent the renaming of select types, methods, or
fields. You can pick and choose individual names; alternatively, you can
write exclusion rules using regular expressions and other criteria, such as
visibility on scope. For example, you could exclude all public methods from
renaming.
Another issue with using an obfuscator occurs after you have deployed an
obfuscated application and you are trying to support it. Say your
application is throwing an exception (which happens even to the best of us)
and a customer sends you a stack dump that looks something like this:
System.Exception: A serious error has occurred
at cv.a()
at cv..ctor(Hashtable A_0)
at ar.a(di A_0)
at ae.a(String[] A_0)
Obviously this is a lot less informative than a stack dump from the
unobfuscated program. The good news is that you can use the map file
generated during obfuscation to decode the stack trace back to the original.
The bad news is that there is sometimes not enough information in the stack
trace to unambiguously retrieve the original symbols from the map file. For
example, notice in the dump that the method return types are omitted. In
applications obfuscated with an enhanced overload induction renaming
algorithm, methods that differ only by return type may be renamed to the
same name. So the stack trace can be ambiguous. Most of the time, you can
narrow the possibilities enough to find the original names to a high degree
of certainty. To help, Dotfuscator Professional provides a tool to
automatically translate the stack trace back to the original offending
method.
Conclusion
You don't need to let hackers use the handy ILDASM utility on your app for
questionable purposes. You can protect your code with a good obfuscator.
Obfuscation raises the reverse engineering bar. In the Visual Studio .NET
2003 box, Dotfuscator Community Edition makes good obfuscation just a few
clicks away.
_________________________________________________________________
Easiest Money Transfer to India. Send Money To 6000 Indian Towns.
http://go.msnserver.com/IN/48198.asp Easiest Way To Send Money Home!
-----------------------------------------------------------
To stop getting this e-mail, or change how often it arrives, go to your E-mail
Settings.
http://groups.msn.com/cochindotnet/_emailsettings.msnw
Need help? If you've forgotten your password, please go to Passport Member Services.
http://groups.msn.com/_passportredir.msnw?ppmprop=help
For other questions or feedback, go to our Contact Us page.
http://groups.msn.com/contact
If you do not want to receive future e-mail from this MSN group, or if you received
this message by mistake, please click the "Remove" link below. On the pre-addressed
e-mail message that opens, simply click "Send". Your e-mail address will be deleted
from this group's mailing list.
mailto:[EMAIL PROTECTED]