------------------------------------------------------------------------------------------------------------
How to create secure software and why everyone should create secure software.
------------------------------------------------------------------------------------------------------------
Author: Amit Choudhary
Email: amitchoudhary0523 AT gmail DOT com
These days, lots of software are getting hacked, and software hacking has become
quite common, so everyone should create secure software so that their software
doesn't get hacked.
"Insecure software" gets hacked, and this leads to the loss of millions of
dollars, and also exposes users' private data, like - SSNs, mobile numbers, etc.
So, it is "necessary" to create secure software these days.
Below, I have given the main points of creating secure software.
********************
In general, when we talk about creating secure software, we probably don't know
where to start from and what to do.
This article brings clarity on this issue. This article explains what to do and
where to start from for creating secure software.
We will now analyze what a software is made of. If we look closely at a
software, then we will realize that a software is made of functions and
functions only. So, if we write secure functions, then the software will become
secure automatically.
So, now, we know that we have to write secure functions to make the software
secure.
We need to focus on functions only to make the whole software secure. We need
not bother about how the software is going to fend off hacking attacks, etc.,
etc. If all the functions are secure, then the whole software is also secure.
Below, I have given the main points of writing secure functions.
********************
I am listing the main points of writing secure functions below. These points
apply to almost all computer languages.
1. The first point is that all the arguments of your function should always be
checked whether they are in the allowed range or not, even if your function
is called from other trusted functions and even if your function is
private/protected or static. The value of a function argument should not be
unbounded; it should be within a bounded range. All function arguments should
have a MIN value and a MAX value. You should try to choose MIN and MAX values
such that they satisfy a majority of users. It is not possible to satisfy all
the users, so you should try to satisfy the majority of them. In case some
users don't like your limits, then they can implement their own versions of
the function. The main point here is - don't satisfy the minority at the
expense of the majority.
Now, a counterpoint can be that checking all the function arguments will
degrade the performance of the function/software "significantly", but
actually, no research/study/tests have been done on this.
I did some experiments with glibc's qsort() function, and I found out that
the performance degradation is only 0.6% - 0.8% per function argument. So,
let's say that a function has 5 arguments, then the maximum performance
degradation will be only 4%. So, if all functions in a software have 5
arguments, then argument validation of all functions will make the whole
software slower by only 4%.
To find the MIN and the MAX values of your arguments, you can check how much
your function/software can support and also how much the underlying hardware
can support. Another factor to consider would be that,
"practically/realistically", how much would actually be needed by the
majority of users. For example, in general, in real life, I haven't heard of
someone needing to sort an integer array having 1 billion elements (in RAM,
not on disk). So, there is no point in supporting that many elements. Now,
let's say that you have a sorting function that sorts an integer array, and
it takes the number of array elements as an input. In this case, you can
limit the maximum number of elements to around 10% of the RAM size. On Linux,
you can get the RAM size through /proc/meminfo. Maybe you can implement a
function called get_ram_size() so that your code can run on all systems
having different RAM sizes. For example, if your system has 2 GB of RAM, then
the maximum limit would be 214,748,364 elements (around 214 million
elements). This should be enough to satisfy almost all users, but if some
people are not happy with this limit, then they can implement their own
versions of the function. Again, please don't satisfy the minority at the
expense of the majority.
Similarly, for strings, you can set the maximum length to 1 KB, or 100 KB, or
1% of the RAM size, and then you can validate the length of the input by
using the strnlen() function provided by the C standard library. If you are
using a different language, then you have to find an equivalent function in
that language or implement your own strnlen() function in that language.
2. The function body also should be secure. After writing code, you should
review your code for security issues and also get it peer reviewed for
security issues. In general, you should always get your code peer reviewed
for security issues, bugs, company coding guidelines, etc. Some common
security issues are - infinite loops that were actually not intended to be
infinite loops, NULL pointer dereference, accessing memory that is not part
of your program, copying more data in a buffer than the actual size of the
buffer (aka buffer overflow), stack overflow, out-of-bounds access (for
arrays), using memory/pointer after freeing it, etc. If you are programming
in C language, then you shouldn't use unbounded C functions like - strcpy(),
strlen(), strcat(), gets(), etc., but instead you should use bounded C
functions like - strncpy(), strnlen(), strncat(), fgets(), etc.
3. Whenever the code is tested (unit testing, integration testing, system
testing, etc.), please make sure that you and your testing team do security
testing also. For example, if a function is taking a string as an input, then
you/your testing team can do the following testing - give NULL as input, give
an unterminated string as input, give an input having a length of 10 MB or
100 MB, etc.
4. Always check the return values of the functions that you are calling before
proceeding ahead. Don't assume that all functions will always succeed. It is
quite possible that a function always succeeds in internal testing, but it
may fail when customers start using your software. If the function that you
called returned an error and if you didn't check it and proceeded ahead, then
wrong things can happen, and these wrong things can open a security hole in
your software, and your software may get hacked.
5. Don't use unsigned data types unless you are dealing with binary data (like -
raw bits, raw bytes, network data, data from hardware devices, etc.).
Although unsigned data type in itself doesn't present any security issue but
if the usage of the unsigned data type is wrong, then it can lead to security
issues. For example, if you use an unsigned integer variable in a loop and
the loop will exit when the unsigned integer variable becomes negative, then
this loop will never exit because an unsigned integer variable will never
become negative, and this infinite loop can open some security holes in your
software. Another issue with unsigned data type is that when you convert an
unsigned data type to a signed data type then although the bit values don't
change but the interpretation of the bit values changes because the data
types are different. So, let's say that you have an unsigned integer variable
'u' and in 'u' the most significant bit is set. But, 'u' is still positive
because it is of unsigned type. Now, if you typecast 'u' to a signed integer
variable 's', then 's' will be interpreted as being negative because the most
significant bit is set. So, "u > 0" will be true, but "s > 0" will be false.
So, now, if you use 's' in some comparison, then the results may not be as
expected, and this may open a security hole in your software.
Some C/C++ functions (malloc(), memset(), new(), etc.) have an argument of
type 'size_t'. 'size_t' is basically an unsigned data type, and on 64-bit
systems, it is usually defined as 'unsigned long', and its size is 64 bits.
Now, in your program, if you are not supporting values of more than
"2^63 - 1", then you don't need to use 'size_t'. You can use a 'long', and
while calling malloc(), memset(), new(), etc., you can typecast 'long' to
'size_t', and it will work. In case you are not supporting values of more
than "2^31 - 1", then you can use an 'int', and while calling malloc(),
memset(), new(), etc., you can typecast 'int' to 'size_t', and it will work.
So, you don't have to use an unsigned data type if you don't have any use of
it.
6. Don't typecast between data types of different lengths. This is because the
values may change if you typecast from a longer data type to a shorter data
type. For example, let's say that you have a 'long' variable 'j' that has a
value of 8589934591 (0x1FFFFFFFF), and now if you typecast it to an 'int'
variable 'i', then the value of 'i' will become -1 (0xFFFFFFFF). Now, when
you use 'i' in your program, then unexpected things may happen, and these may
lead to security issues in your software.
When you typecast from a shorter data type to a longer data type, and if the
shorter data type is negative, then signed extension will happen in the
longer data type, and you may not be expecting that, and this may probably
open a security hole in your software.
7. Don't hard-code passwords or any other credentials in your program/software.
Hard-coding these is a big security issue. Also, although not related to
coding, you should not keep the password of your device as "root" or as your
username.
8. Don't use recursive functions, unless absolutely necessary, because recursive
functions can lead to stack overflow, and stack overflow is a security issue.
9. Avoid using global variables. In object-oriented languages, avoid using
public variables as much as possible. In C language also you should avoid
global variables, but in case you need to use some, then make them 'static'
so that they are not visible outside of the file.
10. Initialize all variables (global/local/static/private/protected/public) to
proper values before using them (in C/C++, uninitialized global and static
variables are automatically initialized to 0).
11. Don't expose all functions to the user. Expose only those functions that the
user will actually need; the rest of the functions should be
private/protected in object-oriented languages and static in C language.
In my opinion, if you follow these points, then your functions/software will be
secure. I follow these points, and my functions/software are fully secure, they
can't be hacked. You can challenge me on this.
Q: Why should everyone create secure software?
A: Everyone should create secure software because "insecure software" gets
hacked, and this leads to the loss of millions of dollars, and also exposes
users' private data, like - SSNs, mobile numbers, etc. So, it is "necessary"
to create secure software these days. Secure but slightly slower software
(around 4% slower according to my experiments with glibc's qsort() function)
will save millions of dollars and also save lots of people/governments from
lots of trouble, etc. Let's say that some hospital software got hacked
because the software provider didn't validate the inputs of the functions
because the software provider didn't want 4% performance degradation. Now,
what will happen to the patients? What will happen to those patients who were
using medical machines, and those machines got hacked because 4% performance
degradation was not acceptable, even if that resulted in insecure software
(that ultimately got hacked)?
----
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]