Re: [Libreoffice] RFC: Idea for fuzz-testing filters

2011-10-10 Thread Marc-André Laverdière
Hello everyone,

I have been struggling along the way and eventually put together this
small starting point... What do you think about it?

Marc-André Laverdière
Software Security Scientist
Innovation Labs, Tata Consultancy Services
Hyderabad, India

On 10/07/2011 01:38 PM, Michael Meeks wrote:
 
 On Fri, 2011-10-07 at 10:53 +0530, Marc-André Laverdière wrote:
 I'm not thrilled with the idea of so much process creation and overhead
 (think Valgrind) for running a somewhat short test over and over again.
 
   Certainly; the linking / bootstrapping overhead is rather substantial
 in comparison with loading a reasonably small document - at least from
 valgrind's perspective. Clearly passing a dozen documents to each
 command-line can help with that though.
 
   HTH,
 
   Michael.
 
/*
 * Version: MPL 1.1 / GPLv3+ / LGPLv3+
 *
 * The contents of this file are subject to the Mozilla Public License Version
 * 1.1 (the License); you may not use this file except in compliance with
 * the License or as specified alternatively below. You may obtain a copy of
 * the License at http://www.mozilla.org/MPL/
 *
 * Software distributed under the License is distributed on an AS IS basis,
 * WITHOUT WARRANTY OF ANY KIND, either express or implied. See the License
 * for the specific language governing rights and limitations under the
 * License.
 *
 * Major Contributor(s):
 * [ Copyright (C) 2011 Tata Consultancy Services, Ltd. Marc-André Laverdière 
marc-an...@atc.tcs.com (initial developer) ]
 *
 * All Rights Reserved.
 *
 * For minor contributions see the git repository.
 *
 * Alternatively, the contents of this file may be used under the terms of
 * either the GNU General Public License Version 3 or later (the GPLv3+), or
 * the GNU Lesser General Public License Version 3 or later (the LGPLv3+),
 * in which case the provisions of the GPLv3+ or the LGPLv3+ are applicable
 * instead of those above.
 */

#include unistd.h
#include stdio.h
#include string
#include iomanip
#include iostream
#include sstream
#include boost/format.hpp
#include boost/program_options.hpp
#include boost/filesystem.hpp
#include boost/filesystem/fstream.hpp

using namespace std;
using namespace boost::program_options;
using namespace boost::filesystem;
using namespace boost::filesystem3;
using namespace boost;

//constants
const string SEEDS_SUBFOLDER = data;
const string RESULTS_SUBFOLDER = results;
const string VALGRIND_COMMAND = valgrind --tool=memcheck;

const string OPTION_HELP = help;
const string OPTION_PROGRAM = program;
const string OPTION_DIR = dir;

const string TERMINATOR = ***^^^FILE_OVER@@@;

const unsigned char PATTERNS[] = {0xFF, 0xEF};

FILE * startProcess(string command, string directory)
{
stringstream concatenator;
concatenator  VALGRIND_COMMAND command directory;
return popen(concatenator.str().c_str(), r);
}

string readProgramOutput(FILE * pipe, string currentbuffer)
{
char buffer[128];
string result = ;
while(!feof(pipe)) {
if(fgets(buffer, 128, pipe) != NULL)
{
string sBuffer = string(buffer);
size_t terminatorLocation = sBuffer.find_last_of(TERMINATOR);
if (terminatorLocation  sBuffer.npos)
{ //we have a match
//do something
result += sBuffer.substr(0,terminatorLocation);
break;
}
else
result += sBuffer;
}
}

return result;
}

bool ensureDirectoryExists(const string name)
{

path folder(RESULTS_SUBFOLDER);
if (!exists(folder))
{
create_directory(folder);
}
else if (!is_directory(folder))
{
cerr  Error:name   is not a directory  endl;
return false;
}

return true;
}

string createBuggerFileName(string  base, uintmax_t offset, unsigned char 
pattern)
{
stringstream concatenator;
concatenator  base  -  dec  offset  -pattern- 
hex  setiosflags (ios_base::showbase | ios_base::uppercase)  
static_castunsigned int(pattern);
return concatenator.str();
}

string getRandomFileName()
{
boost::filesystem3::ifstream in(/dev/random);
stringstream concatenator;
concatenator hex;
for (int i = 0; i  10; i++)
concatenator  in.get();
}

int main(int argc, const char * argv[])
{
options_description desc(Expected arguments:);
desc.add_options()
(OPTION_HELP.c_str(), Shows the help message)
(OPTION_PROGRAM.c_str(), valuevectorstring (), Program to run 
under Valgrind)
(OPTION_DIR.c_str(), valuevectorstring (), Directory to contain 
fuzzed files to test);
//Read the arguments
variables_map arguments;
store(parse_command_line(argc, argv, desc), arguments);


if (arguments.count(OPTION_PROGRAM)  arguments.count(OPTION_DIR))
{
string program = arguments[OPTION_PROGRAM].asvectorstring ()[0];
string dir = arguments[OPTION_DIR].asvector string ()[0];

Re: [Libreoffice] RFC: Idea for fuzz-testing filters

2011-10-10 Thread Michael Meeks
Hi Marc,

On Mon, 2011-10-10 at 15:08 +0530, Marc-André Laverdière wrote:
 I have been struggling along the way and eventually put together this
 small starting point... What do you think about it?

Looks fun :-) I guess it is necessary to have a separate remote-control
process if we want to catch the memcheck output, that's a shame - but
unavoidable I guess.

Regards,

Michael.

-- 
michael.me...@suse.com  , Pseudo Engineer, itinerant idiot

___
LibreOffice mailing list
LibreOffice@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/libreoffice


Re: [Libreoffice] RFC: Idea for fuzz-testing filters

2011-10-07 Thread Michael Meeks

On Fri, 2011-10-07 at 10:53 +0530, Marc-André Laverdière wrote:
 I'm not thrilled with the idea of so much process creation and overhead
 (think Valgrind) for running a somewhat short test over and over again.

Certainly; the linking / bootstrapping overhead is rather substantial
in comparison with loading a reasonably small document - at least from
valgrind's perspective. Clearly passing a dozen documents to each
command-line can help with that though.

HTH,

Michael.

-- 
michael.me...@suse.com  , Pseudo Engineer, itinerant idiot

___
LibreOffice mailing list
LibreOffice@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/libreoffice


Re: [Libreoffice] RFC: Idea for fuzz-testing filters

2011-10-06 Thread Huzaifa Sidhpurwala

On 10/05/2011 06:41 PM, Caolán McNamara wrote:

caolanm-huzaifas: any advice ?


Nice to see the work you have been doing here!

To share some opinion about the my work which lead me to the discovery 
of CVE-2011-2713.


1. There is no right or wrong approach here. A good approach would be 
the one which covers all the possible code paths or maximum possible 
ones in this case.


2. Ideally Peachfuzz or any other intelligent fuzzers (ones available 
freely or custom ones) would be the best way to find flaws. But i will
have to agree with you, the specs are too big in this case and the time 
taken to translate them into a fuzzer format is formidable.


3. I was pointed at [1] by Caolan. How do you run these files through 
libreoffice after generating these test cases?. zzuf could actually 
create the test cases, run libreoffice, destroy them and cycle through 
this process as many times as you want. Saving on hard disk space? :)



--
Huzaifa Sidhpurwala / Red Hat Security Response Team
___
LibreOffice mailing list
LibreOffice@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/libreoffice


Re: [Libreoffice] RFC: Idea for fuzz-testing filters

2011-10-06 Thread Marc-André Laverdière
Thanks for your feedback.

I would really really like #1, but I'm not knowledgeable enough right
now to do that. If (God willing), I'm starting that PhD soon, I might
just be able to do that blindfolded a year from now :)

For #2, I would like to have a tool generate the format handled by
reading our source code. That would give something like 90% of the spec
generated for us. Then we can fill in the blanks. I'm guessing I could
do that in 2-3 more years of PhD than #1 ;)

As for #3, I haven't had good experience with zzuf. I used the
integrated option from CERT, only to have a few results in the log file
and no fuzzed file to repeat the test with.

I also used zzuf to generate test cases for the wmf filters, put the
results in indeterminate, and then export the VALGRIND variable and run
make -sr. Once that was done, I just had to read the report file for
valgrind errors. Not automated, but 'good enough'.

Problem is that it would saturate the disks pretty fast... so the
end-result is that I was spending more time babysitting The Monster (the
computer which was doing all those computations) than improving stuff.
Not ideal. And I'm talking about 30 Gb disk space filled up! And my
personal beef with zzuf is that the bytes are fuzzed randomly, which
means that you never know if that length field at offset 0xABCDEF was
even touched.

Running the whole of LO on a test case is going to take an enormous
amount of time. There is also the risk of having false positives for
things that would be slow to open (you have to set a time out, after
which it will kill the process).

I think we could have some zzuf going, because the probabilistic thing
could help find the kind of bugs that would be otherwise too expensive
to find deterministically (100 Kb file means a lot of combinations).
Maybe my suggestion could be used to generate a lot of seeds for zzuf.
It would guarantee that the specific byte was tampered with...

I'm not thrilled with the idea of so much process creation and overhead
(think Valgrind) for running a somewhat short test over and over again.

Marc-André Laverdière
Software Security Scientist
Innovation Labs, Tata Consultancy Services
Hyderabad, India

On 10/06/2011 12:01 PM, Huzaifa Sidhpurwala wrote:
 On 10/05/2011 06:41 PM, Caolán McNamara wrote:
 caolanm-huzaifas: any advice ?
 
 Nice to see the work you have been doing here!
 
 To share some opinion about the my work which lead me to the discovery
 of CVE-2011-2713.
 
 1. There is no right or wrong approach here. A good approach would be
 the one which covers all the possible code paths or maximum possible
 ones in this case.
 
 2. Ideally Peachfuzz or any other intelligent fuzzers (ones available
 freely or custom ones) would be the best way to find flaws. But i will
 have to agree with you, the specs are too big in this case and the time
 taken to translate them into a fuzzer format is formidable.
 
 3. I was pointed at [1] by Caolan. How do you run these files through
 libreoffice after generating these test cases?. zzuf could actually
 create the test cases, run libreoffice, destroy them and cycle through
 this process as many times as you want. Saving on hard disk space? :)
 
 
___
LibreOffice mailing list
LibreOffice@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/libreoffice


[Libreoffice] RFC: Idea for fuzz-testing filters

2011-10-05 Thread Marc-André Laverdière
Hi everyone,

Before I start writing code, I wanted to get the input of more
experienced developers.

Why bother about this? Why not use what's available out there? Well...
 - Fuzzgrind isn't well documented and won't work out of the box,
 - zzuf has too many bells and whistles, and won't guarantee that every
byte has been messed up with. I used it to generate a lot of cases, and
it fills a disk quickly enough
 - Peachfuzz and others that rely on a specification: well, we have file
formats with hundreds of pages specified.

Here is the idea:
One process if the fuzzer process, it does the following (pseudocode):

  spawn valgrind test-program
  for (i = 0; i  file.length; i++)
fuzzed = memcpy(file)
fuzzed[i] = 0xFF (or whatever)
write(temp-dir/random-name, fuzzed)
read output from the spawned process until the marker is read
if valgrind output is more than the expected valgrind start/end markers
  then copy valgrind output to results directory
  then copy fuzzed to results directory
if spawned program crashed then restart it

The other process would do as follows:
  while(forever)
check if a new file is in temp-dir
if the file name is terminate-yourself, then exit
try to load the file with the filter
output a marker like  Done trying to load -

With this design, we avoid a lot of process creation overhead.
We can probably generalize it enough that we can put pretty much any
filter in there.

What do you think of this idea? What improvements we can add?

Regards,

-- 
Marc-André Laverdière
Software Security Scientist
Innovation Labs, Tata Consultancy Services
Hyderabad, India
___
LibreOffice mailing list
LibreOffice@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/libreoffice


Re: [Libreoffice] RFC: Idea for fuzz-testing filters

2011-10-05 Thread Caolán McNamara
On Wed, 2011-10-05 at 18:25 +0530, Marc-André Laverdière wrote:
 Hi everyone,
 
 Before I start writing code, I wanted to get the input of more
 experienced developers.
 
 Why bother about this? Why not use what's available out there? Well...
  - Fuzzgrind isn't well documented and won't work out of the box,
  - zzuf has too many bells and whistles, and won't guarantee that every
 byte has been messed up with. I used it to generate a lot of cases, and
 it fills a disk quickly enough
  - Peachfuzz and others that rely on a specification: well, we have file
 formats with hundreds of pages specified.
 
 Here is the idea:
 One process if the fuzzer process, it does the following (pseudocode):
 
   spawn valgrind test-program
   for (i = 0; i  file.length; i++)
 fuzzed = memcpy(file)
 fuzzed[i] = 0xFF (or whatever)
 write(temp-dir/random-name, fuzzed)
 read output from the spawned process until the marker is read
 if valgrind output is more than the expected valgrind start/end markers
   then copy valgrind output to results directory
   then copy fuzzed to results directory
 if spawned program crashed then restart it
 
 The other process would do as follows:
   while(forever)
 check if a new file is in temp-dir
 if the file name is terminate-yourself, then exit
 try to load the file with the filter
 output a marker like  Done trying to load -
 
 With this design, we avoid a lot of process creation overhead.
 We can probably generalize it enough that we can put pretty much any
 filter in there.
 
 What do you think of this idea? What improvements we can add?

caolanm-huzaifas: any advice ?

C.

___
LibreOffice mailing list
LibreOffice@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/libreoffice