Re: [xz-devel] Adding fuzz testing support to liblzma

2018-11-02 Thread Bhargava Shastry
Dear Lasse,

On 11/2/18 9:20 PM, Lasse Collin wrote:
> On 2018-10-31 Bhargava Shastry wrote:
>> On 10/30/18 6:26 PM, Lasse Collin wrote:
>>> On 2018-10-30 Bhargava Shastry wrote:  
 - oss-fuzz requires a Google linked email address of the
 maintainer. Could you please provide me one?  
>>>
>>> No, I'm sorry. This is the email address to use to contact me, and I
>>> don't plan to link this address to a Google account.  
>>
>> No need to apologize :)
>>
>> I didn't mean to be presumptuous.
> 
> I didn't mean to imply anything like that. Sorry if there was a
> misunderstanding. I understand OSS-Fuzz is a Google project so it makes
> sense to use Google accounts for logins. (So far I only have a Google
> account to access Play Store on Android; I don't use it otherwise.)

No worries, all is good :)

>> The thing is that oss-fuzz creates
>> bug reports on Google infrastructure and hence the requirement.
>>
>> I will ask oss-fuzz folks if there is an alternative to Google-linked
>> account for viewing bug reports.
> 
> Thanks. I saw your discussion here:
> 
> https://github.com/google/oss-fuzz/issues/1915
> 
> Just seeing bug reports (or getting some kind of notice that something
> has been found) goes a long way. :-)

Okay :)

>> After running version 2 overnight (with the corpus generated from
>> version 1), I see that v2 covers 1007 CFG edges (1% better coverage).
>>
>> I agree that version 1 is better :)
> 
> Hmm OK, thanks. I was thinking if v2 with bigger buffers is worth
> considering still but I don't want to think more, so let's go with v1.

Thank you. Moving forward, we could create more fuzz targets for
different buffer sizes, or have a single target but make the buffer size
conditional on, say, the first byte of fuzzed input.

We could pick up on this thread once the initial integration with
oss-fuzz has been accepted.

> I committed these four files:
> 
> tests/ossfuzz/Makefile
> tests/ossfuzz/config/fuzz.dict
> tests/ossfuzz/config/fuzz.options
> tests/ossfuzz/fuzz.c
> 
> I hope they are OK.
> 
> Is this all that I have to do for now? Other people will take care of
> the rest (Dockerfile and such that were in pdknsk's commit), right?

Right, I have sent a PR to this effect.

https://github.com/google/oss-fuzz/pull/1919

Once this is merged, xz will be continuously fuzzed.

Thank you once again for your feedback and help on this front :)
I wish more software creators/maintainers show similar interest in fuzzing!

Regards,
Bhargava

-- 
Bhargava Shastry 
Security in Telecommunications
TU Berlin / Telekom Innovation Laboratories
Ernst-Reuter-Platz 7, Sekr TEL 17 / D - 10587 Berlin, Germany
phone: +49 30 8353 58235
Keybase: https://keybase.io/bshastry



Re: [xz-devel] Adding fuzz testing support to liblzma

2018-10-30 Thread Lasse Collin
On 2018-10-30 Bhargava Shastry wrote:
> - oss-fuzz requires a Google linked email address of the maintainer.
> Could you please provide me one?

No, I'm sorry. This is the email address to use to contact me, and I
don't plan to link this address to a Google account.

> - It is better that the test harness and related config (dictionary,
> other fuzzer options) reside in the xz source repo. Are you okay
> maintaining these in the long run?

Including the files in the xz repo is fine. I can maintain them in sense
that fuzz.c compiles and I can merge fuzzing related patches that get
sent to me. I hope this is enough.

> As starting point, I used all files with the "xz" extension that I
> could find in the source repo (total of 63 files).

I guess it's a good starting point.

Most of them are under hundred bytes and only one is over thousand
bytes (good-1-delta-lzma2.tiff.xz is 51,316 bytes). The bad files are
based on certain good files but each bad file has something broken in
it, so perhaps the bad files aren't so great for fuzzing (if the damage
is at the beginning, the decoder might stop there and fuzzing bits past
that point is pointless).

> I also did the following experiment
> 
> - I ran version 1 overnight (over 16 hours in total)
> - The coverage saturated at about 996 CFG edges
> 
> Then, I took the corpus that was generated for v1 fuzzing and fed it
> to v2. My hope is that this will quickly tell me how much better
> (coverage wise) v2 is were it to be run for as long as v1
> 
> - I found v2 covers 1004 CFG edges i.e., only 8 CFG edges more than v1
> 
> However, to be sure I need to keep v2 running for as long as v1, but
> my guess is that this saturation will prevail.

The test method sounds good. :-) Only eight more edges sounds low since
there are more than eight places where the code can run out of input or
output and has to stop. Perhaps it needs better input files to hit more
of such situations. Or, like I said in the previous email, maybe the
small input/output buffers aren't as valuable for fuzzing as I thought
and we should just use the simple fast version.

-- 
Lasse Collin  |  IRC: Larhzu @ IRCnet & Freenode



Re: [xz-devel] Adding fuzz testing support to liblzma

2018-10-30 Thread Bhargava Shastry
Dear Lasse,

Thanks for your feedback. My reply is inline. However, it is a good time
to discuss oss-fuzz integration as we apply final touches on the test
harness :)

I have a few questions for you:

- oss-fuzz requires a Google linked email address of the maintainer.
Could you please provide me one?

- It is better that the test harness and related config (dictionary,
other fuzzer options) reside in the xz source repo. Are you okay
maintaining these in the long run?

Thank you :)

On 10/29/18 9:27 PM, Lasse Collin wrote:
> On 2018-10-29 Bhargava Shastry wrote:
>> Thanks for providing two versions for me to test. Here are the
>> results:
>>
>> - version 1 decompresses the whole of fuzzed (compressed) data
>> - version 2 decompresses in chunks of size (input=13 bytes)
>>
>> ### Executions per second
>>
>> I ran both versions a total of 96 times (I have 16 cores :-))
>>
>> - version 1 averaged 1757.20 executions per second
>> - version 2 averaged 429.10 executions per second
>>
>> So, clearly version 1 is faster
> 
> Yes, and the difference is bigger than I hoped.
> 
>> Regarding coverage
>>
>> - version 1 covered 950.26 CFG edges on average
>> - version 2 covered 941.11 CFG edges on average
> 
> I assume you had the latest xz.git that supports
> FUZZING_BUILD_MODE_UNSAFE_FOR_PRODUCTION.

That's correct.

> Did you run the same number of fuzzing rounds on both (so the second
> version took over four times longer) or did you run them for the same
> time (so the second version ran only 1/4 of rounds)?

It's the latter, I ran for a fixed time duration of 2 minutes. In this time,

- version 1 was fuzzed 212677 times on average i.e., the test was fuzzed
with that many distinct inputs
- version 2 was fuzzed 51986 times on average

So, like you say, roughly v2 ran for 1/4 of rounds as v1.

> If both version saw the same number of rounds, I would expect the
> second version to have the same or better coverage. But if the
> comparison was based on time, then it's no surprise if the first
> version has better apparent coverage even if it is impossible for it to
> hit certain code paths that are possible with the second version. It
> might also depend on which input file is used as a starting point for
> the fuzzer.

As starting point, I used all files with the "xz" extension that I could
find in the source repo (total of 63 files).

I also did the following experiment

- I ran version 1 overnight (over 16 hours in total)
- The coverage saturated at about 996 CFG edges

Then, I took the corpus that was generated for v1 fuzzing and fed it to
v2. My hope is that this will quickly tell me how much better (coverage
wise) v2 is were it to be run for as long as v1

- I found v2 covers 1004 CFG edges i.e., only 8 CFG edges more than v1

However, to be sure I need to keep v2 running for as long as v1, but my
guess is that this saturation will prevail.

>> Overall, version 1 is superior imho.
> 
> I don't know yet. Increasing the input and output chunk sizes is
> probably needed to make the second version faster. You could try
> some odd values between 100 and 250, or maybe even up to 500.

Okay, I can try this out once current experiment completes.

Regards,
Bhargava



Re: [xz-devel] Adding fuzz testing support to liblzma

2018-10-29 Thread Lasse Collin
On 2018-10-29 Bhargava Shastry wrote:
> Thanks for providing two versions for me to test. Here are the
> results:
> 
> - version 1 decompresses the whole of fuzzed (compressed) data
> - version 2 decompresses in chunks of size (input=13 bytes)
> 
> ### Executions per second
> 
> I ran both versions a total of 96 times (I have 16 cores :-))
> 
> - version 1 averaged 1757.20 executions per second
> - version 2 averaged 429.10 executions per second
> 
> So, clearly version 1 is faster

Yes, and the difference is bigger than I hoped.

> Regarding coverage
> 
> - version 1 covered 950.26 CFG edges on average
> - version 2 covered 941.11 CFG edges on average

I assume you had the latest xz.git that supports
FUZZING_BUILD_MODE_UNSAFE_FOR_PRODUCTION.

Did you run the same number of fuzzing rounds on both (so the second
version took over four times longer) or did you run them for the same
time (so the second version ran only 1/4 of rounds)?

If both version saw the same number of rounds, I would expect the
second version to have the same or better coverage. But if the
comparison was based on time, then it's no surprise if the first
version has better apparent coverage even if it is impossible for it to
hit certain code paths that are possible with the second version. It
might also depend on which input file is used as a starting point for
the fuzzer.

> Overall, version 1 is superior imho.

I don't know yet. Increasing the input and output chunk sizes is
probably needed to make the second version faster. You could try
some odd values between 100 and 250, or maybe even up to 500.

On the other hand, it's possible that I'm putting too much weight on the
importance of fuzzing the stop & continue code paths.

Thanks again!

-- 
Lasse Collin  |  IRC: Larhzu @ IRCnet & Freenode