Re: [PATCH v4 1/7] fuzz: accelerate non-crash detection

2021-01-10 Thread Qiuhao Li
On Sun, 2021-01-10 at 11:00 -0500, Alexander Bulekov wrote:
> On 210110 2110, Qiuhao Li wrote:
> > On Wed, 2021-01-06 at 23:18 -0500, Alexander Bulekov wrote:
> > > On 201229 1240, Qiuhao Li wrote:
> > > > We spend much time waiting for the timeout program during the
> > > > minimization
> > > > process until it passes a time limit. This patch hacks the
> > > > CLOSED
> > > > (indicates
> > > > the redirection file closed) notification in QTest's output if
> > > > it
> > > > doesn't
> > > > crash.
> > > > 
> > > > Test with quadrupled trace input at:
> > > >   https://bugs.launchpad.net/qemu/+bug/1890333/comments/1
> > > > 
> > > > Original version:
> > > >   real  1m37.246s
> > > >   user  0m13.069s
> > > >   sys   0m8.399s
> > > > 
> > > > Refined version:
> > > >   real  0m45.904s
> > > >   user  0m16.874s
> > > >   sys   0m10.042s
> > > > 
> > > > Signed-off-by: Qiuhao Li 
> > > > ---
> > > >  scripts/oss-fuzz/minimize_qtest_trace.py | 41
> > > > --
> > > > --
> > > >  1 file changed, 28 insertions(+), 13 deletions(-)
> > > > 
> > > > diff --git a/scripts/oss-fuzz/minimize_qtest_trace.py
> > > > b/scripts/oss-fuzz/minimize_qtest_trace.py
> > > > index 5e405a0d5f..aa69c7963e 100755
> > > > --- a/scripts/oss-fuzz/minimize_qtest_trace.py
> > > > +++ b/scripts/oss-fuzz/minimize_qtest_trace.py
> > > > @@ -29,30 +29,46 @@ whether the crash occred. Optionally,
> > > > manually
> > > > set a string that idenitifes the
> > > >  crash by setting CRASH_TOKEN=
> > > >  """.format((sys.argv[0])))
> > > >  
> > > > +deduplication_note = """\n\
> > > > +Note: While trimming the input, sometimes the mutated trace
> > > > triggers a different
> > > > +crash output but indicates the same bug. Under this situation,
> > > > our
> > > > minimizer is
> > > > +incapable of recognizing and stopped from removing it. In the
> > > > future, we may
> > > > +use a more sophisticated crash case deduplication method.
> > > > +\n"""
> > > > +
> > > >  def check_if_trace_crashes(trace, path):
> > > > -global CRASH_TOKEN
> > > >  with open(path, "w") as tracefile:
> > > >  tracefile.write("".join(trace))
> > > >  
> > > > -rc = subprocess.Popen("timeout -s 9 {timeout}s {qemu_path}
> > > > {qemu_args} 2>&1\
> > > > +proc = subprocess.Popen("timeout {timeout}s {qemu_path}
> > > > {qemu_args} 2>&1\
> > > 
> > > Why remove the -s 9 here? I ran into a case where the minimizer
> > > got
> > > stuck on one iteration. Adding back "sigkill" to the timeout can
> > > be a
> > > safety net to catch those bad cases.
> > > -Alex
> > 
> > Hi Alex,
> > 
> > After reviewed this patch again, I think this get-stuck bug may be
> > caused by code:
> > 
> > -return CRASH_TOKEN in output
> 
> Hi,
> Thanks for fixing this. Strangely, I was able to fix it by swapping
> the b'' for a ' ' when I was stuck on a testcase a few days ago.
> vvv 
> > +for line in iter(rc.stdout.readline, b''):
> > +if "CLOSED" in line:
> > +return False
> > +if CRASH_TOKEN in line:
> > +return True
> > 
> 
> I think your proposed change essentially does the same?
> -Alex

Hi Alex,

It looks like I misused the bytes type. Instead of b'', '' (the str
type) should be used here:

-for line in iter(rc.stdout.readline, b''):
+for line in iter(rc.stdout.readline, ''):
And you are right, if we use iter() with sentinel parameter '', it's
does the same as:

+if line == "":
+return False

But if we just fix the get-stuck bug here, we may fail
the assert(check_if_trace_crashes(newtrace, outpath)) check after
remove_lines() or clear_bits() since the same trace input may trigger a
different output between runs.

My solution is instead of using the whole second-to-last line as token,
we only use the the first three words which indicate the type of crash:

-CRASH_TOKEN = output.splitlines()[-2]
+CRASH_TOKEN = " ".join(outs.splitlines()[-2].split()[0:3])

Example: "SUMMARY: AddressSanitizer: stack-overflow"

And thus, we may a get a more trimmed input trace.

> 
> > I assumed there are only two end cases in lines of stdout, but
> > while we
> > are trimming the trace input, the crash output (second-to-last
> > line)
> > may changes, in which case we will go through the output and fail
> > to
> > find "CLOSED" and CRASH_TOKEN, thus get stuck in the loop above.
> > 
> > To fix this bug and get a more trimmed input trace, we can:
> > 
> > Use the first three words of the second-to-last line instead of the
> > whole string, which indicate the type of crash as the token.
> > 
> > -CRASH_TOKEN = output.splitlines()[-2]
> > +CRASH_TOKEN = " ".join(outs.splitlines()[-2].split()[0:3])
> > 
> > If we reach the end of a subprocess' output, return False.
> > 
> > +if line == "":
> > +return False
> > 
> > I fix it in [PATCH v7 1/7] and give an example. Could you review
> > again?
> > Thanks :-)
> > 

Re: [PATCH v4 1/7] fuzz: accelerate non-crash detection

2021-01-10 Thread Alexander Bulekov
On 210110 2110, Qiuhao Li wrote:
> On Wed, 2021-01-06 at 23:18 -0500, Alexander Bulekov wrote:
> > On 201229 1240, Qiuhao Li wrote:
> > > We spend much time waiting for the timeout program during the
> > > minimization
> > > process until it passes a time limit. This patch hacks the CLOSED
> > > (indicates
> > > the redirection file closed) notification in QTest's output if it
> > > doesn't
> > > crash.
> > > 
> > > Test with quadrupled trace input at:
> > >   https://bugs.launchpad.net/qemu/+bug/1890333/comments/1
> > > 
> > > Original version:
> > >   real1m37.246s
> > >   user0m13.069s
> > >   sys 0m8.399s
> > > 
> > > Refined version:
> > >   real0m45.904s
> > >   user0m16.874s
> > >   sys 0m10.042s
> > > 
> > > Signed-off-by: Qiuhao Li 
> > > ---
> > >  scripts/oss-fuzz/minimize_qtest_trace.py | 41 --
> > > --
> > >  1 file changed, 28 insertions(+), 13 deletions(-)
> > > 
> > > diff --git a/scripts/oss-fuzz/minimize_qtest_trace.py
> > > b/scripts/oss-fuzz/minimize_qtest_trace.py
> > > index 5e405a0d5f..aa69c7963e 100755
> > > --- a/scripts/oss-fuzz/minimize_qtest_trace.py
> > > +++ b/scripts/oss-fuzz/minimize_qtest_trace.py
> > > @@ -29,30 +29,46 @@ whether the crash occred. Optionally, manually
> > > set a string that idenitifes the
> > >  crash by setting CRASH_TOKEN=
> > >  """.format((sys.argv[0])))
> > >  
> > > +deduplication_note = """\n\
> > > +Note: While trimming the input, sometimes the mutated trace
> > > triggers a different
> > > +crash output but indicates the same bug. Under this situation, our
> > > minimizer is
> > > +incapable of recognizing and stopped from removing it. In the
> > > future, we may
> > > +use a more sophisticated crash case deduplication method.
> > > +\n"""
> > > +
> > >  def check_if_trace_crashes(trace, path):
> > > -global CRASH_TOKEN
> > >  with open(path, "w") as tracefile:
> > >  tracefile.write("".join(trace))
> > >  
> > > -rc = subprocess.Popen("timeout -s 9 {timeout}s {qemu_path}
> > > {qemu_args} 2>&1\
> > > +proc = subprocess.Popen("timeout {timeout}s {qemu_path}
> > > {qemu_args} 2>&1\
> > 
> > Why remove the -s 9 here? I ran into a case where the minimizer got
> > stuck on one iteration. Adding back "sigkill" to the timeout can be a
> > safety net to catch those bad cases.
> > -Alex
> 
> Hi Alex,
> 
> After reviewed this patch again, I think this get-stuck bug may be
> caused by code:
> 
> -return CRASH_TOKEN in output

Hi,
Thanks for fixing this. Strangely, I was able to fix it by swapping
the b'' for a ' ' when I was stuck on a testcase a few days ago.
vvv 
> +for line in iter(rc.stdout.readline, b''):
> +if "CLOSED" in line:
> +return False
> +if CRASH_TOKEN in line:
> +return True
> 

I think your proposed change essentially does the same?
-Alex

> I assumed there are only two end cases in lines of stdout, but while we
> are trimming the trace input, the crash output (second-to-last line)
> may changes, in which case we will go through the output and fail to
> find "CLOSED" and CRASH_TOKEN, thus get stuck in the loop above.
> 
> To fix this bug and get a more trimmed input trace, we can:
> 
> Use the first three words of the second-to-last line instead of the
> whole string, which indicate the type of crash as the token.
> 
> -CRASH_TOKEN = output.splitlines()[-2]
> +CRASH_TOKEN = " ".join(outs.splitlines()[-2].split()[0:3])
> 
> If we reach the end of a subprocess' output, return False.
> 
> +if line == "":
> +return False
> 
> I fix it in [PATCH v7 1/7] and give an example. Could you review again?
> Thanks :-)
> 
> FYI, I mentioned this situation firstly in [PATCH 1/4], where I gave a
> more detailed example:
> 
> https://lists.gnu.org/archive/html/qemu-devel/2020-12/msg05888.html
> 
> > 
> > >  < {trace_path}".format(timeout=TIMEOUT,
> > > qemu_path=QEMU_PATH,
> > > qemu_args=QEMU_ARGS,
> > > trace_path=path),
> > >shell=True,
> > >stdin=subprocess.PIPE,
> > > -  stdout=subprocess.PIPE)
> > > -stdo = rc.communicate()[0]
> > > -output = stdo.decode('unicode_escape')
> > > -if rc.returncode == 137:# Timed Out
> > > -return False
> > > -if len(output.splitlines()) < 2:
> > > -return False
> > > -
> > > +  stdout=subprocess.PIPE,
> > > +  encoding="utf-8")
> > > +global CRASH_TOKEN
> > >  if CRASH_TOKEN is None:
> > > -CRASH_TOKEN = output.splitlines()[-2]
> > > +try:
> > > +outs, _ = proc.communicate(timeout=5)
> > > +CRASH_TOKEN = outs.splitlines()[-2]
> > > +except subprocess.TimeoutExpired:
> > > +print("subprocess.TimeoutExpired")
> > >

Re: [PATCH v4 1/7] fuzz: accelerate non-crash detection

2021-01-10 Thread Qiuhao Li
On Wed, 2021-01-06 at 23:18 -0500, Alexander Bulekov wrote:
> On 201229 1240, Qiuhao Li wrote:
> > We spend much time waiting for the timeout program during the
> > minimization
> > process until it passes a time limit. This patch hacks the CLOSED
> > (indicates
> > the redirection file closed) notification in QTest's output if it
> > doesn't
> > crash.
> > 
> > Test with quadrupled trace input at:
> >   https://bugs.launchpad.net/qemu/+bug/1890333/comments/1
> > 
> > Original version:
> >   real  1m37.246s
> >   user  0m13.069s
> >   sys   0m8.399s
> > 
> > Refined version:
> >   real  0m45.904s
> >   user  0m16.874s
> >   sys   0m10.042s
> > 
> > Signed-off-by: Qiuhao Li 
> > ---
> >  scripts/oss-fuzz/minimize_qtest_trace.py | 41 --
> > --
> >  1 file changed, 28 insertions(+), 13 deletions(-)
> > 
> > diff --git a/scripts/oss-fuzz/minimize_qtest_trace.py
> > b/scripts/oss-fuzz/minimize_qtest_trace.py
> > index 5e405a0d5f..aa69c7963e 100755
> > --- a/scripts/oss-fuzz/minimize_qtest_trace.py
> > +++ b/scripts/oss-fuzz/minimize_qtest_trace.py
> > @@ -29,30 +29,46 @@ whether the crash occred. Optionally, manually
> > set a string that idenitifes the
> >  crash by setting CRASH_TOKEN=
> >  """.format((sys.argv[0])))
> >  
> > +deduplication_note = """\n\
> > +Note: While trimming the input, sometimes the mutated trace
> > triggers a different
> > +crash output but indicates the same bug. Under this situation, our
> > minimizer is
> > +incapable of recognizing and stopped from removing it. In the
> > future, we may
> > +use a more sophisticated crash case deduplication method.
> > +\n"""
> > +
> >  def check_if_trace_crashes(trace, path):
> > -global CRASH_TOKEN
> >  with open(path, "w") as tracefile:
> >  tracefile.write("".join(trace))
> >  
> > -rc = subprocess.Popen("timeout -s 9 {timeout}s {qemu_path}
> > {qemu_args} 2>&1\
> > +proc = subprocess.Popen("timeout {timeout}s {qemu_path}
> > {qemu_args} 2>&1\
> 
> Why remove the -s 9 here? I ran into a case where the minimizer got
> stuck on one iteration. Adding back "sigkill" to the timeout can be a
> safety net to catch those bad cases.
> -Alex

Hi Alex,

After reviewed this patch again, I think this get-stuck bug may be
caused by code:

-return CRASH_TOKEN in output
+for line in iter(rc.stdout.readline, b''):
+if "CLOSED" in line:
+return False
+if CRASH_TOKEN in line:
+return True

I assumed there are only two end cases in lines of stdout, but while we
are trimming the trace input, the crash output (second-to-last line)
may changes, in which case we will go through the output and fail to
find "CLOSED" and CRASH_TOKEN, thus get stuck in the loop above.

To fix this bug and get a more trimmed input trace, we can:

Use the first three words of the second-to-last line instead of the
whole string, which indicate the type of crash as the token.

-CRASH_TOKEN = output.splitlines()[-2]
+CRASH_TOKEN = " ".join(outs.splitlines()[-2].split()[0:3])

If we reach the end of a subprocess' output, return False.

+if line == "":
+return False

I fix it in [PATCH v7 1/7] and give an example. Could you review again?
Thanks :-)

FYI, I mentioned this situation firstly in [PATCH 1/4], where I gave a
more detailed example:

https://lists.gnu.org/archive/html/qemu-devel/2020-12/msg05888.html

> 
> >  < {trace_path}".format(timeout=TIMEOUT,
> > qemu_path=QEMU_PATH,
> > qemu_args=QEMU_ARGS,
> > trace_path=path),
> >shell=True,
> >stdin=subprocess.PIPE,
> > -  stdout=subprocess.PIPE)
> > -stdo = rc.communicate()[0]
> > -output = stdo.decode('unicode_escape')
> > -if rc.returncode == 137:# Timed Out
> > -return False
> > -if len(output.splitlines()) < 2:
> > -return False
> > -
> > +  stdout=subprocess.PIPE,
> > +  encoding="utf-8")
> > +global CRASH_TOKEN
> >  if CRASH_TOKEN is None:
> > -CRASH_TOKEN = output.splitlines()[-2]
> > +try:
> > +outs, _ = proc.communicate(timeout=5)
> > +CRASH_TOKEN = outs.splitlines()[-2]
> > +except subprocess.TimeoutExpired:
> > +print("subprocess.TimeoutExpired")
> > +return False
> > +print("Identifying Crashes by this string:
> > {}".format(CRASH_TOKEN))
> > +global deduplication_note
> > +print(deduplication_note)
> > +return True
> >  
> > -return CRASH_TOKEN in output
> > +for line in iter(proc.stdout.readline, b''):
> > +if "CLOSED" in line:
> > +return False
> > +if CRASH_TOKEN in line:
> > +return True
> > +
> > +return False
> >  
> >  
> >  def minimize_trace(inpath, outpath):
> >

Re: [PATCH v4 1/7] fuzz: accelerate non-crash detection

2021-01-07 Thread Qiuhao Li
On Wed, 2021-01-06 at 23:18 -0500, Alexander Bulekov wrote:
> On 201229 1240, Qiuhao Li wrote:
> > We spend much time waiting for the timeout program during the
> > minimization
> > process until it passes a time limit. This patch hacks the CLOSED
> > (indicates
> > the redirection file closed) notification in QTest's output if it
> > doesn't
> > crash.
> > 
> > Test with quadrupled trace input at:
> >   https://bugs.launchpad.net/qemu/+bug/1890333/comments/1
> > 
> > Original version:
> >   real  1m37.246s
> >   user  0m13.069s
> >   sys   0m8.399s
> > 
> > Refined version:
> >   real  0m45.904s
> >   user  0m16.874s
> >   sys   0m10.042s
> > 
> > Signed-off-by: Qiuhao Li 
> > ---
> >  scripts/oss-fuzz/minimize_qtest_trace.py | 41 --
> > --
> >  1 file changed, 28 insertions(+), 13 deletions(-)
> > 
> > diff --git a/scripts/oss-fuzz/minimize_qtest_trace.py
> > b/scripts/oss-fuzz/minimize_qtest_trace.py
> > index 5e405a0d5f..aa69c7963e 100755
> > --- a/scripts/oss-fuzz/minimize_qtest_trace.py
> > +++ b/scripts/oss-fuzz/minimize_qtest_trace.py
> > @@ -29,30 +29,46 @@ whether the crash occred. Optionally, manually
> > set a string that idenitifes the
> >  crash by setting CRASH_TOKEN=
> >  """.format((sys.argv[0])))
> >  
> > +deduplication_note = """\n\
> > +Note: While trimming the input, sometimes the mutated trace
> > triggers a different
> > +crash output but indicates the same bug. Under this situation, our
> > minimizer is
> > +incapable of recognizing and stopped from removing it. In the
> > future, we may
> > +use a more sophisticated crash case deduplication method.
> > +\n"""
> > +
> >  def check_if_trace_crashes(trace, path):
> > -global CRASH_TOKEN
> >  with open(path, "w") as tracefile:
> >  tracefile.write("".join(trace))
> >  
> > -rc = subprocess.Popen("timeout -s 9 {timeout}s {qemu_path}
> > {qemu_args} 2>&1\
> > +proc = subprocess.Popen("timeout {timeout}s {qemu_path}
> > {qemu_args} 2>&1\
> 
> Why remove the -s 9 here? I ran into a case where the minimizer got
> stuck on one iteration. Adding back "sigkill" to the timeout can be a
> safety net to catch those bad cases.
> -Alex

Oops, I thought SIGKILL is the default signal timeout will send.
Fixed in version 5, thanks.

> 
> >  < {trace_path}".format(timeout=TIMEOUT,
> > qemu_path=QEMU_PATH,
> > qemu_args=QEMU_ARGS,
> > trace_path=path),
> >shell=True,
> >stdin=subprocess.PIPE,
> > -  stdout=subprocess.PIPE)
> > -stdo = rc.communicate()[0]
> > -output = stdo.decode('unicode_escape')
> > -if rc.returncode == 137:# Timed Out
> > -return False
> > -if len(output.splitlines()) < 2:
> > -return False
> > -
> > +  stdout=subprocess.PIPE,
> > +  encoding="utf-8")
> > +global CRASH_TOKEN
> >  if CRASH_TOKEN is None:
> > -CRASH_TOKEN = output.splitlines()[-2]
> > +try:
> > +outs, _ = proc.communicate(timeout=5)
> > +CRASH_TOKEN = outs.splitlines()[-2]
> > +except subprocess.TimeoutExpired:
> > +print("subprocess.TimeoutExpired")
> > +return False
> > +print("Identifying Crashes by this string:
> > {}".format(CRASH_TOKEN))
> > +global deduplication_note
> > +print(deduplication_note)
> > +return True
> >  
> > -return CRASH_TOKEN in output
> > +for line in iter(proc.stdout.readline, b''):
> > +if "CLOSED" in line:
> > +return False
> > +if CRASH_TOKEN in line:
> > +return True
> > +
> > +return False
> >  
> >  
> >  def minimize_trace(inpath, outpath):
> > @@ -66,7 +82,6 @@ def minimize_trace(inpath, outpath):
> >  print("Crashed in {} seconds".format(end-start))
> >  TIMEOUT = (end-start)*5
> >  print("Setting the timeout for {} seconds".format(TIMEOUT))
> > -print("Identifying Crashes by this string:
> > {}".format(CRASH_TOKEN))
> >  
> >  i = 0
> >  newtrace = trace[:]
> > -- 
> > 2.25.1
> > 




Re: [PATCH v4 1/7] fuzz: accelerate non-crash detection

2021-01-06 Thread Alexander Bulekov
On 201229 1240, Qiuhao Li wrote:
> We spend much time waiting for the timeout program during the minimization
> process until it passes a time limit. This patch hacks the CLOSED (indicates
> the redirection file closed) notification in QTest's output if it doesn't
> crash.
> 
> Test with quadrupled trace input at:
>   https://bugs.launchpad.net/qemu/+bug/1890333/comments/1
> 
> Original version:
>   real1m37.246s
>   user0m13.069s
>   sys 0m8.399s
> 
> Refined version:
>   real0m45.904s
>   user0m16.874s
>   sys 0m10.042s
> 
> Signed-off-by: Qiuhao Li 
> ---
>  scripts/oss-fuzz/minimize_qtest_trace.py | 41 
>  1 file changed, 28 insertions(+), 13 deletions(-)
> 
> diff --git a/scripts/oss-fuzz/minimize_qtest_trace.py 
> b/scripts/oss-fuzz/minimize_qtest_trace.py
> index 5e405a0d5f..aa69c7963e 100755
> --- a/scripts/oss-fuzz/minimize_qtest_trace.py
> +++ b/scripts/oss-fuzz/minimize_qtest_trace.py
> @@ -29,30 +29,46 @@ whether the crash occred. Optionally, manually set a 
> string that idenitifes the
>  crash by setting CRASH_TOKEN=
>  """.format((sys.argv[0])))
>  
> +deduplication_note = """\n\
> +Note: While trimming the input, sometimes the mutated trace triggers a 
> different
> +crash output but indicates the same bug. Under this situation, our minimizer 
> is
> +incapable of recognizing and stopped from removing it. In the future, we may
> +use a more sophisticated crash case deduplication method.
> +\n"""
> +
>  def check_if_trace_crashes(trace, path):
> -global CRASH_TOKEN
>  with open(path, "w") as tracefile:
>  tracefile.write("".join(trace))
>  
> -rc = subprocess.Popen("timeout -s 9 {timeout}s {qemu_path} {qemu_args} 
> 2>&1\
> +proc = subprocess.Popen("timeout {timeout}s {qemu_path} {qemu_args} 2>&1\

Why remove the -s 9 here? I ran into a case where the minimizer got
stuck on one iteration. Adding back "sigkill" to the timeout can be a
safety net to catch those bad cases.
-Alex

>  < {trace_path}".format(timeout=TIMEOUT,
> qemu_path=QEMU_PATH,
> qemu_args=QEMU_ARGS,
> trace_path=path),
>shell=True,
>stdin=subprocess.PIPE,
> -  stdout=subprocess.PIPE)
> -stdo = rc.communicate()[0]
> -output = stdo.decode('unicode_escape')
> -if rc.returncode == 137:# Timed Out
> -return False
> -if len(output.splitlines()) < 2:
> -return False
> -
> +  stdout=subprocess.PIPE,
> +  encoding="utf-8")
> +global CRASH_TOKEN
>  if CRASH_TOKEN is None:
> -CRASH_TOKEN = output.splitlines()[-2]
> +try:
> +outs, _ = proc.communicate(timeout=5)
> +CRASH_TOKEN = outs.splitlines()[-2]
> +except subprocess.TimeoutExpired:
> +print("subprocess.TimeoutExpired")
> +return False
> +print("Identifying Crashes by this string: {}".format(CRASH_TOKEN))
> +global deduplication_note
> +print(deduplication_note)
> +return True
>  
> -return CRASH_TOKEN in output
> +for line in iter(proc.stdout.readline, b''):
> +if "CLOSED" in line:
> +return False
> +if CRASH_TOKEN in line:
> +return True
> +
> +return False
>  
>  
>  def minimize_trace(inpath, outpath):
> @@ -66,7 +82,6 @@ def minimize_trace(inpath, outpath):
>  print("Crashed in {} seconds".format(end-start))
>  TIMEOUT = (end-start)*5
>  print("Setting the timeout for {} seconds".format(TIMEOUT))
> -print("Identifying Crashes by this string: {}".format(CRASH_TOKEN))
>  
>  i = 0
>  newtrace = trace[:]
> -- 
> 2.25.1
> 



Re: [PATCH v4 1/7] fuzz: accelerate non-crash detection

2021-01-06 Thread Alexander Bulekov
On 201229 1240, Qiuhao Li wrote:
> We spend much time waiting for the timeout program during the minimization
> process until it passes a time limit. This patch hacks the CLOSED (indicates
> the redirection file closed) notification in QTest's output if it doesn't
> crash.
> 
> Test with quadrupled trace input at:
>   https://bugs.launchpad.net/qemu/+bug/1890333/comments/1
> 
> Original version:
>   real1m37.246s
>   user0m13.069s
>   sys 0m8.399s
> 
> Refined version:
>   real0m45.904s
>   user0m16.874s
>   sys 0m10.042s
> 
> Signed-off-by: Qiuhao Li 

This makes a huge difference, thanks! Maybe there is some edge-case
where the crash happens after "CLOSED" (e.g. a timer fires after the
last command), but I haven't found such an example among existing
reproducers.

Reviewed-by: Alexander Bulekov 

> ---
>  scripts/oss-fuzz/minimize_qtest_trace.py | 41 
>  1 file changed, 28 insertions(+), 13 deletions(-)
> 
> diff --git a/scripts/oss-fuzz/minimize_qtest_trace.py 
> b/scripts/oss-fuzz/minimize_qtest_trace.py
> index 5e405a0d5f..aa69c7963e 100755
> --- a/scripts/oss-fuzz/minimize_qtest_trace.py
> +++ b/scripts/oss-fuzz/minimize_qtest_trace.py
> @@ -29,30 +29,46 @@ whether the crash occred. Optionally, manually set a 
> string that idenitifes the
>  crash by setting CRASH_TOKEN=
>  """.format((sys.argv[0])))
>  
> +deduplication_note = """\n\
> +Note: While trimming the input, sometimes the mutated trace triggers a 
> different
> +crash output but indicates the same bug. Under this situation, our minimizer 
> is
> +incapable of recognizing and stopped from removing it. In the future, we may
> +use a more sophisticated crash case deduplication method.
> +\n"""
> +
>  def check_if_trace_crashes(trace, path):
> -global CRASH_TOKEN
>  with open(path, "w") as tracefile:
>  tracefile.write("".join(trace))
>  
> -rc = subprocess.Popen("timeout -s 9 {timeout}s {qemu_path} {qemu_args} 
> 2>&1\
> +proc = subprocess.Popen("timeout {timeout}s {qemu_path} {qemu_args} 2>&1\
>  < {trace_path}".format(timeout=TIMEOUT,
> qemu_path=QEMU_PATH,
> qemu_args=QEMU_ARGS,
> trace_path=path),
>shell=True,
>stdin=subprocess.PIPE,
> -  stdout=subprocess.PIPE)
> -stdo = rc.communicate()[0]
> -output = stdo.decode('unicode_escape')
> -if rc.returncode == 137:# Timed Out
> -return False
> -if len(output.splitlines()) < 2:
> -return False
> -
> +  stdout=subprocess.PIPE,
> +  encoding="utf-8")
> +global CRASH_TOKEN
>  if CRASH_TOKEN is None:
> -CRASH_TOKEN = output.splitlines()[-2]
> +try:
> +outs, _ = proc.communicate(timeout=5)
> +CRASH_TOKEN = outs.splitlines()[-2]
> +except subprocess.TimeoutExpired:
> +print("subprocess.TimeoutExpired")
> +return False
> +print("Identifying Crashes by this string: {}".format(CRASH_TOKEN))
> +global deduplication_note
> +print(deduplication_note)
> +return True
>  
> -return CRASH_TOKEN in output
> +for line in iter(proc.stdout.readline, b''):
> +if "CLOSED" in line:
> +return False
> +if CRASH_TOKEN in line:
> +return True
> +
> +return False
>  
>  
>  def minimize_trace(inpath, outpath):
> @@ -66,7 +82,6 @@ def minimize_trace(inpath, outpath):
>  print("Crashed in {} seconds".format(end-start))
>  TIMEOUT = (end-start)*5
>  print("Setting the timeout for {} seconds".format(TIMEOUT))
> -print("Identifying Crashes by this string: {}".format(CRASH_TOKEN))
>  
>  i = 0
>  newtrace = trace[:]
> -- 
> 2.25.1
> 



[PATCH v4 1/7] fuzz: accelerate non-crash detection

2020-12-28 Thread Qiuhao Li
We spend much time waiting for the timeout program during the minimization
process until it passes a time limit. This patch hacks the CLOSED (indicates
the redirection file closed) notification in QTest's output if it doesn't
crash.

Test with quadrupled trace input at:
  https://bugs.launchpad.net/qemu/+bug/1890333/comments/1

Original version:
  real  1m37.246s
  user  0m13.069s
  sys   0m8.399s

Refined version:
  real  0m45.904s
  user  0m16.874s
  sys   0m10.042s

Signed-off-by: Qiuhao Li 
---
 scripts/oss-fuzz/minimize_qtest_trace.py | 41 
 1 file changed, 28 insertions(+), 13 deletions(-)

diff --git a/scripts/oss-fuzz/minimize_qtest_trace.py 
b/scripts/oss-fuzz/minimize_qtest_trace.py
index 5e405a0d5f..aa69c7963e 100755
--- a/scripts/oss-fuzz/minimize_qtest_trace.py
+++ b/scripts/oss-fuzz/minimize_qtest_trace.py
@@ -29,30 +29,46 @@ whether the crash occred. Optionally, manually set a string 
that idenitifes the
 crash by setting CRASH_TOKEN=
 """.format((sys.argv[0])))
 
+deduplication_note = """\n\
+Note: While trimming the input, sometimes the mutated trace triggers a 
different
+crash output but indicates the same bug. Under this situation, our minimizer is
+incapable of recognizing and stopped from removing it. In the future, we may
+use a more sophisticated crash case deduplication method.
+\n"""
+
 def check_if_trace_crashes(trace, path):
-global CRASH_TOKEN
 with open(path, "w") as tracefile:
 tracefile.write("".join(trace))
 
-rc = subprocess.Popen("timeout -s 9 {timeout}s {qemu_path} {qemu_args} 
2>&1\
+proc = subprocess.Popen("timeout {timeout}s {qemu_path} {qemu_args} 2>&1\
 < {trace_path}".format(timeout=TIMEOUT,
qemu_path=QEMU_PATH,
qemu_args=QEMU_ARGS,
trace_path=path),
   shell=True,
   stdin=subprocess.PIPE,
-  stdout=subprocess.PIPE)
-stdo = rc.communicate()[0]
-output = stdo.decode('unicode_escape')
-if rc.returncode == 137:# Timed Out
-return False
-if len(output.splitlines()) < 2:
-return False
-
+  stdout=subprocess.PIPE,
+  encoding="utf-8")
+global CRASH_TOKEN
 if CRASH_TOKEN is None:
-CRASH_TOKEN = output.splitlines()[-2]
+try:
+outs, _ = proc.communicate(timeout=5)
+CRASH_TOKEN = outs.splitlines()[-2]
+except subprocess.TimeoutExpired:
+print("subprocess.TimeoutExpired")
+return False
+print("Identifying Crashes by this string: {}".format(CRASH_TOKEN))
+global deduplication_note
+print(deduplication_note)
+return True
 
-return CRASH_TOKEN in output
+for line in iter(proc.stdout.readline, b''):
+if "CLOSED" in line:
+return False
+if CRASH_TOKEN in line:
+return True
+
+return False
 
 
 def minimize_trace(inpath, outpath):
@@ -66,7 +82,6 @@ def minimize_trace(inpath, outpath):
 print("Crashed in {} seconds".format(end-start))
 TIMEOUT = (end-start)*5
 print("Setting the timeout for {} seconds".format(TIMEOUT))
-print("Identifying Crashes by this string: {}".format(CRASH_TOKEN))
 
 i = 0
 newtrace = trace[:]
-- 
2.25.1