Github user jdanekrh commented on the issue:

    https://github.com/apache/qpid-proton/pull/95
  
    I am not fan of storing binary data either, but I think it is not 
unreasonable given the circumstances.
    
    The corpus was generated automatically by the fuzzer. There is a greedy 
algorithm for this: every time newly discovered input extends the total fuzzing 
coverage, it is added to the corpus.
    
    * There are two seed files which I captured with Wireshark, the rest of the 
corpus is autogenerated. There would have to be something that parses the 
generated files into a text format.
    
    * Some of the files in the corpus are not valid AMQP, the text format would 
have to be more flexible than what you linked, to capture arbitrary binary 
data, not only valid AMQP frames.
    
    * The only advantage of text format I can see is ease of manual inspection 
and more meaningful git versioning. This is IMO not what will be happening. The 
operations done with the corpus in the future will be most likely occasional 
corpus minimization (to compact the number of files needed to reach the current 
coverage).
    
    There is a possibility of a regression discovered with a file in the corpus 
(which is unlikely, because developers will be able to easily run all corpus 
files through fuzzer before they push changes (takes below a second), and 
fixing issues in a fresh change tends to be easier). In this case, you would 
use the fuzzer to give you a minimal input that still triggers the failure and 
you would analyze the minimal input, not the original one.
    
    In summary, I think that corpus should remain in binary form, but there 
should be AMQP parser to help analyze files that trigger failures.
    
    @astitcher btw, how did you go about fixing the two fuzzing issues. Did you 
run proton with the reproducer inputs attached in Jira in a debugger and 
stepped through until something unexpected occurred, or did you analyze the 
files manually to understand what is wrong with them?


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at [email protected] or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to