Suggestions for improvement very welcome! Thank you. Grace O’Hair-Sherman [email protected] [Extended contact information]
GSOC Proposal: Filter to Parse Plain Text strace output to Structured Formats Like JSON Synopsis: As it is, the output of strace is not easily machine-readable. I propose to solve this problem by providing a filter to parse strace output and convert to a structured format. This parser will be written in Python and the output will have the option of being in JavaScript Object Notation or MessagePack (http://msgpack.org/). Here is an example of how partial output of strace run on a hello-world program might be output as JSON (supposing the parser were named strace_to_structured): Partial output: % strace -T ./hello execve("./hello", ["./hello"...], [/* 33 vars */]) = 0 <0.000071> brk(0) = 0x24e3000 <0.000006> JSON: { "strace - T. / hello | strace_to_structured": [{ "syscall": "execve", "arguments": ["./hello", "[\"./hello\"...]", "[/* 33 vars */]"], "return_val": 0, "return_val_hex": "0", "time_in_kernel": 0.000071 }, { "syscall": "brk", "arguments": [0], "return_val": 38678528, "return_val_hex": "0x24e3000", "time_in_kernel": 0.000006 }] } Benefits to Community Anyone who wants to programmatically consume strace output must currently write their own parser before they can use the output. This parser will save these people time and effort as they can start with a format that is easily parseable. Deliverables Preparations completed: I have built strace and reviewed the previous JSON work done in the project. Deadlines: 23 May - 29 May -- Investigation & research into what useful JSON and MessagePack output would look like Investigate where to put Python program in SourceForge and how to package and distribute the program (with help from community mailing list) (Spring quarter classes at university) 30 May - 5 June -- Set up repository; get dummy I/O working Propose JSON and MessagePack formats and get review from community mailing list (Spring quarter classes) 6 June - 12 June -- Create prototype that can create JSON output for one test from strace-code/test (Spring quarter final examinations at university) 13 June - 19 June -- Decide on how to validate JSON output. Perhaps use a python program that can consume and validate JSON. 20 June - 26 June -- Create automated test using initial test program Run filter with more existing strace programs, fixing problems as they appear. (GSOC Midterm evaluation submission period) 27 June - 3 July -- Write usage text that is emitted by filter when presented with unknown flags Ensure filter exits cleanly when interrupted 11 July - 17 July -- Document project so far (Should this go on the project wiki?) 18 July - 24 July -- enhance filter to output MessagePack (and ensure works with one test from strace-code/test) 25 July - 31 July -- Run filter with MessagePack output and with more existing strace programs, fixing problems as they appear. 1 August - 7 August -- Ensure filter correctly reads strace output when it is run with flags (e.g. -T, -v ) and correctly outputs corresponding MessagePack 8 August - 14 August -- Stretch goal: write a demo program that consumes the filter output and prints a summary of average time taken by different system calls. 15 August - 23 August 19:00 UTC -- Final week: tidy code, write tests, improve documentation and submit code sample. Related Work: A similar project was proposed and implemented during the 2014 Google Summer of Code, the main difference being that it was supposed to be directly a part of strace. It seems that this project’s scope may have been too big and it was never integrated with strace. This proposal has a smaller scope in that it will be a separate script that does post-processing on strace output. Another difference is that this project will result in a program with options for different output formats, i.e. JSON or MessagePack. (Inspired by this post: goo.gl/2yvCTG) Biographical Information: I am a second-year computer science major at University of California, Santa Cruz. I have taken Computer Architecture, Algorithms and Abstract Data Types, Computer Systems and Assembly Language, Introduction to Data Structures, and Accelerated Introduction to Programming. By summer I will have taken Analysis of Algorithms as well. Almost all these classes have involved UNIX or Linux Bash and Makefiles. I started developing using Ubuntu two years ago when I interned at Gametime United. I also used Git and wrote JSON, both manually and automatically by writing a Python script. I have experience meeting project deadlines; last summer I designed, coded, and shipped an iOS application from start to finish in less than eight weeks. (It is called Amino Ally: goo.gl/WTGgUz ) I haven’t done any open source projects yet, although I’m a member of my school’s Linux Users’ Group, so I’m really excited for this opportunity to get more involved. The relevant skills that will help me achieve this project’s goal include Bash, Makefiles, Git, Python, and JSON. During the last 10 weeks of Google Summer of Code I will be available full time to work on my project. I have university classes during the first two weeks and final examinations during part of the third week, but I will nonetheless make sure to work at least 20 hours in each of those three weeks. I consider this a serious full-time commitment and I will make up the 60 hours missed during the first three weeks by working 46 hours a week for the remaining 10 weeks. ------------------------------------------------------------------------------ Transform Data into Opportunity. Accelerate data analysis in your applications with Intel Data Analytics Acceleration Library. Click to learn more. http://pubads.g.doubleclick.net/gampad/clk?id=278785351&iu=/4140 _______________________________________________ Strace-devel mailing list [email protected] https://lists.sourceforge.net/lists/listinfo/strace-devel
