Clojurists - I'm fairly new to Clojure and didn't realize how broken I've become using imperative languages all my life. I'm stumped as to how to parse a Varnish (www.varnish-cache.org) log file using Clojure. The main problem is that for a single request a varnish log file generates multiple log lines and each line is interspersed with lines from other threads. These log files can be several gigabytes in size (so using a stable sort of the entire log by thread id is out of the question).
Below I've included a small example log file and an example output Clojure data structure. Let me thank everyone in advance for any hints / help they can provide on this seemingly simple problem. *Rules of the Varnish Log File* - The first number on each line is the thread id (not unique and gets reused frequently) - Each ReqStart marks the start of a request and the last number on the line is the unique transaction id (e.g. 118591777) - ReqEnd denote the end of the processing of the request by the thread - Each line is atomically written, however many threads generate log lines that are interspersed with other requests (threads) - These log files can be VERY large (10+ Gigabytes in the case of my application) so using a stable sort by thread id or anything that loads the entire file into memory is out of the question. *Example Varnish Log file* 40 ReqEnd c 118591771 1350759605.775758028 1350759611.249602079 5.866879225 5.473801851 0.000042200 15 ReqStart c 10.102.41.121 4187 118591777 15 RxRequest c GET 15 RxURL c /json/engagement 15 RxHeader c host: www.example.com 30 ReqStart c 10.102.41.121 3906 118591802 15 RxHeader c Accept: application/json 30 RxRequest c GET 30 RxURL c /ws/boxtops/user/ 30 RxHeader c host: www.example.com 15 ReqEnd c 118591777 1350759605.775758028 1350759611.249602079 5.866879225 5.473801851 0.000042200 30 RxHeader c Accept: application/xml 30 ReqEnd c 118591802 1350759611.326084614 1350759611.329720259 0.005002737 0.003598213 0.000037432 15 ReqStart c 10.102.41.121 4187 118591808 15 RxRequest c GET 15 RxURL c /ws/boxtops/user/ 30 ReqStart c 10.102.41.121 3906 118591810 15 RxHeader c host: www.example.com 15 RxHeader c Accept: application/xml 30 RxRequest c GET 30 RxURL c /registration/success 30 RxHeader c host: www.example.com 46 TxRequest - GET 30 RxHeader c Accept: text/html 46 TxURL - /registration/success 15 ReqEnd c 118591808 1350759611.442447424 1350759611.444925785 0.016906023 0.002441406 0.000036955 30 ReqEnd c 118591810 1350759611.521781683 1350759611.525400877 0.098322868 0.003532171 0.000087023 *Desired Output* { 118591802 { :ReqStart ["10.102.41.121 3906 118591802"] :RxRequest ["GET"] :RxURL ["/ws/boxtops/user/"] :RxHeader ["host: www.example.com" "Accept: application/xml"] or better yet :RxHeader {:host "www.example.com" :Accept "application/xml"} :ReqEnd ["118591802 1350759611.326084614 1350759611.329720259 0.005002737 0.003598213 0.000037432"] } 118591777 { :ReqStart ["10.102.41.121 4187 118591777"] :RxRequest ["GET"] :RxURL ["/json/engagement"] :RxHeader ["host: www.example.com" "Accept: application/json"] :ReqEnd ["118591777 1350759605.775758028 1350759611.249602079 5.866879225 5.473801851 0.000042200" ]} 118591808 { :ReqStart [10.102.41.121 4187 118591808] :RxRequest ["GET"] :RxURL ["/ws/boxtops/user/"] :RxHeader ["host: www.example.com" "Accept: application/xml"] :ReqEnd ["118591808 1350759611.442447424 1350759611.444925785 0.016906023 0.002441406 0.000036955"] } 118591810 { :ReqStart ["10.102.41.121 3906 118591810"] :RxRequest ["GET"] :RxURL ["/registration/success"] :RxHeader ["host: www.example.com" "Accept: text/html] :ReqEnd ["118591810 1350759611.521781683 1350759611.525400877 0.098322868 0.003532171 0.000087023"] } } -- You received this message because you are subscribed to the Google Groups "Clojure" group. To post to this group, send email to clojure@googlegroups.com Note that posts from new members are moderated - please be patient with your first post. To unsubscribe from this group, send email to clojure+unsubscr...@googlegroups.com For more options, visit this group at http://groups.google.com/group/clojure?hl=en