Hi all, I am attempting to measure some performance metrics (such as runtime, memory usage, network communication, etc.) using an external bash script that grabs some machine stats.
I am having difficulty figuring out where to externally call this script in Giraph. Particularly, I would like to call it at several key points in Giraph's execution, such as input/setup, beginning of computation, and output. The issue I am having is that I can't clearly figure out where to place the external calls because I can't figure out where these "phases" are actually happening in Giraph's source. I also have the added difficulty that I only want this external script to be called for each machine/worker not for each thread. Meaning, it should not be inside the vertex computation code, for example. Summary: my goal is to call an external script once per machine at the beginning of setup, computation (at/before superstep 0), and output. 1. Is this possible? 2. If so, could anyone please point me to where these phases are happening that would work for making such an external call? I am guessing this would be the MasterThread file, as this is where all the GiraphTimers are happening. 3. Any general advice would be appreciated. Thanks and regards, Steve
