Hi there, I'm planning to do some performance measurements of my hadoop pig code in order to see how it scales. Does anyone have some suggestions on how to do that?
I thought of measuring the time needed for completion on a fixed cluster size by increasing the input data. Then by fixing the input data and by adding cluster nodes. Does anyone have experience in doing that? I thought of writing a script that does start/stop the time and execute the pig command. Maybe there's a better way? Best, Will
