I have 3 Pig scripts that load data from the same log file, but filter & group this data differently. If I combine these 3 into one & LOAD only once, performance seems to have improved, but now I am curious exactly what does LOAD do?
How does LOAD work internally? Does Pig save results of the LOAD into some separate location in HDFS? Someone please explain how LOAD relates to MapReduce? Thanks.
