Vincent Barat
Thu, 18 Mar 2010 15:23:51 -0700
Hi,I wonder if it is faster to firstly extract only the interesting fiels from a bag of tuples before performing other operations on it, or if it is automatically handled by the optimizer:
For exemple, is: ssessions = FOREACH sessions GENERATE imei; imei_sessions = GROUP ssessions BY imei;imei_session_count = FOREACH imei_sessions GENERATE group, COUNT(ssessions);
faster than: imei_sessions = GROUP sessions BY imei;imei_session_count = FOREACH imei_sessions GENERATE group, COUNT(sessions);
Thanks for your help