Hi, I'm doing some log processing in pig where I extract the typical apache log fields, filter, do some transformations, and then write the processed data to a file.
I'm finding maintaining the list of extracted fields to be somewhat cumbersome, though (and using * too sloppy for a maintainable script), and I'm wondering if I can package/extract them in a tuple. So where I'm doing: register file:/home/hadoop/lib/pig/piggybank.jar logs = load '$input' USING LogLoader as (remoteAddr, remoteLogname, user, time :chararray, method, uri :chararray, proto, status, bytes, referer, userAgent); and logs has 11 fields. could I do something like register file:/home/hadoop/lib/pig/piggybank.jar logs = load '$input' USING LogLoader as (remoteAddr, remoteLogname, user, time :chararray, method, uri :chararray, proto, status, bytes, referer, userAgent); tupled_logs = remoteAddr.. as apache:tuple; in which tupled_logs only has one field. I've tried this, but I haven't found the magic words yet. Thanks, Ranjan
