Basically, I want a way to be able to see the schema of something from within a pig script outside of pig, ideally without having to connect to hadoop to do so.
So for example, we take a random script... a = LOAD blah AS (one:int, two:chararray, three:int); b = FOREACH a GENERATE one, two; ideally I want a way to get the result of DESCRIBE b; but from outside of pig. One ugly way I can think of would be to sort of create a temporary script, append DESCRIBE b;, get rid of any stores and dumbs, run the job locally, and then only take the result. I was hoping there might be a nicer way to do it, OR, if not, how do I run that sort of thing locally, forcing pig not to go onto my hadoop cluster? I appreciate your help Jon
