Haha that's funny, that's exactly what I ended up doing. Python does the job admirably, now if only my python udfs would work :s
Sent via BlackBerry -----Original Message----- From: Dmitriy Ryaboy <[email protected]> Date: Tue, 28 Dec 2010 15:27:02 To: <[email protected]> Reply-To: [email protected] Subject: Re: Getting the results of DEFINE from outside of pig? Do the ugly thing, and you can run in pig -x local for local mode (though you might run into trouble with Pig trying to verify existence of files). PigUnit does essentially the same thing by overriding the Pig parser and simply replacing the parsing code for STOREs :) D On Tue, Dec 28, 2010 at 8:22 AM, Jonathan Coveney <[email protected]>wrote: > Basically, I want a way to be able to see the schema of something from > within a pig script outside of pig, ideally without having to connect to > hadoop to do so. > > So for example, we take a random script... > > a = LOAD blah AS (one:int, two:chararray, three:int); > b = FOREACH a GENERATE one, two; > > ideally I want a way to get the result of DESCRIBE b; but from outside of > pig. > > One ugly way I can think of would be to sort of create a temporary script, > append DESCRIBE b;, get rid of any stores and dumbs, run the job locally, > and then only take the result. > > I was hoping there might be a nicer way to do it, OR, if not, how do I run > that sort of thing locally, forcing pig not to go onto my hadoop cluster? > > I appreciate your help > Jon >
