Re: Reading files in outputSchema

Alan Gates Mon, 16 Apr 2012 10:50:13 -0700

The outputSchema is read on the machine where you start your pig job (referred 
to as the front end).  Where you store the output schema is independent of this 
however.  You can still store it in HDFS and read it from there on your client. 
 By definition your client must be able to read/write HDFS files to use Pig 
anyway.  It is generally better to store it in HDFS so that you don't have to 
keep multiple copies on multiple clients and risk having copies get out of date.


Alan.

On Apr 13, 2012, at 12:43 AM, Rajgopal Vaithiyanathan wrote:

> Where will the outputSchema be executed? in the client or as a mapreduce ?
> Because, the output schema of my EvalFunc is stored in an XML file. I need
> to read this file and generate the output schema.
> Where should I place this XML file ? Client or HDFS ?
> 
> :)
> Raj

Re: Reading files in outputSchema

Reply via email to