what is going to be generating the "pig -param param1=..." and so on?
Couldn't these be made into arguments? ie
REGISTER /opt/apache_pig/pig-0.10.1/
contrib/piggybank/java/piggybank.jar;
REGISTER /tmp/custudf.jar;
DEFINE XMLProcessor org.sdc.map.processor.XMLProcessor('$fields');
PRODUCTS = load 'product.xml' using
org.apache.pig.piggybank.storage.XMLLoader('product') as (line:chararray);
PRODUCT = FOREACH PRODUCTS GENERATE FLATTEN(XMLProcessor(line)) as
(id:chararray, name:chararray, description:chararray);
and you callit with pig -param fields=name,description
and there has to be an output format, so in that case a %default would work?
2013/2/20 Siddhi Borkar <[email protected]>
> I will not be able to use %default statement in my pig script, as the
> parameters being passed to my pig script are not fixed. I would need a
> conditional check to be done in my pig script to check for each and every
> input parameter if it is passed or not.
> Also, there are no conditional operators (if/else) available in pig .
>
> Following is the psuedocode of the functionality I want to achieve
>
> Consider pig files:
> 1) xmlparser.pig
> 2) excelexporter.pig
> 3) htmlexporter.pig
>
> 1) xmlparser.pig
> REGISTER /opt/apache_pig/pig-0.10.1/contrib/piggybank/java/piggybank.jar;
> REGISTER /tmp/custudf.jar;
>
> DEFINE XMLProcessor org.sdc.map.processor.XMLProcessor();
> PRODUCTS = load 'product.xml' using
> org.apache.pig.piggybank.storage.XMLLoader('product') as (line:chararray);
> PRODUCT = FOREACH PRODUCTS GENERATE FLATTEN(XMLProcessor(line)) as
> (id:chararray, name:chararray, description:chararray);
>
> Please note, XMLProcessor is a custom java based udf which parses the xml.
>
> 2) excelexporter.pig
> STORE PRODUCT INTO '/tmp/prod.csv' USING
> CSVExcelStorage(',','NO_MULTILINE','UNIX');
>
> 3) htmlexporter.pig
> //logic for this is not yet implemented
>
> Now the requirement is that I need to write a wrapper pig script which
> invokes the following script and generates an output. The parameters that
> will be passed are the input params and the out file format
>
> For ex pig -param param1=name param2=description outfileformat=csv
> wrapper.pig
>
> Now what I need to do is based on the params passed to the wrapper pig
> script, I need to send inputs to the xml parser and parse the input params.
> In the above case since name and description are passed as params the xml
> should be parsed only for these 2 fields.
> Any idea how this can be achieved in a pig script?
>
> Also depending on the output file format, I need to invoke the
> corresponding exporter script (html or csv) from my wrapper script. I don’t
> see any conditional operators available (if/else) in pig. Any idea how this
> can be achieved?
>
> -----Original Message-----
> From: Jonathan Coveney [mailto:[email protected]]
> Sent: Wednesday, February 20, 2013 2:38 PM
> To: [email protected]
> Subject: Re: reading input parameters in a pig script
>
> Reiterating Prashant's comments.
>
> In the script though you can have a %default statement which will define
> the default value for a parameter, which can also be overriden. My guess is
> this might let you do what you want?
>
>
> 2013/2/20 Prashant Kommireddi <[email protected]>
>
> > Hi Siddhi,
> >
> > "Is there any way to access these params in the script without
> > referring to the param name?" -- how would you associate a param value
> to pig statement?
> >
> > I am guessing in this case your pig script is also dynamically generated?
> > You could use PigServer API
> > http://pig.apache.org/docs/r0.10.0/api/org/apache/pig/PigServer.html
> > to generate params in a Java program and embed them into a script.
> >
> > -Prashant
> >
> >
> > On Tue, Feb 19, 2013 at 3:44 PM, Siddhi Borkar <
> > [email protected]> wrote:
> >
> > >
> > > Consider the following command
> > > pig -param param1=test param2=test1 param3=test2 myscript.pig
> > >
> > > In my case the parameters are dynamic, as in I could either pass
> > > param1 only or I could pass all three params or some extra params.
> > >
> > > Since the parameters are dynamic, in my pig script I will not be
> > > able to refrence the parameters as '$param1' . Is there any way to
> > > access these params in the script without referring to the param name?
> > >
> > > ________________________________________
> > > From: Jonathan Coveney [[email protected]]
> > > Sent: Tuesday, February 19, 2013 6:42 PM
> > > To: [email protected]
> > > Subject: Re: reading input parameters in a pig script
> > >
> > > Can you give an example of what you'd like this to look like?
> > >
> > >
> > > 2013/2/19 Siddhi Borkar <[email protected]>
> > >
> > > > Hi ,
> > > >
> > > > I need to pass parameters dynamically to a pig script. Is there
> > > > any way
> > > to
> > > > read the parameters passed and their corresponding values without
> > giving
> > > > the parameter names in the pig script?
> > > >
> > > > Thanks,
> > > > Siddhi
> > > >
> > > > DISCLAIMER
> > > > ==========
> > > > This e-mail may contain privileged and confidential information
> > > > which
> > is
> > > > the property of Persistent Systems Ltd. It is intended only for
> > > > the use
> > > of
> > > > the individual or entity to which it is addressed. If you are not
> > > > the intended recipient, you are not authorized to read, retain,
> > > > copy,
> > print,
> > > > distribute or use this message. If you have received this
> > > > communication
> > > in
> > > > error, please notify the sender and delete all copies of this
> message.
> > > > Persistent Systems Ltd. does not accept any liability for virus
> > infected
> > > > mails.
> > > >
> > >
> > > DISCLAIMER
> > > ==========
> > > This e-mail may contain privileged and confidential information
> > > which is the property of Persistent Systems Ltd. It is intended only
> > > for the use
> > of
> > > the individual or entity to which it is addressed. If you are not
> > > the intended recipient, you are not authorized to read, retain,
> > > copy, print, distribute or use this message. If you have received
> > > this communication
> > in
> > > error, please notify the sender and delete all copies of this message.
> > > Persistent Systems Ltd. does not accept any liability for virus
> > > infected mails.
> > >
> >
>
> DISCLAIMER
> ==========
> This e-mail may contain privileged and confidential information which is
> the property of Persistent Systems Ltd. It is intended only for the use of
> the individual or entity to which it is addressed. If you are not the
> intended recipient, you are not authorized to read, retain, copy, print,
> distribute or use this message. If you have received this communication in
> error, please notify the sender and delete all copies of this message.
> Persistent Systems Ltd. does not accept any liability for virus infected
> mails.
>