Ah, sorry I missed your former reply. I used python because it's more
flexible, and can generate Pig script from XML files containing all
fields info in my input and output files. These XML files can also
apply to Hive.

On Fri, Sep 14, 2012 at 5:04 AM, Ruslan Al-Fakikh <[email protected]> wrote:
> MiaoMiao, Mohit,
>
> If we are talking about embedding Pig into Python, I'd like to add
> that we can also embed Pig into Java using PigServer
> http://wiki.apache.org/pig/EmbeddedPig
>
> MiaoMiao, what's the purpose of embedding here (if we already have
> parameter substitution feature)? I guess Pig embedding is mostly
> suitable in case we want to add IF/ELSE or LOOP functionality
>
> Thanks
>
> On Thu, Sep 13, 2012 at 6:31 AM, MiaoMiao <[email protected]> wrote:
>> I wrote a python script to do this
>>
>> import sys
>> yyyymmddhh = sys.argv[1]
>> inputPath = getInputPath(yyyymmddhh) #yyyymmddhh to "YYYY/MM/DD/HH/input"
>> outputPath = getOutputPath(yyyymmddhh) #yyyymmddhh to "YYYY/MM/DD/HH/output"
>> pigScript = '''
>> some = load '$input' using PigStorage(',')
>>     as(
>>         id:INT,
>>         value:INT
>>     );
>> final = ..... ;
>> STORE final INTO '$output' using PigStorage(',');
>> '''
>> P = Pig.compile(pigScript)
>> result = P.bind({'input':inputPath, 'output':outputPath}).runSingle()
>> if result.isSuccessful() :
>>     print 'Pig job succeeded'
>> else :
>>     raise 'Pig job failed'
>>
>> Then you can run it with pig
>> pig -x local pig.py 2012091108
>>
>> On Tue, Sep 11, 2012 at 7:11 AM, Mohit Anchlia <[email protected]> 
>> wrote:
>>> Our input path is something like YYYY/MM/DD/HH/input and we like to write
>>> to YYYY/MM/DD/HH/output . Is it possible to get the input path as a String
>>> and convert it to YYYY/MM/DD/HH/output that I can use in "store into"
>>> clause?

Reply via email to