Under the hood Hive tables are just files too.
I am not sure what the INSERT OVERWRITE semantics are in edge cases
(like if your query fails), but you may be able to simulate it using
'fs -mv' and 'fs -rmf' commands that Pig provides to operate on the
hadoop file system.
Note that for safety, Pig will refuse to run if you are trying to
write into a directory that already exists, so you *must* use a move
or a remove if you might already have data in the target location.

All of that goes out the window for both Hive and Pig if you are using
custom SerDes/StoreFuncs, which can do more or less whatever they
want.

-D

On Wed, May 5, 2010 at 11:07 AM, Thejas Nair <[email protected]> wrote:
> Hi Syed,
>
> 1. Released versions of  pig don't support concept of table, there will be
> one in owl specific loaders once they are available. Pig-latin output goes
> into files (if store cmd is used) or STDOUT (if dump is used). The behavior
> if the file already exists is determined by the StoreFunc , PigStorage will
> give an error if the file already exists.
>
>
> Re 2 & 3  - here is the translation to pig-latin -
>
> L = load 'B' as (id, dept_type, dept_id, visible_flag, org_type);
>
> FIL = filter L by visible_flag == 1;
>
> G = group FIL BY (id, dept_type);
>
> FE = foreach G  {
>  DEPT_IDS = FIL.dept_id; DIST_DEPT_IDS = distinct DEPT_IDS;
>  generate group.id, 'S' as org_type,  group.dept_type, COUNT_STAR(FIL) as
> cnt, COUNT(DIST_DEPT_IDS) as cnt_distinct ;
> }
>
> describe FE;
> FE: {cnt_distinct: long,cnt: long,id: bytearray,dept_type:
> bytearray,org_type: chararray}
>
> store FE into 'A'
>
>
> On 5/4/10 4:36 PM, "Syed Wasti" <[email protected]> wrote:
>
>>
>>
>> Hi,
>> I am new to Hadoop and Pig Latin Language.
>> I am trying to convert the below Hive QL to Pig Latin. Any suggestions 
>> please.
>>
>> INSERT OVERWRITE TABLE A
>> SELECT id, org_type, dept_type, cnt, cnt_distinct
>> FROM (SELECT id, 'S' org_type, dept_type, COUNT(1) cnt, COUNT(DISTINCT
>> dept_id) cnt_distinct
>>          FROM B
>>          WHERE visible_flag = 1
>>          GROUP BY id, dept_type
>>
>> Questions:
>> 1. Is there an option to overwrite the table ? OR what does Pig Latin offer ?
>> 2. You can see in the inner Query "'S' org_type" I am creating a new column
>> and inserting 'S' as the value to this. what does Pig Latin offer ?
>> 3. Related to Q2, "COUNT(1) cnt" for every id I am incrementing the count
>> based on how many dept_type and id has and generating a new column and
>> inserting the count in there. How can I do this in pig ?
>>
>> Thanks for you help.
>>
>> Regards
>> MD
>>
>>
>> _________________________________________________________________
>> Hotmail is redefining busy with tools for the New Busy. Get more from your
>> inbox.
>> http://www.windowslive.com/campaign/thenewbusy?ocid=PID28326::T:WLMTAGL:ON:WL:
>> en-US:WM_HMP:042010_2
>
>

Reply via email to