Hello,
When I say ³not working,² I mean that the new partitions are not loaded
into hive. Strangely, the hive action completes successfully and does not
report any errors. From the MR job logs:
OK
Time taken: 1.49 seconds
OK
Time taken: 0.023 seconds
OK
Time taken: 1.811 seconds
OK
Time taken: 0.328 seconds
OK
Time taken: 1.566 seconds
I assume that the five ³OK² messages correspond to the five hive commands
in my script. From that, it appears that the Œmsck repair table checks¹
command completes successfully. However, I believe it should be taking
much longer than 1.5 seconds. For reference, when my hive script is run
manually, we have something like:
OK
Time taken: 1.33 seconds
OK
Time taken: 0.01 seconds
OK
Time taken: 2.681 seconds
OK
Time taken: 0.155 seconds
OK
Partitions not in
metastore: checks:p_date=2013-04-01/p_merchant_id=98
checks:p_date=2013-04-
02/p_merchant_id=98 checks:p_date=2013-04-03/p_merchant_id=98
checks:p_date
=2013-04-04/p_merchant_id=98 checks:p_date=2013-04-05/p_merchant_id=98
chec
ks:p_date=2013-04-06/p_merchant_id=98 checks:p_date=2013-04-07/p_merchant_i
d=98 checks:p_date=2013-04-08/p_merchant_id=98
checks:p_date=2013-04-09/p_m
erchant_id=98 checks:p_date=2013-04-10/p_merchant_id=98
checks:p_date=2013-
04-11/p_merchant_id=98 checks:p_date=2013-04-12/p_merchant_id=98
checks:p_d
ate=2013-04-13/p_merchant_id=98 checks:p_date=2013-04-14/p_merchant_id=98
c
hecks:p_date=2013-04-15/p_merchant_id=98
checks:p_date=2013-04-16/p_merchan
t_id=98 checks:p_date=2013-04-17/p_merchant_id=98
checks:p_date=2013-04-18/
p_merchant_id=98 checks:p_date=2013-04-19/p_merchant_id=98
checks:p_date=20
13-04-20/p_merchant_id=98 checks:p_date=2013-04-21/p_merchant_id=98
checks:
p_date=2013-04-22/p_merchant_id=98 checks:p_date=2013-04-23/p_merchant_id=9
8 checks:p_date=2013-04-24/p_merchant_id=98
checks:p_date=2013-04-25/p_merc
hant_id=98 checks:p_date=2013-04-26/p_merchant_id=98
checks:p_date=2013-04-
27/p_merchant_id=98 checks:p_date=2013-04-28/p_merchant_id=98
checks:p_date
=2013-04-29/p_merchant_id=98 checks:p_date=2013-04-30/p_merchant_id=98
chec
ks:p_date=2013-05-01/p_merchant_id=98 checks:p_date=2013-11-01/p_merchant_i
d=142 checks:p_date=2013-11-02/p_merchant_id=142
Repair: Added partition to metastore
checks:p_date=2013-04-01/p_merchant_id=98
Repair: Added partition to metastore
checks:p_date=2013-04-02/p_merchant_id=98
Repair: Added partition to metastore
checks:p_date=2013-04-03/p_merchant_id=98
Repair: Added partition to metastore
checks:p_date=2013-04-04/p_merchant_id=98
Repair: Added partition to metastore
checks:p_date=2013-04-05/p_merchant_id=98
Repair: Added partition to metastore
checks:p_date=2013-04-06/p_merchant_id=98
Repair: Added partition to metastore
checks:p_date=2013-04-07/p_merchant_id=98
Repair: Added partition to metastore
checks:p_date=2013-04-08/p_merchant_id=98
Repair: Added partition to metastore
checks:p_date=2013-04-09/p_merchant_id=98
Repair: Added partition to metastore
checks:p_date=2013-04-10/p_merchant_id=98
Repair: Added partition to metastore
checks:p_date=2013-04-11/p_merchant_id=98
Repair: Added partition to metastore
checks:p_date=2013-04-12/p_merchant_id=98
Repair: Added partition to metastore
checks:p_date=2013-04-13/p_merchant_id=98
Repair: Added partition to metastore
checks:p_date=2013-04-14/p_merchant_id=98
Repair: Added partition to metastore
checks:p_date=2013-04-15/p_merchant_id=98
Repair: Added partition to metastore
checks:p_date=2013-04-16/p_merchant_id=98
Repair: Added partition to metastore
checks:p_date=2013-04-17/p_merchant_id=98
Repair: Added partition to metastore
checks:p_date=2013-04-18/p_merchant_id=98
Repair: Added partition to metastore
checks:p_date=2013-04-19/p_merchant_id=98
Repair: Added partition to metastore
checks:p_date=2013-04-20/p_merchant_id=98
Repair: Added partition to metastore
checks:p_date=2013-04-21/p_merchant_id=98
Repair: Added partition to metastore
checks:p_date=2013-04-22/p_merchant_id=98
Repair: Added partition to metastore
checks:p_date=2013-04-23/p_merchant_id=98
Repair: Added partition to metastore
checks:p_date=2013-04-24/p_merchant_id=98
Repair: Added partition to metastore
checks:p_date=2013-04-25/p_merchant_id=98
Repair: Added partition to metastore
checks:p_date=2013-04-26/p_merchant_id=98
Repair: Added partition to metastore
checks:p_date=2013-04-27/p_merchant_id=98
Repair: Added partition to metastore
checks:p_date=2013-04-28/p_merchant_id=98
Repair: Added partition to metastore
checks:p_date=2013-04-29/p_merchant_id=98
Repair: Added partition to metastore
checks:p_date=2013-04-30/p_merchant_id=98
Repair: Added partition to metastore
checks:p_date=2013-05-01/p_merchant_id=98
Repair: Added partition to metastore
checks:p_date=2013-11-01/p_merchant_id=142
Repair: Added partition to metastore
checks:p_date=2013-11-02/p_merchant_id=142
Time taken: 6.12 seconds
I am confused as to why running the script manually would work but the
hive action would not. Thanks for your help!
Best,
Andrew
On 2/8/14, 1:02 PM, "Harsh J" <[email protected]> wrote:
>When you say it is 'not working', what do you mean? What error do you
>observe when you check task logs of the spawned launcher MR job of the
>Hive Action?
>
>On Fri, Feb 7, 2014 at 11:09 PM, Andrew O'Neill <[email protected]>
>wrote:
>> Hello everyone,
>>
>> I am having an issue running the command "msck repair table
>><tablename>" from an ooze hive action. My hive action is completing
>>successfully; the table is created but the data doesn't appear in hive
>>until I manually run "msck repair table checks;". Why would this command
>>work when run manually, but not work when run via oozie hive action?
>>Again, every other part of the hive script appears to be working as
>>expected. My hive script "updatepartitions.q" is as follows:
>>
>> create database IF NOT EXISTS ${scope};
>> use ${scope};
>> drop table checks;
>> create external table checks
>> partitioned by (p_date string, p_merchantId bigint)
>> row format serde 'org.apache.hadoop.hive.serde2.avro.AvroSerDe'
>> stored as inputformat
>>'org.apache.hadoop.hive.ql.io.avro.AvroContainerInputFormat'
>> outputformat
>>'org.apache.hadoop.hive.ql.io.avro.AvroContainerOutputFormat'
>> location '${datasetBase}/${scope}/checks'
>> tblproperties (
>> 'avro.schema.url'='${avroSchemaFile}'
>> );
>> msck repair table checks;
>>
>>
>> And my hive action is:
>>
>> <hive xmlns="uri:oozie:hive-action:0.2">
>> <job-tracker>${jobTracker}</job-tracker>
>> <name-node>${nameNode}</name-node>
>> <job-xml>${currentAutomation}/job-defaults.xml</job-xml>
>> <script>updatepartitions.q</script>
>> <param>scope=${scope}</param>
>> <param>datasetBase=${datasetBase}</param>
>> <param>avroSchemaFile=${avroSchemaFile}</param>
>> </hive>
>>
>> Any thoughts or feedback are much appreciated. Thanks!
>>
>>
>> Best,
>> Andrew
>
>
>
>--
>Harsh J