Hadoop Map/Reduce and Hive clarification

Deepak Halale Sat, 12 Sep 2009 16:28:32 -0700

Hi,
I am new to Hadoop , need some clarifications
a) how to automate executing Map/Reduce jobs and also automating loading
data in Hive, do I need to create  a cron job or is there a better way.


b) I have 2 tables as the source for M/R jobs
1) Order Master and Order detail
OrderMaster has order header columns
(OrderId,CustId,PaymentMethod,DeliveryMethod etc)
OrderDetail has orders' item level information (viz.
OrderId,ItemId,Quantity,SalesPrice,CostPrice,DeliveryAddress, Delivery
State,DeliveryZip,DeliveryCountry)
The relation between Master and Detail is 1 to many and OrderId is the key.

If I generate a tab delimited file from each table, how does Reduce  is
going to aggregate the data from OrderDetail example
If I have to sum the OrderRevenue by Order.


Thanks

Deepak

Hadoop Map/Reduce and Hive clarification

Reply via email to