Hi,
Suppose i get a stream of data.
Now I want to filter this 'stream data' and then insert into hive table.
For example
lets say a file datas.txt has following info
guru12delhi
prasad13gurgaon
On Wed, Aug 25, 2010 at 1:07 AM, lei liu liulei...@gmail.com wrote:
When hadoop one job which is submmited by hive, hadoop need the
hive_exec.jar, hwo to hive add hive_exec.jar to hadoop?
Please tell me the where are codes in hive.
Thanks,
LiuLei
I think what you are looking for is a
Hi Neil,
On Wed, Aug 25, 2010 at 2:41 PM, Neil Xu neil.x...@gmail.com wrote:
You can set the input path and output path for each job, and run jobs in
order.
ex. TwoJobs.java
public class TwoJobs extends Configured implements Tool {
public static class Job1Mapper extends MapReduceBase
Yes, it is optimized by hive. There will be only 1 mr job, even if the columns
selected were different.
-namit
From: Neil Xu [neil.x...@gmail.com]
Sent: Wednesday, August 25, 2010 2:40 AM
To: hive-user@hadoop.apache.org
Subject: How is Union All
Hi,
I had execute ALTER TABLE tablename ADD COLUMNS (newcolumns STRING) to add
column,
before i add this column, got 4 columns, the new column will after the 4th
column, and before partition column, and now the new data in hdfs was exist
in the 5th columns, but i execute SELECT * FROM tablename,
Hi, Maxim,
I misunderstand what you want, you need a job chain that a MR job(not
hive) can be automatically run after a Hive job is done, and temp files can
also be cleaned automatically?
I have no idea also, but in our company, a scheduling system is
implemented to manage different kinds