The best way to answer this is that all hadoop components work
remotely, assuming you have the proper configuration and library files
(the same ones from the remote cluster)
I attached a HiveLet (Made up term). It was my first API testing
program. It is more or less a 'One Shot', run the query and exit
program.
You need to run the Meta DB in Server mode for concurrent access.
http://wiki.apache.org/hadoop/HiveDerbyServerMode
It is slightly complicated if your desktop is windows, but still doable.
You would need:
Hadoop conf directory
Hive conf directory
hadoop distribution ( technically only jars )
Hive distribution ( technically only jars )
When you start hadoop/hive they both pick up the locations of the
components from the configurations and start happily on a remote
machine. (Not counting firewall issues)
public class TestHive {
public static void main(String [] args) throws Exception {
OptionsProcessor oproc = new OptionsProcessor();
if(! oproc.process_stage1(args)) {
System.out.println("Problem processing arfs");
}
SessionState.initHiveLog4j();
CliSessionState ss = new CliSessionState (new HiveConf(SessionState.class));
SessionState.start(ss);
if(! oproc.process_stage2(ss)) {
System.out.println("Problem with stage2");
}
SetProcessor sp=new SetProcessor();
Driver qp=new Driver();
int ret = -1;
int sret=-1;
sret = sp.run("set mapred.map.tasks=1");
sret = sp.run("set mapred.reduce.tasks=1");
ret = qp.run("SELECT people.* from people");
Vector <String> res = new Vector<String>();
while (qp.getResults(res)) {
System.out.println("ResSize:"+ res.size());
for (String row:res){
System.out.print(row+"\n");
}
}
//res.clear();
} // end main
} // end TestHive