You may want to check why System time is high. Check your system call stats.
This should give you some clue.
-Bharath
From: Robert Dyer
To: user@hadoop.apache.org; Bharath Mundlapudi
Sent: Monday, December 10, 2012 7:32 PM
Subject: Re: Strange machine
What was the job or query you were running?
Couple of suggestions:
1. Reduce data set size with job chaining
2. Increase Reduce task heap
3. If you are using Hive/Pig, you may want to tune your query.
-Bharath
From: Manoj Babu
To: user@hadoop.apache.org
Sen
Are you seeing any performance impact with this cache increase? It is normal in
linux system to grab high cache level.
-Bharath
From: Andy Isaacson
To: user@hadoop.apache.org
Sent: Monday, December 10, 2012 11:23 AM
Subject: Re: Strange machine behavior
W
If data is less in your cluster (say less than few GBs) then answer is yes. But
it is an expensive route. For large data sets, traditional means is not
feasible and it is expensive.
If you want optimal cost based solution, you could setup another local/remote
cluster and try discp or simply cop
You should never expose internal host names in the Javascript/HTML.
The flow can be
Browser --> Tomcat --(REST, HDFS Client)--> HDFS
Your web app can make REST requests to HDFS and you could use JAX-RS impl
for REST talk in your web app.
I must warn that user experience will suffer by any of th