Re: problem with HDFS caching in Hadoop 2.3

2014-03-09 Thread hequn cheng
hi, have you solved your problem? i have the same problem. it seems that the cache behavior has not been triggered. 2014-03-07 23:37 GMT+08:00 hwpstorage hwpstor...@gmail.com: Hello, It looks like the HDFS caching does not work well. The cached log file is around 200MB. The hadoop cluster

Re: MR2 Job over LZO data

2014-03-09 Thread Gordon Wang
Can you run MR jobs (not pig job) which takes Lzo Files as input ? If you can not run MR jobs. You may want to check the lzo compression configuration in core-site.xml. Make sure the dynamic library is in HADOOP_HOME/lib/native/ Here is a FAQ about how to configure lzo

What's the best practice for managing Hadoop dependencie?

2014-03-09 Thread Fengyun RAO
First of all, I want to claim that I used CDH5 beta, and managed project using maven, and I googled and read a lot, e.g. https://issues.apache.org/jira/browse/MAPREDUCE-1700 http://www.datasalt.com/2011/05/handling-dependencies-and-configuration-in-java-hadoop-projects-efficiently/ I believe the

Re: What's the best practice for managing Hadoop dependencie?

2014-03-09 Thread Stanley Shi
Waiting for others to give best practice. I think you can use eclipse to manage the maven; see the full dependency hierarchy, if some jar(for example, guava) exists in both hadoop dependency chain and your own requirements, put your requirements' scope as provided . Regards, *Stanley Shi,* On