Re: InputFormat and InputSplit - Network location name contains /:

2014-04-10 Thread Harsh J
Do not use the InputSplit's getLocations() API to supply your file path, it is not intended for such things, if thats what you've done in your current InputFormat implementation. If you're looking to store a single file path, use the FileSplit class, or if not as simple as that, do use it as a bas

Re: download hadoop-2.4

2014-04-10 Thread Zhijie Shen
The official release can be found on: http://www.apache.org/dyn/closer.cgi/hadoop/common/ But you can also choose to checkout the code from svn/git repository. On Thu, Apr 10, 2014 at 8:08 PM, Mingjiang Shi wrote: > http://svn.apache.org/repos/asf/hadoop/common/tags/release-2.4.0/ > > > On Fri

Re: download hadoop-2.4

2014-04-10 Thread Mingjiang Shi
http://svn.apache.org/repos/asf/hadoop/common/tags/release-2.4.0/ On Fri, Apr 11, 2014 at 10:23 AM, lei liu wrote: > Hadoop-2.4 is release, where can I download the hadoop-2.4 code from? > > > Thanks, > > LiuLei > -- Cheers -MJ

download hadoop-2.4

2014-04-10 Thread lei liu
Hadoop-2.4 is release, where can I download the hadoop-2.4 code from? Thanks, LiuLei

Re: how can i archive old data in HDFS?

2014-04-10 Thread Stanley Shi
AFAIK, no tools now. Regards, *Stanley Shi,* On Fri, Apr 11, 2014 at 9:09 AM, ch huang wrote: > hi,maillist: > how can i archive old data in HDFS ,i have lot of old data ,the > data will not be use ,but it take lot of space to store it ,i want to > archive and zip the old data, HDFS

which dir in HDFS can be clean ?

2014-04-10 Thread ch huang
hi,maillist: my HDFS cluster run about 1 year ,and i find many dir is very large,i wonder if some of them can be clean? like /var/log/hadoop-yarn/apps

Re: use setrep change number of file replicas,but not work

2014-04-10 Thread ch huang
i set replica number from 3 to 2,but i dump NN metrics ,the PendingDeletionBlocks is zero ,why? if the check thread will sleep a interval then do it's check work ,how long the interval time is? On Thu, Apr 10, 2014 at 10:50 AM, Harsh J wrote: > The replica deletion is asynchronous. You can track

Re: use setrep change number of file replicas,but not work

2014-04-10 Thread ch huang
i can use fsck to get Over-replicated blocks but how can i track pending delete ? On Thu, Apr 10, 2014 at 10:50 AM, Harsh J wrote: > The replica deletion is asynchronous. You can track its deletions via > the NameNode's over-replicated blocks and the pending delete metrics. > > On Thu, Apr 10, 2

how can i archive old data in HDFS?

2014-04-10 Thread ch huang
hi,maillist: how can i archive old data in HDFS ,i have lot of old data ,the data will not be use ,but it take lot of space to store it ,i want to archive and zip the old data, HDFS can do this operation?

Re: hadoop 2.0 upgrade to 2.4

2014-04-10 Thread Alejandro Abdelnur
Motty, https://www.cloudera.com/content/cloudera-content/cloudera-docs/CDH5/latest/CDH5-Installation-Guide/CDH5-Installation-Guide.html provides instructions to upgrade from CDH4 to CDH5 (which bundles Hadoop 2.3.0). If you intention is to use CDH5 that should help you. If you have further ques

hadoop 2.0 upgrade to 2.4

2014-04-10 Thread motty cruz
Hi All, I currently have a hadoop 2.0 cluster in production, I want to upgrade to latest release. current version: [root@doop1 ~]# hadoop version Hadoop 2.0.0-cdh4.6.0 Cluster has the following services: hbase hive hue impala mapreduce oozie sqoop zookeeper can someone point me to a howto upgra

InputFormat and InputSplit - Network location name contains /:

2014-04-10 Thread Patcharee Thongtra
Hi, I wrote a custom InputFormat and InputSplit to handle netcdf file. I use with a custom pig Load function. When I submitted a job by running a pig script. I got an error below. From the error log, the network location name is "hdfs://service-1-0.local:8020/user/patcharee/netcdf_data/wrfout

Re: not able to run map reduce job example on aws machine

2014-04-10 Thread Harsh J
"java.lang.IllegalArgumentException: Does not contain a valid host:port authority: poc_hadoop04:46162" Hostnames cannot carry an underscore '_' character per RFC 952 and its extensions. Please fix your hostname to not carry one. On Thu, Apr 10, 2014 at 5:14 PM, Rahul Singh wrote: > here is my ma

Re: not able to run map reduce job example on aws machine

2014-04-10 Thread Rahul Singh
here is my mapred.site.xml config mapred.job.tracker localhost:54311 The host and port that the MapReduce job tracker runs at. If "local", then jobs are run in-process as a single map and reduce task. Also, The job runs fine in memory, if i remove the dependency on yarn, i.e. if

Re: File requests to Namenode

2014-04-10 Thread Diwakar Hadoop
Thanks !!! Diwakar Sent from my iPhone > On Apr 9, 2014, at 9:22 PM, Harsh J wrote: > > You could look at metrics the NN publishes, or look at/process the > HDFS audit log. > >> On Wed, Apr 9, 2014 at 6:36 PM, Diwakar Sharma >> wrote: >> How and where to check how many datanode block addres

Re: not able to run map reduce job example on aws machine

2014-04-10 Thread Kiran Dangeti
Rahul, Please check the port name given in mapred.site.xml Thanks Kiran On Thu, Apr 10, 2014 at 3:23 PM, Rahul Singh wrote: > Hi, > I am getting following exception while running word count example, > > 14/04/10 15:17:09 INFO mapreduce.Job: Task Id : > attempt_1397123038665_0001_m_00_2, S

not able to run map reduce job example on aws machine

2014-04-10 Thread Rahul Singh
Hi, I am getting following exception while running word count example, 14/04/10 15:17:09 INFO mapreduce.Job: Task Id : attempt_1397123038665_0001_m_00_2, Status : FAILED Container launch failed for container_1397123038665_0001_01_04 : java.lang.IllegalArgumentException: Does not contain