Re: namenode directory disappear after machines restart

2012-05-21 Thread Marcos Ortiz
This is an usual behavior on Unix/Linux systems. When you restart the system, the content of the /tmp directory is cleaned, because precisely, the purpose of this directory is to keep files temporally. For that reason, the data directory for the HDFS filesystem should be another directory, /var/

Re: unable to resolve the heap space error even when running the examples

2012-04-12 Thread Marcos Ortiz
Can you show to us the logs of your NN/DN? On 04/12/2012 03:28 AM, SRIKANTH KOMMINENI (RIT Student) wrote: Tried that it didn't work for a lot of combinations of values On Thu, Apr 12, 2012 at 3:25 AM, Mapred Learn > wrote: Try exporting HADOOP_HEAPSIZE to b

Re: Yahoo Hadoop Tutorial with new APIs?

2012-04-04 Thread Marcos Ortiz
to update those tutorials if people at Yahoo take input from outside world :) I want to help on this too, so, we need to talk with Hadoop colleagues to do this. Regards and best wishes Regards, Jagat - Original Message - From: Marcos Ortiz Sent: 04/04/12 08:32 AM To: common-u

Yahoo Hadoop Tutorial with new APIs?

2012-04-04 Thread Marcos Ortiz
Regards to all the list. There are many people that use the Hadoop Tutorial released by Yahoo at http://developer.yahoo.com/hadoop/tutorial/ The main issue here is that, this tutorial is written with the old APIs? (Hadoop 0.18 I

Re: Retry question

2012-03-18 Thread Marcos Ortiz
HDFS is precisely built with these concerns in mind. If you read a 60 GB file and the rack goes down, the system will present to you transparently another copy, based on your replication factor. A block can not be available too due to corruption, and in this case, it can be replicated to other liv

Re: Best practice to setup Sqoop,Pig and Hive for a hadoop cluster ?

2012-03-15 Thread Marcos Ortiz
On 03/15/2012 09:22 AM, Manu S wrote: Thanks a lot Bijoy, that makes sense :) Suppose if I have Mysql database in some other node(not in hadoop cluster), can I import the tables using sqoop to my HDFS? Yes, this is the main purpose of Sqoop On the Cloudera site, you have the completed docume

Re: Error while using libhdfs C API

2012-03-09 Thread Marcos Ortiz
on the Eugene Ciurana´s Refcard called "Deploying Hadoop", where he did a amazing work explaining in a few pages some tricky configurations tips. Regards     From: Marcos Ortiz [m

Notes about HA Name Node

2012-03-08 Thread Marcos Ortiz
Regards to all the list I was reading the NoSQL Weekly, and I saw that the HDFS developers are working in one of the most requested features for HDFS: High Availability in the NameNode (HA Name Node). Where I can find more information about this development? Best wishes -- Marcos Luis Ortíz

Re: Error while using libhdfs C API

2012-03-06 Thread Marcos Ortiz
Which platform are you using? Did you update the dynamic linker runtime bindings (ldconfig)? ldconfig $HOME/hadoop/c++/Linux-amd64/lib Regards On 03/06/2012 02:38 AM, Amritanshu Shekhar wrote: Hi, I was trying to link 64 bit libhdfs in my application program but it seems there is an issue w

Re: Block size in HDFS

2011-06-10 Thread Marcos Ortiz
On 06/10/2011 10:35 AM, Pedro Costa wrote: Hi, If I define HDFS to use blocks of 64 MB, and I store in HDFS a 1KB file, this file will ocupy 64MB in the HDFS? Thanks, HDFS is not very efficient storing small files, because each file is stored in a block (of 64 MB in your case), and the block m

Re: I/O Error on rsync to hdfs-fuse mount

2011-06-09 Thread Marcos Ortiz
Did you check dmesg to search what is happening in your system? It seems that's a disc problem El 6/9/2011 10:07 PM, J. Ryan Earl escribió: Are there any glaring culprits to check for errors like this: receiving incremental file list rsync: failed to set times on "/mnt/hdfs/user/hadoop/reports_

Re: cant remove files from tmp

2011-06-06 Thread Marcos Ortiz
mapred 22 Jun 6 12:46 . drwxr-xr-x 3 mapred mapred 19 May 26 21:17 .. ?- ? ? ? ?? input-dir -Original Message----- From: Marcos Ortiz [mailto:mlor...@uci.cu <mailto:mlor...@uci.cu>] Sent: Monday, June 06, 2011 1:17 PM To: hdfs-us

Re: cant remove files from tmp

2011-06-06 Thread Marcos Ortiz
* Why are using he root user for these operations? * Which are your permisions on your data directory? (ls -la /part/data)? Regards El 6/6/2011 3:41 PM, Jain, Prem escribió: I have a wrecked datanode which is giving me hard time restarting. It keeps complaining of Datanode dead, pid file exists

Re: Changing dfs.block.size

2011-06-06 Thread Marcos Ortiz
Another advice here, is that you can test the right block size with a seemed enviroment to your production system, before to deploy the real system, and then, you can avoid these kinds of changes. El 6/6/2011 3:09 PM, J. Ryan Earl escribió: Hello, So I have a question about changing dfs.block

Re: Changing dfs.block.size

2011-06-06 Thread Marcos Ortiz
I think that you run several maintenance tasks after doing these changes. * Start the balancer tool to redistribute the blocks by moving them from over-utilized datanodes to under-utilized datanodes. Rebember to change the dfs.balance.bandwidthPerSec property in the hdfs-site.xml file. * Ru

Re: Unable to start hadoop-0.20.2 but able to start hadoop-0.20.203 cluster

2011-05-31 Thread Marcos Ortiz
On 05/31/2011 10:06 AM, Xu, Richard wrote: 1 namenode, 1 datanode. Dfs.replication=3. We also tried 0, 1, 2, same result. *From:*Yaozhen Pan [mailto:itzhak@gmail.com] *Sent:* Tuesday, May 31, 2011 10:34 AM *To:* hdfs-user@hadoop.apache.org *Subject:* Re: Unable to start hadoop-0.20.2 but

Re: Starting Datanode

2011-05-21 Thread Marcos Ortiz
makes any sense at all. Looks like a genuine typo/overlook/bug. Do file a JIRA for this, and fix it to use "-client" alone, if I understand what it really means to do if EUID == 0. On Sat, May 21, 2011 at 4:27 AM, Anh Nguyen wrote: On 05/20/2011 02:33 PM, Marcos Ortiz wrote: On 05/20

Re: Starting Datanode

2011-05-20 Thread Marcos Ortiz
On 05/20/2011 04:08 PM, Anh Nguyen wrote: On 05/20/2011 02:06 PM, Marcos Ortiz wrote: On 05/20/2011 04:27 PM, Marcos Ortiz wrote: On 05/20/2011 03:46 PM, Anh Nguyen wrote: On 05/20/2011 01:15 PM, Marcos Ortiz wrote: On 05/20/2011 01:02 PM, Anh Nguyen wrote: Hi, I just upgraded to hadoop

Re: our experiences with various filesystems and tuning options

2011-05-10 Thread Marcos Ortiz
On 05/10/2011 06:56 AM, Jonathan Disher wrote: In a previous life, I've had extreme problems with XFS, including kernel panics and data loss under high load. Those were database servers, not Hadoop nodes, and it was a few years ago. But, ext3/ext4 seems to be stable enough, and it's more wide

Re: our experiences with various filesystems and tuning options

2011-05-10 Thread Marcos Ortiz
On 05/10/2011 06:29 AM, Rita wrote: I keep asking because I wasn't able to use a XFS filesystem larger than 3-4TB. If the XFS file system is larger than 4TB hdfs won't recognize the space. I am on a 64bit RHEL 5.3 host. On Tue, May 10, 2011 at 6:30 AM, Will Maier

Re: Other FS Pointer?

2011-05-04 Thread Marcos Ortiz Valmaseda
For example: * Amazon S3 (Amazon Simple Storage Service): http://aws.amazon.com/s3/ On the Hadoop wiki, there is a competed guide to work Hadoop with Amazon S3 http://wiki.apache.org/hadoop/AmazonS3 * IBM GPFS: http://www.ibm.com/systems/gpfs/ https://issues.apache.org/jira/browse

Re: Question regarding datanode been wiped by hadoop

2011-04-12 Thread Marcos Ortiz
El 4/12/2011 10:46 AM, felix gao escribió: What reason/condition would cause a datanode’s blocks to be removed? Our cluster had a one of its datanodes crash because of bad RAM. After the system was upgraded and the datanode/tasktracker brought online the next day we noticed the amount of

Re: hadoop branch-0.20-append Build error:build.xml:933: exec returned: 1

2011-04-11 Thread Marcos Ortiz
El 4/11/2011 10:45 PM, Alex Luya escribió: BUILD FAILED .../branch-0 .20-append/build.xml:927: The following error occurred while executing this line: ../branch-0 .20-append/build.xml:933: exec returned: 1 Total time: 1 minute 17 seconds + RESULT=1 + '[' 1 '!=' 0 ']' + echo 'Build Failed

Re: cloudera CDH3 error: namenode running,but:Error: JAVA_HOME is not set and Java could not be found

2011-03-16 Thread Marcos Ortiz
On Wed, 2011-03-16 at 23:19 +0800, Alex Luya wrote: > I download cloudera CDH3 beta:hadoop-0.20.2+228,and modified three > files:hdfs.xml,core-site.xml and hadoop-env.sh.and I do have set > JAVA_HOME in file:hadoop-env.sh,and then try to run:start-dfs.sh,got > this error,but strange thing is that

Re: how does hdfs determine what node to use?

2011-03-10 Thread Marcos Ortiz
El 3/10/2011 8:37 AM, Rita escribió: Thanks Stu. I too was sure there was an algorithm. Is there a place where I can read more about it? I want to know if it picks a block according to the load average or does it always pick "rack0" first? On Wed, Mar 9, 2011 at 10:24 PM,