building from subversion repository
Hi, I'm trying to build hadoop from a current check out of the repository and am receiving the following messages. Can someone enlighten me as to what I'm doing wrong please? Thanks, George... [INFO] BUILD FAILURE [INFO] [INFO] Total time: 1:55.493s [INFO] Finished at: Sun Feb 17 03:58:15 PST 2013 [INFO] Final Memory: 31M/332M [INFO] [ERROR] Could not find goal 'protoc' in plugin org.apache.hadoop:hadoop-maven-plugins:3.0.0-SNAPSHOT among available goals - [Help 1] [ERROR] [ERROR] To see the full stack trace of the errors, re-run Maven with the -e switch. [ERROR] Re-run Maven using the -X switch to enable full debug logging. [ERROR] [ERROR] For more information about the errors and possible solutions, please read the following articles: [ERROR] [Help 1] http://cwiki.apache.org/confluence/display/MAVEN/MojoNotFoundException
RE: Can I perfrom a MR on my local filesystem
Hi, Thank you Niels and thank you Nitin for your reply. Actually, I want to run MR on a cloud store, which is open source. So I thought of implementing a file system for the same and plugging it into Hadoop, just like S3/KFS are there. This would enable a hadoop client to talk to My cloud store. But I do not have further clarity as to how to run MR on the cloud using the JobTracker/TaskTracker framework of Hadoop. As per the link given by Niels, it shows that I can run MR on local file system. So is there any way of telling the JobTracker to read data from a set of nodes and then deploy TaskTracker daemons on those nodes (which would be My cloud store in this case) and fetch the result of MR. Note: I do not want to fetch the data to my local computer as is the case with S3. Fetching the data would fail the purpose of using Hadoop (which is moving compute to data). Thanks, Nikhil From: Agarwal, Nikhil Sent: Sunday, February 17, 2013 11:53 AM To: 'user@hadoop.apache.org' Subject: Can I perfrom a MR on my local filesystem Hi, Recently I followed a blog to run Hadoop on a single node cluster. I wanted to ask that in a single node set-up of Hadoop is it necessary to have the data copied into Hadoop's HDFS before running a MR on it. Can I run MR on my local file system too without copying the data to HDFS? In the Hadoop source code I saw there are implementations of other file systems too like S3, KFS, FTP, etc. so how does exactly a MR happen on S3 data store ? How does JobTracker or Tasktracker run in S3 ? I would be very thankful to get a reply to this. Thanks Regards, Nikhil
Re: building from subversion repository
Hi George, The error below is your issue: [ERROR] Could not find goal 'protoc' in plugin org.apache.hadoop:hadoop-maven-plugins:3.0.0-SNAPSHOT among available goals - [Help 1] To build trunk, a protocol buffers (protobuf) compiler installation of version 2.4 at least is required, cause we have that as a dependency. This is mentioned on http://wiki.apache.org/hadoop/HowToContribute, http://wiki.apache.org/hadoop/QwertyManiac/BuildingHadoopTrunk and also the SVN trunk's BUILDING.txt (look for proto). Once installed in your OS, the following command and output can be seen to work, and your build should continue successfully: ➜ ~ protoc --version libprotoc 2.4.1 On Sun, Feb 17, 2013 at 5:43 PM, George R Goffe grgo...@yahoo.com wrote: Hi, I'm trying to build hadoop from a current check out of the repository and am receiving the following messages. Can someone enlighten me as to what I'm doing wrong please? Thanks, George... [INFO] BUILD FAILURE [INFO] [INFO] Total time: 1:55.493s [INFO] Finished at: Sun Feb 17 03:58:15 PST 2013 [INFO] Final Memory: 31M/332M [INFO] [ERROR] Could not find goal 'protoc' in plugin org.apache.hadoop:hadoop-maven-plugins:3.0.0-SNAPSHOT among available goals - [Help 1] [ERROR] [ERROR] To see the full stack trace of the errors, re-run Maven with the -e switch. [ERROR] Re-run Maven using the -X switch to enable full debug logging. [ERROR] [ERROR] For more information about the errors and possible solutions, please read the following articles: [ERROR] [Help 1] http://cwiki.apache.org/confluence/display/MAVEN/MojoNotFoundException -- Harsh J
Re: QJM deployment
Hi Azuryy, Thanks for your feedback on the docs! I've filed https://issues.apache.org/jira/browse/HDFS-4508 on your behalf to address them. Feel free to file JIRA with documentation complaints with change patches to have them improved yourself :) On Sun, Feb 17, 2013 at 2:25 PM, Azuryy Yu azury...@gmail.com wrote: Hi, who can kindly make 2.0.3-alpha QJM deployment document more better, I cannot understand it successfully, thanks. such as : 1) by running the command hdfs namenode -bootstrapStandby on the unformatted NameNode - - it should be start formatted namenode firstly. 2) If you are converting a non-HA NameNode to be HA, you should run the command hdfs -initializeSharedEdits, which will initialize the JournalNodes with the edits data from the local NameNode edits directories. -- run this command on which node? and does that this command exists? QJM Deployment details section is bad -- Harsh J
Re: executing hadoop commands from python?
i was stuck with similar issue before and couldn't come up with a more viable alternative than this so if the output of the hadoop command is not that big then you can take it into your py script and process it . i use the following code snippet to clean the output of ls and store it into a py list for process. In your case you can do a len on the list to get file count fscommand = hadoop dfs -ls /path/in/%s/*/ 2 /dev/null%(hdfs) hadoop_cmd=commands.getoutput(fscommand) lines = hadoop_cmd.split(\n)[1:] strlines =[map(lambda a:a.strip(),line.split(' ')[-3:]) for line in lines] On Sun, Feb 17, 2013 at 4:17 AM, jamal sasha jamalsha...@gmail.com wrote: Hi, This might be more of a python centric question but was wondering if anyone has tried it out... I am trying to run few hadoop commands from python program... For example if from command line, you do: bin/hadoop dfs -ls /hdfs/query/path it returns all the files in the hdfs query path.. So very similar to unix Now I am trying to basically do this from python.. and do some manipulation from it. exec_str = path/to/hadoop/bin/hadoop dfs -ls + query_path os.system(exec_str) Now, I am trying to grab this output to do some manipulation in it. For example.. count number of files? I looked into subprocess module but then... these are not native shell commands. hence not sure whether i can apply those concepts How to solve this? Thanks -- regards , Anuj Maurice
Re: executing hadoop commands from python?
Instead of 'scraping' this way, consider using a library such as Pydoop (http://pydoop.sourceforge.net) which provides pythonic ways and APIs to interact with Hadoop components. There are also other libraries covered at http://blog.cloudera.com/blog/2013/01/a-guide-to-python-frameworks-for-hadoop/ for example. On Sun, Feb 17, 2013 at 4:17 AM, jamal sasha jamalsha...@gmail.com wrote: Hi, This might be more of a python centric question but was wondering if anyone has tried it out... I am trying to run few hadoop commands from python program... For example if from command line, you do: bin/hadoop dfs -ls /hdfs/query/path it returns all the files in the hdfs query path.. So very similar to unix Now I am trying to basically do this from python.. and do some manipulation from it. exec_str = path/to/hadoop/bin/hadoop dfs -ls + query_path os.system(exec_str) Now, I am trying to grab this output to do some manipulation in it. For example.. count number of files? I looked into subprocess module but then... these are not native shell commands. hence not sure whether i can apply those concepts How to solve this? Thanks -- Harsh J
Re: Namenode failures
It just happened again. This was after a fresh format of HDFS/HBase and I am attempting to re-import the (backed up) data. http://pastebin.com/3fsWCNQY So now if I restart the namenode, I will lose data from the past 3 hours. What is causing this? How can I avoid it in the future? Is there an easy way to monitor (other than a script grep'ing the logs) the checkpoints to see when this happens? On Sat, Feb 16, 2013 at 2:39 PM, Robert Dyer psyb...@gmail.com wrote: Forgot to mention: Hadoop 1.0.4 On Sat, Feb 16, 2013 at 2:38 PM, Robert Dyer psyb...@gmail.com wrote: I am at a bit of wits end here. Every single time I restart the namenode, I get this crash: 2013-02-16 14:32:42,616 INFO org.apache.hadoop.hdfs.server.common.Storage: Image file of size 168058 loaded in 0 seconds. 2013-02-16 14:32:42,618 ERROR org.apache.hadoop.hdfs.server.namenode.NameNode: java.lang.NullPointerException at org.apache.hadoop.hdfs.server.namenode.FSDirectory.addChild(FSDirectory.java:1099) at org.apache.hadoop.hdfs.server.namenode.FSDirectory.addChild(FSDirectory.java:) at org.apache.hadoop.hdfs.server.namenode.FSDirectory.addNode(FSDirectory.java:1014) at org.apache.hadoop.hdfs.server.namenode.FSDirectory.unprotectedAddFile(FSDirectory.java:208) at org.apache.hadoop.hdfs.server.namenode.FSEditLog.loadFSEdits(FSEditLog.java:631) at org.apache.hadoop.hdfs.server.namenode.FSImage.loadFSEdits(FSImage.java:1021) at org.apache.hadoop.hdfs.server.namenode.FSImage.loadFSImage(FSImage.java:839) at org.apache.hadoop.hdfs.server.namenode.FSImage.recoverTransitionRead(FSImage.java:377) at org.apache.hadoop.hdfs.server.namenode.FSDirectory.loadFSImage(FSDirectory.java:100) at org.apache.hadoop.hdfs.server.namenode.FSNamesystem.initialize(FSNamesystem.java:388) at org.apache.hadoop.hdfs.server.namenode.FSNamesystem.init(FSNamesystem.java:362) at org.apache.hadoop.hdfs.server.namenode.NameNode.initialize(NameNode.java:276) at org.apache.hadoop.hdfs.server.namenode.NameNode.init(NameNode.java:496) at org.apache.hadoop.hdfs.server.namenode.NameNode.createNameNode(NameNode.java:1279) at org.apache.hadoop.hdfs.server.namenode.NameNode.main(NameNode.java:1288) I am following best practices here, as far as I know. I have the namenode writing into 3 directories (2 local, 1 NFS). All 3 of these dirs have the exact same files in them. I also run a secondary checkpoint node. This one appears to have started failing a week ago. So checkpoints were *not* being done since then. Thus I can get the NN up and running, but with a week old data! What is going on here? Why does my NN data *always* wind up causing this exception over time? Is there some easy way to get notified when the checkpointing starts to fail? -- Robert Dyer rd...@iastate.edu -- Robert Dyer rd...@iastate.edu
Re: Namenode failures
Hello Robert, It seems that your edit logs and fsimage have got corrupted somehow. It looks somewhat similar to this one https://issues.apache.org/jira/browse/HDFS-686 Have you made any changes to the 'dfs.name.dir' directory lately?Do you have enough space where metadata is getting stored?You can make use of offine image viewer to diagnose the fsimage file. Warm Regards, Tariq https://mtariq.jux.com/ cloudfront.blogspot.com On Mon, Feb 18, 2013 at 3:31 AM, Robert Dyer psyb...@gmail.com wrote: It just happened again. This was after a fresh format of HDFS/HBase and I am attempting to re-import the (backed up) data. http://pastebin.com/3fsWCNQY So now if I restart the namenode, I will lose data from the past 3 hours. What is causing this? How can I avoid it in the future? Is there an easy way to monitor (other than a script grep'ing the logs) the checkpoints to see when this happens? On Sat, Feb 16, 2013 at 2:39 PM, Robert Dyer psyb...@gmail.com wrote: Forgot to mention: Hadoop 1.0.4 On Sat, Feb 16, 2013 at 2:38 PM, Robert Dyer psyb...@gmail.com wrote: I am at a bit of wits end here. Every single time I restart the namenode, I get this crash: 2013-02-16 14:32:42,616 INFO org.apache.hadoop.hdfs.server.common.Storage: Image file of size 168058 loaded in 0 seconds. 2013-02-16 14:32:42,618 ERROR org.apache.hadoop.hdfs.server.namenode.NameNode: java.lang.NullPointerException at org.apache.hadoop.hdfs.server.namenode.FSDirectory.addChild(FSDirectory.java:1099) at org.apache.hadoop.hdfs.server.namenode.FSDirectory.addChild(FSDirectory.java:) at org.apache.hadoop.hdfs.server.namenode.FSDirectory.addNode(FSDirectory.java:1014) at org.apache.hadoop.hdfs.server.namenode.FSDirectory.unprotectedAddFile(FSDirectory.java:208) at org.apache.hadoop.hdfs.server.namenode.FSEditLog.loadFSEdits(FSEditLog.java:631) at org.apache.hadoop.hdfs.server.namenode.FSImage.loadFSEdits(FSImage.java:1021) at org.apache.hadoop.hdfs.server.namenode.FSImage.loadFSImage(FSImage.java:839) at org.apache.hadoop.hdfs.server.namenode.FSImage.recoverTransitionRead(FSImage.java:377) at org.apache.hadoop.hdfs.server.namenode.FSDirectory.loadFSImage(FSDirectory.java:100) at org.apache.hadoop.hdfs.server.namenode.FSNamesystem.initialize(FSNamesystem.java:388) at org.apache.hadoop.hdfs.server.namenode.FSNamesystem.init(FSNamesystem.java:362) at org.apache.hadoop.hdfs.server.namenode.NameNode.initialize(NameNode.java:276) at org.apache.hadoop.hdfs.server.namenode.NameNode.init(NameNode.java:496) at org.apache.hadoop.hdfs.server.namenode.NameNode.createNameNode(NameNode.java:1279) at org.apache.hadoop.hdfs.server.namenode.NameNode.main(NameNode.java:1288) I am following best practices here, as far as I know. I have the namenode writing into 3 directories (2 local, 1 NFS). All 3 of these dirs have the exact same files in them. I also run a secondary checkpoint node. This one appears to have started failing a week ago. So checkpoints were *not* being done since then. Thus I can get the NN up and running, but with a week old data! What is going on here? Why does my NN data *always* wind up causing this exception over time? Is there some easy way to get notified when the checkpointing starts to fail? -- Robert Dyer rd...@iastate.edu -- Robert Dyer rd...@iastate.edu
Re: Namenode failures
On Sun, Feb 17, 2013 at 4:41 PM, Mohammad Tariq donta...@gmail.com wrote: Hello Robert, It seems that your edit logs and fsimage have got corrupted somehow. It looks somewhat similar to this one https://issues.apache.org/jira/browse/HDFS-686 Similar, but the trace is different. Have you made any changes to the 'dfs.name.dir' directory lately? No. Do you have enough space where metadata is getting stored? Yes. All 3 locations have plenty of space (hundreds of GB). You can make use of offine image viewer to diagnose the fsimage file. Warm Regards, Tariq https://mtariq.jux.com/ cloudfront.blogspot.com On Mon, Feb 18, 2013 at 3:31 AM, Robert Dyer psyb...@gmail.com wrote: It just happened again. This was after a fresh format of HDFS/HBase and I am attempting to re-import the (backed up) data. http://pastebin.com/3fsWCNQY So now if I restart the namenode, I will lose data from the past 3 hours. What is causing this? How can I avoid it in the future? Is there an easy way to monitor (other than a script grep'ing the logs) the checkpoints to see when this happens? On Sat, Feb 16, 2013 at 2:39 PM, Robert Dyer psyb...@gmail.com wrote: Forgot to mention: Hadoop 1.0.4 On Sat, Feb 16, 2013 at 2:38 PM, Robert Dyer psyb...@gmail.com wrote: I am at a bit of wits end here. Every single time I restart the namenode, I get this crash: 2013-02-16 14:32:42,616 INFO org.apache.hadoop.hdfs.server.common.Storage: Image file of size 168058 loaded in 0 seconds. 2013-02-16 14:32:42,618 ERROR org.apache.hadoop.hdfs.server.namenode.NameNode: java.lang.NullPointerException at org.apache.hadoop.hdfs.server.namenode.FSDirectory.addChild(FSDirectory.java:1099) at org.apache.hadoop.hdfs.server.namenode.FSDirectory.addChild(FSDirectory.java:) at org.apache.hadoop.hdfs.server.namenode.FSDirectory.addNode(FSDirectory.java:1014) at org.apache.hadoop.hdfs.server.namenode.FSDirectory.unprotectedAddFile(FSDirectory.java:208) at org.apache.hadoop.hdfs.server.namenode.FSEditLog.loadFSEdits(FSEditLog.java:631) at org.apache.hadoop.hdfs.server.namenode.FSImage.loadFSEdits(FSImage.java:1021) at org.apache.hadoop.hdfs.server.namenode.FSImage.loadFSImage(FSImage.java:839) at org.apache.hadoop.hdfs.server.namenode.FSImage.recoverTransitionRead(FSImage.java:377) at org.apache.hadoop.hdfs.server.namenode.FSDirectory.loadFSImage(FSDirectory.java:100) at org.apache.hadoop.hdfs.server.namenode.FSNamesystem.initialize(FSNamesystem.java:388) at org.apache.hadoop.hdfs.server.namenode.FSNamesystem.init(FSNamesystem.java:362) at org.apache.hadoop.hdfs.server.namenode.NameNode.initialize(NameNode.java:276) at org.apache.hadoop.hdfs.server.namenode.NameNode.init(NameNode.java:496) at org.apache.hadoop.hdfs.server.namenode.NameNode.createNameNode(NameNode.java:1279) at org.apache.hadoop.hdfs.server.namenode.NameNode.main(NameNode.java:1288) I am following best practices here, as far as I know. I have the namenode writing into 3 directories (2 local, 1 NFS). All 3 of these dirs have the exact same files in them. I also run a secondary checkpoint node. This one appears to have started failing a week ago. So checkpoints were *not* being done since then. Thus I can get the NN up and running, but with a week old data! What is going on here? Why does my NN data *always* wind up causing this exception over time? Is there some easy way to get notified when the checkpointing starts to fail? -- Robert Dyer rd...@iastate.edu -- Robert Dyer rd...@iastate.edu -- Robert Dyer rd...@iastate.edu
Re: Namenode failures
On Sun, Feb 17, 2013 at 4:41 PM, Mohammad Tariq donta...@gmail.com wrote: You can make use of offine image viewer to diagnose the fsimage file. Is this not included in the 1.0.x branch? All of the documentation I find for it says to run 'bin/hdfs oev' but I do not have a 'bin/hdfs'. Warm Regards, Tariq https://mtariq.jux.com/ cloudfront.blogspot.com On Mon, Feb 18, 2013 at 3:31 AM, Robert Dyer psyb...@gmail.com wrote: It just happened again. This was after a fresh format of HDFS/HBase and I am attempting to re-import the (backed up) data. http://pastebin.com/3fsWCNQY So now if I restart the namenode, I will lose data from the past 3 hours. What is causing this? How can I avoid it in the future? Is there an easy way to monitor (other than a script grep'ing the logs) the checkpoints to see when this happens? On Sat, Feb 16, 2013 at 2:39 PM, Robert Dyer psyb...@gmail.com wrote: Forgot to mention: Hadoop 1.0.4 On Sat, Feb 16, 2013 at 2:38 PM, Robert Dyer psyb...@gmail.com wrote: I am at a bit of wits end here. Every single time I restart the namenode, I get this crash: 2013-02-16 14:32:42,616 INFO org.apache.hadoop.hdfs.server.common.Storage: Image file of size 168058 loaded in 0 seconds. 2013-02-16 14:32:42,618 ERROR org.apache.hadoop.hdfs.server.namenode.NameNode: java.lang.NullPointerException at org.apache.hadoop.hdfs.server.namenode.FSDirectory.addChild(FSDirectory.java:1099) at org.apache.hadoop.hdfs.server.namenode.FSDirectory.addChild(FSDirectory.java:) at org.apache.hadoop.hdfs.server.namenode.FSDirectory.addNode(FSDirectory.java:1014) at org.apache.hadoop.hdfs.server.namenode.FSDirectory.unprotectedAddFile(FSDirectory.java:208) at org.apache.hadoop.hdfs.server.namenode.FSEditLog.loadFSEdits(FSEditLog.java:631) at org.apache.hadoop.hdfs.server.namenode.FSImage.loadFSEdits(FSImage.java:1021) at org.apache.hadoop.hdfs.server.namenode.FSImage.loadFSImage(FSImage.java:839) at org.apache.hadoop.hdfs.server.namenode.FSImage.recoverTransitionRead(FSImage.java:377) at org.apache.hadoop.hdfs.server.namenode.FSDirectory.loadFSImage(FSDirectory.java:100) at org.apache.hadoop.hdfs.server.namenode.FSNamesystem.initialize(FSNamesystem.java:388) at org.apache.hadoop.hdfs.server.namenode.FSNamesystem.init(FSNamesystem.java:362) at org.apache.hadoop.hdfs.server.namenode.NameNode.initialize(NameNode.java:276) at org.apache.hadoop.hdfs.server.namenode.NameNode.init(NameNode.java:496) at org.apache.hadoop.hdfs.server.namenode.NameNode.createNameNode(NameNode.java:1279) at org.apache.hadoop.hdfs.server.namenode.NameNode.main(NameNode.java:1288) I am following best practices here, as far as I know. I have the namenode writing into 3 directories (2 local, 1 NFS). All 3 of these dirs have the exact same files in them. I also run a secondary checkpoint node. This one appears to have started failing a week ago. So checkpoints were *not* being done since then. Thus I can get the NN up and running, but with a week old data! What is going on here? Why does my NN data *always* wind up causing this exception over time? Is there some easy way to get notified when the checkpointing starts to fail? -- Robert Dyer rd...@iastate.edu -- Robert Dyer rd...@iastate.edu -- Robert Dyer rd...@iastate.edu
Re: Namenode failures
Hi Robert, Are you by any chance adding files carrying unusual encoding? If its possible, can we be sent a bundle of the corrupted log set (all of the dfs.name.dir contents) to inspect what seems to be causing the corruption? The only identified (but rarely occurring) bug around this part in 1.0.4 would be https://issues.apache.org/jira/browse/HDFS-4423. The other major corruption bug I know of is already fixed in your version, being https://issues.apache.org/jira/browse/HDFS-3652 specifically. We've not had this report from other users so having a reproduced file set (data not required) would be most helpful. If you have logs leading to the shutdown and crash as well, that'd be good to have too. P.s. How exactly are you shutting down the NN each time? A kill -9 or a regular SIGTERM shutdown? On Mon, Feb 18, 2013 at 4:31 AM, Robert Dyer rd...@iastate.edu wrote: On Sun, Feb 17, 2013 at 4:41 PM, Mohammad Tariq donta...@gmail.com wrote: You can make use of offine image viewer to diagnose the fsimage file. Is this not included in the 1.0.x branch? All of the documentation I find for it says to run 'bin/hdfs oev' but I do not have a 'bin/hdfs'. Warm Regards, Tariq https://mtariq.jux.com/ cloudfront.blogspot.com On Mon, Feb 18, 2013 at 3:31 AM, Robert Dyer psyb...@gmail.com wrote: It just happened again. This was after a fresh format of HDFS/HBase and I am attempting to re-import the (backed up) data. http://pastebin.com/3fsWCNQY So now if I restart the namenode, I will lose data from the past 3 hours. What is causing this? How can I avoid it in the future? Is there an easy way to monitor (other than a script grep'ing the logs) the checkpoints to see when this happens? On Sat, Feb 16, 2013 at 2:39 PM, Robert Dyer psyb...@gmail.com wrote: Forgot to mention: Hadoop 1.0.4 On Sat, Feb 16, 2013 at 2:38 PM, Robert Dyer psyb...@gmail.com wrote: I am at a bit of wits end here. Every single time I restart the namenode, I get this crash: 2013-02-16 14:32:42,616 INFO org.apache.hadoop.hdfs.server.common.Storage: Image file of size 168058 loaded in 0 seconds. 2013-02-16 14:32:42,618 ERROR org.apache.hadoop.hdfs.server.namenode.NameNode: java.lang.NullPointerException at org.apache.hadoop.hdfs.server.namenode.FSDirectory.addChild(FSDirectory.java:1099) at org.apache.hadoop.hdfs.server.namenode.FSDirectory.addChild(FSDirectory.java:) at org.apache.hadoop.hdfs.server.namenode.FSDirectory.addNode(FSDirectory.java:1014) at org.apache.hadoop.hdfs.server.namenode.FSDirectory.unprotectedAddFile(FSDirectory.java:208) at org.apache.hadoop.hdfs.server.namenode.FSEditLog.loadFSEdits(FSEditLog.java:631) at org.apache.hadoop.hdfs.server.namenode.FSImage.loadFSEdits(FSImage.java:1021) at org.apache.hadoop.hdfs.server.namenode.FSImage.loadFSImage(FSImage.java:839) at org.apache.hadoop.hdfs.server.namenode.FSImage.recoverTransitionRead(FSImage.java:377) at org.apache.hadoop.hdfs.server.namenode.FSDirectory.loadFSImage(FSDirectory.java:100) at org.apache.hadoop.hdfs.server.namenode.FSNamesystem.initialize(FSNamesystem.java:388) at org.apache.hadoop.hdfs.server.namenode.FSNamesystem.init(FSNamesystem.java:362) at org.apache.hadoop.hdfs.server.namenode.NameNode.initialize(NameNode.java:276) at org.apache.hadoop.hdfs.server.namenode.NameNode.init(NameNode.java:496) at org.apache.hadoop.hdfs.server.namenode.NameNode.createNameNode(NameNode.java:1279) at org.apache.hadoop.hdfs.server.namenode.NameNode.main(NameNode.java:1288) I am following best practices here, as far as I know. I have the namenode writing into 3 directories (2 local, 1 NFS). All 3 of these dirs have the exact same files in them. I also run a secondary checkpoint node. This one appears to have started failing a week ago. So checkpoints were *not* being done since then. Thus I can get the NN up and running, but with a week old data! What is going on here? Why does my NN data *always* wind up causing this exception over time? Is there some easy way to get notified when the checkpointing starts to fail? -- Robert Dyer rd...@iastate.edu -- Robert Dyer rd...@iastate.edu -- Robert Dyer rd...@iastate.edu -- Harsh J
Re: Namenode failures
On Sun, Feb 17, 2013 at 5:08 PM, Harsh J ha...@cloudera.com wrote: Hi Robert, Are you by any chance adding files carrying unusual encoding? I don't believe so. The only files I push to HDFS are SequenceFiles (with protobuf objects in them) and HBase's regions, which again is just protobuf objects. I don't use any special encodings in the protobufs. If its possible, can we be sent a bundle of the corrupted log set (all of the dfs.name.dir contents) to inspect what seems to be causing the corruption? I can give the logs, dfs data dir(s), and 2nn dirs. https://www.dropbox.com/s/heijq65pmb3esvd/hdfs-bug.tar.gz The only identified (but rarely occurring) bug around this part in 1.0.4 would be https://issues.apache.org/jira/browse/HDFS-4423. The other major corruption bug I know of is already fixed in your version, being https://issues.apache.org/jira/browse/HDFS-3652 specifically. We've not had this report from other users so having a reproduced file set (data not required) would be most helpful. If you have logs leading to the shutdown and crash as well, that'd be good to have too. P.s. How exactly are you shutting down the NN each time? A kill -9 or a regular SIGTERM shutdown? I shut down the NN with 'bin/stop-dfs.sh'. On Mon, Feb 18, 2013 at 4:31 AM, Robert Dyer rd...@iastate.edu wrote: On Sun, Feb 17, 2013 at 4:41 PM, Mohammad Tariq donta...@gmail.com wrote: You can make use of offine image viewer to diagnose the fsimage file. Is this not included in the 1.0.x branch? All of the documentation I find for it says to run 'bin/hdfs oev' but I do not have a 'bin/hdfs'. Warm Regards, Tariq https://mtariq.jux.com/ cloudfront.blogspot.com On Mon, Feb 18, 2013 at 3:31 AM, Robert Dyer psyb...@gmail.com wrote: It just happened again. This was after a fresh format of HDFS/HBase and I am attempting to re-import the (backed up) data. http://pastebin.com/3fsWCNQY So now if I restart the namenode, I will lose data from the past 3 hours. What is causing this? How can I avoid it in the future? Is there an easy way to monitor (other than a script grep'ing the logs) the checkpoints to see when this happens? On Sat, Feb 16, 2013 at 2:39 PM, Robert Dyer psyb...@gmail.com wrote: Forgot to mention: Hadoop 1.0.4 On Sat, Feb 16, 2013 at 2:38 PM, Robert Dyer psyb...@gmail.com wrote: I am at a bit of wits end here. Every single time I restart the namenode, I get this crash: 2013-02-16 14:32:42,616 INFO org.apache.hadoop.hdfs.server.common.Storage: Image file of size 168058 loaded in 0 seconds. 2013-02-16 14:32:42,618 ERROR org.apache.hadoop.hdfs.server.namenode.NameNode: java.lang.NullPointerException at org.apache.hadoop.hdfs.server.namenode.FSDirectory.addChild(FSDirectory.java:1099) at org.apache.hadoop.hdfs.server.namenode.FSDirectory.addChild(FSDirectory.java:) at org.apache.hadoop.hdfs.server.namenode.FSDirectory.addNode(FSDirectory.java:1014) at org.apache.hadoop.hdfs.server.namenode.FSDirectory.unprotectedAddFile(FSDirectory.java:208) at org.apache.hadoop.hdfs.server.namenode.FSEditLog.loadFSEdits(FSEditLog.java:631) at org.apache.hadoop.hdfs.server.namenode.FSImage.loadFSEdits(FSImage.java:1021) at org.apache.hadoop.hdfs.server.namenode.FSImage.loadFSImage(FSImage.java:839) at org.apache.hadoop.hdfs.server.namenode.FSImage.recoverTransitionRead(FSImage.java:377) at org.apache.hadoop.hdfs.server.namenode.FSDirectory.loadFSImage(FSDirectory.java:100) at org.apache.hadoop.hdfs.server.namenode.FSNamesystem.initialize(FSNamesystem.java:388) at org.apache.hadoop.hdfs.server.namenode.FSNamesystem.init(FSNamesystem.java:362) at org.apache.hadoop.hdfs.server.namenode.NameNode.initialize(NameNode.java:276) at org.apache.hadoop.hdfs.server.namenode.NameNode.init(NameNode.java:496) at org.apache.hadoop.hdfs.server.namenode.NameNode.createNameNode(NameNode.java:1279) at org.apache.hadoop.hdfs.server.namenode.NameNode.main(NameNode.java:1288) I am following best practices here, as far as I know. I have the namenode writing into 3 directories (2 local, 1 NFS). All 3 of these dirs have the exact same files in them. I also run a secondary checkpoint node. This one appears to have started failing a week ago. So checkpoints were *not* being done since then. Thus I can get the NN up and running, but with a week old data! What is going on here? Why does my NN data *always* wind up causing this exception over time? Is there some easy way to get notified when the checkpointing starts to fail? -- Robert Dyer rd...@iastate.edu -- Robert Dyer rd...@iastate.edu -- Robert Dyer rd...@iastate.edu -- Harsh J -- Robert Dyer rd...@iastate.edu
RE: why my test result on dfs short circuit read is slower?
I have try to tune io.file.buffer.size to 128K instead of 4K ShortCircuit read performance is still worse than read through datanode. I am start to wondering, does shortcircuit read really help under hadoop 1.1.1 version? I google to find a few people mention they got 2x gain or so upon CDH etc. But I really can't find out what else I can do to make it even just catch up normal read path It seems to me that, with short circuit read enabled, the BlockReaderLocal read data in 512/4096 bytes unit(checksum check enabled/skiped) While when It go through datanode, the BlockSender.sendChunks will read and sent data in 64K bytes units? Is that true? And if so, won't it explain that read through datanode will be faster? Since it read data in bigger block size. Best Regards, Raymond Liu -Original Message- From: Liu, Raymond [mailto:raymond@intel.com] Sent: Saturday, February 16, 2013 2:23 PM To: user@hadoop.apache.org Subject: RE: why my test result on dfs short circuit read is slower? Hi Arpit Gupta Yes, this way also confirms that short circuit read is enabled on my cluster. 13/02/16 14:07:34 DEBUG hdfs.DFSClient: Short circuit read is true 13/02/16 14:07:34 DEBUG hdfs.DFSClient: New BlockReaderLocal for file /mnt/DP_disk4/raymond/hdfs/data/current/subdir63/blk_-2736548898990727 638 of size 134217728 startOffset 0 length 134217728 short circuit checksum false So , any possibility that other setting might impact short circuit read to has worse performance than read through datanode? Raymond Another way to check if short circuit read is configured correctly. As the user who is configured for short circuit read issue the following command on a node where you expect the data to be read locally. export HADOOP_ROOT_LOGGER=debug,console; hadoop dfs -cat /path/to/file_on_hdfs On the console you should see something like hdfs.DFSClient: New BlockReaderLocal for file This would confirm that short circuit read is happening. -- Arpit Gupta Hortonworks Inc. http://hortonworks.com/ On Feb 15, 2013, at 9:53 PM, Liu, Raymond raymond@intel.com wrote: Hi Harsh Yes, I did set both of these. While not in hbase-site.xml but hdfs-site.xml. And I have double confirmed that local reads are performed, since there are no Error in datanode logs, and by watching lo network IO. If you want HBase to leverage the shortcircuit, the DN config dfs.block.local-path-access.user should be set to the user running HBase (i.e. hbase, for example), and the hbase-site.xml should have dfs.client.read.shortcircuit defined in all its RegionServers. Doing this wrong could result in performance penalty and some warn-logging, as local reads will be attempted but will begin to fail. On Sat, Feb 16, 2013 at 8:40 AM, Liu, Raymond raymond@intel.com wrote: Hi I tried to use short circuit read to improve my hbase cluster MR scan performance. I have the following setting in hdfs-site.xml dfs.client.read.shortcircuit set to true dfs.block.local-path-access.user set to MR job runner. The cluster is 1+4 node and each data node have 16cpu/4HDD, with all hbase table major compact thus all data is local. I have hoped that the short circuit read will improve the performance. While the test result is that with short circuit read enabled, the performance actually dropped 10-15%. Say scan a 50G table cost around 100s instead of 90s. My hadoop version is 1.1.1, any idea on this? Thx! Best Regards, Raymond Liu -- Harsh J
Re: product recommendations engine
Yeah... you can make this work. First, if your setup is relatively small, then you won't need Hadoop. Second, having lots of kinds of actions is a very reasonable thing to have. My own suggestion is that you analyze these each for their predictive power independently and then combine them at recommendation time. My own suggestion for how to deploy the recommendation model is in the form of a search engine that has fields for each kind of recommendation cue that you need to have. You can combine any or all of these cues in the process of doing a non-textual search using the recent history of the user as the query. This search-abuse style of recommendations is pretty easy to deploy and PHP has a reasonably good package for sending queries to Solr, which is the search engine I tend to recommend. You should also make a provision for A/B testing on different recommendation approaches and combinations of inputs. This is pretty straightforward, but usually requires some sort of experimental condition assignment and definitely requires good log recording and analysis. That said, this isn't a tiny project. It involves quite a bit of work. It isn't terribly hard at any point and the overall architecture is pretty straightforward, but there is a good bit of work to be done. On Sun, Feb 17, 2013 at 4:21 PM, Douglass Davis douglassdavi...@gmail.comwrote: Hello, I don't have any prior experience with Hadoop. I am also not a statistics expert. I am a software engineer, however, after looking at the docs, Hadoop still seems pretty intimidating to set up. I am interested in doing product recommendations. However, I want to store many things about user behavior, for example whether they click on a link in an email, how they rate a product, whether they buy it, etc. Then I would like to come up with similar items that a user may like. I have seen an example just based on user ratings, but would like to add much more data. Also, I think the clustering could be used in terms of recommending based on similar descriptions, attributes, and keywords. Or, I could use a combination of the two approaches. Another question, I wonder if Hadoop takes into account the passage of time. For example, a user may rate something high, then change their rating a couple months later. Lastly, my site is based on PHP. I need to be able to integrate that with Hadoop. How feasible is this approach? I saw a clustering example, and a recommendation example based on user ratings. Are there any other advice, docs, or examples that you could point me to that deals with any of these issues? Thanks, Doug
RE: why my test result on dfs short circuit read is slower?
Alright, I think in my sequence read scenario, it is possible that shortcircuit read is actually slower than read through datanode. For, when read through datanode, FS read operation is done by datanode daemon, while data processing is done by client. Thus when client is processing the data, data node could read data at the same time and write it to local socket for client to read. It take very few time for client to read from local socket. While when client read native FS directly, all the job is done by client. It will be blocked for more time to read data from native FS for process than from local socket. Overall, when CPU is not bound, and data node always prepare further data for client(due to sequence read scenario), the result is that shortcircuit read is slower though it cost less CPU load. The CPU/idle/IOWait load seems also justify my guess = For a scan only job: = Read through datanode: CPU : 30-35% / IOWait : ~50% / 300 second Shortcircuit read: CPU : 25-30% / IOWait : ~40% / 330 second Short circuit read is 10% slower = For a job do more calculation = Read through datanode: CPU : 80-90% / IOWait : ~5-10% / 190 seconds Shortcircuit read: CPU : ~90% / IOWait : ~2-3% / 160 second Short circuit read is 15% faster. So, short circuit read is not always faster, especially when CPU is not bound and read by sequence, it will be slower. This is the best explain I can get now. Any thoughts? Raymond I have try to tune io.file.buffer.size to 128K instead of 4K ShortCircuit read performance is still worse than read through datanode. I am start to wondering, does shortcircuit read really help under hadoop 1.1.1 version? I google to find a few people mention they got 2x gain or so upon CDH etc. But I really can't find out what else I can do to make it even just catch up normal read path It seems to me that, with short circuit read enabled, the BlockReaderLocal read data in 512/4096 bytes unit(checksum check enabled/skiped) While when It go through datanode, the BlockSender.sendChunks will read and sent data in 64K bytes units? Is that true? And if so, won't it explain that read through datanode will be faster? Since it read data in bigger block size. Best Regards, Raymond Liu -Original Message- From: Liu, Raymond [mailto:raymond@intel.com] Sent: Saturday, February 16, 2013 2:23 PM To: user@hadoop.apache.org Subject: RE: why my test result on dfs short circuit read is slower? Hi Arpit Gupta Yes, this way also confirms that short circuit read is enabled on my cluster. 13/02/16 14:07:34 DEBUG hdfs.DFSClient: Short circuit read is true 13/02/16 14:07:34 DEBUG hdfs.DFSClient: New BlockReaderLocal for file /mnt/DP_disk4/raymond/hdfs/data/current/subdir63/blk_-2736548898990727 638 of size 134217728 startOffset 0 length 134217728 short circuit checksum false So , any possibility that other setting might impact short circuit read to has worse performance than read through datanode? Raymond Another way to check if short circuit read is configured correctly. As the user who is configured for short circuit read issue the following command on a node where you expect the data to be read locally. export HADOOP_ROOT_LOGGER=debug,console; hadoop dfs -cat /path/to/file_on_hdfs On the console you should see something like hdfs.DFSClient: New BlockReaderLocal for file This would confirm that short circuit read is happening. -- Arpit Gupta Hortonworks Inc. http://hortonworks.com/ On Feb 15, 2013, at 9:53 PM, Liu, Raymond raymond@intel.com wrote: Hi Harsh Yes, I did set both of these. While not in hbase-site.xml but hdfs-site.xml. And I have double confirmed that local reads are performed, since there are no Error in datanode logs, and by watching lo network IO. If you want HBase to leverage the shortcircuit, the DN config dfs.block.local-path-access.user should be set to the user running HBase (i.e. hbase, for example), and the hbase-site.xml should have dfs.client.read.shortcircuit defined in all its RegionServers. Doing this wrong could result in performance penalty and some warn-logging, as local reads will be attempted but will begin to fail. On Sat, Feb 16, 2013 at 8:40 AM, Liu, Raymond raymond@intel.com wrote: Hi I tried to use short circuit read to improve my hbase cluster MR scan performance. I have the following setting in hdfs-site.xml dfs.client.read.shortcircuit set to true dfs.block.local-path-access.user set to MR job runner. The cluster is 1+4 node and each data node have 16cpu/4HDD, with all hbase table major compact thus all data is local. I have hoped that the short circuit read will improve the performance.
Re: some ideas for QJM and NFS
Oh, yes, you are right, George. I'll probably do it in the next days. On Mon, Feb 18, 2013 at 2:47 PM, George Datskos george.dats...@jp.fujitsu.com wrote: Hi Azuryy, So you have measurements for hadoop-1.0.4 and hadoop-2.0.3+QJM, but I think you should also measure hadoop-2.0.3 _wihout_ QJM so you can know for sure if the performance degrade is actually related to QJM or not. George Hi, HarshJ is a good guy, I've seen this JIRA: https://issues.apache.org/jira/browse/HDFS-4508 I have a test cluster hadoop-1.0.4, I've upgrade to hadoop-2.0.3-alpha. mu cluster is very small, four nodes totally. then I did some test on the original Hadoop and new Hadoop, the testing is very simple: I have a data file with 450MB, I just put it on the HDFS. block size: 128MB, replica: 2 the following is the result: [root@webdm test]# ll testspeed.tar.gz -rw-r--r-- 1 root root 452M Feb 18 13:54 testspeed.tar.gz [root@webdm test]# //On the hadoop-1.0.4 [root@webdm test]# date +%Y-%m-%d_%H:%M:%S; hadoop dfs -put testspeed.tar.gz / ; date +%Y-%m-%d_%H:%M:%S 2013-02-18_13:54:24 Warning: $HADOOP_HOME is deprecated. 2013-02-18_13:54:58 //On the hadoop-2.0.3-alpha with QJM [root@webdm test]# date +%Y-%m-%d_%H:%M:%S; hdfs dfs -put testspeed.tar.gz / ; date +%Y-%m-%d_%H:%M:%S 2013-02-18_14:13:29 13/02/18 14:13:30 WARN util.NativeCodeLoader: Unable to load native-hadoop library for your platform... using builtin-java classes where applicable 2013-02-18_14:14:33 I do think QJM HA feature affect the performance, because each writer from QJM, it will do: fence old writer; sync in-progress log; start new log segment; then write. only if writer received a successful response from a quorum of JNs, writer finished for this time. But NFS HA just write log segment in the local and NFS, when it receive successful response from NFS, it finished this time. So, I just suggest we always keep these two HA features in future, even in the stable release. which one should be used, which depends on yourself based on your infrastructure. Thanks.
答复: why my test result on dfs short circuit read is slower?
Probably readahead played a key role on the first scenario(scan only job) ? the default LONG_READ_THRESHOLD_BYTES(BlockSender.java) is 256k in current codebase, and ReadaheadPool takes effect on normal read path. Regards, Liang 发件人: Liu, Raymond [raymond@intel.com] 发送时间: 2013年2月18日 14:04 收件人: user@hadoop.apache.org 主题: RE: why my test result on dfs short circuit read is slower? Alright, I think in my sequence read scenario, it is possible that shortcircuit read is actually slower than read through datanode. For, when read through datanode, FS read operation is done by datanode daemon, while data processing is done by client. Thus when client is processing the data, data node could read data at the same time and write it to local socket for client to read. It take very few time for client to read from local socket. While when client read native FS directly, all the job is done by client. It will be blocked for more time to read data from native FS for process than from local socket. Overall, when CPU is not bound, and data node always prepare further data for client(due to sequence read scenario), the result is that shortcircuit read is slower though it cost less CPU load. The CPU/idle/IOWait load seems also justify my guess = For a scan only job: = Read through datanode: CPU : 30-35% / IOWait : ~50% / 300 second Shortcircuit read: CPU : 25-30% / IOWait : ~40% / 330 second Short circuit read is 10% slower = For a job do more calculation = Read through datanode: CPU : 80-90% / IOWait : ~5-10% / 190 seconds Shortcircuit read: CPU : ~90% / IOWait : ~2-3% / 160 second Short circuit read is 15% faster. So, short circuit read is not always faster, especially when CPU is not bound and read by sequence, it will be slower. This is the best explain I can get now. Any thoughts? Raymond I have try to tune io.file.buffer.size to 128K instead of 4K ShortCircuit read performance is still worse than read through datanode. I am start to wondering, does shortcircuit read really help under hadoop 1.1.1 version? I google to find a few people mention they got 2x gain or so upon CDH etc. But I really can't find out what else I can do to make it even just catch up normal read path It seems to me that, with short circuit read enabled, the BlockReaderLocal read data in 512/4096 bytes unit(checksum check enabled/skiped) While when It go through datanode, the BlockSender.sendChunks will read and sent data in 64K bytes units? Is that true? And if so, won't it explain that read through datanode will be faster? Since it read data in bigger block size. Best Regards, Raymond Liu -Original Message- From: Liu, Raymond [mailto:raymond@intel.com] Sent: Saturday, February 16, 2013 2:23 PM To: user@hadoop.apache.org Subject: RE: why my test result on dfs short circuit read is slower? Hi Arpit Gupta Yes, this way also confirms that short circuit read is enabled on my cluster. 13/02/16 14:07:34 DEBUG hdfs.DFSClient: Short circuit read is true 13/02/16 14:07:34 DEBUG hdfs.DFSClient: New BlockReaderLocal for file /mnt/DP_disk4/raymond/hdfs/data/current/subdir63/blk_-2736548898990727 638 of size 134217728 startOffset 0 length 134217728 short circuit checksum false So , any possibility that other setting might impact short circuit read to has worse performance than read through datanode? Raymond Another way to check if short circuit read is configured correctly. As the user who is configured for short circuit read issue the following command on a node where you expect the data to be read locally. export HADOOP_ROOT_LOGGER=debug,console; hadoop dfs -cat /path/to/file_on_hdfs On the console you should see something like hdfs.DFSClient: New BlockReaderLocal for file This would confirm that short circuit read is happening. -- Arpit Gupta Hortonworks Inc. http://hortonworks.com/ On Feb 15, 2013, at 9:53 PM, Liu, Raymond raymond@intel.com wrote: Hi Harsh Yes, I did set both of these. While not in hbase-site.xml but hdfs-site.xml. And I have double confirmed that local reads are performed, since there are no Error in datanode logs, and by watching lo network IO. If you want HBase to leverage the shortcircuit, the DN config dfs.block.local-path-access.user should be set to the user running HBase (i.e. hbase, for example), and the hbase-site.xml should have dfs.client.read.shortcircuit defined in all its RegionServers. Doing this wrong could result in performance penalty and some warn-logging, as local reads will be attempted but will begin to fail. On Sat, Feb 16, 2013 at 8:40 AM, Liu, Raymond raymond@intel.com wrote: Hi I tried to use short circuit read to improve my hbase cluster MR
Re: some ideas for QJM and NFS
Hi, I did it on hadoop-2.0.3-alpha without HA as following: [root@webdm test]# date +%Y-%m-%d_%H:%M:%S; hdfs dfs -put testspeed.tar.gz / ; date +%Y-%m-%d_%H:%M:%S 2013-02-18_15:20:01 13/02/18 15:20:02 WARN util.NativeCodeLoader: Unable to load native-hadoop library for your platform... using builtin-java classes where applicable 2013-02-18_15:20:30 so the performance is a little bit better than hadoop-1.0.4. On Mon, Feb 18, 2013 at 2:53 PM, Azuryy Yu azury...@gmail.com wrote: Oh, yes, you are right, George. I'll probably do it in the next days. On Mon, Feb 18, 2013 at 2:47 PM, George Datskos george.dats...@jp.fujitsu.com wrote: Hi Azuryy, So you have measurements for hadoop-1.0.4 and hadoop-2.0.3+QJM, but I think you should also measure hadoop-2.0.3 _wihout_ QJM so you can know for sure if the performance degrade is actually related to QJM or not. George Hi, HarshJ is a good guy, I've seen this JIRA: https://issues.apache.org/jira/browse/HDFS-4508 I have a test cluster hadoop-1.0.4, I've upgrade to hadoop-2.0.3-alpha. mu cluster is very small, four nodes totally. then I did some test on the original Hadoop and new Hadoop, the testing is very simple: I have a data file with 450MB, I just put it on the HDFS. block size: 128MB, replica: 2 the following is the result: [root@webdm test]# ll testspeed.tar.gz -rw-r--r-- 1 root root 452M Feb 18 13:54 testspeed.tar.gz [root@webdm test]# //On the hadoop-1.0.4 [root@webdm test]# date +%Y-%m-%d_%H:%M:%S; hadoop dfs -put testspeed.tar.gz / ; date +%Y-%m-%d_%H:%M:%S 2013-02-18_13:54:24 Warning: $HADOOP_HOME is deprecated. 2013-02-18_13:54:58 //On the hadoop-2.0.3-alpha with QJM [root@webdm test]# date +%Y-%m-%d_%H:%M:%S; hdfs dfs -put testspeed.tar.gz / ; date +%Y-%m-%d_%H:%M:%S 2013-02-18_14:13:29 13/02/18 14:13:30 WARN util.NativeCodeLoader: Unable to load native-hadoop library for your platform... using builtin-java classes where applicable 2013-02-18_14:14:33 I do think QJM HA feature affect the performance, because each writer from QJM, it will do: fence old writer; sync in-progress log; start new log segment; then write. only if writer received a successful response from a quorum of JNs, writer finished for this time. But NFS HA just write log segment in the local and NFS, when it receive successful response from NFS, it finished this time. So, I just suggest we always keep these two HA features in future, even in the stable release. which one should be used, which depends on yourself based on your infrastructure. Thanks.
答复: some ideas for QJM and NFS
Hi Azuryy, just want to confirm one thing, your JN did not deploy on the same machines within DN, right ? Regards, Liang 发件人: Azuryy Yu [azury...@gmail.com] 发送时间: 2013年2月18日 15:22 收件人: user@hadoop.apache.org 主题: Re: some ideas for QJM and NFS Hi, I did it on hadoop-2.0.3-alpha without HA as following: [root@webdm test]# date +%Y-%m-%d_%H:%M:%S; hdfs dfs -put testspeed.tar.gz / ; date +%Y-%m-%d_%H:%M:%S 2013-02-18_15:20:01 13/02/18 15:20:02 WARN util.NativeCodeLoader: Unable to load native-hadoop library for your platform... using builtin-java classes where applicable 2013-02-18_15:20:30 so the performance is a little bit better than hadoop-1.0.4. On Mon, Feb 18, 2013 at 2:53 PM, Azuryy Yu azury...@gmail.commailto:azury...@gmail.com wrote: Oh, yes, you are right, George. I'll probably do it in the next days. On Mon, Feb 18, 2013 at 2:47 PM, George Datskos george.dats...@jp.fujitsu.commailto:george.dats...@jp.fujitsu.com wrote: Hi Azuryy, So you have measurements for hadoop-1.0.4 and hadoop-2.0.3+QJM, but I think you should also measure hadoop-2.0.3 _wihout_ QJM so you can know for sure if the performance degrade is actually related to QJM or not. George Hi, HarshJ is a good guy, I've seen this JIRA: https://issues.apache.org/jira/browse/HDFS-4508 I have a test cluster hadoop-1.0.4, I've upgrade to hadoop-2.0.3-alpha. mu cluster is very small, four nodes totally. then I did some test on the original Hadoop and new Hadoop, the testing is very simple: I have a data file with 450MB, I just put it on the HDFS. block size: 128MB, replica: 2 the following is the result: [root@webdm test]# ll testspeed.tar.gz -rw-r--r-- 1 root root 452M Feb 18 13:54 testspeed.tar.gz [root@webdm test]# //On the hadoop-1.0.4 [root@webdm test]# date +%Y-%m-%d_%H:%M:%S; hadoop dfs -put testspeed.tar.gz / ; date +%Y-%m-%d_%H:%M:%S 2013-02-18_13:54:24 Warning: $HADOOP_HOME is deprecated. 2013-02-18_13:54:58 //On the hadoop-2.0.3-alpha with QJM [root@webdm test]# date +%Y-%m-%d_%H:%M:%S; hdfs dfs -put testspeed.tar.gz / ; date +%Y-%m-%d_%H:%M:%S 2013-02-18_14:13:29 13/02/18 14:13:30 WARN util.NativeCodeLoader: Unable to load native-hadoop library for your platform... using builtin-java classes where applicable 2013-02-18_14:14:33 I do think QJM HA feature affect the performance, because each writer from QJM, it will do: fence old writer; sync in-progress log; start new log segment; then write. only if writer received a successful response from a quorum of JNs, writer finished for this time. But NFS HA just write log segment in the local and NFS, when it receive successful response from NFS, it finished this time. So, I just suggest we always keep these two HA features in future, even in the stable release. which one should be used, which depends on yourself based on your infrastructure. Thanks.
Re: 答复: some ideas for QJM and NFS
All JNs are deployed on the same node with DN. On Mon, Feb 18, 2013 at 3:35 PM, 谢良 xieli...@xiaomi.com wrote: Hi Azuryy, just want to confirm one thing, your JN did not deploy on the same machines within DN, right ? Regards, Liang -- *发件人:* Azuryy Yu [azury...@gmail.com] *发送时间:* 2013年2月18日 15:22 *收件人:* user@hadoop.apache.org *主题:* Re: some ideas for QJM and NFS Hi, I did it on hadoop-2.0.3-alpha without HA as following: [root@webdm test]# date +%Y-%m-%d_%H:%M:%S; hdfs dfs -put testspeed.tar.gz / ; date +%Y-%m-%d_%H:%M:%S 2013-02-18_15:20:01 13/02/18 15:20:02 WARN util.NativeCodeLoader: Unable to load native-hadoop library for your platform... using builtin-java classes where applicable 2013-02-18_15:20:30 so the performance is a little bit better than hadoop-1.0.4. On Mon, Feb 18, 2013 at 2:53 PM, Azuryy Yu azury...@gmail.com wrote: Oh, yes, you are right, George. I'll probably do it in the next days. On Mon, Feb 18, 2013 at 2:47 PM, George Datskos george.dats...@jp.fujitsu.com wrote: Hi Azuryy, So you have measurements for hadoop-1.0.4 and hadoop-2.0.3+QJM, but I think you should also measure hadoop-2.0.3 _wihout_ QJM so you can know for sure if the performance degrade is actually related to QJM or not. George Hi, HarshJ is a good guy, I've seen this JIRA: https://issues.apache.org/jira/browse/HDFS-4508 I have a test cluster hadoop-1.0.4, I've upgrade to hadoop-2.0.3-alpha. mu cluster is very small, four nodes totally. then I did some test on the original Hadoop and new Hadoop, the testing is very simple: I have a data file with 450MB, I just put it on the HDFS. block size: 128MB, replica: 2 the following is the result: [root@webdm test]# ll testspeed.tar.gz -rw-r--r-- 1 root root 452M Feb 18 13:54 testspeed.tar.gz [root@webdm test]# //On the hadoop-1.0.4 [root@webdm test]# date +%Y-%m-%d_%H:%M:%S; hadoop dfs -put testspeed.tar.gz / ; date +%Y-%m-%d_%H:%M:%S 2013-02-18_13:54:24 Warning: $HADOOP_HOME is deprecated. 2013-02-18_13:54:58 //On the hadoop-2.0.3-alpha with QJM [root@webdm test]# date +%Y-%m-%d_%H:%M:%S; hdfs dfs -put testspeed.tar.gz / ; date +%Y-%m-%d_%H:%M:%S 2013-02-18_14:13:29 13/02/18 14:13:30 WARN util.NativeCodeLoader: Unable to load native-hadoop library for your platform... using builtin-java classes where applicable 2013-02-18_14:14:33 I do think QJM HA feature affect the performance, because each writer from QJM, it will do: fence old writer; sync in-progress log; start new log segment; then write. only if writer received a successful response from a quorum of JNs, writer finished for this time. But NFS HA just write log segment in the local and NFS, when it receive successful response from NFS, it finished this time. So, I just suggest we always keep these two HA features in future, even in the stable release. which one should be used, which depends on yourself based on your infrastructure. Thanks.