building from subversion repository

2013-02-17 Thread George R Goffe
Hi,

I'm trying to build hadoop from a current check out of the repository and am 
receiving the following messages. Can someone enlighten me as to what I'm doing 
wrong please?

Thanks,

George...




[INFO] BUILD FAILURE
[INFO] 
[INFO] Total time: 1:55.493s
[INFO] Finished at: Sun Feb 17 03:58:15 PST 2013
[INFO] Final Memory: 31M/332M
[INFO] 
[ERROR] Could not find goal 'protoc' in plugin 
org.apache.hadoop:hadoop-maven-plugins:3.0.0-SNAPSHOT among available goals - 
[Help 1]
[ERROR] 
[ERROR] To see the full stack trace of the errors, re-run Maven with the -e 
switch.
[ERROR] Re-run Maven using the -X switch to enable full debug logging.
[ERROR] 
[ERROR] For more information about the errors and possible solutions, please 
read the following articles:
[ERROR] [Help 1] 
http://cwiki.apache.org/confluence/display/MAVEN/MojoNotFoundException

RE: Can I perfrom a MR on my local filesystem

2013-02-17 Thread Agarwal, Nikhil
Hi,

Thank you Niels and thank you Nitin for your reply.

Actually, I want to run MR on a cloud store, which is open source. So I thought 
of implementing  a file system for the same and plugging it into Hadoop, just 
like S3/KFS are there. This would enable a hadoop client to talk to My cloud 
store. But I do not have further clarity as to how to run MR on the cloud 
using the JobTracker/TaskTracker framework of Hadoop.

As per the link given by Niels, it shows that I can run MR on local file 
system. So is there any way of telling the JobTracker to read data from a set 
of nodes and then deploy TaskTracker daemons on those nodes (which would be My 
cloud store in this case) and fetch the result of MR.

Note: I do not want to fetch the data to my local computer as is the case with 
S3. Fetching the data would fail the purpose of using Hadoop (which is moving 
compute to data).

Thanks,
Nikhil

From: Agarwal, Nikhil
Sent: Sunday, February 17, 2013 11:53 AM
To: 'user@hadoop.apache.org'
Subject: Can I perfrom a MR on my local filesystem


Hi,

Recently I followed a blog to run Hadoop on a single node cluster.

I wanted to ask that in a single node set-up of Hadoop is it necessary to have 
the data copied into Hadoop's HDFS before running a MR on it. Can I run MR on 
my local file system too without copying the data to HDFS?

In the Hadoop source code I saw there are implementations of other file systems 
too like S3, KFS, FTP, etc. so how does exactly a MR happen on S3 data store ? 
How does JobTracker or Tasktracker run in S3 ?



I would be very thankful to get a reply to this.



Thanks  Regards,

Nikhil



Re: building from subversion repository

2013-02-17 Thread Harsh J
Hi George,

The error below is your issue:

 [ERROR] Could not find goal 'protoc' in plugin 
 org.apache.hadoop:hadoop-maven-plugins:3.0.0-SNAPSHOT among available goals 
 - [Help 1]

To build trunk, a protocol buffers (protobuf) compiler installation of
version 2.4 at least is required, cause we have that as a dependency.
This is mentioned on http://wiki.apache.org/hadoop/HowToContribute,
http://wiki.apache.org/hadoop/QwertyManiac/BuildingHadoopTrunk and
also the SVN trunk's BUILDING.txt (look for proto).

Once installed in your OS, the following command and output can be
seen to work, and your build should continue successfully:

➜  ~  protoc --version
libprotoc 2.4.1

On Sun, Feb 17, 2013 at 5:43 PM, George R Goffe grgo...@yahoo.com wrote:
 Hi,

 I'm trying to build hadoop from a current check out of the repository and am
 receiving the following messages. Can someone enlighten me as to what I'm
 doing wrong please?

 Thanks,

 George...



 [INFO] BUILD FAILURE
 [INFO]
 
 [INFO] Total time: 1:55.493s
 [INFO] Finished at: Sun Feb 17 03:58:15 PST 2013
 [INFO] Final Memory: 31M/332M
 [INFO]
 
 [ERROR] Could not find goal 'protoc' in plugin
 org.apache.hadoop:hadoop-maven-plugins:3.0.0-SNAPSHOT among available goals
 - [Help 1]
 [ERROR]
 [ERROR] To see the full stack trace of the errors, re-run Maven with the -e
 switch.
 [ERROR] Re-run Maven using the -X switch to enable full debug logging.
 [ERROR]
 [ERROR] For more information about the errors and possible solutions, please
 read the following articles:
 [ERROR] [Help 1]
 http://cwiki.apache.org/confluence/display/MAVEN/MojoNotFoundException




--
Harsh J


Re: QJM deployment

2013-02-17 Thread Harsh J
Hi Azuryy,

Thanks for your feedback on the docs! I've filed
https://issues.apache.org/jira/browse/HDFS-4508 on your behalf to
address them. Feel free to file JIRA with documentation complaints
with change patches to have them improved yourself :)

On Sun, Feb 17, 2013 at 2:25 PM, Azuryy Yu azury...@gmail.com wrote:
 Hi,

 who can kindly make 2.0.3-alpha QJM  deployment document more better, I
 cannot understand it successfully, thanks.

 such as :
 1) by running the command hdfs namenode -bootstrapStandby on the
 unformatted NameNode

 - - it should be start formatted namenode firstly.
 2) If you are converting a non-HA NameNode to be HA, you should run the
 command hdfs -initializeSharedEdits, which will initialize the
 JournalNodes with the edits data from the local NameNode edits directories.

 -- run this command on which node? and does that this command exists?


 QJM Deployment details section is bad





--
Harsh J


Re: executing hadoop commands from python?

2013-02-17 Thread anuj maurice
i was stuck with similar issue before and couldn't come up with a more
viable alternative than this so if the output of the hadoop command is not
that big then you can take it into your py script and process it .

i use the following code snippet to clean the output of ls and store it
into a py list for process.
In your case you can do a len on the list to get file count

fscommand  = hadoop dfs -ls /path/in/%s/*/ 2 /dev/null%(hdfs)
hadoop_cmd=commands.getoutput(fscommand)
lines = hadoop_cmd.split(\n)[1:]
strlines =[map(lambda a:a.strip(),line.split(' ')[-3:]) for line in lines]




On Sun, Feb 17, 2013 at 4:17 AM, jamal sasha jamalsha...@gmail.com wrote:

 Hi,

   This might be more of a python centric question but was wondering if
 anyone has tried it out...

 I am trying to run few hadoop commands from python program...

 For example if from command line, you do:

   bin/hadoop dfs -ls /hdfs/query/path

 it returns all the files in the hdfs query path..
 So very similar to unix


 Now I am trying to basically do this from python.. and do some
 manipulation from it.

  exec_str = path/to/hadoop/bin/hadoop dfs -ls  + query_path
  os.system(exec_str)

 Now, I am trying to grab this output to do some manipulation in it.
 For example.. count number of files?
 I looked into subprocess module but then... these are not native shell
 commands. hence not sure whether i can apply those concepts
 How to solve this?

 Thanks





-- 
regards ,
Anuj Maurice


Re: executing hadoop commands from python?

2013-02-17 Thread Harsh J
Instead of 'scraping' this way, consider using a library such as
Pydoop (http://pydoop.sourceforge.net) which provides pythonic ways
and APIs to interact with Hadoop components. There are also other
libraries covered at
http://blog.cloudera.com/blog/2013/01/a-guide-to-python-frameworks-for-hadoop/
for example.

On Sun, Feb 17, 2013 at 4:17 AM, jamal sasha jamalsha...@gmail.com wrote:
 Hi,

   This might be more of a python centric question but was wondering if
 anyone has tried it out...

 I am trying to run few hadoop commands from python program...

 For example if from command line, you do:

   bin/hadoop dfs -ls /hdfs/query/path

 it returns all the files in the hdfs query path..
 So very similar to unix


 Now I am trying to basically do this from python.. and do some manipulation
 from it.

  exec_str = path/to/hadoop/bin/hadoop dfs -ls  + query_path
  os.system(exec_str)

 Now, I am trying to grab this output to do some manipulation in it.
 For example.. count number of files?
 I looked into subprocess module but then... these are not native shell
 commands. hence not sure whether i can apply those concepts
 How to solve this?

 Thanks





--
Harsh J


Re: Namenode failures

2013-02-17 Thread Robert Dyer
It just happened again.  This was after a fresh format of HDFS/HBase and I
am attempting to re-import the (backed up) data.

  http://pastebin.com/3fsWCNQY

So now if I restart the namenode, I will lose data from the past 3 hours.

What is causing this?  How can I avoid it in the future?  Is there an easy
way to monitor (other than a script grep'ing the logs) the checkpoints to
see when this happens?


On Sat, Feb 16, 2013 at 2:39 PM, Robert Dyer psyb...@gmail.com wrote:

 Forgot to mention: Hadoop 1.0.4


 On Sat, Feb 16, 2013 at 2:38 PM, Robert Dyer psyb...@gmail.com wrote:

 I am at a bit of wits end here.  Every single time I restart the
 namenode, I get this crash:

 2013-02-16 14:32:42,616 INFO
 org.apache.hadoop.hdfs.server.common.Storage: Image file of size 168058
 loaded in 0 seconds.
 2013-02-16 14:32:42,618 ERROR
 org.apache.hadoop.hdfs.server.namenode.NameNode:
 java.lang.NullPointerException
 at
 org.apache.hadoop.hdfs.server.namenode.FSDirectory.addChild(FSDirectory.java:1099)
 at
 org.apache.hadoop.hdfs.server.namenode.FSDirectory.addChild(FSDirectory.java:)
 at
 org.apache.hadoop.hdfs.server.namenode.FSDirectory.addNode(FSDirectory.java:1014)
 at
 org.apache.hadoop.hdfs.server.namenode.FSDirectory.unprotectedAddFile(FSDirectory.java:208)
 at
 org.apache.hadoop.hdfs.server.namenode.FSEditLog.loadFSEdits(FSEditLog.java:631)
 at
 org.apache.hadoop.hdfs.server.namenode.FSImage.loadFSEdits(FSImage.java:1021)
 at
 org.apache.hadoop.hdfs.server.namenode.FSImage.loadFSImage(FSImage.java:839)
 at
 org.apache.hadoop.hdfs.server.namenode.FSImage.recoverTransitionRead(FSImage.java:377)
 at
 org.apache.hadoop.hdfs.server.namenode.FSDirectory.loadFSImage(FSDirectory.java:100)
 at
 org.apache.hadoop.hdfs.server.namenode.FSNamesystem.initialize(FSNamesystem.java:388)
 at
 org.apache.hadoop.hdfs.server.namenode.FSNamesystem.init(FSNamesystem.java:362)
 at
 org.apache.hadoop.hdfs.server.namenode.NameNode.initialize(NameNode.java:276)
 at
 org.apache.hadoop.hdfs.server.namenode.NameNode.init(NameNode.java:496)
 at
 org.apache.hadoop.hdfs.server.namenode.NameNode.createNameNode(NameNode.java:1279)
 at
 org.apache.hadoop.hdfs.server.namenode.NameNode.main(NameNode.java:1288)

 I am following best practices here, as far as I know.  I have the
 namenode writing into 3 directories (2 local, 1 NFS).  All 3 of these dirs
 have the exact same files in them.

 I also run a secondary checkpoint node.  This one appears to have started
 failing a week ago.  So checkpoints were *not* being done since then.  Thus
 I can get the NN up and running, but with a week old data!

  What is going on here?  Why does my NN data *always* wind up causing
 this exception over time?  Is there some easy way to get notified when the
 checkpointing starts to fail?




 --

 Robert Dyer
 rd...@iastate.edu




-- 

Robert Dyer
rd...@iastate.edu


Re: Namenode failures

2013-02-17 Thread Mohammad Tariq
Hello Robert,

 It seems that your edit logs and fsimage have got
corrupted somehow. It looks somewhat similar to this one
https://issues.apache.org/jira/browse/HDFS-686

Have you made any changes to the 'dfs.name.dir' directory
lately?Do you have enough space where metadata is getting
stored?You can make use of offine image viewer to diagnose
the fsimage file.

Warm Regards,
Tariq
https://mtariq.jux.com/
cloudfront.blogspot.com


On Mon, Feb 18, 2013 at 3:31 AM, Robert Dyer psyb...@gmail.com wrote:

 It just happened again.  This was after a fresh format of HDFS/HBase and I
 am attempting to re-import the (backed up) data.

   http://pastebin.com/3fsWCNQY

 So now if I restart the namenode, I will lose data from the past 3 hours.

 What is causing this?  How can I avoid it in the future?  Is there an easy
 way to monitor (other than a script grep'ing the logs) the checkpoints to
 see when this happens?


 On Sat, Feb 16, 2013 at 2:39 PM, Robert Dyer psyb...@gmail.com wrote:

 Forgot to mention: Hadoop 1.0.4


 On Sat, Feb 16, 2013 at 2:38 PM, Robert Dyer psyb...@gmail.com wrote:

 I am at a bit of wits end here.  Every single time I restart the
 namenode, I get this crash:

 2013-02-16 14:32:42,616 INFO
 org.apache.hadoop.hdfs.server.common.Storage: Image file of size 168058
 loaded in 0 seconds.
 2013-02-16 14:32:42,618 ERROR
 org.apache.hadoop.hdfs.server.namenode.NameNode:
 java.lang.NullPointerException
 at
 org.apache.hadoop.hdfs.server.namenode.FSDirectory.addChild(FSDirectory.java:1099)
 at
 org.apache.hadoop.hdfs.server.namenode.FSDirectory.addChild(FSDirectory.java:)
 at
 org.apache.hadoop.hdfs.server.namenode.FSDirectory.addNode(FSDirectory.java:1014)
 at
 org.apache.hadoop.hdfs.server.namenode.FSDirectory.unprotectedAddFile(FSDirectory.java:208)
 at
 org.apache.hadoop.hdfs.server.namenode.FSEditLog.loadFSEdits(FSEditLog.java:631)
 at
 org.apache.hadoop.hdfs.server.namenode.FSImage.loadFSEdits(FSImage.java:1021)
 at
 org.apache.hadoop.hdfs.server.namenode.FSImage.loadFSImage(FSImage.java:839)
 at
 org.apache.hadoop.hdfs.server.namenode.FSImage.recoverTransitionRead(FSImage.java:377)
 at
 org.apache.hadoop.hdfs.server.namenode.FSDirectory.loadFSImage(FSDirectory.java:100)
 at
 org.apache.hadoop.hdfs.server.namenode.FSNamesystem.initialize(FSNamesystem.java:388)
 at
 org.apache.hadoop.hdfs.server.namenode.FSNamesystem.init(FSNamesystem.java:362)
 at
 org.apache.hadoop.hdfs.server.namenode.NameNode.initialize(NameNode.java:276)
 at
 org.apache.hadoop.hdfs.server.namenode.NameNode.init(NameNode.java:496)
 at
 org.apache.hadoop.hdfs.server.namenode.NameNode.createNameNode(NameNode.java:1279)
 at
 org.apache.hadoop.hdfs.server.namenode.NameNode.main(NameNode.java:1288)

 I am following best practices here, as far as I know.  I have the
 namenode writing into 3 directories (2 local, 1 NFS).  All 3 of these dirs
 have the exact same files in them.

 I also run a secondary checkpoint node.  This one appears to have
 started failing a week ago.  So checkpoints were *not* being done since
 then.  Thus I can get the NN up and running, but with a week old data!

  What is going on here?  Why does my NN data *always* wind up causing
 this exception over time?  Is there some easy way to get notified when the
 checkpointing starts to fail?




 --

 Robert Dyer
 rd...@iastate.edu




 --

 Robert Dyer
 rd...@iastate.edu



Re: Namenode failures

2013-02-17 Thread Robert Dyer
On Sun, Feb 17, 2013 at 4:41 PM, Mohammad Tariq donta...@gmail.com wrote:

 Hello Robert,

  It seems that your edit logs and fsimage have got
 corrupted somehow. It looks somewhat similar to this one
 https://issues.apache.org/jira/browse/HDFS-686


Similar, but the trace is different.


 Have you made any changes to the 'dfs.name.dir' directory
 lately?


No.


 Do you have enough space where metadata is getting
 stored?


Yes.  All 3 locations have plenty of space (hundreds of GB).


 You can make use of offine image viewer to diagnose
 the fsimage file.

 Warm Regards,
 Tariq
 https://mtariq.jux.com/
 cloudfront.blogspot.com


 On Mon, Feb 18, 2013 at 3:31 AM, Robert Dyer psyb...@gmail.com wrote:

 It just happened again.  This was after a fresh format of HDFS/HBase and
 I am attempting to re-import the (backed up) data.

   http://pastebin.com/3fsWCNQY

 So now if I restart the namenode, I will lose data from the past 3 hours.

 What is causing this?  How can I avoid it in the future?  Is there an
 easy way to monitor (other than a script grep'ing the logs) the checkpoints
 to see when this happens?


 On Sat, Feb 16, 2013 at 2:39 PM, Robert Dyer psyb...@gmail.com wrote:

 Forgot to mention: Hadoop 1.0.4


 On Sat, Feb 16, 2013 at 2:38 PM, Robert Dyer psyb...@gmail.com wrote:

 I am at a bit of wits end here.  Every single time I restart the
 namenode, I get this crash:

 2013-02-16 14:32:42,616 INFO
 org.apache.hadoop.hdfs.server.common.Storage: Image file of size 168058
 loaded in 0 seconds.
 2013-02-16 14:32:42,618 ERROR
 org.apache.hadoop.hdfs.server.namenode.NameNode:
 java.lang.NullPointerException
 at
 org.apache.hadoop.hdfs.server.namenode.FSDirectory.addChild(FSDirectory.java:1099)
 at
 org.apache.hadoop.hdfs.server.namenode.FSDirectory.addChild(FSDirectory.java:)
 at
 org.apache.hadoop.hdfs.server.namenode.FSDirectory.addNode(FSDirectory.java:1014)
 at
 org.apache.hadoop.hdfs.server.namenode.FSDirectory.unprotectedAddFile(FSDirectory.java:208)
 at
 org.apache.hadoop.hdfs.server.namenode.FSEditLog.loadFSEdits(FSEditLog.java:631)
 at
 org.apache.hadoop.hdfs.server.namenode.FSImage.loadFSEdits(FSImage.java:1021)
 at
 org.apache.hadoop.hdfs.server.namenode.FSImage.loadFSImage(FSImage.java:839)
 at
 org.apache.hadoop.hdfs.server.namenode.FSImage.recoverTransitionRead(FSImage.java:377)
 at
 org.apache.hadoop.hdfs.server.namenode.FSDirectory.loadFSImage(FSDirectory.java:100)
 at
 org.apache.hadoop.hdfs.server.namenode.FSNamesystem.initialize(FSNamesystem.java:388)
 at
 org.apache.hadoop.hdfs.server.namenode.FSNamesystem.init(FSNamesystem.java:362)
 at
 org.apache.hadoop.hdfs.server.namenode.NameNode.initialize(NameNode.java:276)
 at
 org.apache.hadoop.hdfs.server.namenode.NameNode.init(NameNode.java:496)
 at
 org.apache.hadoop.hdfs.server.namenode.NameNode.createNameNode(NameNode.java:1279)
 at
 org.apache.hadoop.hdfs.server.namenode.NameNode.main(NameNode.java:1288)

 I am following best practices here, as far as I know.  I have the
 namenode writing into 3 directories (2 local, 1 NFS).  All 3 of these dirs
 have the exact same files in them.

 I also run a secondary checkpoint node.  This one appears to have
 started failing a week ago.  So checkpoints were *not* being done since
 then.  Thus I can get the NN up and running, but with a week old data!

  What is going on here?  Why does my NN data *always* wind up causing
 this exception over time?  Is there some easy way to get notified when the
 checkpointing starts to fail?




 --

 Robert Dyer
 rd...@iastate.edu




 --

 Robert Dyer
 rd...@iastate.edu





-- 

Robert Dyer
rd...@iastate.edu


Re: Namenode failures

2013-02-17 Thread Robert Dyer
On Sun, Feb 17, 2013 at 4:41 PM, Mohammad Tariq donta...@gmail.com wrote:

 You can make use of offine image viewer to diagnose
 the fsimage file.


Is this not included in the 1.0.x branch?  All of the documentation I find
for it says to run 'bin/hdfs oev' but I do not have a 'bin/hdfs'.


 Warm Regards,
 Tariq
 https://mtariq.jux.com/
 cloudfront.blogspot.com


 On Mon, Feb 18, 2013 at 3:31 AM, Robert Dyer psyb...@gmail.com wrote:

 It just happened again.  This was after a fresh format of HDFS/HBase and
 I am attempting to re-import the (backed up) data.

   http://pastebin.com/3fsWCNQY

 So now if I restart the namenode, I will lose data from the past 3 hours.

 What is causing this?  How can I avoid it in the future?  Is there an
 easy way to monitor (other than a script grep'ing the logs) the checkpoints
 to see when this happens?


 On Sat, Feb 16, 2013 at 2:39 PM, Robert Dyer psyb...@gmail.com wrote:

 Forgot to mention: Hadoop 1.0.4


 On Sat, Feb 16, 2013 at 2:38 PM, Robert Dyer psyb...@gmail.com wrote:

 I am at a bit of wits end here.  Every single time I restart the
 namenode, I get this crash:

 2013-02-16 14:32:42,616 INFO
 org.apache.hadoop.hdfs.server.common.Storage: Image file of size 168058
 loaded in 0 seconds.
 2013-02-16 14:32:42,618 ERROR
 org.apache.hadoop.hdfs.server.namenode.NameNode:
 java.lang.NullPointerException
 at
 org.apache.hadoop.hdfs.server.namenode.FSDirectory.addChild(FSDirectory.java:1099)
 at
 org.apache.hadoop.hdfs.server.namenode.FSDirectory.addChild(FSDirectory.java:)
 at
 org.apache.hadoop.hdfs.server.namenode.FSDirectory.addNode(FSDirectory.java:1014)
 at
 org.apache.hadoop.hdfs.server.namenode.FSDirectory.unprotectedAddFile(FSDirectory.java:208)
 at
 org.apache.hadoop.hdfs.server.namenode.FSEditLog.loadFSEdits(FSEditLog.java:631)
 at
 org.apache.hadoop.hdfs.server.namenode.FSImage.loadFSEdits(FSImage.java:1021)
 at
 org.apache.hadoop.hdfs.server.namenode.FSImage.loadFSImage(FSImage.java:839)
 at
 org.apache.hadoop.hdfs.server.namenode.FSImage.recoverTransitionRead(FSImage.java:377)
 at
 org.apache.hadoop.hdfs.server.namenode.FSDirectory.loadFSImage(FSDirectory.java:100)
 at
 org.apache.hadoop.hdfs.server.namenode.FSNamesystem.initialize(FSNamesystem.java:388)
 at
 org.apache.hadoop.hdfs.server.namenode.FSNamesystem.init(FSNamesystem.java:362)
 at
 org.apache.hadoop.hdfs.server.namenode.NameNode.initialize(NameNode.java:276)
 at
 org.apache.hadoop.hdfs.server.namenode.NameNode.init(NameNode.java:496)
 at
 org.apache.hadoop.hdfs.server.namenode.NameNode.createNameNode(NameNode.java:1279)
 at
 org.apache.hadoop.hdfs.server.namenode.NameNode.main(NameNode.java:1288)

 I am following best practices here, as far as I know.  I have the
 namenode writing into 3 directories (2 local, 1 NFS).  All 3 of these dirs
 have the exact same files in them.

 I also run a secondary checkpoint node.  This one appears to have
 started failing a week ago.  So checkpoints were *not* being done since
 then.  Thus I can get the NN up and running, but with a week old data!

  What is going on here?  Why does my NN data *always* wind up causing
 this exception over time?  Is there some easy way to get notified when the
 checkpointing starts to fail?




 --

 Robert Dyer
 rd...@iastate.edu




 --

 Robert Dyer
 rd...@iastate.edu





-- 

Robert Dyer
rd...@iastate.edu


Re: Namenode failures

2013-02-17 Thread Harsh J
Hi Robert,

Are you by any chance adding files carrying unusual encoding? If its
possible, can we be sent a bundle of the corrupted log set (all of the
dfs.name.dir contents) to inspect what seems to be causing the
corruption?

The only identified (but rarely occurring) bug around this part in
1.0.4 would be https://issues.apache.org/jira/browse/HDFS-4423. The
other major corruption bug I know of is already fixed in your version,
being https://issues.apache.org/jira/browse/HDFS-3652 specifically.

We've not had this report from other users so having a reproduced file
set (data not required) would be most helpful. If you have logs
leading to the shutdown and crash as well, that'd be good to have too.

P.s. How exactly are you shutting down the NN each time? A kill -9 or
a regular SIGTERM shutdown?

On Mon, Feb 18, 2013 at 4:31 AM, Robert Dyer rd...@iastate.edu wrote:
 On Sun, Feb 17, 2013 at 4:41 PM, Mohammad Tariq donta...@gmail.com wrote:

 You can make use of offine image viewer to diagnose
 the fsimage file.


 Is this not included in the 1.0.x branch?  All of the documentation I find
 for it says to run 'bin/hdfs oev' but I do not have a 'bin/hdfs'.


 Warm Regards,
 Tariq
 https://mtariq.jux.com/
 cloudfront.blogspot.com


 On Mon, Feb 18, 2013 at 3:31 AM, Robert Dyer psyb...@gmail.com wrote:

 It just happened again.  This was after a fresh format of HDFS/HBase and
 I am attempting to re-import the (backed up) data.

   http://pastebin.com/3fsWCNQY

 So now if I restart the namenode, I will lose data from the past 3 hours.

 What is causing this?  How can I avoid it in the future?  Is there an
 easy way to monitor (other than a script grep'ing the logs) the checkpoints
 to see when this happens?


 On Sat, Feb 16, 2013 at 2:39 PM, Robert Dyer psyb...@gmail.com wrote:

 Forgot to mention: Hadoop 1.0.4


 On Sat, Feb 16, 2013 at 2:38 PM, Robert Dyer psyb...@gmail.com wrote:

 I am at a bit of wits end here.  Every single time I restart the
 namenode, I get this crash:

 2013-02-16 14:32:42,616 INFO
 org.apache.hadoop.hdfs.server.common.Storage: Image file of size 168058
 loaded in 0 seconds.
 2013-02-16 14:32:42,618 ERROR
 org.apache.hadoop.hdfs.server.namenode.NameNode:
 java.lang.NullPointerException
 at
 org.apache.hadoop.hdfs.server.namenode.FSDirectory.addChild(FSDirectory.java:1099)
 at
 org.apache.hadoop.hdfs.server.namenode.FSDirectory.addChild(FSDirectory.java:)
 at
 org.apache.hadoop.hdfs.server.namenode.FSDirectory.addNode(FSDirectory.java:1014)
 at
 org.apache.hadoop.hdfs.server.namenode.FSDirectory.unprotectedAddFile(FSDirectory.java:208)
 at
 org.apache.hadoop.hdfs.server.namenode.FSEditLog.loadFSEdits(FSEditLog.java:631)
 at
 org.apache.hadoop.hdfs.server.namenode.FSImage.loadFSEdits(FSImage.java:1021)
 at
 org.apache.hadoop.hdfs.server.namenode.FSImage.loadFSImage(FSImage.java:839)
 at
 org.apache.hadoop.hdfs.server.namenode.FSImage.recoverTransitionRead(FSImage.java:377)
 at
 org.apache.hadoop.hdfs.server.namenode.FSDirectory.loadFSImage(FSDirectory.java:100)
 at
 org.apache.hadoop.hdfs.server.namenode.FSNamesystem.initialize(FSNamesystem.java:388)
 at
 org.apache.hadoop.hdfs.server.namenode.FSNamesystem.init(FSNamesystem.java:362)
 at
 org.apache.hadoop.hdfs.server.namenode.NameNode.initialize(NameNode.java:276)
 at
 org.apache.hadoop.hdfs.server.namenode.NameNode.init(NameNode.java:496)
 at
 org.apache.hadoop.hdfs.server.namenode.NameNode.createNameNode(NameNode.java:1279)
 at
 org.apache.hadoop.hdfs.server.namenode.NameNode.main(NameNode.java:1288)

 I am following best practices here, as far as I know.  I have the
 namenode writing into 3 directories (2 local, 1 NFS).  All 3 of these dirs
 have the exact same files in them.

 I also run a secondary checkpoint node.  This one appears to have
 started failing a week ago.  So checkpoints were *not* being done since
 then.  Thus I can get the NN up and running, but with a week old data!

 What is going on here?  Why does my NN data *always* wind up causing
 this exception over time?  Is there some easy way to get notified when the
 checkpointing starts to fail?




 --

 Robert Dyer
 rd...@iastate.edu




 --

 Robert Dyer
 rd...@iastate.edu





 --

 Robert Dyer
 rd...@iastate.edu



--
Harsh J


Re: Namenode failures

2013-02-17 Thread Robert Dyer
On Sun, Feb 17, 2013 at 5:08 PM, Harsh J ha...@cloudera.com wrote:

 Hi Robert,

 Are you by any chance adding files carrying unusual encoding?


I don't believe so.  The only files I push to HDFS are SequenceFiles (with
protobuf objects in them) and HBase's regions, which again is just protobuf
objects.  I don't use any special encodings in the protobufs.


 If its
 possible, can we be sent a bundle of the corrupted log set (all of the
 dfs.name.dir contents) to inspect what seems to be causing the
 corruption?


I can give the logs, dfs data dir(s), and 2nn dirs.

https://www.dropbox.com/s/heijq65pmb3esvd/hdfs-bug.tar.gz


 The only identified (but rarely occurring) bug around this part in
 1.0.4 would be https://issues.apache.org/jira/browse/HDFS-4423. The
 other major corruption bug I know of is already fixed in your version,
 being https://issues.apache.org/jira/browse/HDFS-3652 specifically.

 We've not had this report from other users so having a reproduced file
 set (data not required) would be most helpful. If you have logs
 leading to the shutdown and crash as well, that'd be good to have too.

 P.s. How exactly are you shutting down the NN each time? A kill -9 or
 a regular SIGTERM shutdown?


I shut down the NN with 'bin/stop-dfs.sh'.


  On Mon, Feb 18, 2013 at 4:31 AM, Robert Dyer rd...@iastate.edu wrote:
  On Sun, Feb 17, 2013 at 4:41 PM, Mohammad Tariq donta...@gmail.com
 wrote:
 
  You can make use of offine image viewer to diagnose
  the fsimage file.
 
 
  Is this not included in the 1.0.x branch?  All of the documentation I
 find
  for it says to run 'bin/hdfs oev' but I do not have a 'bin/hdfs'.
 
 
  Warm Regards,
  Tariq
  https://mtariq.jux.com/
  cloudfront.blogspot.com
 
 
  On Mon, Feb 18, 2013 at 3:31 AM, Robert Dyer psyb...@gmail.com wrote:
 
  It just happened again.  This was after a fresh format of HDFS/HBase
 and
  I am attempting to re-import the (backed up) data.
 
http://pastebin.com/3fsWCNQY
 
  So now if I restart the namenode, I will lose data from the past 3
 hours.
 
  What is causing this?  How can I avoid it in the future?  Is there an
  easy way to monitor (other than a script grep'ing the logs) the
 checkpoints
  to see when this happens?
 
 
  On Sat, Feb 16, 2013 at 2:39 PM, Robert Dyer psyb...@gmail.com
 wrote:
 
  Forgot to mention: Hadoop 1.0.4
 
 
  On Sat, Feb 16, 2013 at 2:38 PM, Robert Dyer psyb...@gmail.com
 wrote:
 
  I am at a bit of wits end here.  Every single time I restart the
  namenode, I get this crash:
 
  2013-02-16 14:32:42,616 INFO
  org.apache.hadoop.hdfs.server.common.Storage: Image file of size
 168058
  loaded in 0 seconds.
  2013-02-16 14:32:42,618 ERROR
  org.apache.hadoop.hdfs.server.namenode.NameNode:
  java.lang.NullPointerException
  at
 
 org.apache.hadoop.hdfs.server.namenode.FSDirectory.addChild(FSDirectory.java:1099)
  at
 
 org.apache.hadoop.hdfs.server.namenode.FSDirectory.addChild(FSDirectory.java:)
  at
 
 org.apache.hadoop.hdfs.server.namenode.FSDirectory.addNode(FSDirectory.java:1014)
  at
 
 org.apache.hadoop.hdfs.server.namenode.FSDirectory.unprotectedAddFile(FSDirectory.java:208)
  at
 
 org.apache.hadoop.hdfs.server.namenode.FSEditLog.loadFSEdits(FSEditLog.java:631)
  at
 
 org.apache.hadoop.hdfs.server.namenode.FSImage.loadFSEdits(FSImage.java:1021)
  at
 
 org.apache.hadoop.hdfs.server.namenode.FSImage.loadFSImage(FSImage.java:839)
  at
 
 org.apache.hadoop.hdfs.server.namenode.FSImage.recoverTransitionRead(FSImage.java:377)
  at
 
 org.apache.hadoop.hdfs.server.namenode.FSDirectory.loadFSImage(FSDirectory.java:100)
  at
 
 org.apache.hadoop.hdfs.server.namenode.FSNamesystem.initialize(FSNamesystem.java:388)
  at
 
 org.apache.hadoop.hdfs.server.namenode.FSNamesystem.init(FSNamesystem.java:362)
  at
 
 org.apache.hadoop.hdfs.server.namenode.NameNode.initialize(NameNode.java:276)
  at
 
 org.apache.hadoop.hdfs.server.namenode.NameNode.init(NameNode.java:496)
  at
 
 org.apache.hadoop.hdfs.server.namenode.NameNode.createNameNode(NameNode.java:1279)
  at
 
 org.apache.hadoop.hdfs.server.namenode.NameNode.main(NameNode.java:1288)
 
  I am following best practices here, as far as I know.  I have the
  namenode writing into 3 directories (2 local, 1 NFS).  All 3 of
 these dirs
  have the exact same files in them.
 
  I also run a secondary checkpoint node.  This one appears to have
  started failing a week ago.  So checkpoints were *not* being done
 since
  then.  Thus I can get the NN up and running, but with a week old
 data!
 
  What is going on here?  Why does my NN data *always* wind up causing
  this exception over time?  Is there some easy way to get notified
 when the
  checkpointing starts to fail?
 
 
 
 
  --
 
  Robert Dyer
  rd...@iastate.edu
 
 
 
 
  --
 
  Robert Dyer
  rd...@iastate.edu
 
 
 
 
 
  --
 
  Robert Dyer
  rd...@iastate.edu



 --
 Harsh J




-- 

Robert Dyer
rd...@iastate.edu


RE: why my test result on dfs short circuit read is slower?

2013-02-17 Thread Liu, Raymond
I have try to tune io.file.buffer.size to 128K instead of 4K
ShortCircuit read performance is still worse than read through datanode.

I am start to wondering, does shortcircuit read really help under hadoop 1.1.1 
version?
I google to find a few people mention they got 2x gain or so upon CDH etc. But 
I really can't find out what else I can do to make it even just catch up normal 
read path

 
 It seems to me that, with short circuit read enabled, the BlockReaderLocal
 read data in 512/4096 bytes unit(checksum check enabled/skiped)
 
 While when It go through datanode, the BlockSender.sendChunks will read and
 sent data in 64K bytes units?
 
 Is that true? And if so, won't it explain that read through datanode will be
 faster? Since it read data in bigger block size.
 
 Best Regards,
 Raymond Liu
 
 
  -Original Message-
  From: Liu, Raymond [mailto:raymond@intel.com]
  Sent: Saturday, February 16, 2013 2:23 PM
  To: user@hadoop.apache.org
  Subject: RE: why my test result on dfs short circuit read is slower?
 
  Hi Arpit Gupta
 
  Yes,  this way also confirms that short circuit read is enabled on my 
  cluster.
 
  13/02/16 14:07:34 DEBUG hdfs.DFSClient: Short circuit read is true
 
  13/02/16 14:07:34 DEBUG hdfs.DFSClient: New BlockReaderLocal for file
 
 /mnt/DP_disk4/raymond/hdfs/data/current/subdir63/blk_-2736548898990727
  638 of size 134217728 startOffset 0 length 134217728 short circuit
  checksum false
 
  So , any possibility that other setting might impact short circuit
  read to has worse performance than read through datanode?
 
  Raymond
 
 
 
  Another way to check if short circuit read is configured correctly.
 
  As the user who is configured for short circuit read issue the
  following
  command on a node where you expect the data to be read locally.
 
  export HADOOP_ROOT_LOGGER=debug,console; hadoop dfs -cat
  /path/to/file_on_hdfs
 
  On the console you should see something like hdfs.DFSClient: New
  BlockReaderLocal for file
 
  This would confirm that short circuit read is happening.
 
  --
  Arpit Gupta
  Hortonworks Inc.
  http://hortonworks.com/
 
  On Feb 15, 2013, at 9:53 PM, Liu, Raymond raymond@intel.com
 wrote:
 
 
  Hi Harsh
 
  Yes, I did set both of these. While not in hbase-site.xml but hdfs-site.xml.
 
  And I have double confirmed that local reads are performed, since
  there are no Error in datanode logs, and by watching lo network IO.
 
 
 
  If you want HBase to leverage the shortcircuit, the DN config
  dfs.block.local-path-access.user should be set to the user running HBase
 (i.e.
  hbase, for example), and the hbase-site.xml should have
  dfs.client.read.shortcircuit defined in all its RegionServers. Doing
  this wrong could result in performance penalty and some warn-logging,
  as local reads will be attempted but will begin to fail.
 
  On Sat, Feb 16, 2013 at 8:40 AM, Liu, Raymond raymond@intel.com
  wrote:
 
  Hi
 
     I tried to use short circuit read to improve my hbase cluster
  MR scan performance.
 
 
     I have the following setting in hdfs-site.xml
 
     dfs.client.read.shortcircuit set to true
     dfs.block.local-path-access.user set to MR job runner.
 
     The cluster is 1+4 node and each data node have 16cpu/4HDD,
  with all hbase table major compact thus all data is local.
 
     I have hoped that the short circuit read will improve the
  performance.
 
 
     While the test result is that with short circuit read enabled,
  the performance actually dropped 10-15%. Say scan a 50G table cost
  around 100s instead of 90s.
 
 
     My hadoop version is 1.1.1, any idea on this? Thx!
 
  Best Regards,
  Raymond Liu
 
 
 
 
  --
  Harsh J



Re: product recommendations engine

2013-02-17 Thread Ted Dunning
Yeah... you can make this work.

First, if your setup is relatively small, then you won't need Hadoop.

Second, having lots of kinds of actions is a very reasonable thing to have.
 My own suggestion is that you analyze these each for their predictive
power independently and then combine them at recommendation time.

My own suggestion for how to deploy the recommendation model is in the form
of a search engine that has fields for each kind of recommendation cue that
you need to have.  You can combine any or all of these cues in the process
of doing a non-textual search using the recent history of the user as the
query.

This search-abuse style of recommendations is pretty easy to deploy and PHP
has a reasonably good package for sending queries to Solr, which is the
search engine I tend to recommend.

You should also make a provision for A/B testing on different
recommendation approaches and combinations of inputs.  This is pretty
straightforward, but usually requires some sort of experimental condition
assignment and definitely requires good log recording and analysis.

That said, this isn't a tiny project.  It involves quite a bit of work.  It
isn't terribly hard at any point and the overall architecture is pretty
straightforward, but there is a good bit of work to be done.

On Sun, Feb 17, 2013 at 4:21 PM, Douglass Davis
douglassdavi...@gmail.comwrote:

 Hello,

 I don't have any prior experience with Hadoop.  I am also not a statistics
 expert.  I am a software engineer, however, after looking at the docs,
 Hadoop still seems pretty intimidating to set up.

 I am interested in doing product recommendations.  However, I want to
 store many things about user behavior, for example whether they click on a
 link in an email, how they rate a product, whether they buy it, etc.  Then
 I would like to come up with similar items that a user may like.  I have
 seen an example just based on user ratings, but would like to add much more
 data.

 Also, I think the clustering could be used in terms of recommending based
 on similar descriptions, attributes, and keywords.

 Or, I could use a combination of the two approaches.

 Another question, I wonder if Hadoop takes into account the passage of
 time.  For example, a user may rate something high, then change their
 rating a couple months later.

 Lastly, my site is based on PHP.  I need to be able to integrate that with
 Hadoop.

 How feasible is this approach?  I saw a clustering example, and a
 recommendation example based on user ratings.  Are there any other advice,
 docs, or examples that you could point me to that deals with any of these
 issues?

 Thanks,
 Doug







RE: why my test result on dfs short circuit read is slower?

2013-02-17 Thread Liu, Raymond
Alright, I think in my sequence read scenario, it is possible that shortcircuit 
read is actually slower than read through datanode.

For, when read through datanode, FS read operation is done by datanode daemon, 
while data processing is done by client.
Thus when client is processing the data, data node could read data at the same 
time and write it to local socket for client to read.
It take very few time for client to read from local socket.

While when client read native FS directly, all the job is done by client. It 
will be blocked for more time to read data from native FS for process than from 
local socket.

Overall, when CPU is not bound, and data node always prepare further data for 
client(due to sequence read scenario), the result is that shortcircuit read is 
slower though it cost less CPU load. 

The CPU/idle/IOWait load seems also justify my guess

= For a scan only job: =

Read through datanode:  CPU : 30-35% / IOWait : ~50% / 300 second
Shortcircuit read:  CPU : 25-30% / IOWait : ~40% / 330 second
Short circuit read is 10% slower

= For a job do more calculation =
Read through datanode:  CPU : 80-90% / IOWait : ~5-10% / 190 seconds
Shortcircuit read:  CPU : ~90% / IOWait : ~2-3% / 160 second
Short circuit read is 15% faster.

So, short circuit read is not always faster, especially when CPU is not bound 
and read by sequence, it will be slower. This is the best explain I can get now.
Any thoughts?

Raymond

 
 I have try to tune io.file.buffer.size to 128K instead of 4K ShortCircuit read
 performance is still worse than read through datanode.
 
 I am start to wondering, does shortcircuit read really help under hadoop 1.1.1
 version?
 I google to find a few people mention they got 2x gain or so upon CDH etc. 
 But I
 really can't find out what else I can do to make it even just catch up normal
 read path
 
 
  It seems to me that, with short circuit read enabled, the
  BlockReaderLocal read data in 512/4096 bytes unit(checksum check
  enabled/skiped)
 
  While when It go through datanode, the BlockSender.sendChunks will
  read and sent data in 64K bytes units?
 
  Is that true? And if so, won't it explain that read through datanode
  will be faster? Since it read data in bigger block size.
 
  Best Regards,
  Raymond Liu
 
 
   -Original Message-
   From: Liu, Raymond [mailto:raymond@intel.com]
   Sent: Saturday, February 16, 2013 2:23 PM
   To: user@hadoop.apache.org
   Subject: RE: why my test result on dfs short circuit read is slower?
  
   Hi Arpit Gupta
  
   Yes,  this way also confirms that short circuit read is enabled on my
 cluster.
  
   13/02/16 14:07:34 DEBUG hdfs.DFSClient: Short circuit read is true
  
   13/02/16 14:07:34 DEBUG hdfs.DFSClient: New BlockReaderLocal for
   file
  
 
 /mnt/DP_disk4/raymond/hdfs/data/current/subdir63/blk_-2736548898990727
   638 of size 134217728 startOffset 0 length 134217728 short circuit
   checksum false
  
   So , any possibility that other setting might impact short circuit
   read to has worse performance than read through datanode?
  
   Raymond
  
  
  
   Another way to check if short circuit read is configured correctly.
  
   As the user who is configured for short circuit read issue the
   following
   command on a node where you expect the data to be read locally.
  
   export HADOOP_ROOT_LOGGER=debug,console; hadoop dfs -cat
   /path/to/file_on_hdfs
  
   On the console you should see something like hdfs.DFSClient: New
   BlockReaderLocal for file
  
   This would confirm that short circuit read is happening.
  
   --
   Arpit Gupta
   Hortonworks Inc.
   http://hortonworks.com/
  
   On Feb 15, 2013, at 9:53 PM, Liu, Raymond raymond@intel.com
  wrote:
  
  
   Hi Harsh
  
   Yes, I did set both of these. While not in hbase-site.xml but 
   hdfs-site.xml.
  
   And I have double confirmed that local reads are performed, since
   there are no Error in datanode logs, and by watching lo network IO.
  
  
  
   If you want HBase to leverage the shortcircuit, the DN config
   dfs.block.local-path-access.user should be set to the user running
   HBase
  (i.e.
   hbase, for example), and the hbase-site.xml should have
   dfs.client.read.shortcircuit defined in all its RegionServers.
   Doing this wrong could result in performance penalty and some
   warn-logging, as local reads will be attempted but will begin to fail.
  
   On Sat, Feb 16, 2013 at 8:40 AM, Liu, Raymond
   raymond@intel.com
   wrote:
  
   Hi
  
      I tried to use short circuit read to improve my hbase cluster
   MR scan performance.
  
  
      I have the following setting in hdfs-site.xml
  
      dfs.client.read.shortcircuit set to true
      dfs.block.local-path-access.user set to MR job runner.
  
      The cluster is 1+4 node and each data node have 16cpu/4HDD,
   with all hbase table major compact thus all data is local.
  
      I have hoped that the short circuit read will improve the
   performance.
  
  
  

Re: some ideas for QJM and NFS

2013-02-17 Thread Azuryy Yu
Oh, yes, you are right, George. I'll probably do it in the next days.


On Mon, Feb 18, 2013 at 2:47 PM, George Datskos 
george.dats...@jp.fujitsu.com wrote:

  Hi Azuryy,

 So you have measurements for hadoop-1.0.4 and hadoop-2.0.3+QJM, but I
 think you should also measure hadoop-2.0.3 _wihout_ QJM so you can know for
 sure if the performance degrade is actually related to QJM or not.


 George


   Hi,

  HarshJ is a good guy, I've seen this JIRA:
 https://issues.apache.org/jira/browse/HDFS-4508

  I have a test cluster hadoop-1.0.4, I've upgrade to hadoop-2.0.3-alpha.
 mu cluster is very small, four nodes totally.

  then I did some test on the original Hadoop and new Hadoop, the testing
 is very simple: I have a data file with 450MB, I just put it on the HDFS.

  block size: 128MB, replica: 2

  the following is the result:

 [root@webdm test]# ll testspeed.tar.gz
 -rw-r--r-- 1 root root 452M Feb 18 13:54 testspeed.tar.gz
 [root@webdm test]#

  //On the hadoop-1.0.4
  [root@webdm test]# date +%Y-%m-%d_%H:%M:%S; hadoop dfs -put
 testspeed.tar.gz / ; date +%Y-%m-%d_%H:%M:%S
 2013-02-18_13:54:24
 Warning: $HADOOP_HOME is deprecated.
 2013-02-18_13:54:58

  //On the hadoop-2.0.3-alpha with QJM
  [root@webdm test]# date +%Y-%m-%d_%H:%M:%S; hdfs dfs -put
 testspeed.tar.gz / ; date +%Y-%m-%d_%H:%M:%S
 2013-02-18_14:13:29
 13/02/18 14:13:30 WARN util.NativeCodeLoader: Unable to load native-hadoop
 library for your platform... using builtin-java classes where applicable
 2013-02-18_14:14:33

  I do think QJM HA feature affect the performance, because each writer
 from QJM, it will do: fence old writer; sync in-progress log; start new log
 segment; then write. only if writer received a successful response from a
 quorum of JNs, writer finished for this time.

  But NFS HA just write log segment in the local and NFS, when it receive
 successful response from NFS, it finished this time.

  So, I just suggest we always keep these two HA features in future, even
 in the stable release. which one should be used, which depends on yourself
 based on your infrastructure.

  Thanks.





答复: why my test result on dfs short circuit read is slower?

2013-02-17 Thread 谢良
Probably readahead played a key role on the first scenario(scan only job) ?  
the default LONG_READ_THRESHOLD_BYTES(BlockSender.java) is 256k in current 
codebase, and ReadaheadPool takes effect on normal read path.


Regards,
Liang

发件人: Liu, Raymond [raymond@intel.com]
发送时间: 2013年2月18日 14:04
收件人: user@hadoop.apache.org
主题: RE: why my test result on dfs short circuit read is slower?

Alright, I think in my sequence read scenario, it is possible that shortcircuit 
read is actually slower than read through datanode.

For, when read through datanode, FS read operation is done by datanode daemon, 
while data processing is done by client.
Thus when client is processing the data, data node could read data at the same 
time and write it to local socket for client to read.
It take very few time for client to read from local socket.

While when client read native FS directly, all the job is done by client. It 
will be blocked for more time to read data from native FS for process than from 
local socket.

Overall, when CPU is not bound, and data node always prepare further data for 
client(due to sequence read scenario), the result is that shortcircuit read is 
slower though it cost less CPU load.

The CPU/idle/IOWait load seems also justify my guess

= For a scan only job: =

Read through datanode:  CPU : 30-35% / IOWait : ~50% / 300 second
Shortcircuit read:  CPU : 25-30% / IOWait : ~40% / 330 second
Short circuit read is 10% slower

= For a job do more calculation =
Read through datanode:  CPU : 80-90% / IOWait : ~5-10% / 190 seconds
Shortcircuit read:  CPU : ~90% / IOWait : ~2-3% / 160 second
Short circuit read is 15% faster.

So, short circuit read is not always faster, especially when CPU is not bound 
and read by sequence, it will be slower. This is the best explain I can get now.
Any thoughts?

Raymond


 I have try to tune io.file.buffer.size to 128K instead of 4K ShortCircuit read
 performance is still worse than read through datanode.

 I am start to wondering, does shortcircuit read really help under hadoop 1.1.1
 version?
 I google to find a few people mention they got 2x gain or so upon CDH etc. 
 But I
 really can't find out what else I can do to make it even just catch up normal
 read path

 
  It seems to me that, with short circuit read enabled, the
  BlockReaderLocal read data in 512/4096 bytes unit(checksum check
  enabled/skiped)
 
  While when It go through datanode, the BlockSender.sendChunks will
  read and sent data in 64K bytes units?
 
  Is that true? And if so, won't it explain that read through datanode
  will be faster? Since it read data in bigger block size.
 
  Best Regards,
  Raymond Liu
 
 
   -Original Message-
   From: Liu, Raymond [mailto:raymond@intel.com]
   Sent: Saturday, February 16, 2013 2:23 PM
   To: user@hadoop.apache.org
   Subject: RE: why my test result on dfs short circuit read is slower?
  
   Hi Arpit Gupta
  
   Yes,  this way also confirms that short circuit read is enabled on my
 cluster.
  
   13/02/16 14:07:34 DEBUG hdfs.DFSClient: Short circuit read is true
  
   13/02/16 14:07:34 DEBUG hdfs.DFSClient: New BlockReaderLocal for
   file
  
 
 /mnt/DP_disk4/raymond/hdfs/data/current/subdir63/blk_-2736548898990727
   638 of size 134217728 startOffset 0 length 134217728 short circuit
   checksum false
  
   So , any possibility that other setting might impact short circuit
   read to has worse performance than read through datanode?
  
   Raymond
  
  
  
   Another way to check if short circuit read is configured correctly.
  
   As the user who is configured for short circuit read issue the
   following
   command on a node where you expect the data to be read locally.
  
   export HADOOP_ROOT_LOGGER=debug,console; hadoop dfs -cat
   /path/to/file_on_hdfs
  
   On the console you should see something like hdfs.DFSClient: New
   BlockReaderLocal for file
  
   This would confirm that short circuit read is happening.
  
   --
   Arpit Gupta
   Hortonworks Inc.
   http://hortonworks.com/
  
   On Feb 15, 2013, at 9:53 PM, Liu, Raymond raymond@intel.com
  wrote:
  
  
   Hi Harsh
  
   Yes, I did set both of these. While not in hbase-site.xml but 
   hdfs-site.xml.
  
   And I have double confirmed that local reads are performed, since
   there are no Error in datanode logs, and by watching lo network IO.
  
  
  
   If you want HBase to leverage the shortcircuit, the DN config
   dfs.block.local-path-access.user should be set to the user running
   HBase
  (i.e.
   hbase, for example), and the hbase-site.xml should have
   dfs.client.read.shortcircuit defined in all its RegionServers.
   Doing this wrong could result in performance penalty and some
   warn-logging, as local reads will be attempted but will begin to fail.
  
   On Sat, Feb 16, 2013 at 8:40 AM, Liu, Raymond
   raymond@intel.com
   wrote:
  
   Hi
  
  I tried to use short circuit read to improve my hbase cluster
   MR 

Re: some ideas for QJM and NFS

2013-02-17 Thread Azuryy Yu
Hi,

I did it on hadoop-2.0.3-alpha without HA as following:

[root@webdm test]# date +%Y-%m-%d_%H:%M:%S; hdfs dfs -put testspeed.tar.gz
/ ; date +%Y-%m-%d_%H:%M:%S
2013-02-18_15:20:01
13/02/18 15:20:02 WARN util.NativeCodeLoader: Unable to load native-hadoop
library for your platform... using builtin-java classes where applicable
2013-02-18_15:20:30

so the performance is a little bit better than hadoop-1.0.4.



On Mon, Feb 18, 2013 at 2:53 PM, Azuryy Yu azury...@gmail.com wrote:

 Oh, yes, you are right, George. I'll probably do it in the next days.


 On Mon, Feb 18, 2013 at 2:47 PM, George Datskos 
 george.dats...@jp.fujitsu.com wrote:

  Hi Azuryy,

 So you have measurements for hadoop-1.0.4 and hadoop-2.0.3+QJM, but I
 think you should also measure hadoop-2.0.3 _wihout_ QJM so you can know for
 sure if the performance degrade is actually related to QJM or not.


 George


   Hi,

  HarshJ is a good guy, I've seen this JIRA:
 https://issues.apache.org/jira/browse/HDFS-4508

  I have a test cluster hadoop-1.0.4, I've upgrade to hadoop-2.0.3-alpha.
 mu cluster is very small, four nodes totally.

  then I did some test on the original Hadoop and new Hadoop, the testing
 is very simple: I have a data file with 450MB, I just put it on the HDFS.

  block size: 128MB, replica: 2

  the following is the result:

 [root@webdm test]# ll testspeed.tar.gz
 -rw-r--r-- 1 root root 452M Feb 18 13:54 testspeed.tar.gz
 [root@webdm test]#

  //On the hadoop-1.0.4
  [root@webdm test]# date +%Y-%m-%d_%H:%M:%S; hadoop dfs -put
 testspeed.tar.gz / ; date +%Y-%m-%d_%H:%M:%S
 2013-02-18_13:54:24
 Warning: $HADOOP_HOME is deprecated.
 2013-02-18_13:54:58

  //On the hadoop-2.0.3-alpha with QJM
  [root@webdm test]# date +%Y-%m-%d_%H:%M:%S; hdfs dfs -put
 testspeed.tar.gz / ; date +%Y-%m-%d_%H:%M:%S
 2013-02-18_14:13:29
 13/02/18 14:13:30 WARN util.NativeCodeLoader: Unable to load
 native-hadoop library for your platform... using builtin-java classes where
 applicable
 2013-02-18_14:14:33

  I do think QJM HA feature affect the performance, because each writer
 from QJM, it will do: fence old writer; sync in-progress log; start new log
 segment; then write. only if writer received a successful response from a
 quorum of JNs, writer finished for this time.

  But NFS HA just write log segment in the local and NFS, when it receive
 successful response from NFS, it finished this time.

  So, I just suggest we always keep these two HA features in future, even
 in the stable release. which one should be used, which depends on yourself
 based on your infrastructure.

  Thanks.






答复: some ideas for QJM and NFS

2013-02-17 Thread 谢良
Hi Azuryy, just want to confirm one thing, your JN did not deploy on the same 
machines within DN, right ?

Regards,
Liang

发件人: Azuryy Yu [azury...@gmail.com]
发送时间: 2013年2月18日 15:22
收件人: user@hadoop.apache.org
主题: Re: some ideas for QJM and NFS

Hi,

I did it on hadoop-2.0.3-alpha without HA as following:

[root@webdm test]# date +%Y-%m-%d_%H:%M:%S; hdfs dfs -put testspeed.tar.gz / ; 
date +%Y-%m-%d_%H:%M:%S
2013-02-18_15:20:01
13/02/18 15:20:02 WARN util.NativeCodeLoader: Unable to load native-hadoop 
library for your platform... using builtin-java classes where applicable
2013-02-18_15:20:30

so the performance is a little bit better than hadoop-1.0.4.



On Mon, Feb 18, 2013 at 2:53 PM, Azuryy Yu 
azury...@gmail.commailto:azury...@gmail.com wrote:
Oh, yes, you are right, George. I'll probably do it in the next days.


On Mon, Feb 18, 2013 at 2:47 PM, George Datskos 
george.dats...@jp.fujitsu.commailto:george.dats...@jp.fujitsu.com wrote:
Hi Azuryy,

So you have measurements for hadoop-1.0.4 and hadoop-2.0.3+QJM, but I think you 
should also measure hadoop-2.0.3 _wihout_ QJM so you can know for sure if the 
performance degrade is actually related to QJM or not.


George


Hi,

HarshJ is a good guy, I've seen this JIRA: 
https://issues.apache.org/jira/browse/HDFS-4508

I have a test cluster hadoop-1.0.4, I've upgrade to hadoop-2.0.3-alpha. mu 
cluster is very small, four nodes totally.

then I did some test on the original Hadoop and new Hadoop, the testing is very 
simple: I have a data file with 450MB, I just put it on the HDFS.

block size: 128MB, replica: 2

the following is the result:

[root@webdm test]# ll testspeed.tar.gz
-rw-r--r-- 1 root root 452M Feb 18 13:54 testspeed.tar.gz
[root@webdm test]#

//On the hadoop-1.0.4
[root@webdm test]# date +%Y-%m-%d_%H:%M:%S; hadoop dfs -put testspeed.tar.gz / 
; date +%Y-%m-%d_%H:%M:%S
2013-02-18_13:54:24
Warning: $HADOOP_HOME is deprecated.
2013-02-18_13:54:58

//On the hadoop-2.0.3-alpha with QJM
[root@webdm test]# date +%Y-%m-%d_%H:%M:%S; hdfs dfs -put testspeed.tar.gz / ; 
date +%Y-%m-%d_%H:%M:%S
2013-02-18_14:13:29
13/02/18 14:13:30 WARN util.NativeCodeLoader: Unable to load native-hadoop 
library for your platform... using builtin-java classes where applicable
2013-02-18_14:14:33

I do think QJM HA feature affect the performance, because each writer from QJM, 
it will do: fence old writer; sync in-progress log; start new log segment; then 
write. only if writer received a successful response from a quorum of JNs, 
writer finished for this time.

But NFS HA just write log segment in the local and NFS, when it receive 
successful response from NFS, it finished this time.

So, I just suggest we always keep these two HA features in future, even in the 
stable release. which one should be used, which depends on yourself based on 
your infrastructure.

Thanks.





Re: 答复: some ideas for QJM and NFS

2013-02-17 Thread Azuryy Yu
All JNs are deployed on the same node with DN.


On Mon, Feb 18, 2013 at 3:35 PM, 谢良 xieli...@xiaomi.com wrote:

  Hi Azuryy, just want to confirm one thing, your JN did not deploy on the
 same machines within DN, right ?

 Regards,
 Liang
   --
 *发件人:* Azuryy Yu [azury...@gmail.com]
 *发送时间:* 2013年2月18日 15:22
 *收件人:* user@hadoop.apache.org
 *主题:* Re: some ideas for QJM and NFS

   Hi,

 I did it on hadoop-2.0.3-alpha without HA as following:

 [root@webdm test]# date +%Y-%m-%d_%H:%M:%S; hdfs dfs -put
 testspeed.tar.gz / ; date +%Y-%m-%d_%H:%M:%S
 2013-02-18_15:20:01
 13/02/18 15:20:02 WARN util.NativeCodeLoader: Unable to load native-hadoop
 library for your platform... using builtin-java classes where applicable
 2013-02-18_15:20:30

  so the performance is a little bit better than hadoop-1.0.4.



 On Mon, Feb 18, 2013 at 2:53 PM, Azuryy Yu azury...@gmail.com wrote:

 Oh, yes, you are right, George. I'll probably do it in the next days.


 On Mon, Feb 18, 2013 at 2:47 PM, George Datskos 
 george.dats...@jp.fujitsu.com wrote:

  Hi Azuryy,

 So you have measurements for hadoop-1.0.4 and hadoop-2.0.3+QJM, but I
 think you should also measure hadoop-2.0.3 _wihout_ QJM so you can know for
 sure if the performance degrade is actually related to QJM or not.


 George


Hi,

  HarshJ is a good guy, I've seen this JIRA:
 https://issues.apache.org/jira/browse/HDFS-4508

  I have a test cluster hadoop-1.0.4, I've upgrade to
 hadoop-2.0.3-alpha. mu cluster is very small, four nodes totally.

  then I did some test on the original Hadoop and new Hadoop, the
 testing is very simple: I have a data file with 450MB, I just put it on the
 HDFS.

  block size: 128MB, replica: 2

  the following is the result:

 [root@webdm test]# ll testspeed.tar.gz
 -rw-r--r-- 1 root root 452M Feb 18 13:54 testspeed.tar.gz
 [root@webdm test]#

  //On the hadoop-1.0.4
  [root@webdm test]# date +%Y-%m-%d_%H:%M:%S; hadoop dfs -put
 testspeed.tar.gz / ; date +%Y-%m-%d_%H:%M:%S
 2013-02-18_13:54:24
 Warning: $HADOOP_HOME is deprecated.
 2013-02-18_13:54:58

  //On the hadoop-2.0.3-alpha with QJM
  [root@webdm test]# date +%Y-%m-%d_%H:%M:%S; hdfs dfs -put
 testspeed.tar.gz / ; date +%Y-%m-%d_%H:%M:%S
 2013-02-18_14:13:29
 13/02/18 14:13:30 WARN util.NativeCodeLoader: Unable to load
 native-hadoop library for your platform... using builtin-java classes where
 applicable
 2013-02-18_14:14:33

  I do think QJM HA feature affect the performance, because each writer
 from QJM, it will do: fence old writer; sync in-progress log; start new log
 segment; then write. only if writer received a successful response from a
 quorum of JNs, writer finished for this time.

  But NFS HA just write log segment in the local and NFS, when it
 receive successful response from NFS, it finished this time.

  So, I just suggest we always keep these two HA features in future,
 even in the stable release. which one should be used, which depends on
 yourself based on your infrastructure.

  Thanks.