Re: Fair scheduler.

2012-10-17 Thread Harsh J
Hi, Regular users never write into the mapred.system.dir AFAICT. That directory, is just for the JT to use to mark its presence and to expose the distributed filesystem it will be relying on. Users write to their respective staging directories, which lies elsewhere and is per-user. Let me post

Re: HDFS federation

2012-10-17 Thread Visioner Sadak
tht means i will need a cluster set up only right pseudo distr mode wont wrk On Wed, Oct 17, 2012 at 5:15 AM, lohit lohit.vijayar...@gmail.com wrote: You can try out federation by create 3 different conf directories and starting 3 different NameNodes out of those configurations. These

Re: unable to access using webhdfs in 0.23.3

2012-10-17 Thread Visioner Sadak
http://112.30.123.711:50070/webhdfs/v1/user/test.jpg?op=OPEN whn i use my linux box ip *112.30.123.711 it points to * http://localhost:50075/webhdfs/v1/user/test.jpg?op=OPENnamenoderpcaddress=112.30.123.711:8020offset=0 localhost hw to bring in linux box ip in here ??? On Wed, Oct 17,

HDFS upgrade

2012-10-17 Thread Amit Sela
Hi all, I want to upgrade a 1TB cluster from hadoop 0.20.3 to hadoop 1.0.3. I am interested to know how long does the hdfs upgrade take and in general how long it takes from deploying new versions until the cluster is back to running heavy MapReduce ? I'd also appreciate it if someone could

Re: wait at the end of job

2012-10-17 Thread Radim Kolar
it tracks data read into buffer, not processed data. sitting at 100 percent is okay

Re: HDFS using SAN

2012-10-17 Thread Kevin O'dell
You may want to take a look at the Netapp White Paper on this. They have a SAN solution as their Hadoop offering. http://www.netapp.com/templates/mediaView?m=tr-3969.pdfcc=uswid=130618138mid=56872393 On Tue, Oct 16, 2012 at 7:28 PM, Pamecha, Abhishek apame...@x.com wrote: Yes, for MR, my

Re: HDFS using SAN

2012-10-17 Thread Tom Deutsch
And of source IBM has supported our GPFS and SONAS customers for a couple of years already. --- Sent from my Blackberry so please excuse typing and spelling errors. - Original Message - From: Kevin O'dell [kevin.od...@cloudera.com] Sent: 10/17/2012

Re: HDFS using SAN

2012-10-17 Thread Mohamed Riadh Trad
Sauvegarde tes données! Le 17 oct. 2012 à 15:25, Kevin O'dell a écrit : You may want to take a look at the Netapp White Paper on this. They have a SAN solution as their Hadoop offering. http://www.netapp.com/templates/mediaView?m=tr-3969.pdfcc=uswid=130618138mid=56872393 On Tue, Oct

Re: Question about namenode HA

2012-10-17 Thread Chao Shi
Get we get rid of ZK completely? Since JNs are like simplified version of ZK, it should be possible to use it for election. I think it pretty easy: - JN exposes latest heartbeat information via RPC (the active NN heart-beats JNs every 1 second) - zkfc decided whether the current active NN is

Re: Fair scheduler.

2012-10-17 Thread Goldstone, Robin J.
Yes, you would think that users shouldn't need to write to mapred.system.dir, yet that seems to be the case. I posted details about my configuration along with full stack traces last week. I won't re-post everything but essentially I have mapred.system.dir defined as a directory in HDFS owned by

Re: Fair scheduler.

2012-10-17 Thread Harsh J
Hey Robin, Thanks for the detailed post. Just looked at your older thread, and you're right, the JT does write into its system dir for users' job info and token files when initializing the Job. The bug you ran into and the exception+trace you got makes sense now. I just didn't see it on version

Re: HDFS federation

2012-10-17 Thread Visioner Sadak
got tht Anil Thanks a ton friends for your help.. On Wed, Oct 17, 2012 at 8:35 PM, Anil Gupta anilgupt...@gmail.com wrote: Hi visioner, It won't work on pseudo distributed because you need to run at least 2 NN and if you run 2NN then u need to configure them separately on different

Anyone else having problems hitting Apache's site?

2012-10-17 Thread Michael Segel
I'm having issues connecting to the API pages off the Apache site. Is it just me? Thx -Mike

Re: Anyone else having problems hitting Apache's site?

2012-10-17 Thread Matt Bornski
http://www.downforeveryoneorjustme.com/hadoop.apache.org and my web browser say it's just you On Wed, Oct 17, 2012 at 9:02 AM, Michael Segel michael_se...@hotmail.comwrote: I'm having issues connecting to the API pages off the Apache site. Is it just me? Thx -Mike

Re: Fair scheduler.

2012-10-17 Thread Harsh J
No, you're right - to define the queue names at the cluster level, the mapred.queue.names is the right config. To specify a queue at the job level, mapred.job.queue.name is the right config. On Wed, Oct 17, 2012 at 11:10 PM, Patai Sangbutsarakum silvianhad...@gmail.com wrote: Harsh.. i am

Re: unable to access using webhdfs in 0.23.3

2012-10-17 Thread Arpit Gupta
does any other webhdfs call work for you? for example can you do this http://localhost:50070/webhdfs/v1/user/test.jpg?OP=liststatus I tried 0.23.4 and setup a single node cluster and was able to make the calls. -- Arpit Gupta Hortonworks Inc. http://hortonworks.com/ On Oct 17, 2012, at 11:17

Re: speculative execution in yarn

2012-10-17 Thread Vinod Kumar Vavilapalli
Speculative execution is a per-job concept, so in 2.* release line, it is MR AM's responsibility. Because there is a two level scheduling - one at RM and one at AM, AMs have not way of figuring out if there are other jobs are not. In general, AMs send container requests in a greedy manner and

Re: unable to access using webhdfs in 0.23.3

2012-10-17 Thread Visioner Sadak
yes operations like liststatus and GETCONTENTSUMMARY etc are wrking just OPEN is not wrking coz the redirect url it generates is http://localhost:50075/webhdfs/v1/user/test.jpg?op=OPENnamenoderpcaddress=112.30.123.711:8020offset=0 now when i replace the localhost with *112.30.123.711it wrks

Hadoop on Isilon problem

2012-10-17 Thread Artem Ervits
Anyone using Hadoop running on Isilon NAS? I am trying to submit a job with a user other than the one running Hadoop and I'm getting the following error: Exception in thread main java.io.IOException: Permission denied at java.io.UnixFileSystem.createFileExclusively(Native Method)

hadoop current properties

2012-10-17 Thread Kartashov, Andy
Is there a command-linein hadoop or Java methd to display all (if not individual) hadoop's current properties are set to? Rgds, AK NOTICE: This e-mail message and any attachments are confidential, subject to copyright and may be privileged. Any unauthorized use, copying or disclosure is

Re: Using a hard drive instead of

2012-10-17 Thread Colin McCabe
Hi Mark, HDFS contains a write-ahead log which will protect you from power failure. It's called the edit log. If you want warm failover, you can use HDFS HA, which is available in recent versions of HDFS. Hope this helps. Colin On Wed, Oct 17, 2012 at 3:44 PM, Mark Kerzner

RE: HDFS using SAN

2012-10-17 Thread Pamecha, Abhishek
Tom Do you mean you are using GPFS instead of HDFS? Also, if you can share, are you deploying it as DAS set up or a SAN? Thanks, Abhishek From: Tom Deutsch [mailto:tdeut...@us.ibm.com] Sent: Wednesday, October 17, 2012 6:31 AM To: user Subject: Re: HDFS using SAN And of source IBM has

Re: Hadoop on Isilon problem

2012-10-17 Thread Rita
out of curiosity, what does running HDFS give you when running thru an Isilon cluster? On Wed, Oct 17, 2012 at 3:59 PM, Mohit Anchlia mohitanch...@gmail.comwrote: Look at the directory permissions? On Wed, Oct 17, 2012 at 12:18 PM, Artem Ervits are9...@nyp.org wrote: Anyone using Hadoop

RE: HDFS using SAN

2012-10-17 Thread Pamecha, Abhishek
In a SAN? Would it be a concern if I am relying on HDFS to do the replication and using SAN only for dumb storage tier. In that case, the only difference is remote vs local access. Reliability may be, actually, even better in a SAN coz I would assume any reasonable SAN would provide decent

Not sure Kerberos principal needs a Linux user account

2012-10-17 Thread Zheng, Kai
Sorry, may I resend the message with a subject, just forgot it. Hi, When Kerberos authentication is used instead of the default simple method, is a Linux user account needed to run a MapReduce job for a principal? Why? For example, for a Kerberos principal

Hive Query with UDF

2012-10-17 Thread Sam Mohamed
I have some encrypted data in an HDFS csv, that I've created a Hive table for, and I want to run a Hive query that first encrypts the query param, then does the lookup. I have a UDF that does encryption as follows: public class ParamEncrypt extends UDF { public Text evaluate(String name)

RE: Hive Query with UDF

2012-10-17 Thread Sam Mohamed
Thanks for the quick response. The idea is that we are selling the encryption product for customers who use HDFS. Hence, encryption is a requirement. Any other suggestions. Sam From: Michael Segel [michael_se...@hotmail.com] Sent: Wednesday, October

Re: Hive Query with UDF

2012-10-17 Thread Michael Segel
You really don't want to do that. It becomes a nightmare in that you now ship a derivative of Hive and then have to maintain it and keep it in lock step w Hive from Apache. There are other options and designs but since this is for a commercial product. I'm not going to talk about them. Keep

Re: Hadoop on Isilon problem

2012-10-17 Thread Artem Ervits
With Isilon, there is no need for hdfs-site configuration file. Isilon takes care of replication, although you can certainly add hadoop replication. The biggest plus is the scalability for storage layer. We keep a lot of our data in Isilon and importing into hdfs will result in two locations of

hadoop streaming with custom RecordReader class

2012-10-17 Thread Jason Wang
Hi all, I'm experimenting with hadoop streaming on build 1.0.3. To give background info, i'm streaming a text file into mapper written in C. Using the default settings, streaming uses TextInputFormat which creates one record from each line. The problem I am having is that I need record

Re: hadoop streaming with custom RecordReader class

2012-10-17 Thread Harsh J
Hi Jason, A few questions (in order): 1. Does Hadoop's own NLineInputFormat not suffice? http://hadoop.apache.org/docs/current/api/org/apache/hadoop/mapred/lib/NLineInputFormat.html 2. Do you make sure to pass your jar into the front-end too? $ export HADOOP_CLASSPATH=/path/to/your/jar $