Re: Using a hard drive instead of

2012-10-12 Thread Ravi Prakash
Maybe at a slight tangent, but for each write operation on HDFS (e.g. create a file, delete a file, create a directory), the NN waits until the edit has been *flushed* to disk. So I can imagine such a hypothetical(?) disk would tremendously speed up the NN even as it is. Mark, can you please

Re: Getting the configuration object for a MapReduce job from the DataNode

2012-10-22 Thread Ravi Prakash
Hi Adrian, Please use user@hadoop.apache.org for user-related questions Which version of Hadoop are you using? Where do you want the object? In a map/reduce task? For the currently executing job or for a different job? In 0.23, you can use the RM webservices.

Re: simple tutorial or procedure for configuring HDFS FEDERATION 0.23.3 in a cluster

2012-10-27 Thread Ravi Prakash
A simple search ought to have found this for you.     http://hadoop.apache.org/docs/r0.23.0/hadoop-yarn/hadoop-yarn-site/Federation.html From: Visioner Sadak visioner.sa...@gmail.com To: user@hadoop.apache.org Sent: Saturday, October 27, 2012 2:03 AM

Re: mapreduce.job.end-notification settings

2012-11-11 Thread Ravi Prakash
D-oh! Thanks for discovering this. Sorry for my silly mistake. Filed and patched https://issues.apache.org/jira/browse/MAPREDUCE-4786 .    Thanks Ravi From: Harsh J ha...@cloudera.com To: user@hadoop.apache.org Sent: Friday, November 9, 2012 2:24 PM

Re: Unexpected Hadoop behavior: map task re-running after reducer has been running

2013-03-11 Thread Ravi Prakash
This is not unexpected behavior. If there are fetch failures on the Reduce (i.e. its not able to get the map outputs) then a map may be rerun. From: David Parks davidpark...@yahoo.com To: user@hadoop.apache.org user@hadoop.apache.org Sent: Monday, March 11,

Re: Job end notification does not always work (Hadoop 2.x)

2013-06-22 Thread Ravi Prakash
Hi Prashant, I would tend to agree with you. Although job-end notification is only a best-effort mechanism (i.e. we cannot always guarantee notification for example when the AM OOMs), I agree with you that we can do more. If you feel strongly about this, please create a JIRA and possibly

Re: Job end notification does not always work (Hadoop 2.x)

2013-06-23 Thread Ravi Prakash
should send a notification by other means. On Sat, Jun 22, 2013 at 2:38 PM, Ravi Prakash ravi...@ymail.com wrote: Hi Prashant, I would tend to agree with you. Although job-end notification is only a best-effort mechanism (i.e. we cannot always guarantee notification for example when the AM

Re: MapReduce job not running - i think i keep all correct configuration.

2013-06-23 Thread Ravi Prakash
Hi Pavan, I assure you this configuration works. The problem is very likely in your configuration files. Please look them over once again. Also did you restart your daemons after changing the configuration? Some configurations necessarily require a restart. Ravi.

Re: 答复: Help about build cluster on boxes which already has one?

2013-06-25 Thread Ravi Prakash
What version of Hadoop are you planning on using? You will probably have to partition the resources too. e.g. If you are using 0.23 / 2.0, the NM available resources memory will have to be split on all the nodes From: Sandeep L sandeepvre...@outlook.com To:

Re: intermediate results files

2013-07-02 Thread Ravi Prakash
Hi John! If your block is going to be replicated to three nodes, then in the default block placement policy, 2 of them will be on the same rack, and a third one will be on a different rack. Depending on the network bandwidths available intra-rack and inter-rack, writing with replication

Re: Extra start-up overhead with hadoop-2.1.0-beta

2013-08-07 Thread Ravi Prakash
I believe https://issues.apache.org/jira/browse/MAPREDUCE-5399 causes performance degradation in cases where there are a lot of reducers. I can imagine it causing degradation if the configuration files are super big / some other weird cases. From: Krishna

Re: Yahoo provided hadoop tutorial having issue with eclipse configuration

2013-09-09 Thread Ravi Prakash
Kiran, hadoop-0.18 is a VERY old version (Probably 5 years old). Please consider trying out a newer version. You can follow these steps inside a VM to get a single node cluster running: https://hadoop.apache.org/docs/current/hadoop-project-dist/hadoop-common/SingleCluster.html HTH Ravi.

Re: assign tasks to specific nodes

2013-09-09 Thread Ravi Prakash
http://lucene.472066.n3.nabble.com/Assigning-reduce-tasks-to-specific-nodes-td4022832.html From: Mark Olimpiati markq2...@gmail.com To: user@hadoop.apache.org Sent: Friday, September 6, 2013 1:47 PM Subject: assign tasks to specific nodes Hi guys,     I'm

Re: Is there any way to partially process HDFS edits?

2013-09-25 Thread Ravi Prakash
Tom! I would guess that just giving the NN JVM lots of memory (64Gb / 96Gb) should be the easiest way. From: Tom Brown tombrow...@gmail.com To: user@hadoop.apache.org user@hadoop.apache.org Sent: Wednesday, September 25, 2013 11:29 AM Subject: Is there any

Re: Uploading a file to HDFS

2013-10-01 Thread Ravi Prakash
Karim! Look at DFSOutputStream.java:DataStreamer HTH Ravi From: Karim Awara karim.aw...@kaust.edu.sa To: user user@hadoop.apache.org Sent: Thursday, September 26, 2013 7:51 AM Subject: Re: Uploading a file to HDFS Thanks for the reply. when the client

Re: modify HDFS

2013-10-01 Thread Ravi Prakash
Karim! You should read BUILDING.txt . I usually generate the eclipse files using mvn eclipse:eclipse Then I can import all the projects into eclipse as eclipse projects. This is useful for code navigation and completion etc. however I still build using command line: mvn -Pdist

Re: modify HDFS

2013-10-02 Thread Ravi Prakash
environment)? -- Best Regards, Karim Ahmed Awara On Wed, Oct 2, 2013 at 1:13 AM, Ravi Prakash ravi...@ymail.com wrote: Karim! You should read BUILDING.txt . I usually generate the eclipse files using mvn eclipse:eclipse Then I can import all the projects into eclipse as eclipse projects

Re: datanode tuning

2013-10-07 Thread Ravi Prakash
To: common-u...@hadoop.apache.org common-u...@hadoop.apache.org; Ravi Prakash ravi...@ymail.com Sent: Monday, October 7, 2013 5:55 AM Subject: Re: datanode tuning Thanks Ravi. The number of nodes isn't a lot but the size is rather large. Each data node has about 14-16T (560-640T). For the datanode

Re: Yarn never use TeraSort#TotalOrderPartitioner when run TeraSort job?

2013-10-18 Thread Ravi Prakash
Sam, I would guess that the jar file you think is running, is not actually the one. I am guessing that in the task classpath, there is a normal jar file (without your changes) which is being picked up before your modified jar file. On Thursday, October 17, 2013 10:13 PM, sam liu

Re: issure about different heapsize on namenode and datanode

2013-10-18 Thread Ravi Prakash
Hi! You can go to the JMX page: http://namenode:50070/jmx to find out what the Heap Memory and usage is. Yes we know that there is a problem in the scripts. I believe its being handled as part of https://issues.apache.org/jira/browse/HADOOP-9902 On Friday, October 18, 2013 2:07 AM, ch

Re: ResourceManager webapp code runs OOM

2013-10-22 Thread Ravi Prakash
Hi Prashant! You can set yarn.resourcemanager.max-completed-applicationsin yarn-site.xml of RM to limit the maximum number of apps it keeps track of (It defaults to 1). You're right that the Heap may also be increased. HTH Ravi On Monday, October 21, 2013 5:54 PM, Prashant Kommireddi

Re: Hadoop core jar class update

2013-10-24 Thread Ravi Prakash
Viswanathan, What version of Hadoop are you using? What is the change? On Wednesday, October 23, 2013 2:20 PM, Viswanathan J jayamviswanat...@gmail.com wrote: Hi guys, If I update(very small change) the hadoop-core mapred class for one of the OOME patch and compiled the jar. If I deploy

Re: hadoop2.2.0 compile Failed - no such instruction

2013-10-24 Thread Ravi Prakash
Hi Rico! What was the command line you used to build? On Wednesday, October 23, 2013 11:44 PM, codepeak gcodep...@gmail.com wrote: Hi all,        I have a problem when compile the hadoop 2.2.0, the apache only offers 32bit distribution, but I need 64bit, so I have to compile it myself. My

Re: Trash in yarn

2013-10-28 Thread Ravi Prakash
Hi Siddharth, The Trash feature is enabled by setting fs.trash.interval . I'm not sure about your question on hive. What do you mean by the trash helping with dropped tables? On Friday, October 25, 2013 3:08 AM, Siddharth Tiwari siddharth.tiw...@live.com wrote: How can I enable trash in

Re: Reduce Merge Memtomem Parameter

2013-10-28 Thread Ravi Prakash
Hi! This parameter triggers a sort of fetched map outputs on the reducer node when the number of in memory map outputs memToMemMergeOutputsThreshold . It is disabled by default. I am guessing this was put in on the premise that it might be faster to sort fewer number of streams even in

Re: How / when does On-disk merge work?

2013-10-28 Thread Ravi Prakash
Hi! Tom White's Hadoop: The Definitive Guide is probably the best source for information on this (apart from the code itself ;-) Look at MergeManagerImpl.java btw in case you are so inclined). HTH Ravi   On Friday, October 25, 2013 2:36 PM, - commodor...@ymail.com wrote: Hi All, Can

Re: How to open .gv file for Yarn event model

2014-04-02 Thread Ravi Prakash
Hi Azuryy! You have to use dot to convert it to png On Tuesday, April 1, 2014 6:38 PM, Azuryy Yu azury...@gmail.com wrote: Hi, I compiled Yarn event model using maven, but how to open .gv file to view it? Thanks.

Re: speed of replication for under replicated blocks by namenode

2014-05-13 Thread Ravi Prakash
-BEGIN PGP SIGNED MESSAGE- Hash: SHA1 Hi Chandra! Replication is done according to priority (e.g. where only 1 block out of 3 remains is higher priority than when only 2 out of 3 remain). Every time a DN heartbeats into the NN, it *may* be assigned some replication work according to

Re: LVM to JBOD conversion without data loss

2014-05-13 Thread Ravi Prakash
-BEGIN PGP SIGNED MESSAGE- Hash: SHA1 One way I can think of is decomissioning the nodes and then basically re-imaging it however you want to. Is that not an option? On 05/12/14 00:18, Bharath Kumar wrote: Hi I am a query regarding JBOD , I sit possible to migrate from LVM to JBOD

Re: enable regular expression on which parameter?

2014-05-13 Thread Ravi Prakash
-BEGIN PGP SIGNED MESSAGE- Hash: SHA1 Avinash! That JIRA is still open and does not seem to have been fixed. There are a lot of issues with providing regexes though. A long standing issue has been https://issues.apache.org/jira/browse/HDFS-13 which makes it even harder HTH Ravi On

Re: Can anyone help me resolve this Error: unable to create new native thread

2014-08-14 Thread Ravi Prakash
Hi Chris! When is this error caused? Which logs do you see this in? Are you sure you are setting the ulimit for the correct user? What application are you trying to run which is causing you to run up against this limit? HTH Ravi On Saturday, August 9, 2014 6:07 AM, Chris MacKenzie

Re: Seek behavior difference between NativeS3FsInputStream and DFSInputStream

2014-11-02 Thread Ravi Prakash
-BEGIN PGP SIGNED MESSAGE- Hash: SHA1 Hi Venkata! Please feel free to open a JIRA and upload a patch. You might also try the new s3a implementation (instead of s3n), but there's a chance that the behavior will be the same. Cheers Ravi On 10/31/14 03:23, Ravuri, Venkata Puneet wrote:

Re: Reliability of timestamps in logs

2015-01-26 Thread Ravi Prakash
Are you running NTP? On Friday, January 23, 2015 12:42 AM, Fabio anyte...@gmail.com wrote: Hi guys, while analyzing SLS logs I noticed some unexpected behaviors, such as resources requests sent before the AM container gets to a RUNNING state. For this reason I started wondering how

Re: yarn jobhistory server not displaying all jobs

2015-01-26 Thread Ravi Prakash
Hi Matt! Take a look at the mapreduce.jobhistory.* configuration parameters here for the delay in moving finished jobs to the HistoryServer:https://hadoop.apache.org/docs/stable/hadoop-mapreduce-client/hadoop-mapreduce-client-core/mapred-default.xml I've seen this error hadoop is not allowed

Re: NN config questions

2015-01-26 Thread Ravi Prakash
Hi Dave! Here the class which is used to store all the edits : https://github.com/apache/hadoop/blob/trunk/hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/namenode/FSEditLog.java#L575 HTHRavi On Monday, January 26, 2015 10:32 AM, dlmar...@comcast.net

Re: Launching Hadoop map reduce job from a servlet

2015-01-21 Thread Ravi Prakash
Hi Rab! I think you have a comma in between mapreduce and framework.name where it should be a period. You can also try looking at the job's logs to see if the configuration for mapreduce.framwork.name was indeed passed or not. HTH On Friday, January 16, 2015 9:55 AM, rab ra

Re: hadoop cluster with non-uniform disk spec

2015-02-11 Thread Ravi Prakash
Hi Chen! Are you running the balancer? What are you setting dfs.datanode.available-space-volume-choosing-policy.balanced-space-threshold dfs.datanode.available-space-volume-choosing-policy.balanced-space-preference-fractionto? On Wednesday, February 11, 2015 7:44 AM, Chen Song

Re: Reliability of timestamps in logs

2015-01-27 Thread Ravi Prakash
you mean there is a chance it's updating the clock while the job is running? Regards Fabio On 01/26/2015 08:00 PM, Ravi Prakash wrote: Are you running NTP? On Friday, January 23, 2015 12:42 AM, Fabio anyte...@gmail.com wrote: Hi guys, while analyzing SLS logs I

Re: HDFS balancer ending with error

2015-01-29 Thread Ravi Prakash
You can try running fsck. On Thursday, January 29, 2015 6:34 AM, Juraj jiv fatcap@gmail.com wrote: Hello, I noticed our CDH5 cluster is not balanced and balancer role is missing. So i added via cloudera manager new balancer role and clicked on rebalance in hdfs menu. After

Re: How to debug why example not finishing (or even starting)

2015-01-29 Thread Ravi Prakash
Do you have capacity on your cluster? Did you submit it to the right queue? Go to your scheduler page: http://YOUR-RESOURCE-MANAGER-HOSTNAME:8088/cluster/scheduler On Thursday, January 29, 2015 4:48 AM, Frank Lanitz frank.lan...@sql-ag.de wrote: Hi, Sorry for spamming the list. ;)

Re: Unit tests on Hadoop Cluster

2015-01-05 Thread Ravi Prakash
Hi Kamal! Thanks for your initiative. Please take a look at MiniDFSCluster / MiniJournalCluster / MiniYarnCluster etc. In your unit tests you can essentially start a cluster in a single JVM. You can look at TestQJMWithFaults.java HTHRavi On Sunday, January 4, 2015 10:09 PM, kamaldeep

Re: Kill one node on map start

2015-02-09 Thread Ravi Prakash
In unit tests MiniMRYarnCluster is used to do this kind of stuff. On Friday, February 6, 2015 3:51 AM, Telles Nobrega tellesnobr...@gmail.com wrote: Hi, I'm working on a experiment and I need to do something like, start a hadoop job (wordcount, terasort, pi) and let the application

Re: missing data blocks after active name node crashes

2015-02-10 Thread Ravi Prakash
Hi Chen! From my understanding, every operation on the Namenode is logged (and flushed) to disk / QJM / shared storage. This includes the addBlock operation. So when a client requests to write a new block, the metadata is logged by the active NN, so even if it crashes later on, the new active

Re: execute job in a remote jobtracker in YARN?

2015-02-13 Thread Ravi Prakash
Hi! There is no JobTracker in YARN. There is an ApplicationMaster. And there is a ResourceManager. Which do you mean? You can use the ResourceManager REST API to submit new applications

Re: RM doesn't show which host is submitted job

2015-02-05 Thread Ravi Prakash
Hi Nur! Thanks for your report. Please feel free to open a JIRA in the YARN project for this: https://issues.apache.org/jira/browse/YARN A patch would be great. Look at ClientRMService.submitApplication() CheersRavi On Wednesday, February 4, 2015 9:34 PM, Nur Kholis Majid

Re: Unable to see application in http://localhost:8088/cluster/apps

2015-03-17 Thread Ravi Prakash
Perhaps yarn.resourcemanager.max-completed-applications ?   On Tuesday, March 17, 2015 10:02 AM, hitarth trivedi t.hita...@gmail.com wrote: Hi, When I submit a job to yarn ResourceManager the job is successful, and eventhe Apps submitted, Apps Running, Apps Completed counters

Re: can block size for namenode be different from datanode block size?

2015-03-25 Thread Ravi Prakash
Hi Mich! The block size you are referring to is used only on the datanodes. The file that the namenode writes (fsimage OR editlog) is not chunked using this block size. HTHRavi On Wednesday, March 25, 2015 8:12 AM, Dr Mich Talebzadeh m...@peridale.co.uk wrote: Hi, The block

Re: recoverLeaseInternal: why current leasehonder is forbidden to append the file?

2015-02-24 Thread Ravi Prakash
Hi Dmitry! I suspect its because we don't want two streams from the same DFSClient to write to the same file. The Lease.holder is a simple string which corresponds usually to DFSClient_someid . HTH Ravi. On Tuesday, February 24, 2015 12:12 AM, Dmitry Simonov dimmobor...@gmail.com wrote:

Re: suspend and resume a job in execution?

2015-02-20 Thread Ravi Prakash
I am not aware of an API that would let you do this. You may be able to move an application to a queue with 0 resources to achieve the desired behavior but I'm not entirely sure. On Wednesday, February 18, 2015 9:24 AM, xeonmailinglist xeonmailingl...@gmail.com wrote: By job, I

Re: YARN and LinuxContainerExecutor in simple security mode

2015-07-06 Thread Ravi Prakash
- as long as I dont explicitly allow for this using   hadoop.proxyuser.username.groups   hadoop.proxyuser.username.hosts user processes spawned by yarn on worknodes will always run with the uid of that user. Is that right?   Thanks,   Tomasz W dniu 29.06.2015 o 21:43, Ravi Prakash pisze: Hi

Re: Web Address appears to be ignored

2015-05-19 Thread Ravi Prakash
Ewan! This sounds like a bug. Please open a JIRA. ThanksRavi On Tuesday, May 19, 2015 8:09 AM, Ewan Higgs ewan.hi...@ugent.be wrote: Hi all, I am setting up a Hadoop cluster where the nodes have FQDNames inside the cluster, but the DNS where these names are registered is behind

Re: Unable to pass complete tests on 2.7.1

2015-08-17 Thread Ravi Prakash
Hi Tucker! Sadly, unit tests failing is usual for hadoop builds. You can use -DskipTests to build without running unit tests, or -fn (fail-never) to continue despite failures. The maven-plugin helps us manage generated source code (e.g. protobuf files generate more java files which need to be

Re: hdfs: weird lease expiration issue

2015-08-21 Thread Ravi Prakash
Hi Bogdan! This is because the second application attempt appears to HDFS as a new client. Are you sure the second client experienced write errors because *its* lease was removed? Yongjun has a great writeup : http://blog.cloudera.com/blog/2015/02/understanding-hdfs-recovery-processes-part-1/

Re: YARN and LinuxContainerExecutor in simple security mode

2015-06-29 Thread Ravi Prakash
Hi Tomasz! It is tricky to set up, but there are no implications to security if you configure it correctly. Please read the discussion on [YARN-2424] LCE should support non-cgroups, non-secure mode - ASF JIRA HTH Ravi |   | |   |   |   |   |   | | [YARN-2424] LCE should support non-cgroups,

Re: Documentation inconsistency about append write in HDFS

2015-08-03 Thread Ravi Prakash
Thanks Thanh! Yes! Could you please post a patch? On Sunday, August 2, 2015 8:50 PM, Thanh Hong Dai hdth...@tma.com.vn wrote: !--#yiv919757 _filtered #yiv919757 {font-family:MS Mincho;panose-1:2 2 6 9 4 2 5 8 3 4;} _filtered #yiv919757 {font-family:Cambria

Re: Web based file manager for HDFS?

2015-07-27 Thread Ravi Prakash
Hi Caesar! I'm going to try to get that functionality as part of [HDFS-7588] Improve the HDFS Web UI browser to allow chowning / chmoding, creating dirs and uploading files - ASF JIRA in the next 2 months. Ravi |   | |   |   |   |   |   | | [HDFS-7588] Improve the HDFS Web UI browser to allow

Fw: important

2015-09-08 Thread Ravi Prakash
Hello! Important message, visit http://schevelyura.ru/written.php?id6 Ravi Prakash

Fw: important message

2015-09-14 Thread Ravi Prakash
Hey friend! Check it out http://isttp.org/necessary.php?1o7f0 Ravi Prakash

Fw: important message

2015-09-14 Thread Ravi Prakash
Hey friend! Check it out http://xecuuho.net/sent.php?dtya Ravi Prakash

Re: Usage of data node to run on commodity hardware

2016-06-07 Thread Ravi Prakash
Hi Krishna! I don't see why you couldn't start Hadoop in this configuration. Performance would obviously be suspect. Maybe by configuring your network toppology script, you could even improve the performance. Most mobiles are ARM processor. I know some cool people ran Hadoop v1 on Raspberry Pis

Re: HDFS Federation

2016-06-06 Thread Ravi Prakash
Perhaps use the "viewfs://" protocol prepended to your path? On Sun, Jun 5, 2016 at 1:10 PM, Kun Ren wrote: > Hi Genius, > > I just configured HDFS Federation, and try to use it(2 namenodes, one is > for /my, another is for /your). When I run the command: > hdfs dfs -ls /,

Re: HDFS in Kubernetes

2016-06-06 Thread Ravi Prakash
Klaus! Good luck with your attempt to run HDFS inside Kubernetes! Please keep us posted. For creating a new file, a DFSClient : 1. First calls addBlock on the NameNode.

Re: Upgrading production hadoop system from 2.5.1 to 2.7.2

2016-03-24 Thread Ravi Prakash
y important to us. We cannot tolerate any data loss with > the update. Do you remember how long it took for you to upgrade it from > 2.4.1 to 2.7.1 ? > > Thanks, > Chathuri > > On Wed, Mar 23, 2016 at 7:09 PM, Ravi Prakash <ravihad...@gmail.com> > wrote: > >&g

Re: No edits files in dfs.namenode.edits.dir

2016-05-19 Thread Ravi Prakash
No! You are probably writing the edits file somewhere still. An `lsof` on the namenode process may be more revealing. Obviously this depends on configuration, but unless you have some really crazy settings, I'm pretty sure the edits would be persisted to disk. On Wed, May 18, 2016 at 2:47 AM,

Re: Regarding WholeInputFileFormat Java Heap Size error

2016-05-12 Thread Ravi Prakash
Shubh! You can perhaps introduce an artificial delay in your map task and then take a JAVA heap dump of the MapTask JVM to analyze where the memory is going. Its hard to speculate otherwise. On Wed, May 11, 2016 at 10:15 PM, Shubh hadoopExp wrote: > > > > Hi All, > >

Re: New cluster help

2016-07-14 Thread Ravi Prakash
Hi Tombin! Is this the first cluster you're ever setting up? Are you able to run an "hdfs dfs -ls /" successfully? How about putting files into HDFS? I'd take it one step at a time if I were you. i.e. 1. Set up a simple HDFS cluster (without SSL) 2. Turn on SSL 3. Then try to run HBase. Is step

Re: Node Manager crashes with OutOfMemory error

2016-07-26 Thread Ravi Prakash
Hi Rahul! Which version of Hadoop are you using? What non-default values of configuration are you setting? You can set HeapDumpOnOutOfMemoryError on the command line while starting up your nodemanagers and see the resulting heap dump in Eclipse MAT / jvisualvm / yourkit to see where are the

Re: Where's official Docker image for Hadoop?

2016-07-20 Thread Ravi Prakash
Would something like this be useful as a starting point? https://github.com/apache/hadoop/tree/trunk/dev-support/docker (this is checked into apache/trunk) The DockerContainerExecutor was an alpha feature that didn't really get much traction and is not what you think it is. (If configured on the

Re: Building a distributed system

2016-07-18 Thread Ravi Prakash
Welcome to the community Richard! I suspect Hadoop can be more useful than just splitting and stitching back data. Depending on your use cases, it may come in handy to manage your machines, restart failed tasks, scheduling work when data becomes available etc. I wouldn't necessarily count it out.

Re: unsubscribe

2016-07-05 Thread Ravi Prakash
Please send an email to user-unsubscr...@hadoop.apache.org On Wed, Jun 29, 2016 at 8:02 AM, Bob Krier wrote: > >

Re: unsubscribe

2016-07-05 Thread Ravi Prakash
Please send an email to user-unsubscr...@hadoop.apache.org On Wed, Jun 29, 2016 at 8:04 AM, Mike Rapuano wrote: > > > -- > > > Michael Rapuano > > Dev/Ops Engineer > > 617-498-7800 | 617-468-1774 > > 25 Drydock Ave > > Boston, MA 02210 > >

Re: MapReduce Job State: PREP over 8 hours, state no change

2016-08-08 Thread Ravi Prakash
That's unusual. Are you able to submit a simple sleep job? You can do this using: yarn jar $HADOOP_PREFIX/share/hadoop/mapreduce/hadoop-mapreduce-client-jobclient-*-tests.jar sleep -m 1 -r 1 This should finish it in under a minute. Otherwise I'd suspect that your cluster is misconfigured. HTH

Re: Yarn web UI shows more memory used than actual

2016-08-15 Thread Ravi Prakash
Hi Suresh! YARN's accounting for memory on each node is completely different from the Linux kernel's accounting of memory used. e.g. I could launch a MapReduce task which in reality allocates just 100 Mb, and tell YARN to give it 8 Gb. The kernel would show the memory requested by the task, the

Re: Confusion between dfs.replication and dfs.namenode.replication.min options in hdfs-site.xml

2017-02-02 Thread Ravi Prakash
Hi Andrey! Your assumption is absolutely correct. dfs.namenode.replication.min is what you should set to 2 in your case. You should also look at dfs.client.block.write.replace-datanode-on-failure.policy, dfs.client.block.write.replace-datanode-on-failure.enable and

Re: Some Questions about Node Manager Memory Used

2017-01-24 Thread Ravi Prakash
Hi Zhuo Chen! Yarn has a few methods to account for memory. By default, it is guaranteeing your (hive) application a certain amount of memory. It depends totally on the application whether it uses all of that memory, or as in your case, leaves plenty of headroom in case it needs to expand in the

Re: WordCount MapReduce error

2017-02-23 Thread Ravi Prakash
00GB free space on that drive > so space shouldn't be the issue. > 3. The application has all the required permissions. > > Additionally, something I've tested is that if I set the number of reduce > tasks in the WordCount.java file to 0 (job.setNumReduceTask = 0) then I get > the

Re: WordCount MapReduce error

2017-02-22 Thread Ravi Prakash
Hi Vasil! It seems like the WordCount application is expecting to open the intermediate file but failing. Do you see a directory under D:/tmp/hadoop-Vasil Grigirov/ . I can think of a few reasons. I'm sorry I am not familiar with the Filesystem on Windows 10. 1. Spaces in the file name are not

Re: WordCount MapReduce error

2017-02-22 Thread Ravi Prakash
he WordCount.java file to 0 (job.setNumReduceTask = 0) then I get > the success files for the Map task in my output directory. So the Map tasks > work fine but the Reduce is messing up. Is it possible that my build is > somewhat incorrect even though it said everything was successfully bu

Re: HDFS Shell tool

2017-02-09 Thread Ravi Prakash
Great job Vity! Thanks a lot for sharing. Have you thought about using WebHDFS? Thanks Ravi On Thu, Feb 9, 2017 at 7:12 AM, Vitásek, Ladislav wrote: > Hello Hadoop fans, > I would like to inform you about our tool we want to share. > > We created a new utility - HDFS Shell

Re: HDFS Shell tool

2017-02-10 Thread Ravi Prakash
Ravi, > I am glad you like it. > Why should I use WebHDFS? Our cluster sysops, include me, prefer command > line. :-) > > -Vity > > 2017-02-09 22:21 GMT+01:00 Ravi Prakash <ravihad...@gmail.com>: > >> Great job Vity! >> >> Thanks a lot for sharing. H

Re: Yarn containers creating child process

2017-02-13 Thread Ravi Prakash
Hi Sandesh! A *yarn* task is just like any other process on the operating system. Depending on which ContainerExecutor you use, you should launch the yarn task with appropriate limits in place. Although I have never tried it, on Linux you could use setrlimit or

Re: How to fix "HDFS Missing replicas"

2017-02-13 Thread Ravi Prakash
Hi Ascot! Just out of curiosity, which version of hadoop are you using? fsck has some other options (e.g. -blocks will print out the block report too, -list-corruptfileblocks prints out the list of missing blocks and files they belong to) . I suspect you may also want to specify the

Re: HDFS fsck command giving health as corrupt for '/'

2017-02-16 Thread Ravi Prakash
Hi Nishant! I'd suggest reading the HDFS user guide to begin with and becoming familiar with the architecture. https://hadoop.apache.org/docs/stable/hadoop-project-dist/hadoop-hdfs/HdfsUserGuide.html . Where are the blocks stored on the datanodes? Were they on persistent storage on the EC2

Re: YARN Resource Allocation When Memory is Very Small

2016-08-30 Thread Ravi Prakash
Hi Nico! The RM is configured with a minimum allocation. Take a look at "yarn.scheduler.minimum-allocation-mb" . You can also read through code here:

Re: Capacity scheduler for yarn oin 2.7.3 - problem with job scheduling to created queue.

2016-11-08 Thread Ravi Prakash
Hi Rafal! Have you been able to launch the job successfully first without configuring node-labels? Do you really need node-labels? How much total memory do you have on the cluster? Node labels are usually for specifying special capabilities of the nodes (e.g. some nodes could have GPUs and your

Re: Yarn 2.7.3 - capacity scheduler container allocation to nodes?

2016-11-09 Thread Ravi Prakash
Hi Rafal! Have you been able to launch the job successfully first without configuring node-labels? Do you really need node-labels? How much total memory do you have on the cluster? Node labels are usually for specifying special capabilities of the nodes (e.g. some nodes could have GPUs and your

Re: How to mount HDFS as a local file system?

2016-11-10 Thread Ravi Prakash
Or you could use NFS https://hadoop.apache.org/docs/r2.7.2/hadoop-project-dist/hadoop-hdfs/HdfsNfsGateway.html . In our experience, both of them still need some work for stability and correctness. On Thu, Nov 10, 2016 at 10:00 AM, wrote: > Fuse is your tool: > >

Re: Fw:Re:How to add custom field to hadoop MR task log?

2016-11-04 Thread Ravi Prakash
Hi Maria! You have to be careful which log4j.properties file is on the classpath of the task which was launched. Often times there are multiple log4j.properties file, perhaps in the classpaths or in one of the jars on the classpath. Are you sure the log4j.properties file you edited is the only

Re: Bug in ORC file code? (OrcSerde)?

2016-10-19 Thread Ravi Prakash
MIchael! Although there is a little overlap in the communities, I strongly suggest you email u...@orc.apache.org ( https://orc.apache.org/help/ ) I don't know if you have to be subscribed to a mailing list to get replies to your email address. Ravi On Wed, Oct 19, 2016 at 11:29 AM, Michael

Re: HDFS Replication Issue

2016-10-12 Thread Ravi Prakash
Hi Eric! Did you follow https://hadoop.apache.org/docs/current2/hadoop-project- dist/hadoop-common/SingleCluster.html to set up your single node cluster? Did you set dfs.replication in hdfs-site.xml ? The logs you posted don't have enough information to debug the issue. *IF* everything has been

Re: Hadoop: precomputing data

2016-10-12 Thread Ravi Prakash
I guess one of the questions is what is your false negative rate in Approach 1 Step 1? Ofcourse if you are limited by resources you may have to go with Approach 1. On Thu, Oct 6, 2016 at 6:14 AM, venito camelas wrote: > I'm designing a prototype using *Hadoop* for

Re: HDFS Issues.

2016-10-12 Thread Ravi Prakash
There are a few conditions for the Namenode to come out of safemode. # Number of datanodes, # Number of blocks that have been reported. How many blocks have the datanodes reported? On Tue, Oct 4, 2016 at 1:22 PM, Steve Brenneis wrote: > I have an HDFS cluster of three

Re: Newbie Ambari Question

2016-10-12 Thread Ravi Prakash
I suspect https://ambari.apache.org/mail-lists.html may be more useful. On Thu, Oct 6, 2016 at 2:45 AM, Deepak Goel wrote: > > Hey > > Namaskara~Nalama~Guten Tag~Bonjour > > Sorry, Is this the right forum for asking a question "Ambari Hadoop > Installation" from Hortonworks?

Re: file permission issue

2016-10-17 Thread Ravi Prakash
Hi! https://github.com/apache/hadoop/blob/trunk/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-nodemanager/src/main/java/org/apache/hadoop/yarn/server/nodemanager/containermanager/localizer/ResourceLocalizationService.java#L1524 Just fyi, there are different kinds of

Re: Does the JobHistoryServer register itself with ZooKeeper?

2016-11-16 Thread Ravi Prakash
Are you talking about the Mapreduce JobHistoryServer? I am not aware of it needing Zookeeper for anything. What gave you that impression? On Wed, Nov 16, 2016 at 11:32 AM, Benson Qiu wrote: > I'm looking for a way to check for connectivity to the JobHistoryServer. > >

Re: unsubscribe

2016-10-31 Thread Ravi Prakash
Please email user-unsubscr...@hadoop.apache.org On Mon, Oct 31, 2016 at 2:29 AM, 风雨无阻 <232341...@qq.com> wrote: > unsubscribe >

Re: why the default value of 'yarn.resourcemanager.container.liveness-monitor.interval-ms' in yarn-default.xml is so high?

2016-11-03 Thread Ravi Prakash
at happens if RM finds that one NM's heartbeat is missing but it is not > 10 min yet (yarn.nm.liveness-monitor.expiry-interval-ms time is not > expired yet) > Will a new application still make container request to that NM via RM? > > Thanks > Tanvir > > > > > > O

Re: Issue in Rollback (after rolling upgrade) from hadoop 2.7.2 to 2.5.2

2016-10-13 Thread Ravi Prakash
Hi Dinesh! This is obviously a very hazardous situation you are in (if your data is important), so I'd suggest moving carefully. Make as many backups of as many things you can. The usual mechanism that Hadoop uses when upgrading is to rename directories of the old format and keep them around

Re: Where does Hadoop get username and group mapping from for linux shell username and group mapping?

2016-10-14 Thread Ravi Prakash
Chen! It gets it from whatever is configured on the Namenode. https://hadoop.apache.org/docs/r2.7.2/hadoop-project-dist/hadoop-hdfs/HdfsPermissionsGuide.html#Group_Mapping HTH Ravi On Thu, Oct 13, 2016 at 7:43 PM, chen dong wrote: > Hi, > > Currently I am working on a

Re: hadoop cluster container memory limit

2016-10-14 Thread Ravi Prakash
Hi! Look at yarn.nodemanager.resource.memory-mb in https://hadoop.apache.org/docs/r2.7.2/hadoop-yarn/hadoop-yarn-common/yarn-default.xml I'm not sure how 11.25Gb comes in. How did you deploy the cluster? Ravi On Thu, Oct 13, 2016 at 9:07 PM, agc studio wrote: > Hi

  1   2   >