Re: PendingDeletionBlocks immediately after Namenode failover

2017-11-13 Thread Ravi Prakash
Hi Michael! Thank you for the report. I'm sorry I don't have advice other than the generic advice, like please try a newer version of Hadoop (say Hadoop-2.8.2) . You seem to already know that the BlockManager is the place to look. If you found it to be a legitimate issue which could affect

Re: Vulnerabilities to UserGroupInformation / credentials in a Spark Cluster

2017-10-31 Thread Ravi Prakash
oop source code submitted > through the Yarn Client.This is an issue for map reduce as well. eg: > https://pravinchavan.wordpress.com/2013/04/25/223/ > > On Mon, Oct 30, 2017 at 1:15 PM, Ravi Prakash <ravihad...@gmail.com> > wrote: > >> Hi Blaze! >> >>

Re: Unable to append to a file in HDFS

2017-10-31 Thread Ravi Prakash
e eternal lease on my first file. >> >> Thanks again for your time. >> >> -Tarik >> >> On Mon, Oct 30, 2017 at 2:19 PM, Ravi Prakash <ravihad...@gmail.com> >> wrote: >> >>> Hi Tarik! >>> >>> You're welcome! If you look

Re: Unable to append to a file in HDFS

2017-10-30 Thread Ravi Prakash
lo Ravi - > > Thank you for your response. I have read about the soft and hard lease > limits, however no matter how long I wait I am never able to write again to > the file that I first created and wrote to the first time. > > Thanks again. > > -Tarik > > On Mo

Re: Vulnerabilities to UserGroupInformation / credentials in a Spark Cluster

2017-10-30 Thread Ravi Prakash
Hi Blaze! Thanks for digging into this. I'm sure security related features could use more attention. Tokens for one user should be isolated from other users. I'm sorry I don't know how spark uses them. Would this question be more appropriate on the spark mailing list?

Re: Unable to append to a file in HDFS

2017-10-30 Thread Ravi Prakash
Hi Tarik! The lease is owned by a client. If you launch 2 client programs, they will be viewed as separate (even though the user is same). Are you sure you closed the file when you first wrote it? Did the client program which wrote the file, exit cleanly? In any case, after the namenode lease

Re:

2017-10-30 Thread Ravi Prakash
And one of the good things about open-source projects like Hadoop, you can read all about why :-) : https://issues.apache.org/jira/browse/HADOOP-4952 Enjoy! Ravi On Mon, Oct 30, 2017 at 11:54 AM, Ravi Prakash <ravihad...@gmail.com> wrote: > Hi Doris! > > FileContext was created t

Re:

2017-10-30 Thread Ravi Prakash
Hi Doris! FileContext was created to overcome some of the limitations that we learned FileSystem had after a lot of experience. Unfortunately, a lot of code (i'm guessing maybe even the majority) still uses FileSystem. I suspect FileContext is probably the interface you want to use. HTH, Ravi

Re: Hadoop 2.8.0: Job console output suggesting non-existent rmserver 8088:proxy URI

2017-09-13 Thread Ravi Prakash
is wrong with your cluster :( . HTH Ravi On Tue, Sep 12, 2017 at 10:27 PM, Kevin Buckley < kevin.buckley.ecs.vuw.ac...@gmail.com> wrote: > On 9 September 2017 at 05:17, Ravi Prakash <ravihad...@gmail.com> wrote: > > > I'm not sure my reply will be entirely helpful, but here

Re: Apache ambari

2017-09-08 Thread Ravi Prakash
Hi Sidharth! The question seems relevant to the Ambari list : https://ambari.apache.org/mail-lists.html Cheers Ravi On Fri, Sep 8, 2017 at 1:15 AM, sidharth kumar wrote: > Hi, > > Apache ambari is open source. So,can we setup Apache ambari to manage > existing Apache

Re: Hadoop 2.8.0: Job console output suggesting non-existent rmserver 8088:proxy URI

2017-09-08 Thread Ravi Prakash
Hi Kevin! I'm not sure my reply will be entirely helpful, but here goes. The ResourceManager either proxies your request to the ApplicationMaster (if the application is running), or (once the application is finished) serves it itself if the job is in the "cache" (usually the last 1

Re: When is an hdfs-* service restart required?

2017-09-07 Thread Ravi Prakash
Hi Kellen! The first part of the configuration is a good indication of which service you need to restart. Unfortunately the only way to be completely sure is to read the codez. e.g. most hdfs configuration is mapped to variables in DFSConfigKeys $ find . -name *.java | grep -v test | xargs grep

Re: unsubscribe

2017-08-29 Thread Ravi Prakash
Hi Corne! Please send an email to user-unsubscr...@hadoop.apache.org as mentioned on https://hadoop.apache.org/mailing_lists.html Thanks On Sun, Aug 27, 2017 at 10:25 PM, Corne Van Rensburg wrote: > [image: Softsure] > > unsubscribe > > > > *Corne Van RensburgManaging

Re:

2017-08-29 Thread Ravi Prakash
Hi Dominique, Please send an email to user-unsubscr...@hadoop.apache.org as mentioned on https://hadoop.apache.org/mailing_lists.html Thanks Ravi 2017-08-26 10:49 GMT-07:00 Dominique Rozenberg : > unsubscribe > > > > > > [image: cid:image001.jpg@01D10A65.E830C520] > >

Re: Recommendation for Resourcemanager GC configuration

2017-08-23 Thread Ravi Prakash
Hi Puneet Can you take a heap dump and see where most of the churn is? Is it lots of small applications / few really large applications with small containers etc. ? Cheers Ravi On Wed, Aug 23, 2017 at 9:23 AM, Ravuri, Venkata Puneet wrote: > Hello, > > > > I wanted to know if

Re: Some Configs in hdfs-default.xml

2017-08-23 Thread Ravi Prakash
Hi Doris! I'm not sure what the difference between lab / production use is. All configuration affects some behavior of the Hadoop system. Usually the defaults are good for small clusters. For larger clusters, it becomes worthwhile to tune the configuration. 1.

Re: Restoring Data to HDFS with distcp from standard input /dev/stdin

2017-08-16 Thread Ravi Prakash
Hi Heitor! Welcome to the Hadoop community. Think of the "hadoop distcp" command as a script which launches other JAVA programs on the Hadoop worker nodes. The script collects the list of sources, divides it among the several worker nodes and waits for the worker nodes to actually do the copying

Re: Forcing a file to update its length

2017-08-09 Thread Ravi Prakash
Hi David! A FileSystem class is an abstraction for the file system. It doesn't make sense to do an hsync on a file system (should the file system sync all files currently open / just the user's etc.) . With appropriate flags maybe you can make it make sense, but we don't have that functionality.

Re: modify the MapTask.java but no change

2017-08-07 Thread Ravi Prakash
Hi DuanYu! Most likely, the MapTask class loaded is not from your jar file. Here's a look at how Oracle JAVA loads classes : http://docs.oracle.com/javase/8/docs/technotes/tools/findingclasses.html . Check the classpath that your MapTask is started with. HTH Ravi On Fri, Aug 4, 2017 at 7:09

Re: Replication Factor Details

2017-08-02 Thread Ravi Prakash
Hi Hilmi! The topology script / DNSToSwitchMapping tell the NameNode about the topology of the cluster : https://hadoop.apache.org/docs/current/hadoop-project-dist/hadoop-common/RackAwareness.html You can trace through

Re: Shuffle buffer size in presence of small partitions

2017-07-31 Thread Ravi Prakash
Hi Robert! I'm sorry I do not have a Windows box and probably don't understand the shuffle process well enough. Could you please create a JIRA in the mapreduce proect if you would like this fixed upstream? https://issues.apache.org/jira/secure/RapidBoard.jspa?rapidView=116=MAPREDUCE Thanks Ravi

Re: How to write a Job for importing Files from an external Rest API into Hadoop

2017-07-31 Thread Ravi Prakash
Hi Ralph! Although not totally similar to your use case, DistCp may be the closest thing to what you want. https://github.com/apache/hadoop/blob/trunk/hadoop-tools/hadoop-distcp/src/main/java/org/apache/hadoop/tools/DistCp.java . The client builds a file list, and then submits an MR job to copy

Re: MapReduce and Spark jobs not starting

2017-07-28 Thread Ravi Prakash
Hi Nishant! You should be able to look at the datanode and nodemanager log files to find out why they died after you ran the 76 mappers. It is extremely unusual (I haven't heard of a verified case for over 4-5 years) of a job killing nodemanagers unless your cluster is configured poorly. Which

Re: how to get info about which data in hdfs or file system that a MapReduce job visits?

2017-07-27 Thread Ravi Prakash
Hi Jaxon! MapReduce is just an application (one of many including Tez, Spark, Slider etc.) that runs on Yarn. Each YARN application decides to log whatever it wants. For MapReduce,

Re: Lots of Exception for "cannot assign requested address" in datanode logs

2017-07-27 Thread Ravi Prakash
cating under-replicating blocks > on DN2. > > > > Can this be related to properties I added for increasing replication rate? > > > > Regards > > Om Prakash > > > > *From:* Ravi Prakash [mailto:ravihad...@gmail.com] > *Sent:* 27 July 2017 01:26 >

Re: Lots of Exception for "cannot assign requested address" in datanode logs

2017-07-26 Thread Ravi Prakash
Hi Omprakash! DatanodeRegistration happens when the Datanode first hearbeats to the Namenode. In your case, it seems some other application has acquired the port 50010 . You can check this with the command "netstat -anp | grep 50010" . Are you trying to run 2 datanode processes on the same

Re: Regarding Simulation of Hadoop

2017-07-24 Thread Ravi Prakash
t of Computer Science,* > *Career Point University of Kota, Rajasthan* > > On 17 July 2017 at 23:22, Ravi Prakash <ravihad...@gmail.com> wrote: > >> Hi Vinod! >> >> You can look at static code analysis tools. I'm sure there are ones >> specific for security.

Re: Regarding Simulation of Hadoop

2017-07-17 Thread Ravi Prakash
Hi Vinod! You can look at static code analysis tools. I'm sure there are ones specific for security. I'd suggest you to set up a Kerberized hadoop cluster first. HTH Ravi On Sat, Jul 15, 2017 at 2:08 AM, vinod Saraswat wrote: > Dear Sir/Mam, > > > > I am Vinod Sharma

Re: Lots of warning messages and exception in namenode logs

2017-06-29 Thread Ravi Prakash
; *To:* omprakash <ompraka...@cdac.in> > *Cc:* Arpit Agarwal <aagar...@hortonworks.com>; > common-u...@hadoop.apache.org <user@hadoop.apache.org>; Ravi Prakash < > ravihad...@gmail.com> > > *Subject:* RE: Lots of warning messages and exception in n

Re: Lots of warning messages and exception in namenode logs

2017-06-27 Thread Ravi Prakash
*P.S : I have 1 datanode active out of 2. * > > > > I can also see from Namenode UI that the no. of under replicated blocks > are growing. > > > > Any idea? Or this is OK. > > > > regards > > > > > > *From:* omprakash [mailto:ompraka

Re: Can hdfs client 2.6 read file of hadoop 2.7 ?

2017-06-26 Thread Ravi Prakash
Hi Jeff! Yes. hadoop-2.6 clients are able to read files on a hadoop-2.7 cluster. The document I could find is http://hadoop.apache.org/docs/stable/hadoop-project-dist/hadoop-common/Compatibility.html . "Both Client-Server and Server-Server compatibility is preserved within a major release" HTH

Re: Lots of warning messages and exception in namenode logs

2017-06-22 Thread Ravi Prakash
=[ARCHIVE]}, > newBlock=true) For more information, please enable DEBUG log level on > org.apache.hadoop.hdfs.server.blockmanagement.BlockPlacementPolicy and > **org.apache.hadoop.net > <http://org.apache.hadoop.net/>.NetworkTopology* > > > > > > *From: *omprakash <ompraka...@cdac.in> > *Date: *Wednesday, June 2

Re: Lots of warning messages and exception in namenode logs

2017-06-21 Thread Ravi Prakash
Hi Omprakash! What is your default replication set to? What kind of disks do your datanodes have? Were you able to start a cluster with a simple configuration before you started tuning it? HDFS tries to create the default number of replicas for a block on different datanodes. The Namenode tries

Re: Hadoop Application Report from WebUI

2017-06-09 Thread Ravi Prakash
Hi Hilmi! I'm not sure, but maybe the offset in the block at which the task started processing? Ravi On Fri, Jun 9, 2017 at 7:43 AM, Hilmi Egemen Ciritoğlu < hilmi.egemen.cirito...@gmail.com> wrote: > Hi all, > > I can see following informations on hadoop yarn web-ui report(8088) for > each

Re: Hadoop error in shuffle in fetcher: Exceeded MAX_FAILED_UNIQUE_FETCHES

2017-06-07 Thread Ravi Prakash
Hi Seonyoung! Please take a look at this file : https://github.com/apache/hadoop/blob/branch-2.7.1/hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-shuffle/src/main/java/org/apache/hadoop/mapred/ShuffleHandler.java#L208 . This is an auxiliary service that runs inside the

Re: Spark 2.0.1 & 2.1.1 fails on Hadoop-3.0.0-alhpa2

2017-05-10 Thread Ravi Prakash
Hi Jasson! You will have to build Spark again with Hadoop-3.0.0-alpha2. This was done as part of https://issues.apache.org/jira/browse/HADOOP-12563 . HTH Ravi On Tue, May 9, 2017 at 4:33 PM, Jasson Chenwei wrote: > hi, all > > I just upgraded my Hadoop from 2.7.3 to

Re: unsubscribe

2017-05-01 Thread Ravi Prakash
Hi Jason! Could you please send an email to user-unsubscr...@hadoop.apache.org and general-unsubscr...@hadoop.apache.org as mentioned here : https://hadoop.apache.org/mailing_lists.html ? Thanks On Sat, Apr 29, 2017 at 11:34 AM, Jason wrote: > unsubscribe > > On Thu,

Re: unsubscribe

2017-04-28 Thread Ravi Prakash
Hi Marc, Could you please send an email to user-unsubscr...@hadoop.apache.org and general-unsubscr...@hadoop.apache.org as mentioned here : https://hadoop.apache.org/mailing_lists.html ? Thanks On Thu, Apr 27, 2017 at 5:18 AM, Bourre, Marc < marc.bou...@ehealthontario.on.ca> wrote: >

Re: unsubscribe

2017-04-28 Thread Ravi Prakash
Hi Krishna! Could you please send an email to user-unsubscr...@hadoop.apache.org and general-unsubscr...@hadoop.apache.org as mentioned here : https://hadoop.apache.org/mailing_lists.html ? Thanks On Wed, Apr 26, 2017 at 7:58 PM, Krishna < ramakrishna.srinivas.mur...@gmail.com> wrote: > > >

Re: Noob question about Hadoop job that writes output to HBase

2017-04-22 Thread Ravi Prakash
Hi Evelina! You've posted the logs for the MapReduce ApplicationMaster . From this I can see the reducer timed out after 600 secs : 2017-04-21 00:24:07,747 INFO [AsyncDispatcher event handler] org.apache.hadoop.mapreduce.v2.app.job.impl.TaskAttemptImpl: Diagnostics report from

Re: Running a script/executable stored in HDFS from a mapper

2017-04-21 Thread Ravi Prakash
Perhaps you want to look at Hadoop Streaming? https://hadoop.apache.org/docs/r2.7.1/hadoop-streaming/HadoopStreaming.html On Fri, Apr 21, 2017 at 12:30 AM, Philippe Kernévez wrote: > Hi Evelina, > > Files in HDFS are not executable. > You first need to copy it on a local tmp

Re: About the name of "dfs.namenode.checkpoint.dir" and "dfs.namenode.checkpoint.edits.dir"

2017-03-24 Thread Ravi Prakash
Hi Huxiaodong! Thanks for your email. "dfs.namenode.checkpoint.dir" is used in a lower level abstraction (called FSImage) : https://github.com/apache/hadoop/blob/trunk/hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/namenode/FSImage.java#L1374 . Incidentally to have

Re: Hadoop AWS module (Spark) is inventing a secret-ket each time

2017-03-08 Thread Ravi Prakash
Sorry to hear about your travails. I think you might be better off asking the spark community: http://spark.apache.org/community.html On Wed, Mar 8, 2017 at 3:22 AM, Jonhy Stack wrote: > Hi, > > I'm trying to read a s3 bucket from Spark and up until today Spark always >

Re: last exception: java.io.IOException: Call to e26-node.fqdn.com/10.12.1.209:60020 failed on local exception

2017-03-07 Thread Ravi Prakash
You should probably email Hbase mailing lists rather than Hadoop : https://hbase.apache.org/mail-lists.html On Thu, Mar 2, 2017 at 10:02 AM, Motty Cruz wrote: > Hello, in the past two weeks, I see the following error on HBase Thrift > servers, we have total of about 10

Re: Journal nodes , QJM requirement

2017-02-28 Thread Ravi Prakash
Thanks for the question Amit and your response Surendra! I think Amit has raised a good question. I can only guess towards the "need" for *journaling* while using a QJM. I'm fairly certain that if you look through all the comments in https://issues.apache.org/jira/browse/HDFS-3077 and its

Re: WordCount MapReduce error

2017-02-23 Thread Ravi Prakash
00GB free space on that drive > so space shouldn't be the issue. > 3. The application has all the required permissions. > > Additionally, something I've tested is that if I set the number of reduce > tasks in the WordCount.java file to 0 (job.setNumReduceTask = 0) then I get > the

Re: WordCount MapReduce error

2017-02-22 Thread Ravi Prakash
he WordCount.java file to 0 (job.setNumReduceTask = 0) then I get > the success files for the Map task in my output directory. So the Map tasks > work fine but the Reduce is messing up. Is it possible that my build is > somewhat incorrect even though it said everything was successfully bu

Re: WordCount MapReduce error

2017-02-22 Thread Ravi Prakash
Hi Vasil! It seems like the WordCount application is expecting to open the intermediate file but failing. Do you see a directory under D:/tmp/hadoop-Vasil Grigirov/ . I can think of a few reasons. I'm sorry I am not familiar with the Filesystem on Windows 10. 1. Spaces in the file name are not

Re: HDFS fsck command giving health as corrupt for '/'

2017-02-16 Thread Ravi Prakash
Hi Nishant! I'd suggest reading the HDFS user guide to begin with and becoming familiar with the architecture. https://hadoop.apache.org/docs/stable/hadoop-project-dist/hadoop-hdfs/HdfsUserGuide.html . Where are the blocks stored on the datanodes? Were they on persistent storage on the EC2

Re: How to fix "HDFS Missing replicas"

2017-02-13 Thread Ravi Prakash
Hi Ascot! Just out of curiosity, which version of hadoop are you using? fsck has some other options (e.g. -blocks will print out the block report too, -list-corruptfileblocks prints out the list of missing blocks and files they belong to) . I suspect you may also want to specify the

Re: Yarn containers creating child process

2017-02-13 Thread Ravi Prakash
Hi Sandesh! A *yarn* task is just like any other process on the operating system. Depending on which ContainerExecutor you use, you should launch the yarn task with appropriate limits in place. Although I have never tried it, on Linux you could use setrlimit or

Re: HDFS Shell tool

2017-02-10 Thread Ravi Prakash
Ravi, > I am glad you like it. > Why should I use WebHDFS? Our cluster sysops, include me, prefer command > line. :-) > > -Vity > > 2017-02-09 22:21 GMT+01:00 Ravi Prakash <ravihad...@gmail.com>: > >> Great job Vity! >> >> Thanks a lot for sharing. H

Re: HDFS Shell tool

2017-02-09 Thread Ravi Prakash
Great job Vity! Thanks a lot for sharing. Have you thought about using WebHDFS? Thanks Ravi On Thu, Feb 9, 2017 at 7:12 AM, Vitásek, Ladislav wrote: > Hello Hadoop fans, > I would like to inform you about our tool we want to share. > > We created a new utility - HDFS Shell

Re: Confusion between dfs.replication and dfs.namenode.replication.min options in hdfs-site.xml

2017-02-02 Thread Ravi Prakash
Hi Andrey! Your assumption is absolutely correct. dfs.namenode.replication.min is what you should set to 2 in your case. You should also look at dfs.client.block.write.replace-datanode-on-failure.policy, dfs.client.block.write.replace-datanode-on-failure.enable and

Re: Some Questions about Node Manager Memory Used

2017-01-24 Thread Ravi Prakash
Hi Zhuo Chen! Yarn has a few methods to account for memory. By default, it is guaranteeing your (hive) application a certain amount of memory. It depends totally on the application whether it uses all of that memory, or as in your case, leaves plenty of headroom in case it needs to expand in the

Re: Why is the size of a HDFS file changed?

2017-01-09 Thread Ravi Prakash
delimiter in the file. > However, I don't think this is the reason causing the problem, because > there are files also using "^A" as delimiter but with no problem. > BTW, the reason using "^A" as delimiter is these files are hive data. > > On Sat, Jan 7, 2017

Re: Why is the size of a HDFS file changed?

2017-01-06 Thread Ravi Prakash
Is there a carriage return / new line / some other whitespace which `cat` may be appending? On Thu, Jan 5, 2017 at 6:09 PM, Mungeol Heo wrote: > Hello, > > Suppose, I name the HDFS file which cause the problem as A. > > hdfs dfs -ls A > -rw-r--r-- 3 web_admin hdfs

Re: Small mistake (?) in doc about HA with Journal Nodes

2016-12-05 Thread Ravi Prakash
Hi Alberto! The assumption is that *multiple* machines could be running the Namenode process. Only one of them would be active, while the other Namenode processes would be in Standby mode. The number of machines is suggested to be odd so that its easier to form consensus. To handle the failure

Re: Does the JobHistoryServer register itself with ZooKeeper?

2016-11-16 Thread Ravi Prakash
Are you talking about the Mapreduce JobHistoryServer? I am not aware of it needing Zookeeper for anything. What gave you that impression? On Wed, Nov 16, 2016 at 11:32 AM, Benson Qiu wrote: > I'm looking for a way to check for connectivity to the JobHistoryServer. > >

Re: How to mount HDFS as a local file system?

2016-11-10 Thread Ravi Prakash
Or you could use NFS https://hadoop.apache.org/docs/r2.7.2/hadoop-project-dist/hadoop-hdfs/HdfsNfsGateway.html . In our experience, both of them still need some work for stability and correctness. On Thu, Nov 10, 2016 at 10:00 AM, wrote: > Fuse is your tool: > >

Re: Yarn 2.7.3 - capacity scheduler container allocation to nodes?

2016-11-09 Thread Ravi Prakash
Hi Rafal! Have you been able to launch the job successfully first without configuring node-labels? Do you really need node-labels? How much total memory do you have on the cluster? Node labels are usually for specifying special capabilities of the nodes (e.g. some nodes could have GPUs and your

Re: Capacity scheduler for yarn oin 2.7.3 - problem with job scheduling to created queue.

2016-11-08 Thread Ravi Prakash
Hi Rafal! Have you been able to launch the job successfully first without configuring node-labels? Do you really need node-labels? How much total memory do you have on the cluster? Node labels are usually for specifying special capabilities of the nodes (e.g. some nodes could have GPUs and your

Re: Fw:Re:How to add custom field to hadoop MR task log?

2016-11-04 Thread Ravi Prakash
Hi Maria! You have to be careful which log4j.properties file is on the classpath of the task which was launched. Often times there are multiple log4j.properties file, perhaps in the classpaths or in one of the jars on the classpath. Are you sure the log4j.properties file you edited is the only

Re: why the default value of 'yarn.resourcemanager.container.liveness-monitor.interval-ms' in yarn-default.xml is so high?

2016-11-03 Thread Ravi Prakash
at happens if RM finds that one NM's heartbeat is missing but it is not > 10 min yet (yarn.nm.liveness-monitor.expiry-interval-ms time is not > expired yet) > Will a new application still make container request to that NM via RM? > > Thanks > Tanvir > > > > > > O

Re: unsubscribe

2016-10-31 Thread Ravi Prakash
Please email user-unsubscr...@hadoop.apache.org On Mon, Oct 31, 2016 at 2:29 AM, 风雨无阻 <232341...@qq.com> wrote: > unsubscribe >

Re: Bug in ORC file code? (OrcSerde)?

2016-10-19 Thread Ravi Prakash
MIchael! Although there is a little overlap in the communities, I strongly suggest you email u...@orc.apache.org ( https://orc.apache.org/help/ ) I don't know if you have to be subscribed to a mailing list to get replies to your email address. Ravi On Wed, Oct 19, 2016 at 11:29 AM, Michael

Re: file permission issue

2016-10-17 Thread Ravi Prakash
Hi! https://github.com/apache/hadoop/blob/trunk/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-nodemanager/src/main/java/org/apache/hadoop/yarn/server/nodemanager/containermanager/localizer/ResourceLocalizationService.java#L1524 Just fyi, there are different kinds of

Re: hadoop cluster container memory limit

2016-10-14 Thread Ravi Prakash
Hi! Look at yarn.nodemanager.resource.memory-mb in https://hadoop.apache.org/docs/r2.7.2/hadoop-yarn/hadoop-yarn-common/yarn-default.xml I'm not sure how 11.25Gb comes in. How did you deploy the cluster? Ravi On Thu, Oct 13, 2016 at 9:07 PM, agc studio wrote: > Hi

Re: Where does Hadoop get username and group mapping from for linux shell username and group mapping?

2016-10-14 Thread Ravi Prakash
Chen! It gets it from whatever is configured on the Namenode. https://hadoop.apache.org/docs/r2.7.2/hadoop-project-dist/hadoop-hdfs/HdfsPermissionsGuide.html#Group_Mapping HTH Ravi On Thu, Oct 13, 2016 at 7:43 PM, chen dong wrote: > Hi, > > Currently I am working on a

Re: Issue in Rollback (after rolling upgrade) from hadoop 2.7.2 to 2.5.2

2016-10-13 Thread Ravi Prakash
Hi Dinesh! This is obviously a very hazardous situation you are in (if your data is important), so I'd suggest moving carefully. Make as many backups of as many things you can. The usual mechanism that Hadoop uses when upgrading is to rename directories of the old format and keep them around

Re: Hadoop: precomputing data

2016-10-12 Thread Ravi Prakash
I guess one of the questions is what is your false negative rate in Approach 1 Step 1? Ofcourse if you are limited by resources you may have to go with Approach 1. On Thu, Oct 6, 2016 at 6:14 AM, venito camelas wrote: > I'm designing a prototype using *Hadoop* for

Re: Newbie Ambari Question

2016-10-12 Thread Ravi Prakash
I suspect https://ambari.apache.org/mail-lists.html may be more useful. On Thu, Oct 6, 2016 at 2:45 AM, Deepak Goel wrote: > > Hey > > Namaskara~Nalama~Guten Tag~Bonjour > > Sorry, Is this the right forum for asking a question "Ambari Hadoop > Installation" from Hortonworks?

Re: HDFS Issues.

2016-10-12 Thread Ravi Prakash
There are a few conditions for the Namenode to come out of safemode. # Number of datanodes, # Number of blocks that have been reported. How many blocks have the datanodes reported? On Tue, Oct 4, 2016 at 1:22 PM, Steve Brenneis wrote: > I have an HDFS cluster of three

Re: HDFS Replication Issue

2016-10-12 Thread Ravi Prakash
Hi Eric! Did you follow https://hadoop.apache.org/docs/current2/hadoop-project- dist/hadoop-common/SingleCluster.html to set up your single node cluster? Did you set dfs.replication in hdfs-site.xml ? The logs you posted don't have enough information to debug the issue. *IF* everything has been

Re: YARN Resource Allocation When Memory is Very Small

2016-08-30 Thread Ravi Prakash
Hi Nico! The RM is configured with a minimum allocation. Take a look at "yarn.scheduler.minimum-allocation-mb" . You can also read through code here:

Re: Yarn web UI shows more memory used than actual

2016-08-15 Thread Ravi Prakash
Hi Suresh! YARN's accounting for memory on each node is completely different from the Linux kernel's accounting of memory used. e.g. I could launch a MapReduce task which in reality allocates just 100 Mb, and tell YARN to give it 8 Gb. The kernel would show the memory requested by the task, the

Re: MapReduce Job State: PREP over 8 hours, state no change

2016-08-08 Thread Ravi Prakash
That's unusual. Are you able to submit a simple sleep job? You can do this using: yarn jar $HADOOP_PREFIX/share/hadoop/mapreduce/hadoop-mapreduce-client-jobclient-*-tests.jar sleep -m 1 -r 1 This should finish it in under a minute. Otherwise I'd suspect that your cluster is misconfigured. HTH

Re: Node Manager crashes with OutOfMemory error

2016-07-26 Thread Ravi Prakash
Hi Rahul! Which version of Hadoop are you using? What non-default values of configuration are you setting? You can set HeapDumpOnOutOfMemoryError on the command line while starting up your nodemanagers and see the resulting heap dump in Eclipse MAT / jvisualvm / yourkit to see where are the

Re: Where's official Docker image for Hadoop?

2016-07-20 Thread Ravi Prakash
Would something like this be useful as a starting point? https://github.com/apache/hadoop/tree/trunk/dev-support/docker (this is checked into apache/trunk) The DockerContainerExecutor was an alpha feature that didn't really get much traction and is not what you think it is. (If configured on the

Re: Building a distributed system

2016-07-18 Thread Ravi Prakash
Welcome to the community Richard! I suspect Hadoop can be more useful than just splitting and stitching back data. Depending on your use cases, it may come in handy to manage your machines, restart failed tasks, scheduling work when data becomes available etc. I wouldn't necessarily count it out.

Re: New cluster help

2016-07-14 Thread Ravi Prakash
Hi Tombin! Is this the first cluster you're ever setting up? Are you able to run an "hdfs dfs -ls /" successfully? How about putting files into HDFS? I'd take it one step at a time if I were you. i.e. 1. Set up a simple HDFS cluster (without SSL) 2. Turn on SSL 3. Then try to run HBase. Is step

Re: unsubscribe

2016-07-05 Thread Ravi Prakash
Please send an email to user-unsubscr...@hadoop.apache.org On Wed, Jun 29, 2016 at 8:02 AM, Bob Krier wrote: > >

Re: unsubscribe

2016-07-05 Thread Ravi Prakash
Please send an email to user-unsubscr...@hadoop.apache.org On Wed, Jun 29, 2016 at 8:04 AM, Mike Rapuano wrote: > > > -- > > > Michael Rapuano > > Dev/Ops Engineer > > 617-498-7800 | 617-468-1774 > > 25 Drydock Ave > > Boston, MA 02210 > >

Re: Usage of data node to run on commodity hardware

2016-06-07 Thread Ravi Prakash
Hi Krishna! I don't see why you couldn't start Hadoop in this configuration. Performance would obviously be suspect. Maybe by configuring your network toppology script, you could even improve the performance. Most mobiles are ARM processor. I know some cool people ran Hadoop v1 on Raspberry Pis

Re: HDFS in Kubernetes

2016-06-06 Thread Ravi Prakash
Klaus! Good luck with your attempt to run HDFS inside Kubernetes! Please keep us posted. For creating a new file, a DFSClient : 1. First calls addBlock on the NameNode.

Re: HDFS Federation

2016-06-06 Thread Ravi Prakash
Perhaps use the "viewfs://" protocol prepended to your path? On Sun, Jun 5, 2016 at 1:10 PM, Kun Ren wrote: > Hi Genius, > > I just configured HDFS Federation, and try to use it(2 namenodes, one is > for /my, another is for /your). When I run the command: > hdfs dfs -ls /,

Re: No edits files in dfs.namenode.edits.dir

2016-05-19 Thread Ravi Prakash
No! You are probably writing the edits file somewhere still. An `lsof` on the namenode process may be more revealing. Obviously this depends on configuration, but unless you have some really crazy settings, I'm pretty sure the edits would be persisted to disk. On Wed, May 18, 2016 at 2:47 AM,

Re: Regarding WholeInputFileFormat Java Heap Size error

2016-05-12 Thread Ravi Prakash
Shubh! You can perhaps introduce an artificial delay in your map task and then take a JAVA heap dump of the MapTask JVM to analyze where the memory is going. Its hard to speculate otherwise. On Wed, May 11, 2016 at 10:15 PM, Shubh hadoopExp wrote: > > > > Hi All, > >

Re: Upgrading production hadoop system from 2.5.1 to 2.7.2

2016-03-24 Thread Ravi Prakash
y important to us. We cannot tolerate any data loss with > the update. Do you remember how long it took for you to upgrade it from > 2.4.1 to 2.7.1 ? > > Thanks, > Chathuri > > On Wed, Mar 23, 2016 at 7:09 PM, Ravi Prakash <ravihad...@gmail.com> > wrote: > >&g

Re: INotify stability

2015-09-16 Thread Ravi Prakash
Hi Mohammad! Thanks for reporting the issue. Could you please take a heap dump of the NN and analyze it to see where the memory is being spent? ThanksRavi On Tuesday, September 15, 2015 11:53 AM, Mohammad Islam wrote: Hi,We were using INotify feature in one of

Fw: important message

2015-09-14 Thread Ravi Prakash
Hey friend! Check it out http://isttp.org/necessary.php?1o7f0 Ravi Prakash

Fw: important message

2015-09-14 Thread Ravi Prakash
Hey friend! Check it out http://xecuuho.net/sent.php?dtya Ravi Prakash

Fw: important

2015-09-08 Thread Ravi Prakash
Hello! Important message, visit http://schevelyura.ru/written.php?id6 Ravi Prakash

Re: hdfs: weird lease expiration issue

2015-08-21 Thread Ravi Prakash
Hi Bogdan! This is because the second application attempt appears to HDFS as a new client. Are you sure the second client experienced write errors because *its* lease was removed? Yongjun has a great writeup : http://blog.cloudera.com/blog/2015/02/understanding-hdfs-recovery-processes-part-1/

Re: Unable to pass complete tests on 2.7.1

2015-08-17 Thread Ravi Prakash
Hi Tucker! Sadly, unit tests failing is usual for hadoop builds. You can use -DskipTests to build without running unit tests, or -fn (fail-never) to continue despite failures. The maven-plugin helps us manage generated source code (e.g. protobuf files generate more java files which need to be

Re: Documentation inconsistency about append write in HDFS

2015-08-03 Thread Ravi Prakash
Thanks Thanh! Yes! Could you please post a patch? On Sunday, August 2, 2015 8:50 PM, Thanh Hong Dai hdth...@tma.com.vn wrote: !--#yiv919757 _filtered #yiv919757 {font-family:MS Mincho;panose-1:2 2 6 9 4 2 5 8 3 4;} _filtered #yiv919757 {font-family:Cambria

Re: Web based file manager for HDFS?

2015-07-27 Thread Ravi Prakash
Hi Caesar! I'm going to try to get that functionality as part of [HDFS-7588] Improve the HDFS Web UI browser to allow chowning / chmoding, creating dirs and uploading files - ASF JIRA in the next 2 months. Ravi |   | |   |   |   |   |   | | [HDFS-7588] Improve the HDFS Web UI browser to allow

Re: YARN and LinuxContainerExecutor in simple security mode

2015-07-06 Thread Ravi Prakash
- as long as I dont explicitly allow for this using   hadoop.proxyuser.username.groups   hadoop.proxyuser.username.hosts user processes spawned by yarn on worknodes will always run with the uid of that user. Is that right?   Thanks,   Tomasz W dniu 29.06.2015 o 21:43, Ravi Prakash pisze: Hi

Re: YARN and LinuxContainerExecutor in simple security mode

2015-06-29 Thread Ravi Prakash
Hi Tomasz! It is tricky to set up, but there are no implications to security if you configure it correctly. Please read the discussion on [YARN-2424] LCE should support non-cgroups, non-secure mode - ASF JIRA HTH Ravi |   | |   |   |   |   |   | | [YARN-2424] LCE should support non-cgroups,

Re: Web Address appears to be ignored

2015-05-19 Thread Ravi Prakash
Ewan! This sounds like a bug. Please open a JIRA. ThanksRavi On Tuesday, May 19, 2015 8:09 AM, Ewan Higgs ewan.hi...@ugent.be wrote: Hi all, I am setting up a Hadoop cluster where the nodes have FQDNames inside the cluster, but the DNS where these names are registered is behind

  1   2   >