Re: Huge DataNode Virtual Memory Usage
I think it may of been 6676016: http://java.sun.com/javase/6/webnotes/6u10.html We were able to repro at the time this through heavy lucene indexing + our internal document pre-processing logic that churned a lot of objects. We have still experience similar issues with 10 but much rarer. Maybe going to 13 may shed some light, you could be tickling another similar bug but I didnt see anything obvious. C On May 9, 2009, at 12:30 AM, Stefan Will wrote: Chris, Thanks for the tip ... However I'm already running 1.6_10: java version "1.6.0_10" Java(TM) SE Runtime Environment (build 1.6.0_10-b33) Java HotSpot(TM) 64-Bit Server VM (build 11.0-b15, mixed mode) Do you know of a specific bug # in the JDK bug database that addresses this ? Cheers, Stefan From: Chris Collins Reply-To: Date: Fri, 8 May 2009 20:34:21 -0700 To: "core-user@hadoop.apache.org" Subject: Re: Huge DataNode Virtual Memory Usage Stefan, there was a nasty memory leak in in 1.6.x before 1.6 10. It manifested itself during major GC. We saw this on linux and solaris and dramatically improved with an upgrade. C On May 8, 2009, at 6:12 PM, Stefan Will wrote: Hi, I just ran into something rather scary: One of my datanode processes that I’m running with –Xmx256M, and a maximum number of Xceiver threads of 4095 had a virtual memory size of over 7GB (!). I know that the VM size on Linux isn’t necessarily equal to the actual memory used, but I wouldn’t expect it to be an order of magnitude higher either. I ran pmap on the process, and it showed around 1000 thread stack blocks with roughly 1MB each (which is the default size on the 64bit JDK). The largest block was 3GB in size which I can’t figure out what it is for. Does anyone have any insights into this ? Anything that can be done to prevent this other than to restart the DFS regularly ? -- Stefan
Re: Huge DataNode Virtual Memory Usage
Stefan, there was a nasty memory leak in in 1.6.x before 1.6 10. It manifested itself during major GC. We saw this on linux and solaris and dramatically improved with an upgrade. C On May 8, 2009, at 6:12 PM, Stefan Will wrote: Hi, I just ran into something rather scary: One of my datanode processes that I’m running with –Xmx256M, and a maximum number of Xceiver threads of 4095 had a virtual memory size of over 7GB (!). I know that the VM size on Linux isn’t necessarily equal to the actual memory used, but I wouldn’t expect it to be an order of magnitude higher either. I ran pmap on the process, and it showed around 1000 thread stack blocks with roughly 1MB each (which is the default size on the 64bit JDK). The largest block was 3GB in size which I can’t figure out what it is for. Does anyone have any insights into this ? Anything that can be done to prevent this other than to restart the DFS regularly ? -- Stefan
Re: Is there any performance issue with Jrockit JVM for Hadoop
a couple of years back we did a lot of experimentation between sun's vm and jrocket. We had initially assumed that jrocket was going to scream since thats what the press were saying. In short, what we discovered was that certain jdk library usage was a little bit faster with jrocket, but for core vm performance such as synchronization, primitive operations the sun vm out performed. We were not taking account of startup time, just raw code execution. As I said, this was a couple of years back so things may of changed. C On May 7, 2009, at 2:17 AM, Grace wrote: I am running the test on 0.18.1 and 0.19.1. Both versions have the same issue with JRockit JVM. It is for the example sort job, to sort 20G data on 1+2 nodes. Following is the result(version 0.18.1). The sort job running with JRockit JVM took 260 secs more than that with Sun JVM. --- || JVM || Completion Time || --- || JRockit || 786,315 msec|| || Sun || 526,602 msec || --- Furthermore, under 0.19.1 version, I have set the reusing JVM parameter as -1. It seems no improvement for JRockit JVM. On Thu, May 7, 2009 at 4:32 PM, JQ Hadoop wrote: I believe Jrockit JVM have slightly higer startup time than the SUN JVM; but that should not make a lot of difference, especially if JVMs are reused in 0.19. Which Hadoop version are you using? What Hadoop job are you running? And what performance do you get? Thanks, JQ -Original Message- From: Grace Sent: Wednesday, May 06, 2009 1:07 PM To: core-user@hadoop.apache.org Subject: Is there any performance issue with Jrockit JVM for Hadoop Hi all, This is Grace. I am replacing Sun JVM with Jrockit JVM for Hadoop. Also I keep all the same Java options and configuration as Sun JVM. However it is very strange that the performance using Jrockit JVM is poorer than the one using Sun, such as the map stage became slower. Has anyone encountered the similar problem? Could you please give some advise about it? Thanks a lot. Regards, Grace
Re: Hadoop datanode crashed - SIGBUS
I had some pretty bad issues with leaks in _07. _10 btw has a lot of bug fixes. I dont know it would fix this problem. As for flags I wouldnt know. One thing you could try is to try and match the memory region that the program counter matches. If you use jstack or jmap, cant remember which, it will give you a dump of all the libraries and their memory address range. From that you may see if the PCounter matches anything interesting. Other than that I would go with Brians recommendations. C On Dec 1, 2008, at 1:59 PM, Sagar Naik wrote: hi, I dont have additional information on it. If u know any other flag tht I need to turn on , pl do tell me . The flags tht are currently on are " -XX:+HeapDumpOnOutOfMemoryError -XX:+UseParallelGC - Dcom.sun.management.jmxremote" But this is what is listed in stdout (datanode.out) file Java version : java version "1.6.0_07" Java(TM) SE Runtime Environment (build 1.6.0_07-b06) Java HotSpot(TM) Server VM (build 10.0-b23, mixed mode) I will try to stress test the memory. -Sagar Chris Collins wrote: Was there anything mentioned as part of the tombstone message about "problematic frame"? What java are you using? There are a few reasons for SIGBUS errors, one is illegal address alignment, but from java thats very unlikelythere were some issues with the native zip library in older vm's. As Brian pointed out, sometimes this points to a hw issue. C On Dec 1, 2008, at 1:32 PM, Sagar Naik wrote: Brian Bockelman wrote: Hardware/memory problems? I m not sure. SIGBUS is relatively rare; it sometimes indicates a hardware error in the memory system, depending on your arch. *uname -a : * Linux hdimg53 2.6.15-1.2054_FC5smp #1 SMP Tue Mar 14 16:05:46 EST 2006 i686 i686 i386 GNU/Linux *top's top* Cpu(s): 0.1% us, 1.1% sy, 0.0% ni, 98.0% id, 0.8% wa, 0.0% hi, 0.0% si Mem: 8288280k total, 1575680k used, 6712600k free, 5392k buffers Swap: 16386292k total, 68k used, 16386224k free, 522408k cached 8 core , xeon 2GHz Brian On Dec 1, 2008, at 3:00 PM, Sagar Naik wrote: Couple of the datanodes crashed with the following error The /tmp is 15% occupied # # An unexpected error has been detected by Java Runtime Environment: # # SIGBUS (0x7) at pc=0xb4edcb6a, pid=10111, tid=1212181408 # [Too many errors, abort] Pl suggest how should I go to debug this particular problem -Sagar Thanks to Brian -Sagar
Re: Hadoop datanode crashed - SIGBUS
Was there anything mentioned as part of the tombstone message about "problematic frame"? What java are you using? There are a few reasons for SIGBUS errors, one is illegal address alignment, but from java thats very unlikelythere were some issues with the native zip library in older vm's. As Brian pointed out, sometimes this points to a hw issue. C On Dec 1, 2008, at 1:32 PM, Sagar Naik wrote: Brian Bockelman wrote: Hardware/memory problems? I m not sure. SIGBUS is relatively rare; it sometimes indicates a hardware error in the memory system, depending on your arch. *uname -a : * Linux hdimg53 2.6.15-1.2054_FC5smp #1 SMP Tue Mar 14 16:05:46 EST 2006 i686 i686 i386 GNU/Linux *top's top* Cpu(s): 0.1% us, 1.1% sy, 0.0% ni, 98.0% id, 0.8% wa, 0.0% hi, 0.0% si Mem: 8288280k total, 1575680k used, 6712600k free, 5392k buffers Swap: 16386292k total, 68k used, 16386224k free, 522408k cached 8 core , xeon 2GHz Brian On Dec 1, 2008, at 3:00 PM, Sagar Naik wrote: Couple of the datanodes crashed with the following error The /tmp is 15% occupied # # An unexpected error has been detected by Java Runtime Environment: # # SIGBUS (0x7) at pc=0xb4edcb6a, pid=10111, tid=1212181408 # [Too many errors, abort] Pl suggest how should I go to debug this particular problem -Sagar Thanks to Brian -Sagar
Re: Can anyone recommend me a inter-language data file format?
Consider talking to Doug Cutting. He is playing with the idea of a variant of JSON, I am sure he would love your help. Specifically he is looking at a coding scheme that is easy to read, does not duplicate key names per record and supports file splits. C On Nov 1, 2008, at 8:20 PM, Zhou, Yunqing wrote: embedded database cannot handle large-scale data, not very efficient I have about 1 billion records. these records should be passed through some modules. I mean a data exchange format similar to XML but more flexible and efficient. On Sun, Nov 2, 2008 at 10:49 AM, lamfeeling <[EMAIL PROTECTED]> wrote: Consider Embeded Database? Berkeley DB, written in C++, and have interface for many languages. 在2008-11-02?10:15:22,"Zhou,?Yunqing"?<[EMAIL PROTECTED]>?写道: The?project?I?focused?on?has?many?modules?written?in?different? languages (several?modules?are?hadoop?jobs). So?I'd?like?to?utilize?a?common?record?based?data?file?format?for? data exchange. XML?is?not?efficient?for?appending?new?records. SequenceFile?seems?not?having?API?of?other?languages?except?Java. Protocol?Buffers'?hadoop?API?seems?under?development. any?recommendation?for?this? Thanks
Re: Can anyone recommend me a inter-language data file format?
Sleepycat has a java edition: http://www.oracle.com/technology/products/berkeley-db/index.html I has an "interesting" open source license. If you dont need to ship it on an install disk your probably good to go with that too. you could also consider Derby. C On Nov 1, 2008, at 7:49 PM, lamfeeling wrote: Consider Embeded Database? Berkeley DB, written in C++, and have interface for many languages. 在2008-11-02?10:15:22,"Zhou,?Yunqing"?<[EMAIL PROTECTED]>?写道: The?project?I?focused?on?has?many?modules?written?in?different? languages (several?modules?are?hadoop?jobs). So?I'd?like?to?utilize?a?common?record?based?data?file?format?for? data exchange. XML?is?not?efficient?for?appending?new?records. SequenceFile?seems?not?having?API?of?other?languages?except?Java. Protocol?Buffers'?hadoop?API?seems?under?development. any?recommendation?for?this? Thanks
Re: Internet-Based Secure Clustered FS?
Have you considered Amazon S3? I dont know how secure your requirements are. There are lots of companies using this for just offsite data storage and also with EC2. C On Jun 17, 2008, at 6:48 PM, Kenneth Miller wrote: All, I'm looking for a solution that would allow me to securely use VPSs (hosted VMs) or hosted dedicated servers as nodes in a distributed file system. My bandwidth/speed requirements aren't high, space requirements are potentially huge and ever growing, superb security is a must, but I really don't want to worry about hosting the DFS in-house. Is there any solution that's capable of this and/or is there anyone currently doing this? Regards, Kenneth Miller
Re: client connect as different username?
Thanks Nicolas, I read it yet again (ok, only the third time). Yes it talks of whoami, I actually knew that from single stepping the client too, I was still stuck. It mentioned the posix model, kinda guessed that also from the javadocs. From Dougs note it clearly states that "No foo account need to exist on the namenode" and that the only exception is the user that started the server. I didnt get that clarity from the perms doc. Perhaps an example for the case where there are users other than that that started the serverI would of thought this was a common one. In our office we dumped this on a bunch of linux boxes that all share the same username, but all our developers are using macs with their own user name and they dont expect to have their own user on the linux boxes (cause we are lazy that way). For instance, that all it requires is for me to create the ability for say a mac user with login of bob to access things under /bob is for me to go in as the super user and do something like: hadoop dfs -mkdir /bob hadoop dfs -chown bob /bob where bob literally doesnt exist on the hdfs box and was not mentioned prior to those two commands. On Jun 11, 2008, at 10:00 PM, [EMAIL PROTECTED] wrote: This information can be found in http://hadoop.apache.org/core/docs/current/hdfs_permissions_guide.html Nicholas - Original Message ---- From: Chris Collins <[EMAIL PROTECTED]> To: core-user@hadoop.apache.org Sent: Wednesday, June 11, 2008 9:31:18 PM Subject: Re: client connect as different username? Thanks Doug, should this be added to the permissions doc or to the faq? See you in Sonoma. C On Jun 11, 2008, at 9:15 PM, Doug Cutting wrote: Chris Collins wrote: You are referring to creating a directory in hdfs? Because if I am user chris and the hdfs only has user foo, then I cant create a directory because I dont have perms, infact I cant even connect. Today, users and groups are declared by the client. The namenode only records and checks against user and group names provided by the client. So if someone named "foo" writes a file, then that file is owned by someone named "foo" and anyone named "foo" is the owner of that file. No "foo" account need exist on the namenode. The one (important) exception is the "superuser". Whatever user name starts the namenode is the superuser for that filesystem. And if "/" is not world writable, a new filesystem will not contain a home directory (or anywhere else) writable by other users. So, in a multiuser Hadoop installation, the superuser needs to create home directories and project directories for other users and set their protections accordingly before other users can do anything. Perhaps this is what you've run into? Doug
Re: Programatically initializing and starting HDFS cluster
I am also interested about this option, since I will probably be hacking at such a thing in the next few weeks. I am also curious if you can run MR jobs within process rather than launching each time. The scenario is when initialization takes just way too long for a map reduce shard to be executed in this model. For example, say you are trying to compute the top n terms within a set of documents where top n is those top rarest terms in some model corpus, perhaps you have a df index, or perhaps you have a huge nlp engine thats used for entity extraction, any of these assume a chunk of memory and a chunk of time to init each pass. Here of course you really would need not only to specify the job, but somehow constrain the candidate nodes this can run on based upon their ability to run this. C On Jun 12, 2008, at 2:02 AM, Robert Krüger wrote: Hi, for our developers I would like to write a few lines of Java code that, given a base directory, sets up an HDFS filesystem, initializes it, if it is not there yet and then starts the service(s) in process. This is to run on each developer's machine, probably within a tomcat instance. I don't want to do this (if I don't have to) in a bunch of shell scripts. Could anyone point to code samples that do similar things or give any other hints that make this easier than to look at what the Command line tools do and reverse engineer it from there? Thanks in advance, Robert
Re: client connect as different username?
Thanks Doug, should this be added to the permissions doc or to the faq? See you in Sonoma. C On Jun 11, 2008, at 9:15 PM, Doug Cutting wrote: Chris Collins wrote: You are referring to creating a directory in hdfs? Because if I am user chris and the hdfs only has user foo, then I cant create a directory because I dont have perms, infact I cant even connect. Today, users and groups are declared by the client. The namenode only records and checks against user and group names provided by the client. So if someone named "foo" writes a file, then that file is owned by someone named "foo" and anyone named "foo" is the owner of that file. No "foo" account need exist on the namenode. The one (important) exception is the "superuser". Whatever user name starts the namenode is the superuser for that filesystem. And if "/" is not world writable, a new filesystem will not contain a home directory (or anywhere else) writable by other users. So, in a multiuser Hadoop installation, the superuser needs to create home directories and project directories for other users and set their protections accordingly before other users can do anything. Perhaps this is what you've run into? Doug
Re: client connect as different username?
We know whomai is called, thanks, I found out painfully the first day I played with this because in dev, my ide is started not from a shell. Therefor the path is not inherited to include /usr/bin. Hdfs client hides the actual fact that ProcessBuilder barfs with a file not found with a "login exception", "whoami". Not as clear as I would of liked :-}| You are referring to creating a directory in hdfs? Because if I am user chris and the hdfs only has user foo, then I cant create a directory because I dont have perms, infact I cant even connect. I believe another emailer holds the answer which was blindly dumb on my part for not trying, that of adding a user in unix and creating a group that those users belong to. Thanks Chris On Jun 11, 2008, at 5:36 PM, Allen Wittenauer wrote: On 6/11/08 5:17 PM, "Chris Collins" <[EMAIL PROTECTED]> wrote: The finer point to this is that in development you may be logged in as user x and have a shared hdfs instance that a number of people are using. In that mode its not practical to sudo as you have all your development tools setup for userx. hdfs is setup with a single user, what is the procedure to add users to that hdfs instance? It has to support it surely? Its really not obvious, looking in the hdfs docs that come with the distro nothing springs out. the hadoop command line tool doesnt have anything that vaguely looks like a way to create a user. User information is sent from the client. The code literally does a 'whoami' and 'groups' and sends that information to the server. Shared data should be handled just like you would in UNIX: - create a directory - set permissions to be insecure - go crazy
Re: client connect as different username?
The finer point to this is that in development you may be logged in as user x and have a shared hdfs instance that a number of people are using. In that mode its not practical to sudo as you have all your development tools setup for userx. hdfs is setup with a single user, what is the procedure to add users to that hdfs instance? It has to support it surely? Its really not obvious, looking in the hdfs docs that come with the distro nothing springs out. the hadoop command line tool doesnt have anything that vaguely looks like a way to create a user. Help is greatly appreciated. I am sure its somewhere so blindingly obvious. How are other people doing other that sudoing to one single user name? Thanks ChRiS On Jun 11, 2008, at 5:11 PM, [EMAIL PROTECTED] wrote: The best way is to use sudo command to execute hadoop client. Does it work for you? Nicholas - Original Message From: Bob Remeika <[EMAIL PROTECTED]> To: core-user@hadoop.apache.org Sent: Wednesday, June 11, 2008 12:56:14 PM Subject: client connect as different username? Apologies if this is an RTM response, but I looked and wasn't able to find anything concrete. Is it possible to connect to HDFS via the HDFS client under a different username than I am currently logged in as? Here is our situation, I am user bobr on the client machine. I need to add something to the HDFS cluster as the user "companyuser". Is this possible with the current set of APIs or do I have to upload and "chown"? Thanks, Bob
RE: Couple of basic hdfs starter issues
I should update this to stupidity on my part (though the hidden shell execution within the client thats error gets masked is somewhat fickle). Of course if I dont start the thing up via the ide, but from the command line it goes past this problem (security issue, but that one is probably a more obvious thing). Still if anyone has an idea what happened to language id and the carrot2 stuff inside nutch that would be appreciated. C -Original Message- From: chris collins [mailto:[EMAIL PROTECTED] Sent: Sat 6/7/2008 10:54 AM To: core-user@hadoop.apache.org Subject: Couple of basic hdfs starter issues Sorry in advance if these "challenges" are covered in a document somewhere. I have setup hadoop on a centos 64 bit Linux box. I have verified that it is up and running only through seeing the java processes running and that I can access it from the admin ui. hadoop version is 1.7.0 but I also tried 1.6.4 for the following issue: >From a mac osx box using java 1.5 I am trying to run the following: String home = "hdfs://linuxbox:9000"; URI uri = new URI(home); Configuration conf = new Configuration(); FileSystem fs = FileSystem.get(uri, conf); The call to FileSystem.get throws an IOException stating that there is a login error with message "whoami". When I single step through the code there is an attempt to figure out what user is running this process by creating a processbuilder with "whoami". This fails with a "not found" error. I believe this is because you have to have a fully qualified path for processbuilder on the mac??? I also verified that my hadoop-default.xml and hadoop-site.xml is infact found in the classpath. All this is being attempted via a debug session in intellij ide. Any ideas on what I am doing wrong, I am sure its a configuration blunder on my part? Further, we used to use an old copy of nutch, of course now the hadoop part of nutch is its own jar file, so I upgraded the nutch jars too. We were using a few things within the nutch project that seem to of gone away: net.sf incarnation of the snowball stemmer (I fixed this by pulling directly the source from the author). language identificationany idea where it went? carrot2 clusteringany idea where that went? Thanks in advance. Chris
Couple of basic hdfs starter issues
Sorry in advance if these "challenges" are covered in a document somewhere. I have setup hadoop on a centos 64 bit Linux box. I have verified that it is up and running only through seeing the java processes running and that I can access it from the admin ui. hadoop version is 1.7.0 but I also tried 1.6.4 for the following issue: >From a mac osx box using java 1.5 I am trying to run the following: String home = "hdfs://linuxbox:9000"; URI uri = new URI(home); Configuration conf = new Configuration(); FileSystem fs = FileSystem.get(uri, conf); The call to FileSystem.get throws an IOException stating that there is a login error with message "whoami". When I single step through the code there is an attempt to figure out what user is running this process by creating a processbuilder with "whoami". This fails with a "not found" error. I believe this is because you have to have a fully qualified path for processbuilder on the mac??? I also verified that my hadoop-default.xml and hadoop-site.xml is infact found in the classpath. All this is being attempted via a debug session in intellij ide. Any ideas on what I am doing wrong, I am sure its a configuration blunder on my part? Further, we used to use an old copy of nutch, of course now the hadoop part of nutch is its own jar file, so I upgraded the nutch jars too. We were using a few things within the nutch project that seem to of gone away: net.sf incarnation of the snowball stemmer (I fixed this by pulling directly the source from the author). language identificationany idea where it went? carrot2 clusteringany idea where that went? Thanks in advance. Chris