Thank you for your time and suggestions, I've already tried starfish, but not jmap. I'll check it out. Thanks again, Mark
On Wed, Feb 29, 2012 at 1:17 PM, Charles Earl <charles.ce...@gmail.com>wrote: > I assume you have also just tried running locally and using the jdk > performance tools (e.g. jmap) to gain insight by configuring hadoop to run > absolute minimum number of tasks? > Perhaps the discussion > > http://grokbase.com/t/hadoop/common-user/11ahm67z47/how-do-i-connect-java-visual-vm-to-a-remote-task > might be relevant? > On Feb 29, 2012, at 3:53 PM, Mark question wrote: > > > I've used hadoop profiling (.prof) to show the stack trace but it was > hard > > to follow. jConsole locally since I couldn't find a way to set a port > > number to child processes when running them remotely. Linux commands > > (top,/proc), showed me that the virtual memory is almost twice as my > > physical which means swapping is happening which is what I'm trying to > > avoid. > > > > So basically, is there a way to assign a port to child processes to > monitor > > them remotely (asked before by Xun) or would you recommend another > > monitoring tool? > > > > Thank you, > > Mark > > > > > > On Wed, Feb 29, 2012 at 11:35 AM, Charles Earl <charles.ce...@gmail.com > >wrote: > > > >> Mark, > >> So if I understand, it is more the memory management that you are > >> interested in, rather than a need to run an existing C or C++ > application > >> in MapReduce platform? > >> Have you done profiling of the application? > >> C > >> On Feb 29, 2012, at 2:19 PM, Mark question wrote: > >> > >>> Thanks Charles .. I'm running Hadoop for research to perform duplicate > >>> detection methods. To go deeper, I need to understand what's slowing my > >>> program, which usually starts with analyzing memory to predict best > input > >>> size for map task. So you're saying piping can help me control memory > >> even > >>> though it's running on VM eventually? > >>> > >>> Thanks, > >>> Mark > >>> > >>> On Wed, Feb 29, 2012 at 11:03 AM, Charles Earl < > charles.ce...@gmail.com > >>> wrote: > >>> > >>>> Mark, > >>>> Both streaming and pipes allow this, perhaps more so pipes at the > level > >> of > >>>> the mapreduce task. Can you provide more details on the application? > >>>> On Feb 29, 2012, at 1:56 PM, Mark question wrote: > >>>> > >>>>> Hi guys, thought I should ask this before I use it ... will using C > >> over > >>>>> Hadoop give me the usual C memory management? For example, malloc() , > >>>>> sizeof() ? My guess is no since this all will eventually be turned > into > >>>>> bytecode, but I need more control on memory which obviously is hard > for > >>>> me > >>>>> to do with Java. > >>>>> > >>>>> Let me know of any advantages you know about streaming in C over > >> hadoop. > >>>>> Thank you, > >>>>> Mark > >>>> > >>>> > >> > >> > >