Re: task jvm bootstrapping via distributed cache

2013-01-17 Thread Stan Rosenberg
Hi, As I suspected, cache files are symlinked after a child JVM is started: TaskRunner.setupWorkDir is being called from org.apache.hadoop.mapred.Child.main. This is unfortunate as it makes impossible to leverage distributed cache for the purpose of deploying JVM agents. I could submit a jira if

Re: task jvm bootstrapping via distributed cache

2012-08-04 Thread rahul p
Hi Arun, I am new to hadoop n big data. Can you help me start working on basics.my experience is into ETL and BI DWH. Rahul On Aug 4, 2012 12:33 AM, "Stan Rosenberg" wrote: > Arun, > > I don't believe the symlink is of help. The symlink is created in the > task's current working directory (cwd

Re: task jvm bootstrapping via distributed cache

2012-08-03 Thread Stan Rosenberg
On Fri, Aug 3, 2012 at 4:19 PM, Arun C Murthy wrote: > Just do -javaagent:./profiler.jar? > Yep, that should work. Thanks!

Re: task jvm bootstrapping via distributed cache

2012-08-03 Thread Arun C Murthy
Just do -javaagent:./profiler.jar? On Aug 3, 2012, at 9:32 AM, Stan Rosenberg wrote: > Arun, > > I don't believe the symlink is of help. The symlink is created in the > task's current working directory (cwd), but I don't know what cwd is > when I launch with 'hadoop jar ...'. > > Thanks, > >

Re: task jvm bootstrapping via distributed cache

2012-08-03 Thread Stan Rosenberg
On Fri, Aug 3, 2012 at 1:31 PM, Harsh J wrote: > > What this would do is merely take your passed -files jar (client-common) and > symlink it into the JVM's working directory (the task's working directory) > _before_ the JVM is begun, as "foo.jar". So if I pass additionally, JVM opts > that refer t

Re: task jvm bootstrapping via distributed cache

2012-08-03 Thread Harsh J
Stan, What Arun says would surely work. For instance, read this command: hadoop jar share/hadoop/mapreduce/hadoop-mapreduce-examples-2.0.0.jar pi -files "share/hadoop/mapreduce/hadoop-mapreduce-client-common-2.0.0.jar#foo.jar" -Dmapred.child.java.opts="-javaagent:./foo.jar" 1 1 What this would

Re: task jvm bootstrapping via distributed cache

2012-08-03 Thread Stan Rosenberg
Arun, I don't believe the symlink is of help. The symlink is created in the task's current working directory (cwd), but I don't know what cwd is when I launch with 'hadoop jar ...'. Thanks, stan On Fri, Aug 3, 2012 at 2:39 AM, Arun C Murthy wrote: > Stan, > > You can ask TT to create a symli

Re: task jvm bootstrapping via distributed cache

2012-08-02 Thread Arun C Murthy
Stan, You can ask TT to create a symlink to your jar shipped via DistCache: http://hadoop.apache.org/common/docs/r1.0.3/mapred_tutorial.html#DistributedCache That should give you what you want. hth, Arun On Jul 30, 2012, at 3:23 PM, Stan Rosenberg wrote: > Hi, > > I am seeking a way to le

Re: task jvm bootstrapping via distributed cache

2012-07-31 Thread Stan Rosenberg
I am guessing this is either a well-known problem or an edge case. In any case, would it be a bad idea to designate predetermined output paths? E.g., DistributedCache.addCacheFileInto(uri, conf, outputPath) would attempt to copy the cached file into the specified path resolving to a task's local f