Re: Load .so library error when Hadoop calls JNI interfaces

2009-04-30 Thread Ian jonhson
Thanks for answering.

I run my Hadoop in single node, not cluster mode.



On Thu, Apr 30, 2009 at 11:21 AM, jason hadoop jason.had...@gmail.com wrote:
 You need to make sure that the shared library is available on the
 tasktracker nodes, either by installing it, or by pushing it around via the
 distributed cache



 On Wed, Apr 29, 2009 at 8:19 PM, Ian jonhson jonhson@gmail.com wrote:

 Dear all,

 I wrote a plugin codes for Hadoop, which calls the interfaces
 in Cpp-built .so library. The plugin codes are written in java,
 so I prepared a JNI class to encapsulate the C interfaces.

 The java codes can be executed successfully when I compiled
 it and run it standalone. However, it does not work when I embedded
 in Hadoop. The exception shown out is (found in Hadoop logs):


   screen dump  -

 # grep myClass logs/* -r
 logs/hadoop-hadoop-tasktracker-testbed0.container.org.out:Exception in
 thread JVM Runner jvm_200904261632_0001_m_-1217897050 spawned.
 java.lang.UnsatisfiedLinkError:
 org.apache.hadoop.mapred.myClass.myClassfsMount(Ljava/lang/String;)I
 logs/hadoop-hadoop-tasktracker-testbed0.container.org.out:      at
 org.apache.hadoop.mapred.myClass.myClassfsMount(Native Method)
 logs/hadoop-hadoop-tasktracker-testbed0.container.org.out:Exception in
 thread JVM Runner jvm_200904261632_0001_m_-1887898624 spawned.
 java.lang.UnsatisfiedLinkError:
 org.apache.hadoop.mapred.myClass.myClassfsMount(Ljava/lang/String;)I
 logs/hadoop-hadoop-tasktracker-testbed0.container.org.out:      at
 org.apache.hadoop.mapred.myClass.myClassfsMount(Native Method)
 ...

 

 It seems the library can not be loaded in Hadoop. My codes
 (myClass.java) is like:


 ---  myClass.java  --

 public class myClass
 {

        public static final Log LOG =
                    LogFactory.getLog(org.apache.hadoop.mapred.myClass);


        public myClass()   {

                try {
                        //System.setProperty(java.library.path,
 /usr/local/lib);

                        /* The above line does not work, so I have to
 do something
                         * like following line.
                         */
                        addDir(new String(/usr/local/lib));
                        System.loadLibrary(myclass);
                }
                catch(UnsatisfiedLinkError e) {
                        LOG.info( Cannot load library:\n  +
                                e.toString() );
                }
                catch(IOException ioe) {
                        LOG.info( IO error:\n  +
                                ioe.toString() );
                }

        }

        /* Since the System.setProperty() does not work, I have to add the
 following
         * function to force the path is added in java.library.path
         */
        public static void addDir(String s) throws IOException {

            try {
                        Field field =
 ClassLoader.class.getDeclaredField(usr_paths);
                         field.setAccessible(true);
                        String[] paths = (String[])field.get(null);
                        for (int i = 0; i  paths.length; i++) {
                            if (s.equals(paths[i])) {
                                return;
                            }
                        }
                        String[] tmp = new String[paths.length+1];
                        System.arraycopy(paths,0,tmp,0,paths.length);
                        tmp[paths.length] = s;

                        field.set(null,tmp);
                    } catch (IllegalAccessException e) {
                        throw new IOException(Failed to get
 permissions to set library path);
                    } catch (NoSuchFieldException e) {
                        throw new IOException(Failed to get field
 handle to set library path);
            }
        }

        public native int myClassfsMount(String subsys);
        public native int myClassfsUmount(String subsys);


 }

 


 I don't know what missed in my codes and am wondering whether there are any
 rules in Hadoop I should obey if I want to  achieve my target.

 FYI, the myClassfsMount() and myClassfsUmount() will open a socket to call
 services from a daemon. I would better if this design did not cause the
 fail in
 my codes.


 Any comments?


 Thanks in advance,

 Ian




 --
 Alpha Chapters of my book on Hadoop are available
 http://www.apress.com/book/view/9781430219422



Re: Load .so library error when Hadoop calls JNI interfaces

2009-04-30 Thread He Yongqiang
put your .so file in every traker's Hadoop-install/lib/native/Linux-xxx-xx/

Or
 
In your code,try to do
  
  String oldPath=System.getProperty(java.library.path);
  System.setProperty(java.library.path, oldPath==null?
local_path_of_lib_file:oldPath+pathSeparator +local_path_of_lib_file))
  System.loadLibrary(XXX);

However, you also need to fetch the library to local through
DistributedCache( like jason said) or putting and getting it from hdfs by
yourself.

On 09-4-30 下午5:14, Ian jonhson jonhson@gmail.com wrote:

 You mean that the current hadoop does not support JNI calls, right?
 Are there any solution to achieve the calls from C interfaces?
 
 2009/4/30 He Yongqiang heyongqi...@software.ict.ac.cn:
 Does hadoop now support jni calls in Mappers or Reducers? If yes, how? If
 not, I think we should create a jira issue for supporting that.
 
 
 On 09-4-30 下午4:02, Ian jonhson jonhson@gmail.com wrote:
 
 Thanks for answering.
 
 I run my Hadoop in single node, not cluster mode.
 
 
 
 On Thu, Apr 30, 2009 at 11:21 AM, jason hadoop jason.had...@gmail.com
 wrote:
 You need to make sure that the shared library is available on the
 tasktracker nodes, either by installing it, or by pushing it around via the
 distributed cache
 
 
 
 On Wed, Apr 29, 2009 at 8:19 PM, Ian jonhson jonhson@gmail.com wrote:
 
 Dear all,
 
 I wrote a plugin codes for Hadoop, which calls the interfaces
 in Cpp-built .so library. The plugin codes are written in java,
 so I prepared a JNI class to encapsulate the C interfaces.
 
 The java codes can be executed successfully when I compiled
 it and run it standalone. However, it does not work when I embedded
 in Hadoop. The exception shown out is (found in Hadoop logs):
 
 
   screen dump  -
 
 # grep myClass logs/* -r
 logs/hadoop-hadoop-tasktracker-testbed0.container.org.out:Exception in
 thread JVM Runner jvm_200904261632_0001_m_-1217897050 spawned.
 java.lang.UnsatisfiedLinkError:
 org.apache.hadoop.mapred.myClass.myClassfsMount(Ljava/lang/String;)I
 logs/hadoop-hadoop-tasktracker-testbed0.container.org.out:  at
 org.apache.hadoop.mapred.myClass.myClassfsMount(Native Method)
 logs/hadoop-hadoop-tasktracker-testbed0.container.org.out:Exception in
 thread JVM Runner jvm_200904261632_0001_m_-1887898624 spawned.
 java.lang.UnsatisfiedLinkError:
 org.apache.hadoop.mapred.myClass.myClassfsMount(Ljava/lang/String;)I
 logs/hadoop-hadoop-tasktracker-testbed0.container.org.out:  at
 org.apache.hadoop.mapred.myClass.myClassfsMount(Native Method)
 ...
 
 
 
 It seems the library can not be loaded in Hadoop. My codes
 (myClass.java) is like:
 
 
 ---  myClass.java  --
 
 public class myClass
 {
 
public static final Log LOG =
LogFactory.getLog(org.apache.hadoop.mapred.myClass);
 
 
public myClass()   {
 
try {
//System.setProperty(java.library.path,
 /usr/local/lib);
 
/* The above line does not work, so I have to
 do something
 * like following line.
 */
addDir(new String(/usr/local/lib));
System.loadLibrary(myclass);
}
catch(UnsatisfiedLinkError e) {
LOG.info( Cannot load library:\n  +
e.toString() );
}
catch(IOException ioe) {
LOG.info( IO error:\n  +
ioe.toString() );
}
 
}
 
/* Since the System.setProperty() does not work, I have to add the
 following
 * function to force the path is added in java.library.path
 */
public static void addDir(String s) throws IOException {
 
try {
Field field =
 ClassLoader.class.getDeclaredField(usr_paths);
 field.setAccessible(true);
String[] paths = (String[])field.get(null);
for (int i = 0; i  paths.length; i++) {
if (s.equals(paths[i])) {
return;
}
}
String[] tmp = new String[paths.length+1];
System.arraycopy(paths,0,tmp,0,paths.length);
tmp[paths.length] = s;
 
field.set(null,tmp);
} catch (IllegalAccessException e) {
throw new IOException(Failed to get
 permissions to set library path);
} catch (NoSuchFieldException e) {
throw new IOException(Failed to get field
 handle to set library path);
}
}
 
public native int myClassfsMount(String subsys);
public native int 

Re: Load .so library error when Hadoop calls JNI interfaces

2009-04-30 Thread He Yongqiang
Does hadoop now support jni calls in Mappers or Reducers? If yes, how? If
not, I think we should create a jira issue for supporting that.


On 09-4-30 下午4:02, Ian jonhson jonhson@gmail.com wrote:

 Thanks for answering.
 
 I run my Hadoop in single node, not cluster mode.
 
 
 
 On Thu, Apr 30, 2009 at 11:21 AM, jason hadoop jason.had...@gmail.com wrote:
 You need to make sure that the shared library is available on the
 tasktracker nodes, either by installing it, or by pushing it around via the
 distributed cache
 
 
 
 On Wed, Apr 29, 2009 at 8:19 PM, Ian jonhson jonhson@gmail.com wrote:
 
 Dear all,
 
 I wrote a plugin codes for Hadoop, which calls the interfaces
 in Cpp-built .so library. The plugin codes are written in java,
 so I prepared a JNI class to encapsulate the C interfaces.
 
 The java codes can be executed successfully when I compiled
 it and run it standalone. However, it does not work when I embedded
 in Hadoop. The exception shown out is (found in Hadoop logs):
 
 
   screen dump  -
 
 # grep myClass logs/* -r
 logs/hadoop-hadoop-tasktracker-testbed0.container.org.out:Exception in
 thread JVM Runner jvm_200904261632_0001_m_-1217897050 spawned.
 java.lang.UnsatisfiedLinkError:
 org.apache.hadoop.mapred.myClass.myClassfsMount(Ljava/lang/String;)I
 logs/hadoop-hadoop-tasktracker-testbed0.container.org.out:  at
 org.apache.hadoop.mapred.myClass.myClassfsMount(Native Method)
 logs/hadoop-hadoop-tasktracker-testbed0.container.org.out:Exception in
 thread JVM Runner jvm_200904261632_0001_m_-1887898624 spawned.
 java.lang.UnsatisfiedLinkError:
 org.apache.hadoop.mapred.myClass.myClassfsMount(Ljava/lang/String;)I
 logs/hadoop-hadoop-tasktracker-testbed0.container.org.out:  at
 org.apache.hadoop.mapred.myClass.myClassfsMount(Native Method)
 ...
 
 
 
 It seems the library can not be loaded in Hadoop. My codes
 (myClass.java) is like:
 
 
 ---  myClass.java  --
 
 public class myClass
 {
 
public static final Log LOG =
LogFactory.getLog(org.apache.hadoop.mapred.myClass);
 
 
public myClass()   {
 
try {
//System.setProperty(java.library.path,
 /usr/local/lib);
 
/* The above line does not work, so I have to
 do something
 * like following line.
 */
addDir(new String(/usr/local/lib));
System.loadLibrary(myclass);
}
catch(UnsatisfiedLinkError e) {
LOG.info( Cannot load library:\n  +
e.toString() );
}
catch(IOException ioe) {
LOG.info( IO error:\n  +
ioe.toString() );
}
 
}
 
/* Since the System.setProperty() does not work, I have to add the
 following
 * function to force the path is added in java.library.path
 */
public static void addDir(String s) throws IOException {
 
try {
Field field =
 ClassLoader.class.getDeclaredField(usr_paths);
 field.setAccessible(true);
String[] paths = (String[])field.get(null);
for (int i = 0; i  paths.length; i++) {
if (s.equals(paths[i])) {
return;
}
}
String[] tmp = new String[paths.length+1];
System.arraycopy(paths,0,tmp,0,paths.length);
tmp[paths.length] = s;
 
field.set(null,tmp);
} catch (IllegalAccessException e) {
throw new IOException(Failed to get
 permissions to set library path);
} catch (NoSuchFieldException e) {
throw new IOException(Failed to get field
 handle to set library path);
}
}
 
public native int myClassfsMount(String subsys);
public native int myClassfsUmount(String subsys);
 
 
 }
 
 
 
 
 I don't know what missed in my codes and am wondering whether there are any
 rules in Hadoop I should obey if I want to  achieve my target.
 
 FYI, the myClassfsMount() and myClassfsUmount() will open a socket to call
 services from a daemon. I would better if this design did not cause the
 fail in
 my codes.
 
 
 Any comments?
 
 
 Thanks in advance,
 
 Ian
 
 
 
 
 --
 Alpha Chapters of my book on Hadoop are available
 http://www.apress.com/book/view/9781430219422
 
 
 




Re: I need help

2009-04-30 Thread Steve Loughran

Razen Alharbi wrote:

Thanks everybody,

The issue was that hadoop writes all the outputs to stderr instead of stdout
and i don't know why. I would really love to know why the usual hadoop job
progress is written to stderr.


because there is a line in log4.properties telling it to do just that?

log4j.appender.console.target=System.err


--
Steve Loughran  http://www.1060.org/blogxter/publish/5
Author: Ant in Action   http://antbook.org/


Re: Load .so library error when Hadoop calls JNI interfaces

2009-04-30 Thread Rakhi Khatwani
Hi Jason,
 when will the full version of your book be available??

On Thu, Apr 30, 2009 at 8:51 AM, jason hadoop jason.had...@gmail.comwrote:

 You need to make sure that the shared library is available on the
 tasktracker nodes, either by installing it, or by pushing it around via the
 distributed cache



 On Wed, Apr 29, 2009 at 8:19 PM, Ian jonhson jonhson@gmail.com
 wrote:

  Dear all,
 
  I wrote a plugin codes for Hadoop, which calls the interfaces
  in Cpp-built .so library. The plugin codes are written in java,
  so I prepared a JNI class to encapsulate the C interfaces.
 
  The java codes can be executed successfully when I compiled
  it and run it standalone. However, it does not work when I embedded
  in Hadoop. The exception shown out is (found in Hadoop logs):
 
 
    screen dump  -
 
  # grep myClass logs/* -r
  logs/hadoop-hadoop-tasktracker-testbed0.container.org.out:Exception in
  thread JVM Runner jvm_200904261632_0001_m_-1217897050 spawned.
  java.lang.UnsatisfiedLinkError:
  org.apache.hadoop.mapred.myClass.myClassfsMount(Ljava/lang/String;)I
  logs/hadoop-hadoop-tasktracker-testbed0.container.org.out:  at
  org.apache.hadoop.mapred.myClass.myClassfsMount(Native Method)
  logs/hadoop-hadoop-tasktracker-testbed0.container.org.out:Exception in
  thread JVM Runner jvm_200904261632_0001_m_-1887898624 spawned.
  java.lang.UnsatisfiedLinkError:
  org.apache.hadoop.mapred.myClass.myClassfsMount(Ljava/lang/String;)I
  logs/hadoop-hadoop-tasktracker-testbed0.container.org.out:  at
  org.apache.hadoop.mapred.myClass.myClassfsMount(Native Method)
  ...
 
  
 
  It seems the library can not be loaded in Hadoop. My codes
  (myClass.java) is like:
 
 
  ---  myClass.java  --
 
  public class myClass
  {
 
 public static final Log LOG =
 LogFactory.getLog(org.apache.hadoop.mapred.myClass);
 
 
 public myClass()   {
 
 try {
 //System.setProperty(java.library.path,
  /usr/local/lib);
 
 /* The above line does not work, so I have to
  do something
  * like following line.
  */
 addDir(new String(/usr/local/lib));
 System.loadLibrary(myclass);
 }
 catch(UnsatisfiedLinkError e) {
 LOG.info( Cannot load library:\n  +
 e.toString() );
 }
 catch(IOException ioe) {
 LOG.info( IO error:\n  +
 ioe.toString() );
 }
 
 }
 
 /* Since the System.setProperty() does not work, I have to add the
  following
  * function to force the path is added in java.library.path
  */
 public static void addDir(String s) throws IOException {
 
 try {
 Field field =
  ClassLoader.class.getDeclaredField(usr_paths);
  field.setAccessible(true);
 String[] paths = (String[])field.get(null);
 for (int i = 0; i  paths.length; i++) {
 if (s.equals(paths[i])) {
 return;
 }
 }
 String[] tmp = new String[paths.length+1];
 System.arraycopy(paths,0,tmp,0,paths.length);
 tmp[paths.length] = s;
 
 field.set(null,tmp);
 } catch (IllegalAccessException e) {
 throw new IOException(Failed to get
  permissions to set library path);
 } catch (NoSuchFieldException e) {
 throw new IOException(Failed to get field
  handle to set library path);
 }
 }
 
 public native int myClassfsMount(String subsys);
 public native int myClassfsUmount(String subsys);
 
 
  }
 
  
 
 
  I don't know what missed in my codes and am wondering whether there are
 any
  rules in Hadoop I should obey if I want to  achieve my target.
 
  FYI, the myClassfsMount() and myClassfsUmount() will open a socket to
 call
  services from a daemon. I would better if this design did not cause the
  fail in
  my codes.
 
 
  Any comments?
 
 
  Thanks in advance,
 
  Ian
 



 --
 Alpha Chapters of my book on Hadoop are available
 http://www.apress.com/book/view/9781430219422



Re: Load .so library error when Hadoop calls JNI interfaces

2009-04-30 Thread Ian jonhson
2009/4/30 He Yongqiang heyongqi...@software.ict.ac.cn:
 put your .so file in every traker's Hadoop-install/lib/native/Linux-xxx-xx/

 Or

 In your code,try to do

  String oldPath=System.getProperty(java.library.path);
  System.setProperty(java.library.path, oldPath==null?
 local_path_of_lib_file:oldPath+pathSeparator +local_path_of_lib_file))
  System.loadLibrary(XXX);



I have copied .so and .a files to Hadoop-install/lib/native/Linux-xxx-xx/
and called  System.loadLibrary(XXX); in my codes, but nothing happens.

Then, I tried the second solution mentioned above, same problem is
occurred (the .so files have been in native directory).



 However, you also need to fetch the library to local through
 DistributedCache( like jason said) or putting and getting it from hdfs by
 yourself.


Does I need to copy libraries in local machine since I run the Hadoop in
single node?

How can I do it either by fetching or putting from hdfs?


 On 09-4-30 下午5:14, Ian jonhson jonhson@gmail.com wrote:

 You mean that the current hadoop does not support JNI calls, right?
 Are there any solution to achieve the calls from C interfaces?

 2009/4/30 He Yongqiang heyongqi...@software.ict.ac.cn:
 Does hadoop now support jni calls in Mappers or Reducers? If yes, how? If
 not, I think we should create a jira issue for supporting that.




Re: Load .so library error when Hadoop calls JNI interfaces

2009-04-30 Thread Ian jonhson
You mean that the current hadoop does not support JNI calls, right?
Are there any solution to achieve the calls from C interfaces?

2009/4/30 He Yongqiang heyongqi...@software.ict.ac.cn:
 Does hadoop now support jni calls in Mappers or Reducers? If yes, how? If
 not, I think we should create a jira issue for supporting that.


 On 09-4-30 下午4:02, Ian jonhson jonhson@gmail.com wrote:

 Thanks for answering.

 I run my Hadoop in single node, not cluster mode.



 On Thu, Apr 30, 2009 at 11:21 AM, jason hadoop jason.had...@gmail.com 
 wrote:
 You need to make sure that the shared library is available on the
 tasktracker nodes, either by installing it, or by pushing it around via the
 distributed cache



 On Wed, Apr 29, 2009 at 8:19 PM, Ian jonhson jonhson@gmail.com wrote:

 Dear all,

 I wrote a plugin codes for Hadoop, which calls the interfaces
 in Cpp-built .so library. The plugin codes are written in java,
 so I prepared a JNI class to encapsulate the C interfaces.

 The java codes can be executed successfully when I compiled
 it and run it standalone. However, it does not work when I embedded
 in Hadoop. The exception shown out is (found in Hadoop logs):


   screen dump  -

 # grep myClass logs/* -r
 logs/hadoop-hadoop-tasktracker-testbed0.container.org.out:Exception in
 thread JVM Runner jvm_200904261632_0001_m_-1217897050 spawned.
 java.lang.UnsatisfiedLinkError:
 org.apache.hadoop.mapred.myClass.myClassfsMount(Ljava/lang/String;)I
 logs/hadoop-hadoop-tasktracker-testbed0.container.org.out:  at
 org.apache.hadoop.mapred.myClass.myClassfsMount(Native Method)
 logs/hadoop-hadoop-tasktracker-testbed0.container.org.out:Exception in
 thread JVM Runner jvm_200904261632_0001_m_-1887898624 spawned.
 java.lang.UnsatisfiedLinkError:
 org.apache.hadoop.mapred.myClass.myClassfsMount(Ljava/lang/String;)I
 logs/hadoop-hadoop-tasktracker-testbed0.container.org.out:  at
 org.apache.hadoop.mapred.myClass.myClassfsMount(Native Method)
 ...

 

 It seems the library can not be loaded in Hadoop. My codes
 (myClass.java) is like:


 ---  myClass.java  --

 public class myClass
 {

public static final Log LOG =
LogFactory.getLog(org.apache.hadoop.mapred.myClass);


public myClass()   {

try {
//System.setProperty(java.library.path,
 /usr/local/lib);

/* The above line does not work, so I have to
 do something
 * like following line.
 */
addDir(new String(/usr/local/lib));
System.loadLibrary(myclass);
}
catch(UnsatisfiedLinkError e) {
LOG.info( Cannot load library:\n  +
e.toString() );
}
catch(IOException ioe) {
LOG.info( IO error:\n  +
ioe.toString() );
}

}

/* Since the System.setProperty() does not work, I have to add the
 following
 * function to force the path is added in java.library.path
 */
public static void addDir(String s) throws IOException {

try {
Field field =
 ClassLoader.class.getDeclaredField(usr_paths);
 field.setAccessible(true);
String[] paths = (String[])field.get(null);
for (int i = 0; i  paths.length; i++) {
if (s.equals(paths[i])) {
return;
}
}
String[] tmp = new String[paths.length+1];
System.arraycopy(paths,0,tmp,0,paths.length);
tmp[paths.length] = s;

field.set(null,tmp);
} catch (IllegalAccessException e) {
throw new IOException(Failed to get
 permissions to set library path);
} catch (NoSuchFieldException e) {
throw new IOException(Failed to get field
 handle to set library path);
}
}

public native int myClassfsMount(String subsys);
public native int myClassfsUmount(String subsys);


 }

 


 I don't know what missed in my codes and am wondering whether there are any
 rules in Hadoop I should obey if I want to  achieve my target.

 FYI, the myClassfsMount() and myClassfsUmount() will open a socket to call
 services from a daemon. I would better if this design did not cause the
 fail in
 my codes.


 Any comments?


 Thanks in advance,

 Ian




 --
 Alpha Chapters of my book on Hadoop are available
 http://www.apress.com/book/view/9781430219422








Re: unable to see anything in stdout

2009-04-30 Thread Aaron Kimball
First thing I would do is to run the job in the local jobrunner (as a single
process on your local machine without involving the cluster):

JobConf conf = .
// set other params, mapper, etc. here
conf.set(mapred.job.tracker, local); // use localjobrunner
conf.set(fs.default.name, file:///); // read from local hard disk
instead of hdfs

JobClient.runJob(conf);


This will actually print stdout, stderr, etc. to your local terminal. Try
this on a single input file. This will let you confirm that it does, in
fact, write to stdout.

- Aaron

On Thu, Apr 30, 2009 at 9:00 AM, Asim linka...@gmail.com wrote:

 Hi,

 I am not able to see any job output in userlogs/task_id/stdout. It
 remains empty even though I have many println statements. Are there
 any steps to debug this problem?

 Regards,
 Asim



Re: Specifying System Properties in the had

2009-04-30 Thread Aaron Kimball
So you want a different -Dfoo=test on each node? It's probably grabbing
the setting from the node where the job was submitted, and this overrides
the settings on each task node.

Try adding finaltrue/final to the property block on the tasktrackers,
then restart Hadoop and try again. This will prevent the job from overriding
the setting.

- Aaron

On Thu, Apr 30, 2009 at 9:25 AM, Marc Limotte mlimo...@feeva.com wrote:

 I'm trying to set a System Property in the Hadoop config, so my jobs will
 know which cluster they are running on.  I think I should be able to do this
 with -Dname=value in mapred.child.java.opts (example below), but the
 setting is ignored.
 In hadoop-site.xml I have:
 property
 namemapred.child.java.opts/name
 value-Xmx200m -Dfoo=test/value
 /property
 But the job conf through the web server indicates:
 mapred.child.java.opts -Xmx1024M -Duser.timezone=UTC

 I'm using Hadoop-0.17.2.1.
 Any tips on why my setting is not picked up?

 Marc

 
 PRIVATE AND CONFIDENTIAL - NOTICE TO RECIPIENT: THIS E-MAIL IS MEANT FOR
 ONLY THE INTENDED RECIPIENT OF THE TRANSMISSION, AND MAY BE A COMMUNICATION
 PRIVILEGE BY LAW. IF YOU RECEIVED THIS E-MAIL IN ERROR, ANY REVIEW, USE,
 DISSEMINATION, DISTRIBUTION, OR COPYING OF THIS EMAIL IS STRICTLY
 PROHIBITED. PLEASE NOTIFY US IMMEDIATELY OF THE ERROR BY RETURN E-MAIL AND
 PLEASE DELETE THIS MESSAGE FROM YOUR SYSTEM.



Re: Specifying System Properties in the had

2009-04-30 Thread Tom White
Another way to do this would be to set a property in the Hadoop config itself.

In the job launcher you would have something like:

JobConf conf = ...
conf.setProperty(foo, test);

Then you can read the property in your map or reduce task.

Tom

On Thu, Apr 30, 2009 at 3:25 PM, Aaron Kimball aa...@cloudera.com wrote:
 So you want a different -Dfoo=test on each node? It's probably grabbing
 the setting from the node where the job was submitted, and this overrides
 the settings on each task node.

 Try adding finaltrue/final to the property block on the tasktrackers,
 then restart Hadoop and try again. This will prevent the job from overriding
 the setting.

 - Aaron

 On Thu, Apr 30, 2009 at 9:25 AM, Marc Limotte mlimo...@feeva.com wrote:

 I'm trying to set a System Property in the Hadoop config, so my jobs will
 know which cluster they are running on.  I think I should be able to do this
 with -Dname=value in mapred.child.java.opts (example below), but the
 setting is ignored.
 In hadoop-site.xml I have:
 property
 namemapred.child.java.opts/name
 value-Xmx200m -Dfoo=test/value
 /property
 But the job conf through the web server indicates:
 mapred.child.java.opts -Xmx1024M -Duser.timezone=UTC

 I'm using Hadoop-0.17.2.1.
 Any tips on why my setting is not picked up?

 Marc

 
 PRIVATE AND CONFIDENTIAL - NOTICE TO RECIPIENT: THIS E-MAIL IS MEANT FOR
 ONLY THE INTENDED RECIPIENT OF THE TRANSMISSION, AND MAY BE A COMMUNICATION
 PRIVILEGE BY LAW. IF YOU RECEIVED THIS E-MAIL IN ERROR, ANY REVIEW, USE,
 DISSEMINATION, DISTRIBUTION, OR COPYING OF THIS EMAIL IS STRICTLY
 PROHIBITED. PLEASE NOTIFY US IMMEDIATELY OF THE ERROR BY RETURN E-MAIL AND
 PLEASE DELETE THIS MESSAGE FROM YOUR SYSTEM.




Infinite Loop Resending status from task tracker

2009-04-30 Thread Lance Riedel
Has anyone seen this before? Our task tracker produced a 2.7 gig log  
file in a few hours. The entry is all the same (every 2 ms):


2009-04-30 02:34:40,207 INFO org.apache.hadoop.mapred.TaskTracker:  
Resending 'status' to 'ec2-xx-xx-xx-xx.compute-1.amazonaws.com' with  
reponseId '5341
2009-04-30 02:34:40,398 INFO org.apache.hadoop.mapred.TaskTracker:  
Resending 'status' to 'ec2-xx-xx-xx-xx.compute-1.amazonaws.com' with  
reponseId '5341
2009-04-30 02:34:40,403 INFO org.apache.hadoop.mapred.TaskTracker:  
Resending 'status' to 'ec2-xx-xx-xx-xx.compute-1.amazonaws.com' with  
reponseId '5341
2009-04-30 02:34:40,411 INFO org.apache.hadoop.mapred.TaskTracker:  
Resending 'status' to 'ec2-xx-xx-xx-xx.compute-1.amazonaws.com' with  
reponseId '5341
2009-04-30 02:34:40,414 INFO org.apache.hadoop.mapred.TaskTracker:  
Resending 'status' to 'ec2-xx-xx-xx-xx.compute-1.amazonaws.com' with  
reponseId '5341
2009-04-30 02:34:40,417 INFO org.apache.hadoop.mapred.TaskTracker:  
Resending 'status' to 'ec2-xx-xx-xx-xx.compute-1.amazonaws.com' with  
reponseId '5341
2009-04-30 02:34:40,420 INFO org.apache.hadoop.mapred.TaskTracker:  
Resending 'status' to 'ec2-xx-xx-xx-xx.compute-1.amazonaws.com' with  
reponseId '5341

... (And on and on and on...)


These are the few lines before it started:

2009-04-30 02:34:29,780 INFO  
org.apache.hadoop.mapred.TaskTracker.clienttrace: src: xxx.xxx.xxx.xxx: 
50060, dest: 10.253.178.95:40268, bytes: 3341324, op: MAPRED_SHUFFLE,  
cliID: attempt_200904291917_0352_m_06_0
2009-04-30 02:34:31,522 INFO org.apache.hadoop.mapred.TaskTracker:  
Sent out 418891 bytes for reduce: 12 from map:  
attempt_200904291917_0352_m_07_0 given 418891/418887 from 4301462  
with (22, 171)
2009-04-30 02:34:31,522 INFO  
org.apache.hadoop.mapred.TaskTracker.clienttrace: src: xxx.xxx.xxx.xxx: 
50060, dest: xxx.xxx.xxx.xxx:40268, bytes: 418891, op: MAPRED_SHUFFLE,  
cliID: attempt_200904291917_0352_m_07_0
2009-04-30 02:34:35,382 INFO org.apache.hadoop.mapred.TaskTracker:  
attempt_200904291917_0352_r_03_0 0.3030303% reduce  copy (10 of  
11 at 0.32 MB/s) 
2009-04-30 02:34:38,385 INFO org.apache.hadoop.mapred.TaskTracker:  
attempt_200904291917_0352_r_03_0 0.3030303% reduce  copy (10 of  
11 at 0.32 MB/s) 
2009-04-30 02:34:40,207 INFO org.apache.hadoop.mapred.TaskTracker:  
Resending 'status' to 'ec2-xx-xx-xx-xx.compute-1.amazonaws.com' with  
reponseId '5341
2009-04-30 02:34:40,398 INFO org.apache.hadoop.mapred.TaskTracker:  
Resending 'status' to 'ec2-xx-xx-xx-xx.compute-1.amazonaws.com' with  
reponseId '5341
2009-04-30 02:34:40,403 INFO org.apache.hadoop.mapred.TaskTracker:  
Resending 'status' to 'ec2-xx-xx-xx-xx.compute-1.amazonaws.com' with  
reponseId '5341
2009-04-30 02:34:40,411 INFO org.apache.hadoop.mapred.TaskTracker:  
Resending 'status' to 'ec2-xx-xx-xx-xx.compute-1.amazonaws.com' with  
reponseId '5341
2009-04-30 02:34:40,414 INFO org.apache.hadoop.mapred.TaskTracker:  
Resending 'status' to 'ec2-xx-xx-xx-xx.compute-1.amazonaws.com' with  
reponseId '5341
2009-04-30 02:34:40,417 INFO org.apache.hadoop.mapred.TaskTracker:  
Resending 'status' to 'ec2-xx-xx-xx-xx.compute-1.amazonaws.com' with  
reponseId '5341
2009-04-30 02:34:40,420 INFO org.apache.hadoop.mapred.TaskTracker:  
Resending 'status' to 'ec2-xx-xx-xx-xx.compute-1.amazonaws.com' with  
reponseId '5341


--And on for2+ gigs


Re: Master crashed

2009-04-30 Thread Mayuran Yogarajah

Alex Loddengaard wrote:

I'm confused.  Why are you trying to stop things when you're bringing the
name node back up?  Try running start-all.sh instead.

Alex

  
Won't that try to start the daemons on the slave nodes again? They're 
already running.


M

On Tue, Apr 28, 2009 at 4:00 PM, Mayuran Yogarajah 
mayuran.yogara...@casalemedia.com wrote:

  

The master in my cluster crashed, the dfs/mapred java processes are
still running on the slaves.  What should I do next? I brought the master
back up and ran stop-mapred.sh and stop-dfs.sh and it said this:

slave1.test.com: no tasktracker to stop
slave1.test.com: no datanode to stop

Not sure what happened here, please advise.

thanks,
M






Re: Infinite Loop Resending status from task tracker

2009-04-30 Thread Todd Lipcon
Hi Lance,

Can I ask what version you were running when you saw this? Is it
reproducible? I'm trying to look at the code path that might produce such a
behavior and want to make sure I'm looking at the right version.

Thanks
-Todd

On Thu, Apr 30, 2009 at 9:33 AM, Lance Riedel la...@dotspots.com wrote:

 Has anyone seen this before? Our task tracker produced a 2.7 gig log file
 in a few hours. The entry is all the same (every 2 ms):

 2009-04-30 02:34:40,207 INFO org.apache.hadoop.mapred.TaskTracker:
 Resending 'status' to 'ec2-xx-xx-xx-xx.compute-1.amazonaws.com' with
 reponseId '5341
 2009-04-30 02:34:40,398 INFO org.apache.hadoop.mapred.TaskTracker:
 Resending 'status' to 'ec2-xx-xx-xx-xx.compute-1.amazonaws.com' with
 reponseId '5341
 2009-04-30 02:34:40,403 INFO org.apache.hadoop.mapred.TaskTracker:
 Resending 'status' to 'ec2-xx-xx-xx-xx.compute-1.amazonaws.com' with
 reponseId '5341
 2009-04-30 02:34:40,411 INFO org.apache.hadoop.mapred.TaskTracker:
 Resending 'status' to 'ec2-xx-xx-xx-xx.compute-1.amazonaws.com' with
 reponseId '5341
 2009-04-30 02:34:40,414 INFO org.apache.hadoop.mapred.TaskTracker:
 Resending 'status' to 'ec2-xx-xx-xx-xx.compute-1.amazonaws.com' with
 reponseId '5341
 2009-04-30 02:34:40,417 INFO org.apache.hadoop.mapred.TaskTracker:
 Resending 'status' to 'ec2-xx-xx-xx-xx.compute-1.amazonaws.com' with
 reponseId '5341
 2009-04-30 02:34:40,420 INFO org.apache.hadoop.mapred.TaskTracker:
 Resending 'status' to 'ec2-xx-xx-xx-xx.compute-1.amazonaws.com' with
 reponseId '5341
 ... (And on and on and on...)


 These are the few lines before it started:

 2009-04-30 02:34:29,780 INFO
 org.apache.hadoop.mapred.TaskTracker.clienttrace: src:
 xxx.xxx.xxx.xxx:50060, dest: 10.253.178.95:40268, bytes: 3341324, op:
 MAPRED_SHUFFLE, cliID: attempt_200904291917_0352_m_06_0
 2009-04-30 02:34:31,522 INFO org.apache.hadoop.mapred.TaskTracker: Sent out
 418891 bytes for reduce: 12 from map: attempt_200904291917_0352_m_07_0
 given 418891/418887 from 4301462 with (22, 171)
 2009-04-30 02:34:31,522 INFO
 org.apache.hadoop.mapred.TaskTracker.clienttrace: src:
 xxx.xxx.xxx.xxx:50060, dest: xxx.xxx.xxx.xxx:40268, bytes: 418891, op:
 MAPRED_SHUFFLE, cliID: attempt_200904291917_0352_m_07_0
 2009-04-30 02:34:35,382 INFO org.apache.hadoop.mapred.TaskTracker:
 attempt_200904291917_0352_r_03_0 0.3030303% reduce  copy (10 of 11 at
 0.32 MB/s) 
 2009-04-30 02:34:38,385 INFO org.apache.hadoop.mapred.TaskTracker:
 attempt_200904291917_0352_r_03_0 0.3030303% reduce  copy (10 of 11 at
 0.32 MB/s) 
 2009-04-30 02:34:40,207 INFO org.apache.hadoop.mapred.TaskTracker:
 Resending 'status' to 'ec2-xx-xx-xx-xx.compute-1.amazonaws.com' with
 reponseId '5341
 2009-04-30 02:34:40,398 INFO org.apache.hadoop.mapred.TaskTracker:
 Resending 'status' to 'ec2-xx-xx-xx-xx.compute-1.amazonaws.com' with
 reponseId '5341
 2009-04-30 02:34:40,403 INFO org.apache.hadoop.mapred.TaskTracker:
 Resending 'status' to 'ec2-xx-xx-xx-xx.compute-1.amazonaws.com' with
 reponseId '5341
 2009-04-30 02:34:40,411 INFO org.apache.hadoop.mapred.TaskTracker:
 Resending 'status' to 'ec2-xx-xx-xx-xx.compute-1.amazonaws.com' with
 reponseId '5341
 2009-04-30 02:34:40,414 INFO org.apache.hadoop.mapred.TaskTracker:
 Resending 'status' to 'ec2-xx-xx-xx-xx.compute-1.amazonaws.com' with
 reponseId '5341
 2009-04-30 02:34:40,417 INFO org.apache.hadoop.mapred.TaskTracker:
 Resending 'status' to 'ec2-xx-xx-xx-xx.compute-1.amazonaws.com' with
 reponseId '5341
 2009-04-30 02:34:40,420 INFO org.apache.hadoop.mapred.TaskTracker:
 Resending 'status' to 'ec2-xx-xx-xx-xx.compute-1.amazonaws.com' with
 reponseId '5341

 --And on for2+ gigs



Re: Infinite Loop Resending status from task tracker

2009-04-30 Thread Lance Riedel
I have not been able to reproduce.  We are using version 19.1 with the  
following patches:

4780-2v19.patch (Jira  4780)
closeAll3.patch (Jira 3998)

Thanks,
Lance

On Apr 30, 2009, at 10:40 AM, Todd Lipcon wrote:


Hi Lance,

Can I ask what version you were running when you saw this? Is it
reproducible? I'm trying to look at the code path that might produce  
such a

behavior and want to make sure I'm looking at the right version.

Thanks
-Todd

On Thu, Apr 30, 2009 at 9:33 AM, Lance Riedel la...@dotspots.com  
wrote:


Has anyone seen this before? Our task tracker produced a 2.7 gig  
log file

in a few hours. The entry is all the same (every 2 ms):

2009-04-30 02:34:40,207 INFO org.apache.hadoop.mapred.TaskTracker:
Resending 'status' to 'ec2-xx-xx-xx-xx.compute-1.amazonaws.com' with
reponseId '5341
2009-04-30 02:34:40,398 INFO org.apache.hadoop.mapred.TaskTracker:
Resending 'status' to 'ec2-xx-xx-xx-xx.compute-1.amazonaws.com' with
reponseId '5341
2009-04-30 02:34:40,403 INFO org.apache.hadoop.mapred.TaskTracker:
Resending 'status' to 'ec2-xx-xx-xx-xx.compute-1.amazonaws.com' with
reponseId '5341
2009-04-30 02:34:40,411 INFO org.apache.hadoop.mapred.TaskTracker:
Resending 'status' to 'ec2-xx-xx-xx-xx.compute-1.amazonaws.com' with
reponseId '5341
2009-04-30 02:34:40,414 INFO org.apache.hadoop.mapred.TaskTracker:
Resending 'status' to 'ec2-xx-xx-xx-xx.compute-1.amazonaws.com' with
reponseId '5341
2009-04-30 02:34:40,417 INFO org.apache.hadoop.mapred.TaskTracker:
Resending 'status' to 'ec2-xx-xx-xx-xx.compute-1.amazonaws.com' with
reponseId '5341
2009-04-30 02:34:40,420 INFO org.apache.hadoop.mapred.TaskTracker:
Resending 'status' to 'ec2-xx-xx-xx-xx.compute-1.amazonaws.com' with
reponseId '5341
... (And on and on and on...)


These are the few lines before it started:

2009-04-30 02:34:29,780 INFO
org.apache.hadoop.mapred.TaskTracker.clienttrace: src:
xxx.xxx.xxx.xxx:50060, dest: 10.253.178.95:40268, bytes: 3341324, op:
MAPRED_SHUFFLE, cliID: attempt_200904291917_0352_m_06_0
2009-04-30 02:34:31,522 INFO org.apache.hadoop.mapred.TaskTracker:  
Sent out
418891 bytes for reduce: 12 from map:  
attempt_200904291917_0352_m_07_0

given 418891/418887 from 4301462 with (22, 171)
2009-04-30 02:34:31,522 INFO
org.apache.hadoop.mapred.TaskTracker.clienttrace: src:
xxx.xxx.xxx.xxx:50060, dest: xxx.xxx.xxx.xxx:40268, bytes: 418891,  
op:

MAPRED_SHUFFLE, cliID: attempt_200904291917_0352_m_07_0
2009-04-30 02:34:35,382 INFO org.apache.hadoop.mapred.TaskTracker:
attempt_200904291917_0352_r_03_0 0.3030303% reduce  copy (10  
of 11 at

0.32 MB/s) 
2009-04-30 02:34:38,385 INFO org.apache.hadoop.mapred.TaskTracker:
attempt_200904291917_0352_r_03_0 0.3030303% reduce  copy (10  
of 11 at

0.32 MB/s) 
2009-04-30 02:34:40,207 INFO org.apache.hadoop.mapred.TaskTracker:
Resending 'status' to 'ec2-xx-xx-xx-xx.compute-1.amazonaws.com' with
reponseId '5341
2009-04-30 02:34:40,398 INFO org.apache.hadoop.mapred.TaskTracker:
Resending 'status' to 'ec2-xx-xx-xx-xx.compute-1.amazonaws.com' with
reponseId '5341
2009-04-30 02:34:40,403 INFO org.apache.hadoop.mapred.TaskTracker:
Resending 'status' to 'ec2-xx-xx-xx-xx.compute-1.amazonaws.com' with
reponseId '5341
2009-04-30 02:34:40,411 INFO org.apache.hadoop.mapred.TaskTracker:
Resending 'status' to 'ec2-xx-xx-xx-xx.compute-1.amazonaws.com' with
reponseId '5341
2009-04-30 02:34:40,414 INFO org.apache.hadoop.mapred.TaskTracker:
Resending 'status' to 'ec2-xx-xx-xx-xx.compute-1.amazonaws.com' with
reponseId '5341
2009-04-30 02:34:40,417 INFO org.apache.hadoop.mapred.TaskTracker:
Resending 'status' to 'ec2-xx-xx-xx-xx.compute-1.amazonaws.com' with
reponseId '5341
2009-04-30 02:34:40,420 INFO org.apache.hadoop.mapred.TaskTracker:
Resending 'status' to 'ec2-xx-xx-xx-xx.compute-1.amazonaws.com' with
reponseId '5341

--And on for2+ gigs





Re: Infinite Loop Resending status from task tracker

2009-04-30 Thread Todd Lipcon
Hey Lance,

Did you see any error messages in the JobTracker logs around the time this
started? I think I understand how this might happen.

Thanks,
-Todd

On Thu, Apr 30, 2009 at 10:45 AM, Lance Riedel la...@dotspots.com wrote:

 I have not been able to reproduce.  We are using version 19.1 with the
 following patches:
 4780-2v19.patch (Jira  4780)
 closeAll3.patch (Jira 3998)

 Thanks,
 Lance


 On Apr 30, 2009, at 10:40 AM, Todd Lipcon wrote:

  Hi Lance,

 Can I ask what version you were running when you saw this? Is it
 reproducible? I'm trying to look at the code path that might produce such
 a
 behavior and want to make sure I'm looking at the right version.

 Thanks
 -Todd

 On Thu, Apr 30, 2009 at 9:33 AM, Lance Riedel la...@dotspots.com wrote:

  Has anyone seen this before? Our task tracker produced a 2.7 gig log file
 in a few hours. The entry is all the same (every 2 ms):

 2009-04-30 02:34:40,207 INFO org.apache.hadoop.mapred.TaskTracker:
 Resending 'status' to 'ec2-xx-xx-xx-xx.compute-1.amazonaws.com' with
 reponseId '5341
 2009-04-30 02:34:40,398 INFO org.apache.hadoop.mapred.TaskTracker:
 Resending 'status' to 'ec2-xx-xx-xx-xx.compute-1.amazonaws.com' with
 reponseId '5341
 2009-04-30 02:34:40,403 INFO org.apache.hadoop.mapred.TaskTracker:
 Resending 'status' to 'ec2-xx-xx-xx-xx.compute-1.amazonaws.com' with
 reponseId '5341
 2009-04-30 02:34:40,411 INFO org.apache.hadoop.mapred.TaskTracker:
 Resending 'status' to 'ec2-xx-xx-xx-xx.compute-1.amazonaws.com' with
 reponseId '5341
 2009-04-30 02:34:40,414 INFO org.apache.hadoop.mapred.TaskTracker:
 Resending 'status' to 'ec2-xx-xx-xx-xx.compute-1.amazonaws.com' with
 reponseId '5341
 2009-04-30 02:34:40,417 INFO org.apache.hadoop.mapred.TaskTracker:
 Resending 'status' to 'ec2-xx-xx-xx-xx.compute-1.amazonaws.com' with
 reponseId '5341
 2009-04-30 02:34:40,420 INFO org.apache.hadoop.mapred.TaskTracker:
 Resending 'status' to 'ec2-xx-xx-xx-xx.compute-1.amazonaws.com' with
 reponseId '5341
 ... (And on and on and on...)


 These are the few lines before it started:

 2009-04-30 02:34:29,780 INFO
 org.apache.hadoop.mapred.TaskTracker.clienttrace: src:
 xxx.xxx.xxx.xxx:50060, dest: 10.253.178.95:40268, bytes: 3341324, op:
 MAPRED_SHUFFLE, cliID: attempt_200904291917_0352_m_06_0
 2009-04-30 02:34:31,522 INFO org.apache.hadoop.mapred.TaskTracker: Sent
 out
 418891 bytes for reduce: 12 from map:
 attempt_200904291917_0352_m_07_0
 given 418891/418887 from 4301462 with (22, 171)
 2009-04-30 02:34:31,522 INFO
 org.apache.hadoop.mapred.TaskTracker.clienttrace: src:
 xxx.xxx.xxx.xxx:50060, dest: xxx.xxx.xxx.xxx:40268, bytes: 418891, op:
 MAPRED_SHUFFLE, cliID: attempt_200904291917_0352_m_07_0
 2009-04-30 02:34:35,382 INFO org.apache.hadoop.mapred.TaskTracker:
 attempt_200904291917_0352_r_03_0 0.3030303% reduce  copy (10 of 11
 at
 0.32 MB/s) 
 2009-04-30 02:34:38,385 INFO org.apache.hadoop.mapred.TaskTracker:
 attempt_200904291917_0352_r_03_0 0.3030303% reduce  copy (10 of 11
 at
 0.32 MB/s) 
 2009-04-30 02:34:40,207 INFO org.apache.hadoop.mapred.TaskTracker:
 Resending 'status' to 'ec2-xx-xx-xx-xx.compute-1.amazonaws.com' with
 reponseId '5341
 2009-04-30 02:34:40,398 INFO org.apache.hadoop.mapred.TaskTracker:
 Resending 'status' to 'ec2-xx-xx-xx-xx.compute-1.amazonaws.com' with
 reponseId '5341
 2009-04-30 02:34:40,403 INFO org.apache.hadoop.mapred.TaskTracker:
 Resending 'status' to 'ec2-xx-xx-xx-xx.compute-1.amazonaws.com' with
 reponseId '5341
 2009-04-30 02:34:40,411 INFO org.apache.hadoop.mapred.TaskTracker:
 Resending 'status' to 'ec2-xx-xx-xx-xx.compute-1.amazonaws.com' with
 reponseId '5341
 2009-04-30 02:34:40,414 INFO org.apache.hadoop.mapred.TaskTracker:
 Resending 'status' to 'ec2-xx-xx-xx-xx.compute-1.amazonaws.com' with
 reponseId '5341
 2009-04-30 02:34:40,417 INFO org.apache.hadoop.mapred.TaskTracker:
 Resending 'status' to 'ec2-xx-xx-xx-xx.compute-1.amazonaws.com' with
 reponseId '5341
 2009-04-30 02:34:40,420 INFO org.apache.hadoop.mapred.TaskTracker:
 Resending 'status' to 'ec2-xx-xx-xx-xx.compute-1.amazonaws.com' with
 reponseId '5341

 --And on for2+ gigs





Re: Infinite Loop Resending status from task tracker

2009-04-30 Thread Lance Riedel
Here are the job tracker logs from the same time (and yes.. there is  
something there!!):



2009-04-30 02:34:28,484 INFO org.apache.hadoop.mapred.JobTracker:  
Serious problem.  While updating status, cannot find taskid  
attempt_200904291917_0252_r_03_0



2009-04-30 02:34:40,215 INFO org.apache.hadoop.ipc.Server: IPC Server  
handler 2 on 54311, call  
heartbeat(org.apache.hadoop.mapred.tasktrackersta...@1a93388, false,  
true, 5341) from 10.253.134.191:42688: error: java.io.IOException:  
java.lang.NullPointerException

java.io.IOException: java.lang.NullPointerException
at  
org.apache.hadoop.mapred.JobTracker.getTasksToSave(JobTracker.java:2130)
at  
org.apache.hadoop.mapred.JobTracker.heartbeat(JobTracker.java:1923)

at sun.reflect.GeneratedMethodAccessor72.invoke(Unknown Source)
at  
sun 
.reflect 
.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java: 
25)

at java.lang.reflect.Method.invoke(Method.java:597)
at org.apache.hadoop.ipc.RPC$Server.call(RPC.java:481)
at org.apache.hadoop.ipc.Server$Handler.run(Server.java:894)
2009-04-30 02:34:40,215 INFO org.apache.hadoop.mapred.JobTracker:  
Serious problem.  While updating status, cannot find taskid  
attempt_200904291917_0296_r_14_1
2009-04-30 02:34:40,217 INFO org.apache.hadoop.mapred.JobTracker:  
Adding task 'attempt_200904291917_0352_r_13_0' to tip  
task_200904291917_0352_r_13, for tracker 'tracker_domU-12-31-38-00- 
F0-41.compute-1.internal:localhost.localdomain/127.0.0.1:42479'
2009-04-30 02:34:40,217 INFO org.apache.hadoop.mapred.JobTracker:  
Removed completed task 'attempt_200904291917_0343_m_03_0' from  
'tracker_domU-12-31-38-00- 
F0-41.compute-1.internal:localhost.localdomain/127.0.0.1:42479'
2009-04-30 02:34:40,217 INFO org.apache.hadoop.mapred.JobTracker:  
Removed completed task 'attempt_200904291917_0343_m_07_0' from  
'tracker_domU-12-31-38-00- 
F0-41.compute-1.internal:localhost.localdomain/127.0.0.1:42479'



And then.. a LOT more


2009-04-30 02:34:40,433 INFO org.apache.hadoop.mapred.JobTracker:  
Serious problem.  While updating status, cannot find taskid  
attempt_200904291917_0252_r_03_0
2009-04-30 02:34:40,433 WARN org.apache.hadoop.mapred.TaskInProgress:  
Recieved duplicate status update of 'KILLED' for  
'attempt_200904291917_0352_m_10_1' of TIP  
'task_200904291917_0352_m_10'
2009-04-30 02:34:40,433 INFO org.apache.hadoop.ipc.Server: IPC Server  
handler 1 on 54311, call  
heartbeat(org.apache.hadoop.mapred.tasktrackersta...@1b7b4c1, false,  
true, 5341) from 10.253.134.191:42688: error: java.io.IOException:  
java.lang.NullPointerException

java.io.IOException: java.lang.NullPointerException
at  
org.apache.hadoop.mapred.JobTracker.getTasksToSave(JobTracker.java:2130)
at  
org.apache.hadoop.mapred.JobTracker.heartbeat(JobTracker.java:1923)

at sun.reflect.GeneratedMethodAccessor72.invoke(Unknown Source)
at  
sun 
.reflect 
.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java: 
25)

at java.lang.reflect.Method.invoke(Method.java:597)
at org.apache.hadoop.ipc.RPC$Server.call(RPC.java:481)
at org.apache.hadoop.ipc.Server$Handler.run(Server.java:894)
2009-04-30 02:34:40,441 INFO org.apache.hadoop.mapred.JobTracker:  
Serious problem.  While updating status, cannot find taskid  
attempt_200904291917_0252_r_03_0
2009-04-30 02:34:40,441 WARN org.apache.hadoop.mapred.TaskInProgress:  
Recieved duplicate status update of 'KILLED' for  
'attempt_200904291917_0352_m_10_1' of TIP  
'task_200904291917_0352_m_10'
2009-04-30 02:34:40,442 INFO org.apache.hadoop.ipc.Server: IPC Server  
handler 1 on 54311, call  
heartbeat(org.apache.hadoop.mapred.tasktrackersta...@1598c57, false,  
true, 5341) from 10.253.134.191:42688: error: java.io.IOException:  
java.lang.NullPointerException

java.io.IOException: java.lang.NullPointerException
at  
org.apache.hadoop.mapred.JobTracker.getTasksToSave(JobTracker.java:2130)
at  
org.apache.hadoop.mapred.JobTracker.heartbeat(JobTracker.java:1923)

at sun.reflect.GeneratedMethodAccessor72.invoke(Unknown Source)
at  
sun 
.reflect 
.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java: 
25)

at java.lang.reflect.Method.invoke(Method.java:597)
at org.apache.hadoop.ipc.RPC$Server.call(RPC.java:481)
at org.apache.hadoop.ipc.Server$Handler.run(Server.java:894)
2009-04-30 02:34:40,444 INFO org.apache.hadoop.mapred.JobTracker:  
Serious problem.  While updating status, cannot find taskid  
attempt_200904291917_0252_r_03_0
2009-04-30 02:34:40,444 WARN org.apache.hadoop.mapred.TaskInProgress:  
Recieved duplicate status update of 'KILLED' for  
'attempt_200904291917_0352_m_10_1' of TIP  
'task_200904291917_0352_m_10'
2009-04-30 02:34:40,444 INFO org.apache.hadoop.ipc.Server: IPC Server  
handler 7 on 54311, call  

Re: Infinite Loop Resending status from task tracker

2009-04-30 Thread Todd Lipcon
Hey Lance,

Thanks for the logs. They definitely confirmed my suspicion. There are two
problems here:

1) If the JobTracker throws an exception during processing of a heartbeat,
the tasktracker retries with no delay, since lastHeartbeat isn't updated in
TaskTracker.offerService. This is related to HADOOP-3987

2) If the TaskTracker sends a task in COMMIT_PENDING state with an invalid
task id, the jobtracker will trigger a NullPointerException in
JobTracker.getTasksToSave. Instead it should probably create a
KillTaskAction. I just filed a JIRA to track this issue:

https://issues.apache.org/jira/browse/HADOOP-5761

3) The TaskTracker somehow had a task attempt in COMMIT_PENDING state that
the JobTracker didn't know about. How it got there is a separate problem
that's a bit harder to track down.

Thanks
-Todd

On Thu, Apr 30, 2009 at 11:17 AM, Lance Riedel la...@dotspots.com wrote:

 Here are the job tracker logs from the same time (and yes.. there is
 something there!!):


 2009-04-30 02:34:28,484 INFO org.apache.hadoop.mapred.JobTracker: Serious
 problem.  While updating status, cannot find taskid
 attempt_200904291917_0252_r_03_0


 2009-04-30 02:34:40,215 INFO org.apache.hadoop.ipc.Server: IPC Server
 handler 2 on 54311, call
 heartbeat(org.apache.hadoop.mapred.tasktrackersta...@1a93388, false, true,
 5341) from 10.253.134.191:42688: error: java.io.IOException:
 java.lang.NullPointerException
 java.io.IOException: java.lang.NullPointerException
at
 org.apache.hadoop.mapred.JobTracker.getTasksToSave(JobTracker.java:2130)
at
 org.apache.hadoop.mapred.JobTracker.heartbeat(JobTracker.java:1923)
at sun.reflect.GeneratedMethodAccessor72.invoke(Unknown Source)
at
 sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25)
at java.lang.reflect.Method.invoke(Method.java:597)
at org.apache.hadoop.ipc.RPC$Server.call(RPC.java:481)
at org.apache.hadoop.ipc.Server$Handler.run(Server.java:894)
 2009-04-30 02:34:40,215 INFO org.apache.hadoop.mapred.JobTracker: Serious
 problem.  While updating status, cannot find taskid
 attempt_200904291917_0296_r_14_1
 2009-04-30 02:34:40,217 INFO org.apache.hadoop.mapred.JobTracker: Adding
 task 'attempt_200904291917_0352_r_13_0' to tip
 task_200904291917_0352_r_13, for tracker
 'tracker_domU-12-31-38-00-F0-41.compute-1.internal:localhost.localdomain/
 127.0.0.1:42479'
 2009-04-30 02:34:40,217 INFO org.apache.hadoop.mapred.JobTracker: Removed
 completed task 'attempt_200904291917_0343_m_03_0' from
 'tracker_domU-12-31-38-00-F0-41.compute-1.internal:localhost.localdomain/
 127.0.0.1:42479'
 2009-04-30 02:34:40,217 INFO org.apache.hadoop.mapred.JobTracker: Removed
 completed task 'attempt_200904291917_0343_m_07_0' from
 'tracker_domU-12-31-38-00-F0-41.compute-1.internal:localhost.localdomain/
 127.0.0.1:42479'


 And then.. a LOT more


 2009-04-30 02:34:40,433 INFO org.apache.hadoop.mapred.JobTracker: Serious
 problem.  While updating status, cannot find taskid
 attempt_200904291917_0252_r_03_0
 2009-04-30 02:34:40,433 WARN org.apache.hadoop.mapred.TaskInProgress:
 Recieved duplicate status update of 'KILLED' for
 'attempt_200904291917_0352_m_10_1' of TIP
 'task_200904291917_0352_m_10'
 2009-04-30 02:34:40,433 INFO org.apache.hadoop.ipc.Server: IPC Server
 handler 1 on 54311, call
 heartbeat(org.apache.hadoop.mapred.tasktrackersta...@1b7b4c1, false, true,
 5341) from 10.253.134.191:42688: error: java.io.IOException:
 java.lang.NullPointerException
 java.io.IOException: java.lang.NullPointerException
at
 org.apache.hadoop.mapred.JobTracker.getTasksToSave(JobTracker.java:2130)
at
 org.apache.hadoop.mapred.JobTracker.heartbeat(JobTracker.java:1923)
at sun.reflect.GeneratedMethodAccessor72.invoke(Unknown Source)
at
 sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25)
at java.lang.reflect.Method.invoke(Method.java:597)
at org.apache.hadoop.ipc.RPC$Server.call(RPC.java:481)
at org.apache.hadoop.ipc.Server$Handler.run(Server.java:894)
 2009-04-30 02:34:40,441 INFO org.apache.hadoop.mapred.JobTracker: Serious
 problem.  While updating status, cannot find taskid
 attempt_200904291917_0252_r_03_0
 2009-04-30 02:34:40,441 WARN org.apache.hadoop.mapred.TaskInProgress:
 Recieved duplicate status update of 'KILLED' for
 'attempt_200904291917_0352_m_10_1' of TIP
 'task_200904291917_0352_m_10'
 2009-04-30 02:34:40,442 INFO org.apache.hadoop.ipc.Server: IPC Server
 handler 1 on 54311, call
 heartbeat(org.apache.hadoop.mapred.tasktrackersta...@1598c57, false, true,
 5341) from 10.253.134.191:42688: error: java.io.IOException:
 java.lang.NullPointerException
 java.io.IOException: java.lang.NullPointerException
at
 org.apache.hadoop.mapred.JobTracker.getTasksToSave(JobTracker.java:2130)
at
 org.apache.hadoop.mapred.JobTracker.heartbeat(JobTracker.java:1923)
at 

Re: Master crashed

2009-04-30 Thread Scott Carey

On 4/30/09 10:18 AM, Mayuran Yogarajah mayuran.yogara...@casalemedia.com
wrote:

 Alex Loddengaard wrote:
 I'm confused.  Why are you trying to stop things when you're bringing the
 name node back up?  Try running start-all.sh instead.
 
 Alex
 
  
 Won't that try to start the daemons on the slave nodes again? They're
 already running.
 

That doesn't matter, start-all.sh detects already running processes and does
not bring up duplicates. You can run it 100x in a row without a stop if you
wanted:

namenode running as process 12621. Stop it first.
datanode running as process 28540. Stop it first.
jobtracker running as process 12814. Stop it first.
tasktracker running as process 28763. Stop it first.



 M
 On Tue, Apr 28, 2009 at 4:00 PM, Mayuran Yogarajah 
 mayuran.yogara...@casalemedia.com wrote:
 
  
 The master in my cluster crashed, the dfs/mapred java processes are
 still running on the slaves.  What should I do next? I brought the master
 back up and ran stop-mapred.sh and stop-dfs.sh and it said this:
 
 slave1.test.com: no tasktracker to stop
 slave1.test.com: no datanode to stop
 
 Not sure what happened here, please advise.
 
 thanks,
 M
 

 
 



Implementing compareTo in user-written keys where one extends the other is error prone

2009-04-30 Thread Marshall Schor
Hi.  I had difficulties in getting Reduce sorting to wor - it took me a good art
of a day to figure out what was going wrong, so I'm sharing this in hopes of
earning something from the community or getting hadoop improved to avoid thisind
of error for future users.

I have 2 key classes, one holds a String, the other one extends that, and adds a
boolean.

I implemented the first key class (let's call it Super)

public class Super implements WritableComparableSuper {
 . . .
  public int compareTo(Super o) {
// sort on string value
. . .
  }

I implemented the 2nd key class (let's call it Sub)

public class Sub extends Super {
 . . .
  public int compareTo(Sub o) {
// sort on boolean value
. . .
// if equal, use the super:
... else
 return super.compareTo(o);
  }


With this setup, I used the Sub class as a mapper output key, and
expected the sort on the boolean value to happen first, then for equal
values there, the sort on the string values.

What actually happened, was that the sort on the boolean value was
skipped completely, and only the sort on the string was done.

The reason for this is that (in 0.19.1 release) the WritableCompator
instance that is created (using the defaults - no custom Comparator)
knows the class is Sub, and calls from the key value it created, and
calls the compareTo method, passing it the other key.  Both of these
keys are of type Sub.  However, they are passed via this code in
WritableComparator:

 public int compare(WritableComparable a, WritableComparable b) {
return a.compareTo(b);
  }

Java uses the interface spec for WritableComparable that was declared,
in this case WritableComparableSuper, and infers that the arg type for
the compareTo is Super.  So it skips calling the compareTo in Sub, and
just calls the one in Super.

The workaround is to change the signature of Sub's compareTo method to
match the spec in the interface, namely it has to take the Super as an
argument, and then cast it to Sub.

This seems like a very error prone design.  Am I doing something wrong,
or can this be improved so that this kind of error is avoided?

-Marshall Schor



classpath for finding Key classes

2009-04-30 Thread Marshall Schor
Hi - I have a classpath question.

In hadoop, one can define the Java classes to be used for Keys and Values.  I
am doing this.  When I make my giant Jar file holding everything needed for
running my application, I include these classes.

However, I've discovered that that is not enough it seems (in 0.19.1
version - in case that matters :-) ).  The job start up processes is
reading the configuration and finding the names of my Key classes, and
tries to load them.  But it is not using the giant Jar for my job,
(yet), so it doesn't find them.

A work-around that I've found is to include my giant Jar as the argument
to -libjars - that seems to get the class path set up so the startup /
validation code can find my classes.  This seems wasteful - having the
giant jar in two places...

Is there a best practices way to do this that's better than this?

Thanks. -Marshall Schor



Re: Implementing compareTo in user-written keys where one extends the other is error prone

2009-04-30 Thread Owen O'Malley
If you use custom key types, you really should be defining a  
RawComparator. It will perform much much better.


-- Owen