Hi,
I try to use Spring Data Hadoop with CDH4 to write a Map Reduce Job.

On startup, I get the following exception:

Exception in thread "SimpleAsyncTaskExecutor-1" 
java.lang.ExceptionInInitializerError
        at 
org.springframework.data.hadoop.mapreduce.JobExecutor$2.run(JobExecutor.java:183)
        at java.lang.Thread.run(Thread.java:722)
Caused by: java.lang.NullPointerException
        at 
org.springframework.util.ReflectionUtils.makeAccessible(ReflectionUtils.java:405)
        at 
org.springframework.data.hadoop.mapreduce.JobUtils.<clinit>(JobUtils.java:123)
        ... 2 more

I guess there is a problem with my Hadoop related dependencies. I couldn't find 
any reference showing how to configure Spring Data together with CDH4. But 
Costin showed, he is able to configure it: 
https://build.springsource.org/browse/SPRINGDATAHADOOP-CDH4-JOB1


**Maven Setup**

<properties>
        <spring.hadoop.version>1.0.0.BUILD-SNAPSHOT</spring.hadoop.version>
        <hadoop.version>2.0.0-cdh4.1.3</hadoop.version>
</properties>

<dependencies>
        ...
        <dependency>
                <groupId>org.springframework.data</groupId>
                <artifactId>spring-data-hadoop</artifactId>
                <version>${spring.hadoop.version}</version>
        </dependency>

        <dependency>
                <groupId>org.apache.hadoop</groupId>
                <artifactId>hadoop-common</artifactId>
                <version>${hadoop.version}</version>
        </dependency>

        <dependency>
                <groupId>org.apache.hadoop</groupId>
                <artifactId>hadoop-client</artifactId>
                <version>${hadoop.version}</version>
        </dependency>

        <dependency>
                <groupId>org.apache.hadoop</groupId>
                <artifactId>hadoop-streaming</artifactId>
                <version>${hadoop.version}</version>
        </dependency>

        <dependency>
                <groupId>org.apache.hadoop</groupId>
                <artifactId>hadoop-test</artifactId>
                <version>2.0.0-mr1-cdh4.1.3</version>
        </dependency>

        <dependency>
                <groupId>org.apache.hadoop</groupId>
                <artifactId>hadoop-tools</artifactId>
                <version>2.0.0-mr1-cdh4.1.3</version>
        </dependency>
        ...
</dependencies>
...
<repositories>   
        <repository>
                <id>cloudera</id>
                
<url>https://repository.cloudera.com/artifactory/cloudera-repos/</url>
                <snapshots>
                        <enabled>false</enabled>
                </snapshots>
        </repository>

        <repository>
                <id>spring-snapshot</id>
                <name>Spring Maven SNAPSHOT Repository</name>
                <url>http://repo.springframework.org/snapshot</url>
        </repository>
</repositories>

**Application Context**

<?xml version="1.0" encoding="UTF-8"?>
<beans xmlns="http://www.springframework.org/schema/beans";
        xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance";
        xmlns:hdp="http://www.springframework.org/schema/hadoop"; 
xmlns:context="http://www.springframework.org/schema/context";
        xmlns:hadoop="http://www.springframework.org/schema/hadoop";
        xsi:schemaLocation="
http://www.springframework.org/schema/beans 
http://www.springframework.org/schema/beans/spring-beans.xsd
http://www.springframework.org/schema/hadoop 
http://www.springframework.org/schema/hadoop/spring-hadoop.xsd
http://www.springframework.org/schema/context/spring-context.xsd 
http://www.springframework.org/schema/integration
http://www.springframework.org/schema/context 
http://www.springframework.org/schema/context/spring-context-3.1.xsd";>

        <context:property-placeholder location="classpath:hadoop.properties" />

        <hdp:configuration id="hadoopConfiguration">
                fs.defaultFS=${hd.fs}
        </hdp:configuration>

        <hdp:job id="wordCountJob" input-path="${input.path}"
                output-path="${output.path}" mapper="com.example.WordMapper"
                reducer="com.example.WordReducer" />
                
<hdp:job-runner job-ref="wordCountJob" run-at-startup="true" 
wait-for-completion="true"/>               

</beans>

**Cluster version**

Hadoop 2.0.0-cdh4.1.3


**Note:**

This small Unittest is running fine with the current configuration:

@RunWith(SpringJUnit4ClassRunner.class)
@ContextConfiguration(locations = { "classpath:/applicationContext.xml" })
public class Starter {

         @Autowired
         private Configuration configuration;
                
         @Test
         public void shellOps() {
                 Assert.assertNotNull(this.configuration);
                 FsShell fsShell = new FsShell(this.configuration);
                 final Collection<FileStatus> coll = fsShell.ls("/user");
                 System.out.println(coll);
         }
}


It would be nice if someone can give me an example configuration.

Best Regards,
Christian.

Reply via email to