Hi,

For Spring Data Hadoop problems, it's best to use the designated forum [1]. These being said I've tried to reproduce your error but I can't - I've upgraded the build to CDH 4.1.3 which runs fine against the VM on the CI (4.1.1).
Maybe you have some other libraries on the client classpath?

From the stacktrace, it looks like the org.apache.hadoop.mapreduce.Job class has no 'state' or 'info' fields...

Anyway, let's continue the discussion on the forum.

Cheers,
[1] http://forum.springsource.org/forumdisplay.php?87-Hadoop

On 02/12/13 2:51 PM, Christian Schneider wrote:
Hi,
I try to use Spring Data Hadoop with CDH4 to write a Map Reduce Job.

On startup, I get the following exception:

Exception in thread "SimpleAsyncTaskExecutor-1" 
java.lang.ExceptionInInitializerError
        at 
org.springframework.data.hadoop.mapreduce.JobExecutor$2.run(JobExecutor.java:183)
        at java.lang.Thread.run(Thread.java:722)
Caused by: java.lang.NullPointerException
        at 
org.springframework.util.ReflectionUtils.makeAccessible(ReflectionUtils.java:405)
        at 
org.springframework.data.hadoop.mapreduce.JobUtils.<clinit>(JobUtils.java:123)
        ... 2 more

I guess there is a problem with my Hadoop related dependencies. I couldn't find 
any reference showing how to configure Spring Data together with CDH4. But 
Costin showed, he is able to configure it: 
https://build.springsource.org/browse/SPRINGDATAHADOOP-CDH4-JOB1


**Maven Setup**

<properties>
        <spring.hadoop.version>1.0.0.BUILD-SNAPSHOT</spring.hadoop.version>
        <hadoop.version>2.0.0-cdh4.1.3</hadoop.version>
</properties>

<dependencies>
        ...
        <dependency>
                <groupId>org.springframework.data</groupId>
                <artifactId>spring-data-hadoop</artifactId>
                <version>${spring.hadoop.version}</version>
        </dependency>

        <dependency>
                <groupId>org.apache.hadoop</groupId>
                <artifactId>hadoop-common</artifactId>
                <version>${hadoop.version}</version>
        </dependency>

        <dependency>
                <groupId>org.apache.hadoop</groupId>
                <artifactId>hadoop-client</artifactId>
                <version>${hadoop.version}</version>
        </dependency>

        <dependency>
                <groupId>org.apache.hadoop</groupId>
                <artifactId>hadoop-streaming</artifactId>
                <version>${hadoop.version}</version>
        </dependency>

        <dependency>
                <groupId>org.apache.hadoop</groupId>
                <artifactId>hadoop-test</artifactId>
                <version>2.0.0-mr1-cdh4.1.3</version>
        </dependency>

        <dependency>
                <groupId>org.apache.hadoop</groupId>
                <artifactId>hadoop-tools</artifactId>
                <version>2.0.0-mr1-cdh4.1.3</version>
        </dependency>
        ...
</dependencies>
...
<repositories>
        <repository>
                <id>cloudera</id>
                
<url>https://repository.cloudera.com/artifactory/cloudera-repos/</url>
                <snapshots>
                        <enabled>false</enabled>
                </snapshots>
        </repository>

        <repository>
                <id>spring-snapshot</id>
                <name>Spring Maven SNAPSHOT Repository</name>
                <url>http://repo.springframework.org/snapshot</url>
        </repository>
</repositories>

**Application Context**

<?xml version="1.0" encoding="UTF-8"?>
<beans xmlns="http://www.springframework.org/schema/beans";
        xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance";
        xmlns:hdp="http://www.springframework.org/schema/hadoop"; 
xmlns:context="http://www.springframework.org/schema/context";
        xmlns:hadoop="http://www.springframework.org/schema/hadoop";
        xsi:schemaLocation="
http://www.springframework.org/schema/beans 
http://www.springframework.org/schema/beans/spring-beans.xsd
http://www.springframework.org/schema/hadoop 
http://www.springframework.org/schema/hadoop/spring-hadoop.xsd
http://www.springframework.org/schema/context/spring-context.xsd 
http://www.springframework.org/schema/integration
http://www.springframework.org/schema/context 
http://www.springframework.org/schema/context/spring-context-3.1.xsd";>

        <context:property-placeholder location="classpath:hadoop.properties" />

        <hdp:configuration id="hadoopConfiguration">
                fs.defaultFS=${hd.fs}
        </hdp:configuration>

        <hdp:job id="wordCountJob" input-path="${input.path}"
                output-path="${output.path}" mapper="com.example.WordMapper"
                reducer="com.example.WordReducer" />
                
<hdp:job-runner job-ref="wordCountJob" run-at-startup="true" 
wait-for-completion="true"/>           

</beans>

**Cluster version**

Hadoop 2.0.0-cdh4.1.3


**Note:**

This small Unittest is running fine with the current configuration:

@RunWith(SpringJUnit4ClassRunner.class)
@ContextConfiguration(locations = { "classpath:/applicationContext.xml" })
public class Starter {

         @Autowired
         private Configuration configuration;
                
         @Test
         public void shellOps() {
                 Assert.assertNotNull(this.configuration);
                 FsShell fsShell = new FsShell(this.configuration);
                 final Collection<FileStatus> coll = fsShell.ls("/user");
                 System.out.println(coll);
         }
}


It would be nice if someone can give me an example configuration.

Best Regards,
Christian.


--
Costin

Reply via email to