Re: Configuring custom input format

Matei Zaharia Tue, 25 Nov 2014 18:10:43 -0800

Yeah, unfortunately that will be up to them to fix, though it wouldn't hurt to 
send them a JIRA mentioning this.


Matei

> On Nov 25, 2014, at 2:58 PM, Corey Nolet <cjno...@gmail.com> wrote:
> 
> I was wiring up my job in the shell while i was learning Spark/Scala. I'm 
> getting more comfortable with them both now so I've been mostly testing 
> through Intellij with mock data as inputs.
> 
> I think the problem lies more on Hadoop than Spark as the Job object seems to 
> check it's state and throw an exception when the toString() method is called 
> before the Job has physically been submitted.
> 
> On Tue, Nov 25, 2014 at 5:31 PM, Matei Zaharia <matei.zaha...@gmail.com 
> <mailto:matei.zaha...@gmail.com>> wrote:
> How are you creating the object in your Scala shell? Maybe you can write a 
> function that directly returns the RDD, without assigning the object to a 
> temporary variable.
> 
> Matei
> 
>> On Nov 5, 2014, at 2:54 PM, Corey Nolet <cjno...@gmail.com 
>> <mailto:cjno...@gmail.com>> wrote:
>> 
>> The closer I look @ the stack trace in the Scala shell, it appears to be the 
>> call to toString() that is causing the construction of the Job object to 
>> fail. Is there a ways to suppress this output since it appears to be 
>> hindering my ability to new up this object?
>> 
>> On Wed, Nov 5, 2014 at 5:49 PM, Corey Nolet <cjno...@gmail.com 
>> <mailto:cjno...@gmail.com>> wrote:
>> I'm trying to use a custom input format with SparkContext.newAPIHadoopRDD. 
>> Creating the new RDD works fine but setting up the configuration file via 
>> the static methods on input formats that require a Hadoop Job object is 
>> proving to be difficult. 
>> 
>> Trying to new up my own Job object with the SparkContext.hadoopConfiguration 
>> is throwing the exception on line 283 of this grepcode:
>> 
>> http://grepcode.com/file/repo1.maven.org/maven2/org.apache.hadoop/hadoop-mapreduce-client-core/2.5.0/org/apache/hadoop/mapreduce/Job.java#Job
>>  
>> <http://grepcode.com/file/repo1.maven.org/maven2/org.apache.hadoop/hadoop-mapreduce-client-core/2.5.0/org/apache/hadoop/mapreduce/Job.java#Job>
>> 
>> Looking in the SparkContext code, I'm seeing that it's newing up Job objects 
>> just fine using nothing but the configuraiton. Using SparkContext.textFile() 
>> appears to be working for me. Any ideas? Has anyone else run into this as 
>> well? Is it possible to have a method like SparkContext.getJob() or 
>> something similar?
>> 
>> Thanks.
>> 
>> 
> 
>

Re: Configuring custom input format

Reply via email to