Thanks a millio for your support!

Von: Shahab Yunus [mailto:shahab.yu...@gmail.com]
Gesendet: Donnerstag, 18. September 2014 13:40
An: user@hadoop.apache.org
Betreff: Re: ClassCastException on running map-reduce jobs + tests on Windows 
(mongo-hadoop)

You will have to convert BSONWritable to BSONObject yourself. You can abstract 
this parsing in a separate class/object model and reuse it but as far as I 
understand, objects being serialized or deserialized have to be Writable 
(conforming to the interface that Hadoop defines and Comparable if going to act 
as key.)

So given that, you will either have to do the parsing yourself or design your 
downstream modules which expect BSONObject to accept BSONWritable. This way you 
won't need to parse. But the downside is that your downstream users will then 
be tied with Hadoop api, resulting in a potentially undesirable dependency.

Some links that might be helpful regarding this design and its background:

http://learnhadoopwithme.wordpress.com/tag/writablecomparable/

Page 93 and onwards from Tom White's book, Hadoop: The Definitive Guide.

Regarding your Windows experience, I don't have much knowledge in that area. 
Sorry :(

Regards,
Shahab

On Thu, Sep 18, 2014 at 2:58 AM, Blanca Hernandez 
<blanca.hernan...@willhaben.at<mailto:blanca.hernan...@willhaben.at>> wrote:
Thanks,

I made the changes and everything works fine!! Many thanks!!

Now I am having problems converting BSONWritable to BSONObject and vice versa. 
Is there an automatic way to make it?
Or should I write myself a parse?

And regarding the tests on windows, any experience?

Thanks again!!

Best regards,

Blanca


Von: Shahab Yunus [mailto:shahab.yu...@gmail.com<mailto:shahab.yu...@gmail.com>]
Gesendet: Mittwoch, 17. September 2014 17:20
An: user@hadoop.apache.org<mailto:user@hadoop.apache.org>
Betreff: Re: ClassCastException on running map-reduce jobs + tests on Windows 
(mongo-hadoop)

You String as the outputKey (). java.lang.String is not Writable. Change it to 
Text just like you did for the Mapper.

Regards,
Shahab

On Wed, Sep 17, 2014 at 10:43 AM, Blanca Hernandez 
<blanca.hernan...@willhaben.at<mailto:blanca.hernan...@willhaben.at>> wrote:
Thanks for answering:

hadoop jar /tmp/hadoop-test.jar at.willhaben.hadoop.AveragePriceCalculationJob

In the AveragePriceCalculationJob I have my configuration:


private static class AveragePriceCalculationJob extends MongoTool {
        private AveragePriceCalculationJob(AveragePriceNode currentNode, String 
currentId, int nodeNumber) {
            Configuration conf = new Configuration();
            MongoConfig config = new MongoConfig(conf);
            setConf(conf);
            // change for my values
            config.setInputFormat(MongoInputFormat.class);
            config.setOutputFormat(MongoOutputFormat.class);

            config.setMapperOutputKey(Text.class);
            config.setMapperOutputValue(BSONObject.class);
            config.setOutputKey(String.class);
            config.setOutputValue(BSONWritable.class);

            config.setInputURI("myUrl");
            config.setOutputURI("myUrl");
            config.setMapper(AveragePriceMapper.class);
            config.setReducer(AveragePriceReducer.class);

        }
    }


And the main method:


public static void main(String [] args) throws InterruptedException, 
IOException, ClassNotFoundException {
        // … some code

            try {
                ToolRunner.run(new AveragePriceCalculationJob(currentNode, 
currentId, nodeNumber), args);
            } catch (Exception e) {
                e.printStackTrace();  //To change body of catch statement use 
File | Settings | File Templates.
            }
    }


Best regards,

Blanca

Von: Shahab Yunus [mailto:shahab.yu...@gmail.com<mailto:shahab.yu...@gmail.com>]
Gesendet: Mittwoch, 17. September 2014 16:37
An: user@hadoop.apache.org<mailto:user@hadoop.apache.org>
Betreff: Re: ClassCastException on running map-reduce jobs + tests on Windows 
(mongo-hadoop)

Can you provide the driver code for this job?

Regards,
Shahab

On Wed, Sep 17, 2014 at 10:28 AM, Blanca Hernandez 
<blanca.hernan...@willhaben.at<mailto:blanca.hernan...@willhaben.at>> wrote:
Hi again, I changed the String objects with org.apache.hadoop.io.Text objects 
(why is String not accepted?), and now I get another exception, so I don´t 
really know if I solved something or I broke something:


java.lang.Exception: java.lang.NullPointerException
        at 
org.apache.hadoop.mapred.LocalJobRunner$Job.runTasks(LocalJobRunner.java:462)
        at 
org.apache.hadoop.mapred.LocalJobRunner$Job.run(LocalJobRunner.java:522)
Caused by: java.lang.NullPointerException
        at 
org.apache.hadoop.mapred.MapTask$MapOutputBuffer.init(MapTask.java:988)
        at 
org.apache.hadoop.mapred.MapTask.createSortingCollector(MapTask.java:391)
        at org.apache.hadoop.mapred.MapTask.access$100(MapTask.java:80)
        at 
org.apache.hadoop.mapred.MapTask$NewOutputCollector.<init>(MapTask.java:675)
        at org.apache.hadoop.mapred.MapTask.runNewMapper(MapTask.java:747)
        at org.apache.hadoop.mapred.MapTask.run(MapTask.java:340)
        at 
org.apache.hadoop.mapred.LocalJobRunner$Job$MapTaskRunnable.run(LocalJobRunner.java:243)
        at 
java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:471)
        at java.util.concurrent.FutureTask.run(FutureTask.java:262)
        at 
java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145)
        at 
java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615)
        at java.lang.Thread.run(Thread.java:744)

If I could debug it in my IDE, I think I could work faster, but I have the 
problems already exposed. How am I testing now? Building a jar, copying it on 
the server and running a Hadoop jar command (not very performance approach…).

Could you give me a hand on this? Any Windows + IntelliJ IDEa there? Maaaany 
thanks!



Von: Blanca Hernandez 
[mailto:blanca.hernan...@willhaben.at<mailto:blanca.hernan...@willhaben.at>]
Gesendet: Mittwoch, 17. September 2014 15:27
An: user@hadoop.apache.org<mailto:user@hadoop.apache.org>
Betreff: ClassCastException on running map-reduce jobs + tests on Windows 
(mongo-hadoop)

Hi!

I am getting some CCE and don´t really understand why…

Here my mapper:

public class AveragePriceMapper extends Mapper<String, BSONObject, String, 
BSONObject>{
    @Override
    public void map(final String key, final BSONObject val, final Context 
context) throws IOException, InterruptedException {
        String id = “result_of_making_some_operations”;
        context.write(id, val);
    }
}

And in my configuration:

config.setMapperOutputKey(String.class);
config.setMapperOutputValue(BSONObject.class);


On running my generated jar on the server, seems to work everything ok until:

14/09/17 15:20:36 INFO mapred.MapTask: Processing split: 
MongoInputSplit{URI=mongodb://user:pass@host:27017/my_db.my_collection, 
authURI=null, min={ "_id" : { "$oid" : "541666d8e4b07265e257a42e"}}, max={ }, 
query={ }, sort={ }, fields={ }, notimeout=false}
14/09/17 15:20:36 INFO mapred.MapTask: Map output collector class = 
org.apache.hadoop.mapred.MapTask$MapOutputBuffer
14/09/17 15:20:36 INFO mapred.MapTask: (EQUATOR) 0 kvi 26214396(104857584)
14/09/17 15:20:36 INFO mapred.MapTask: mapreduce.task.io.sort.mb: 100
14/09/17 15:20:36 INFO mapred.MapTask: soft limit at 83886080
14/09/17 15:20:36 INFO mapred.MapTask: bufstart = 0; bufvoid = 104857600
14/09/17 15:20:36 INFO mapred.MapTask: kvstart = 26214396; length = 6553600
14/09/17 15:20:36 INFO mapred.LocalJobRunner: map task executor complete.
14/09/17 15:20:36 WARN mapred.LocalJobRunner: 
job_local1701078621_0001java.lang.Exception: java.lang.ClassCastException: 
class java.lang.String
        at 
org.apache.hadoop.mapred.LocalJobRunner$Job.runTasks(LocalJobRunner.java:462)
        at 
org.apache.hadoop.mapred.LocalJobRunner$Job.run(LocalJobRunner.java:522)
Caused by: java.lang.ClassCastException: class java.lang.String
        at java.lang.Class.asSubclass(Class.java:3126)
        at 
org.apache.hadoop.mapred.JobConf.getOutputKeyComparator(JobConf.java:885)
        at 
org.apache.hadoop.mapred.MapTask$MapOutputBuffer.init(MapTask.java:981)
        at 
org.apache.hadoop.mapred.MapTask.createSortingCollector(MapTask.java:391)
        at org.apache.hadoop.mapred.MapTask.access$100(MapTask.java:80)
        at 
org.apache.hadoop.mapred.MapTask$NewOutputCollector.<init>(MapTask.java:675)
        at org.apache.hadoop.mapred.MapTask.runNewMapper(MapTask.java:747)
        at org.apache.hadoop.mapred.MapTask.run(MapTask.java:340)
        at 
org.apache.hadoop.mapred.LocalJobRunner$Job$MapTaskRunnable.run(LocalJobRunner.java:243)
        at 
java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:471)
        at java.util.concurrent.FutureTask.run(FutureTask.java:262)
        at 
java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145)
        at 
java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615)
        at java.lang.Thread.run(Thread.java:744)


Did I miss something??


Another issue I am worry about: working on a Windows system makes everything 
quite complicated with Hadoop. I have it installed and running, the same as my 
mongoDB database (I am using the connector provided by them). Running the same 
main class I am using in the hadooop jar call on the server (in the example 
before), but from my IDE, I get this exception:

PriviledgedActionException as:hernanbl cause:java.io.IOException: Failed to set 
permissions of path: 
\tmp\hadoop-hernanbl\mapred\staging\hernanbl1600842219\.staging to 0700

How could I make it run?


Many thanks!!

Best regards,

Blanca



Reply via email to