[jira] [Commented] (TEZ-3330) Error on avro M/R job with Tez: missing configuration property

2016-10-10 Thread Siddharth Seth (JIRA)

[ 
https://issues.apache.org/jira/browse/TEZ-3330?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15562979#comment-15562979
 ] 

Siddharth Seth commented on TEZ-3330:
-

Thanks for the review. Committing. Think the findbugs warnings is being fixed 
by TEZ-3464.

> Error on avro M/R job with Tez: missing configuration property
> --
>
> Key: TEZ-3330
> URL: https://issues.apache.org/jira/browse/TEZ-3330
> Project: Apache Tez
>  Issue Type: Bug
>Reporter: Manuel Godbert
>Assignee: Siddharth Seth
>  Labels: newbie
> Attachments: TEZ-3330.01.patch, TEZ-3330.temp.2.patch, 
> TEZ-3330.temp.patch
>
>
> I tried running the simple avro M/R job MapredColorCount, that I found in the 
> examples of avro release 1.7.7.
> It failed with the following trace:
> {code}
> errorMessage=Shuffle Runner 
> Failed:org.apache.tez.runtime.library.common.shuffle.orderedgrouped.Shuffle$ShuffleError:
>  Error while doing final merge
> at 
> org.apache.tez.runtime.library.common.shuffle.orderedgrouped.Shuffle$RunShuffleCallable.callInternal(Shuffle.java:378)
> at 
> org.apache.tez.runtime.library.common.shuffle.orderedgrouped.Shuffle$RunShuffleCallable.callInternal(Shuffle.java:337)
> at org.apache.tez.common.CallableWithNdc.call(CallableWithNdc.java:36)
> at java.util.concurrent.FutureTask.run(FutureTask.java:262)
> at 
> java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145)
> at 
> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615)
> at java.lang.Thread.run(Thread.java:744)
> Caused by: java.lang.NullPointerException
> at java.io.StringReader.(StringReader.java:50)
> at org.apache.avro.Schema$Parser.parse(Schema.java:917)
> at org.apache.avro.Schema.parse(Schema.java:966)
> at org.apache.avro.mapred.AvroJob.getMapOutputSchema(AvroJob.java:78)
> at 
> org.apache.avro.mapred.AvroKeyComparator.setConf(AvroKeyComparator.java:39)
> at 
> org.apache.hadoop.util.ReflectionUtils.setConf(ReflectionUtils.java:76)
> at 
> org.apache.hadoop.util.ReflectionUtils.newInstance(ReflectionUtils.java:136)
> at 
> org.apache.tez.runtime.library.common.ConfigUtils.getIntermediateInputKeyComparator(ConfigUtils.java:133)
> at 
> org.apache.tez.runtime.library.common.shuffle.orderedgrouped.MergeManager.finalMerge(MergeManager.java:915)
> at 
> org.apache.tez.runtime.library.common.shuffle.orderedgrouped.MergeManager.close(MergeManager.java:540)
> at 
> org.apache.tez.runtime.library.common.shuffle.orderedgrouped.Shuffle$RunShuffleCallable.callInternal(Shuffle.java:376)
> ... 6 more
> {code}
> Digging a bit I saw that during shuffle Tez can't access some of the 
> configuration properties of the job. In our example it is the 
> avro.output.schema that is missing.
> With some more complicated code I could get one step further and a similar 
> issue happened when the valuesIterator for the reducer was being built:
> {code}
> java.lang.NullPointerException
> at java.io.StringReader.(StringReader.java:50)
> at org.apache.avro.Schema$Parser.parse(Schema.java:917)
> at org.apache.avro.Schema.parse(Schema.java:966)
> at org.apache.avro.mapred.AvroJob.getMapOutputSchema(AvroJob.java:78)
> at 
> org.apache.avro.mapred.AvroSerialization.getDeserializer(AvroSerialization.java:53)
> at 
> org.apache.hadoop.io.serializer.SerializationFactory.getDeserializer(SerializationFactory.java:90)
> at 
> org.apache.tez.runtime.library.common.ValuesIterator.(ValuesIterator.java:80)
> at 
> org.apache.tez.runtime.library.input.OrderedGroupedKVInput.createValuesIterator(OrderedGroupedKVInput.java:287)
> {code}
> I am using HDP2.4, Tez 0.7.0, avro 1.7.4



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (TEZ-3330) Error on avro M/R job with Tez: missing configuration property

2016-10-07 Thread Hitesh Shah (JIRA)

[ 
https://issues.apache.org/jira/browse/TEZ-3330?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15556156#comment-15556156
 ] 

Hitesh Shah commented on TEZ-3330:
--

+1 - not sure about the findbugs which is in tez-dag and therefore unrelated. 

> Error on avro M/R job with Tez: missing configuration property
> --
>
> Key: TEZ-3330
> URL: https://issues.apache.org/jira/browse/TEZ-3330
> Project: Apache Tez
>  Issue Type: Bug
>Reporter: Manuel Godbert
>Assignee: Siddharth Seth
>  Labels: newbie
> Attachments: TEZ-3330.01.patch, TEZ-3330.temp.2.patch, 
> TEZ-3330.temp.patch
>
>
> I tried running the simple avro M/R job MapredColorCount, that I found in the 
> examples of avro release 1.7.7.
> It failed with the following trace:
> {code}
> errorMessage=Shuffle Runner 
> Failed:org.apache.tez.runtime.library.common.shuffle.orderedgrouped.Shuffle$ShuffleError:
>  Error while doing final merge
> at 
> org.apache.tez.runtime.library.common.shuffle.orderedgrouped.Shuffle$RunShuffleCallable.callInternal(Shuffle.java:378)
> at 
> org.apache.tez.runtime.library.common.shuffle.orderedgrouped.Shuffle$RunShuffleCallable.callInternal(Shuffle.java:337)
> at org.apache.tez.common.CallableWithNdc.call(CallableWithNdc.java:36)
> at java.util.concurrent.FutureTask.run(FutureTask.java:262)
> at 
> java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145)
> at 
> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615)
> at java.lang.Thread.run(Thread.java:744)
> Caused by: java.lang.NullPointerException
> at java.io.StringReader.(StringReader.java:50)
> at org.apache.avro.Schema$Parser.parse(Schema.java:917)
> at org.apache.avro.Schema.parse(Schema.java:966)
> at org.apache.avro.mapred.AvroJob.getMapOutputSchema(AvroJob.java:78)
> at 
> org.apache.avro.mapred.AvroKeyComparator.setConf(AvroKeyComparator.java:39)
> at 
> org.apache.hadoop.util.ReflectionUtils.setConf(ReflectionUtils.java:76)
> at 
> org.apache.hadoop.util.ReflectionUtils.newInstance(ReflectionUtils.java:136)
> at 
> org.apache.tez.runtime.library.common.ConfigUtils.getIntermediateInputKeyComparator(ConfigUtils.java:133)
> at 
> org.apache.tez.runtime.library.common.shuffle.orderedgrouped.MergeManager.finalMerge(MergeManager.java:915)
> at 
> org.apache.tez.runtime.library.common.shuffle.orderedgrouped.MergeManager.close(MergeManager.java:540)
> at 
> org.apache.tez.runtime.library.common.shuffle.orderedgrouped.Shuffle$RunShuffleCallable.callInternal(Shuffle.java:376)
> ... 6 more
> {code}
> Digging a bit I saw that during shuffle Tez can't access some of the 
> configuration properties of the job. In our example it is the 
> avro.output.schema that is missing.
> With some more complicated code I could get one step further and a similar 
> issue happened when the valuesIterator for the reducer was being built:
> {code}
> java.lang.NullPointerException
> at java.io.StringReader.(StringReader.java:50)
> at org.apache.avro.Schema$Parser.parse(Schema.java:917)
> at org.apache.avro.Schema.parse(Schema.java:966)
> at org.apache.avro.mapred.AvroJob.getMapOutputSchema(AvroJob.java:78)
> at 
> org.apache.avro.mapred.AvroSerialization.getDeserializer(AvroSerialization.java:53)
> at 
> org.apache.hadoop.io.serializer.SerializationFactory.getDeserializer(SerializationFactory.java:90)
> at 
> org.apache.tez.runtime.library.common.ValuesIterator.(ValuesIterator.java:80)
> at 
> org.apache.tez.runtime.library.input.OrderedGroupedKVInput.createValuesIterator(OrderedGroupedKVInput.java:287)
> {code}
> I am using HDP2.4, Tez 0.7.0, avro 1.7.4



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (TEZ-3330) Error on avro M/R job with Tez: missing configuration property

2016-10-07 Thread TezQA (JIRA)

[ 
https://issues.apache.org/jira/browse/TEZ-3330?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15556137#comment-15556137
 ] 

TezQA commented on TEZ-3330:


{color:red}-1 overall{color}.  Here are the results of testing the latest 
attachment
  http://issues.apache.org/jira/secure/attachment/12832178/TEZ-3330.01.patch
  against master revision dceb365.

{color:green}+1 @author{color}.  The patch does not contain any @author 
tags.

{color:green}+1 tests included{color}.  The patch appears to include 8 new 
or modified test files.

{color:green}+1 javac{color}.  The applied patch does not increase the 
total number of javac compiler warnings.

{color:green}+1 javadoc{color}.  There were no new javadoc warning messages.

{color:red}-1 findbugs{color}.  The patch appears to introduce 1 new 
Findbugs (version 3.0.1) warnings.

{color:green}+1 release audit{color}.  The applied patch does not increase 
the total number of release audit warnings.

{color:green}+1 core tests{color}.  The patch passed unit tests in .

Test results: 
https://builds.apache.org/job/PreCommit-TEZ-Build/2025//testReport/
Findbugs warnings: 
https://builds.apache.org/job/PreCommit-TEZ-Build/2025//artifact/patchprocess/newPatchFindbugsWarningstez-dag.html
Console output: https://builds.apache.org/job/PreCommit-TEZ-Build/2025//console

This message is automatically generated.

> Error on avro M/R job with Tez: missing configuration property
> --
>
> Key: TEZ-3330
> URL: https://issues.apache.org/jira/browse/TEZ-3330
> Project: Apache Tez
>  Issue Type: Bug
>Reporter: Manuel Godbert
>Assignee: Siddharth Seth
>  Labels: newbie
> Attachments: TEZ-3330.01.patch, TEZ-3330.temp.2.patch, 
> TEZ-3330.temp.patch
>
>
> I tried running the simple avro M/R job MapredColorCount, that I found in the 
> examples of avro release 1.7.7.
> It failed with the following trace:
> {code}
> errorMessage=Shuffle Runner 
> Failed:org.apache.tez.runtime.library.common.shuffle.orderedgrouped.Shuffle$ShuffleError:
>  Error while doing final merge
> at 
> org.apache.tez.runtime.library.common.shuffle.orderedgrouped.Shuffle$RunShuffleCallable.callInternal(Shuffle.java:378)
> at 
> org.apache.tez.runtime.library.common.shuffle.orderedgrouped.Shuffle$RunShuffleCallable.callInternal(Shuffle.java:337)
> at org.apache.tez.common.CallableWithNdc.call(CallableWithNdc.java:36)
> at java.util.concurrent.FutureTask.run(FutureTask.java:262)
> at 
> java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145)
> at 
> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615)
> at java.lang.Thread.run(Thread.java:744)
> Caused by: java.lang.NullPointerException
> at java.io.StringReader.(StringReader.java:50)
> at org.apache.avro.Schema$Parser.parse(Schema.java:917)
> at org.apache.avro.Schema.parse(Schema.java:966)
> at org.apache.avro.mapred.AvroJob.getMapOutputSchema(AvroJob.java:78)
> at 
> org.apache.avro.mapred.AvroKeyComparator.setConf(AvroKeyComparator.java:39)
> at 
> org.apache.hadoop.util.ReflectionUtils.setConf(ReflectionUtils.java:76)
> at 
> org.apache.hadoop.util.ReflectionUtils.newInstance(ReflectionUtils.java:136)
> at 
> org.apache.tez.runtime.library.common.ConfigUtils.getIntermediateInputKeyComparator(ConfigUtils.java:133)
> at 
> org.apache.tez.runtime.library.common.shuffle.orderedgrouped.MergeManager.finalMerge(MergeManager.java:915)
> at 
> org.apache.tez.runtime.library.common.shuffle.orderedgrouped.MergeManager.close(MergeManager.java:540)
> at 
> org.apache.tez.runtime.library.common.shuffle.orderedgrouped.Shuffle$RunShuffleCallable.callInternal(Shuffle.java:376)
> ... 6 more
> {code}
> Digging a bit I saw that during shuffle Tez can't access some of the 
> configuration properties of the job. In our example it is the 
> avro.output.schema that is missing.
> With some more complicated code I could get one step further and a similar 
> issue happened when the valuesIterator for the reducer was being built:
> {code}
> java.lang.NullPointerException
> at java.io.StringReader.(StringReader.java:50)
> at org.apache.avro.Schema$Parser.parse(Schema.java:917)
> at org.apache.avro.Schema.parse(Schema.java:966)
> at org.apache.avro.mapred.AvroJob.getMapOutputSchema(AvroJob.java:78)
> at 
> org.apache.avro.mapred.AvroSerialization.getDeserializer(AvroSerialization.java:53)
> at 
> org.apache.hadoop.io.serializer.SerializationFactory.getDeserializer(SerializationFactory.java:90)
> at 
> org.apache.tez.runtime.library.common.ValuesIterator.(ValuesIterator.java:80)
> at 
> 

[jira] [Commented] (TEZ-3330) Error on avro M/R job with Tez: missing configuration property

2016-09-20 Thread Manuel Godbert (JIRA)

[ 
https://issues.apache.org/jira/browse/TEZ-3330?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15506084#comment-15506084
 ] 

Manuel Godbert commented on TEZ-3330:
-

I am afraid I do not understand what you expect from me, I am not used to git 
patches, just basic push and pull... I let you finalize the work!

> Error on avro M/R job with Tez: missing configuration property
> --
>
> Key: TEZ-3330
> URL: https://issues.apache.org/jira/browse/TEZ-3330
> Project: Apache Tez
>  Issue Type: Bug
>Reporter: Manuel Godbert
> Attachments: TEZ-3330.temp.2.patch, TEZ-3330.temp.patch
>
>
> I tried running the simple avro M/R job MapredColorCount, that I found in the 
> examples of avro release 1.7.7.
> It failed with the following trace:
> {code}
> errorMessage=Shuffle Runner 
> Failed:org.apache.tez.runtime.library.common.shuffle.orderedgrouped.Shuffle$ShuffleError:
>  Error while doing final merge
> at 
> org.apache.tez.runtime.library.common.shuffle.orderedgrouped.Shuffle$RunShuffleCallable.callInternal(Shuffle.java:378)
> at 
> org.apache.tez.runtime.library.common.shuffle.orderedgrouped.Shuffle$RunShuffleCallable.callInternal(Shuffle.java:337)
> at org.apache.tez.common.CallableWithNdc.call(CallableWithNdc.java:36)
> at java.util.concurrent.FutureTask.run(FutureTask.java:262)
> at 
> java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145)
> at 
> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615)
> at java.lang.Thread.run(Thread.java:744)
> Caused by: java.lang.NullPointerException
> at java.io.StringReader.(StringReader.java:50)
> at org.apache.avro.Schema$Parser.parse(Schema.java:917)
> at org.apache.avro.Schema.parse(Schema.java:966)
> at org.apache.avro.mapred.AvroJob.getMapOutputSchema(AvroJob.java:78)
> at 
> org.apache.avro.mapred.AvroKeyComparator.setConf(AvroKeyComparator.java:39)
> at 
> org.apache.hadoop.util.ReflectionUtils.setConf(ReflectionUtils.java:76)
> at 
> org.apache.hadoop.util.ReflectionUtils.newInstance(ReflectionUtils.java:136)
> at 
> org.apache.tez.runtime.library.common.ConfigUtils.getIntermediateInputKeyComparator(ConfigUtils.java:133)
> at 
> org.apache.tez.runtime.library.common.shuffle.orderedgrouped.MergeManager.finalMerge(MergeManager.java:915)
> at 
> org.apache.tez.runtime.library.common.shuffle.orderedgrouped.MergeManager.close(MergeManager.java:540)
> at 
> org.apache.tez.runtime.library.common.shuffle.orderedgrouped.Shuffle$RunShuffleCallable.callInternal(Shuffle.java:376)
> ... 6 more
> {code}
> Digging a bit I saw that during shuffle Tez can't access some of the 
> configuration properties of the job. In our example it is the 
> avro.output.schema that is missing.
> With some more complicated code I could get one step further and a similar 
> issue happened when the valuesIterator for the reducer was being built:
> {code}
> java.lang.NullPointerException
> at java.io.StringReader.(StringReader.java:50)
> at org.apache.avro.Schema$Parser.parse(Schema.java:917)
> at org.apache.avro.Schema.parse(Schema.java:966)
> at org.apache.avro.mapred.AvroJob.getMapOutputSchema(AvroJob.java:78)
> at 
> org.apache.avro.mapred.AvroSerialization.getDeserializer(AvroSerialization.java:53)
> at 
> org.apache.hadoop.io.serializer.SerializationFactory.getDeserializer(SerializationFactory.java:90)
> at 
> org.apache.tez.runtime.library.common.ValuesIterator.(ValuesIterator.java:80)
> at 
> org.apache.tez.runtime.library.input.OrderedGroupedKVInput.createValuesIterator(OrderedGroupedKVInput.java:287)
> {code}
> I am using HDP2.4, Tez 0.7.0, avro 1.7.4



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (TEZ-3330) Error on avro M/R job with Tez: missing configuration property

2016-09-19 Thread Siddharth Seth (JIRA)

[ 
https://issues.apache.org/jira/browse/TEZ-3330?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15504312#comment-15504312
 ] 

Siddharth Seth commented on TEZ-3330:
-

Getting this patch in, will need some test changes. I'll see if I can get to 
this sometime, otherwise [~manuel.godbert] - if you can, please update the 
patch so that it can be committed.

> Error on avro M/R job with Tez: missing configuration property
> --
>
> Key: TEZ-3330
> URL: https://issues.apache.org/jira/browse/TEZ-3330
> Project: Apache Tez
>  Issue Type: Bug
>Reporter: Manuel Godbert
> Attachments: TEZ-3330.temp.2.patch, TEZ-3330.temp.patch
>
>
> I tried running the simple avro M/R job MapredColorCount, that I found in the 
> examples of avro release 1.7.7.
> It failed with the following trace:
> {code}
> errorMessage=Shuffle Runner 
> Failed:org.apache.tez.runtime.library.common.shuffle.orderedgrouped.Shuffle$ShuffleError:
>  Error while doing final merge
> at 
> org.apache.tez.runtime.library.common.shuffle.orderedgrouped.Shuffle$RunShuffleCallable.callInternal(Shuffle.java:378)
> at 
> org.apache.tez.runtime.library.common.shuffle.orderedgrouped.Shuffle$RunShuffleCallable.callInternal(Shuffle.java:337)
> at org.apache.tez.common.CallableWithNdc.call(CallableWithNdc.java:36)
> at java.util.concurrent.FutureTask.run(FutureTask.java:262)
> at 
> java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145)
> at 
> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615)
> at java.lang.Thread.run(Thread.java:744)
> Caused by: java.lang.NullPointerException
> at java.io.StringReader.(StringReader.java:50)
> at org.apache.avro.Schema$Parser.parse(Schema.java:917)
> at org.apache.avro.Schema.parse(Schema.java:966)
> at org.apache.avro.mapred.AvroJob.getMapOutputSchema(AvroJob.java:78)
> at 
> org.apache.avro.mapred.AvroKeyComparator.setConf(AvroKeyComparator.java:39)
> at 
> org.apache.hadoop.util.ReflectionUtils.setConf(ReflectionUtils.java:76)
> at 
> org.apache.hadoop.util.ReflectionUtils.newInstance(ReflectionUtils.java:136)
> at 
> org.apache.tez.runtime.library.common.ConfigUtils.getIntermediateInputKeyComparator(ConfigUtils.java:133)
> at 
> org.apache.tez.runtime.library.common.shuffle.orderedgrouped.MergeManager.finalMerge(MergeManager.java:915)
> at 
> org.apache.tez.runtime.library.common.shuffle.orderedgrouped.MergeManager.close(MergeManager.java:540)
> at 
> org.apache.tez.runtime.library.common.shuffle.orderedgrouped.Shuffle$RunShuffleCallable.callInternal(Shuffle.java:376)
> ... 6 more
> {code}
> Digging a bit I saw that during shuffle Tez can't access some of the 
> configuration properties of the job. In our example it is the 
> avro.output.schema that is missing.
> With some more complicated code I could get one step further and a similar 
> issue happened when the valuesIterator for the reducer was being built:
> {code}
> java.lang.NullPointerException
> at java.io.StringReader.(StringReader.java:50)
> at org.apache.avro.Schema$Parser.parse(Schema.java:917)
> at org.apache.avro.Schema.parse(Schema.java:966)
> at org.apache.avro.mapred.AvroJob.getMapOutputSchema(AvroJob.java:78)
> at 
> org.apache.avro.mapred.AvroSerialization.getDeserializer(AvroSerialization.java:53)
> at 
> org.apache.hadoop.io.serializer.SerializationFactory.getDeserializer(SerializationFactory.java:90)
> at 
> org.apache.tez.runtime.library.common.ValuesIterator.(ValuesIterator.java:80)
> at 
> org.apache.tez.runtime.library.input.OrderedGroupedKVInput.createValuesIterator(OrderedGroupedKVInput.java:287)
> {code}
> I am using HDP2.4, Tez 0.7.0, avro 1.7.4



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (TEZ-3330) Error on avro M/R job with Tez: missing configuration property

2016-09-15 Thread Manuel Godbert (JIRA)

[ 
https://issues.apache.org/jira/browse/TEZ-3330?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15493310#comment-15493310
 ] 

Manuel Godbert commented on TEZ-3330:
-

Thanks, this solves the issue!

> Error on avro M/R job with Tez: missing configuration property
> --
>
> Key: TEZ-3330
> URL: https://issues.apache.org/jira/browse/TEZ-3330
> Project: Apache Tez
>  Issue Type: Bug
>Reporter: Manuel Godbert
> Attachments: TEZ-3330.temp.2.patch, TEZ-3330.temp.patch
>
>
> I tried running the simple avro M/R job MapredColorCount, that I found in the 
> examples of avro release 1.7.7.
> It failed with the following trace:
> {code}
> errorMessage=Shuffle Runner 
> Failed:org.apache.tez.runtime.library.common.shuffle.orderedgrouped.Shuffle$ShuffleError:
>  Error while doing final merge
> at 
> org.apache.tez.runtime.library.common.shuffle.orderedgrouped.Shuffle$RunShuffleCallable.callInternal(Shuffle.java:378)
> at 
> org.apache.tez.runtime.library.common.shuffle.orderedgrouped.Shuffle$RunShuffleCallable.callInternal(Shuffle.java:337)
> at org.apache.tez.common.CallableWithNdc.call(CallableWithNdc.java:36)
> at java.util.concurrent.FutureTask.run(FutureTask.java:262)
> at 
> java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145)
> at 
> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615)
> at java.lang.Thread.run(Thread.java:744)
> Caused by: java.lang.NullPointerException
> at java.io.StringReader.(StringReader.java:50)
> at org.apache.avro.Schema$Parser.parse(Schema.java:917)
> at org.apache.avro.Schema.parse(Schema.java:966)
> at org.apache.avro.mapred.AvroJob.getMapOutputSchema(AvroJob.java:78)
> at 
> org.apache.avro.mapred.AvroKeyComparator.setConf(AvroKeyComparator.java:39)
> at 
> org.apache.hadoop.util.ReflectionUtils.setConf(ReflectionUtils.java:76)
> at 
> org.apache.hadoop.util.ReflectionUtils.newInstance(ReflectionUtils.java:136)
> at 
> org.apache.tez.runtime.library.common.ConfigUtils.getIntermediateInputKeyComparator(ConfigUtils.java:133)
> at 
> org.apache.tez.runtime.library.common.shuffle.orderedgrouped.MergeManager.finalMerge(MergeManager.java:915)
> at 
> org.apache.tez.runtime.library.common.shuffle.orderedgrouped.MergeManager.close(MergeManager.java:540)
> at 
> org.apache.tez.runtime.library.common.shuffle.orderedgrouped.Shuffle$RunShuffleCallable.callInternal(Shuffle.java:376)
> ... 6 more
> {code}
> Digging a bit I saw that during shuffle Tez can't access some of the 
> configuration properties of the job. In our example it is the 
> avro.output.schema that is missing.
> With some more complicated code I could get one step further and a similar 
> issue happened when the valuesIterator for the reducer was being built:
> {code}
> java.lang.NullPointerException
> at java.io.StringReader.(StringReader.java:50)
> at org.apache.avro.Schema$Parser.parse(Schema.java:917)
> at org.apache.avro.Schema.parse(Schema.java:966)
> at org.apache.avro.mapred.AvroJob.getMapOutputSchema(AvroJob.java:78)
> at 
> org.apache.avro.mapred.AvroSerialization.getDeserializer(AvroSerialization.java:53)
> at 
> org.apache.hadoop.io.serializer.SerializationFactory.getDeserializer(SerializationFactory.java:90)
> at 
> org.apache.tez.runtime.library.common.ValuesIterator.(ValuesIterator.java:80)
> at 
> org.apache.tez.runtime.library.input.OrderedGroupedKVInput.createValuesIterator(OrderedGroupedKVInput.java:287)
> {code}
> I am using HDP2.4, Tez 0.7.0, avro 1.7.4



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (TEZ-3330) Error on avro M/R job with Tez: missing configuration property

2016-07-22 Thread TezQA (JIRA)

[ 
https://issues.apache.org/jira/browse/TEZ-3330?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15390313#comment-15390313
 ] 

TezQA commented on TEZ-3330:


{color:red}-1 overall{color}.  Here are the results of testing the latest 
attachment
  http://issues.apache.org/jira/secure/attachment/12819719/TEZ-3330.temp.2.patch
  against master revision 97fa44f.

{color:green}+1 @author{color}.  The patch does not contain any @author 
tags.

{color:red}-1 tests included{color}.  The patch doesn't appear to include 
any new or modified tests.
Please justify why no new tests are needed for this 
patch.
Also please list what manual steps were performed to 
verify this patch.

{color:green}+1 javac{color}.  The applied patch does not increase the 
total number of javac compiler warnings.

{color:green}+1 javadoc{color}.  There were no new javadoc warning messages.

{color:green}+1 findbugs{color}.  The patch does not introduce any new 
Findbugs (version 3.0.1) warnings.

{color:green}+1 release audit{color}.  The applied patch does not increase 
the total number of release audit warnings.

{color:green}+1 core tests{color}.  The patch passed unit tests in .

Test results: 
https://builds.apache.org/job/PreCommit-TEZ-Build/1873//testReport/
Console output: https://builds.apache.org/job/PreCommit-TEZ-Build/1873//console

This message is automatically generated.

> Error on avro M/R job with Tez: missing configuration property
> --
>
> Key: TEZ-3330
> URL: https://issues.apache.org/jira/browse/TEZ-3330
> Project: Apache Tez
>  Issue Type: Bug
>Reporter: Manuel Godbert
> Attachments: TEZ-3330.temp.2.patch, TEZ-3330.temp.patch
>
>
> I tried running the simple avro M/R job MapredColorCount, that I found in the 
> examples of avro release 1.7.7.
> It failed with the following trace:
> {code}
> errorMessage=Shuffle Runner 
> Failed:org.apache.tez.runtime.library.common.shuffle.orderedgrouped.Shuffle$ShuffleError:
>  Error while doing final merge
> at 
> org.apache.tez.runtime.library.common.shuffle.orderedgrouped.Shuffle$RunShuffleCallable.callInternal(Shuffle.java:378)
> at 
> org.apache.tez.runtime.library.common.shuffle.orderedgrouped.Shuffle$RunShuffleCallable.callInternal(Shuffle.java:337)
> at org.apache.tez.common.CallableWithNdc.call(CallableWithNdc.java:36)
> at java.util.concurrent.FutureTask.run(FutureTask.java:262)
> at 
> java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145)
> at 
> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615)
> at java.lang.Thread.run(Thread.java:744)
> Caused by: java.lang.NullPointerException
> at java.io.StringReader.(StringReader.java:50)
> at org.apache.avro.Schema$Parser.parse(Schema.java:917)
> at org.apache.avro.Schema.parse(Schema.java:966)
> at org.apache.avro.mapred.AvroJob.getMapOutputSchema(AvroJob.java:78)
> at 
> org.apache.avro.mapred.AvroKeyComparator.setConf(AvroKeyComparator.java:39)
> at 
> org.apache.hadoop.util.ReflectionUtils.setConf(ReflectionUtils.java:76)
> at 
> org.apache.hadoop.util.ReflectionUtils.newInstance(ReflectionUtils.java:136)
> at 
> org.apache.tez.runtime.library.common.ConfigUtils.getIntermediateInputKeyComparator(ConfigUtils.java:133)
> at 
> org.apache.tez.runtime.library.common.shuffle.orderedgrouped.MergeManager.finalMerge(MergeManager.java:915)
> at 
> org.apache.tez.runtime.library.common.shuffle.orderedgrouped.MergeManager.close(MergeManager.java:540)
> at 
> org.apache.tez.runtime.library.common.shuffle.orderedgrouped.Shuffle$RunShuffleCallable.callInternal(Shuffle.java:376)
> ... 6 more
> {code}
> Digging a bit I saw that during shuffle Tez can't access some of the 
> configuration properties of the job. In our example it is the 
> avro.output.schema that is missing.
> With some more complicated code I could get one step further and a similar 
> issue happened when the valuesIterator for the reducer was being built:
> {code}
> java.lang.NullPointerException
> at java.io.StringReader.(StringReader.java:50)
> at org.apache.avro.Schema$Parser.parse(Schema.java:917)
> at org.apache.avro.Schema.parse(Schema.java:966)
> at org.apache.avro.mapred.AvroJob.getMapOutputSchema(AvroJob.java:78)
> at 
> org.apache.avro.mapred.AvroSerialization.getDeserializer(AvroSerialization.java:53)
> at 
> org.apache.hadoop.io.serializer.SerializationFactory.getDeserializer(SerializationFactory.java:90)
> at 
> org.apache.tez.runtime.library.common.ValuesIterator.(ValuesIterator.java:80)
> at 
> 

[jira] [Commented] (TEZ-3330) Error on avro M/R job with Tez: missing configuration property

2016-07-22 Thread Siddharth Seth (JIRA)

[ 
https://issues.apache.org/jira/browse/TEZ-3330?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15390172#comment-15390172
 ] 

Siddharth Seth commented on TEZ-3330:
-

bq. the "this.conf.addResource(conf)" in the patch does not affect properties 
already present in the initial conf.
Good point. That's not how it should have been done anyway. There's already a 
helper to do this. Let me try making a quick change to the patch. 
[~mandecannes] - feel free to update the patch as well, as you see fit.

> Error on avro M/R job with Tez: missing configuration property
> --
>
> Key: TEZ-3330
> URL: https://issues.apache.org/jira/browse/TEZ-3330
> Project: Apache Tez
>  Issue Type: Bug
>Reporter: Manuel Godbert
> Attachments: TEZ-3330.temp.patch
>
>
> I tried running the simple avro M/R job MapredColorCount, that I found in the 
> examples of avro release 1.7.7.
> It failed with the following trace:
> {code}
> errorMessage=Shuffle Runner 
> Failed:org.apache.tez.runtime.library.common.shuffle.orderedgrouped.Shuffle$ShuffleError:
>  Error while doing final merge
> at 
> org.apache.tez.runtime.library.common.shuffle.orderedgrouped.Shuffle$RunShuffleCallable.callInternal(Shuffle.java:378)
> at 
> org.apache.tez.runtime.library.common.shuffle.orderedgrouped.Shuffle$RunShuffleCallable.callInternal(Shuffle.java:337)
> at org.apache.tez.common.CallableWithNdc.call(CallableWithNdc.java:36)
> at java.util.concurrent.FutureTask.run(FutureTask.java:262)
> at 
> java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145)
> at 
> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615)
> at java.lang.Thread.run(Thread.java:744)
> Caused by: java.lang.NullPointerException
> at java.io.StringReader.(StringReader.java:50)
> at org.apache.avro.Schema$Parser.parse(Schema.java:917)
> at org.apache.avro.Schema.parse(Schema.java:966)
> at org.apache.avro.mapred.AvroJob.getMapOutputSchema(AvroJob.java:78)
> at 
> org.apache.avro.mapred.AvroKeyComparator.setConf(AvroKeyComparator.java:39)
> at 
> org.apache.hadoop.util.ReflectionUtils.setConf(ReflectionUtils.java:76)
> at 
> org.apache.hadoop.util.ReflectionUtils.newInstance(ReflectionUtils.java:136)
> at 
> org.apache.tez.runtime.library.common.ConfigUtils.getIntermediateInputKeyComparator(ConfigUtils.java:133)
> at 
> org.apache.tez.runtime.library.common.shuffle.orderedgrouped.MergeManager.finalMerge(MergeManager.java:915)
> at 
> org.apache.tez.runtime.library.common.shuffle.orderedgrouped.MergeManager.close(MergeManager.java:540)
> at 
> org.apache.tez.runtime.library.common.shuffle.orderedgrouped.Shuffle$RunShuffleCallable.callInternal(Shuffle.java:376)
> ... 6 more
> {code}
> Digging a bit I saw that during shuffle Tez can't access some of the 
> configuration properties of the job. In our example it is the 
> avro.output.schema that is missing.
> With some more complicated code I could get one step further and a similar 
> issue happened when the valuesIterator for the reducer was being built:
> {code}
> java.lang.NullPointerException
> at java.io.StringReader.(StringReader.java:50)
> at org.apache.avro.Schema$Parser.parse(Schema.java:917)
> at org.apache.avro.Schema.parse(Schema.java:966)
> at org.apache.avro.mapred.AvroJob.getMapOutputSchema(AvroJob.java:78)
> at 
> org.apache.avro.mapred.AvroSerialization.getDeserializer(AvroSerialization.java:53)
> at 
> org.apache.hadoop.io.serializer.SerializationFactory.getDeserializer(SerializationFactory.java:90)
> at 
> org.apache.tez.runtime.library.common.ValuesIterator.(ValuesIterator.java:80)
> at 
> org.apache.tez.runtime.library.input.OrderedGroupedKVInput.createValuesIterator(OrderedGroupedKVInput.java:287)
> {code}
> I am using HDP2.4, Tez 0.7.0, avro 1.7.4



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (TEZ-3330) Error on avro M/R job with Tez: missing configuration property

2016-07-19 Thread Manuel Godbert (JIRA)

[ 
https://issues.apache.org/jira/browse/TEZ-3330?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15384408#comment-15384408
 ] 

Manuel Godbert commented on TEZ-3330:
-

Hello, thanks for the patch. I just tested it, it solves the shuffle error but 
not the second issue. The full trace is:

{code}
task:java.lang.NullPointerException
at java.io.StringReader.(StringReader.java:50)
at org.apache.avro.Schema$Parser.parse(Schema.java:917)
at org.apache.avro.Schema.parse(Schema.java:966)
at org.apache.avro.mapred.AvroJob.getMapOutputSchema(AvroJob.java:78)
at 
org.apache.avro.mapred.AvroSerialization.getDeserializer(AvroSerialization.java:53)
at 
org.apache.hadoop.io.serializer.SerializationFactory.getDeserializer(SerializationFactory.java:90)
at 
org.apache.tez.runtime.library.common.ValuesIterator.(ValuesIterator.java:81)
at 
org.apache.tez.runtime.library.input.OrderedGroupedKVInput.createValuesIterator(OrderedGroupedKVInput.java:280)
at 
org.apache.tez.runtime.library.input.OrderedGroupedKVInput.waitForInputReady(OrderedGroupedKVInput.java:176)
at 
org.apache.tez.runtime.library.input.OrderedGroupedKVInput.getReader(OrderedGroupedKVInput.java:240)
at 
org.apache.tez.mapreduce.processor.reduce.ReduceProcessor.run(ReduceProcessor.java:130)
at 
org.apache.tez.runtime.LogicalIOProcessorRuntimeTask.run(LogicalIOProcessorRuntimeTask.java:344)
at 
org.apache.tez.runtime.task.TezTaskRunner$TaskRunnerCallable$1.run(TezTaskRunner.java:181)
at 
org.apache.tez.runtime.task.TezTaskRunner$TaskRunnerCallable$1.run(TezTaskRunner.java:172)
at java.security.AccessController.doPrivileged(Native Method)
at javax.security.auth.Subject.doAs(Subject.java:415)
at 
org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1709)
at 
org.apache.tez.runtime.task.TezTaskRunner$TaskRunnerCallable.callInternal(TezTaskRunner.java:172)
at 
org.apache.tez.runtime.task.TezTaskRunner$TaskRunnerCallable.callInternal(TezTaskRunner.java:168)
at org.apache.tez.common.CallableWithNdc.call(CallableWithNdc.java:36)
at java.util.concurrent.FutureTask.run(FutureTask.java:262)
at 
java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145)
at 
java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615)
at java.lang.Thread.run(Thread.java:744)
{code}

Regards

> Error on avro M/R job with Tez: missing configuration property
> --
>
> Key: TEZ-3330
> URL: https://issues.apache.org/jira/browse/TEZ-3330
> Project: Apache Tez
>  Issue Type: Bug
>Reporter: Manuel Godbert
> Attachments: TEZ-3330.temp.patch
>
>
> I tried running the simple avro M/R job MapredColorCount, that I found in the 
> examples of avro release 1.7.7.
> It failed with the following trace:
> {code}
> errorMessage=Shuffle Runner 
> Failed:org.apache.tez.runtime.library.common.shuffle.orderedgrouped.Shuffle$ShuffleError:
>  Error while doing final merge
> at 
> org.apache.tez.runtime.library.common.shuffle.orderedgrouped.Shuffle$RunShuffleCallable.callInternal(Shuffle.java:378)
> at 
> org.apache.tez.runtime.library.common.shuffle.orderedgrouped.Shuffle$RunShuffleCallable.callInternal(Shuffle.java:337)
> at org.apache.tez.common.CallableWithNdc.call(CallableWithNdc.java:36)
> at java.util.concurrent.FutureTask.run(FutureTask.java:262)
> at 
> java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145)
> at 
> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615)
> at java.lang.Thread.run(Thread.java:744)
> Caused by: java.lang.NullPointerException
> at java.io.StringReader.(StringReader.java:50)
> at org.apache.avro.Schema$Parser.parse(Schema.java:917)
> at org.apache.avro.Schema.parse(Schema.java:966)
> at org.apache.avro.mapred.AvroJob.getMapOutputSchema(AvroJob.java:78)
> at 
> org.apache.avro.mapred.AvroKeyComparator.setConf(AvroKeyComparator.java:39)
> at 
> org.apache.hadoop.util.ReflectionUtils.setConf(ReflectionUtils.java:76)
> at 
> org.apache.hadoop.util.ReflectionUtils.newInstance(ReflectionUtils.java:136)
> at 
> org.apache.tez.runtime.library.common.ConfigUtils.getIntermediateInputKeyComparator(ConfigUtils.java:133)
> at 
> org.apache.tez.runtime.library.common.shuffle.orderedgrouped.MergeManager.finalMerge(MergeManager.java:915)
> at 
> org.apache.tez.runtime.library.common.shuffle.orderedgrouped.MergeManager.close(MergeManager.java:540)
> at 
> org.apache.tez.runtime.library.common.shuffle.orderedgrouped.Shuffle$RunShuffleCallable.callInternal(Shuffle.java:376)
> ... 6 more
> {code}
> Digging a bit I saw that during shuffle Tez can't access some of the 
> configuration properties of the job. In our example it is the 
> 

[jira] [Commented] (TEZ-3330) Error on avro M/R job with Tez: missing configuration property

2016-07-13 Thread Siddharth Seth (JIRA)

[ 
https://issues.apache.org/jira/browse/TEZ-3330?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15375782#comment-15375782
 ] 

Siddharth Seth commented on TEZ-3330:
-

I don't think there's any way to do this at the moment. Attaching a temporary 
patch for this. Don't think fixing this properly is trivial; well we could just 
skip the ConfigBuilders altogether.

> Error on avro M/R job with Tez: missing configuration property
> --
>
> Key: TEZ-3330
> URL: https://issues.apache.org/jira/browse/TEZ-3330
> Project: Apache Tez
>  Issue Type: Bug
>Reporter: Manuel Godbert
> Attachments: TEZ-3330.temp.patch
>
>
> I tried running the simple avro M/R job MapredColorCount, that I found in the 
> examples of avro release 1.7.7.
> It failed with the following trace:
> {code}
> errorMessage=Shuffle Runner 
> Failed:org.apache.tez.runtime.library.common.shuffle.orderedgrouped.Shuffle$ShuffleError:
>  Error while doing final merge
> at 
> org.apache.tez.runtime.library.common.shuffle.orderedgrouped.Shuffle$RunShuffleCallable.callInternal(Shuffle.java:378)
> at 
> org.apache.tez.runtime.library.common.shuffle.orderedgrouped.Shuffle$RunShuffleCallable.callInternal(Shuffle.java:337)
> at org.apache.tez.common.CallableWithNdc.call(CallableWithNdc.java:36)
> at java.util.concurrent.FutureTask.run(FutureTask.java:262)
> at 
> java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145)
> at 
> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615)
> at java.lang.Thread.run(Thread.java:744)
> Caused by: java.lang.NullPointerException
> at java.io.StringReader.(StringReader.java:50)
> at org.apache.avro.Schema$Parser.parse(Schema.java:917)
> at org.apache.avro.Schema.parse(Schema.java:966)
> at org.apache.avro.mapred.AvroJob.getMapOutputSchema(AvroJob.java:78)
> at 
> org.apache.avro.mapred.AvroKeyComparator.setConf(AvroKeyComparator.java:39)
> at 
> org.apache.hadoop.util.ReflectionUtils.setConf(ReflectionUtils.java:76)
> at 
> org.apache.hadoop.util.ReflectionUtils.newInstance(ReflectionUtils.java:136)
> at 
> org.apache.tez.runtime.library.common.ConfigUtils.getIntermediateInputKeyComparator(ConfigUtils.java:133)
> at 
> org.apache.tez.runtime.library.common.shuffle.orderedgrouped.MergeManager.finalMerge(MergeManager.java:915)
> at 
> org.apache.tez.runtime.library.common.shuffle.orderedgrouped.MergeManager.close(MergeManager.java:540)
> at 
> org.apache.tez.runtime.library.common.shuffle.orderedgrouped.Shuffle$RunShuffleCallable.callInternal(Shuffle.java:376)
> ... 6 more
> {code}
> Digging a bit I saw that during shuffle Tez can't access some of the 
> configuration properties of the job. In our example it is the 
> avro.output.schema that is missing.
> With some more complicated code I could get one step further and a similar 
> issue happened when the valuesIterator for the reducer was being built:
> {code}
> java.lang.NullPointerException
> at java.io.StringReader.(StringReader.java:50)
> at org.apache.avro.Schema$Parser.parse(Schema.java:917)
> at org.apache.avro.Schema.parse(Schema.java:966)
> at org.apache.avro.mapred.AvroJob.getMapOutputSchema(AvroJob.java:78)
> at 
> org.apache.avro.mapred.AvroSerialization.getDeserializer(AvroSerialization.java:53)
> at 
> org.apache.hadoop.io.serializer.SerializationFactory.getDeserializer(SerializationFactory.java:90)
> at 
> org.apache.tez.runtime.library.common.ValuesIterator.(ValuesIterator.java:80)
> at 
> org.apache.tez.runtime.library.input.OrderedGroupedKVInput.createValuesIterator(OrderedGroupedKVInput.java:287)
> {code}
> I am using HDP2.4, Tez 0.7.0, avro 1.7.4



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (TEZ-3330) Error on avro M/R job with Tez: missing configuration property

2016-07-12 Thread Manuel Godbert (JIRA)

[ 
https://issues.apache.org/jira/browse/TEZ-3330?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15372798#comment-15372798
 ] 

Manuel Godbert commented on TEZ-3330:
-

I already tried that actually, with no success: the configuration property 
becomes available during shuffle but its value is the constant value of the 
tez-site.xml, not the value dynamically built at job setup.

> Error on avro M/R job with Tez: missing configuration property
> --
>
> Key: TEZ-3330
> URL: https://issues.apache.org/jira/browse/TEZ-3330
> Project: Apache Tez
>  Issue Type: Bug
>Reporter: Manuel Godbert
>
> I tried running the simple avro M/R job MapredColorCount, that I found in the 
> examples of avro release 1.7.7.
> It failed with the following trace:
> {code}
> errorMessage=Shuffle Runner 
> Failed:org.apache.tez.runtime.library.common.shuffle.orderedgrouped.Shuffle$ShuffleError:
>  Error while doing final merge
> at 
> org.apache.tez.runtime.library.common.shuffle.orderedgrouped.Shuffle$RunShuffleCallable.callInternal(Shuffle.java:378)
> at 
> org.apache.tez.runtime.library.common.shuffle.orderedgrouped.Shuffle$RunShuffleCallable.callInternal(Shuffle.java:337)
> at org.apache.tez.common.CallableWithNdc.call(CallableWithNdc.java:36)
> at java.util.concurrent.FutureTask.run(FutureTask.java:262)
> at 
> java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145)
> at 
> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615)
> at java.lang.Thread.run(Thread.java:744)
> Caused by: java.lang.NullPointerException
> at java.io.StringReader.(StringReader.java:50)
> at org.apache.avro.Schema$Parser.parse(Schema.java:917)
> at org.apache.avro.Schema.parse(Schema.java:966)
> at org.apache.avro.mapred.AvroJob.getMapOutputSchema(AvroJob.java:78)
> at 
> org.apache.avro.mapred.AvroKeyComparator.setConf(AvroKeyComparator.java:39)
> at 
> org.apache.hadoop.util.ReflectionUtils.setConf(ReflectionUtils.java:76)
> at 
> org.apache.hadoop.util.ReflectionUtils.newInstance(ReflectionUtils.java:136)
> at 
> org.apache.tez.runtime.library.common.ConfigUtils.getIntermediateInputKeyComparator(ConfigUtils.java:133)
> at 
> org.apache.tez.runtime.library.common.shuffle.orderedgrouped.MergeManager.finalMerge(MergeManager.java:915)
> at 
> org.apache.tez.runtime.library.common.shuffle.orderedgrouped.MergeManager.close(MergeManager.java:540)
> at 
> org.apache.tez.runtime.library.common.shuffle.orderedgrouped.Shuffle$RunShuffleCallable.callInternal(Shuffle.java:376)
> ... 6 more
> {code}
> Digging a bit I saw that during shuffle Tez can't access some of the 
> configuration properties of the job. In our example it is the 
> avro.output.schema that is missing.
> With some more complicated code I could get one step further and a similar 
> issue happened when the valuesIterator for the reducer was being built:
> {code}
> java.lang.NullPointerException
> at java.io.StringReader.(StringReader.java:50)
> at org.apache.avro.Schema$Parser.parse(Schema.java:917)
> at org.apache.avro.Schema.parse(Schema.java:966)
> at org.apache.avro.mapred.AvroJob.getMapOutputSchema(AvroJob.java:78)
> at 
> org.apache.avro.mapred.AvroSerialization.getDeserializer(AvroSerialization.java:53)
> at 
> org.apache.hadoop.io.serializer.SerializationFactory.getDeserializer(SerializationFactory.java:90)
> at 
> org.apache.tez.runtime.library.common.ValuesIterator.(ValuesIterator.java:80)
> at 
> org.apache.tez.runtime.library.input.OrderedGroupedKVInput.createValuesIterator(OrderedGroupedKVInput.java:287)
> {code}
> I am using HDP2.4, Tez 0.7.0, avro 1.7.4



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (TEZ-3330) Error on avro M/R job with Tez: missing configuration property

2016-07-11 Thread Hitesh Shah (JIRA)

[ 
https://issues.apache.org/jira/browse/TEZ-3330?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15371859#comment-15371859
 ] 

Hitesh Shah commented on TEZ-3330:
--

For now, can you try adding the configs in question into tez-site.xml and see 
if that gets you past the error? 

> Error on avro M/R job with Tez: missing configuration property
> --
>
> Key: TEZ-3330
> URL: https://issues.apache.org/jira/browse/TEZ-3330
> Project: Apache Tez
>  Issue Type: Bug
>Reporter: Manuel Godbert
>
> I tried running the simple avro M/R job MapredColorCount, that I found in the 
> examples of avro release 1.7.7.
> It failed with the following trace:
> {code}
> errorMessage=Shuffle Runner 
> Failed:org.apache.tez.runtime.library.common.shuffle.orderedgrouped.Shuffle$ShuffleError:
>  Error while doing final merge
> at 
> org.apache.tez.runtime.library.common.shuffle.orderedgrouped.Shuffle$RunShuffleCallable.callInternal(Shuffle.java:378)
> at 
> org.apache.tez.runtime.library.common.shuffle.orderedgrouped.Shuffle$RunShuffleCallable.callInternal(Shuffle.java:337)
> at org.apache.tez.common.CallableWithNdc.call(CallableWithNdc.java:36)
> at java.util.concurrent.FutureTask.run(FutureTask.java:262)
> at 
> java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145)
> at 
> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615)
> at java.lang.Thread.run(Thread.java:744)
> Caused by: java.lang.NullPointerException
> at java.io.StringReader.(StringReader.java:50)
> at org.apache.avro.Schema$Parser.parse(Schema.java:917)
> at org.apache.avro.Schema.parse(Schema.java:966)
> at org.apache.avro.mapred.AvroJob.getMapOutputSchema(AvroJob.java:78)
> at 
> org.apache.avro.mapred.AvroKeyComparator.setConf(AvroKeyComparator.java:39)
> at 
> org.apache.hadoop.util.ReflectionUtils.setConf(ReflectionUtils.java:76)
> at 
> org.apache.hadoop.util.ReflectionUtils.newInstance(ReflectionUtils.java:136)
> at 
> org.apache.tez.runtime.library.common.ConfigUtils.getIntermediateInputKeyComparator(ConfigUtils.java:133)
> at 
> org.apache.tez.runtime.library.common.shuffle.orderedgrouped.MergeManager.finalMerge(MergeManager.java:915)
> at 
> org.apache.tez.runtime.library.common.shuffle.orderedgrouped.MergeManager.close(MergeManager.java:540)
> at 
> org.apache.tez.runtime.library.common.shuffle.orderedgrouped.Shuffle$RunShuffleCallable.callInternal(Shuffle.java:376)
> ... 6 more
> {code}
> Digging a bit I saw that during shuffle Tez can't access some of the 
> configuration properties of the job. In our example it is the 
> avro.output.schema that is missing.
> With some more complicated code I could get one step further and a similar 
> issue happened when the valuesIterator for the reducer was being built:
> {code}
> java.lang.NullPointerException
> at java.io.StringReader.(StringReader.java:50)
> at org.apache.avro.Schema$Parser.parse(Schema.java:917)
> at org.apache.avro.Schema.parse(Schema.java:966)
> at org.apache.avro.mapred.AvroJob.getMapOutputSchema(AvroJob.java:78)
> at 
> org.apache.avro.mapred.AvroSerialization.getDeserializer(AvroSerialization.java:53)
> at 
> org.apache.hadoop.io.serializer.SerializationFactory.getDeserializer(SerializationFactory.java:90)
> at 
> org.apache.tez.runtime.library.common.ValuesIterator.(ValuesIterator.java:80)
> at 
> org.apache.tez.runtime.library.input.OrderedGroupedKVInput.createValuesIterator(OrderedGroupedKVInput.java:287)
> {code}
> I am using HDP2.4, Tez 0.7.0, avro 1.7.4



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (TEZ-3330) Error on avro M/R job with Tez: missing configuration property

2016-07-11 Thread Siddharth Seth (JIRA)

[ 
https://issues.apache.org/jira/browse/TEZ-3330?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15371177#comment-15371177
 ] 

Siddharth Seth commented on TEZ-3330:
-

That makes sense. Maybe we should consider removing the filtering completely..

> Error on avro M/R job with Tez: missing configuration property
> --
>
> Key: TEZ-3330
> URL: https://issues.apache.org/jira/browse/TEZ-3330
> Project: Apache Tez
>  Issue Type: Bug
>Reporter: Manuel Godbert
>
> I tried running the simple avro M/R job MapredColorCount, that I found in the 
> examples of avro release 1.7.7.
> It failed with the following trace:
> {code}
> errorMessage=Shuffle Runner 
> Failed:org.apache.tez.runtime.library.common.shuffle.orderedgrouped.Shuffle$ShuffleError:
>  Error while doing final merge
> at 
> org.apache.tez.runtime.library.common.shuffle.orderedgrouped.Shuffle$RunShuffleCallable.callInternal(Shuffle.java:378)
> at 
> org.apache.tez.runtime.library.common.shuffle.orderedgrouped.Shuffle$RunShuffleCallable.callInternal(Shuffle.java:337)
> at org.apache.tez.common.CallableWithNdc.call(CallableWithNdc.java:36)
> at java.util.concurrent.FutureTask.run(FutureTask.java:262)
> at 
> java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145)
> at 
> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615)
> at java.lang.Thread.run(Thread.java:744)
> Caused by: java.lang.NullPointerException
> at java.io.StringReader.(StringReader.java:50)
> at org.apache.avro.Schema$Parser.parse(Schema.java:917)
> at org.apache.avro.Schema.parse(Schema.java:966)
> at org.apache.avro.mapred.AvroJob.getMapOutputSchema(AvroJob.java:78)
> at 
> org.apache.avro.mapred.AvroKeyComparator.setConf(AvroKeyComparator.java:39)
> at 
> org.apache.hadoop.util.ReflectionUtils.setConf(ReflectionUtils.java:76)
> at 
> org.apache.hadoop.util.ReflectionUtils.newInstance(ReflectionUtils.java:136)
> at 
> org.apache.tez.runtime.library.common.ConfigUtils.getIntermediateInputKeyComparator(ConfigUtils.java:133)
> at 
> org.apache.tez.runtime.library.common.shuffle.orderedgrouped.MergeManager.finalMerge(MergeManager.java:915)
> at 
> org.apache.tez.runtime.library.common.shuffle.orderedgrouped.MergeManager.close(MergeManager.java:540)
> at 
> org.apache.tez.runtime.library.common.shuffle.orderedgrouped.Shuffle$RunShuffleCallable.callInternal(Shuffle.java:376)
> ... 6 more
> {code}
> Digging a bit I saw that during shuffle Tez can't access some of the 
> configuration properties of the job. In our example it is the 
> avro.output.schema that is missing.
> With some more complicated code I could get one step further and a similar 
> issue happened when the valuesIterator for the reducer was being built:
> {code}
> java.lang.NullPointerException
> at java.io.StringReader.(StringReader.java:50)
> at org.apache.avro.Schema$Parser.parse(Schema.java:917)
> at org.apache.avro.Schema.parse(Schema.java:966)
> at org.apache.avro.mapred.AvroJob.getMapOutputSchema(AvroJob.java:78)
> at 
> org.apache.avro.mapred.AvroSerialization.getDeserializer(AvroSerialization.java:53)
> at 
> org.apache.hadoop.io.serializer.SerializationFactory.getDeserializer(SerializationFactory.java:90)
> at 
> org.apache.tez.runtime.library.common.ValuesIterator.(ValuesIterator.java:80)
> at 
> org.apache.tez.runtime.library.input.OrderedGroupedKVInput.createValuesIterator(OrderedGroupedKVInput.java:287)
> {code}
> I am using HDP2.4, Tez 0.7.0, avro 1.7.4



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (TEZ-3330) Error on avro M/R job with Tez: missing configuration property

2016-07-08 Thread Hitesh Shah (JIRA)

[ 
https://issues.apache.org/jira/browse/TEZ-3330?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15367878#comment-15367878
 ] 

Hitesh Shah commented on TEZ-3330:
--

[~sseth] This is likely due to how we keep the configs small in the 
inputs/outputs by filtering out the non-required settings. In MR mode, should 
we just pass in all configs into each Input and Output given that we have no 
guarantees on what is being used/not-used?  

> Error on avro M/R job with Tez: missing configuration property
> --
>
> Key: TEZ-3330
> URL: https://issues.apache.org/jira/browse/TEZ-3330
> Project: Apache Tez
>  Issue Type: Bug
>Reporter: Manuel Godbert
>
> I tried running the simple avro M/R job MapredColorCount, that I found in the 
> examples of avro release 1.7.7.
> It failed with the following trace:
> {code}
> errorMessage=Shuffle Runner 
> Failed:org.apache.tez.runtime.library.common.shuffle.orderedgrouped.Shuffle$ShuffleError:
>  Error while doing final merge
> at 
> org.apache.tez.runtime.library.common.shuffle.orderedgrouped.Shuffle$RunShuffleCallable.callInternal(Shuffle.java:378)
> at 
> org.apache.tez.runtime.library.common.shuffle.orderedgrouped.Shuffle$RunShuffleCallable.callInternal(Shuffle.java:337)
> at org.apache.tez.common.CallableWithNdc.call(CallableWithNdc.java:36)
> at java.util.concurrent.FutureTask.run(FutureTask.java:262)
> at 
> java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145)
> at 
> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615)
> at java.lang.Thread.run(Thread.java:744)
> Caused by: java.lang.NullPointerException
> at java.io.StringReader.(StringReader.java:50)
> at org.apache.avro.Schema$Parser.parse(Schema.java:917)
> at org.apache.avro.Schema.parse(Schema.java:966)
> at org.apache.avro.mapred.AvroJob.getMapOutputSchema(AvroJob.java:78)
> at 
> org.apache.avro.mapred.AvroKeyComparator.setConf(AvroKeyComparator.java:39)
> at 
> org.apache.hadoop.util.ReflectionUtils.setConf(ReflectionUtils.java:76)
> at 
> org.apache.hadoop.util.ReflectionUtils.newInstance(ReflectionUtils.java:136)
> at 
> org.apache.tez.runtime.library.common.ConfigUtils.getIntermediateInputKeyComparator(ConfigUtils.java:133)
> at 
> org.apache.tez.runtime.library.common.shuffle.orderedgrouped.MergeManager.finalMerge(MergeManager.java:915)
> at 
> org.apache.tez.runtime.library.common.shuffle.orderedgrouped.MergeManager.close(MergeManager.java:540)
> at 
> org.apache.tez.runtime.library.common.shuffle.orderedgrouped.Shuffle$RunShuffleCallable.callInternal(Shuffle.java:376)
> ... 6 more
> {code}
> Digging a bit I saw that during shuffle Tez can't access some of the 
> configuration properties of the job. In our example it is the 
> avro.output.schema that is missing.
> With some more complicated code I could get one step further and a similar 
> issue happened when the valuesIterator for the reducer was being built:
> {code}
> java.lang.NullPointerException
> at java.io.StringReader.(StringReader.java:50)
> at org.apache.avro.Schema$Parser.parse(Schema.java:917)
> at org.apache.avro.Schema.parse(Schema.java:966)
> at org.apache.avro.mapred.AvroJob.getMapOutputSchema(AvroJob.java:78)
> at 
> org.apache.avro.mapred.AvroSerialization.getDeserializer(AvroSerialization.java:53)
> at 
> org.apache.hadoop.io.serializer.SerializationFactory.getDeserializer(SerializationFactory.java:90)
> at 
> org.apache.tez.runtime.library.common.ValuesIterator.(ValuesIterator.java:80)
> at 
> org.apache.tez.runtime.library.input.OrderedGroupedKVInput.createValuesIterator(OrderedGroupedKVInput.java:287)
> {code}
> I am using HDP2.4, Tez 0.7.0, avro 1.7.4



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)