[jira] [Commented] (TEZ-2741) Hive on Tez does not work well with Sequence Files Schema changes

2016-05-26 Thread Gopal V (JIRA)

[ 
https://issues.apache.org/jira/browse/TEZ-2741?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15303501#comment-15303501
 ] 

Gopal V commented on TEZ-2741:
--

[~rajesh.balamohan]: the reported bug cannot be reproduced by only using Hive.

The bug scenario requires PIG written SequenceFiles in the same directory as 
Hive written one (Specifically, PIG data was not generated via HCatalog).



> Hive on Tez does not work well with Sequence Files Schema changes
> -
>
> Key: TEZ-2741
> URL: https://issues.apache.org/jira/browse/TEZ-2741
> Project: Apache Tez
>  Issue Type: Bug
>Reporter: Rajat Jain
>Assignee: Gopal V
> Attachments: TEZ-2741.1.patch, garbled_text
>
>
> {code}
> hive> create external table foo (a string) partitioned by (p string) stored 
> as sequencefile location 'hdfs:///user/hive/foo'
> # A useless file with some text in hdfs
> hive> create external table tmp_foo (a string) location 
> 'hdfs:///tmp/random_data'
> hive> insert overwrite table foo partition (p = '1') select * from tmp_foo
> {code}
> After this step, {{foo}} contains one partition with a text file.
> Now use this Java program to generate the second sequence file (but with a 
> different key class)
> {code}
> import org.apache.hadoop.conf.Configuration;
> import org.apache.hadoop.fs.Path;
> import org.apache.hadoop.io.BytesWritable;
> import org.apache.hadoop.io.LongWritable;
> import org.apache.hadoop.io.Text;
> import org.apache.hadoop.mapreduce.Job;
> import org.apache.hadoop.mapreduce.Mapper;
> import org.apache.hadoop.mapreduce.Reducer;
> import org.apache.hadoop.mapreduce.lib.input.TextInputFormat;
> import org.apache.hadoop.mapreduce.lib.output.SequenceFileOutputFormat;
> import java.io.IOException;
> public class SequenceFileWriter {
>   public static void main(String[] args) throws IOException,
>   InterruptedException, ClassNotFoundException {
> Configuration conf = new Configuration();
> Job job = new Job(conf);
> job.setJobName("Convert Text");
> job.setJarByClass(Mapper.class);
> job.setMapperClass(Mapper.class);
> job.setReducerClass(Reducer.class);
> // increase if you need sorting or a special number of files
> job.setNumReduceTasks(0);
> job.setOutputKeyClass(LongWritable.class);
> job.setOutputValueClass(Text.class);
> job.setOutputFormatClass(SequenceFileOutputFormat.class);
> job.setInputFormatClass(TextInputFormat.class);
> TextInputFormat.addInputPath(job, new Path("/tmp/random_data"));
> SequenceFileOutputFormat.setOutputPath(job, new 
> Path("/user/hive/foo/p=2/"));
> // submit and wait for completion
> job.waitForCompletion(true);
>   }
> }
> {code}
> Now run {{select count(*) from foo;}}. It passes with MapReduce, but fails 
> with Tez with the following error:
> {code}
> hive> set hive.execution.engine=tez;
> hive> select count(*) from foo;
> Status: Failed
> Vertex failed, vertexName=Map 1, vertexId=vertex_1438013895843_0007_1_00, 
> diagnostics=[Task failed, taskId=task_1438013895843_0007_1_00_00, 
> diagnostics=[TaskAttempt 0 failed, info=[Error: Failure while running 
> task:java.lang.RuntimeException: 
> org.apache.hadoop.hive.ql.metadata.HiveException: java.io.IOException: 
> java.io.IOException: While processing file 
> hdfs://localhost:9000/user/hive/foo/p=2/part-m-0. wrong key class: 
> org.apache.hadoop.io.BytesWritable is not class 
> org.apache.hadoop.io.LongWritable
>   at 
> org.apache.hadoop.hive.ql.exec.tez.TezProcessor.initializeAndRunProcessor(TezProcessor.java:171)
>   at 
> org.apache.hadoop.hive.ql.exec.tez.TezProcessor.run(TezProcessor.java:137)
>   at 
> org.apache.tez.runtime.LogicalIOProcessorRuntimeTask.run(LogicalIOProcessorRuntimeTask.java:337)
>   at 
> org.apache.tez.runtime.task.TezTaskRunner$TaskRunnerCallable$1.run(TezTaskRunner.java:179)
>   at 
> org.apache.tez.runtime.task.TezTaskRunner$TaskRunnerCallable$1.run(TezTaskRunner.java:171)
>   at java.security.AccessController.doPrivileged(Native Method)
>   at javax.security.auth.Subject.doAs(Subject.java:415)
>   at 
> org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1635)
>   at 
> org.apache.tez.runtime.task.TezTaskRunner$TaskRunnerCallable.callInternal(TezTaskRunner.java:171)
>   at 
> org.apache.tez.runtime.task.TezTaskRunner$TaskRunnerCallable.callInternal(TezTaskRunner.java:167)
>   at org.apache.tez.common.CallableWithNdc.call(CallableWithNdc.java:36)
>   at java.util.concurrent.FutureTask.run(FutureTask.java:262)
>   at 
> java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145)
>   at 
> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615)
>   at java.lang.Thread.run(Thread.java:745)
> Caused by: 

[jira] [Commented] (TEZ-3266) DAG failed when yarn resources is rare like " No groups available for user A" because DAGAppMaster launched and exit_with_sucessful immediately.

2016-05-26 Thread Feng Yuan (JIRA)

[ 
https://issues.apache.org/jira/browse/TEZ-3266?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15303488#comment-15303488
 ] 

Feng Yuan commented on TEZ-3266:


hi [~hitesh],i confirm i setted "tez.session.client.timeout.secs" to -1.
but it still dont workaroud.

app logs:
Container: container_1463493135662_92725_01_01 on XXX
=
LogType:stderr
Log Upload Time:26-May-2016 16:46:58
LogLength:0
Log Contents:

LogType:stdout
Log Upload Time:26-May-2016 16:46:58
LogLength:948
Log Contents:
Heap
 PSYoungGen  total 301056K, used 23304K [0x0007ccc8, 
0x0007e1c0, 0x0008)
  eden space 258560K, 9% used 
[0x0007ccc8,0x0007d5c75578,0x0007dc90)
lgrp 0 space 129280K, 4% used 
[0x0007ccc8,0x0007cd18cd88,0x0007d4ac)
lgrp 1 space 129280K, 14% used 
[0x0007d4ac,0x0007d5c75578,0x0007dc90)
  from space 42496K, 0% used 
[0x0007df28,0x0007df28,0x0007e1c0)
  to   space 42496K, 0% used 
[0x0007dc90,0x0007dc90,0x0007df28)
 ParOldGen   total 686592K, used 0K [0x00076660, 
0x00079048, 0x0007ccc8)
  object space 686592K, 0% used 
[0x00076660,0x00076660,0x00079048)
 PSPermGen   total 21504K, used 6480K [0x00076140, 
0x00076290, 0x00076660)
  object space 21504K, 30% used 
[0x00076140,0x000761a540a8,0x00076290)

LogType:syslog
Log Upload Time:26-May-2016 16:46:58
LogLength:0
Log Contents:

i see app is killed from acceptted state:
2016-05-26 16:46:53,964 INFO 
org.apache.hadoop.yarn.server.resourcemanager.rmapp.RMAppImpl: Storing 
application with id application_1463493135662_92725
2016-05-26 16:46:53,964 INFO 
org.apache.hadoop.yarn.server.resourcemanager.RMAuditLogger: USER=bre  
IP=192.168.44.40OPERATION=Submit Application Request
TARGET=ClientRMServiceRESULT=SUCCESSAPPID=application_1463493135662_92725
2016-05-26 16:46:53,964 INFO 
org.apache.hadoop.yarn.server.resourcemanager.rmapp.RMAppImpl: 
application_1463493135662_92725 State change from NEW to NEW_SAVING
2016-05-26 16:46:53,964 INFO 
org.apache.hadoop.yarn.server.resourcemanager.recovery.RMStateStore: Storing 
info for app: application_1463493135662_92725
2016-05-26 16:46:53,966 INFO 
org.apache.hadoop.yarn.server.resourcemanager.rmapp.RMAppImpl: 
application_1463493135662_92725 State change from NEW_SAVING to SUBMITTED
2016-05-26 16:46:53,967 INFO 
org.apache.hadoop.yarn.server.resourcemanager.scheduler.fair.FairScheduler: 
Accepted application application_1463493135662_92725 from user: bre, in queue: 
default, currently num of applications: 3
2016-05-26 16:46:53,967 INFO 
org.apache.hadoop.yarn.server.resourcemanager.rmapp.RMAppImpl: 
application_1463493135662_92725 State change from SUBMITTED to ACCEPTED
2016-05-26 16:46:53,976 INFO 
org.apache.hadoop.yarn.server.resourcemanager.RMAuditLogger: USER=bre  
OPERATION=AM Allocated ContainerTARGET=SchedulerApp RESULT=SUCCESS  
APPID=application_1463493135662_92725   
CONTAINERID=container_1463493135662_92725_01_01
2016-05-26 16:46:53,977 INFO 
org.apache.hadoop.yarn.server.resourcemanager.rmapp.attempt.RMAppAttemptImpl: 
Storing attempt: AppId: application_1463493135662_92725 AttemptId: 
appattempt_1463493135662_92725_01 MasterContainer: Container: [ContainerId: 
container_1463493135662_92725_01_01, NodeId: 
bjlg-44p129-hadoop115.bfdabc.com:2717, NodeHttpAddress: 
bjlg-44p129-hadoop115.bfdabc.com:8042, Resource: , 
Priority: 0, Token: Token { kind: ContainerToken, service: 192.168.44.129:2717 
}, ]
2016-05-26 16:46:58,379 INFO 
org.apache.hadoop.yarn.server.resourcemanager.rmapp.RMAppImpl: 
application_1463493135662_92725 State change from ACCEPTED to KILLING
2016-05-26 16:46:58,381 INFO 
org.apache.hadoop.yarn.server.resourcemanager.recovery.ZKRMStateStore: Watcher 
event type: NodeDataChanged with state:SyncConnected for 
path:/rmstore/ZKRMStateRoot/RMAppRoot/application_1463493135662_92725/appattempt_1463493135662_92725_01
 for Service 
org.apache.hadoop.yarn.server.resourcemanager.recovery.RMStateStore in state 
org.apache.hadoop.yarn.server.resourcemanager.recovery.RMStateStore: STARTED
2016-05-26 16:46:58,382 INFO 
org.apache.hadoop.yarn.server.resourcemanager.rmapp.RMAppImpl: Updating 
application application_1463493135662_92725 with final state: KILLED
2016-05-26 16:46:58,382 INFO 
org.apache.hadoop.yarn.server.resourcemanager.rmapp.RMAppImpl: 
application_1463493135662_92725 State change from KILLING to FINAL_SAVING
2016-05-26 16:46:58,382 INFO 
org.apache.hadoop.yarn.server.resourcemanager.recovery.RMStateStore: Updating 
info for app: application_1463493135662_92725
2016-05-26 16:46:58,382 INFO 

[jira] [Commented] (TEZ-3265) Add preconditions check in SortSpan when available mem is less than metasize

2016-05-26 Thread Rajesh Balamohan (JIRA)

[ 
https://issues.apache.org/jira/browse/TEZ-3265?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15303369#comment-15303369
 ] 

Rajesh Balamohan commented on TEZ-3265:
---

agreed. Saw this "reserved.remaining()=14680064, reserved.metasize=16777216" in 
one of the jobs which was weird given the checks. Increasing the mem was the 
easier workaround, but will check why this happened.

> Add preconditions check in SortSpan when available mem is less than metasize
> 
>
> Key: TEZ-3265
> URL: https://issues.apache.org/jira/browse/TEZ-3265
> Project: Apache Tez
>  Issue Type: Bug
>Reporter: Rajesh Balamohan
>Priority: Minor
> Attachments: TEZ-3265.1.patch
>
>
> {noformat}
> 2016-05-21 09:01:48,523 [INFO] [TezChild] |impl.PipelinedSorter|: Reducer 3: 
> reserved.remaining()=14680064, reserved.metasize=16777216
> ...
> ...
> ...
> ...
> Caused by: java.lang.IllegalArgumentException
>   at java.nio.Buffer.position(Buffer.java:244)
>   at 
> org.apache.tez.runtime.library.common.sort.impl.PipelinedSorter$SortSpan.(PipelinedSorter.java:737)
>   at 
> org.apache.tez.runtime.library.common.sort.impl.PipelinedSorter.sort(PipelinedSorter.java:255)
>   at 
> org.apache.tez.runtime.library.common.sort.impl.PipelinedSorter.collect(PipelinedSorter.java:310)
>   at 
> org.apache.tez.runtime.library.common.sort.impl.PipelinedSorter.write(PipelinedSorter.java:283)
>   at 
> org.apache.tez.runtime.library.output.OrderedPartitionedKVOutput$1.write(OrderedPartitionedKVOutput.java:164)
>   at 
> org.apache.hadoop.hive.ql.exec.tez.TezProcessor$TezKVOutputCollector.collect(TezProcessor.java:198)
>   at 
> org.apache.hadoop.hive.ql.exec.ReduceSinkOperator.collect(ReduceSinkOperator.java:542)
>   at 
> org.apache.hadoop.hive.ql.exec.ReduceSinkOperator.process(ReduceSinkOperator.java:386)
>   ... 45 more
> {noformat}
> It would be good to have a Preconditions check in SortSpan instead of 
> throwing exception from Buffer.java.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (TEZ-3063) Tez UI : Display Input, Output, Processor, Source and Sink configurations under a vertex

2016-05-26 Thread Hitesh Shah (JIRA)

[ 
https://issues.apache.org/jira/browse/TEZ-3063?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15303129#comment-15303129
 ] 

Hitesh Shah commented on TEZ-3063:
--

Visually, the new configs tab looks great. 

Minor nits: 
  - As part of some basic UX testing with a couple of users, we noticed that a 
lot of folks were accidentally clicking on the vertex name which took the user 
to the configs tab for the different vertex instead of expecting to see say the 
configs of an Input from Map 1. Can we change the vertex name to be a non-link 
for now?
  - s/Configurations 250/Configurations: 250/" or 250 Config Properties would 
be better as Configurations 1 makes it sound like an index/ID.
  - For Configurations 0, is the data present but empty? or not present at all? 
We can probably just use "Configuration not available" 

  


 

> Tez UI : Display Input, Output, Processor, Source and Sink configurations 
> under a vertex
> 
>
> Key: TEZ-3063
> URL: https://issues.apache.org/jira/browse/TEZ-3063
> Project: Apache Tez
>  Issue Type: Sub-task
>Reporter: Sreenath Somarajapuram
>Assignee: Sreenath Somarajapuram
> Attachments: TEZ-3063.1.patch, Vertex configs.png
>
>
> New tab/page:
> - Create a new configurations tab under Vertex.
> - Add a visualization as in the attached wireframe image with Processor, 
> Sources, Sinks, Inputs and Outputs.
> Interaction:
> - On clicking the user would be shown the related details (name, desc, class 
> etc), and a tabular representation of the configuration.
> - On clicking a vertex name on the visualization, user would be redirected to 
> the configurations page of the respective vertex.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (TEZ-3271) Provide mapreduce failures.maxpercent equivalent

2016-05-26 Thread TezQA (JIRA)

[ 
https://issues.apache.org/jira/browse/TEZ-3271?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15302945#comment-15302945
 ] 

TezQA commented on TEZ-3271:


{color:red}-1 overall{color}.  Here are the results of testing the latest 
attachment
  http://issues.apache.org/jira/secure/attachment/12806460/TEZ-3271.2.patch
  against master revision 89802b1.

{color:green}+1 @author{color}.  The patch does not contain any @author 
tags.

{color:green}+1 tests included{color}.  The patch appears to include 1 new 
or modified test files.

{color:green}+1 javac{color}.  The applied patch does not increase the 
total number of javac compiler warnings.

{color:green}+1 javadoc{color}.  There were no new javadoc warning messages.

{color:green}+1 findbugs{color}.  The patch does not introduce any new 
Findbugs (version 3.0.1) warnings.

{color:green}+1 release audit{color}.  The applied patch does not increase 
the total number of release audit warnings.

{color:red}-1 core tests{color}.  The patch failed these unit tests in :
   org.apache.tez.dag.app.dag.impl.TestVertexImpl
  org.apache.tez.test.TestTaskErrorsUsingLocalMode
  org.apache.tez.test.TestFaultTolerance
  org.apache.tez.test.TestExceptionPropagation
  org.apache.tez.test.TestLocalMode
  org.apache.tez.history.TestHistoryParser

Test results: 
https://builds.apache.org/job/PreCommit-TEZ-Build/1755//testReport/
Console output: https://builds.apache.org/job/PreCommit-TEZ-Build/1755//console

This message is automatically generated.

> Provide mapreduce failures.maxpercent equivalent
> 
>
> Key: TEZ-3271
> URL: https://issues.apache.org/jira/browse/TEZ-3271
> Project: Apache Tez
>  Issue Type: New Feature
>Reporter: Jonathan Eagles
>Assignee: Jonathan Eagles
> Attachments: TEZ-3271.1.patch, TEZ-3271.2.patch
>
>
> mapreduce.map.failures.maxpercent
> mapreduce.reduce.failures.maxpercent



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


Failed: TEZ-3271 PreCommit Build #1755

2016-05-26 Thread Apache Jenkins Server
Jira: https://issues.apache.org/jira/browse/TEZ-3271
Build: https://builds.apache.org/job/PreCommit-TEZ-Build/1755/

###
## LAST 60 LINES OF THE CONSOLE 
###
[...truncated 1475 lines...]
awk: cannot open 
/home/jenkins/jenkins-slave/workspace/PreCommit-TEZ-Build@2/../patchprocess/testrun.txt
 (No such file or directory)




{color:red}-1 overall{color}.  Here are the results of testing the latest 
attachment
  http://issues.apache.org/jira/secure/attachment/12806460/TEZ-3271.2.patch
  against master revision 89802b1.

{color:green}+1 @author{color}.  The patch does not contain any @author 
tags.

{color:green}+1 tests included{color}.  The patch appears to include 1 new 
or modified test files.

{color:green}+1 javac{color}.  The applied patch does not increase the 
total number of javac compiler warnings.

{color:green}+1 javadoc{color}.  There were no new javadoc warning messages.

{color:green}+1 findbugs{color}.  The patch does not introduce any new 
Findbugs (version 3.0.1) warnings.

{color:green}+1 release audit{color}.  The applied patch does not increase 
the total number of release audit warnings.

{color:red}-1 core tests{color}.  The patch failed these unit tests in :
   org.apache.tez.dag.app.dag.impl.TestVertexImpl
  org.apache.tez.test.TestTaskErrorsUsingLocalMode
  org.apache.tez.test.TestFaultTolerance
  org.apache.tez.test.TestExceptionPropagation
  org.apache.tez.test.TestLocalMode
  org.apache.tez.history.TestHistoryParser

Test results: 
https://builds.apache.org/job/PreCommit-TEZ-Build/1755//testReport/
Console output: https://builds.apache.org/job/PreCommit-TEZ-Build/1755//console

This message is automatically generated.


==
==
Adding comment to Jira.
==
==


Comment added.
652ac13effa98ae017c22e23ac9d864504c81892 logged out


==
==
Finished build.
==
==


Build step 'Execute shell' marked build as failure
Archiving artifacts
[description-setter] Could not determine description.
Recording test results
Email was triggered for: Failure - Any
Sending email for trigger: Failure - Any



###
## FAILED TESTS (if any) 
##
24 tests failed.
FAILED:  org.apache.tez.dag.app.dag.impl.TestVertexImpl.testVertexFailure

Error Message:
expected: but was:

Stack Trace:
java.lang.AssertionError: expected: but was:
at org.junit.Assert.fail(Assert.java:88)
at org.junit.Assert.failNotEquals(Assert.java:743)
at org.junit.Assert.assertEquals(Assert.java:118)
at org.junit.Assert.assertEquals(Assert.java:144)
at 
org.apache.tez.dag.app.dag.impl.TestVertexImpl.testVertexFailure(TestVertexImpl.java:3296)


FAILED:  org.apache.tez.dag.app.dag.impl.TestVertexImpl.testKilledTasksHandling

Error Message:
expected: but was:

Stack Trace:
java.lang.AssertionError: expected: but was:
at org.junit.Assert.fail(Assert.java:88)
at org.junit.Assert.failNotEquals(Assert.java:743)
at org.junit.Assert.assertEquals(Assert.java:118)
at org.junit.Assert.assertEquals(Assert.java:144)
at 
org.apache.tez.dag.app.dag.impl.TestVertexImpl.testKilledTasksHandling(TestVertexImpl.java:3395)


FAILED:  org.apache.tez.dag.app.dag.impl.TestVertexImpl.testVertexTaskFailure

Error Message:
expected: but was:

Stack Trace:
java.lang.AssertionError: expected: but was:
at org.junit.Assert.fail(Assert.java:88)
at org.junit.Assert.failNotEquals(Assert.java:743)
at org.junit.Assert.assertEquals(Assert.java:118)
at org.junit.Assert.assertEquals(Assert.java:144)
at 
org.apache.tez.dag.app.dag.impl.TestVertexImpl.testVertexTaskFailure(TestVertexImpl.java:3447)


FAILED:  org.apache.tez.history.TestHistoryParser.testParserWithFailedJob

Error Message:
null

Stack Trace:
java.lang.AssertionError: null
at org.junit.Assert.fail(Assert.java:86)
at org.junit.Assert.assertTrue(Assert.java:41)
at org.junit.Assert.assertTrue(Assert.java:52)
at 
org.apache.tez.history.TestHistoryParser.testParserWithFailedJob(TestHistoryParser.java:381)


FAILED:  

[jira] [Updated] (TEZ-3277) Tez UI: Improve the error bar

2016-05-26 Thread Sreenath Somarajapuram (JIRA)

 [ 
https://issues.apache.org/jira/browse/TEZ-3277?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Sreenath Somarajapuram updated TEZ-3277:

Description: 
- When a request fails, the error bar must display the complete URL that was hit
- Right now we just support closing of the bar. We must have some way to open 
it.

- Investigate if we can display an error history.
- Investigate how the messages can be made better.

  was:
- When a request fails, the error bar must display the complete URL that was hit
- Right now we just support closing of the bar. We must have some way to open 
it.
- Investigate if we can display an error history.


> Tez UI: Improve the error bar
> -
>
> Key: TEZ-3277
> URL: https://issues.apache.org/jira/browse/TEZ-3277
> Project: Apache Tez
>  Issue Type: Bug
>Reporter: Sreenath Somarajapuram
>Assignee: Sreenath Somarajapuram
>
> - When a request fails, the error bar must display the complete URL that was 
> hit
> - Right now we just support closing of the bar. We must have some way to open 
> it.
> - Investigate if we can display an error history.
> - Investigate how the messages can be made better.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (TEZ-3271) Provide mapreduce failures.maxpercent equivalent

2016-05-26 Thread Jonathan Eagles (JIRA)

 [ 
https://issues.apache.org/jira/browse/TEZ-3271?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jonathan Eagles updated TEZ-3271:
-
Attachment: TEZ-3271.2.patch

> Provide mapreduce failures.maxpercent equivalent
> 
>
> Key: TEZ-3271
> URL: https://issues.apache.org/jira/browse/TEZ-3271
> Project: Apache Tez
>  Issue Type: New Feature
>Reporter: Jonathan Eagles
>Assignee: Jonathan Eagles
> Attachments: TEZ-3271.1.patch, TEZ-3271.2.patch
>
>
> mapreduce.map.failures.maxpercent
> mapreduce.reduce.failures.maxpercent



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (TEZ-3265) Add preconditions check in SortSpan when available mem is less than metasize

2016-05-26 Thread Siddharth Seth (JIRA)

[ 
https://issues.apache.org/jira/browse/TEZ-3265?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15302550#comment-15302550
 ] 

Siddharth Seth commented on TEZ-3265:
-

I'm not sure the precondition check will ever be hit - given the calculation 
above it.
{code}
if(capacity < (metasize+dataSize)) {
// try to allocate less meta space, because we have sample data
metasize = METASIZE*(capacity/(perItem+METASIZE));
  }
{code}

> Add preconditions check in SortSpan when available mem is less than metasize
> 
>
> Key: TEZ-3265
> URL: https://issues.apache.org/jira/browse/TEZ-3265
> Project: Apache Tez
>  Issue Type: Bug
>Reporter: Rajesh Balamohan
>Priority: Minor
> Attachments: TEZ-3265.1.patch
>
>
> {noformat}
> 2016-05-21 09:01:48,523 [INFO] [TezChild] |impl.PipelinedSorter|: Reducer 3: 
> reserved.remaining()=14680064, reserved.metasize=16777216
> ...
> ...
> ...
> ...
> Caused by: java.lang.IllegalArgumentException
>   at java.nio.Buffer.position(Buffer.java:244)
>   at 
> org.apache.tez.runtime.library.common.sort.impl.PipelinedSorter$SortSpan.(PipelinedSorter.java:737)
>   at 
> org.apache.tez.runtime.library.common.sort.impl.PipelinedSorter.sort(PipelinedSorter.java:255)
>   at 
> org.apache.tez.runtime.library.common.sort.impl.PipelinedSorter.collect(PipelinedSorter.java:310)
>   at 
> org.apache.tez.runtime.library.common.sort.impl.PipelinedSorter.write(PipelinedSorter.java:283)
>   at 
> org.apache.tez.runtime.library.output.OrderedPartitionedKVOutput$1.write(OrderedPartitionedKVOutput.java:164)
>   at 
> org.apache.hadoop.hive.ql.exec.tez.TezProcessor$TezKVOutputCollector.collect(TezProcessor.java:198)
>   at 
> org.apache.hadoop.hive.ql.exec.ReduceSinkOperator.collect(ReduceSinkOperator.java:542)
>   at 
> org.apache.hadoop.hive.ql.exec.ReduceSinkOperator.process(ReduceSinkOperator.java:386)
>   ... 45 more
> {noformat}
> It would be good to have a Preconditions check in SortSpan instead of 
> throwing exception from Buffer.java.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (TEZ-3274) Vertex with MRInput and shuffle input does not respect slow start

2016-05-26 Thread Bikas Saha (JIRA)

[ 
https://issues.apache.org/jira/browse/TEZ-3274?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15302549#comment-15302549
 ] 

Bikas Saha commented on TEZ-3274:
-

There probably isnt. We could use this one. Or if you need an urgent point fix 
in this jira some scheduling heuristics could be added optionally to 
RootInputInitializer. Though I am not sure what exactly is happening. Since 
these tasks also read data from HDFS why would be not want them to start asap 
if there is spare capacity. Slow start is effectively also tries to start tasks 
as soon as possible (in fact sooner than its inputs are ready so I am not sure 
why it was called slow start when it could have been called eager start :) ).

> Vertex with MRInput and shuffle input does not respect slow start
> -
>
> Key: TEZ-3274
> URL: https://issues.apache.org/jira/browse/TEZ-3274
> Project: Apache Tez
>  Issue Type: Bug
>Reporter: Jonathan Eagles
>
> Vertices with shuffle input and MRInput choose RootInputVertexManager (and 
> not ShuffleVertexManager) and start containers and tasks immediately. In this 
> scenario, resources can be wasted since they do not respect 
> tez.shuffle-vertex-manager.min-src-fraction 
> tez.shuffle-vertex-manager.max-src-fraction. 



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (TEZ-3273) In one vexter has some task failed,DAG will stuck forever.

2016-05-26 Thread Feng Yuan (JIRA)

[ 
https://issues.apache.org/jira/browse/TEZ-3273?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15301881#comment-15301881
 ] 

Feng Yuan commented on TEZ-3273:


thanks,i will check it.

> In one vexter has some task failed,DAG will stuck forever.
> --
>
> Key: TEZ-3273
> URL: https://issues.apache.org/jira/browse/TEZ-3273
> Project: Apache Tez
>  Issue Type: Bug
>Affects Versions: 0.5.2
> Environment: hive0.14 hadoop2.6
>Reporter: Feng Yuan
>
> Map 1: 145(+0,-1)/146 Reducer 2: 0/415
> Map 1: 145(+0,-1)/146 Reducer 2: 0/415
> Map 1: 145(+0,-1)/146 Reducer 2: 0/415
> Map 1: 145(+0,-1)/146 Reducer 2: 0/415
> Map 1: 145(+0,-1)/146 Reducer 2: 0/415
> Map 1: 145(+0,-1)/146 Reducer 2: 0/415
> Map 1: 145(+0,-1)/146 Reducer 2: 0/415
> stuck forever~



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (TEZ-3265) Add preconditions check in SortSpan when available mem is less than metasize

2016-05-26 Thread Rajesh Balamohan (JIRA)

 [ 
https://issues.apache.org/jira/browse/TEZ-3265?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Rajesh Balamohan updated TEZ-3265:
--
Attachment: TEZ-3265.1.patch

> Add preconditions check in SortSpan when available mem is less than metasize
> 
>
> Key: TEZ-3265
> URL: https://issues.apache.org/jira/browse/TEZ-3265
> Project: Apache Tez
>  Issue Type: Bug
>Reporter: Rajesh Balamohan
>Priority: Minor
> Attachments: TEZ-3265.1.patch
>
>
> {noformat}
> 2016-05-21 09:01:48,523 [INFO] [TezChild] |impl.PipelinedSorter|: Reducer 3: 
> reserved.remaining()=14680064, reserved.metasize=16777216
> ...
> ...
> ...
> ...
> Caused by: java.lang.IllegalArgumentException
>   at java.nio.Buffer.position(Buffer.java:244)
>   at 
> org.apache.tez.runtime.library.common.sort.impl.PipelinedSorter$SortSpan.(PipelinedSorter.java:737)
>   at 
> org.apache.tez.runtime.library.common.sort.impl.PipelinedSorter.sort(PipelinedSorter.java:255)
>   at 
> org.apache.tez.runtime.library.common.sort.impl.PipelinedSorter.collect(PipelinedSorter.java:310)
>   at 
> org.apache.tez.runtime.library.common.sort.impl.PipelinedSorter.write(PipelinedSorter.java:283)
>   at 
> org.apache.tez.runtime.library.output.OrderedPartitionedKVOutput$1.write(OrderedPartitionedKVOutput.java:164)
>   at 
> org.apache.hadoop.hive.ql.exec.tez.TezProcessor$TezKVOutputCollector.collect(TezProcessor.java:198)
>   at 
> org.apache.hadoop.hive.ql.exec.ReduceSinkOperator.collect(ReduceSinkOperator.java:542)
>   at 
> org.apache.hadoop.hive.ql.exec.ReduceSinkOperator.process(ReduceSinkOperator.java:386)
>   ... 45 more
> {noformat}
> It would be good to have a Preconditions check in SortSpan instead of 
> throwing exception from Buffer.java.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (TEZ-3273) In one vexter has some task failed,DAG will stuck forever.

2016-05-26 Thread Jeff Zhang (JIRA)

[ 
https://issues.apache.org/jira/browse/TEZ-3273?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15301598#comment-15301598
 ] 

Jeff Zhang commented on TEZ-3273:
-

Use the instruction here to enable log aggregation. 
http://hortonworks.com/blog/simplifying-user-logs-management-and-access-in-yarn/

> In one vexter has some task failed,DAG will stuck forever.
> --
>
> Key: TEZ-3273
> URL: https://issues.apache.org/jira/browse/TEZ-3273
> Project: Apache Tez
>  Issue Type: Bug
>Affects Versions: 0.5.2
> Environment: hive0.14 hadoop2.6
>Reporter: Feng Yuan
>
> Map 1: 145(+0,-1)/146 Reducer 2: 0/415
> Map 1: 145(+0,-1)/146 Reducer 2: 0/415
> Map 1: 145(+0,-1)/146 Reducer 2: 0/415
> Map 1: 145(+0,-1)/146 Reducer 2: 0/415
> Map 1: 145(+0,-1)/146 Reducer 2: 0/415
> Map 1: 145(+0,-1)/146 Reducer 2: 0/415
> Map 1: 145(+0,-1)/146 Reducer 2: 0/415
> stuck forever~



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (TEZ-3273) In one vexter has some task failed,DAG will stuck forever.

2016-05-26 Thread Feng Yuan (JIRA)

[ 
https://issues.apache.org/jira/browse/TEZ-3273?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15301586#comment-15301586
 ] 

Feng Yuan commented on TEZ-3273:


i have done these above,is there any possible logs disappear if app failed or 
am failed or other scene?

> In one vexter has some task failed,DAG will stuck forever.
> --
>
> Key: TEZ-3273
> URL: https://issues.apache.org/jira/browse/TEZ-3273
> Project: Apache Tez
>  Issue Type: Bug
>Affects Versions: 0.5.2
> Environment: hive0.14 hadoop2.6
>Reporter: Feng Yuan
>
> Map 1: 145(+0,-1)/146 Reducer 2: 0/415
> Map 1: 145(+0,-1)/146 Reducer 2: 0/415
> Map 1: 145(+0,-1)/146 Reducer 2: 0/415
> Map 1: 145(+0,-1)/146 Reducer 2: 0/415
> Map 1: 145(+0,-1)/146 Reducer 2: 0/415
> Map 1: 145(+0,-1)/146 Reducer 2: 0/415
> Map 1: 145(+0,-1)/146 Reducer 2: 0/415
> stuck forever~



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)