[jira] [Comment Edited] (HIVE-14091) some errors are not propagated to LLAP external clients

2016-06-29 Thread Sergey Shelukhin (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-14091?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15355592#comment-15355592
 ] 

Sergey Shelukhin edited comment on HIVE-14091 at 6/29/16 6:38 PM:
--

Do you want to update the patch? Was the entire process failure path removed or 
just taskFailed? Is the error still propagated? Note the c/p above. How do the 
test fail with the call removed?


was (Author: sershe):
Do you want to update the patch? Was the entire process failure path removed or 
just taskFailed?

> some errors are not propagated to LLAP external clients
> ---
>
> Key: HIVE-14091
> URL: https://issues.apache.org/jira/browse/HIVE-14091
> Project: Hive
>  Issue Type: Bug
>Reporter: Jason Dere
>Assignee: Sergey Shelukhin
> Attachments: HIVE-14091.01.patch, HIVE-14091.patch
>
>




--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Comment Edited] (HIVE-14091) some errors are not propagated to LLAP external clients

2016-06-27 Thread Siddharth Seth (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-14091?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15352006#comment-15352006
 ] 

Siddharth Seth edited comment on HIVE-14091 at 6/27/16 11:06 PM:
-

The main change here is to close the socket in case of an exception, correct? 
and hope that this causes the InputStream read to return immediately - since 
the interrupt does not work. Afaik - this is best effort - and there's a 
comment in the patch which indicates the same.
This will cause any reads on the InputStream to fail - likely with a 
ClosedChannelException (or equivalent). Do we need to handle this in a specific 
manner in the reader code - at least to indicate the kind of error so that 
debugging is easier.

Mostly unrelated to this jira.
{code}
case ERROR:
  throw new IOException("Received reader event error: " + 
event.getMessage());
default:
  throw new IOException("Got reader event type " + 
event.getEventType() + ", expected error event");
{code}
This gets rid of the original exception. Would be worth propagating the 
exception further up, or at least logging it.

I don't think the addition of taskFailed on the Responder is required. This 
will be invoked in any case when the Umbilical heartbeat implementation invokes 
responder.heartbeat. (adding the method implies the error being sent twice to 
the responder)

Should the socket also be cleaned up during ReaderBase.close()






was (Author: sseth):
The main change here is to close the socket in case of an exception, correct? 
and hope that this causes the InputStream read to return immediately - since 
the interrupt does not work. Afaik - this is best effort - and there's a 
comment in the patch which indicates the same.
This will cause any reads on the InputStream to fail - likely with a 
ClosedChannelException (or equivalent). Do we need to handle this in a specific 
manner in the reader code - at least to indicate the kind of error so that 
debugging is easier.

Mostly unrelated to this jira.
{code}
case ERROR:
  throw new IOException("Received reader event error: " + 
event.getMessage());
default:
  throw new IOException("Got reader event type " + 
event.getEventType() + ", expected error event");
{code}
This gets rid of the original exception. Would be worth propagating the 
exception further up, or at least logging it.

I don't think the addition of taskFailed on the Responder is required. This 
will be invoked in any case when the Umbilical heartbeat implementation invokes 
responder.heartbeat. (adding the method implies the error being sent twice to 
the responder)

Should the socket also be cleaned up during ReaderBase.close()

Kind of related to the patch.
{code}





> some errors are not propagated to LLAP external clients
> ---
>
> Key: HIVE-14091
> URL: https://issues.apache.org/jira/browse/HIVE-14091
> Project: Hive
>  Issue Type: Bug
>Reporter: Jason Dere
>Assignee: Sergey Shelukhin
> Attachments: HIVE-14091.patch
>
>




--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Comment Edited] (HIVE-14091) some errors are not propagated to LLAP external clients

2016-06-24 Thread Sergey Shelukhin (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-14091?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15348611#comment-15348611
 ] 

Sergey Shelukhin edited comment on HIVE-14091 at 6/24/16 9:00 PM:
--

I added an exception to the end-to-end test. Without the patch, the test times 
out. With the patch, the error looks like this:
{noformat}
ests run: 1, Failures: 0, Errors: 1, Skipped: 0, Time elapsed: 36.07 sec <<< 
FAILURE! - in org.apache.hive.jdbc.TestJdbcWithMiniLlap
testLlapInputFormatEndToEnd(org.apache.hive.jdbc.TestJdbcWithMiniLlap)  Time 
elapsed: 5.089 sec  <<< ERROR!
java.io.IOException: Received reader event error: Received an error for task ID 
attempt_8772768970312654090_0001_0_00_00_0: Error while running task ( 
failure ) : 
attempt_8772768970312654090_0001_0_00_00_0:java.lang.RuntimeException: 
java.lang.Exception: boom!
at 
org.apache.hadoop.hive.ql.exec.tez.TezProcessor.initializeAndRunProcessor(TezProcessor.java:211)
at 
org.apache.hadoop.hive.ql.exec.tez.TezProcessor.run(TezProcessor.java:168)
at 
org.apache.tez.runtime.LogicalIOProcessorRuntimeTask.run(LogicalIOProcessorRuntimeTask.java:355)
at 
org.apache.tez.runtime.task.TaskRunner2Callable$1.run(TaskRunner2Callable.java:72)
at 
org.apache.tez.runtime.task.TaskRunner2Callable$1.run(TaskRunner2Callable.java:60)
at java.security.AccessController.doPrivileged(Native Method)
at javax.security.auth.Subject.doAs(Subject.java:422)
at 
org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1656)
at 
org.apache.tez.runtime.task.TaskRunner2Callable.callInternal(TaskRunner2Callable.java:60)
at 
org.apache.tez.runtime.task.TaskRunner2Callable.callInternal(TaskRunner2Callable.java:36)
at org.apache.tez.common.CallableWithNdc.call(CallableWithNdc.java:36)
at java.util.concurrent.FutureTask.run(FutureTask.java:266)
at 
java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142)
at 
java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617)
at java.lang.Thread.run(Thread.java:745)
Caused by: java.lang.Exception: boom!
at 
org.apache.hadoop.hive.ql.exec.tez.MapRecordProcessor.getMRInput(MapRecordProcessor.java:497)
at 
org.apache.hadoop.hive.ql.exec.tez.MapRecordProcessor.init(MapRecordProcessor.java:171)
at 
org.apache.hadoop.hive.ql.exec.tez.TezProcessor.initializeAndRunProcessor(TezProcessor.java:184)
... 14 more

at 
org.apache.hadoop.hive.llap.LlapBaseRecordReader.next(LlapBaseRecordReader.java:163)
at 
org.apache.hadoop.hive.llap.LlapBaseRecordReader.next(LlapBaseRecordReader.java:47)
at 
org.apache.hadoop.hive.llap.LlapRowRecordReader.next(LlapRowRecordReader.java:107)
at 
org.apache.hadoop.hive.llap.LlapRowRecordReader.next(LlapRowRecordReader.java:56)
at 
org.apache.hive.jdbc.TestJdbcWithMiniLlap.getLlapIFRowCount(TestJdbcWithMiniLlap.java:201)
at 
org.apache.hive.jdbc.TestJdbcWithMiniLlap.testLlapInputFormatEndToEnd(TestJdbcWithMiniLlap.java:223)
{noformat}


was (Author: sershe):
I added some exception to the end-to-end test. Without the patch, the test 
times out. With the patch, the error looks like this:
{noformat}
ests run: 1, Failures: 0, Errors: 1, Skipped: 0, Time elapsed: 36.07 sec <<< 
FAILURE! - in org.apache.hive.jdbc.TestJdbcWithMiniLlap
testLlapInputFormatEndToEnd(org.apache.hive.jdbc.TestJdbcWithMiniLlap)  Time 
elapsed: 5.089 sec  <<< ERROR!
java.io.IOException: Received reader event error: Received an error for task ID 
attempt_8772768970312654090_0001_0_00_00_0: Error while running task ( 
failure ) : 
attempt_8772768970312654090_0001_0_00_00_0:java.lang.RuntimeException: 
java.lang.Exception: boom!
at 
org.apache.hadoop.hive.ql.exec.tez.TezProcessor.initializeAndRunProcessor(TezProcessor.java:211)
at 
org.apache.hadoop.hive.ql.exec.tez.TezProcessor.run(TezProcessor.java:168)
at 
org.apache.tez.runtime.LogicalIOProcessorRuntimeTask.run(LogicalIOProcessorRuntimeTask.java:355)
at 
org.apache.tez.runtime.task.TaskRunner2Callable$1.run(TaskRunner2Callable.java:72)
at 
org.apache.tez.runtime.task.TaskRunner2Callable$1.run(TaskRunner2Callable.java:60)
at java.security.AccessController.doPrivileged(Native Method)
at javax.security.auth.Subject.doAs(Subject.java:422)
at 
org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1656)
at 
org.apache.tez.runtime.task.TaskRunner2Callable.callInternal(TaskRunner2Callable.java:60)
at 
org.apache.tez.runtime.task.TaskRunner2Callable.callInternal(TaskRunner2Callable.java:36)
at org.apache.tez.common.CallableWithNdc.call(CallableWithNdc.java:36)
at java.util.concurrent.FutureTask.run(FutureTask.java:266)