subject:"\[jira\] \[Updated\] \(SPARK\-21067\) Thrift Server \- CTAS fail with Unable to move source"

[jira] [Updated] (SPARK-21067) Thrift Server - CTAS fail with Unable to move source

2019-06-17 Thread Dominic Ricard (JIRA)



 [ 
https://issues.apache.org/jira/browse/SPARK-21067?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Dominic Ricard updated SPARK-21067:
---
Affects Version/s: 2.4.3

> Thrift Server - CTAS fail with Unable to move source
> 
>
> Key: SPARK-21067
> URL: https://issues.apache.org/jira/browse/SPARK-21067
> Project: Spark
>  Issue Type: Bug
>  Components: SQL
>Affects Versions: 2.1.1, 2.2.0, 2.4.0, 2.4.3
> Environment: Yarn
> Hive MetaStore
> HDFS (HA)
>Reporter: Dominic Ricard
>Priority: Major
> Attachments: SPARK-21067.patch
>
>
> After upgrading our Thrift cluster to 2.1.1, we ran into an issue where CTAS 
> would fail, sometimes...
> Most of the time, the CTAS would work only once, after starting the thrift 
> server. After that, dropping the table and re-issuing the same CTAS would 
> fail with the following message (Sometime, it fails right away, sometime it 
> work for a long period of time):
> {noformat}
> Error: org.apache.spark.sql.AnalysisException: 
> org.apache.hadoop.hive.ql.metadata.HiveException: Unable to move source 
> hdfs://nameservice1//tmp/hive-staging/thrift_hive_2017-06-12_16-56-18_464_7598877199323198104-31/-ext-1/part-0
>  to destination 
> hdfs://nameservice1/user/hive/warehouse/dricard.db/test/part-0; 
> (state=,code=0)
> {noformat}
> We have already found the following Jira 
> (https://issues.apache.org/jira/browse/SPARK-11021) which state that the 
> {{hive.exec.stagingdir}} had to be added in order for Spark to be able to 
> handle CREATE TABLE properly as of 2.0. As you can see in the error, we have 
> ours set to "/tmp/hive-staging/\{user.name\}"
> Same issue with INSERT statements:
> {noformat}
> CREATE TABLE IF NOT EXISTS dricard.test (col1 int); INSERT INTO TABLE 
> dricard.test SELECT 1;
> Error: org.apache.spark.sql.AnalysisException: 
> org.apache.hadoop.hive.ql.metadata.HiveException: Unable to move source 
> hdfs://nameservice1/tmp/hive-staging/thrift_hive_2017-06-12_20-41-12_964_3086448130033637241-16/-ext-1/part-0
>  to destination 
> hdfs://nameservice1/user/hive/warehouse/dricard.db/test/part-0; 
> (state=,code=0)
> {noformat}
> This worked fine in 1.6.2, which we currently run in our Production 
> Environment but since 2.0+, we haven't been able to CREATE TABLE consistently 
> on the cluster.
> SQL to reproduce issue:
> {noformat}
> DROP SCHEMA IF EXISTS dricard CASCADE; 
> CREATE SCHEMA dricard; 
> CREATE TABLE dricard.test (col1 int); 
> INSERT INTO TABLE dricard.test SELECT 1; 
> SELECT * from dricard.test; 
> DROP TABLE dricard.test; 
> CREATE TABLE dricard.test AS select 1 as `col1`;
> SELECT * from dricard.test
> {noformat}
> Thrift server usually fails at INSERT...
> Tried the same procedure in a spark context using spark.sql() and didn't 
> encounter the same issue.
> Full stack Trace:
> {noformat}
> 17/06/14 14:52:18 ERROR thriftserver.SparkExecuteStatementOperation: Error 
> executing query, currentState RUNNING,
> org.apache.spark.sql.AnalysisException: 
> org.apache.hadoop.hive.ql.metadata.HiveException: Unable to move source 
> hdfs://nameservice1/tmp/hive-staging/thrift_hive_2017-06-14_14-52-18_521_5906917519254880890-5/-ext-1/part-0
>  to desti
> nation hdfs://nameservice1/user/hive/warehouse/dricard.db/test/part-0;
> at 
> org.apache.spark.sql.hive.HiveExternalCatalog.withClient(HiveExternalCatalog.scala:106)
> at 
> org.apache.spark.sql.hive.HiveExternalCatalog.loadTable(HiveExternalCatalog.scala:766)
> at 
> org.apache.spark.sql.hive.execution.InsertIntoHiveTable.sideEffectResult$lzycompute(InsertIntoHiveTable.scala:374)
> at 
> org.apache.spark.sql.hive.execution.InsertIntoHiveTable.sideEffectResult(InsertIntoHiveTable.scala:221)
> at 
> org.apache.spark.sql.hive.execution.InsertIntoHiveTable.doExecute(InsertIntoHiveTable.scala:407)
> at 
> org.apache.spark.sql.execution.SparkPlan$$anonfun$execute$1.apply(SparkPlan.scala:114)
> at 
> org.apache.spark.sql.execution.SparkPlan$$anonfun$execute$1.apply(SparkPlan.scala:114)
> at 
> org.apache.spark.sql.execution.SparkPlan$$anonfun$executeQuery$1.apply(SparkPlan.scala:135)
> at 
> org.apache.spark.rdd.RDDOperationScope$.withScope(RDDOperationScope.scala:151)
> at 
> org.apache.spark.sql.execution.SparkPlan.executeQuery(SparkPlan.scala:132)
> at 
> org.apache.spark.sql.execution.SparkPlan.execute(SparkPlan.scala:113)
> at 
> org.apache.spark.sql.execution.QueryExecution.toRdd$lzycompute(QueryExecution.scala:92)
> at 
> org.apache.spark.sql.execution.QueryExecution.toRdd(QueryExecution.scala:92)
> at org.apache.spark.sql.Dataset.(Dataset.scala:185)
> at org.apache.spark.sql.Dataset$.ofRows(Dataset.scala:64)
> at org.apache.spa

[jira] [Updated] (SPARK-21067) Thrift Server - CTAS fail with Unable to move source

2019-03-11 Thread Dominic Ricard (JIRA)



 [ 
https://issues.apache.org/jira/browse/SPARK-21067?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Dominic Ricard updated SPARK-21067:
---
Affects Version/s: 2.4.0

> Thrift Server - CTAS fail with Unable to move source
> 
>
> Key: SPARK-21067
> URL: https://issues.apache.org/jira/browse/SPARK-21067
> Project: Spark
>  Issue Type: Bug
>  Components: SQL
>Affects Versions: 2.1.1, 2.2.0, 2.4.0
> Environment: Yarn
> Hive MetaStore
> HDFS (HA)
>Reporter: Dominic Ricard
>Priority: Major
> Attachments: SPARK-21067.patch
>
>
> After upgrading our Thrift cluster to 2.1.1, we ran into an issue where CTAS 
> would fail, sometimes...
> Most of the time, the CTAS would work only once, after starting the thrift 
> server. After that, dropping the table and re-issuing the same CTAS would 
> fail with the following message (Sometime, it fails right away, sometime it 
> work for a long period of time):
> {noformat}
> Error: org.apache.spark.sql.AnalysisException: 
> org.apache.hadoop.hive.ql.metadata.HiveException: Unable to move source 
> hdfs://nameservice1//tmp/hive-staging/thrift_hive_2017-06-12_16-56-18_464_7598877199323198104-31/-ext-1/part-0
>  to destination 
> hdfs://nameservice1/user/hive/warehouse/dricard.db/test/part-0; 
> (state=,code=0)
> {noformat}
> We have already found the following Jira 
> (https://issues.apache.org/jira/browse/SPARK-11021) which state that the 
> {{hive.exec.stagingdir}} had to be added in order for Spark to be able to 
> handle CREATE TABLE properly as of 2.0. As you can see in the error, we have 
> ours set to "/tmp/hive-staging/\{user.name\}"
> Same issue with INSERT statements:
> {noformat}
> CREATE TABLE IF NOT EXISTS dricard.test (col1 int); INSERT INTO TABLE 
> dricard.test SELECT 1;
> Error: org.apache.spark.sql.AnalysisException: 
> org.apache.hadoop.hive.ql.metadata.HiveException: Unable to move source 
> hdfs://nameservice1/tmp/hive-staging/thrift_hive_2017-06-12_20-41-12_964_3086448130033637241-16/-ext-1/part-0
>  to destination 
> hdfs://nameservice1/user/hive/warehouse/dricard.db/test/part-0; 
> (state=,code=0)
> {noformat}
> This worked fine in 1.6.2, which we currently run in our Production 
> Environment but since 2.0+, we haven't been able to CREATE TABLE consistently 
> on the cluster.
> SQL to reproduce issue:
> {noformat}
> DROP SCHEMA IF EXISTS dricard CASCADE; 
> CREATE SCHEMA dricard; 
> CREATE TABLE dricard.test (col1 int); 
> INSERT INTO TABLE dricard.test SELECT 1; 
> SELECT * from dricard.test; 
> DROP TABLE dricard.test; 
> CREATE TABLE dricard.test AS select 1 as `col1`;
> SELECT * from dricard.test
> {noformat}
> Thrift server usually fails at INSERT...
> Tried the same procedure in a spark context using spark.sql() and didn't 
> encounter the same issue.
> Full stack Trace:
> {noformat}
> 17/06/14 14:52:18 ERROR thriftserver.SparkExecuteStatementOperation: Error 
> executing query, currentState RUNNING,
> org.apache.spark.sql.AnalysisException: 
> org.apache.hadoop.hive.ql.metadata.HiveException: Unable to move source 
> hdfs://nameservice1/tmp/hive-staging/thrift_hive_2017-06-14_14-52-18_521_5906917519254880890-5/-ext-1/part-0
>  to desti
> nation hdfs://nameservice1/user/hive/warehouse/dricard.db/test/part-0;
> at 
> org.apache.spark.sql.hive.HiveExternalCatalog.withClient(HiveExternalCatalog.scala:106)
> at 
> org.apache.spark.sql.hive.HiveExternalCatalog.loadTable(HiveExternalCatalog.scala:766)
> at 
> org.apache.spark.sql.hive.execution.InsertIntoHiveTable.sideEffectResult$lzycompute(InsertIntoHiveTable.scala:374)
> at 
> org.apache.spark.sql.hive.execution.InsertIntoHiveTable.sideEffectResult(InsertIntoHiveTable.scala:221)
> at 
> org.apache.spark.sql.hive.execution.InsertIntoHiveTable.doExecute(InsertIntoHiveTable.scala:407)
> at 
> org.apache.spark.sql.execution.SparkPlan$$anonfun$execute$1.apply(SparkPlan.scala:114)
> at 
> org.apache.spark.sql.execution.SparkPlan$$anonfun$execute$1.apply(SparkPlan.scala:114)
> at 
> org.apache.spark.sql.execution.SparkPlan$$anonfun$executeQuery$1.apply(SparkPlan.scala:135)
> at 
> org.apache.spark.rdd.RDDOperationScope$.withScope(RDDOperationScope.scala:151)
> at 
> org.apache.spark.sql.execution.SparkPlan.executeQuery(SparkPlan.scala:132)
> at 
> org.apache.spark.sql.execution.SparkPlan.execute(SparkPlan.scala:113)
> at 
> org.apache.spark.sql.execution.QueryExecution.toRdd$lzycompute(QueryExecution.scala:92)
> at 
> org.apache.spark.sql.execution.QueryExecution.toRdd(QueryExecution.scala:92)
> at org.apache.spark.sql.Dataset.(Dataset.scala:185)
> at org.apache.spark.sql.Dataset$.ofRows(Dataset.scala:64)
> at org.apache.spark.sql.

[jira] [Updated] (SPARK-21067) Thrift Server - CTAS fail with Unable to move source

2019-02-21 Thread Xianyin Xin (JIRA)



 [ 
https://issues.apache.org/jira/browse/SPARK-21067?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Xianyin Xin updated SPARK-21067:

Attachment: SPARK-21067.patch

> Thrift Server - CTAS fail with Unable to move source
> 
>
> Key: SPARK-21067
> URL: https://issues.apache.org/jira/browse/SPARK-21067
> Project: Spark
>  Issue Type: Bug
>  Components: SQL
>Affects Versions: 2.1.1, 2.2.0
> Environment: Yarn
> Hive MetaStore
> HDFS (HA)
>Reporter: Dominic Ricard
>Priority: Major
> Attachments: SPARK-21067.patch
>
>
> After upgrading our Thrift cluster to 2.1.1, we ran into an issue where CTAS 
> would fail, sometimes...
> Most of the time, the CTAS would work only once, after starting the thrift 
> server. After that, dropping the table and re-issuing the same CTAS would 
> fail with the following message (Sometime, it fails right away, sometime it 
> work for a long period of time):
> {noformat}
> Error: org.apache.spark.sql.AnalysisException: 
> org.apache.hadoop.hive.ql.metadata.HiveException: Unable to move source 
> hdfs://nameservice1//tmp/hive-staging/thrift_hive_2017-06-12_16-56-18_464_7598877199323198104-31/-ext-1/part-0
>  to destination 
> hdfs://nameservice1/user/hive/warehouse/dricard.db/test/part-0; 
> (state=,code=0)
> {noformat}
> We have already found the following Jira 
> (https://issues.apache.org/jira/browse/SPARK-11021) which state that the 
> {{hive.exec.stagingdir}} had to be added in order for Spark to be able to 
> handle CREATE TABLE properly as of 2.0. As you can see in the error, we have 
> ours set to "/tmp/hive-staging/\{user.name\}"
> Same issue with INSERT statements:
> {noformat}
> CREATE TABLE IF NOT EXISTS dricard.test (col1 int); INSERT INTO TABLE 
> dricard.test SELECT 1;
> Error: org.apache.spark.sql.AnalysisException: 
> org.apache.hadoop.hive.ql.metadata.HiveException: Unable to move source 
> hdfs://nameservice1/tmp/hive-staging/thrift_hive_2017-06-12_20-41-12_964_3086448130033637241-16/-ext-1/part-0
>  to destination 
> hdfs://nameservice1/user/hive/warehouse/dricard.db/test/part-0; 
> (state=,code=0)
> {noformat}
> This worked fine in 1.6.2, which we currently run in our Production 
> Environment but since 2.0+, we haven't been able to CREATE TABLE consistently 
> on the cluster.
> SQL to reproduce issue:
> {noformat}
> DROP SCHEMA IF EXISTS dricard CASCADE; 
> CREATE SCHEMA dricard; 
> CREATE TABLE dricard.test (col1 int); 
> INSERT INTO TABLE dricard.test SELECT 1; 
> SELECT * from dricard.test; 
> DROP TABLE dricard.test; 
> CREATE TABLE dricard.test AS select 1 as `col1`;
> SELECT * from dricard.test
> {noformat}
> Thrift server usually fails at INSERT...
> Tried the same procedure in a spark context using spark.sql() and didn't 
> encounter the same issue.
> Full stack Trace:
> {noformat}
> 17/06/14 14:52:18 ERROR thriftserver.SparkExecuteStatementOperation: Error 
> executing query, currentState RUNNING,
> org.apache.spark.sql.AnalysisException: 
> org.apache.hadoop.hive.ql.metadata.HiveException: Unable to move source 
> hdfs://nameservice1/tmp/hive-staging/thrift_hive_2017-06-14_14-52-18_521_5906917519254880890-5/-ext-1/part-0
>  to desti
> nation hdfs://nameservice1/user/hive/warehouse/dricard.db/test/part-0;
> at 
> org.apache.spark.sql.hive.HiveExternalCatalog.withClient(HiveExternalCatalog.scala:106)
> at 
> org.apache.spark.sql.hive.HiveExternalCatalog.loadTable(HiveExternalCatalog.scala:766)
> at 
> org.apache.spark.sql.hive.execution.InsertIntoHiveTable.sideEffectResult$lzycompute(InsertIntoHiveTable.scala:374)
> at 
> org.apache.spark.sql.hive.execution.InsertIntoHiveTable.sideEffectResult(InsertIntoHiveTable.scala:221)
> at 
> org.apache.spark.sql.hive.execution.InsertIntoHiveTable.doExecute(InsertIntoHiveTable.scala:407)
> at 
> org.apache.spark.sql.execution.SparkPlan$$anonfun$execute$1.apply(SparkPlan.scala:114)
> at 
> org.apache.spark.sql.execution.SparkPlan$$anonfun$execute$1.apply(SparkPlan.scala:114)
> at 
> org.apache.spark.sql.execution.SparkPlan$$anonfun$executeQuery$1.apply(SparkPlan.scala:135)
> at 
> org.apache.spark.rdd.RDDOperationScope$.withScope(RDDOperationScope.scala:151)
> at 
> org.apache.spark.sql.execution.SparkPlan.executeQuery(SparkPlan.scala:132)
> at 
> org.apache.spark.sql.execution.SparkPlan.execute(SparkPlan.scala:113)
> at 
> org.apache.spark.sql.execution.QueryExecution.toRdd$lzycompute(QueryExecution.scala:92)
> at 
> org.apache.spark.sql.execution.QueryExecution.toRdd(QueryExecution.scala:92)
> at org.apache.spark.sql.Dataset.(Dataset.scala:185)
> at org.apache.spark.sql.Dataset$.ofRows(Dataset.scala:64)
> at org.apache.spark.sql.SparkSes

[jira] [Updated] (SPARK-21067) Thrift Server - CTAS fail with Unable to move source

2017-07-11 Thread Dominic Ricard (JIRA)


 [ 
https://issues.apache.org/jira/browse/SPARK-21067?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Dominic Ricard updated SPARK-21067:
---
Affects Version/s: 2.2.0

> Thrift Server - CTAS fail with Unable to move source
> 
>
> Key: SPARK-21067
> URL: https://issues.apache.org/jira/browse/SPARK-21067
> Project: Spark
>  Issue Type: Bug
>  Components: SQL
>Affects Versions: 2.1.1, 2.2.0
> Environment: Yarn
> Hive MetaStore
> HDFS (HA)
>Reporter: Dominic Ricard
>
> After upgrading our Thrift cluster to 2.1.1, we ran into an issue where CTAS 
> would fail, sometimes...
> Most of the time, the CTAS would work only once, after starting the thrift 
> server. After that, dropping the table and re-issuing the same CTAS would 
> fail with the following message (Sometime, it fails right away, sometime it 
> work for a long period of time):
> {noformat}
> Error: org.apache.spark.sql.AnalysisException: 
> org.apache.hadoop.hive.ql.metadata.HiveException: Unable to move source 
> hdfs://nameservice1//tmp/hive-staging/thrift_hive_2017-06-12_16-56-18_464_7598877199323198104-31/-ext-1/part-0
>  to destination 
> hdfs://nameservice1/user/hive/warehouse/dricard.db/test/part-0; 
> (state=,code=0)
> {noformat}
> We have already found the following Jira 
> (https://issues.apache.org/jira/browse/SPARK-11021) which state that the 
> {{hive.exec.stagingdir}} had to be added in order for Spark to be able to 
> handle CREATE TABLE properly as of 2.0. As you can see in the error, we have 
> ours set to "/tmp/hive-staging/\{user.name\}"
> Same issue with INSERT statements:
> {noformat}
> CREATE TABLE IF NOT EXISTS dricard.test (col1 int); INSERT INTO TABLE 
> dricard.test SELECT 1;
> Error: org.apache.spark.sql.AnalysisException: 
> org.apache.hadoop.hive.ql.metadata.HiveException: Unable to move source 
> hdfs://nameservice1/tmp/hive-staging/thrift_hive_2017-06-12_20-41-12_964_3086448130033637241-16/-ext-1/part-0
>  to destination 
> hdfs://nameservice1/user/hive/warehouse/dricard.db/test/part-0; 
> (state=,code=0)
> {noformat}
> This worked fine in 1.6.2, which we currently run in our Production 
> Environment but since 2.0+, we haven't been able to CREATE TABLE consistently 
> on the cluster.
> SQL to reproduce issue:
> {noformat}
> DROP SCHEMA IF EXISTS dricard CASCADE; 
> CREATE SCHEMA dricard; 
> CREATE TABLE dricard.test (col1 int); 
> INSERT INTO TABLE dricard.test SELECT 1; 
> SELECT * from dricard.test; 
> DROP TABLE dricard.test; 
> CREATE TABLE dricard.test AS select 1 as `col1`;
> SELECT * from dricard.test
> {noformat}
> Thrift server usually fails at INSERT...
> Tried the same procedure in a spark context using spark.sql() and didn't 
> encounter the same issue.
> Full stack Trace:
> {noformat}
> 17/06/14 14:52:18 ERROR thriftserver.SparkExecuteStatementOperation: Error 
> executing query, currentState RUNNING,
> org.apache.spark.sql.AnalysisException: 
> org.apache.hadoop.hive.ql.metadata.HiveException: Unable to move source 
> hdfs://nameservice1/tmp/hive-staging/thrift_hive_2017-06-14_14-52-18_521_5906917519254880890-5/-ext-1/part-0
>  to desti
> nation hdfs://nameservice1/user/hive/warehouse/dricard.db/test/part-0;
> at 
> org.apache.spark.sql.hive.HiveExternalCatalog.withClient(HiveExternalCatalog.scala:106)
> at 
> org.apache.spark.sql.hive.HiveExternalCatalog.loadTable(HiveExternalCatalog.scala:766)
> at 
> org.apache.spark.sql.hive.execution.InsertIntoHiveTable.sideEffectResult$lzycompute(InsertIntoHiveTable.scala:374)
> at 
> org.apache.spark.sql.hive.execution.InsertIntoHiveTable.sideEffectResult(InsertIntoHiveTable.scala:221)
> at 
> org.apache.spark.sql.hive.execution.InsertIntoHiveTable.doExecute(InsertIntoHiveTable.scala:407)
> at 
> org.apache.spark.sql.execution.SparkPlan$$anonfun$execute$1.apply(SparkPlan.scala:114)
> at 
> org.apache.spark.sql.execution.SparkPlan$$anonfun$execute$1.apply(SparkPlan.scala:114)
> at 
> org.apache.spark.sql.execution.SparkPlan$$anonfun$executeQuery$1.apply(SparkPlan.scala:135)
> at 
> org.apache.spark.rdd.RDDOperationScope$.withScope(RDDOperationScope.scala:151)
> at 
> org.apache.spark.sql.execution.SparkPlan.executeQuery(SparkPlan.scala:132)
> at 
> org.apache.spark.sql.execution.SparkPlan.execute(SparkPlan.scala:113)
> at 
> org.apache.spark.sql.execution.QueryExecution.toRdd$lzycompute(QueryExecution.scala:92)
> at 
> org.apache.spark.sql.execution.QueryExecution.toRdd(QueryExecution.scala:92)
> at org.apache.spark.sql.Dataset.(Dataset.scala:185)
> at org.apache.spark.sql.Dataset$.ofRows(Dataset.scala:64)
> at org.apache.spark.sql.SparkSession.sql(SparkSession.scala:592)
> at org.apache.spark.sql.SQLCo

[jira] [Updated] (SPARK-21067) Thrift Server - CTAS fail with Unable to move source

2017-06-14 Thread Dominic Ricard (JIRA)


 [ 
https://issues.apache.org/jira/browse/SPARK-21067?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Dominic Ricard updated SPARK-21067:
---
Description: 
After upgrading our Thrift cluster to 2.1.1, we ran into an issue where CTAS 
would fail, sometimes...

Most of the time, the CTAS would work only once, after starting the thrift 
server. After that, dropping the table and re-issuing the same CTAS would fail 
with the following message (Sometime, it fails right away, sometime it work for 
a long period of time):

{noformat}
Error: org.apache.spark.sql.AnalysisException: 
org.apache.hadoop.hive.ql.metadata.HiveException: Unable to move source 
hdfs://nameservice1//tmp/hive-staging/thrift_hive_2017-06-12_16-56-18_464_7598877199323198104-31/-ext-1/part-0
 to destination 
hdfs://nameservice1/user/hive/warehouse/dricard.db/test/part-0; 
(state=,code=0)
{noformat}

We have already found the following Jira 
(https://issues.apache.org/jira/browse/SPARK-11021) which state that the 
{{hive.exec.stagingdir}} had to be added in order for Spark to be able to 
handle CREATE TABLE properly as of 2.0. As you can see in the error, we have 
ours set to "/tmp/hive-staging/\{user.name\}"

Same issue with INSERT statements:
{noformat}
CREATE TABLE IF NOT EXISTS dricard.test (col1 int); INSERT INTO TABLE 
dricard.test SELECT 1;
Error: org.apache.spark.sql.AnalysisException: 
org.apache.hadoop.hive.ql.metadata.HiveException: Unable to move source 
hdfs://nameservice1/tmp/hive-staging/thrift_hive_2017-06-12_20-41-12_964_3086448130033637241-16/-ext-1/part-0
 to destination 
hdfs://nameservice1/user/hive/warehouse/dricard.db/test/part-0; 
(state=,code=0)
{noformat}

This worked fine in 1.6.2, which we currently run in our Production Environment 
but since 2.0+, we haven't been able to CREATE TABLE consistently on the 
cluster.

SQL to reproduce issue:
{noformat}
DROP SCHEMA IF EXISTS dricard CASCADE; 
CREATE SCHEMA dricard; 
CREATE TABLE dricard.test (col1 int); 
INSERT INTO TABLE dricard.test SELECT 1; 
SELECT * from dricard.test; 
DROP TABLE dricard.test; 
CREATE TABLE dricard.test AS select 1 as `col1`;
SELECT * from dricard.test
{noformat}

Thrift server usually fails at INSERT...

Tried the same procedure in a spark context using spark.sql() and didn't 
encounter the same issue.

Full stack Trace:
{noformat}
17/06/14 14:52:18 ERROR thriftserver.SparkExecuteStatementOperation: Error 
executing query, currentState RUNNING,
org.apache.spark.sql.AnalysisException: 
org.apache.hadoop.hive.ql.metadata.HiveException: Unable to move source 
hdfs://nameservice1/tmp/hive-staging/thrift_hive_2017-06-14_14-52-18_521_5906917519254880890-5/-ext-1/part-0
 to desti
nation hdfs://nameservice1/user/hive/warehouse/dricard.db/test/part-0;
at 
org.apache.spark.sql.hive.HiveExternalCatalog.withClient(HiveExternalCatalog.scala:106)
at 
org.apache.spark.sql.hive.HiveExternalCatalog.loadTable(HiveExternalCatalog.scala:766)
at 
org.apache.spark.sql.hive.execution.InsertIntoHiveTable.sideEffectResult$lzycompute(InsertIntoHiveTable.scala:374)
at 
org.apache.spark.sql.hive.execution.InsertIntoHiveTable.sideEffectResult(InsertIntoHiveTable.scala:221)
at 
org.apache.spark.sql.hive.execution.InsertIntoHiveTable.doExecute(InsertIntoHiveTable.scala:407)
at 
org.apache.spark.sql.execution.SparkPlan$$anonfun$execute$1.apply(SparkPlan.scala:114)
at 
org.apache.spark.sql.execution.SparkPlan$$anonfun$execute$1.apply(SparkPlan.scala:114)
at 
org.apache.spark.sql.execution.SparkPlan$$anonfun$executeQuery$1.apply(SparkPlan.scala:135)
at 
org.apache.spark.rdd.RDDOperationScope$.withScope(RDDOperationScope.scala:151)
at 
org.apache.spark.sql.execution.SparkPlan.executeQuery(SparkPlan.scala:132)
at org.apache.spark.sql.execution.SparkPlan.execute(SparkPlan.scala:113)
at 
org.apache.spark.sql.execution.QueryExecution.toRdd$lzycompute(QueryExecution.scala:92)
at 
org.apache.spark.sql.execution.QueryExecution.toRdd(QueryExecution.scala:92)
at org.apache.spark.sql.Dataset.(Dataset.scala:185)
at org.apache.spark.sql.Dataset$.ofRows(Dataset.scala:64)
at org.apache.spark.sql.SparkSession.sql(SparkSession.scala:592)
at org.apache.spark.sql.SQLContext.sql(SQLContext.scala:699)
at 
org.apache.spark.sql.hive.thriftserver.SparkExecuteStatementOperation.org$apache$spark$sql$hive$thriftserver$SparkExecuteStatementOperation$$execute(SparkExecuteStatementOperation.scala:231)
at 
org.apache.spark.sql.hive.thriftserver.SparkExecuteStatementOperation$$anon$1$$anon$2.run(SparkExecuteStatementOperation.scala:174)
at 
org.apache.spark.sql.hive.thriftserver.SparkExecuteStatementOperation$$anon$1$$anon$2.run(SparkExecuteStatementOperation.scala:171)
at java.security.AccessController.doPrivileged(Native Method)
at javax.se

[jira] [Updated] (SPARK-21067) Thrift Server - CTAS fail with Unable to move source

2017-06-14 Thread Dominic Ricard (JIRA)


 [ 
https://issues.apache.org/jira/browse/SPARK-21067?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Dominic Ricard updated SPARK-21067:
---
Description: 
After upgrading our Thrift cluster to 2.1.1, we ran into an issue where CTAS 
would fail, sometimes...

Most of the time, the CTAS would work only once, after starting the thrift 
server. After that, dropping the table and re-issuing the same CTAS would fail 
with the following message (Sometime, it fails right away, sometime it work for 
a long period of time):

{noformat}
Error: org.apache.spark.sql.AnalysisException: 
org.apache.hadoop.hive.ql.metadata.HiveException: Unable to move source 
hdfs://nameservice1//tmp/hive-staging/thrift_hive_2017-06-12_16-56-18_464_7598877199323198104-31/-ext-1/part-0
 to destination 
hdfs://nameservice1/user/hive/warehouse/dricard.db/test/part-0; 
(state=,code=0)
{noformat}

We have already found the following Jira 
(https://issues.apache.org/jira/browse/SPARK-11021) which state that the 
{{hive.exec.stagingdir}} had to be added in order for Spark to be able to 
handle CREATE TABLE properly as of 2.0. As you can see in the error, we have 
ours set to "/tmp/hive-staging/\{user.name\}"

Same issue with INSERT statements:
{noformat}
CREATE TABLE IF NOT EXISTS dricard.test (col1 int); INSERT INTO TABLE 
dricard.test SELECT 1;
Error: org.apache.spark.sql.AnalysisException: 
org.apache.hadoop.hive.ql.metadata.HiveException: Unable to move source 
hdfs://nameservice1/tmp/hive-staging/thrift_hive_2017-06-12_20-41-12_964_3086448130033637241-16/-ext-1/part-0
 to destination 
hdfs://nameservice1/user/hive/warehouse/dricard.db/test/part-0; 
(state=,code=0)
{noformat}

This worked fine in 1.6.2, which we currently run in our Production Environment 
but since 2.0+, we haven't been able to CREATE TABLE consistently on the 
cluster.

SQL to reproduce issue:
{noformat}
DROP SCHEMA IF EXISTS dricard CASCADE; 
CREATE SCHEMA dricard; 
CREATE TABLE dricard.test (col1 int); 
INSERT INTO TABLE dricard.test SELECT 1; 
SELECT * from dricard.test; 
DROP TABLE dricard.test; 
CREATE TABLE dricard.test AS select 1 as `col1`;
SELECT * from dricard.test
{noformat}

Thrift server usually fails at INSERT...

Tried the same procedure in a spark context using spark.sql() and didn't 
encounter the same issue.

  was:
After upgrading our Thrift cluster to 2.1.1, we ran into an issue where CTAS 
would fail, sometimes...

Most of the time, the CTAS would work only once, after starting the thrift 
server. After that, dropping the table and re-issuing the same CTAS would fail 
with the following message (Sometime, it fails right away, sometime it work for 
a long period of time):

{noformat}
Error: org.apache.spark.sql.AnalysisException: 
org.apache.hadoop.hive.ql.metadata.HiveException: Unable to move source 
hdfs://nameservice1//tmp/hive-staging/thrift_hive_2017-06-12_16-56-18_464_7598877199323198104-31/-ext-1/part-0
 to destination 
hdfs://nameservice1/user/hive/warehouse/dricard.db/test/part-0; 
(state=,code=0)
{noformat}

We have already found the following Jira 
(https://issues.apache.org/jira/browse/SPARK-11021) which state that the 
{{hive.exec.stagingdir}} had to be added in order for Spark to be able to 
handle CREATE TABLE properly as of 2.0. As you can see in the error, we have 
ours set to "/tmp/hive-staging/\{user.name\}"

Same issue with INSERT statements:
{noformat}
CREATE TABLE IF NOT EXISTS dricard.test (col1 int); INSERT INTO TABLE 
dricard.test SELECT 1;
Error: org.apache.spark.sql.AnalysisException: 
org.apache.hadoop.hive.ql.metadata.HiveException: Unable to move source 
hdfs://nameservice1/tmp/hive-staging/thrift_hive_2017-06-12_20-41-12_964_3086448130033637241-16/-ext-1/part-0
 to destination 
hdfs://nameservice1/user/hive/warehouse/dricard.db/test/part-0; 
(state=,code=0)
{noformat}

This worked fine in 1.6.2, which we currently run in our Production Environment 
but since 2.0+, we haven't been able to CREATE TABLE consistently on the 
cluster.



> Thrift Server - CTAS fail with Unable to move source
> 
>
> Key: SPARK-21067
> URL: https://issues.apache.org/jira/browse/SPARK-21067
> Project: Spark
>  Issue Type: Bug
>  Components: SQL
>Affects Versions: 2.1.1
> Environment: Yarn
> Hive MetaStore
> HDFS (HA)
>Reporter: Dominic Ricard
>
> After upgrading our Thrift cluster to 2.1.1, we ran into an issue where CTAS 
> would fail, sometimes...
> Most of the time, the CTAS would work only once, after starting the thrift 
> server. After that, dropping the table and re-issuing the same CTAS would 
> fail with the following message (Sometime, it fails right away, sometime it 
> work for a long period of time):
> {noformat}
> Error: org.apache.spark.sql.AnalysisException: 
> org.apache.hadoop.hive.ql

[jira] [Updated] (SPARK-21067) Thrift Server - CTAS fail with Unable to move source

2017-06-13 Thread Dominic Ricard (JIRA)


 [ 
https://issues.apache.org/jira/browse/SPARK-21067?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Dominic Ricard updated SPARK-21067:
---
Description: 
After upgrading our Thrift cluster to 2.1.1, we ran into an issue where CTAS 
would fail, sometimes...

Most of the time, the CTAS would work only once, after starting the thrift 
server. After that, dropping the table and re-issuing the same CTAS would fail 
with the following message (Sometime, it fails right away, sometime it work for 
a long period of time):

{noformat}
Error: org.apache.spark.sql.AnalysisException: 
org.apache.hadoop.hive.ql.metadata.HiveException: Unable to move source 
hdfs://nameservice1//tmp/hive-staging/thrift_hive_2017-06-12_16-56-18_464_7598877199323198104-31/-ext-1/part-0
 to destination 
hdfs://nameservice1/user/hive/warehouse/dricard.db/test/part-0; 
(state=,code=0)
{noformat}

We have already found the following Jira 
(https://issues.apache.org/jira/browse/SPARK-11021) which state that the 
{{hive.exec.stagingdir}} had to be added in order for Spark to be able to 
handle CREATE TABLE properly as of 2.0. As you can see in the error, we have 
ours set to "/tmp/hive-staging/\{user.name\}"

Same issue with INSERT statements:
{noformat}
CREATE TABLE IF NOT EXISTS dricard.test (col1 int); INSERT INTO TABLE 
dricard.test SELECT 1;
Error: org.apache.spark.sql.AnalysisException: 
org.apache.hadoop.hive.ql.metadata.HiveException: Unable to move source 
hdfs://nameservice1/tmp/hive-staging/thrift_hive_2017-06-12_20-41-12_964_3086448130033637241-16/-ext-1/part-0
 to destination 
hdfs://nameservice1/user/hive/warehouse/dricard.db/test/part-0; 
(state=,code=0)
{noformat}

This worked fine in 1.6.2, which we currently run in our Production Environment 
but since 2.0+, we haven't been able to CREATE TABLE consistently on the 
cluster.


  was:
After upgrading our Thrift cluster to 2.1.1, we ran into an issue where CTAS 
would fail, sometimes...

Most of the time, the CTAS would work only once after starting the thrift 
server. After that, dropping the table and re-issuing the same CTAS would fail 
with the following message (Sometime, it fails right away, sometime it work for 
a long period of time):

{noformat}
Error: org.apache.spark.sql.AnalysisException: 
org.apache.hadoop.hive.ql.metadata.HiveException: Unable to move source 
hdfs://nameservice1//tmp/hive-staging/thrift_hive_2017-06-12_16-56-18_464_7598877199323198104-31/-ext-1/part-0
 to destination 
hdfs://nameservice1/user/hive/warehouse/dricard.db/test/part-0; 
(state=,code=0)
{noformat}

We have already found the following Jira 
(https://issues.apache.org/jira/browse/SPARK-11021) which state that the 
{{hive.exec.stagingdir}} had to be added in order for Spark to be able to 
handle CREATE TABLE properly as of 2.0. As you can see in the error, we have 
ours set to "/tmp/hive-staging/\{user.name\}"

Same issue with INSERT statements:
{noformat}
CREATE TABLE IF NOT EXISTS dricard.test (col1 int); INSERT INTO TABLE 
dricard.test SELECT 1;
Error: org.apache.spark.sql.AnalysisException: 
org.apache.hadoop.hive.ql.metadata.HiveException: Unable to move source 
hdfs://nameservice1/tmp/hive-staging/thrift_hive_2017-06-12_20-41-12_964_3086448130033637241-16/-ext-1/part-0
 to destination 
hdfs://nameservice1/user/hive/warehouse/dricard.db/test/part-0; 
(state=,code=0)
{noformat}

This worked fine in 1.6.2, which we currently run in our Production Environment 
but since 2.0+, we haven't been able to CREATE TABLE consistently on the 
cluster.



> Thrift Server - CTAS fail with Unable to move source
> 
>
> Key: SPARK-21067
> URL: https://issues.apache.org/jira/browse/SPARK-21067
> Project: Spark
>  Issue Type: Bug
>  Components: SQL
>Affects Versions: 2.1.1
> Environment: Yarn
> Hive MetaStore
> HDFS (HA)
>Reporter: Dominic Ricard
>
> After upgrading our Thrift cluster to 2.1.1, we ran into an issue where CTAS 
> would fail, sometimes...
> Most of the time, the CTAS would work only once, after starting the thrift 
> server. After that, dropping the table and re-issuing the same CTAS would 
> fail with the following message (Sometime, it fails right away, sometime it 
> work for a long period of time):
> {noformat}
> Error: org.apache.spark.sql.AnalysisException: 
> org.apache.hadoop.hive.ql.metadata.HiveException: Unable to move source 
> hdfs://nameservice1//tmp/hive-staging/thrift_hive_2017-06-12_16-56-18_464_7598877199323198104-31/-ext-1/part-0
>  to destination 
> hdfs://nameservice1/user/hive/warehouse/dricard.db/test/part-0; 
> (state=,code=0)
> {noformat}
> We have already found the following Jira 
> (https://issues.apache.org/jira/browse/SPARK-11021) which state that the 
> {{hive.exec.stagingdir}} had to be added in order for

[jira] [Updated] (SPARK-21067) Thrift Server - CTAS fail with Unable to move source

[jira] [Updated] (SPARK-21067) Thrift Server - CTAS fail with Unable to move source

[jira] [Updated] (SPARK-21067) Thrift Server - CTAS fail with Unable to move source

[jira] [Updated] (SPARK-21067) Thrift Server - CTAS fail with Unable to move source

[jira] [Updated] (SPARK-21067) Thrift Server - CTAS fail with Unable to move source

[jira] [Updated] (SPARK-21067) Thrift Server - CTAS fail with Unable to move source

[jira] [Updated] (SPARK-21067) Thrift Server - CTAS fail with Unable to move source

7 matches

Site Navigation

Mail list logo

Footer information