[ 
https://issues.apache.org/jira/browse/SPARK-35313?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Kaushik Muniandi updated SPARK-35313:
-------------------------------------
    Description: 
Got an error while running a code through Airflow DAG.

Exception while running an ETL job on an External table created on Hive stored 
as parquet in S3 with AWS Glue as metastore. Here's the error message:

 

java.lang.RuntimeException: Caught Hive MetaException attempting to get 
partition metadata by filter from Hive. You can set the Spark configuration 
setting spark.sql.hive.manageFilesourcePartitions to false to work around this 
problem, however this will result in degraded performance. Please report a bug: 
https://issues.apache.org/jira/browse/SPARK |

 

Caused by: MetaException(message:Unknown exception occurred. (Service: AWSGlue; 
Status Code: 500; Error Code: InternalServiceException; Request ID: 
73267997-1795-45a3-965f-8bb2a6b7b3ac))

 

Exact issue occurred while running on Databricks notebook as well. Screenshot 
attached for both cases.

  was:
Got an error while running a code through Airflow DAG.

Exception while running an ETL job on an External table created on Hive stored 
as parquet in S3 with AWS Glue as metastore. Here's the error message:

 

java.lang.RuntimeException: Caught Hive MetaException attempting to get 
partition metadata by filter from Hive. You can set the Spark configuration 
setting spark.sql.hive.manageFilesourcePartitions to false to work around this 
problem, however this will result in degraded performance. Please report a bug: 
https://issues.apache.org/jira/browse/SPARK |

 

Caused by: MetaException(message:Unknown exception occurred. (Service: AWSGlue; 
Status Code: 500; Error Code: InternalServiceException; Request ID: 
73267997-1795-45a3-965f-8bb2a6b7b3ac))


> Hive MetaException attempting to get partition metadata by filter from Hive
> ---------------------------------------------------------------------------
>
>                 Key: SPARK-35313
>                 URL: https://issues.apache.org/jira/browse/SPARK-35313
>             Project: Spark
>          Issue Type: Bug
>          Components: Spark Submit
>    Affects Versions: 3.0.1
>         Environment: Got an error while running a code through Airflow DAG. 
> Data size: ~ 2 TB and a little over 28 billion rows in the table
> Error occurred when parquet was read from s3 and written to another s3 
> location using spark.read.parquet running on Databricks 7.5 on top of EMR 
> r5.8xlarge cluster
>            Reporter: Kaushik Muniandi
>            Priority: Major
>         Attachments: spark_issue.JPG
>
>
> Got an error while running a code through Airflow DAG.
> Exception while running an ETL job on an External table created on Hive 
> stored as parquet in S3 with AWS Glue as metastore. Here's the error message:
>  
> java.lang.RuntimeException: Caught Hive MetaException attempting to get 
> partition metadata by filter from Hive. You can set the Spark configuration 
> setting spark.sql.hive.manageFilesourcePartitions to false to work around 
> this problem, however this will result in degraded performance. Please report 
> a bug: https://issues.apache.org/jira/browse/SPARK |
>  
> Caused by: MetaException(message:Unknown exception occurred. (Service: 
> AWSGlue; Status Code: 500; Error Code: InternalServiceException; Request ID: 
> 73267997-1795-45a3-965f-8bb2a6b7b3ac))
>  
> Exact issue occurred while running on Databricks notebook as well. Screenshot 
> attached for both cases.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to