[
https://issues.apache.org/jira/browse/SPARK-39990?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
melin updated SPARK-39990:
--------------------------
Description:
The hive metastore restricts field name to only contain alphanumerics and
underscores. If the custom catalog does not use hms, these restrictions may not
exist, such as reading excel data, writing paruqet, and column names are prone
to special characters such as spaces, parentheses, etc
hack way forbidden:
{code:java}
@Around("execution(public *
org.apache.spark.sql.execution.datasources.DataSourceUtils.checkFieldNames(..))")
public void checkFieldNames_1(ProceedingJoinPoint pjp) throws Throwable {
LOG.info("skip checkFieldNames 1");
}
@Around("execution(public *
org.apache.spark.sql.execution.datasources.parquet.ParquetSchemaConverter$.checkFieldNames(..))")
public void checkFieldNames_2(ProceedingJoinPoint pjp) throws Throwable {
LOG.info("skip checkFieldNames 2");
}
@Around("execution(public *
org.apache.spark.sql.execution.datasources.parquet.ParquetSchemaConverter$.checkFieldName(..))")
public void checkFieldNames_3(ProceedingJoinPoint pjp) throws Throwable
{ LOG.info("skip checkFieldNames 3"); }{code}
CREATE OR REPLACE TABLE huaixin_rp.bigdata.parquet_orders_rp5 USING PARQUET
select 12 as id, 'ceity' as `address(地 址)`
[~hyukjin.kwon]
was:
The hive metastore restricts field name to only contain alphanumerics and
underscores. If the custom catalog does not use hms, these restrictions may not
exist, such as reading excel data, writing paruqet, and column names are prone
to special characters such as spaces, parentheses, etc
hack way forbidden:
{code:java}
@Around("execution(public *
org.apache.spark.sql.execution.datasources.DataSourceUtils.checkFieldNames(..))")
public void checkFieldNames_1(ProceedingJoinPoint pjp) throws Throwable {
LOG.info("skip checkFieldNames 1");
}
@Around("execution(public *
org.apache.spark.sql.execution.datasources.parquet.ParquetSchemaConverter$.checkFieldNames(..))")
public void checkFieldNames_2(ProceedingJoinPoint pjp) throws Throwable {
LOG.info("skip checkFieldNames 2");
}
@Around("execution(public *
org.apache.spark.sql.execution.datasources.parquet.ParquetSchemaConverter$.checkFieldName(..))")
public void checkFieldNames_3(ProceedingJoinPoint pjp) throws Throwable
{ LOG.info("skip checkFieldNames 3"); }{code}
CREATE OR REPLACE TABLE huaixin_rp.bigdata.parquet_orders_rp5 USING PARQUET
select 12 as id, 'ceity' as `address(地 址)`
> Restrict special characters in field name, which can be controlled by
> switches
> -------------------------------------------------------------------------------
>
> Key: SPARK-39990
> URL: https://issues.apache.org/jira/browse/SPARK-39990
> Project: Spark
> Issue Type: Improvement
> Components: SQL
> Affects Versions: 3.4.0
> Reporter: melin
> Priority: Major
>
> The hive metastore restricts field name to only contain alphanumerics and
> underscores. If the custom catalog does not use hms, these restrictions may
> not exist, such as reading excel data, writing paruqet, and column names are
> prone to special characters such as spaces, parentheses, etc
> hack way forbidden:
> {code:java}
> @Around("execution(public *
> org.apache.spark.sql.execution.datasources.DataSourceUtils.checkFieldNames(..))")
> public void checkFieldNames_1(ProceedingJoinPoint pjp) throws Throwable {
> LOG.info("skip checkFieldNames 1");
> }
> @Around("execution(public *
> org.apache.spark.sql.execution.datasources.parquet.ParquetSchemaConverter$.checkFieldNames(..))")
> public void checkFieldNames_2(ProceedingJoinPoint pjp) throws Throwable {
> LOG.info("skip checkFieldNames 2");
> }
> @Around("execution(public *
> org.apache.spark.sql.execution.datasources.parquet.ParquetSchemaConverter$.checkFieldName(..))")
> public void checkFieldNames_3(ProceedingJoinPoint pjp) throws Throwable
> { LOG.info("skip checkFieldNames 3"); }{code}
> CREATE OR REPLACE TABLE huaixin_rp.bigdata.parquet_orders_rp5 USING PARQUET
> select 12 as id, 'ceity' as `address(地 址)`
> [~hyukjin.kwon]
--
This message was sent by Atlassian Jira
(v8.20.10#820010)
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]