[
https://issues.apache.org/jira/browse/SPARK-30262?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
chenliang updated SPARK-30262:
------------------------------
Description:
For Spark2.3.0+, we could get the Partitions Statistics Info.But in some
specail case, The Info like totalSize,rawDataSize,rowCount maybe empty. When
we do some ddls like
{code:java}
desc formatted partitions {code}
,the NumberFormatException is showed as below:
{code:java}
spark-sql> desc formatted table1 partition(year='2019', month='10', day='17',
hour='23');
19/10/19 00:02:40 ERROR SparkSQLDriver: Failed in [desc formatted table1
partition(year='2019', month='10', day='17', hour='23')]
java.lang.NumberFormatException: Zero length BigInteger
at java.math.BigInteger.(BigInteger.java:411)
at java.math.BigInteger.(BigInteger.java:597)
at scala.math.BigInt$.apply(BigInt.scala:77)
at
org.apache.spark.sql.hive.client.HiveClientImpl$$anonfun$31.apply(HiveClientImpl.scala:1056)
at
org.apache.spark.sql.hive.client.HiveClientImpl$$anonfun$31.apply(HiveClientImpl.scala:1056)
at scala.Option.map(Option.scala:146)
at
org.apache.spark.sql.hive.client.HiveClientImpl$.org$apache$spark$sql$hive$client$HiveClientImpl$$readHiveStats(HiveClientImpl.scala:1056)
at
org.apache.spark.sql.hive.client.HiveClientImpl$.fromHivePartition(HiveClientImpl.scala:1048)
at
org.apache.spark.sql.hive.client.HiveClientImpl$$anonfun$getPartitionOption$1$$anonfun$apply$16.apply(HiveClientImpl.scala:659)
at
org.apache.spark.sql.hive.client.HiveClientImpl$$anonfun$getPartitionOption$1$$anonfun$apply$16.apply(HiveClientImpl.scala:659)
at scala.Option.map(Option.scala:146)
at
org.apache.spark.sql.hive.client.HiveClientImpl$$anonfun$getPartitionOption$1.apply(HiveClientImpl.scala:659)
at
org.apache.spark.sql.hive.client.HiveClientImpl$$anonfun$getPartitionOption$1.apply(HiveClientImpl.scala:656)
at
org.apache.spark.sql.hive.client.HiveClientImpl$$anonfun$withHiveState$1.apply(HiveClientImpl.scala:281)
at
org.apache.spark.sql.hive.client.HiveClientImpl.liftedTree1$1(HiveClientImpl.scala:219)
at
org.apache.spark.sql.hive.client.HiveClientImpl.retryLocked(HiveClientImpl.scala:218)
at
org.apache.spark.sql.hive.client.HiveClientImpl.withHiveState(HiveClientImpl.scala:264)
at
org.apache.spark.sql.hive.client.HiveClientImpl.getPartitionOption(HiveClientImpl.scala:656)
at
org.apache.spark.sql.hive.client.HiveClient$class.getPartitionOption(HiveClient.scala:194)
at
org.apache.spark.sql.hive.client.HiveClientImpl.getPartitionOption(HiveClientImpl.scala:84)
at
org.apache.spark.sql.hive.client.HiveClient$class.getPartition(HiveClient.scala:174)
at
org.apache.spark.sql.hive.client.HiveClientImpl.getPartition(HiveClientImpl.scala:84)
at
org.apache.spark.sql.hive.HiveExternalCatalog$$anonfun$getPartition$1.apply(HiveExternalCatalog.scala:1125)
at
org.apache.spark.sql.hive.HiveExternalCatalog$$anonfun$getPartition$1.apply(HiveExternalCatalog.scala:1124)
{code}
was:
For Spark2.3.0+, we could get the Partitions Statistics Info.But in some
specail case, The Info like totalSize,rawDataSize,rowCount maybe empty. When
we do some ddls like
{code:java}
desc formatted partitions {code}
,the NumberFormatException is showed as below:
{code:java}
spark-sql> desc formatted gulfstream.ods_binlog_business_config_whole
partition(year='2019', month='10', day='17', hour='23');
19/10/19 00:02:40 ERROR SparkSQLDriver: Failed in [desc formatted
gulfstream.ods_binlog_business_config_whole partition(year='2019', month='10',
day='17', hour='23')]
java.lang.NumberFormatException: Zero length BigInteger
at java.math.BigInteger.(BigInteger.java:411)
at java.math.BigInteger.(BigInteger.java:597)
at scala.math.BigInt$.apply(BigInt.scala:77)
at
org.apache.spark.sql.hive.client.HiveClientImpl$$anonfun$31.apply(HiveClientImpl.scala:1056)
at
org.apache.spark.sql.hive.client.HiveClientImpl$$anonfun$31.apply(HiveClientImpl.scala:1056)
at scala.Option.map(Option.scala:146)
at
org.apache.spark.sql.hive.client.HiveClientImpl$.org$apache$spark$sql$hive$client$HiveClientImpl$$readHiveStats(HiveClientImpl.scala:1056)
at
org.apache.spark.sql.hive.client.HiveClientImpl$.fromHivePartition(HiveClientImpl.scala:1048)
at
org.apache.spark.sql.hive.client.HiveClientImpl$$anonfun$getPartitionOption$1$$anonfun$apply$16.apply(HiveClientImpl.scala:659)
at
org.apache.spark.sql.hive.client.HiveClientImpl$$anonfun$getPartitionOption$1$$anonfun$apply$16.apply(HiveClientImpl.scala:659)
at scala.Option.map(Option.scala:146)
at
org.apache.spark.sql.hive.client.HiveClientImpl$$anonfun$getPartitionOption$1.apply(HiveClientImpl.scala:659)
at
org.apache.spark.sql.hive.client.HiveClientImpl$$anonfun$getPartitionOption$1.apply(HiveClientImpl.scala:656)
at
org.apache.spark.sql.hive.client.HiveClientImpl$$anonfun$withHiveState$1.apply(HiveClientImpl.scala:281)
at
org.apache.spark.sql.hive.client.HiveClientImpl.liftedTree1$1(HiveClientImpl.scala:219)
at
org.apache.spark.sql.hive.client.HiveClientImpl.retryLocked(HiveClientImpl.scala:218)
at
org.apache.spark.sql.hive.client.HiveClientImpl.withHiveState(HiveClientImpl.scala:264)
at
org.apache.spark.sql.hive.client.HiveClientImpl.getPartitionOption(HiveClientImpl.scala:656)
at
org.apache.spark.sql.hive.client.HiveClient$class.getPartitionOption(HiveClient.scala:194)
at
org.apache.spark.sql.hive.client.HiveClientImpl.getPartitionOption(HiveClientImpl.scala:84)
at
org.apache.spark.sql.hive.client.HiveClient$class.getPartition(HiveClient.scala:174)
at
org.apache.spark.sql.hive.client.HiveClientImpl.getPartition(HiveClientImpl.scala:84)
at
org.apache.spark.sql.hive.HiveExternalCatalog$$anonfun$getPartition$1.apply(HiveExternalCatalog.scala:1125)
at
org.apache.spark.sql.hive.HiveExternalCatalog$$anonfun$getPartition$1.apply(HiveExternalCatalog.scala:1124)
{code}
> Fix NumberFormatException when totalSize is empty
> --------------------------------------------------
>
> Key: SPARK-30262
> URL: https://issues.apache.org/jira/browse/SPARK-30262
> Project: Spark
> Issue Type: Bug
> Components: SQL
> Affects Versions: 2.4.3
> Reporter: chenliang
> Priority: Major
> Fix For: 2.3.2, 2.4.3
>
>
> For Spark2.3.0+, we could get the Partitions Statistics Info.But in some
> specail case, The Info like totalSize,rawDataSize,rowCount maybe empty.
> When we do some ddls like
> {code:java}
> desc formatted partitions {code}
> ,the NumberFormatException is showed as below:
> {code:java}
> spark-sql> desc formatted table1 partition(year='2019', month='10', day='17',
> hour='23');
> 19/10/19 00:02:40 ERROR SparkSQLDriver: Failed in [desc formatted table1
> partition(year='2019', month='10', day='17', hour='23')]
> java.lang.NumberFormatException: Zero length BigInteger
> at java.math.BigInteger.(BigInteger.java:411)
> at java.math.BigInteger.(BigInteger.java:597)
> at scala.math.BigInt$.apply(BigInt.scala:77)
> at
> org.apache.spark.sql.hive.client.HiveClientImpl$$anonfun$31.apply(HiveClientImpl.scala:1056)
> at
> org.apache.spark.sql.hive.client.HiveClientImpl$$anonfun$31.apply(HiveClientImpl.scala:1056)
> at scala.Option.map(Option.scala:146)
> at
> org.apache.spark.sql.hive.client.HiveClientImpl$.org$apache$spark$sql$hive$client$HiveClientImpl$$readHiveStats(HiveClientImpl.scala:1056)
> at
> org.apache.spark.sql.hive.client.HiveClientImpl$.fromHivePartition(HiveClientImpl.scala:1048)
> at
> org.apache.spark.sql.hive.client.HiveClientImpl$$anonfun$getPartitionOption$1$$anonfun$apply$16.apply(HiveClientImpl.scala:659)
> at
> org.apache.spark.sql.hive.client.HiveClientImpl$$anonfun$getPartitionOption$1$$anonfun$apply$16.apply(HiveClientImpl.scala:659)
> at scala.Option.map(Option.scala:146)
> at
> org.apache.spark.sql.hive.client.HiveClientImpl$$anonfun$getPartitionOption$1.apply(HiveClientImpl.scala:659)
> at
> org.apache.spark.sql.hive.client.HiveClientImpl$$anonfun$getPartitionOption$1.apply(HiveClientImpl.scala:656)
> at
> org.apache.spark.sql.hive.client.HiveClientImpl$$anonfun$withHiveState$1.apply(HiveClientImpl.scala:281)
> at
> org.apache.spark.sql.hive.client.HiveClientImpl.liftedTree1$1(HiveClientImpl.scala:219)
> at
> org.apache.spark.sql.hive.client.HiveClientImpl.retryLocked(HiveClientImpl.scala:218)
> at
> org.apache.spark.sql.hive.client.HiveClientImpl.withHiveState(HiveClientImpl.scala:264)
> at
> org.apache.spark.sql.hive.client.HiveClientImpl.getPartitionOption(HiveClientImpl.scala:656)
> at
> org.apache.spark.sql.hive.client.HiveClient$class.getPartitionOption(HiveClient.scala:194)
> at
> org.apache.spark.sql.hive.client.HiveClientImpl.getPartitionOption(HiveClientImpl.scala:84)
> at
> org.apache.spark.sql.hive.client.HiveClient$class.getPartition(HiveClient.scala:174)
> at
> org.apache.spark.sql.hive.client.HiveClientImpl.getPartition(HiveClientImpl.scala:84)
> at
> org.apache.spark.sql.hive.HiveExternalCatalog$$anonfun$getPartition$1.apply(HiveExternalCatalog.scala:1125)
> at
> org.apache.spark.sql.hive.HiveExternalCatalog$$anonfun$getPartition$1.apply(HiveExternalCatalog.scala:1124)
> {code}
--
This message was sent by Atlassian Jira
(v8.3.4#803005)
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]