It seems that column `link_crtd_date` is of type `timestamp` and you therefore partition by date including time, which produces a huge amount of directories. I assume your intent is to partition by date (partition_date=yyyy-MM-dd or year=yyyy/month=MM/day=dd) so you need to format/split your timestamp accordingly, for example:
-- partitioned by 'yyyy-MM-dd' INSERT OVERWRITE TABLE bsl12.email_edge_lyh_partitioned2 PARTITION (partition_date) SELECT date_format(link_crtd_date, 'yyyy-MM-dd') as partition_date, * FROM bsl12.email_edge_lyh_mth1; -- partitioned by year/month/day INSERT OVERWRITE TABLE bsl12.email_edge_lyh_partitioned2 PARTITION (year, month, day) SELECT year(link_crtd_date, 'yyyy-MM-dd') as year, month(link_crtd_date, 'yyyy-MM-dd') as month, day(link_crtd_date, 'yyyy-MM-dd') as day, * FROM bsl12.email_edge_lyh_mth1; Best Regards Roland Johann Software Developer/Data Engineer phenetic GmbH Lütticher Straße 10, 50674 Köln, Germany Mobil: +49 172 365 26 46 Mail: roland.joh...@phenetic.io Web: phenetic.io Handelsregister: Amtsgericht Köln (HRB 92595) Geschäftsführer: Roland Johann, Uwe Reimann > Am 23.08.2019 um 09:43 schrieb zhangliyun <kelly...@126.com>: > > Hi all: > when i use spark dynamic partition feature , i met a problem about hdfs > quota. I found that it is every easy to meet quota problem (exceed the max > value of quota of directory) > > I have generated a unpartitioned table 'bsl12.email_edge_lyh_mth1' which > contains 584M records and will insert it to a partitioned table > "bsl12.email_edge_lyh_partitioned2" > --select count(*) from bsl12.email_edge_lyh_mth1; --584652128 > --INSERT OVERWRITE TABLE bsl12.email_edge_lyh_partitioned2 PARTITION > (link_crtd_date) SELECT * FROM bsl12.email_edge_lyh_mth1; > > > when i viewed the temporary directory when sql running, i saw multiple file > with link_crd_date=2018-01-01***, I guess one record one temporary file. as > there are 584M data in the unpartitioned table, is there any parameters for > us to control the temporary file count to avoid the quota problem. > > ``` > > 133 > hdfs://horton/apps/risk/ars/datamart/email_edge_lyh_partitioned2/.hive-staging_hive_2019-08-22_19-41-38_747_7237025592628396381-1/-ext-10000/_temporary/0/_temporary/attempt_20190822195048_0000_m_001404_0/link_crtd_date=2018-01-0112%3A35%3A29 > 137 > hdfs://horton/apps/risk/ars/datamart/email_edge_lyh_partitioned2/.hive-staging_hive_2019-08-22_19-41-38_747_7237025592628396381-1/-ext-10000/_temporary/0/_temporary/attempt_20190822195048_0000_m_001404_0/link_crtd_date=2018-01-01 > 12%3A35%3A47 > 136 > hdfs://horton/apps/risk/ars/datamart/email_edge_lyh_partitioned2/.hive-staging_hive_2019-08-22_19-41-38_747_7237025592628396381-1/-ext-10000/_temporary/0/_temporary/attempt_20190822195048_0000_m_001404_0/link_crtd_date=2018-01-01 > 12%3A38%3A23 > 132 > hdfs://horton/apps/risk/ars/datamart/email_edge_lyh_partitioned2/.hive-staging_hive_2019-08-22_19-41-38_747_7237025592628396381-1/-ext-10000/_temporary/0/_temporary/attempt_20190822195048_0000_m_001404_0/link_crtd_date=2018-01-01 > 12%3A38%3A54 > 536 > hdfs://horton/apps/risk/ars/datamart/email_edge_lyh_partitioned2/.hive-staging_hive_2019-08-22_19-41-38_747_7237025592628396381-1/-ext-10000/_temporary/0/_temporary/attempt_20190822195048_0000_m_001404_0/link_crtd_date=2018-01-01 > 12%3A40%3A01 > > ``` > > Best Regards > > Kelly Zhang > > >