Hi all:
  when i use spark dynamic partition feature , i met a problem about hdfs 
quota.  I found that it is every easy to meet quota problem (exceed the max 
value of quota of directory)


I have generated a unpartitioned table 'bsl12.email_edge_lyh_mth1' which 
contains 584M records and will insert it to a  partitioned table 
"bsl12.email_edge_lyh_partitioned2"
--select count(*) from bsl12.email_edge_lyh_mth1; --584652128
--INSERT OVERWRITE TABLE bsl12.email_edge_lyh_partitioned2 PARTITION 
(link_crtd_date) SELECT * FROM bsl12.email_edge_lyh_mth1;



when i viewed the temporary directory when sql running, i saw  multiple  file 
with link_crd_date=2018-01-01***, I guess one record one temporary file.  as  
there are 584M data in the unpartitioned table,  is there any parameters for us 
to control the temporary file count  to avoid the quota problem.

```

 

133    
hdfs://horton/apps/risk/ars/datamart/email_edge_lyh_partitioned2/.hive-staging_hive_2019-08-22_19-41-38_747_7237025592628396381-1/-ext-10000/_temporary/0/_temporary/attempt_20190822195048_0000_m_001404_0/link_crtd_date=2018-01-0112%3A35%3A29

137    
hdfs://horton/apps/risk/ars/datamart/email_edge_lyh_partitioned2/.hive-staging_hive_2019-08-22_19-41-38_747_7237025592628396381-1/-ext-10000/_temporary/0/_temporary/attempt_20190822195048_0000_m_001404_0/link_crtd_date=2018-01-01
 12%3A35%3A47

136    
hdfs://horton/apps/risk/ars/datamart/email_edge_lyh_partitioned2/.hive-staging_hive_2019-08-22_19-41-38_747_7237025592628396381-1/-ext-10000/_temporary/0/_temporary/attempt_20190822195048_0000_m_001404_0/link_crtd_date=2018-01-01
 12%3A38%3A23

132    
hdfs://horton/apps/risk/ars/datamart/email_edge_lyh_partitioned2/.hive-staging_hive_2019-08-22_19-41-38_747_7237025592628396381-1/-ext-10000/_temporary/0/_temporary/attempt_20190822195048_0000_m_001404_0/link_crtd_date=2018-01-01
 12%3A38%3A54

536    
hdfs://horton/apps/risk/ars/datamart/email_edge_lyh_partitioned2/.hive-staging_hive_2019-08-22_19-41-38_747_7237025592628396381-1/-ext-10000/_temporary/0/_temporary/attempt_20190822195048_0000_m_001404_0/link_crtd_date=2018-01-01
 12%3A40%3A01



```


Best Regards


Kelly Zhang

Reply via email to