[GitHub] spark issue #15667: [SPARK-18107][SQL] Insert overwrite statement runs much ...

snodawn Sun, 30 Oct 2016 20:47:10 -0700

Github user snodawn commented on the issue:

    https://github.com/apache/spark/pull/15667
  
    @viirya I have tested the new patch, which performs better than expected. 
Before patching, it costs about 500~600 seconds, but now it just cost me about 
16 seconds to run the same statement. But it still runs slow when I run such 
sql:
    
    insert overwrite table login4game partition(pt,dt)    select distinct 
account_name,role_id,server,'1476979200' as recdate, 'mix' as platform, 'mix' 
as pid, 'mix' as dev, pt, dt  from tbllog_login  where pt='mix_en' and  
dt='2016-10-21';
    
    It's the dynamic  partition in hive, where we needn't to specify the 
partition value when inserting.  I test it in hive 2.0.1, it costs 47.822 
seconds, but in hive 1.2.1, it costs 574.33 seconds,  as the same with what it 
does in spark, which is 526.44 seconds.



---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at [email protected] or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

[GitHub] spark issue #15667: [SPARK-18107][SQL] Insert overwrite statement runs much ...

Reply via email to