Github user snodawn commented on the issue:
https://github.com/apache/spark/pull/15667
@viirya I have tested the new patch, which performs better than expected.
Before patching, it costs about 500~600 seconds, but now it just cost me about
16 seconds to run the same statement. But it still runs slow when I run such
sql:
insert overwrite table login4game partition(pt,dt) select distinct
account_name,role_id,server,'1476979200' as recdate, 'mix' as platform, 'mix'
as pid, 'mix' as dev, pt, dt from tbllog_login where pt='mix_en' and
dt='2016-10-21';
It's the dynamic partition in hive, where we needn't to specify the
partition value when inserting. I test it in hive 2.0.1, it costs 47.822
seconds, but in hive 1.2.1, it costs 574.33 seconds, as the same with what it
does in spark, which is 526.44 seconds.
---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at [email protected] or file a JIRA ticket
with INFRA.
---
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]