wangyum commented on PR #40462:
URL: https://github.com/apache/spark/pull/40462#issuecomment-1486144770

   If such queries cannot be optimized, the performance of such queries will be 
very poor. We use a partition to fetch data from MySQL, and increase its 
parallelism for downstream computing after fetching the data:
   
   ```sql
   CREATE VIEW full_query_log
   AS
   SELECT h.* FROM query_log_hdfs h
   UNION ALL
   SELECT /*+ REBALANCE */ q.*, DATE(start) FROM query_log_mysql q;
   
   SELECT * FROM full_query_log limit 5;
   ```
   
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]


---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to