DataGamePlay commented on issue #2067: URL: https://github.com/apache/paimon/issues/2067#issuecomment-3722142630
## Reproduce on 0.8.1 Same issue on **Paimon 0.8.1** ## Problem Description Same data, same Hive engine, same query logic: - Hive table: 2 minutes - Paimon table: 6 minutes (still not returned) Same data, same Spark SQL: - Direct SELECT fields: very slow - GROUP BY fields: very fast 3. Query with Hive engine: SELECT col1, col2 FROM paimon_table WHERE dt>='2025-01-01'; -- 6 minutes, still not returned 4. Query with Spark SQL: SELECT col1, col2 FROM paimon_table WHERE dt='2025-01-01'; -- very slow SELECT col1, col2 FROM paimon_table WHERE dt='2025-01-01' GROUP BY col1, col2; -- very fast **Request**: 1.Please help analyze why direct SELECT is very slow even after read optimization, and why GROUP BY is very fast. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: [email protected] For queries about this service, please contact Infrastructure at: [email protected]
