Re: Catalog API for Partition

JackyLee Fri, 17 Jul 2020 09:02:27 -0700

Hi, wenchen. Thanks for your attention and reply.

Firstly. These Partition Catalog APIs are not specially used for hive, they
can be used with LakeHouse or myql or other source support partitions.
Secondly. These Partition Catalog APIs are only designed for better data
management, not for speed up data scan. The API used to speed up hive data
scan are different from these APIs.


Currently, we use Hive Catalog APIs to support speeding hive data scan and
write data into hive. However, we are trying to redefine HiveTable, which
implements FileTable, and use PartitioningPruning to support speed up hive
scan. Privately, I think this is a better way to support hive in
datasourcev2.

Thanks again.
Jacky Lee



--
Sent from: http://apache-spark-developers-list.1001551.n3.nabble.com/

---------------------------------------------------------------------
To unsubscribe e-mail: [email protected]

Re: Catalog API for Partition

Reply via email to