Re: Catalog API for Partition

Wenchen Fan Mon, 20 Jul 2020 07:57:02 -0700

Yea we don't want the partitions to be Hive-specific. My point is, we call
it "Partition Catalog APIs", which makes me confused about the relationship
between this and the "partitions" in `TableCatalog.createTable`. Are these
two orthogonal? Or you kind of unify them?


On Sat, Jul 18, 2020 at 12:02 AM JackyLee <qcsd2...@163.com> wrote:

> Hi, wenchen. Thanks for your attention and reply.
>
> Firstly. These Partition Catalog APIs are not specially used for hive, they
> can be used with LakeHouse or myql or other source support partitions.
> Secondly. These Partition Catalog APIs are only designed for better data
> management, not for speed up data scan. The API used to speed up hive data
> scan are different from these APIs.
>
> Currently, we use Hive Catalog APIs to support speeding hive data scan and
> write data into hive. However, we are trying to redefine HiveTable, which
> implements FileTable, and use PartitioningPruning to support speed up hive
> scan. Privately, I think this is a better way to support hive in
> datasourcev2.
>
> Thanks again.
> Jacky Lee
>
>
>
> --
> Sent from: http://apache-spark-developers-list.1001551.n3.nabble.com/
>
> ---------------------------------------------------------------------
> To unsubscribe e-mail: dev-unsubscr...@spark.apache.org
>
>

Re: Catalog API for Partition

Reply via email to