Hi Vinoth,

Since *schema registr*y is source of truth and Hive Meta store is
translation it,
having option to update multiple metastores in Hudi would help here in this
case.
Similar to what Syed mentioned, same Hudi dataset can be exposed in
multiple
places like Athena, Redshift Spectrum, on prem Presto, Hive etc where
datasets's
meta data is not shared with each other.

Regards,
Purushotham Pushpavanth



On Wed, 1 Jan 2020 at 00:46, Vinoth Chandar <[email protected]> wrote:

> Can one of the aws folks please chime in here? IIRC I saw some tweets
> mentioning Hudi/Athena support is in the works.
> Not sure myself.
>
> On Sun, Dec 29, 2019 at 11:33 PM Syed Abdul Kather <[email protected]>
> wrote:
>
> > Hi Team,
> >
> > We have built the  "CDC  pipeline with apache hudi and debezium" .  It
> > works very well in our production.
> >
> > But we have inhouse Ambari  Cluster with Hive metastore for all the ETL
> > purpose and Athena for all analytics purposes.  To make hudi table we
> work
> > on the athena we have preserved only the latest version and create the
> > table in parquet format .
> >
> > Right now hive metastore get update using hudi itself . But to keep the
> > athena metastore in sync we have wrote a separate script to manage. But
> > that looks like not right approach . As only the required the affected
> > partition needs to be updated in athena side.
> >
> > Please suggest as right approach here .
> >
> >             Thanks and Regards,
> >         S SYED ABDUL KATHER
> >
>

Reply via email to