Hi Vinoth,
As discussed by puru. Please suggest as on supporting the multiple
megastores or if there is any better way.
Thanks and Regards,
S SYED ABDUL KATHER
On Mon, Jan 6, 2020 at 3:47 PM Purushotham Pushpavanthar <
[email protected]> wrote:
> Hi Vinoth,
>
> Since *schema registr*y is source of truth and Hive Meta store is
> translation it,
> having option to update multiple metastores in Hudi would help here in this
> case.
> Similar to what Syed mentioned, same Hudi dataset can be exposed in
> multiple
> places like Athena, Redshift Spectrum, on prem Presto, Hive etc where
> datasets's
> meta data is not shared with each other.
>
> Regards,
> Purushotham Pushpavanth
>
>
>
> On Wed, 1 Jan 2020 at 00:46, Vinoth Chandar <[email protected]> wrote:
>
> > Can one of the aws folks please chime in here? IIRC I saw some tweets
> > mentioning Hudi/Athena support is in the works.
> > Not sure myself.
> >
> > On Sun, Dec 29, 2019 at 11:33 PM Syed Abdul Kather <[email protected]>
> > wrote:
> >
> > > Hi Team,
> > >
> > > We have built the "CDC pipeline with apache hudi and debezium" . It
> > > works very well in our production.
> > >
> > > But we have inhouse Ambari Cluster with Hive metastore for all the ETL
> > > purpose and Athena for all analytics purposes. To make hudi table we
> > work
> > > on the athena we have preserved only the latest version and create the
> > > table in parquet format .
> > >
> > > Right now hive metastore get update using hudi itself . But to keep the
> > > athena metastore in sync we have wrote a separate script to manage. But
> > > that looks like not right approach . As only the required the affected
> > > partition needs to be updated in athena side.
> > >
> > > Please suggest as right approach here .
> > >
> > > Thanks and Regards,
> > > S SYED ABDUL KATHER
> > >
> >
>