Re: Metadata Management

Szuromi Tamás Thu, 19 Oct 2017 23:37:26 -0700

Hi Vasu,

https://github.com/linkedin/WhereHows might be a good fit.


Cheers
Tamas

On 2017. Oct 19., Thu at 23:22, Vasu Gourabathina <vgour...@gmail.com>
wrote:

> All:
>
> This may be off topic for Spark, but I'm sure several of you might have
> used some form of this as part of your BigData implementations. So, wanted
> to reach out.
>
> As part of the Data Lake and Data Processing (by Spark as an example), we
> might end up different form-factors for the files (via, cleanup, enrichment
> etc).
>
> In order to make this data available for data exploration by analysts,
> data scientists - how to manage the metadata?
>   - Creating Metadata Repository
>   - Make the schemas available for users, so they may use it to create
> Hive tables, use them by Presto etc.
>
> Can you recommend some patterns (or tools) to help manage the Metadata?
> Trying to reduce the dependency on the engineers and make the
> analysts/scientists be self-sufficient as much as possible.
>
> Azure and AWS Glue Data Catalog seem to address this. Any inputs on these
> two?
>
> Appreciate in advance.
>
> Thanks,
> Vasu.
>

Re: Metadata Management

Reply via email to