vishnu-chalil opened a new issue, #7396: URL: https://github.com/apache/gravitino/issues/7396
### Describe the feature Currently, Gravitino only supports logging lineage data via a log sink. To enhance lineage tracking capabilities, we propose adding Marquez as a supported lineage sink. This integration will allow Gravitino to store and manage metadata in Marquez, enabling better data lineage visibility and usability. ### Motivation The existing log sink implementation is limited in capturing and utilizing lineage data effectively. Marquez provides a robust, purpose-built solution for metadata and lineage tracking. Integration with Marquez will enable: - End-to-end lineage tracking across data pipelines. - Improved metadata visibility for debugging, compliance, and governance. - Better interoperability with modern data tools (e.g., Airflow, Spark). ### Describe the solution Support Marquez as a configurable lineage sink via gravitino.conf. `gravitino.lineage.sinks = marquez # Configure your Marquez sink class gravitino.lineage.marquez.sinkClass = org.apache.gravitino.lineage.sink.LineageMarquezSink # Optional: Configure additional properties for your sink gravitino.lineage.marquez.url = http://localhost:5000` Key Features - Flexible Configuration - Supports dynamic Marquez server URLs. - Optional authentication (API keys, OAuth2). - Extensible Design - Allows custom implementations via class overrides. - Backward Compatibility - Retains existing log sink as the default option. ### Additional context _No response_ -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: [email protected] For queries about this service, please contact Infrastructure at: [email protected]
