vladhlinsky opened a new pull request #88: ATLAS-3640 Update 'spark_ml_model_ml_directory' and 'spark_ml_pipeline_ml_directory' relationship definitions URL: https://github.com/apache/atlas/pull/88 ## What changes were proposed in this pull request? Update `spark_ml_model_ml_directory` and `spark_ml_pipeline_ml_directory` relationship definitions to use `DataSet` type instead of it's child type `spark_ml_directory`. This is required in order to integrate Spark Atlas Connector's ML event processor. Previously, Spark Atlas Connector used the `spark_ml_directory` model for ML model directory but this is changed in the scope of https://github.com/hortonworks-spark/spark-atlas-connector/issues/61, https://github.com/hortonworks-spark/spark-atlas-connector/pull/62 so ML model directory is `DataSet` entity(i.e. `hdfs_path`, `fs_path` and `aws_s3_object`). Thus, relationship definitions must be updated, otherwise, an attempt to create relation leads to: ``` org.apache.atlas.exception.AtlasBaseException: invalid relationshipDef: spark_ml_model_ml_directory: end type 1: spark_ml_directory, end type 2: spark_ml_model ``` since `COMPOSITION` requires `spark_ml_directory` to be set. Proposed changes are safe for old clients since `DataSet` is parent type for the `spark_ml_directory`. ## How was this patch tested? Manually and with unit tests.
---------------------------------------------------------------- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: [email protected] With regards, Apache Git Services
