vladhlinsky opened a new pull request #88: ATLAS-3640 Update 
'spark_ml_model_ml_directory' and 'spark_ml_pipeline_ml_directory' relationship 
definitions
URL: https://github.com/apache/atlas/pull/88
 
 
   ## What changes were proposed in this pull request?
   
   Update `spark_ml_model_ml_directory` and `spark_ml_pipeline_ml_directory` 
relationship definitions to use `DataSet` type instead of it's child type 
`spark_ml_directory`. This is required in order to integrate Spark Atlas 
Connector's ML event processor.
   Previously, Spark Atlas Connector used the `spark_ml_directory` model for ML 
model directory but this is changed in the scope of 
https://github.com/hortonworks-spark/spark-atlas-connector/issues/61, 
https://github.com/hortonworks-spark/spark-atlas-connector/pull/62 so ML model 
directory is `DataSet` entity(i.e. `hdfs_path`, `fs_path` and `aws_s3_object`).
   Thus, relationship definitions must be updated, otherwise, an attempt to 
create relation leads to: 
   ```
   org.apache.atlas.exception.AtlasBaseException: invalid relationshipDef: 
spark_ml_model_ml_directory: end type 1: spark_ml_directory, end type 2: 
spark_ml_model
   ```
   since `COMPOSITION` requires `spark_ml_directory` to be set.
   
   Proposed changes are safe for old clients since `DataSet` is parent type for 
the `spark_ml_directory`.
   
   ## How was this patch tested?
   
   Manually and with unit tests.
   

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
[email protected]


With regards,
Apache Git Services

Reply via email to