Barbara Eckman created ATLAS-3570:
-------------------------------------
Summary: Atlas typedefs for Machine Learning Models, Feature Sets,
and Feature Engineering Engines
Key: ATLAS-3570
URL: https://issues.apache.org/jira/browse/ATLAS-3570
Project: Atlas
Issue Type: New Feature
Reporter: Barbara Eckman
Currently the base types in Atlas do not include Machine Learning (ML) Model
tables. It would be nice to add typedefs for them, so they could be part of
enterprise discovery and versioning.
ENTITIES COULD INCLUDE:
MLModel (overview info), with attributes:
* uniqueId
* version
* businessUseCase
* modelFramework (eg scikit-learn)
* modelTypes (eg random forest regressor)
* modelClass (eg random forest (bagging + decision trees))
* isEnsemble boolean
* outcomeTypeDescription (eg single float)
* **dataScienceOwnerEmail
* githubRepoURL where the model code is founc
* modelDeploymentDate
* populationScored (eg in Comcast, residential or business customers)
* accuracyMeasures
MLModelExecution, with attributes:
* exampleInputDatasetURL (URL where a sample input dataset can be found)
* outputTargetDatasetURLs
* opsOwnerEmail
* executionEndpointURL
* dockerContainerURL
* MLFlowPointerURL
* executionNotebookURL (eg Databricks, Jupyter)
MLModelTraining, with attributes:
* hyperParameters
* trainingDatasetURLs
* trainingNotebookURL (eg Databricks, Jupyter)
FeatureSet (a set of features prepared as input to an ML model), with
attributes:
* version
* locationURL
FeatureEngineeringEngine (the engine that generates the feature set for an ML
model), with attributes:
* version
* ownerEmail
* inputSourceURL
* processingEngineInfoURL (docs on the processing engine)
* githubRepoURL
* outputTargetURL
--
This message was sent by Atlassian Jira
(v8.3.4#803005)