c4emmmm opened a new pull request #9033: [FLINK-13167] Introduce FlinkML 
import/export framework
URL: https://github.com/apache/flink/pull/9033
 
 
   ## What is the purpose of the change
   
   Importing/exporting a trained model is an important part in machine learning 
lifecycle, especially for serving purpose, hence it triggered a heatedly 
discussion in the previous pull request for FlinkML basic interfaces. Some 
thoughts about this are summarized in the following doc.
   
   
https://docs.google.com/document/d/1B84b-1CvOXtwWQ6_tQyiaHwnSeiRqh-V96Or8uHqCp8/edit?usp=sharing
   
   This pull request introduces the import/export framework based on the 
discussion. 
   
   ## Brief change log
   
   Import and export framework have almost the same structure so we take export 
as example. Export framework has 3 basic components, including 2 interfaces for 
exporter authors to implement, and one util class for end-users to acquire an 
Exporter.
   
   Interfaces:
    - Exporter: Interface of model exporters. Implementations can export 
supported models to a specific target model type, such as PMML.
   
    - ExporterFactory: Interface of factories to create Exporters. Each 
implementation supports only one model type. Typically an ExporterFactory is 
given a model instance with a property map if needed, and returns an Exporter 
that supports the model to export to target model type. ExporterFactories are 
loaded with ServiceLoader framework of Java, and should be registered in the 
file 
"META-INF/services/org.apache.flink.ml.api.misc.exportation.ExporterFactory".
   
   Utility:
    - ExporterLoader: Loader to get an Exporter for a specific model format and 
a model instance. Users would get Exporters via this class rather than directly 
creating with constructors. This is a concrete util class and no one needs to 
override it.
   
   Besides, a new enumeration named ModelType is also introduced, listing all 
types FlinkML may support. This would be extended when a new exporter support a 
new type.
   
   ## Verifying this change
   
   This change would add tests for Factory and FactoryLoader, TBD.
   
   ## Does this pull request potentially affect one of the following parts:
   
     - Dependencies (does it add or upgrade a dependency): (no)
     - The public API, i.e., is any changed class annotated with 
`@Public(Evolving)`: (no)
     - The serializers: (no)
     - The runtime per-record code paths (performance sensitive): (no)
     - Anything that affects deployment or recovery: JobManager (and its 
components), Checkpointing, Yarn/Mesos, ZooKeeper: (no)
     - The S3 file system connector: (no)
   
   ## Documentation
   
     - Does this pull request introduce a new feature? (yes)
     - If yes, how is the feature documented? (JavaDocs)
   

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

Reply via email to