lindong28 commented on code in PR #196:
URL: https://github.com/apache/flink-ml/pull/196#discussion_r1091380089


##########
flink-ml-lib/src/test/java/org/apache/flink/ml/feature/StandardScalerTest.java:
##########
@@ -206,7 +206,7 @@ public void testSaveLoadAndPredict() throws Exception {
         model = TestUtils.saveAndReload(tEnv, model, 
tempFolder.newFolder().getAbsolutePath());
 
         assertEquals(
-                Arrays.asList("mean", "std"),
+                Arrays.asList("mean", "std", "version", "timestamp"),

Review Comment:
   > What do you mean by verify model data contains those columns? Do you mean 
we only verify model data and leave model version and timestamp unchecked?
   
   Yes that is what I means. IMO StandardScalerTest should verify the expected 
behavior of StandardScalerModel from the perspective of users who want to use 
StandardScalerModel. IMO users of StandardScalerModel should not need to know 
concepts such as model version or timestamp. Users should 
`OnlineStandardScaler` needs to know these information, and this behavior is 
covered by `OnlineStandardScalerTest`.
   
   > I think letting StandardScalerModel have model version and timestamp is 
reasonable, since offline learning could be viewed as a special case of online 
learning
   
   Yes we can tell users this. But is there benefit for users to take the extra 
effort to learn about this? If only the users of `OnlineStandardScaler` needs 
to know this, it seems more intuitive to put all related tests there?



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]

Reply via email to