Gandagorn commented on a change in pull request #1323:
URL: https://github.com/apache/systemds/pull/1323#discussion_r668004905



##########
File path: src/main/python/tests/examples/tutorials/test_adult.py
##########
@@ -387,6 +387,11 @@ def test_level2(self):
         
################################################################################################################
         X1, M1 = X1.transform_encode(spec=jspec)
 
+        # better alternative for encoding
+        # X1, M = F1.transform_encode(spec=jspec)
+        # X2 = F2.transform_apply(spec=jspec, meta=M)
+        # testX2 = X2.compute(True)

Review comment:
       1. There are only 2 different labels to predict, if a person makes 
"<=50K" a year or ">50K" a year. The 2 labels in the training set are denoted 
correctly as  either "<=50K" and ">50K", however in the test set they are 
called "<=50K."  and ">50K.". The dot at the end of the test set labels hinders 
us to use the transform_apply function correctly.
   2. The dml function we want to use should be able to replace a string in the 
target column of the frame, similar to 
   
   ```
   replace_target_frame = function(String replacement, String to_replace, 
Frame[Unknown] X)
     return(Frame[Unknown] X)
   {
     for (i in 1:nrow(X)) {
       if (as.scalar(X[i, ncol(X)]) == to_replace) {
         X[i, ncol(X)] = replacement;
       }
     }
   }
   ```
   
   However when trying to load the function into python, we get the error 
   > "NotImplementedError: Not Implemented type parsing for function def: 
Frame[Unknown]X"
   
   
   
   
   

##########
File path: src/main/python/tests/examples/tutorials/test_adult.py
##########
@@ -387,6 +387,11 @@ def test_level2(self):
         
################################################################################################################
         X1, M1 = X1.transform_encode(spec=jspec)
 
+        # better alternative for encoding
+        # X1, M = F1.transform_encode(spec=jspec)
+        # X2 = F2.transform_apply(spec=jspec, meta=M)
+        # testX2 = X2.compute(True)

Review comment:
       Thank you very much! Still I am confused how to use the replace function 
in the python environment? Because we load the data in python into a frame, and 
when trying to use `sds.source` with a dml file that includes a function with a 
frame as an argument, it throws the above error.




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: dev-unsubscr...@systemds.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Reply via email to