ReneEnjilian opened a new pull request, #2133:
URL: https://github.com/apache/systemds/pull/2133

   This PR adds a new builtin function for ADASYN (Adaptive Synthetic Sampling) 
for generating synthetic data in case of class imbalances in ML datasets 
(binary classification). The method itself is implemented but I still need to 
add the test class in java. I manually tested the method on a real dataset 
called [Pima Indians 
Diabetes](https://www.kaggle.com/datasets/uciml/pima-indians-diabetes-database),
 which the authors also used in the original paper. The generated synthetic 
data looked very reasonable when compared to the original data. I will make a 
more detailed description here once I added the test cases and conducted more 
experiments on the other datasets mentioned in the paper. 


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: dev-unsubscr...@systemds.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org

Reply via email to