PeilinChen01 opened a new pull request, #2498: URL: https://github.com/apache/systemds/pull/2498
### Summary This pull request contains the midterm progress for the Isolation Forest builtin implementation in SystemDS. Main changes: * Refactored the Isolation Forest training and scoring logic. * Clamped `subsampling_size` to the actual number of input rows when necessary. * Added support for single-row scoring in `outlierByIsolationForestApply`. * Added support for single-tree models. * Added handling for constant data where no valid split is possible. * Made seeded training reproducible. * Added edge-case tests for the Isolation Forest builtin. ### Tests The following command passes locally: ```bash mvn -Dtest=BuiltinIsolationForestTest test ``` Covered test cases include: * Basic model training and scoring * Hybrid execution mode * Subsampling-size clamping * Single-row apply * Single-tree model * Constant data * Seed reproducibility * Anomaly ranking -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: [email protected] For queries about this service, please contact Infrastructure at: [email protected]
