[PR] Generalize autoencoder training to support arbitrary-depth (N-layer) architectures [systemds]

via GitHub Sun, 18 Jan 2026 19:55:37 -0800


AdityaPandey2612 opened a new pull request, #2403:
URL: https://github.com/apache/systemds/pull/2403


   This PR extends the existing autoencoder implementation to support 
arbitrary-depth encoder–decoder architectures, building directly on the 
original 2-hidden-layer design.
   The current autoencoder code works well for the fixed 2-layer case, but it 
relies on hard-coded assumptions about layer count and parameter layout. This 
makes it difficult to experiment with deeper architectures without duplicating 
large parts of the training logic. The goal of this change is to generalize the 
training pipeline while preserving the original behavior and APIs.
   In short: the existing 2-layer autoencoder remains unchanged, and a new 
generalized path is added for N-layer configurations.
   
   This work is motivated by practical experimentation needs:
   - Enable deeper autoencoders (e.g., multi-layer encoders with mirrored 
decoders)
   - Avoid duplicating training logic for each new depth
   - Reuse the existing SGD, momentum, batching, and decay logic
   - Keep backward compatibility with all existing scripts and tests
   
   Changes:
   - Added a generalized autoencoder entry point that accepts a list of hidden 
layer sizes
   - Refactored training logic to operate over lists of weights and biases 
instead of fixed positional arguments
   - Introduced helper functions to:
       1. build symmetric encoder/decoder layer layouts
       2. flatten and unflatten model state
       3. run forward and backward passes across arbitrary depth
   - Left the original 2-layer autoencoder path intact and unchanged
   No existing public APIs were removed or modified.
   
   Testing
   - Existing autoencoder tests continue to pass without modification
   - New tests validate:
        1. correct output shapes for multi-layer configurations
        2. consistency of behavior when reducing to the 2-layer case
   
   Known limitations
   Parameter server training for the generalized path exposes limitations in 
the current list-based gradient aggregation logic. The code logic breaks when 
attempting to change the mode from default to parameter. Would need fixing of 
internal SystemDS parameter server utility. Follow-up work may be needed to 
fully support paramserv for arbitrary-depth models. This PR does not attempt to 
change paramserv internals; the generalized training is stable under the 
default (non-paramserv) execution path
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]

[PR] Generalize autoencoder training to support arbitrary-depth (N-layer) architectures [systemds]

Reply via email to