fedorzh commented on issue #9185: Gluon provided ResNet does not get desirable accuracy on CIFAR10 URL: https://github.com/apache/incubator-mxnet/issues/9185#issuecomment-356109100 I also have some symbols stored in my notebook cache, not sure which one is which ``` ResNetV2( (features): HybridSequential( (0): BatchNorm(fix_gamma=True, eps=1e-05, momentum=0.9, axis=1, in_channels=None) (1): Conv2D(3 -> 64, kernel_size=(7, 7), stride=(2, 2), padding=(3, 3), bias=False) (2): BatchNorm(fix_gamma=False, eps=1e-05, momentum=0.9, axis=1, in_channels=None) (3): Activation(relu) (4): MaxPool2D(size=(3, 3), stride=(2, 2), padding=(1, 1), ceil_mode=False) (5): HybridSequential( (0): BasicBlockV2( (bn1): BatchNorm(fix_gamma=False, eps=1e-05, momentum=0.9, axis=1, in_channels=None) (bn2): BatchNorm(fix_gamma=False, eps=1e-05, momentum=0.9, axis=1, in_channels=None) (conv2): Conv2D(64 -> 64, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1), bias=False) (conv1): Conv2D(64 -> 64, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1), bias=False) ) (1): BasicBlockV2( (bn1): BatchNorm(fix_gamma=False, eps=1e-05, momentum=0.9, axis=1, in_channels=None) (bn2): BatchNorm(fix_gamma=False, eps=1e-05, momentum=0.9, axis=1, in_channels=None) (conv2): Conv2D(64 -> 64, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1), bias=False) (conv1): Conv2D(64 -> 64, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1), bias=False) ) ) (6): HybridSequential( (0): BasicBlockV2( (bn1): BatchNorm(fix_gamma=False, eps=1e-05, momentum=0.9, axis=1, in_channels=None) (downsample): Conv2D(64 -> 128, kernel_size=(1, 1), stride=(2, 2), bias=False) (bn2): BatchNorm(fix_gamma=False, eps=1e-05, momentum=0.9, axis=1, in_channels=None) (conv2): Conv2D(128 -> 128, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1), bias=False) (conv1): Conv2D(64 -> 128, kernel_size=(3, 3), stride=(2, 2), padding=(1, 1), bias=False) ) (1): BasicBlockV2( (bn1): BatchNorm(fix_gamma=False, eps=1e-05, momentum=0.9, axis=1, in_channels=None) (bn2): BatchNorm(fix_gamma=False, eps=1e-05, momentum=0.9, axis=1, in_channels=None) (conv2): Conv2D(128 -> 128, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1), bias=False) (conv1): Conv2D(128 -> 128, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1), bias=False) ) ) (7): HybridSequential( (0): BasicBlockV2( (bn1): BatchNorm(fix_gamma=False, eps=1e-05, momentum=0.9, axis=1, in_channels=None) (downsample): Conv2D(128 -> 256, kernel_size=(1, 1), stride=(2, 2), bias=False) (bn2): BatchNorm(fix_gamma=False, eps=1e-05, momentum=0.9, axis=1, in_channels=None) (conv2): Conv2D(256 -> 256, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1), bias=False) (conv1): Conv2D(128 -> 256, kernel_size=(3, 3), stride=(2, 2), padding=(1, 1), bias=False) ) (1): BasicBlockV2( (bn1): BatchNorm(fix_gamma=False, eps=1e-05, momentum=0.9, axis=1, in_channels=None) (bn2): BatchNorm(fix_gamma=False, eps=1e-05, momentum=0.9, axis=1, in_channels=None) (conv2): Conv2D(256 -> 256, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1), bias=False) (conv1): Conv2D(256 -> 256, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1), bias=False) ) ) (8): HybridSequential( (0): BasicBlockV2( (bn1): BatchNorm(fix_gamma=False, eps=1e-05, momentum=0.9, axis=1, in_channels=None) (downsample): Conv2D(256 -> 512, kernel_size=(1, 1), stride=(2, 2), bias=False) (bn2): BatchNorm(fix_gamma=False, eps=1e-05, momentum=0.9, axis=1, in_channels=None) (conv2): Conv2D(512 -> 512, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1), bias=False) (conv1): Conv2D(256 -> 512, kernel_size=(3, 3), stride=(2, 2), padding=(1, 1), bias=False) ) (1): BasicBlockV2( (bn1): BatchNorm(fix_gamma=False, eps=1e-05, momentum=0.9, axis=1, in_channels=None) (bn2): BatchNorm(fix_gamma=False, eps=1e-05, momentum=0.9, axis=1, in_channels=None) (conv2): Conv2D(512 -> 512, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1), bias=False) (conv1): Conv2D(512 -> 512, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1), bias=False) ) ) (9): BatchNorm(fix_gamma=False, eps=1e-05, momentum=0.9, axis=1, in_channels=None) (10): Activation(relu) (11): GlobalAvgPool2D(size=(1, 1), stride=(1, 1), padding=(0, 0), ceil_mode=True) (12): Flatten ) (output): Dense(512 -> 10, linear) ) ``` and ``` ResNetV2( (features): HybridSequential( (0): BatchNorm(fix_gamma=True, eps=1e-05, momentum=0.9, axis=1, in_channels=3) (1): Conv2D(3 -> 16, kernel_size=(7, 7), stride=(2, 2), padding=(3, 3), bias=False) (2): BatchNorm(fix_gamma=False, eps=1e-05, momentum=0.9, axis=1, in_channels=16) (3): Activation(relu) (4): MaxPool2D(size=(3, 3), stride=(2, 2), padding=(1, 1), ceil_mode=False) (5): HybridSequential( (0): BasicBlockV2( (bn1): BatchNorm(fix_gamma=False, eps=1e-05, momentum=0.9, axis=1, in_channels=16) (bn2): BatchNorm(fix_gamma=False, eps=1e-05, momentum=0.9, axis=1, in_channels=16) (conv2): Conv2D(16 -> 16, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1), bias=False) (conv1): Conv2D(16 -> 16, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1), bias=False) ) (1): BasicBlockV2( (bn1): BatchNorm(fix_gamma=False, eps=1e-05, momentum=0.9, axis=1, in_channels=16) (bn2): BatchNorm(fix_gamma=False, eps=1e-05, momentum=0.9, axis=1, in_channels=16) (conv2): Conv2D(16 -> 16, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1), bias=False) (conv1): Conv2D(16 -> 16, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1), bias=False) ) (2): BasicBlockV2( (bn1): BatchNorm(fix_gamma=False, eps=1e-05, momentum=0.9, axis=1, in_channels=16) (bn2): BatchNorm(fix_gamma=False, eps=1e-05, momentum=0.9, axis=1, in_channels=16) (conv2): Conv2D(16 -> 16, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1), bias=False) (conv1): Conv2D(16 -> 16, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1), bias=False) ) ) (6): HybridSequential( (0): BasicBlockV2( (bn1): BatchNorm(fix_gamma=False, eps=1e-05, momentum=0.9, axis=1, in_channels=16) (downsample): Conv2D(16 -> 32, kernel_size=(1, 1), stride=(2, 2), bias=False) (bn2): BatchNorm(fix_gamma=False, eps=1e-05, momentum=0.9, axis=1, in_channels=32) (conv2): Conv2D(32 -> 32, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1), bias=False) (conv1): Conv2D(16 -> 32, kernel_size=(3, 3), stride=(2, 2), padding=(1, 1), bias=False) ) (1): BasicBlockV2( (bn1): BatchNorm(fix_gamma=False, eps=1e-05, momentum=0.9, axis=1, in_channels=32) (bn2): BatchNorm(fix_gamma=False, eps=1e-05, momentum=0.9, axis=1, in_channels=32) (conv2): Conv2D(32 -> 32, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1), bias=False) (conv1): Conv2D(32 -> 32, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1), bias=False) ) (2): BasicBlockV2( (bn1): BatchNorm(fix_gamma=False, eps=1e-05, momentum=0.9, axis=1, in_channels=32) (bn2): BatchNorm(fix_gamma=False, eps=1e-05, momentum=0.9, axis=1, in_channels=32) (conv2): Conv2D(32 -> 32, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1), bias=False) (conv1): Conv2D(32 -> 32, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1), bias=False) ) ) (7): HybridSequential( (0): BasicBlockV2( (bn1): BatchNorm(fix_gamma=False, eps=1e-05, momentum=0.9, axis=1, in_channels=32) (downsample): Conv2D(32 -> 64, kernel_size=(1, 1), stride=(2, 2), bias=False) (bn2): BatchNorm(fix_gamma=False, eps=1e-05, momentum=0.9, axis=1, in_channels=64) (conv2): Conv2D(64 -> 64, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1), bias=False) (conv1): Conv2D(32 -> 64, kernel_size=(3, 3), stride=(2, 2), padding=(1, 1), bias=False) ) (1): BasicBlockV2( (bn1): BatchNorm(fix_gamma=False, eps=1e-05, momentum=0.9, axis=1, in_channels=64) (bn2): BatchNorm(fix_gamma=False, eps=1e-05, momentum=0.9, axis=1, in_channels=64) (conv2): Conv2D(64 -> 64, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1), bias=False) (conv1): Conv2D(64 -> 64, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1), bias=False) ) (2): BasicBlockV2( (bn1): BatchNorm(fix_gamma=False, eps=1e-05, momentum=0.9, axis=1, in_channels=64) (bn2): BatchNorm(fix_gamma=False, eps=1e-05, momentum=0.9, axis=1, in_channels=64) (conv2): Conv2D(64 -> 64, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1), bias=False) (conv1): Conv2D(64 -> 64, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1), bias=False) ) ) (8): BatchNorm(fix_gamma=False, eps=1e-05, momentum=0.9, axis=1, in_channels=64) (9): Activation(relu) (10): GlobalAvgPool2D(size=(1, 1), stride=(1, 1), padding=(0, 0), ceil_mode=True) (11): Flatten ) (output): Dense(64 -> 10, linear) ) ```
---------------------------------------------------------------- This is an automated message from the Apache Git Service. To respond to the message, please log on GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: [email protected] With regards, Apache Git Services
