Also, normally people refer to the "last fully connected layer" as the layer before the "softmax layer" when using pretrained weights. This is because the weights associated with that last softmax layer are intrinsically linked to the training and the softmax, while lower layers may have more abstract representations. So in this case I would expect it is like this (for example)
X -> L1 -> L2 -> L3 -> Softmax layer (wx + b) and they take the weights from L3. Another thing common to convnets is to take the first / second full connected layer (rather than the last one before the softmax). X -> L1 (conv) -> L2 (conv) -> L3 (fullconnect) -> L4 (fullconnect) -> Softmax layer (wx + b) People often choose L4, because it should contain information from the whole image if the convnet is well designed, along with the extra bit of "mixing"/abstraction from L3. Choosing what pretrained layer is usually problem specific, and performance can vary depending on how close the task the pretrained model was used for, and what task you are trying to do with the SVM. On Thu, Nov 24, 2016 at 2:28 PM, Pascal Lamblin <[email protected]> wrote: > The softmax layer (softmax(wx + b) is a classifier, that is trained on > the last fully-connected layer, and backpropagates a gradient so that > the rest of the network is trained as well. > > SVM is a different classifier, that they connected to the same input > (x, the output of the last fully-connected layer) and that they trained > (without backpropagation I think). > > There is sometimes confusion in the literature between the softmax > operation itself (exp(x) / exp(x).sum(), that converts unnormalized > log-probabilities into a probability vector) and the "softmax layer", or > "logistic regression layer" (softmax(Wx+b)). > > On Thu, Nov 24, 2016, Beatriz G. wrote: >> Hi Everyone, I am trying to build a cnn based in imagenet. The paper which >> I am following sais that the architecture is formed by convolutional layers >> and fully connected layers, and in the last layer, i.e. output layer is >> followed by softmax. Then, it sais that after extracting the features from >> the last fully connected layer, uses a SVM as a classifier. >> >> I do not know if the input of the classifier is the output of the softmax. >> >> And I thought that the softmax was a classifier, and I must be wrong >> >> >> Regards. >> >> -- >> >> --- >> You received this message because you are subscribed to the Google Groups >> "theano-users" group. >> To unsubscribe from this group and stop receiving emails from it, send an >> email to [email protected]. >> For more options, visit https://groups.google.com/d/optout. > > > -- > Pascal > > -- > > --- > You received this message because you are subscribed to the Google Groups > "theano-users" group. > To unsubscribe from this group and stop receiving emails from it, send an > email to [email protected]. > For more options, visit https://groups.google.com/d/optout. -- --- You received this message because you are subscribed to the Google Groups "theano-users" group. To unsubscribe from this group and stop receiving emails from it, send an email to [email protected]. For more options, visit https://groups.google.com/d/optout.
