@samskalicky Thanks! I have just thought of two models (both for relation 
extraction/classification) that would be made more convenient to implement with 
the help of the diag/trace operator:

* Learning with Noise: Enhance Distantly Supervised Relation Extraction with 
Dynamic Transition Matrix (https://arxiv.org/abs/1705.03995). They use a matrix 
to model the transition from the true label to the noisy one, whose trace 
appears in the loss as a regularizer.
* Neural Relation Extraction with Selective Attention over Instances 
(http://aclweb.org/anthology/P16-1200). The output of this model is the 
diagonal of a matrix whose rows are first softmaxed. When fed with a batch of 
inputs, the outputs could be fetched by taking the diagonals of several 
matrices at the same time.

[ Full content available at: 
https://github.com/apache/incubator-mxnet/issues/12327 ]
This message was relayed via gitbox.apache.org for [email protected]

Reply via email to