TaoLv commented on issue #16735: Use single-bit for mask in dropout operator
URL: https://github.com/apache/incubator-mxnet/pull/16735#issuecomment-586662412
> Why does it increase memory load?
If there are N elements, per the Bernoulli distribution generation in VSL,
we still need t
TaoLv commented on issue #16735: Use single-bit for mask in dropout operator
URL: https://github.com/apache/incubator-mxnet/pull/16735#issuecomment-586608057
> What algorithm is used in TF and pytorch?
@pengzhao-intel I don't think TF has a fused dropout operator. It's
implement with
TaoLv commented on issue #16735: Use single-bit for mask in dropout operator
URL: https://github.com/apache/incubator-mxnet/pull/16735#issuecomment-586564858
> Does the slow down come from more computation in the new algorithm or the
sub-optimal implementation?
The new implementation
TaoLv commented on issue #16735: Use single-bit for mask in dropout operator
URL: https://github.com/apache/incubator-mxnet/pull/16735#issuecomment-586069482
Does the `avg_time_Dropout` include backward time? @apeforest
This
TaoLv commented on issue #16735: Use single-bit for mask in dropout operator
URL: https://github.com/apache/incubator-mxnet/pull/16735#issuecomment-585510015
> I'm ok with the result. @TaoLv any concern?
It's still 1.36x slower. I will take another look today.
> If sacrificing
TaoLv commented on issue #16735: Use single-bit for mask in dropout operator
URL: https://github.com/apache/incubator-mxnet/pull/16735#issuecomment-584505352
@apeforest Could you please share your benchmarking scripts?
This i
TaoLv commented on issue #16735: Use single-bit for mask in dropout operator
URL: https://github.com/apache/incubator-mxnet/pull/16735#issuecomment-584493695
> Can MKL VSL support bit-wise mask so we don't have to do the extra
conversion which seems redundant in the MKL case.
Sorry,
TaoLv commented on issue #16735: Use single-bit for mask in dropout operator
URL: https://github.com/apache/incubator-mxnet/pull/16735#issuecomment-581106905
@eric-haibin-lin some VSL functions [1] are used to generate random numbers
for dropout. VSL is part of MKL library [2].
[1]
TaoLv commented on issue #16735: Use single-bit for mask in dropout operator
URL: https://github.com/apache/incubator-mxnet/pull/16735#issuecomment-573484903
@apeforest Could you please also test the operator performance with
USE_BLAS=mkl?
--
TaoLv commented on issue #16735: Use single-bit for mask in dropout operator
URL: https://github.com/apache/incubator-mxnet/pull/16735#issuecomment-573422963
@apeforest Thank you for testing it out. Given memory is not always a
concern, can we make bit mask an option for dropout?
-
TaoLv commented on issue #16735: Use single-bit for mask in dropout operator
URL: https://github.com/apache/incubator-mxnet/pull/16735#issuecomment-570769743
@apeforest Thank you for the nice work! Do you have any numbers to share?
- memory usage of a model in which dropout workspace used
11 matches
Mail list logo