It would be awesome if MXNet were the first DL framework to support Nvidia
Volta. What do you all think about cutting a v0.12 release once that
integration is ready?

On Wed, Sep 27, 2017 at 10:38 PM, Jun Wu <[email protected]> wrote:

> I had been working on the sparse tensor project with Haibin. After it was
> wrapped up for the first stage, I started my work on the quantization
> project (INT-8 inference). The benefits of using quantized models for
> inference include much higher inference throughput than FP32 model with
> acceptable accuracy loss and compact models saved on small devices. The
> work currently aims at quantizing ConvNets, and we will consider expanding
> it to RNN networks after getting good results for images. Meanwhile, it's
> expected to support quantization on CPU, GPU, and mobile devices.
>

Reply via email to