Gaurav Gireesh created MXNET-1210:
-------------------------------------

             Summary: Gluon Audio
                 Key: MXNET-1210
                 URL: https://issues.apache.org/jira/browse/MXNET-1210
             Project: Apache MXNet
          Issue Type: New Feature
          Components: Gluon
            Reporter: Gaurav Gireesh


As a user, I would like to have an out of the box feature in Audio Data Loader 
and Audio transforms in MXNet, that would allow me :
 * to be able to load audio (only .wav files supported currently) files and 
make a Gluon AudioDataset (NDArrays),

 * apply some popular audio transforms on the audio data( example scaling, MEL, 
MFCC etc.),

 * load the Dataset using Gluon's DataLoader, train a neural network ( Ex: MLP) 
with this transformed audio dataset,

 * perform a simple audio data related task such as sounds classification - 1 
audio clip with 1 label( Multiclass sound classification problem).

 * Provide an end to end example for a task (Urban Sounds Classification) 
including:

 * reading audio files from a folder location (can be extended to S3 bucket 
later) and load it into the AudioDataset

 * apply audio transforms

 * train a model - neural network with the AudioDataset or DataLoader

 * perform the multi class classification - conduct inference

 * Design here: https://cwiki.apache.org/confluence/display/MXNET/Gluon+-+Audio



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to