Gaurav Gireesh created MXNET-1210:
-------------------------------------
Summary: Gluon Audio
Key: MXNET-1210
URL: https://issues.apache.org/jira/browse/MXNET-1210
Project: Apache MXNet
Issue Type: New Feature
Components: Gluon
Reporter: Gaurav Gireesh
As a user, I would like to have an out of the box feature in Audio Data Loader
and Audio transforms in MXNet, that would allow me :
* to be able to load audio (only .wav files supported currently) files and
make a Gluon AudioDataset (NDArrays),
* apply some popular audio transforms on the audio data( example scaling, MEL,
MFCC etc.),
* load the Dataset using Gluon's DataLoader, train a neural network ( Ex: MLP)
with this transformed audio dataset,
* perform a simple audio data related task such as sounds classification - 1
audio clip with 1 label( Multiclass sound classification problem).
* Provide an end to end example for a task (Urban Sounds Classification)
including:
* reading audio files from a folder location (can be extended to S3 bucket
later) and load it into the AudioDataset
* apply audio transforms
* train a model - neural network with the AudioDataset or DataLoader
* perform the multi class classification - conduct inference
* Design here: https://cwiki.apache.org/confluence/display/MXNET/Gluon+-+Audio
--
This message was sent by Atlassian JIRA
(v7.6.3#76005)
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]