[GitHub] Jerryzcn commented on issue #10042: [MXNET-86] Gluon dataloader crash on speech recognition training
Jerryzcn commented on issue #10042: [MXNET-86] Gluon dataloader crash on speech recognition training URL: https://github.com/apache/incubator-mxnet/issues/10042#issuecomment-377629440 Yeah, we can close this. This is an automated message from the Apache Git Service. To respond to the message, please log on GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services
[GitHub] Jerryzcn commented on issue #10042: [MXNET-86] Gluon dataloader crash on speech recognition training
Jerryzcn commented on issue #10042: [MXNET-86] Gluon dataloader crash on speech recognition training URL: https://github.com/apache/incubator-mxnet/issues/10042#issuecomment-374372260 @ThomasDelteil The bug is related to a race condition in memory management, where a space is double freed. The latest master add a lock on the space, so that might slow down the dataloading. However, I'm not sure exactly why. ping @zhreshold This is an automated message from the Apache Git Service. To respond to the message, please log on GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services
[GitHub] Jerryzcn commented on issue #10042: [MXNET-86] Gluon dataloader crash on speech recognition training
Jerryzcn commented on issue #10042: [MXNET-86] Gluon dataloader crash on speech recognition training URL: https://github.com/apache/incubator-mxnet/issues/10042#issuecomment-374319694 @ThomasDelteil can you try build from source using master? https://github.com/apache/incubator-mxnet/pull/10096 might fix it. I am testing it right now. This is an automated message from the Apache Git Service. To respond to the message, please log on GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services
[GitHub] Jerryzcn commented on issue #10042: [MXNET-86] Gluon dataloader crash on speech recognition training
Jerryzcn commented on issue #10042: [MXNET-86] Gluon dataloader crash on speech recognition training URL: https://github.com/apache/incubator-mxnet/issues/10042#issuecomment-374319694 @ThomasDelteil can you try build from source using master? https://github.com/apache/incubator-mxnet/pull/10096 might fix it. This is an automated message from the Apache Git Service. To respond to the message, please log on GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services
[GitHub] Jerryzcn commented on issue #10042: [MXNET-86] Gluon dataloader crash on speech recognition training
Jerryzcn commented on issue #10042: [MXNET-86] Gluon dataloader crash on speech recognition training URL: https://github.com/apache/incubator-mxnet/issues/10042#issuecomment-373186979 segfault seems to related to https://github.com/apache/incubator-mxnet/pull/10096 This is an automated message from the Apache Git Service. To respond to the message, please log on GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services
[GitHub] Jerryzcn commented on issue #10042: [MXNET-86] Gluon dataloader crash on speech recognition training
Jerryzcn commented on issue #10042: [MXNET-86] Gluon dataloader crash on speech recognition training URL: https://github.com/apache/incubator-mxnet/issues/10042#issuecomment-372817360 there seems to be other issues as well, after training for 1 day or so i got segfault. This does not happen with small dataset. Segfault is tested with 1.2.0. I will try previous version This is an automated message from the Apache Git Service. To respond to the message, please log on GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services
[GitHub] Jerryzcn commented on issue #10042: [MXNET-86] Gluon dataloader crash on speech recognition training
Jerryzcn commented on issue #10042: [MXNET-86] Gluon dataloader crash on speech recognition training URL: https://github.com/apache/incubator-mxnet/issues/10042#issuecomment-372816606 The error output only happens when you train on the actual data. When you use the test script, it will freeze. If revert back to 1.1.0 the problem is resolved. This is an automated message from the Apache Git Service. To respond to the message, please log on GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services
[GitHub] Jerryzcn commented on issue #10042: [MXNET-86] Gluon dataloader crash on speech recognition training
Jerryzcn commented on issue #10042: [MXNET-86] Gluon dataloader crash on speech recognition training URL: https://github.com/apache/incubator-mxnet/issues/10042#issuecomment-372816606 The error output only happens when you train on the actual data. However, when use the test script, it will freeze. If revert back to 1.1.0 the problem is resolved. This is an automated message from the Apache Git Service. To respond to the message, please log on GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services
[GitHub] Jerryzcn commented on issue #10042: [MXNET-86] Gluon dataloader crash on speech recognition training
Jerryzcn commented on issue #10042: [MXNET-86] Gluon dataloader crash on speech recognition training URL: https://github.com/apache/incubator-mxnet/issues/10042#issuecomment-372816606 The error output only happens when you train on the actual data. When use the test script, it will freeze. If revert back to 1.1.0 the problem is resolved. This is an automated message from the Apache Git Service. To respond to the message, please log on GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services