[GitHub] jiarenyf commented on issue #7710: CTC ERROR WITH CUDA ILLEGAL MEMORY ACCESS ERROR
jiarenyf commented on issue #7710: CTC ERROR WITH CUDA ILLEGAL MEMORY ACCESS ERROR URL: https://github.com/apache/incubator-mxnet/issues/7710#issuecomment-330248910 @szha I found another reason of that problem. As in "https://github.com/jiarenyf/mxWrapper/blob/02fd9b0fcd37f7224648efad651a6f83a1f06d78/mxHelper/mxData.py#L158;, the labels is initialize to empty, while in "https://github.com/jiarenyf/mxWrapper/blob/02fd9b0fcd37f7224648efad651a6f83a1f06d78/mxHelper/model/model.py#L98;, it directly uses batch.label without considering the batch.pad. So if not meets "data size % batch size ==0", the error occurs when accessing empty labels. And the data set I offered you to debug has 800 images, and it meets: "800 % 100 == 0" (100 is batch size). So the problem never occurs on your workplace. I found this problem because I happen to change the train set size to 111 and the test size to 33, where "111+33 % 100 != 0". And after I add "https://github.com/jiarenyf/mxWrapper/blob/02fd9b0fcd37f7224648efad651a6f83a1f06d78/mxHelper/mxData.py#L169; (remove empty), and change "https://github.com/jiarenyf/mxWrapper/blob/02fd9b0fcd37f7224648efad651a6f83a1f06d78/mxHelper/mxData.py#L170; (set "pad" to 0), then the problem does not occur. I wonder is there some error or bug in my implementation: "https://github.com/jiarenyf/mxWrapper/blob/master/mxHelper/model/model.py; ? This is an automated message from the Apache Git Service. To respond to the message, please log on GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services
[GitHub] jiarenyf commented on issue #7710: CTC ERROR WITH CUDA ILLEGAL MEMORY ACCESS ERROR
jiarenyf commented on issue #7710: CTC ERROR WITH CUDA ILLEGAL MEMORY ACCESS ERROR URL: https://github.com/apache/incubator-mxnet/issues/7710#issuecomment-329039575 @szha Yes. I have updated to the version 0.11.0 by "pip install mxnet-cu80==0.11.0". This is an automated message from the Apache Git Service. To respond to the message, please log on GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services
[GitHub] jiarenyf commented on issue #7710: CTC ERROR WITH CUDA ILLEGAL MEMORY ACCESS ERROR
jiarenyf commented on issue #7710: CTC ERROR WITH CUDA ILLEGAL MEMORY ACCESS ERROR URL: https://github.com/apache/incubator-mxnet/issues/7710#issuecomment-328841747 @szha I found that the problem is not because of the data, but maybe because of the local variables in python functions that are collected by the gc, such that the cuda access illegal memory. And I add "gc.disable()" at the beginning of each epoch, and then add "gc.enable()" at the end of each epoch, the problem never occurs at least for 5 days . I think maybe this could help you to debug ... This is an automated message from the Apache Git Service. To respond to the message, please log on GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services
[GitHub] jiarenyf commented on issue #7710: CTC ERROR WITH CUDA ILLEGAL MEMORY ACCESS ERROR
jiarenyf commented on issue #7710: CTC ERROR WITH CUDA ILLEGAL MEMORY ACCESS ERROR URL: https://github.com/apache/incubator-mxnet/issues/7710#issuecomment-327673283 @szha I found that when adding forceResize in the imageIter-kwargs, the problem solved ... And I wonder why the imageIter center-crop the image when the img-size is not the same with the data-shape, but not given a choice to force-reshape it ... This is an automated message from the Apache Git Service. To respond to the message, please log on GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services
[GitHub] jiarenyf commented on issue #7710: CTC ERROR WITH CUDA ILLEGAL MEMORY ACCESS ERROR
jiarenyf commented on issue #7710: CTC ERROR WITH CUDA ILLEGAL MEMORY ACCESS ERROR URL: https://github.com/apache/incubator-mxnet/issues/7710#issuecomment-327673283 @szha I found that when adding forceResize in the imageIter-kwargs, the problem solved ... This is an automated message from the Apache Git Service. To respond to the message, please log on GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services
[GitHub] jiarenyf commented on issue #7710: CTC ERROR WITH CUDA ILLEGAL MEMORY ACCESS ERROR
jiarenyf commented on issue #7710: CTC ERROR WITH CUDA ILLEGAL MEMORY ACCESS ERROR URL: https://github.com/apache/incubator-mxnet/issues/7710#issuecomment-326861804 @szha Link?http://pan.baidu.com/s/1o7Zpc06 Password?plmp Sometimes it could train for several epochs, but finally it always occurs the problem ... This is an automated message from the Apache Git Service. To respond to the message, please log on GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services
[GitHub] jiarenyf commented on issue #7710: CTC ERROR WITH CUDA ILLEGAL MEMORY ACCESS ERROR
jiarenyf commented on issue #7710: CTC ERROR WITH CUDA ILLEGAL MEMORY ACCESS ERROR URL: https://github.com/apache/incubator-mxnet/issues/7710#issuecomment-326787837 @szha Link?http://pan.baidu.com/s/1o7Zpc06 Password?plmp Sometimes it could train for several epochs, but finally it always occurs the problem ... This is an automated message from the Apache Git Service. To respond to the message, please log on GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services
[GitHub] jiarenyf commented on issue #7710: CTC ERROR WITH CUDA ILLEGAL MEMORY ACCESS ERROR
jiarenyf commented on issue #7710: CTC ERROR WITH CUDA ILLEGAL MEMORY ACCESS ERROR URL: https://github.com/apache/incubator-mxnet/issues/7710#issuecomment-326784827 @szha I install 2 versions of mxnet on different pc: 1, mxnet-cu80==0.10.0.post2 and "d220e192897da763ed8e8a6135f9e8b24cc1a2ce" 2, mxnet-cu80==0.11.0rc3 and "53274b4a2b0d73f3fbdb10cfb5f9ed0c8263fda7" And both have the problem mention above ... This is an automated message from the Apache Git Service. To respond to the message, please log on GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services
[GitHub] jiarenyf commented on issue #7710: CTC ERROR WITH CUDA ILLEGAL MEMORY ACCESS ERROR
jiarenyf commented on issue #7710: CTC ERROR WITH CUDA ILLEGAL MEMORY ACCESS ERROR URL: https://github.com/apache/incubator-mxnet/issues/7710#issuecomment-326784827 @szha I install 2 versions of mxnet on different pc: 1, mxnet-cu80==0.10.0.post2 and "d220e192897da763ed8e8a6135f9e8b24cc1a2ce" 2, mxnet-cu80==0.11.0rc3 and "53274b4a2b0d73f3fbdb10cfb5f9ed0c8263fda7" And both have the problem mention above ... This is an automated message from the Apache Git Service. To respond to the message, please log on GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services