[GitHub] asitstands commented on issue #10575: [MXNET-145] Remove global PRNGs of numpy and python used in mx.image package
asitstands commented on issue #10575: [MXNET-145] Remove global PRNGs of numpy and python used in mx.image package URL: https://github.com/apache/incubator-mxnet/pull/10575#issuecomment-382063974 I experimented with the following two functions. The first one takes 37s and the second one takes 28s to run the above `ImageIter` test. So the second one has the same performance with the implementation in the master using python RNG. The difference is from the serialization by the engine. I didn't expect that it causes such a large difference. To avoid the serialization, we need an independent RNG that is not intended for sharing. But I think that it should not be the global python/numpy RNG. I'll make a separate PR for this. ```c++ // Faster than sampling operator but still slow int rand(mx_float low, mx_float high, mx_float* out) { API_BEGIN(); Context ctx = Context::CPU(); Engine::VarHandle var = Engine::Get()->NewVariable(); mxnet::Resource resource = mxnet::ResourceManager::Get()->Request( ctx, ResourceRequest::kRandom); Engine::Get()->PushSync([low, high, out, resource](RunContext rctx) { mshadow::Random *prnd = resource.get_random(rctx.get_stream()); mshadow::Tensor tmp(out, mshadow::Shape1(1)); prnd->SampleUniform(&tmp, low, high); }, ctx, {}, {var, resource.var}, FnProperty::kNormal, 0, PROFILER_MESSAGE_FUNCNAME); Engine::Get()->WaitForVar(var); Engine::Get()->DeleteVariable([](mxnet::RunContext) {}, mxnet::Context(), var); API_END(); } // Fast but not safe as `resource` is shared int rand(mx_float low, mx_float high, mx_float* out) { API_BEGIN(); Context ctx = Context::CPU(); mxnet::Resource resource = mxnet::ResourceManager::Get()->Request( ctx, ResourceRequest::kRandom); mshadow::Random *prnd = resource.get_random(nullptr); mshadow::Tensor tmp(out, mshadow::Shape1(1)); prnd->SampleUniform(&tmp, low, high); API_END(); } ``` This is an automated message from the Apache Git Service. To respond to the message, please log on GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services
[GitHub] asitstands commented on issue #10575: [MXNET-145] Remove global PRNGs of numpy and python used in mx.image package
asitstands commented on issue #10575: [MXNET-145] Remove global PRNGs of numpy and python used in mx.image package URL: https://github.com/apache/incubator-mxnet/pull/10575#issuecomment-382063974 I experimented with the following two functions. The first one takes 37s and the second one takes 28s to run the above `ImageIter` test. So the second one has the same performance with the implementation in the master using python RNG. The difference is from the serialization by the engine. I didn't expect that it causes such a large difference. To avoid the serialization, we need an independent RNG that is not intended for sharing. But I think that it should not be the global python/numpy RNG. I'll make a separate PR for this. ```c++ // Faster than sampling operator but still slow int rand(mx_float low, mx_float high, mx_float* out) { API_BEGIN(); Context ctx = Context::CPU(); Engine::VarHandle var = Engine::Get()->NewVariable(); mxnet::Resource resource = mxnet::ResourceManager::Get()->Request( ctx, ResourceRequest::kRandom); Engine::Get()->PushSync([low, high, out, resource](RunContext rctx) { mshadow::Random *prnd = resource.get_random(rctx.get_stream()); mshadow::Tensor tmp(out, mshadow::Shape1(1)); prnd->SampleUniform(&tmp, low, high); }, ctx, {}, {var, resource.var}, FnProperty::kNormal, 0, PROFILER_MESSAGE_FUNCNAME); Engine::Get()->WaitForVar(var); Engine::Get()->DeleteVariable([](mxnet::RunContext) {}, mxnet::Context(), var); API_END(); } // Fast but not safe as `prnd` is shared int rand(mx_float low, mx_float high, mx_float* out) { API_BEGIN(); Context ctx = Context::CPU(); mxnet::Resource resource = mxnet::ResourceManager::Get()->Request( ctx, ResourceRequest::kRandom); mshadow::Random *prnd = resource.get_random(nullptr); mshadow::Tensor tmp(out, mshadow::Shape1(1)); prnd->SampleUniform(&tmp, low, high); API_END(); } ``` This is an automated message from the Apache Git Service. To respond to the message, please log on GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services
[GitHub] asitstands commented on issue #10575: [MXNET-145] Remove global PRNGs of numpy and python used in mx.image package
asitstands commented on issue #10575: [MXNET-145] Remove global PRNGs of numpy and python used in mx.image package URL: https://github.com/apache/incubator-mxnet/pull/10575#issuecomment-382063974 I experimented with the following two functions. The first one takes 37s and the second one takes 28s to run the above `ImageIter` test. So the second one has the same performance with the implementation in the master using python RNG. The difference is from the serialization by the engine. I didn't expect that it causes such a large difference. To avoid the serialization, we need an independent RNG that is not intended for sharing. But I think that it should not be the global python/numpy RNG. I'll make a separate PR for this. ```c++ // Faster than sampling operator but still slow int rand(mx_float low, mx_float high, mx_float* out) { API_BEGIN(); Context ctx = Context::CPU(); Engine::VarHandle var = Engine::Get()->NewVariable(); mxnet::Resource resource = mxnet::ResourceManager::Get()->Request( ctx, ResourceRequest::kRandom); Engine::Get()->PushSync([low, high, out, resource](RunContext rctx) { mshadow::Random *prnd = resource.get_random(rctx.get_stream()); mshadow::Tensor tmp(out, mshadow::Shape1(1)); prnd->SampleUniform(&tmp, low, high); }, ctx, {}, {var, resource.var}, FnProperty::kNormal, 0, PROFILER_MESSAGE_FUNCNAME); Engine::Get()->WaitForVar(var); Engine::Get()->DeleteVariable([](mxnet::RunContext) {}, mxnet::Context(), var); API_END(); } // Fast but not safe as `resource` is shared int rand(mx_float low, mx_float high, mx_float* out) { API_BEGIN(); Context ctx = Context::CPU(); mxnet::Resource resource = mxnet::ResourceManager::Get()->Request( ctx, ResourceRequest::kRandom); mshadow::Random *prnd = resource.get_random(nullptr); mshadow::Tensor tmp(out, mshadow::Shape1(1)); prnd->SampleUniform(&tmp, low, high); API_END(); } ``` This is an automated message from the Apache Git Service. To respond to the message, please log on GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services
[GitHub] asitstands commented on issue #10575: [MXNET-145] Remove global PRNGs of numpy and python used in mx.image package
asitstands commented on issue #10575: [MXNET-145] Remove global PRNGs of numpy and python used in mx.image package URL: https://github.com/apache/incubator-mxnet/pull/10575#issuecomment-381930133 I'm not sure what causes the slowdown. If it is the overhead of mxnet's operator calling mechanism, adding a simple function to generate a scalar random number, using mxnet's internal RNGs, to the random number API would be the solution. I'll test with this. This is an automated message from the Apache Git Service. To respond to the message, please log on GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services
[GitHub] asitstands commented on issue #10575: [MXNET-145] Remove global PRNGs of numpy and python used in mx.image package
asitstands commented on issue #10575: [MXNET-145] Remove global PRNGs of numpy and python used in mx.image package URL: https://github.com/apache/incubator-mxnet/pull/10575#issuecomment-381930133 I'm not sure what causes the slowdown. If it is the overhead of mxnet's operator calling mechanism, implementing a simple function to generate a scalar random number, using mxnet's internal RNGs, to the random number API would be the solution. I'll test with this. This is an automated message from the Apache Git Service. To respond to the message, please log on GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services
[GitHub] asitstands commented on issue #10575: [MXNET-145] Remove global PRNGs of numpy and python used in mx.image package
asitstands commented on issue #10575: [MXNET-145] Remove global PRNGs of numpy and python used in mx.image package URL: https://github.com/apache/incubator-mxnet/pull/10575#issuecomment-381877219 Replacing the python/numpy global RNGs with a mxnet specific instance of numpy RNG would be better in terms of the performance. Is it acceptable keeping a numpy RNG as an internal utility in `mx.random` and seeding with `mx.random.seed`? This is an automated message from the Apache Git Service. To respond to the message, please log on GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services
[GitHub] asitstands commented on issue #10575: [MXNET-145] Remove global PRNGs of numpy and python used in mx.image package
asitstands commented on issue #10575: [MXNET-145] Remove global PRNGs of numpy and python used in mx.image package URL: https://github.com/apache/incubator-mxnet/pull/10575#issuecomment-381877219 Replacing the python/numpy global RNGs with a mxnet specific instance of numpy RNG would be better in terms of the performance. Is it acceptable keeping an numpy RNG as an internal utility in `mx.random` and seeding with `mx.random.seed`? This is an automated message from the Apache Git Service. To respond to the message, please log on GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services
[GitHub] asitstands commented on issue #10575: [MXNET-145] Remove global PRNGs of numpy and python used in mx.image package
asitstands commented on issue #10575: [MXNET-145] Remove global PRNGs of numpy and python used in mx.image package URL: https://github.com/apache/incubator-mxnet/pull/10575#issuecomment-381877219 Replacing the python/numpy global RNGs with a mxnet specific instance of numpy RNG would be better in terms of the performance. However, I'm not sure that exposing the RNG as a public API is a good idea. Is it acceptable keeping an numpy RNG as an internal utility in `mx.random` and seeding with `mx.random.seed`? This is an automated message from the Apache Git Service. To respond to the message, please log on GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services
[GitHub] asitstands commented on issue #10575: [MXNET-145] Remove global PRNGs of numpy and python used in mx.image package
asitstands commented on issue #10575: [MXNET-145] Remove global PRNGs of numpy and python used in mx.image package URL: https://github.com/apache/incubator-mxnet/pull/10575#issuecomment-381877219 Replacing the python/numpy global RNGs with a mxnet specific instance of numpy RNG would be more acceptable in terms of the performance. However, I'm not sure that exposing the RNG as a public API is a good idea. Is it acceptable keeping an numpy RNG as an internal utility in `mx.random` and seeding with `mx.random.seed`? This is an automated message from the Apache Git Service. To respond to the message, please log on GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services
[GitHub] asitstands commented on issue #10575: [MXNET-145] Remove global PRNGs of numpy and python used in mx.image package
asitstands commented on issue #10575: [MXNET-145] Remove global PRNGs of numpy and python used in mx.image package URL: https://github.com/apache/incubator-mxnet/pull/10575#issuecomment-381871354 A simple benchmark using `ImageIter` shows a bad performance degradation. On Xeon E5-2680, the current master runs the following code in about 28s. The time is increased to 42s if this PR is applied. It is 50% increase of the running time. ```python start = time.time() data = mx.img.ImageIter( batch_size=32, data_shape=(3, 224, 224), path_imgrec='caltech.rec', # 9144 image samples rand_crop=True, rand_resize=True, rand_mirror=True, rand_gray=0.5, brightness=0.5, contrast=0.5, saturation=0.5) for i in data: pass mx.nd.waitall() end = time.time() print("elapsed time: ", end - start) ``` This is an automated message from the Apache Git Service. To respond to the message, please log on GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services
[GitHub] asitstands commented on issue #10575: [MXNET-145] Remove global PRNGs of numpy and python used in mx.image package
asitstands commented on issue #10575: [MXNET-145] Remove global PRNGs of numpy and python used in mx.image package URL: https://github.com/apache/incubator-mxnet/pull/10575#issuecomment-381829264 @piiswrong Currently I have no measure. Is there any benchmark or example code that I can use? If not, I'll write my own. This is an automated message from the Apache Git Service. To respond to the message, please log on GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services