[GitHub] asitstands commented on issue #10575: [MXNET-145] Remove global PRNGs of numpy and python used in mx.image package

2018-04-17 Thread GitBox
asitstands commented on issue #10575: [MXNET-145] Remove global PRNGs of numpy 
and python used in mx.image package
URL: https://github.com/apache/incubator-mxnet/pull/10575#issuecomment-382063974
 
 
   I experimented with the following two functions. The first one takes 37s and 
the second one takes 28s to run the above `ImageIter` test. So the second one 
has the same performance with the implementation in the master using python 
RNG. The difference is from the serialization by the engine. I didn't expect 
that it causes such a large difference. To avoid the serialization, we need an 
independent RNG that is not intended for sharing. But I think that it should 
not be the global python/numpy RNG. I'll make a separate PR for this.
   ```c++
   // Faster than sampling operator but still slow
   int rand(mx_float low, mx_float high, mx_float* out) {
 API_BEGIN();
 Context ctx = Context::CPU();
 Engine::VarHandle var = Engine::Get()->NewVariable();
 mxnet::Resource resource = mxnet::ResourceManager::Get()->Request(
   ctx, ResourceRequest::kRandom);
 Engine::Get()->PushSync([low, high, out, resource](RunContext rctx) {
   mshadow::Random *prnd = 
 resource.get_random(rctx.get_stream());
   mshadow::Tensor tmp(out, mshadow::Shape1(1));
   prnd->SampleUniform(&tmp, low, high);
 }, ctx, {}, {var, resource.var},
 FnProperty::kNormal, 0, PROFILER_MESSAGE_FUNCNAME);
 Engine::Get()->WaitForVar(var);
 Engine::Get()->DeleteVariable([](mxnet::RunContext) {}, mxnet::Context(), 
var);
 API_END();
   }
   
   // Fast but not safe as `resource` is shared
   int rand(mx_float low, mx_float high, mx_float* out) {
 API_BEGIN();
 Context ctx = Context::CPU();
 mxnet::Resource resource = mxnet::ResourceManager::Get()->Request(
   ctx, ResourceRequest::kRandom);
 mshadow::Random *prnd = 
 resource.get_random(nullptr);
 mshadow::Tensor tmp(out, mshadow::Shape1(1));
 prnd->SampleUniform(&tmp, low, high);
 API_END();
   }
   ```


This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services


[GitHub] asitstands commented on issue #10575: [MXNET-145] Remove global PRNGs of numpy and python used in mx.image package

2018-04-17 Thread GitBox
asitstands commented on issue #10575: [MXNET-145] Remove global PRNGs of numpy 
and python used in mx.image package
URL: https://github.com/apache/incubator-mxnet/pull/10575#issuecomment-382063974
 
 
   I experimented with the following two functions. The first one takes 37s and 
the second one takes 28s to run the above `ImageIter` test. So the second one 
has the same performance with the implementation in the master using python 
RNG. The difference is from the serialization by the engine. I didn't expect 
that it causes such a large difference. To avoid the serialization, we need an 
independent RNG that is not intended for sharing. But I think that it should 
not be the global python/numpy RNG. I'll make a separate PR for this.
   ```c++
   // Faster than sampling operator but still slow
   int rand(mx_float low, mx_float high, mx_float* out) {
 API_BEGIN();
 Context ctx = Context::CPU();
 Engine::VarHandle var = Engine::Get()->NewVariable();
 mxnet::Resource resource = mxnet::ResourceManager::Get()->Request(
   ctx, ResourceRequest::kRandom);
 Engine::Get()->PushSync([low, high, out, resource](RunContext rctx) {
   mshadow::Random *prnd = 
 resource.get_random(rctx.get_stream());
   mshadow::Tensor tmp(out, mshadow::Shape1(1));
   prnd->SampleUniform(&tmp, low, high);
 }, ctx, {}, {var, resource.var},
 FnProperty::kNormal, 0, PROFILER_MESSAGE_FUNCNAME);
 Engine::Get()->WaitForVar(var);
 Engine::Get()->DeleteVariable([](mxnet::RunContext) {}, mxnet::Context(), 
var);
 API_END();
   }
   
   // Fast but not safe as `prnd` is shared
   int rand(mx_float low, mx_float high, mx_float* out) {
 API_BEGIN();
 Context ctx = Context::CPU();
 mxnet::Resource resource = mxnet::ResourceManager::Get()->Request(
   ctx, ResourceRequest::kRandom);
 mshadow::Random *prnd = 
 resource.get_random(nullptr);
 mshadow::Tensor tmp(out, mshadow::Shape1(1));
 prnd->SampleUniform(&tmp, low, high);
 API_END();
   }
   ```


This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services


[GitHub] asitstands commented on issue #10575: [MXNET-145] Remove global PRNGs of numpy and python used in mx.image package

2018-04-17 Thread GitBox
asitstands commented on issue #10575: [MXNET-145] Remove global PRNGs of numpy 
and python used in mx.image package
URL: https://github.com/apache/incubator-mxnet/pull/10575#issuecomment-382063974
 
 
   I experimented with the following two functions. The first one takes 37s and 
the second one takes 28s to run the above `ImageIter` test. So the second one 
has the same performance with the implementation in the master using python 
RNG. The difference is from the serialization by the engine. I didn't expect 
that it causes such a large difference. To avoid the serialization, we need an 
independent RNG that is not intended for sharing. But I think that it should 
not be the global python/numpy RNG. I'll make a separate PR for this.
   ```c++
   // Faster than sampling operator but still slow
   int rand(mx_float low, mx_float high, mx_float* out) {
 API_BEGIN();
 Context ctx = Context::CPU();
 Engine::VarHandle var = Engine::Get()->NewVariable();
 mxnet::Resource resource = mxnet::ResourceManager::Get()->Request(
   ctx, ResourceRequest::kRandom);
 Engine::Get()->PushSync([low, high, out, resource](RunContext rctx) {
   mshadow::Random *prnd = 
 resource.get_random(rctx.get_stream());
   mshadow::Tensor tmp(out, mshadow::Shape1(1));
   prnd->SampleUniform(&tmp, low, high);
 }, ctx, {}, {var, resource.var},
 FnProperty::kNormal, 0, PROFILER_MESSAGE_FUNCNAME);
 Engine::Get()->WaitForVar(var);
 Engine::Get()->DeleteVariable([](mxnet::RunContext) {}, mxnet::Context(), 
var);
 API_END();
   }
   
   // Fast but not safe as `resource` is shared
   int rand(mx_float low, mx_float high, mx_float* out) {
 API_BEGIN();
 Context ctx = Context::CPU();
 mxnet::Resource resource = mxnet::ResourceManager::Get()->Request(
   ctx, ResourceRequest::kRandom);
 mshadow::Random *prnd = 
 resource.get_random(nullptr);
 mshadow::Tensor tmp(out, mshadow::Shape1(1));
 prnd->SampleUniform(&tmp, low, high);
 API_END();
   }
   ```


This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services


[GitHub] asitstands commented on issue #10575: [MXNET-145] Remove global PRNGs of numpy and python used in mx.image package

2018-04-17 Thread GitBox
asitstands commented on issue #10575: [MXNET-145] Remove global PRNGs of numpy 
and python used in mx.image package
URL: https://github.com/apache/incubator-mxnet/pull/10575#issuecomment-381930133
 
 
   I'm not sure what causes the slowdown. If it is the overhead of mxnet's 
operator calling mechanism, adding a simple function to generate a scalar 
random number, using mxnet's internal RNGs, to the random number API would be 
the solution. I'll test with this.


This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services


[GitHub] asitstands commented on issue #10575: [MXNET-145] Remove global PRNGs of numpy and python used in mx.image package

2018-04-17 Thread GitBox
asitstands commented on issue #10575: [MXNET-145] Remove global PRNGs of numpy 
and python used in mx.image package
URL: https://github.com/apache/incubator-mxnet/pull/10575#issuecomment-381930133
 
 
   I'm not sure what causes the slowdown. If it is the overhead of mxnet's 
operator calling mechanism, implementing a simple function to generate a scalar 
random number, using mxnet's internal RNGs, to the random number API would be 
the solution. I'll test with this.


This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services


[GitHub] asitstands commented on issue #10575: [MXNET-145] Remove global PRNGs of numpy and python used in mx.image package

2018-04-17 Thread GitBox
asitstands commented on issue #10575: [MXNET-145] Remove global PRNGs of numpy 
and python used in mx.image package
URL: https://github.com/apache/incubator-mxnet/pull/10575#issuecomment-381877219
 
 
   Replacing the python/numpy global RNGs with a mxnet specific instance of 
numpy RNG would be better in terms of the performance. Is it acceptable keeping 
a numpy RNG as an internal utility in `mx.random` and seeding with 
`mx.random.seed`?


This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services


[GitHub] asitstands commented on issue #10575: [MXNET-145] Remove global PRNGs of numpy and python used in mx.image package

2018-04-17 Thread GitBox
asitstands commented on issue #10575: [MXNET-145] Remove global PRNGs of numpy 
and python used in mx.image package
URL: https://github.com/apache/incubator-mxnet/pull/10575#issuecomment-381877219
 
 
   Replacing the python/numpy global RNGs with a mxnet specific instance of 
numpy RNG would be better in terms of the performance. Is it acceptable keeping 
an numpy RNG as an internal utility in `mx.random` and seeding with 
`mx.random.seed`?


This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services


[GitHub] asitstands commented on issue #10575: [MXNET-145] Remove global PRNGs of numpy and python used in mx.image package

2018-04-17 Thread GitBox
asitstands commented on issue #10575: [MXNET-145] Remove global PRNGs of numpy 
and python used in mx.image package
URL: https://github.com/apache/incubator-mxnet/pull/10575#issuecomment-381877219
 
 
   Replacing the python/numpy global RNGs with a mxnet specific instance of 
numpy RNG would be better in terms of the performance. However, I'm not sure 
that exposing the RNG as a public API is a good idea. Is it acceptable keeping 
an numpy RNG as an internal utility in `mx.random` and seeding with 
`mx.random.seed`?


This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services


[GitHub] asitstands commented on issue #10575: [MXNET-145] Remove global PRNGs of numpy and python used in mx.image package

2018-04-17 Thread GitBox
asitstands commented on issue #10575: [MXNET-145] Remove global PRNGs of numpy 
and python used in mx.image package
URL: https://github.com/apache/incubator-mxnet/pull/10575#issuecomment-381877219
 
 
   Replacing the python/numpy global RNGs with a mxnet specific instance of 
numpy RNG would be more acceptable in terms of the performance. However, I'm 
not sure that exposing the RNG as a public API is a good idea. Is it acceptable 
keeping an numpy RNG as an internal utility in `mx.random` and seeding with 
`mx.random.seed`?


This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services


[GitHub] asitstands commented on issue #10575: [MXNET-145] Remove global PRNGs of numpy and python used in mx.image package

2018-04-17 Thread GitBox
asitstands commented on issue #10575: [MXNET-145] Remove global PRNGs of numpy 
and python used in mx.image package
URL: https://github.com/apache/incubator-mxnet/pull/10575#issuecomment-381871354
 
 
   A simple benchmark using `ImageIter` shows a bad performance degradation. On 
Xeon E5-2680, the current master runs the following code in about 28s. The time 
is increased to 42s if this PR is applied. It is 50% increase of the running 
time.
   ```python
   start = time.time()
   data = mx.img.ImageIter(
batch_size=32,
data_shape=(3, 224, 224),
path_imgrec='caltech.rec', # 9144 image samples
rand_crop=True,
rand_resize=True,
rand_mirror=True,
rand_gray=0.5,
brightness=0.5,
contrast=0.5,
saturation=0.5)
   for i in data:
pass
   mx.nd.waitall()
   end = time.time()
   print("elapsed time: ", end - start)
   ```


This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services


[GitHub] asitstands commented on issue #10575: [MXNET-145] Remove global PRNGs of numpy and python used in mx.image package

2018-04-16 Thread GitBox
asitstands commented on issue #10575: [MXNET-145] Remove global PRNGs of numpy 
and python used in mx.image package
URL: https://github.com/apache/incubator-mxnet/pull/10575#issuecomment-381829264
 
 
   @piiswrong Currently I have no measure. Is there any benchmark or example 
code that I can use? If not, I'll write my own.


This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services