[GitHub] perdasilva commented on a change in pull request #12485: [WIP] test_ImageRecordIter_seed_augmentation flaky test fix

GitBox Sun, 04 Nov 2018 23:21:59 -0800

perdasilva commented on a change in pull request #12485: [WIP] 
test_ImageRecordIter_seed_augmentation flaky test fix
URL: https://github.com/apache/incubator-mxnet/pull/12485#discussion_r230650963


 ##########
 File path: src/io/iter_image_recordio_2.cc
 ##########
 @@ -518,6 +518,17 @@ inline unsigned 
ImageRecordIOParser2<DType>::ParseChunk(DType* data_dptr, real_t
       cv::Mat res;
       rec.Load(blob.dptr, blob.size);
       cv::Mat buf(1, rec.content_size, CV_8U, rec.content);
+
+      if (idx % 1000 == 0) {
+        if (param_.seed_aug.has_value()) {
+          LOG(INFO) << "aug seed: " << param_.seed_aug.value();
+        }
+        LOG(INFO) << "tid: " << tid << " idx: " << idx << " index: " << 
rec.image_index();
+      }
+      if (param_.seed_aug.has_value()) {
+        prnds_[tid]->seed(idx + param_.seed_aug.value());
 
 Review comment:
   There are two requirements here: that setting the seed will yield 
reproducible results and that parallelization should be used to augment the 
images. We need to reset the seed for each image because there is no guarantee 
that the same image will be processed by the same thread. Or that even that it 
will be the ith image processed by that thread across every run (or even 
different hardware - the code figures out the number of threads to use for 
processing). Therefore, resetting the random number generator at the start of 
processing an image it the only way (at least that I could think of) to 
guarantee that, in the case of setting a fixed seed, the same random 
distortions will be applied to the same image, in a multi-threaded environment, 
independent of the hardware being used. I hope this is clear. It's not the 
easiest topic to discuss in written form lol.

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
[email protected]


With regards,
Apache Git Services

[GitHub] perdasilva commented on a change in pull request #12485: [WIP] test_ImageRecordIter_seed_augmentation flaky test fix

Reply via email to