markobean commented on PR #9617:
URL: https://github.com/apache/nifi/pull/9617#issuecomment-2581391940

   > > I would like the submitted implementation to be considered over EL 
unless there is a strong argument against it. Why do you prefer EL?
   > 
   > I will also note that the current test case that runs the processor 1000 
times in an effort to produce files of expected sizes highlights the problem 
with the non-deterministic nature of the current approach. Such a test is prone 
to inconsistent behavior. That is some that could be addressed separately, but 
it points back to another reason for going with EL.
   
   The same issue exists for the EL solution. Both do not necessarily require 
1000 iterations. But because they ultimately utilize a pseudo-random number 
generator, I thought it wise to keep the number of iterations relatively large. 
It's not a compute-intensive processor, so iterating this many times is 
acceptable. On my system, an iteration of 1 takes ~150ms; 1,000 iterations 
250ms. It's a small delta for a dramatic increase in data. This will ensure a 
very high likelihood that the full spectrum of allowed data sizes are covered. 
And, the asserts will continue to work even if the full range is not exercised. 
For example, say the range is intended to be 1-10 bytes, but the test only 
achieves 2-9 byte files. The absence of 1 and 10 will not cause the unit test 
to fail; it just never reached the edge cases. The only way to fail is if all 
1,000 executions happen to generate the exact same-sized file. Not going to 
happen in our lifetime.. unless the randomizer is broken. ;) 
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]

Reply via email to