jfrazee commented on pull request #4482: URL: https://github.com/apache/nifi/pull/4482#issuecomment-681178575
@mattyb149 I went through this pretty closely and I think it looks good. The Probabilistic and Reservoir sampling produces the right distributions over 100k files. Two things I wanted to verify: (1) since the reservoir sampling is in-memory (for good reasons), setting it too high will be problematic? and (2) you've opted to not do these as controller services because that's potentially more complicated than most users would need/want? ---------------------------------------------------------------- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: [email protected]
