Hello, One of our collections hates CursorMark, it really does. When under very heavy load the nodes can occasionally consume GBs additional heap for no clear reason immediately after downloading the entire corpus.
Although the additional heap consumption is a separate problem that i hope anyone can shed some light on, there is another strange behaviour i would like to see explained. When under little load and with a batch size of just a few hundred, the download speed creeps at at most 150 doc/s. But when i increase batch size to absurd numbers such as 20k, the speed jumps to 2.5k docs/s. Changing total time from days to just a few hours. We see the heap and the speed differences only really with one big collection of millions of small documents. They are just query, click and view logs with additional metadata fields such as time, digests, ranks, dates, uids, view time etc. Is there someone here to shed some light on these vague subjects? Many thanks, Markus