Github user bbende commented on the issue:
https://github.com/apache/nifi/pull/2478
@bdesert Thanks for the updates, was reviewing the code again and I think
we need to change to way the `ScanHBaseResultHandler` works...
Currently it adds rows to a list in memory until bulk size is reached, and
since bulk size defaults to 0, the default case will be that bulk size is never
reached and all the rows are left as "hanging" rows. This means if someone
scans a table with 1 million rows, all 1 millions will be in memory before
being written to the flow file which would not be good for memory usage.
We should be able to write row by row to the flow file and never add them
to a list. Inside the handler we can use `session.append(flowFile, (out) ->` to
append a row at a time to the flow file. I think we can then do away with the
"hanging rows" concept because there won't be anything buffered in memory.
---