Github user MikeThomsen commented on the issue:

    https://github.com/apache/nifi/pull/2294
  
    @mgaido91 I added the charset. I can't believe I missed that...
    
    WRT the batching, I stand by my opinion that we need to have a sane default 
there because there needs to be a way to ensure someone doesn't accidentally 
(or on purpose) send an operation that is too big to HBase at once.
    
    To me this is not a theoretical issue because I ran into something like 
this with PutHBaseRecord doing genomic data ingestion w/ NiFi. The data set 
would generate easily 10B, if not 20-25B tiny (like few dozen byte) writes. I 
had to really scale back the size of each record set I was sending to 
PutHBaseRecord because it was easily to generate so many Puts that it would 
hammer a region offline unexpectedly.
    
    I'm not a HBase expert by any means, but it seems like a recipe for trouble 
based on my experience with putting a lot of small writes (and Delete objects 
are tiny writes).


---

Reply via email to