[
https://issues.apache.org/jira/browse/NIFI-3538?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16337448#comment-16337448
]
ASF GitHub Bot commented on NIFI-3538:
--------------------------------------
Github user MikeThomsen commented on the issue:
https://github.com/apache/nifi/pull/2294
@mgaido91 I added the charset. I can't believe I missed that...
WRT the batching, I stand by my opinion that we need to have a sane default
there because there needs to be a way to ensure someone doesn't accidentally
(or on purpose) send an operation that is too big to HBase at once.
To me this is not a theoretical issue because I ran into something like
this with PutHBaseRecord doing genomic data ingestion w/ NiFi. The data set
would generate easily 10B, if not 20-25B tiny (like few dozen byte) writes. I
had to really scale back the size of each record set I was sending to
PutHBaseRecord because it was easily to generate so many Puts that it would
hammer a region offline unexpectedly.
I'm not a HBase expert by any means, but it seems like a recipe for trouble
based on my experience with putting a lot of small writes (and Delete objects
are tiny writes).
> Add DeleteHBase processor(s)
> ----------------------------
>
> Key: NIFI-3538
> URL: https://issues.apache.org/jira/browse/NIFI-3538
> Project: Apache NiFi
> Issue Type: New Feature
> Components: Extensions
> Reporter: Matt Burgess
> Assignee: Mike Thomsen
> Priority: Major
>
> NiFi currently has processors for storing and retrieving cells/rows in HBase,
> but there is no mechanism for deleting records and/or tables.
> I'm not sure if a single DeleteHBase processor could accomplish both, that
> can be discussed under this Jira (and can be split out if necessary).
--
This message was sent by Atlassian JIRA
(v7.6.3#76005)