[
https://issues.apache.org/jira/browse/NIFI-1174?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15013584#comment-15013584
]
Mark Payne commented on NIFI-1174:
----------------------------------
[~bbende] - I tried connecting GetTwitter to PutHBaseJSON. When I set the Row
Identifier to ${uuid}, all worked perfectly. When I instead used set the Row
Identifier Field Name to "id", I got the error message:
{code}
015-11-19 13:07:29,055 ERROR [Timer-Driven Process Thread-8]
org.apache.nifi.hbase.PutHBaseJSON
PutHBaseJSON[id=1662f7a6-f7b0-4157-a67a-80beca08c8b3] Invalid FlowFile
StandardFlowFileRecord[uuid=f55c5704-f039-4e46-80d1-f189ad5c160c,claim=StandardContentClaim
[resourceClaim=StandardResourceClaim[id=1447938319221-136, container=default,
section=136], offset=121236,
length=156],offset=0,name=1171666305633.json,size=156] missing table, row,
column familiy, or column qualifier; routing to failure
{code}
I also tried setting the field name to "id_str" since the JSON has two fields,
one that is numeric and one that is a string version. Got the same result
either way.
I also am concerned about the number of WARN log messages that are produced.
Since there are 4 or 5 different "complex" fields in the JSON, I see a lot of
warning messages indicating that those fields are not being transferred. I
would recommend that rather than warning for each of those, we build up a
single message indicating the fields that are not being sent and then
generating only a single message. Even then, though, we end up warning on each
message. How do you feel about having a property that allows user to specify
how to handle objects that have "complex" fields (non-flat JSON)? Provide maybe
3 options: Fail (route flowfile to failure), Warn (log), Ignore (just log at a
debug level)?
Otherwise, it works very well! Since I already had an HBase Client Service
created to test the PutHBaseCell, this was super simple to setup. Very nicely
done overall!
> Create a Put HBase processor that can put multiple cells
> --------------------------------------------------------
>
> Key: NIFI-1174
> URL: https://issues.apache.org/jira/browse/NIFI-1174
> Project: Apache NiFi
> Issue Type: Improvement
> Reporter: Bryan Bende
> Assignee: Bryan Bende
> Priority: Minor
> Attachments: NIFI-1174.patch
>
>
> We recently added a PutHBaseCell processor which works great for writing one
> individual cell at a time, but it can require a significant amount of work in
> a flow to create a row with multiple cells.
> We should support a variation of this processor that can accept a flow file
> with key/value pairs in the content of the flow file (possibly json). The
> key/value pairs then turned into the cells for the given row and get added in
> one put operation.
--
This message was sent by Atlassian JIRA
(v6.3.4#6332)