[
https://issues.apache.org/jira/browse/FLINK-2055?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16017665#comment-16017665
]
ASF GitHub Bot commented on FLINK-2055:
---------------------------------------
Github user nragon commented on the issue:
https://github.com/apache/flink/pull/2332
I've made a custom solution which works for my use cases. Notice that the
code attached is not working because it's only a skeleton.
This prototype uses asynchbase and tries to manage throttling issues as
mentioned above. The way I do this is by limiting requests per client by 1000
(also configurable, if you want, depending on hbase capacity and response), and
skipping records after reaching that threshold. Every record skipped is updated
according with system timestamp, always keeping the most recent skipped record
for later updates.
Now, in my use case I always use a keyby -> reduce before sink, which keeps
the aggregation state, meaning that every record invoked by hbase sink will
have the last aggregated value from your previous operators. When all requests
are done `pending == 0` I compare the last skipped record with the last
requested record, if the skipped timestamp is less than the requested timestamp
means that hbase has the last aggregation.
There is plenty of room for improvments, i just did'nt have the time.
[HBaseSink.txt](https://github.com/apache/flink/files/1014991/HBaseSink.txt)
> Implement Streaming HBaseSink
> -----------------------------
>
> Key: FLINK-2055
> URL: https://issues.apache.org/jira/browse/FLINK-2055
> Project: Flink
> Issue Type: New Feature
> Components: Streaming Connectors
> Affects Versions: 0.9
> Reporter: Robert Metzger
> Assignee: Erli Ding
>
> As per :
> http://apache-flink-user-mailing-list-archive.2336050.n4.nabble.com/Write-Stream-to-HBase-td1300.html
--
This message was sent by Atlassian JIRA
(v6.3.15#6346)