If you're doing bulk ingestion of JSON, I would recommend using
PutHBaseRecord. I wrote it/contributed it when my team ran into similar
limitations doing genomic data ingestion (several 10s of billions of Puts
from the 1000 genomes project). If you run into problems with it, just post
them and poke me.
On Fri, Feb 9, 2018 at 6:56 PM, Joe Witt <joe.w...@gmail.com> wrote:
> thanks for reporting and if you can do a contrib that would be great!
> On Feb 9, 2018 6:56 PM, "Martini, Adam" <adam.mart...@nike.com> wrote:
> > Hello NiFi Dev Community,
> > This commit hash (part of the NiFi 1.5.0 release) created serious
> > performance issues for HBase Put operations: "
> > 116c8463428c1fb51bfb7a8adfcf23c32fded964".
> > The override of the “toTransitUri” method makes a call to
> > “connection.getAdmin().getClusterStatus().getMaster().getHostAndPort()”
> > upon every flow file transfer, which essentially doubles the traffic
> > through the HBase connector. The performance of our PutHBaseJSON
> > dropped to 1/3 after deploying NiFi 1.5.0.
> > Please let us know a timeline for a fix. We are building and testing our
> > own tar ball in the interim to fix the issue and are happy to contribute
> > our code back to the project if you would like.
> > All the best and thank you.
> > Adam Martini
> > Senior Developer, Nike Digital