Re: NiFi 1.5.0 HBase_1_1_2_ClientService performance bug

Mike Thomsen Fri, 09 Feb 2018 16:11:06 -0800

Adam,

If you're doing bulk ingestion of JSON, I would recommend using
PutHBaseRecord. I wrote it/contributed it when my team ran into similar
limitations doing genomic data ingestion (several 10s of billions of Puts
from the 1000 genomes project). If you run into problems with it, just post
them and poke me.


Mike

On Fri, Feb 9, 2018 at 6:56 PM, Joe Witt <[email protected]> wrote:

> adam
>
> thanks for reporting and if you can do a contrib that would be great!
>
> thanks
> joe
>
> On Feb 9, 2018 6:56 PM, "Martini, Adam" <[email protected]> wrote:
>
> > Hello NiFi Dev Community,
> >
> > This commit hash (part of the NiFi 1.5.0 release) created serious
> > performance issues for HBase Put operations: "
> > 116c8463428c1fb51bfb7a8adfcf23c32fded964".
> >
> > The override of the “toTransitUri” method makes a call to
> > “connection.getAdmin().getClusterStatus().getMaster().getHostAndPort()”
> > upon every flow file transfer, which essentially doubles the traffic
> > through the HBase connector.  The performance of our PutHBaseJSON
> processor
> > dropped to 1/3 after deploying NiFi 1.5.0.
> >
> > Please let us know a timeline for a fix.  We are building and testing our
> > own tar ball in the interim to fix the issue and are happy to contribute
> > our code back to the project if you would like.
> >
> > All the best and thank you.
> >
> > Adam Martini
> > Senior Developer, Nike Digital
> >
> >
> >
>

Re: NiFi 1.5.0 HBase_1_1_2_ClientService performance bug

Reply via email to