Hi,
Yes I tried to fetch around 40 million rows which took time but it was executed. I'll try with the Avro thing. How to break the select into multiple part? Can you explain in brief the partition flow to start with? Thanks, Mohit From: Shawn Weeks <[email protected]> Sent: 27 June 2018 18:51 To: [email protected] Subject: Re: SelectHiveQl gets stuck when query table containning 12 Billion rows It's probably not stuck doing nothing, using a JDBC connection to fetch 12 Billion rows is going to be painful no matter what you do. At those kind of sizes you're probably better off having Hive create a temporary table in Avro format and then consuming the Avro files from HDFS into NiFi. The largest number of rows I've pulled into NiFi via JDBC in a single query is around 10-20 Million and that took a long time. You can also try breaking the select into multiple parts and running them simultaneously. I've done something similar where I first ran a query to get all of the partitions and then I executed a select for each partition in parallel. Thanks Shawn _____ From: Mohit <[email protected] <mailto:[email protected]> > Sent: Wednesday, June 27, 2018 8:14:25 AM To: [email protected] <mailto:[email protected]> Subject: SelectHiveQl gets stuck when query table containning 12 Billion rows Hi all, I'm trying to fetch data from hive using SelectHiveQL. It works fine for small to medium sized tables, but when I try to fetch data from large table with around 12 billion rows it gets stuck for hours but do nothing. I have set the Max Row per flowfile property to 10 million. We have a 4 node NiFi cluster with 150GB RAM memory each. Is there any configuration which is to be manipulated to make this work? Regards, Mohit
