Hi Clemens, The reason is here https://github.com/apache/kudu/blob/master/java/kudu-mapreduce/src/main/java/org/apache/kudu/mapreduce/KuduTableInputFormat.java#L72
I'd still be worried about that. You can look at what HBase does for inspiration in fixing this: https://github.com/apache/hbase/blob/master/hbase-mapreduce/src/main/java/org/apache/hadoop/hbase/mapreduce/TableInputFormatBase.java#L238 Thanks, J-D On Wed, Dec 27, 2017 at 6:01 AM, Clemens Valiente < [email protected]> wrote: > Hi, > > I hope everyone had a nice long holiday weekend! > > > I have a question regarding the kudu-mapreduce package, in particular this > line in the KuduTableInputFormat where getSplits() shuts down our Kudu > client: > > https://github.com/cloudera/kudu/blob/master/java/kudu- > mapreduce/src/main/java/org/apache/kudu/mapreduce/ > KuduTableInputFormat.java#L164 > > Is there a reason for shutting down the client here? > > This does not work with Hive: > > In FetchInputFormatSplit, Hive uses the same InputFormat for fetching the > splits and getting the recordReader (in our case, it is the > KuduTableInputFormat.TableRecordReader). > > If Hive then tries to initialize that record reader, it runs into an error > here: > > https://github.com/cloudera/kudu/blob/master/java/kudu- > mapreduce/src/main/java/org/apache/kudu/mapreduce/ > KuduTableInputFormat.java#L397 > > since the TableRecordReader uses the same client of the > KuduTableInputFormat that was already shut down by getSplits() > > > Since the client is already shut down by the close() method on the > KuduTableInputFormat, I don't see a need to do the same in getSplits() > > If there are no objections or a reason for keeping the shutdown call > there, I would open a ticket and submit a patch for this. > > > Cheers > > Clemens > >
