Hi,

I hope everyone had a nice long holiday weekend!


I have a question regarding the kudu-mapreduce package, in particular this line 
in the KuduTableInputFormat where getSplits() shuts down our Kudu client:

https://github.com/cloudera/kudu/blob/master/java/kudu-mapreduce/src/main/java/org/apache/kudu/mapreduce/KuduTableInputFormat.java#L164

Is there a reason for shutting down the client here?

This does not work with Hive:

In FetchInputFormatSplit, Hive uses the same InputFormat for fetching the 
splits and getting the recordReader (in our case, it is the 
KuduTableInputFormat.TableRecordReader).

If Hive then tries to initialize that record reader, it runs into an error here:

https://github.com/cloudera/kudu/blob/master/java/kudu-mapreduce/src/main/java/org/apache/kudu/mapreduce/KuduTableInputFormat.java#L397

since the TableRecordReader uses the same client of the KuduTableInputFormat 
that was already shut down by getSplits()


Since the client is already shut down by the close() method on the 
KuduTableInputFormat, I don't see a need to do the same in getSplits()

If there are no objections or a reason for keeping the shutdown call there, I 
would open a ticket and submit a patch for this.


Cheers

Clemens

Reply via email to