Hi Clemens,

The reason is here
https://github.com/apache/kudu/blob/master/java/kudu-mapreduce/src/main/java/org/apache/kudu/mapreduce/KuduTableInputFormat.java#L72

I'd still be worried about that. You can look at what HBase does for
inspiration in fixing this:
https://github.com/apache/hbase/blob/master/hbase-mapreduce/src/main/java/org/apache/hadoop/hbase/mapreduce/TableInputFormatBase.java#L238

Thanks,

J-D

On Wed, Dec 27, 2017 at 6:01 AM, Clemens Valiente <
[email protected]> wrote:

> Hi,
>
> I hope everyone had a nice long holiday weekend!
>
>
> I have a question regarding the kudu-mapreduce package, in particular this
> line in the KuduTableInputFormat where getSplits() shuts down our Kudu
> client:
>
> https://github.com/cloudera/kudu/blob/master/java/kudu-
> mapreduce/src/main/java/org/apache/kudu/mapreduce/
> KuduTableInputFormat.java#L164
>
> Is there a reason for shutting down the client here?
>
> This does not work with Hive:
>
> In FetchInputFormatSplit, Hive uses the same InputFormat for fetching the
> splits and getting the recordReader (in our case, it is the
> KuduTableInputFormat.TableRecordReader).
>
> If Hive then tries to initialize that record reader, it runs into an error
> here:
>
> https://github.com/cloudera/kudu/blob/master/java/kudu-
> mapreduce/src/main/java/org/apache/kudu/mapreduce/
> KuduTableInputFormat.java#L397
>
> since the TableRecordReader uses the same client of the
> KuduTableInputFormat that was already shut down by getSplits()
>
>
> Since the client is already shut down by the close() method on the
> KuduTableInputFormat, I don't see a need to do the same in getSplits()
>
> If there are no objections or a reason for keeping the shutdown call
> there, I would open a ticket and submit a patch for this.
>
>
> Cheers
>
> Clemens
>
>

Reply via email to