Hey,

No worries.

TBH I haven't been in the MR world for a while, so if you think it makes
more sense like that and that we won't be opening tons of KuduClients then
I'm all for it.

Thanks,

J-D

On Wed, Dec 27, 2017 at 2:47 PM, Clemens Valiente <
[email protected]> wrote:

> Hi J-D,
>
> thanks for the answer. I missed that part in the comment, teaches me to
> check more closely in the future. Now I understand the reason.
>
> It might probably be a better idea to instantiate a separate KuduClient
> for each RecordReader since they are not really designed to share the same
> client anyway.
>
> Thoughts?
>
>
> Clemens
>
> ________________________________
> From: Jean-Daniel Cryans <[email protected]>
> Sent: 27 December 2017 23:37:54
> To: [email protected]
> Subject: Re: Possible Race Condition in kudu-mapreduce
>
> Hi Clemens,
>
> The reason is here
> https://github.com/apache/kudu/blob/master/java/kudu-
> mapreduce/src/main/java/org/apache/kudu/mapreduce/
> KuduTableInputFormat.java#L72
>
> I'd still be worried about that. You can look at what HBase does for
> inspiration in fixing this:
> https://github.com/apache/hbase/blob/master/hbase-
> mapreduce/src/main/java/org/apache/hadoop/hbase/mapreduce/
> TableInputFormatBase.java#L238
>
> Thanks,
>
> J-D
>
> On Wed, Dec 27, 2017 at 6:01 AM, Clemens Valiente <
> [email protected]> wrote:
>
> > Hi,
> >
> > I hope everyone had a nice long holiday weekend!
> >
> >
> > I have a question regarding the kudu-mapreduce package, in particular
> this
> > line in the KuduTableInputFormat where getSplits() shuts down our Kudu
> > client:
> >
> > https://github.com/cloudera/kudu/blob/master/java/kudu-
> > mapreduce/src/main/java/org/apache/kudu/mapreduce/
> > KuduTableInputFormat.java#L164
> >
> > Is there a reason for shutting down the client here?
> >
> > This does not work with Hive:
> >
> > In FetchInputFormatSplit, Hive uses the same InputFormat for fetching the
> > splits and getting the recordReader (in our case, it is the
> > KuduTableInputFormat.TableRecordReader).
> >
> > If Hive then tries to initialize that record reader, it runs into an
> error
> > here:
> >
> > https://github.com/cloudera/kudu/blob/master/java/kudu-
> > mapreduce/src/main/java/org/apache/kudu/mapreduce/
> > KuduTableInputFormat.java#L397
> >
> > since the TableRecordReader uses the same client of the
> > KuduTableInputFormat that was already shut down by getSplits()
> >
> >
> > Since the client is already shut down by the close() method on the
> > KuduTableInputFormat, I don't see a need to do the same in getSplits()
> >
> > If there are no objections or a reason for keeping the shutdown call
> > there, I would open a ticket and submit a patch for this.
> >
> >
> > Cheers
> >
> > Clemens
> >
> >
>

Reply via email to