Re: Hive on Tez fails with Sequence files having different key classes

Rajat Jain Tue, 28 Jul 2015 23:26:11 -0700

Will try that. Thanks.

On Tue, Jul 28, 2015 at 10:07 PM Gopal Vijayaraghavan <[email protected]>
wrote:


>
>
> > Noted. Filing this issue because we have legacy data which was generated
> >in incompatible ways, and it works fine with MR. We'll try to change the
> >data ourselves.
>
> Sure, the easy workaround for this is to do ³insert overwrite table foo as
> select * from table² (or partition self-insert) with
>
> set
> hive.tez.input.format=org.apache.hadoop.hive.ql.io.CombineHiveInputFormat;
>
> That will not be fast to query/process since for each row the schema check
> kicks in and has to ask ³What¹s the schema for this row?² to the hive
> IOContext.
>
> While the Hive+Tez fast path makes the schema differentation way before
> that with an
>
>  // this is the bit where we make sure we don't group across partition
>  // schema boundaries
>
>  if (schemaEvolved(s, prevSplit, groupAcrossFiles, work)) {
>
> HTH.
>
> Cheers,
> Gopal
>
>
> --

Sent from mobile device. Excuse brevity and tyops.

Re: Hive on Tez fails with Sequence files having different key classes

Reply via email to