Hi All, I seriously need help on this aspect. Any reference or pointer to troubleshoot or fix this, could be helpful.
Regards Biswa On Fri, Mar 25, 2016 at 11:24 PM, Biswajit Nayak <biswa...@altiscale.com> wrote: > Prashanth, > > Apologies for the delay in response. > > Below is the orcfiledump of the empty orc file from a broken partition. > > *$ hive --orcfiledump /hive/*testdb*.db/*table_orc > */year=2016/month=1/day=29/000000_0* > > *Structure for /hive/*testdb*.db/*table_orc > */year=2016/month=1/day=29/000000_0* > > *File Version: 0.12 with HIVE_8732* > > *16/03/25 17:49:09 INFO orc.ReaderImpl: Reading ORC rows from /hive/* > testdb*.db/*table_orc*/year=2016/month=1/day=29/000000_0 with {include: > null, offset: 0, length: 9223372036854775807}* > > *16/03/25 17:49:09 INFO orc.RecordReaderFactory: Schema is not specified > on read. Using file schema.* > > *Rows: 0* > > *Compression: SNAPPY* > > *Compression size: 262144* > > *Type: struct<>* > > > *Stripe Statistics:* > > > *File Statistics:* > > * Column 0: count: 0 hasNull: false* > > > *Stripes:* > > > *File length: 49 bytes* > > *Padding length: 0 bytes* > > *Padding ratio: 0%* > > *$ * > > > I still not able to figure it out whats causing this odd behaviour? > > > Regards > Biswa > > On Thu, Mar 10, 2016 at 3:12 PM, Prasanth Jayachandran < > pjayachand...@hortonworks.com> wrote: > >> Alternatively you can send orcfiledump output for the empty orc file from >> broken partition. >> >> Thanks >> Prasanth >> >> On Mar 10, 2016, at 5:11 PM, Prasanth Jayachandran < >> pjayachand...@hortonworks.com> wrote: >> >> Could you attach the emtpy orc files from one of the broken partition >> somewhere? I can run some tests on it to see why its happening. >> >> Thanks >> Prasanth >> >> On Mar 8, 2016, at 12:02 AM, Biswajit Nayak <biswa...@altiscale.com> >> wrote: >> >> Both the parameters are set to false by default. >> >> *hive> set hive.optimize.index.filter;* >> *hive.optimize.index.filter=false* >> *hive> set hive.orc.splits.include.file.footer;* >> *hive.orc.splits.include.file.footer=false* >> *hive> * >> >> >>>I suspect this might be related to having 0 row files in the buckets >> not >> having any recorded schema. >> >> yes there are few files with 0 row, but the query works with other >> partition (which has 0 row files). Out of 30 partition (for a month), 3-4 >> partition are having this issue. Even reload of the data does not yield >> anything. Query works fine in MR now, but having issue in tez. >> >> >> >> On Tue, Mar 8, 2016 at 2:43 AM, Gopal Vijayaraghavan <gop...@apache.org> >> wrote: >> >>> >>> > c varchar(2) >>> ... >>> > Num Buckets: 7 >>> >>> I suspect this might be related to having 0 row files in the buckets not >>> having any recorded schema. >>> >>> You can also experiment with hive.optimize.index.filter=false, to see if >>> the zero row case is artificially produced via predicate push-down. >>> >>> >>> That shouldn't be a problem unless you've turned on >>> hive.orc.splits.include.file.footer=true (recommended to be false). >>> >>> Your row-locations don't actually match any Apache source jar in my >>> builds, are there any other patches to consider? >>> >>> Cheers, >>> Gopal >>> >>> >>> >> >> >> >