Re: getting read past EOF for Double column

Owen O'Malley Mon, 18 Dec 2017 10:28:04 -0800

This is a bug. Please file a jira. It looks like a change went in that made
the DoubleTreeReader fail if it is called on a batch of size 0.


Thanks,
   Owen

On Mon, Dec 18, 2017 at 10:19 AM, Owen O'Malley <[email protected]>
wrote:

> Actually, the metadata is reasonable, it is just that there is an array
> above that column that doesn't have any elements.
>
> So the tree down to column 36 looks like:
>
> column 0: (struct) count: 42692
> column 1: data (struct) count: 42692
> column 21: listingAssociated (array) count: 42692
> column 22: (struct) count: 0
> column 32: sla (array) count: 0
> column 33: (struct) count: 0
> column 34: shippingTier (struct) count: 0
> column 35: charge (struct) count: 0
> column 36: amount (double) count: 0
>
> since there are 0 instances of column 22, there aren't any instances below
> that. So what should be happening is that the reader doesn't call down to
> read the data because there are no values.
>
> Which version of ORC are you using to read with?
>
> Thanks,
>    Owen
>
>
> On Mon, Dec 18, 2017 at 5:38 AM, Piyush Mukati <[email protected]>
> wrote:
>
>> Hi,
>> I have written one orc file with map-reduce job. But while reading the
>> file I am getting "read past EOF for a double column".
>> After debugging I found that we are trying to read an empty stream. I am
>> suspecting the file meta to be corrupt.
>>
>> as the column meta says:
>> *Column 36: count: 0 hasNull: false sum: 0.0*
>> I am not able to understand how hasNull=false and count can be zero.
>> while other columns have non zero counts.
>>
>> I am out of ideas on debugging.  Please help me with the direction I
>> should debug  further.
>> please find attached meta and the stackTarace.
>> Thanks.
>>
>
>

Re: getting read past EOF for Double column

Reply via email to