[ 
https://issues.apache.org/jira/browse/HIVE-13818?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15296111#comment-15296111
 ] 

Gopal V edited comment on HIVE-13818 at 5/23/16 8:50 AM:
---------------------------------------------------------

Update theory - the issue disappeared when I did {{cast as bigint}} to every 
join column, so that they wouldn't be the 4 byte ints (4+1 byte=5).

{code}
select  i_item_id,
        s_state,
        avg(ss_quantity) agg1,
        avg(ss_list_price) agg2,
        avg(ss_coupon_amt) agg3,
        avg(ss_sales_price) agg4
 from store_sales, customer_demographics, date_dim, store, item
 where cast(store_sales.ss_sold_date_sk as bigint) = cast(date_dim.d_date_sk as 
bigint) and
       cast(store_sales.ss_item_sk as bigint) = cast(item.i_item_sk as bigint) 
and
       cast(store_sales.ss_store_sk as bigint) = cast(store.s_store_sk as 
bigint) and
       cast(store_sales.ss_cdemo_sk as bigint) = 
cast(customer_demographics.cd_demo_sk as bigint) and
       customer_demographics.cd_gender = 'F' and
       customer_demographics.cd_marital_status = 'D' and
       customer_demographics.cd_education_status = 'Unknown' and
       date_dim.d_year = 1998 and
       store.s_state in ('KS','AL', 'MN', 'AL', 'SC', 'VT')
 group by i_item_id, s_state
 order by i_item_id
         ,s_state
 limit 10;
{code}

The BinarySortableDeserializeRead.java:213 is actually the Long parsing, which 
might be accidentally trying to deserialize an Int key using the Long codepath.

{code}
208     case LONG:
209       {
210         final boolean invert = columnSortOrderIsDesc[fieldIndex];
211         long v = inputByteBuffer.read(invert) ^ 0x80;
212         for (int i = 0; i < 7; i++) {
213           v = (v << 8) + (inputByteBuffer.read(invert) & 0xff);
214         }
215         currentLong = v;
216       }
217       break;
{code}

The sort order issues with var-int encoding might be the reason int & long are 
encoded in different byte widths inside BinarySortable.


was (Author: gopalv):
Update theory - the issue disappeared when I did {{cast as bigint}} to every 
join column, so that they wouldn't be the 4 byte ints (4+1 byte=5).

{code}
select  i_item_id,
        s_state,
        avg(ss_quantity) agg1,
        avg(ss_list_price) agg2,
        avg(ss_coupon_amt) agg3,
        avg(ss_sales_price) agg4
 from store_sales, customer_demographics, date_dim, store, item
 where cast(store_sales.ss_sold_date_sk as bigint) = cast(date_dim.d_date_sk as 
bigint) and
       cast(store_sales.ss_item_sk as bigint) = cast(item.i_item_sk as bigint) 
and
       cast(store_sales.ss_store_sk as bigint) = cast(store.s_store_sk as 
bigint) and
       cast(store_sales.ss_cdemo_sk as bigint) = 
cast(customer_demographics.cd_demo_sk as bigint) and
       customer_demographics.cd_gender = 'F' and
       customer_demographics.cd_marital_status = 'D' and
       customer_demographics.cd_education_status = 'Unknown' and
       date_dim.d_year = 1998 and
       store.s_state in ('KS','AL', 'MN', 'AL', 'SC', 'VT')
 group by i_item_id, s_state
 order by i_item_id
         ,s_state
 limit 10;
{code}

The BinarySortableDeserializeRead.java:213 is actually the Long parsing, which 
might be accidentally trying to deserialize an Int key using the Long codepath.

{code}
208     case LONG:
209       {
210         final boolean invert = columnSortOrderIsDesc[fieldIndex];
211         long v = inputByteBuffer.read(invert) ^ 0x80;
212         for (int i = 0; i < 7; i++) {
213           v = (v << 8) + (inputByteBuffer.read(invert) & 0xff);
214         }
215         currentLong = v;
216       }
217       break;
{code}

> (Part 2) EOFException with fast hashtable
> -----------------------------------------
>
>                 Key: HIVE-13818
>                 URL: https://issues.apache.org/jira/browse/HIVE-13818
>             Project: Hive
>          Issue Type: Bug
>          Components: Hive
>            Reporter: Matt McCline
>            Assignee: Matt McCline
>            Priority: Critical
>         Attachments: HIVE-13818.01.patch
>
>
> Changes for HIVE-13682 did fix a bug in Fast Hash Tables, but evidently not 
> this issue according to Gopal/Rajesh/Nita.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

Reply via email to