[jira] [Commented] (TRAFODION-2138) Hive scan on wide tables can result in lost rows or error

ASF GitHub Bot (JIRA) Wed, 03 Aug 2016 13:40:37 -0700

    [ 
https://issues.apache.org/jira/browse/TRAFODION-2138?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15406581#comment-15406581
 ]


ASF GitHub Bot commented on TRAFODION-2138:
-------------------------------------------

GitHub user sureshsubbiah opened a pull request:

    https://github.com/apache/incubator-trafodion/pull/637

    [TRAFODION-2138] Hive scan on wide tables can result in lost rows or error

    The fix has three parts
    a)  RangeTailIOSize will now default to maximum length of a row, if > 16KB. 
    Each esp range will look ahead by this size. Previously this was hard coded 
to 16KB.
    b)  If a whole buffer does not have the start of a row, that is not an 
error.
    c)  We raise an error during compile if maximum row size is greater than 
size of IO buffer. 
    We need to have the whole row (from start to finish) in a contiguous buffer 
to do the rest of our logic. 
    Currently IO buffer defaults to 64 MB.

You can merge this pull request into a Git repository by running:

    $ git pull https://github.com/sureshsubbiah/incubator-trafodion traf-2094

Alternatively you can review and apply these changes as the patch at:

    https://github.com/apache/incubator-trafodion/pull/637.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

    This closes #637
    
----
commit 6d266221b959c14319a760716939f261a2460276
Author: Suresh Subbiah <[email protected]>
Date:   2016-08-03T20:35:43Z

    [TRAFODION-2138] Hive scan on wide tables can result in lost rows or error
    
    The fix has three parts
    a)  RangeTailIOSize will now default to maximum length of a row, if > 16KB.
    Each esp range will look ahead by this size. Previously this was hard coded 
to 16KB.
    b)  If a whole buffer does not have the start of a row, that is not an 
error.
    c)  We raise an error during compile if maximum row size is greater than 
size of IO buffer.
    We need to have the whole row (from start to finish) in a contiguous buffer 
to do the rest of our logic.
    Currently IO buffer defaults to 64 MB.

----


> Hive scan on wide tables can result in lost rows or error
> ---------------------------------------------------------
>
>                 Key: TRAFODION-2138
>                 URL: https://issues.apache.org/jira/browse/TRAFODION-2138
>             Project: Apache Trafodion
>          Issue Type: Bug
>          Components: sql-exe
>    Affects Versions: 2.0-incubating
>            Reporter: Suresh Subbiah
>            Assignee: Suresh Subbiah
>             Fix For: 2.0-incubating
>
>
> When a Hive table has wide rows (say greater than 16KB), and the scan is 
> parallelized. It is possible that one one these two problem situations may 
> arise
> a) An error with sqlcode 8446 is raised
> *** ERROR[8446] An error occurred during hdfs buffer fetch. Error Detail: No 
> record delimiter found in buffer from hdfsRead.
> b) some rows are lost.
> Both these situations occur since the scan is broken up into ranges and the 
> lookahead to the next range is determined by HDFS_IO_RANGE_TAIL which 
> defaults to 16KB. This setting is also hardcoded in a crucial location so 
> changing the default does not help.
> Test case to see this problem is
> cqd traf_max_character_col_length '1000000';
> drop table if exists t031t1;
> create table t031t1 (z int not null primary key, a varchar(1000000), b 
> char(1000000));
> insert into t031t1 values (1, repeat('a', 1000000, 1000000) , 'def');
> insert into t031t1 values (2, repeat('a', 1000, 1000) , 'zzz');
> insert into t031t1 values (3, repeat('a', 10000, 10000) , 'zzz');
> insert into t031t1 values (4, repeat('a', 100000, 100000) , 'zzz');
> insert into t031t1 values (5, repeat('a', 100, 100) , 'zzz');
> insert into t031t1 values (6, repeat('a', 10, 10) , 'zzz');
> insert into t031t1 values (7, repeat('a', 1, 1) , 'zzz');
> insert into t031t1 values (8, repeat('a', 1000000, 1000000) , null);
> insert into t031t1 values (9, repeat('a', 500000, 500000) , null);
> insert into t031t1 values (10, repeat('a', 100, 100) , null);
> --in hive
> create table t031hive(z int, a string, b string);
> -- in sqlci
> cqd hive_max_string_length '1000000';
> insert into hive.hive.t031hive select * from t031t1;
> create external table t031hive (z int, a varchar(1000000), b 
> varchar(1000000)) for hive.hive.t031hive;
> control query shape sort_groupby(exchange(cut));
> -- 2 esp plan
> cqd hive_min_bytes_per_esp_partition '2000000' ;
> prepare s2 from select count(*) from hive.hive.t031hive;
> explain options 'f' s2 ;
> execute s2 ;
> cqd hive_min_bytes_per_esp_partition '1000000' ;
> -- 4 esp plan
> prepare s2 from select count(*) from hive.hive.t031hive;
> explain options 'f' s2 ;
> execute s2 ;
> The row boundaries and esp scan rage boundaries are shown below
> hexdump -v -e '7/1 "%5_ad:%-5_c" "\n"' t031.txt | grep "\n"
> -- 4 esp plan
> 1000006:\n      esp0 end (652820)
> 1001013:\n   
> 1011020:\n
> 1111027:\n   
> 1111134:\n   
> 1111151:\n   
> 1111159:\n
> 2111165:\n      esp1 end (1305640)  esp2 end (1958460)
> 2611171:\n   
> 2611278:\n   
> -- 2 esp plan
> 1000006:\n      
> 1001013:\n   
> 1011020:\n
> 1111027:\n   
> 1111134:\n   
> 1111151:\n   
> 1111159:\n
> 2111165:\n      esp0 end (1305640)  
> 2611171:\n   
> 2611278:\n      
> umber of esps to scan:    4
>   Esp# Range#  StartOffset    BytesRead      FileName
> ====== ====== ============ ============  ==============================
>      0      0            0       652820  t031hive/T031HIVE-0-20160716231120-1 
> (split_e)
>      1      1       652820       652820  t031hive/T031HIVE-0-20160716231120-1 
> (split_b/e)
>      2      2      1305640       652820  t031hive/T031HIVE-0-20160716231120-1 
> (split_b/e)
>      3      3      1958460       652819  t031hive/T031HIVE-0-20160716231120-1 
> (split_b)
> Summary of bytes read per ESP (2611279 = 100 percent):
> ESP    0 reads             652820 bytes ( 100 percent of avg)
> ESP    1 reads             652820 bytes ( 100 percent of avg)
> ESP    2 reads             652820 bytes ( 100 percent of avg)
> ESP    3 reads             652819 bytes (  99 percent of avg)
> hdfsRootDir: hdfs://localhost:15600/user/hive/warehouse/t031hive
> modTSforDir_ = 1468710680, numOfPartCols_ = 0
> Number of ranges to scan: 2
> Number of esps to scan:    2
>   Esp# Range#  StartOffset    BytesRead      FileName
> ====== ====== ============ ============  ==============================
>      0      0            0      1305640  t031hive/T031HIVE-0-20160716231120-1 
> (split_e)
>      1      1      1305640      1305639  t031hive/T031HIVE-0-20160716231120-1 
> (split_b)
> Summary of bytes read per ESP (2611279 = 100 percent):
> ESP    0 reads            1305640 bytes ( 100 percent of avg)
> ESP    1 reads            1305639 bytes (  99 percent of avg)
> hdfsRootDir: hdfs://localhost:15600/user/hive/warehouse/t031hive
> modTSforDir_ = 1468710680, numOfPartCols_ = 0
> The fix has three parts
> a)    RangeTailIOSize will now default to maximum length of a row, if > 16KB. 
> Each esp range will look ahead by this size. Previously this was hard coded 
> to 16KB.
> b)    If a whole buffer does not have the start of a row, that is not an 
> error.
> c)    We raise an error during compile if maximum row size is greater than 
> size of IO buffer. We need to have the whole row (from start to finish) in a 
> contiguous buffer to do the rest of our logic.  Currently IO buffer defaults 
> to 64 MB. 



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (TRAFODION-2138) Hive scan on wide tables can result in lost rows or error

Reply via email to