[
https://issues.apache.org/jira/browse/TRAFODION-2138?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15406581#comment-15406581
]
ASF GitHub Bot commented on TRAFODION-2138:
-------------------------------------------
GitHub user sureshsubbiah opened a pull request:
https://github.com/apache/incubator-trafodion/pull/637
[TRAFODION-2138] Hive scan on wide tables can result in lost rows or error
The fix has three parts
a) RangeTailIOSize will now default to maximum length of a row, if > 16KB.
Each esp range will look ahead by this size. Previously this was hard coded
to 16KB.
b) If a whole buffer does not have the start of a row, that is not an
error.
c) We raise an error during compile if maximum row size is greater than
size of IO buffer.
We need to have the whole row (from start to finish) in a contiguous buffer
to do the rest of our logic.
Currently IO buffer defaults to 64 MB.
You can merge this pull request into a Git repository by running:
$ git pull https://github.com/sureshsubbiah/incubator-trafodion traf-2094
Alternatively you can review and apply these changes as the patch at:
https://github.com/apache/incubator-trafodion/pull/637.patch
To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:
This closes #637
----
commit 6d266221b959c14319a760716939f261a2460276
Author: Suresh Subbiah <[email protected]>
Date: 2016-08-03T20:35:43Z
[TRAFODION-2138] Hive scan on wide tables can result in lost rows or error
The fix has three parts
a) RangeTailIOSize will now default to maximum length of a row, if > 16KB.
Each esp range will look ahead by this size. Previously this was hard coded
to 16KB.
b) If a whole buffer does not have the start of a row, that is not an
error.
c) We raise an error during compile if maximum row size is greater than
size of IO buffer.
We need to have the whole row (from start to finish) in a contiguous buffer
to do the rest of our logic.
Currently IO buffer defaults to 64 MB.
----
> Hive scan on wide tables can result in lost rows or error
> ---------------------------------------------------------
>
> Key: TRAFODION-2138
> URL: https://issues.apache.org/jira/browse/TRAFODION-2138
> Project: Apache Trafodion
> Issue Type: Bug
> Components: sql-exe
> Affects Versions: 2.0-incubating
> Reporter: Suresh Subbiah
> Assignee: Suresh Subbiah
> Fix For: 2.0-incubating
>
>
> When a Hive table has wide rows (say greater than 16KB), and the scan is
> parallelized. It is possible that one one these two problem situations may
> arise
> a) An error with sqlcode 8446 is raised
> *** ERROR[8446] An error occurred during hdfs buffer fetch. Error Detail: No
> record delimiter found in buffer from hdfsRead.
> b) some rows are lost.
> Both these situations occur since the scan is broken up into ranges and the
> lookahead to the next range is determined by HDFS_IO_RANGE_TAIL which
> defaults to 16KB. This setting is also hardcoded in a crucial location so
> changing the default does not help.
> Test case to see this problem is
> cqd traf_max_character_col_length '1000000';
> drop table if exists t031t1;
> create table t031t1 (z int not null primary key, a varchar(1000000), b
> char(1000000));
> insert into t031t1 values (1, repeat('a', 1000000, 1000000) , 'def');
> insert into t031t1 values (2, repeat('a', 1000, 1000) , 'zzz');
> insert into t031t1 values (3, repeat('a', 10000, 10000) , 'zzz');
> insert into t031t1 values (4, repeat('a', 100000, 100000) , 'zzz');
> insert into t031t1 values (5, repeat('a', 100, 100) , 'zzz');
> insert into t031t1 values (6, repeat('a', 10, 10) , 'zzz');
> insert into t031t1 values (7, repeat('a', 1, 1) , 'zzz');
> insert into t031t1 values (8, repeat('a', 1000000, 1000000) , null);
> insert into t031t1 values (9, repeat('a', 500000, 500000) , null);
> insert into t031t1 values (10, repeat('a', 100, 100) , null);
> --in hive
> create table t031hive(z int, a string, b string);
> -- in sqlci
> cqd hive_max_string_length '1000000';
> insert into hive.hive.t031hive select * from t031t1;
> create external table t031hive (z int, a varchar(1000000), b
> varchar(1000000)) for hive.hive.t031hive;
> control query shape sort_groupby(exchange(cut));
> -- 2 esp plan
> cqd hive_min_bytes_per_esp_partition '2000000' ;
> prepare s2 from select count(*) from hive.hive.t031hive;
> explain options 'f' s2 ;
> execute s2 ;
> cqd hive_min_bytes_per_esp_partition '1000000' ;
> -- 4 esp plan
> prepare s2 from select count(*) from hive.hive.t031hive;
> explain options 'f' s2 ;
> execute s2 ;
> The row boundaries and esp scan rage boundaries are shown below
> hexdump -v -e '7/1 "%5_ad:%-5_c" "\n"' t031.txt | grep "\n"
> -- 4 esp plan
> 1000006:\n esp0 end (652820)
> 1001013:\n
> 1011020:\n
> 1111027:\n
> 1111134:\n
> 1111151:\n
> 1111159:\n
> 2111165:\n esp1 end (1305640) esp2 end (1958460)
> 2611171:\n
> 2611278:\n
> -- 2 esp plan
> 1000006:\n
> 1001013:\n
> 1011020:\n
> 1111027:\n
> 1111134:\n
> 1111151:\n
> 1111159:\n
> 2111165:\n esp0 end (1305640)
> 2611171:\n
> 2611278:\n
> umber of esps to scan: 4
> Esp# Range# StartOffset BytesRead FileName
> ====== ====== ============ ============ ==============================
> 0 0 0 652820 t031hive/T031HIVE-0-20160716231120-1
> (split_e)
> 1 1 652820 652820 t031hive/T031HIVE-0-20160716231120-1
> (split_b/e)
> 2 2 1305640 652820 t031hive/T031HIVE-0-20160716231120-1
> (split_b/e)
> 3 3 1958460 652819 t031hive/T031HIVE-0-20160716231120-1
> (split_b)
> Summary of bytes read per ESP (2611279 = 100 percent):
> ESP 0 reads 652820 bytes ( 100 percent of avg)
> ESP 1 reads 652820 bytes ( 100 percent of avg)
> ESP 2 reads 652820 bytes ( 100 percent of avg)
> ESP 3 reads 652819 bytes ( 99 percent of avg)
> hdfsRootDir: hdfs://localhost:15600/user/hive/warehouse/t031hive
> modTSforDir_ = 1468710680, numOfPartCols_ = 0
> Number of ranges to scan: 2
> Number of esps to scan: 2
> Esp# Range# StartOffset BytesRead FileName
> ====== ====== ============ ============ ==============================
> 0 0 0 1305640 t031hive/T031HIVE-0-20160716231120-1
> (split_e)
> 1 1 1305640 1305639 t031hive/T031HIVE-0-20160716231120-1
> (split_b)
> Summary of bytes read per ESP (2611279 = 100 percent):
> ESP 0 reads 1305640 bytes ( 100 percent of avg)
> ESP 1 reads 1305639 bytes ( 99 percent of avg)
> hdfsRootDir: hdfs://localhost:15600/user/hive/warehouse/t031hive
> modTSforDir_ = 1468710680, numOfPartCols_ = 0
> The fix has three parts
> a) RangeTailIOSize will now default to maximum length of a row, if > 16KB.
> Each esp range will look ahead by this size. Previously this was hard coded
> to 16KB.
> b) If a whole buffer does not have the start of a row, that is not an
> error.
> c) We raise an error during compile if maximum row size is greater than
> size of IO buffer. We need to have the whole row (from start to finish) in a
> contiguous buffer to do the rest of our logic. Currently IO buffer defaults
> to 64 MB.
--
This message was sent by Atlassian JIRA
(v6.3.4#6332)