Hi Team, I am creating this table: CREATE TABLE IF NOT EXISTS orctest2 ( id string, id2 string, id3 string, id4 string ) STORED AS ORC tblproperties ("orc.stripe.size"="1048576","orc.row.index.stride"="3333”);
The stripe size is set to 1MB. After loading data, the table file is about 60MB: -rwxr-xr-x 1 root root 61335650 Dec 2 18:08 000000_0 However it genrated 1492 stripes and each stripe is about 40KB. Stripes: Stripe: offset: 3 data: 39124 rows: 5000 tail: 68 index: 292 Stream: column 0 section ROW_INDEX start: 3 length 16 Stream: column 1 section ROW_INDEX start: 19 length 69 Stream: column 2 section ROW_INDEX start: 88 length 69 Stream: column 3 section ROW_INDEX start: 157 length 69 Stream: column 4 section ROW_INDEX start: 226 length 69 Stream: column 1 section DATA start: 295 length 9762 Stream: column 1 section LENGTH start: 10057 length 19 Stream: column 2 section DATA start: 10076 length 9762 Stream: column 2 section LENGTH start: 19838 length 19 Stream: column 3 section DATA start: 19857 length 9762 Stream: column 3 section LENGTH start: 29619 length 19 Stream: column 4 section DATA start: 29638 length 9762 Stream: column 4 section LENGTH start: 39400 length 19 Encoding column 0: DIRECT Encoding column 1: DIRECT_V2 Encoding column 2: DIRECT_V2 Encoding column 3: DIRECT_V2 Encoding column 4: DIRECT_V2 Anybody knows how does the ORC stripe define? Thanks. -- Thanks, www.openkb.info (Open KnowledgeBase for Hadoop/Database/OS/Network/Tool)