[jira] [Commented] (HIVE-22330) Maximize smallBuffer usage in BytesColumnVector
[ https://issues.apache.org/jira/browse/HIVE-22330?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16950277#comment-16950277 ] Attila Zsolt Piros commented on HIVE-22330: --- I think the intention to use *smallbuffers* was to avoid fragmented allocation. As you see when the *nextElemLength* is over the *MAX_SIZE_FOR_SMALL_BUFFER* (1MB) then there is a new buffer allocated for the data and in that case the only possible exception is OOM. So I think when the *smallbuffer* is chosen the same should be kept: no other exception should be thrown (beside the implicit OOM). > Maximize smallBuffer usage in BytesColumnVector > --- > > Key: HIVE-22330 > URL: https://issues.apache.org/jira/browse/HIVE-22330 > Project: Hive > Issue Type: Improvement >Affects Versions: 4.0.0 >Reporter: Karen Coppage >Assignee: Karen Coppage >Priority: Major > Attachments: HIVE-22330.01.patch > > > When BytesColumnVector is populated with values, it either creates a new > (byte[]) buffer object to help take the values, but if the values array is > <=1MB, then instead of creating a new buffer it reuses a single > "smallBuffer". Every time the smallBuffer is too small for the data we want > to store there, the size is doubled; when the size ends up larger than 1 GB > (or Integer.MAX_VALUE / 2) then the next time we try to double the size, > overflow occurs and an error is thrown. > A quick fix here is to set the smallBuffer size to Integer.MAX_VALUE in this > case. -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Updated] (HIVE-21916) Avoid overflow as a result of casting to bigint at the "ceil", "ceiling" and "floor" SQL functions
[ https://issues.apache.org/jira/browse/HIVE-21916?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Attila Zsolt Piros updated HIVE-21916: -- Description: The ceil, ceiling and floor SQL functions return type is bigint and this leads to overflow: {code:java} hive> select version(), ceil(1.2345678901234e+200), ceiling(1.2345678901234e+200), floor(1.2345678901234e+200); OK 4.0.0-SNAPSHOT r11f78562ab36333cc1d0a3f6051d9846c9c921329223372036854775807 92233720368547758079223372036854775807 {code} The explain returned: {code:java} ++ | Explain | ++ | STAGE DEPENDENCIES: | | Stage-0 is a root stage | | | | STAGE PLANS: | | Stage: Stage-0 | | Fetch Operator | | limit: -1 | | Processor Tree: | | TableScan | | alias: _dummy_table | | Row Limit Per Split: 1 | | Statistics: Num rows: 1 Data size: 10 Basic stats: COMPLETE Column stats: COMPLETE | | Select Operator | | expressions: '4.0.0-SNAPSHOT r11f78562ab36333cc1d0a3f6051d9846c9c92132' (type: string), 9223372036854775807L (type: bigint), 9223372036854775807L (type: bigint), 9223372036854775807L (type: bigint) | | outputColumnNames: _col0, _col1, _col2, _col3 | | Statistics: Num rows: 1 Data size: 164 Basic stats: COMPLETE Column stats: COMPLETE | | ListSink | | | ++ {code} Meanwhile at other SQL engines. *PostgreSQL:* {code:java} postgres=# select version(), ceil(1.2345678901234e+200), ceiling(1.2345678901234e+200), floor(1.2345678901234e+200); version | ceil | ceiling | floor --++--- - --+--- PostgreSQL 11.3 (Debian 11.3-1.pgdg90+1) on x86_64-pc-linux-gnu, compiled by gcc (Debian 6.3.0-18+deb9u1) 6.3.0 20170516, 64-bit | 12345678901234 000 | 1234567890123400 0 | 12345678901234 000 (1 row) {code} *MySQL:* {code:java} mysql> select version(), ceil(1.2345678901234e+200), ceiling(1.2345678901234e+200), floor(1.2345678901234e+200); +---+---+---+---+ | version() | ceil(1.2345678901234e+200) | ceiling(1.2345678901234e+200) | floor(1.2345678901234e+200) | +---+---+---+---+ | 5.7.26 | 12345678901234000 |
[jira] [Updated] (HIVE-21916) Avoid overflow as a result of casting to bigint at the "ceil", "ceiling" and "floor" SQL functions
[ https://issues.apache.org/jira/browse/HIVE-21916?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Attila Zsolt Piros updated HIVE-21916: -- Description: The ceil, ceiling and floor SQL functions return type is bigint and this leads to overflow: {code:java} hive> select version(), ceil(1.2345678901234e+200), ceiling(1.2345678901234e+200), floor(1.2345678901234e+200); OK 4.0.0-SNAPSHOT r11f78562ab36333cc1d0a3f6051d9846c9c921329223372036854775807 92233720368547758079223372036854775807 {code} The explain returned: {code} ++ | Explain | ++ | STAGE DEPENDENCIES: | | Stage-0 is a root stage | | | | STAGE PLANS: | | Stage: Stage-0 | | Fetch Operator | | limit: -1 | | Processor Tree: | | TableScan | | alias: _dummy_table | | Row Limit Per Split: 1 | | Statistics: Num rows: 1 Data size: 10 Basic stats: COMPLETE Column stats: COMPLETE | | Select Operator | | expressions: '4.0.0-SNAPSHOT r11f78562ab36333cc1d0a3f6051d9846c9c92132' (type: string), 9223372036854775807L (type: bigint), 9223372036854775807L (type: bigint), 9223372036854775807L (type: bigint) | | outputColumnNames: _col0, _col1, _col2, _col3 | | Statistics: Num rows: 1 Data size: 164 Basic stats: COMPLETE Column stats: COMPLETE | | ListSink | | | ++ {code} Meanwhile at other SQL engines. *PostgreSQL:* {code:java} postgres=# select version(), ceil(1.2345678901234e+200), ceiling(1.2345678901234e+200), floor(1.2345678901234e+200); version | ceil | ceiling | floor --++--- - --+--- PostgreSQL 11.3 (Debian 11.3-1.pgdg90+1) on x86_64-pc-linux-gnu, compiled by gcc (Debian 6.3.0-18+deb9u1) 6.3.0 20170516, 64-bit | 12345678901234 000 | 1234567890123400 0 | 12345678901234 000 (1 row) {code} *MySQL:* {code:java} mysql> select version(), ceil(1.2345678901234e+200), ceiling(1.2345678901234e+200), floor(1.2345678901234e+200); +---+---+---+---+ | version() | ceil(1.2345678901234e+200) | ceiling(1.2345678901234e+200) | floor(1.2345678901234e+200) | +---+---+---+---+ | 5.7.26 | 12345678901234000 |
[jira] [Updated] (HIVE-21916) Avoid overflow as a result of casting to bigint at the "ceil", "ceiling" and "floor" SQL functions
[ https://issues.apache.org/jira/browse/HIVE-21916?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Attila Zsolt Piros updated HIVE-21916: -- Description: The ceil, ceiling and floor SQL functions return type is bigint and this leads to overflow: {code:java} hive> select version(), ceil(1.2345678901234e+200), ceiling(1.2345678901234e+200), floor(1.2345678901234e+200); OK 4.0.0-SNAPSHOT r11f78562ab36333cc1d0a3f6051d9846c9c921329223372036854775807 92233720368547758079223372036854775807 {code} The explain returned: {code} expressions: '4.0.0-SNAPSHOT r11f78562ab36333cc1d0a3f6051d9846c9c92132' (type: string), 9223372036854775807L (type: bigint), 9223372036854775807L (type: bigint), 9223372036854775807L (type: bigint) {code} Meanwhile at other SQL engines. *PostgreSQL:* {code:java} postgres=# select version(), ceil(1.2345678901234e+200), ceiling(1.2345678901234e+200), floor(1.2345678901234e+200); version | ceil | ceiling | floor --++--- - --+--- PostgreSQL 11.3 (Debian 11.3-1.pgdg90+1) on x86_64-pc-linux-gnu, compiled by gcc (Debian 6.3.0-18+deb9u1) 6.3.0 20170516, 64-bit | 12345678901234 000 | 1234567890123400 0 | 12345678901234 000 (1 row) {code} *MySQL:* {code:java} mysql> select version(), ceil(1.2345678901234e+200), ceiling(1.2345678901234e+200), floor(1.2345678901234e+200); +---+---+---+---+ | version() | ceil(1.2345678901234e+200) | ceiling(1.2345678901234e+200) | floor(1.2345678901234e+200) | +---+---+---+---+ | 5.7.26 | 12345678901234000 | 12345678901234000 | 12345678901234000 |
[jira] [Updated] (HIVE-21916) Avoid overflow as a result of casting to bigint at the "ceil", "ceiling" and "floor" SQL functions
[ https://issues.apache.org/jira/browse/HIVE-21916?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Attila Zsolt Piros updated HIVE-21916: -- Summary: Avoid overflow as a result of casting to bigint at the "ceil", "ceiling" and "floor" SQL functions (was: Avoid overflow as a result of casting to long at the "ceil", "ceiling" and "floor" SQL functions) > Avoid overflow as a result of casting to bigint at the "ceil", "ceiling" and > "floor" SQL functions > -- > > Key: HIVE-21916 > URL: https://issues.apache.org/jira/browse/HIVE-21916 > Project: Hive > Issue Type: Improvement >Affects Versions: 4.0.0 >Reporter: Attila Zsolt Piros >Priority: Major > > The ceil, ceiling and floor SQL functions return type is long and this leads > to overflow: > {code} > hive> select version(), ceil(1.2345678901234e+200), > ceiling(1.2345678901234e+200), floor(1.2345678901234e+200); > OK > 4.0.0-SNAPSHOT r11f78562ab36333cc1d0a3f6051d9846c9c92132 > 922337203685477580792233720368547758079223372036854775807 > {code} > > Meanwhile at other SQL engines. > *PostgreSQL:* > {code} > postgres=# select version(), ceil(1.2345678901234e+200), > ceiling(1.2345678901234e+200), floor(1.2345678901234e+200); version | ceil | > ceiling | floor > --++--- > > - > > --+--- > > > PostgreSQL 11.3 (Debian 11.3-1.pgdg90+1) on x86_64-pc-linux-gnu, compiled by > gcc (Debian 6.3.0-18+deb9u1) 6.3.0 20170516, 64-bit | > 12345678901234 > > 000 > | > 1234567890123400 > > 0 > | > 12345678901234 > > 000 > (1 row) > {code} > *MySQL:* > > {code} > mysql> select version(), ceil(1.2345678901234e+200), > ceiling(1.2345678901234e+200), floor(1.2345678901234e+200); > +---+---+---+---+ > | version() | ceil(1.2345678901234e+200) | ceiling(1.2345678901234e+200) | > floor(1.2345678901234e+200) | > +---+---+---+---+ > | 5.7.26 | > 12345678901234000 > | >
[jira] [Updated] (HIVE-21916) Avoid overflow as a result of casting to long at the "ceil", "ceiling" and "floor" SQL functions
[ https://issues.apache.org/jira/browse/HIVE-21916?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Attila Zsolt Piros updated HIVE-21916: -- Summary: Avoid overflow as a result of casting to long at the "ceil", "ceiling" and "floor" SQL functions (was: Avoid overflow as a result of casting to long for the "ceil", "ceiling" and "floor" SQL functions) > Avoid overflow as a result of casting to long at the "ceil", "ceiling" and > "floor" SQL functions > > > Key: HIVE-21916 > URL: https://issues.apache.org/jira/browse/HIVE-21916 > Project: Hive > Issue Type: Improvement >Affects Versions: 4.0.0 >Reporter: Attila Zsolt Piros >Priority: Major > > The ceil, ceiling and floor SQL functions return type is long and this leads > to overflow: > {code} > hive> select version(), ceil(1.2345678901234e+200), > ceiling(1.2345678901234e+200), floor(1.2345678901234e+200); > OK > 4.0.0-SNAPSHOT r11f78562ab36333cc1d0a3f6051d9846c9c92132 > 922337203685477580792233720368547758079223372036854775807 > {code} > > Meanwhile at other SQL engines. > *PostgreSQL:* > {code} > postgres=# select version(), ceil(1.2345678901234e+200), > ceiling(1.2345678901234e+200), floor(1.2345678901234e+200); version | ceil | > ceiling | floor > --++--- > > - > > --+--- > > > PostgreSQL 11.3 (Debian 11.3-1.pgdg90+1) on x86_64-pc-linux-gnu, compiled by > gcc (Debian 6.3.0-18+deb9u1) 6.3.0 20170516, 64-bit | > 12345678901234 > > 000 > | > 1234567890123400 > > 0 > | > 12345678901234 > > 000 > (1 row) > {code} > *MySQL:* > > {code} > mysql> select version(), ceil(1.2345678901234e+200), > ceiling(1.2345678901234e+200), floor(1.2345678901234e+200); > +---+---+---+---+ > | version() | ceil(1.2345678901234e+200) | ceiling(1.2345678901234e+200) | > floor(1.2345678901234e+200) | > +---+---+---+---+ > | 5.7.26 | > 12345678901234000 > | > 12345678901234000 >
[jira] [Updated] (HIVE-21916) Avoid overflow as a result of casting to long for the "ceil", "ceiling" and "floor" SQL functions
[ https://issues.apache.org/jira/browse/HIVE-21916?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Attila Zsolt Piros updated HIVE-21916: -- Summary: Avoid overflow as a result of casting to long for the "ceil", "ceiling" and "floor" SQL functions (was: Avoid overflow because of casting in case of the "ceil", "ceiling" and "floor" SQL functions) > Avoid overflow as a result of casting to long for the "ceil", "ceiling" and > "floor" SQL functions > - > > Key: HIVE-21916 > URL: https://issues.apache.org/jira/browse/HIVE-21916 > Project: Hive > Issue Type: Improvement >Affects Versions: 4.0.0 >Reporter: Attila Zsolt Piros >Priority: Major > > The ceil, ceiling and floor SQL functions return type is long and this leads > to overflow: > {code} > hive> select version(), ceil(1.2345678901234e+200), > ceiling(1.2345678901234e+200), floor(1.2345678901234e+200); > OK > 4.0.0-SNAPSHOT r11f78562ab36333cc1d0a3f6051d9846c9c92132 > 922337203685477580792233720368547758079223372036854775807 > {code} > > Meanwhile at other SQL engines. > *PostgreSQL:* > {code} > postgres=# select version(), ceil(1.2345678901234e+200), > ceiling(1.2345678901234e+200), floor(1.2345678901234e+200); version | ceil | > ceiling | floor > --++--- > > - > > --+--- > > > PostgreSQL 11.3 (Debian 11.3-1.pgdg90+1) on x86_64-pc-linux-gnu, compiled by > gcc (Debian 6.3.0-18+deb9u1) 6.3.0 20170516, 64-bit | > 12345678901234 > > 000 > | > 1234567890123400 > > 0 > | > 12345678901234 > > 000 > (1 row) > {code} > *MySQL:* > > {code} > mysql> select version(), ceil(1.2345678901234e+200), > ceiling(1.2345678901234e+200), floor(1.2345678901234e+200); > +---+---+---+---+ > | version() | ceil(1.2345678901234e+200) | ceiling(1.2345678901234e+200) | > floor(1.2345678901234e+200) | > +---+---+---+---+ > | 5.7.26 | > 12345678901234000 > | > 12345678901234000 >
[jira] [Updated] (HIVE-21916) Avoid overflow because of casting in case of the "ceil", "ceiling" and "floor" SQL functions
[ https://issues.apache.org/jira/browse/HIVE-21916?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Attila Zsolt Piros updated HIVE-21916: -- Description: The ceil, ceiling and floor SQL functions return type is long and this leads to overflow: {code} hive> select version(), ceil(1.2345678901234e+200), ceiling(1.2345678901234e+200), floor(1.2345678901234e+200); OK 4.0.0-SNAPSHOT r11f78562ab36333cc1d0a3f6051d9846c9c921329223372036854775807 92233720368547758079223372036854775807 {code} Meanwhile at other SQL engines. *PostgreSQL:* {code} postgres=# select version(), ceil(1.2345678901234e+200), ceiling(1.2345678901234e+200), floor(1.2345678901234e+200); version | ceil | ceiling | floor --++--- - --+--- PostgreSQL 11.3 (Debian 11.3-1.pgdg90+1) on x86_64-pc-linux-gnu, compiled by gcc (Debian 6.3.0-18+deb9u1) 6.3.0 20170516, 64-bit | 12345678901234 000 | 1234567890123400 0 | 12345678901234 000 (1 row) {code} *MySQL:* {code} mysql> select version(), ceil(1.2345678901234e+200), ceiling(1.2345678901234e+200), floor(1.2345678901234e+200); +---+---+---+---+ | version() | ceil(1.2345678901234e+200) | ceiling(1.2345678901234e+200) | floor(1.2345678901234e+200) | +---+---+---+---+ | 5.7.26 | 12345678901234000 | 12345678901234000 | 12345678901234000 |