[jira] [Commented] (HIVE-22330) Maximize smallBuffer usage in BytesColumnVector

2019-10-13 Thread Attila Zsolt Piros (Jira)


[ 
https://issues.apache.org/jira/browse/HIVE-22330?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16950277#comment-16950277
 ] 

Attila Zsolt Piros commented on HIVE-22330:
---

I think the intention to use *smallbuffers* was to avoid fragmented allocation. 
As you see when the *nextElemLength* is over the *MAX_SIZE_FOR_SMALL_BUFFER* 
(1MB) then there is a new buffer allocated for the data and in that case the 
only possible exception is OOM. So I think when the *smallbuffer* is chosen the 
same should be kept: no other exception should be thrown (beside the implicit 
OOM).
   

> Maximize smallBuffer usage in BytesColumnVector
> ---
>
> Key: HIVE-22330
> URL: https://issues.apache.org/jira/browse/HIVE-22330
> Project: Hive
>  Issue Type: Improvement
>Affects Versions: 4.0.0
>Reporter: Karen Coppage
>Assignee: Karen Coppage
>Priority: Major
> Attachments: HIVE-22330.01.patch
>
>
> When BytesColumnVector is populated with values, it either creates a new 
> (byte[]) buffer object to help take the values, but if the values array is 
> <=1MB, then instead of creating a new buffer it reuses a single 
> "smallBuffer". Every time the smallBuffer is too small for the data we want 
> to store there, the size is doubled; when the size ends up larger than 1 GB 
> (or Integer.MAX_VALUE / 2) then the next time we try to double the size, 
> overflow occurs and an error is thrown.
> A quick fix here is to set the smallBuffer size to Integer.MAX_VALUE in this 
> case.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Updated] (HIVE-21916) Avoid overflow as a result of casting to bigint at the "ceil", "ceiling" and "floor" SQL functions

2019-06-24 Thread Attila Zsolt Piros (JIRA)


 [ 
https://issues.apache.org/jira/browse/HIVE-21916?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Attila Zsolt Piros updated HIVE-21916:
--
Description: 
The ceil, ceiling and floor SQL functions return type is bigint and this leads 
to overflow:
{code:java}
hive> select version(), ceil(1.2345678901234e+200), 
ceiling(1.2345678901234e+200), floor(1.2345678901234e+200);
OK
4.0.0-SNAPSHOT r11f78562ab36333cc1d0a3f6051d9846c9c921329223372036854775807 
   92233720368547758079223372036854775807
{code}
The explain returned:
{code:java}
++
| Explain |
++
| STAGE DEPENDENCIES: |
| Stage-0 is a root stage |
| |
| STAGE PLANS: |
| Stage: Stage-0 |
| Fetch Operator |
| limit: -1 |
| Processor Tree: |
| TableScan |
| alias: _dummy_table |
| Row Limit Per Split: 1 |
| Statistics: Num rows: 1 Data size: 10 Basic stats: COMPLETE Column stats: 
COMPLETE |
| Select Operator |
| expressions: '4.0.0-SNAPSHOT r11f78562ab36333cc1d0a3f6051d9846c9c92132' 
(type: string), 9223372036854775807L (type: bigint), 9223372036854775807L 
(type: bigint), 9223372036854775807L (type: bigint) |
| outputColumnNames: _col0, _col1, _col2, _col3 |
| Statistics: Num rows: 1 Data size: 164 Basic stats: COMPLETE Column stats: 
COMPLETE |
| ListSink |
| |
++
{code}
Meanwhile at other SQL engines.

*PostgreSQL:*
{code:java}
postgres=# select version(), ceil(1.2345678901234e+200), 
ceiling(1.2345678901234e+200), floor(1.2345678901234e+200); version | ceil | 
ceiling | floor 
--++---
 
-
 
--+---
 

 PostgreSQL 11.3 (Debian 11.3-1.pgdg90+1) on x86_64-pc-linux-gnu, compiled by 
gcc (Debian 6.3.0-18+deb9u1) 6.3.0 20170516, 64-bit | 
12345678901234
 
000
 | 
1234567890123400
 
0
 | 
12345678901234
 
000
 (1 row)
{code}
*MySQL:*
  
{code:java}
mysql> select version(), ceil(1.2345678901234e+200), 
ceiling(1.2345678901234e+200), floor(1.2345678901234e+200); 
+---+---+---+---+
 | version() | ceil(1.2345678901234e+200) | ceiling(1.2345678901234e+200) | 
floor(1.2345678901234e+200) | 
+---+---+---+---+
 | 5.7.26 | 
12345678901234000
 | 

[jira] [Updated] (HIVE-21916) Avoid overflow as a result of casting to bigint at the "ceil", "ceiling" and "floor" SQL functions

2019-06-24 Thread Attila Zsolt Piros (JIRA)


 [ 
https://issues.apache.org/jira/browse/HIVE-21916?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Attila Zsolt Piros updated HIVE-21916:
--
Description: 
The ceil, ceiling and floor SQL functions return type is bigint and this leads 
to overflow:
{code:java}
hive> select version(), ceil(1.2345678901234e+200), 
ceiling(1.2345678901234e+200), floor(1.2345678901234e+200);
OK
4.0.0-SNAPSHOT r11f78562ab36333cc1d0a3f6051d9846c9c921329223372036854775807 
   92233720368547758079223372036854775807
{code}
The explain returned:
{code}
++
| Explain |
++
| STAGE DEPENDENCIES: |
| Stage-0 is a root stage |
| |
| STAGE PLANS: |
| Stage: Stage-0 |
| Fetch Operator |
| limit: -1 |
| Processor Tree: |
| TableScan |
| alias: _dummy_table |
| Row Limit Per Split: 1 |
| Statistics: Num rows: 1 Data size: 10 Basic stats: COMPLETE Column stats: 
COMPLETE |
| Select Operator |
| expressions: '4.0.0-SNAPSHOT r11f78562ab36333cc1d0a3f6051d9846c9c92132' 
(type: string), 9223372036854775807L (type: bigint), 9223372036854775807L 
(type: bigint), 9223372036854775807L (type: bigint) |
| outputColumnNames: _col0, _col1, _col2, _col3 |
| Statistics: Num rows: 1 Data size: 164 Basic stats: COMPLETE Column stats: 
COMPLETE |
| ListSink |
| |
++
{code}
Meanwhile at other SQL engines.

*PostgreSQL:*
{code:java}
postgres=# select version(), ceil(1.2345678901234e+200), 
ceiling(1.2345678901234e+200), floor(1.2345678901234e+200); version | ceil | 
ceiling | floor 
--++---
 
-
 
--+---
 

 PostgreSQL 11.3 (Debian 11.3-1.pgdg90+1) on x86_64-pc-linux-gnu, compiled by 
gcc (Debian 6.3.0-18+deb9u1) 6.3.0 20170516, 64-bit | 
12345678901234
 
000
 | 
1234567890123400
 
0
 | 
12345678901234
 
000
 (1 row)
{code}
*MySQL:*
  
{code:java}
mysql> select version(), ceil(1.2345678901234e+200), 
ceiling(1.2345678901234e+200), floor(1.2345678901234e+200); 
+---+---+---+---+
 | version() | ceil(1.2345678901234e+200) | ceiling(1.2345678901234e+200) | 
floor(1.2345678901234e+200) | 
+---+---+---+---+
 | 5.7.26 | 
12345678901234000
 | 

[jira] [Updated] (HIVE-21916) Avoid overflow as a result of casting to bigint at the "ceil", "ceiling" and "floor" SQL functions

2019-06-24 Thread Attila Zsolt Piros (JIRA)


 [ 
https://issues.apache.org/jira/browse/HIVE-21916?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Attila Zsolt Piros updated HIVE-21916:
--
Description: 
The ceil, ceiling and floor SQL functions return type is bigint and this leads 
to overflow:
{code:java}
hive> select version(), ceil(1.2345678901234e+200), 
ceiling(1.2345678901234e+200), floor(1.2345678901234e+200);
OK
4.0.0-SNAPSHOT r11f78562ab36333cc1d0a3f6051d9846c9c921329223372036854775807 
   92233720368547758079223372036854775807
{code}
The explain returned:


{code}
expressions: '4.0.0-SNAPSHOT r11f78562ab36333cc1d0a3f6051d9846c9c92132' (type: 
string), 9223372036854775807L (type: bigint), 9223372036854775807L (type: 
bigint), 9223372036854775807L (type: bigint)
{code}

 Meanwhile at other SQL engines.

*PostgreSQL:*
{code:java}
postgres=# select version(), ceil(1.2345678901234e+200), 
ceiling(1.2345678901234e+200), floor(1.2345678901234e+200); version | ceil | 
ceiling | floor 
--++---
 
-
 
--+---
 

 PostgreSQL 11.3 (Debian 11.3-1.pgdg90+1) on x86_64-pc-linux-gnu, compiled by 
gcc (Debian 6.3.0-18+deb9u1) 6.3.0 20170516, 64-bit | 
12345678901234
 
000
 | 
1234567890123400
 
0
 | 
12345678901234
 
000
 (1 row)
{code}
*MySQL:*
  
{code:java}
mysql> select version(), ceil(1.2345678901234e+200), 
ceiling(1.2345678901234e+200), floor(1.2345678901234e+200); 
+---+---+---+---+
 | version() | ceil(1.2345678901234e+200) | ceiling(1.2345678901234e+200) | 
floor(1.2345678901234e+200) | 
+---+---+---+---+
 | 5.7.26 | 
12345678901234000
 | 
12345678901234000
 | 
12345678901234000
 | 

[jira] [Updated] (HIVE-21916) Avoid overflow as a result of casting to bigint at the "ceil", "ceiling" and "floor" SQL functions

2019-06-24 Thread Attila Zsolt Piros (JIRA)


 [ 
https://issues.apache.org/jira/browse/HIVE-21916?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Attila Zsolt Piros updated HIVE-21916:
--
Summary: Avoid overflow as a result of casting to bigint at the "ceil", 
"ceiling" and "floor" SQL functions  (was: Avoid overflow as a result of 
casting to long at the "ceil", "ceiling" and "floor" SQL functions)

> Avoid overflow as a result of casting to bigint at the "ceil", "ceiling" and 
> "floor" SQL functions
> --
>
> Key: HIVE-21916
> URL: https://issues.apache.org/jira/browse/HIVE-21916
> Project: Hive
>  Issue Type: Improvement
>Affects Versions: 4.0.0
>Reporter: Attila Zsolt Piros
>Priority: Major
>
> The ceil, ceiling and floor SQL functions return type is long and this leads 
> to overflow:
> {code}
> hive> select version(), ceil(1.2345678901234e+200), 
> ceiling(1.2345678901234e+200), floor(1.2345678901234e+200);
> OK
> 4.0.0-SNAPSHOT r11f78562ab36333cc1d0a3f6051d9846c9c92132
> 922337203685477580792233720368547758079223372036854775807
> {code}
>  
> Meanwhile at other SQL engines.
> *PostgreSQL:*
> {code}
> postgres=# select version(), ceil(1.2345678901234e+200), 
> ceiling(1.2345678901234e+200), floor(1.2345678901234e+200); version | ceil | 
> ceiling | floor 
> --++---
>  
> -
>  
> --+---
>  
> 
>  PostgreSQL 11.3 (Debian 11.3-1.pgdg90+1) on x86_64-pc-linux-gnu, compiled by 
> gcc (Debian 6.3.0-18+deb9u1) 6.3.0 20170516, 64-bit | 
> 12345678901234
>  
> 000
>  | 
> 1234567890123400
>  
> 0
>  | 
> 12345678901234
>  
> 000
>  (1 row)
> {code}
> *MySQL:*
>  
> {code}
> mysql> select version(), ceil(1.2345678901234e+200), 
> ceiling(1.2345678901234e+200), floor(1.2345678901234e+200); 
> +---+---+---+---+
>  | version() | ceil(1.2345678901234e+200) | ceiling(1.2345678901234e+200) | 
> floor(1.2345678901234e+200) | 
> +---+---+---+---+
>  | 5.7.26 | 
> 12345678901234000
>  | 
> 

[jira] [Updated] (HIVE-21916) Avoid overflow as a result of casting to long at the "ceil", "ceiling" and "floor" SQL functions

2019-06-24 Thread Attila Zsolt Piros (JIRA)


 [ 
https://issues.apache.org/jira/browse/HIVE-21916?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Attila Zsolt Piros updated HIVE-21916:
--
Summary: Avoid overflow as a result of casting to long at the "ceil", 
"ceiling" and "floor" SQL functions  (was: Avoid overflow as a result of 
casting to long for the "ceil", "ceiling" and "floor" SQL functions)

> Avoid overflow as a result of casting to long at the "ceil", "ceiling" and 
> "floor" SQL functions
> 
>
> Key: HIVE-21916
> URL: https://issues.apache.org/jira/browse/HIVE-21916
> Project: Hive
>  Issue Type: Improvement
>Affects Versions: 4.0.0
>Reporter: Attila Zsolt Piros
>Priority: Major
>
> The ceil, ceiling and floor SQL functions return type is long and this leads 
> to overflow:
> {code}
> hive> select version(), ceil(1.2345678901234e+200), 
> ceiling(1.2345678901234e+200), floor(1.2345678901234e+200);
> OK
> 4.0.0-SNAPSHOT r11f78562ab36333cc1d0a3f6051d9846c9c92132
> 922337203685477580792233720368547758079223372036854775807
> {code}
>  
> Meanwhile at other SQL engines.
> *PostgreSQL:*
> {code}
> postgres=# select version(), ceil(1.2345678901234e+200), 
> ceiling(1.2345678901234e+200), floor(1.2345678901234e+200); version | ceil | 
> ceiling | floor 
> --++---
>  
> -
>  
> --+---
>  
> 
>  PostgreSQL 11.3 (Debian 11.3-1.pgdg90+1) on x86_64-pc-linux-gnu, compiled by 
> gcc (Debian 6.3.0-18+deb9u1) 6.3.0 20170516, 64-bit | 
> 12345678901234
>  
> 000
>  | 
> 1234567890123400
>  
> 0
>  | 
> 12345678901234
>  
> 000
>  (1 row)
> {code}
> *MySQL:*
>  
> {code}
> mysql> select version(), ceil(1.2345678901234e+200), 
> ceiling(1.2345678901234e+200), floor(1.2345678901234e+200); 
> +---+---+---+---+
>  | version() | ceil(1.2345678901234e+200) | ceiling(1.2345678901234e+200) | 
> floor(1.2345678901234e+200) | 
> +---+---+---+---+
>  | 5.7.26 | 
> 12345678901234000
>  | 
> 12345678901234000
> 

[jira] [Updated] (HIVE-21916) Avoid overflow as a result of casting to long for the "ceil", "ceiling" and "floor" SQL functions

2019-06-24 Thread Attila Zsolt Piros (JIRA)


 [ 
https://issues.apache.org/jira/browse/HIVE-21916?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Attila Zsolt Piros updated HIVE-21916:
--
Summary: Avoid overflow as a result of casting to long for the "ceil", 
"ceiling" and "floor" SQL functions  (was: Avoid overflow because of casting in 
case of the "ceil", "ceiling" and "floor" SQL functions)

> Avoid overflow as a result of casting to long for the "ceil", "ceiling" and 
> "floor" SQL functions
> -
>
> Key: HIVE-21916
> URL: https://issues.apache.org/jira/browse/HIVE-21916
> Project: Hive
>  Issue Type: Improvement
>Affects Versions: 4.0.0
>Reporter: Attila Zsolt Piros
>Priority: Major
>
> The ceil, ceiling and floor SQL functions return type is long and this leads 
> to overflow:
> {code}
> hive> select version(), ceil(1.2345678901234e+200), 
> ceiling(1.2345678901234e+200), floor(1.2345678901234e+200);
> OK
> 4.0.0-SNAPSHOT r11f78562ab36333cc1d0a3f6051d9846c9c92132
> 922337203685477580792233720368547758079223372036854775807
> {code}
>  
> Meanwhile at other SQL engines.
> *PostgreSQL:*
> {code}
> postgres=# select version(), ceil(1.2345678901234e+200), 
> ceiling(1.2345678901234e+200), floor(1.2345678901234e+200); version | ceil | 
> ceiling | floor 
> --++---
>  
> -
>  
> --+---
>  
> 
>  PostgreSQL 11.3 (Debian 11.3-1.pgdg90+1) on x86_64-pc-linux-gnu, compiled by 
> gcc (Debian 6.3.0-18+deb9u1) 6.3.0 20170516, 64-bit | 
> 12345678901234
>  
> 000
>  | 
> 1234567890123400
>  
> 0
>  | 
> 12345678901234
>  
> 000
>  (1 row)
> {code}
> *MySQL:*
>  
> {code}
> mysql> select version(), ceil(1.2345678901234e+200), 
> ceiling(1.2345678901234e+200), floor(1.2345678901234e+200); 
> +---+---+---+---+
>  | version() | ceil(1.2345678901234e+200) | ceiling(1.2345678901234e+200) | 
> floor(1.2345678901234e+200) | 
> +---+---+---+---+
>  | 5.7.26 | 
> 12345678901234000
>  | 
> 12345678901234000
>  

[jira] [Updated] (HIVE-21916) Avoid overflow because of casting in case of the "ceil", "ceiling" and "floor" SQL functions

2019-06-24 Thread Attila Zsolt Piros (JIRA)


 [ 
https://issues.apache.org/jira/browse/HIVE-21916?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Attila Zsolt Piros updated HIVE-21916:
--
Description: 
The ceil, ceiling and floor SQL functions return type is long and this leads to 
overflow:
{code}
hive> select version(), ceil(1.2345678901234e+200), 
ceiling(1.2345678901234e+200), floor(1.2345678901234e+200);
OK
4.0.0-SNAPSHOT r11f78562ab36333cc1d0a3f6051d9846c9c921329223372036854775807 
   92233720368547758079223372036854775807
{code}
 
Meanwhile at other SQL engines.

*PostgreSQL:*
{code}
postgres=# select version(), ceil(1.2345678901234e+200), 
ceiling(1.2345678901234e+200), floor(1.2345678901234e+200); version | ceil | 
ceiling | floor 
--++---
 
-
 
--+---
 

 PostgreSQL 11.3 (Debian 11.3-1.pgdg90+1) on x86_64-pc-linux-gnu, compiled by 
gcc (Debian 6.3.0-18+deb9u1) 6.3.0 20170516, 64-bit | 
12345678901234
 
000
 | 
1234567890123400
 
0
 | 
12345678901234
 
000
 (1 row)
{code}

*MySQL:*
 
{code}
mysql> select version(), ceil(1.2345678901234e+200), 
ceiling(1.2345678901234e+200), floor(1.2345678901234e+200); 
+---+---+---+---+
 | version() | ceil(1.2345678901234e+200) | ceiling(1.2345678901234e+200) | 
floor(1.2345678901234e+200) | 
+---+---+---+---+
 | 5.7.26 | 
12345678901234000
 | 
12345678901234000
 | 
12345678901234000
 |