[jira] [Updated] (HIVE-12837) Better memory estimation/allocation for hybrid grace hash join during hash table loading

2016-05-05 Thread Wei Zheng (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-12837?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Wei Zheng updated HIVE-12837:
-
   Resolution: Fixed
Fix Version/s: 2.1.0
   Status: Resolved  (was: Patch Available)

Thanks [~vikram.dixit] and [~sershe] for the review. Committed to master.

> Better memory estimation/allocation for hybrid grace hash join during hash 
> table loading
> 
>
> Key: HIVE-12837
> URL: https://issues.apache.org/jira/browse/HIVE-12837
> Project: Hive
>  Issue Type: Bug
>  Components: Hive
>Affects Versions: 2.1.0
>Reporter: Wei Zheng
>Assignee: Wei Zheng
> Fix For: 2.1.0
>
> Attachments: HIVE-12837.1.patch, HIVE-12837.2.patch, 
> HIVE-12837.3.patch, HIVE-12837.4.patch, HIVE-12837.5.patch
>
>
> This is to avoid an edge case when the memory available is very little (less 
> than a single write buffer size), and we start loading the hash table. Since 
> the write buffer is lazily allocated, we will easily run out of memory before 
> even checking if we should spill any hash partition.
> e.g.
> Total memory available: 210 MB
> Size of ref array of BytesBytesMultiHashMap for each hash partition: ~16 MB
> Size of write buffer: 8 MB (lazy allocation)
> Number of hash partitions: 16
> Number of hash partitions created in memory: 13
> Number of hash partitions created on disk: 3
> Available memory left after HybridHashTableContainer initialization: 
> 210-16*13=2MB
> Now let's say a row is to be loaded into a hash partition in memory, it will 
> try to allocate an 8MB write buffer for it, but we only have 2MB, thus OOM.
> Solution is to perform the check for possible spilling earlier so we can 
> spill partitions if memory is about to be full, to avoid OOM.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HIVE-12837) Better memory estimation/allocation for hybrid grace hash join during hash table loading

2016-04-25 Thread Wei Zheng (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-12837?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Wei Zheng updated HIVE-12837:
-
Attachment: HIVE-12837.5.patch

I don't think test failures are related. But to make sure, I cloned patch 4 to 
patch 5, for another round of precommit test.

> Better memory estimation/allocation for hybrid grace hash join during hash 
> table loading
> 
>
> Key: HIVE-12837
> URL: https://issues.apache.org/jira/browse/HIVE-12837
> Project: Hive
>  Issue Type: Bug
>  Components: Hive
>Affects Versions: 2.1.0
>Reporter: Wei Zheng
>Assignee: Wei Zheng
> Attachments: HIVE-12837.1.patch, HIVE-12837.2.patch, 
> HIVE-12837.3.patch, HIVE-12837.4.patch, HIVE-12837.5.patch
>
>
> This is to avoid an edge case when the memory available is very little (less 
> than a single write buffer size), and we start loading the hash table. Since 
> the write buffer is lazily allocated, we will easily run out of memory before 
> even checking if we should spill any hash partition.
> e.g.
> Total memory available: 210 MB
> Size of ref array of BytesBytesMultiHashMap for each hash partition: ~16 MB
> Size of write buffer: 8 MB (lazy allocation)
> Number of hash partitions: 16
> Number of hash partitions created in memory: 13
> Number of hash partitions created on disk: 3
> Available memory left after HybridHashTableContainer initialization: 
> 210-16*13=2MB
> Now let's say a row is to be loaded into a hash partition in memory, it will 
> try to allocate an 8MB write buffer for it, but we only have 2MB, thus OOM.
> Solution is to perform the check for possible spilling earlier so we can 
> spill partitions if memory is about to be full, to avoid OOM.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HIVE-12837) Better memory estimation/allocation for hybrid grace hash join during hash table loading

2016-04-22 Thread Wei Zheng (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-12837?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Wei Zheng updated HIVE-12837:
-
Attachment: HIVE-12837.4.patch

Rebased patch 4 for test

> Better memory estimation/allocation for hybrid grace hash join during hash 
> table loading
> 
>
> Key: HIVE-12837
> URL: https://issues.apache.org/jira/browse/HIVE-12837
> Project: Hive
>  Issue Type: Bug
>  Components: Hive
>Affects Versions: 2.1.0
>Reporter: Wei Zheng
>Assignee: Wei Zheng
> Attachments: HIVE-12837.1.patch, HIVE-12837.2.patch, 
> HIVE-12837.3.patch, HIVE-12837.4.patch
>
>
> This is to avoid an edge case when the memory available is very little (less 
> than a single write buffer size), and we start loading the hash table. Since 
> the write buffer is lazily allocated, we will easily run out of memory before 
> even checking if we should spill any hash partition.
> e.g.
> Total memory available: 210 MB
> Size of ref array of BytesBytesMultiHashMap for each hash partition: ~16 MB
> Size of write buffer: 8 MB (lazy allocation)
> Number of hash partitions: 16
> Number of hash partitions created in memory: 13
> Number of hash partitions created on disk: 3
> Available memory left after HybridHashTableContainer initialization: 
> 210-16*13=2MB
> Now let's say a row is to be loaded into a hash partition in memory, it will 
> try to allocate an 8MB write buffer for it, but we only have 2MB, thus OOM.
> Solution is to perform the check for possible spilling earlier so we can 
> spill partitions if memory is about to be full, to avoid OOM.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HIVE-12837) Better memory estimation/allocation for hybrid grace hash join during hash table loading

2016-01-18 Thread Wei Zheng (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-12837?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Wei Zheng updated HIVE-12837:
-
Attachment: HIVE-12837.3.patch

patch 3 for test

> Better memory estimation/allocation for hybrid grace hash join during hash 
> table loading
> 
>
> Key: HIVE-12837
> URL: https://issues.apache.org/jira/browse/HIVE-12837
> Project: Hive
>  Issue Type: Bug
>  Components: Hive
>Affects Versions: 2.1.0
>Reporter: Wei Zheng
>Assignee: Wei Zheng
> Attachments: HIVE-12837.1.patch, HIVE-12837.2.patch, 
> HIVE-12837.3.patch
>
>
> This is to avoid an edge case when the memory available is very little (less 
> than a single write buffer size), and we start loading the hash table. Since 
> the write buffer is lazily allocated, we will easily run out of memory before 
> even checking if we should spill any hash partition.
> e.g.
> Total memory available: 210 MB
> Size of ref array of BytesBytesMultiHashMap for each hash partition: ~16 MB
> Size of write buffer: 8 MB (lazy allocation)
> Number of hash partitions: 16
> Number of hash partitions created in memory: 13
> Number of hash partitions created on disk: 3
> Available memory left after HybridHashTableContainer initialization: 
> 210-16*13=2MB
> Now let's say a row is to be loaded into a hash partition in memory, it will 
> try to allocate an 8MB write buffer for it, but we only have 2MB, thus OOM.
> Solution is to perform the check for possible spilling earlier so we can 
> spill partitions if memory is about to be full, to avoid OOM.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HIVE-12837) Better memory estimation/allocation for hybrid grace hash join during hash table loading

2016-01-12 Thread Wei Zheng (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-12837?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Wei Zheng updated HIVE-12837:
-
Attachment: HIVE-12837.2.patch

patch 2 for testing

> Better memory estimation/allocation for hybrid grace hash join during hash 
> table loading
> 
>
> Key: HIVE-12837
> URL: https://issues.apache.org/jira/browse/HIVE-12837
> Project: Hive
>  Issue Type: Bug
>  Components: Hive
>Affects Versions: 2.1.0
>Reporter: Wei Zheng
>Assignee: Wei Zheng
> Attachments: HIVE-12837.1.patch, HIVE-12837.2.patch
>
>
> This is to avoid an edge case when the memory available is very little (less 
> than a single write buffer size), and we start loading the hash table. Since 
> the write buffer is lazily allocated, we will easily run out of memory before 
> even checking if we should spill any hash partition.
> e.g.
> Total memory available: 210 MB
> Size of ref array of BytesBytesMultiHashMap for each hash partition: ~16 MB
> Size of write buffer: 8 MB (lazy allocation)
> Number of hash partitions: 16
> Number of hash partitions created in memory: 13
> Number of hash partitions created on disk: 3
> Available memory left after HybridHashTableContainer initialization: 
> 210-16*13=2MB
> Now let's say a row is to be loaded into a hash partition in memory, it will 
> try to allocate an 8MB write buffer for it, but we only have 2MB, thus OOM.
> Solution is to perform the check for possible spilling earlier so we can 
> spill partitions if memory is about to be full, to avoid OOM.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HIVE-12837) Better memory estimation/allocation for hybrid grace hash join during hash table loading

2016-01-11 Thread Wei Zheng (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-12837?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Wei Zheng updated HIVE-12837:
-
Attachment: HIVE-12837.1.patch

> Better memory estimation/allocation for hybrid grace hash join during hash 
> table loading
> 
>
> Key: HIVE-12837
> URL: https://issues.apache.org/jira/browse/HIVE-12837
> Project: Hive
>  Issue Type: Bug
>  Components: Hive
>Affects Versions: 2.1.0
>Reporter: Wei Zheng
>Assignee: Wei Zheng
> Attachments: HIVE-12837.1.patch
>
>
> This is to avoid an edge case when the memory available is very little (less 
> than a single write buffer size), and we start loading the hash table. Since 
> the write buffer is lazily allocated, we will easily run out of memory before 
> even checking if we should spill any hash partition.
> e.g.
> Total memory available: 210 MB
> Size of ref array of BytesBytesMultiHashMap for each hash partition: ~16 MB
> Size of write buffer: 8 MB (lazy allocation)
> Number of hash partitions: 16
> Number of hash partitions created in memory: 13
> Number of hash partitions created on disk: 3
> Available memory left after HybridHashTableContainer initialization: 
> 210-16*13=2MB
> Now let's say a row is to be loaded into a hash partition in memory, it will 
> try to allocate an 8MB write buffer for it, but we only have 2MB, thus OOM.
> Solution is to perform the check for possible spilling earlier so we can 
> spill partitions if memory is about to be full, to avoid OOM.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HIVE-12837) Better memory estimation/allocation for hybrid grace hash join during hash table loading

2016-01-11 Thread Wei Zheng (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-12837?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Wei Zheng updated HIVE-12837:
-
Description: 
This is to avoid an edge case when the memory available is very little (less 
than a single write buffer size), and we start loading the hash table. Since 
the write buffer is lazily allocated, we will easily run out of memory before 
even checking if we should spill any hash partition.

e.g.
Total memory available: 210 MB
Size of ref array of BytesBytesMultiHashMap for each hash partition: ~16 MB
Size of write buffer: 8 MB (lazy allocation)
Number of hash partitions: 16
Number of hash partitions created in memory: 13
Number of hash partitions created on disk: 3
Available memory left after HybridHashTableContainer initialization: 
210-16*13=2MB

Now let's say a row is to be loaded into a hash partition in memory, it will 
try to allocate an 8MB write buffer for it, but we only have 2MB, thus OOM.

Solution is to perform the check for possible spilling earlier so we can spill 
partitions if memory is about to be full, to avoid OOM.

  was:
This is to avoid an edge case when the memory available is very little (less 
than a single write buffer size), and we start loading the hash table. Since 
the write buffer is lazily allocated, we will easily run out of memory before 
even checking if we should spill any hash partition.

e.g.
Total memory available: 210 MB
Size of ref array of BytesBytesMultiHashMap for each hash partition: ~16 MB
Size of write buffer: 8 MB (lazy allocation)
# hash partitions: 16
# hash partitions created in memory: 13
# hash partitions created on disk: 3
Available memory left after HybridHashTableContainer initialization: 
210-16*13=2MB

Now let's say a row is to be loaded into a hash partition in memory, it will 
try to allocate an 8MB write buffer for it, but we only have 2MB, thus OOM.

Solution is to perform the check for possible spilling earlier so we can spill 
partitions if memory is about to be full, to avoid OOM.


> Better memory estimation/allocation for hybrid grace hash join during hash 
> table loading
> 
>
> Key: HIVE-12837
> URL: https://issues.apache.org/jira/browse/HIVE-12837
> Project: Hive
>  Issue Type: Bug
>  Components: Hive
>Affects Versions: 2.1.0
>Reporter: Wei Zheng
>Assignee: Wei Zheng
>
> This is to avoid an edge case when the memory available is very little (less 
> than a single write buffer size), and we start loading the hash table. Since 
> the write buffer is lazily allocated, we will easily run out of memory before 
> even checking if we should spill any hash partition.
> e.g.
> Total memory available: 210 MB
> Size of ref array of BytesBytesMultiHashMap for each hash partition: ~16 MB
> Size of write buffer: 8 MB (lazy allocation)
> Number of hash partitions: 16
> Number of hash partitions created in memory: 13
> Number of hash partitions created on disk: 3
> Available memory left after HybridHashTableContainer initialization: 
> 210-16*13=2MB
> Now let's say a row is to be loaded into a hash partition in memory, it will 
> try to allocate an 8MB write buffer for it, but we only have 2MB, thus OOM.
> Solution is to perform the check for possible spilling earlier so we can 
> spill partitions if memory is about to be full, to avoid OOM.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)