[ 
https://issues.apache.org/jira/browse/HIVE-23085?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17071637#comment-17071637
 ] 

Syed Shameerur Rahman commented on HIVE-23085:
----------------------------------------------


{noformat}
A memory mapped file throws a SIGBUS and kills the whole process when the disk 
it is mapped onto is full.
{noformat}
 We can still hit this issue even with a single memory mapped disk. Say my disk 
space was shared between storing shuffle data (or for any other application) 
and storing cache i can run into this issue anytime due to lazy write by OS.


{noformat}
This can be fixed by pre-allocating all the disk space on startup, but that 
made startup much much slower (added 7 minutes to startup times).
{noformat}

Yes tried pre-allocating, since there is no fallocate in java i tried with a 
hacky way of doing FileChannel.transferTo() from sparse file it took me around 
16 mins (consistently) to allocate 256GB of cache with default areanaCount and 
maxAlloc size.



> LLAP: Support Multiple NVMe-SSD disk Locations While Using SSD Cache
> --------------------------------------------------------------------
>
>                 Key: HIVE-23085
>                 URL: https://issues.apache.org/jira/browse/HIVE-23085
>             Project: Hive
>          Issue Type: Improvement
>            Reporter: Syed Shameerur Rahman
>            Assignee: Syed Shameerur Rahman
>            Priority: Major
>              Labels: pull-request-available
>             Fix For: 4.0.0
>
>         Attachments: HIVE-23085.01.patch
>
>          Time Spent: 10m
>  Remaining Estimate: 0h
>
> Currently we can configure only one SSD location while using SSD cache in 
> LLAP. This highly undermines the capacity of some machines to use its disk 
> capacity to the fullest. For example *AWS* provides *r5d.4x large* series 
> which comes with *2 * 300 GB NVme SSD disk* with the current design only one 
> of the mounted *NVme SSD* disk can be used for caching. Hence adding support 
> for caching data at multiple ssd mounted locations.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

Reply via email to