[
https://issues.apache.org/jira/browse/HIVE-23085?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17070213#comment-17070213
]
Gopal Vijayaraghavan commented on HIVE-23085:
---------------------------------------------
A memory mapped file throws a SIGBUS and kills the whole process when the disk
it is mapped onto is full.
This can be fixed by pre-allocating all the disk space on startup, but that
made startup much much slower (added 7 minutes to startup times).
bq. I am just curious to know does the extra level of abstraction
significantly degrades the cache performance?
Oddly, in my measurements, LVM was a tiny bit faster than using independent
disks one by one. Particularly for the random write operations into the cache
(i.e evict + replace). Seemed to behave like a RAID-0 rather than as a
performance hit on the IO side.
> LLAP: Support Multiple NVMe-SSD disk Locations While Using SSD Cache
> --------------------------------------------------------------------
>
> Key: HIVE-23085
> URL: https://issues.apache.org/jira/browse/HIVE-23085
> Project: Hive
> Issue Type: Improvement
> Reporter: Syed Shameerur Rahman
> Assignee: Syed Shameerur Rahman
> Priority: Major
> Labels: pull-request-available
> Fix For: 4.0.0
>
> Attachments: HIVE-23085.01.patch
>
> Time Spent: 10m
> Remaining Estimate: 0h
>
> Currently we can configure only one SSD location while using SSD cache in
> LLAP. This highly undermines the capacity of some machines to use its disk
> capacity to the fullest. For example *AWS* provides *r5d.4x large* series
> which comes with *2 * 300 GB NVme SSD disk* with the current design only one
> of the mounted *NVme SSD* disk can be used for caching. Hence adding support
> for caching data at multiple ssd mounted locations.
--
This message was sent by Atlassian Jira
(v8.3.4#803005)