[ 
https://issues.apache.org/jira/browse/IMPALA-13794?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jason Fehr updated IMPALA-13794:
--------------------------------
    Description: 
Currently we use HdfsTable's memory usage estimates for Iceberg tables, even 
though they have different characteristics.

We should come up with a calculation that is better suited for Iceberg tables.

The current estimate doesn't take some things into account, e.g.
* IcebergFileDescriptor sizes are larger than plain FileDescriptor sizes, 
especially when the table is partitioned
* IcebergContentFileStore's data structures overhead
    * V3: Deletion Vectors stored in IcebergContentFileStore.dataFileToDV_
* the internal Iceberg BaseTable object

Also need to cross-check the estimation with the Jamm weigher (that we use in 
our caches).

  was:
Currently we use HdfsTable's memory usage estimates for Iceberg tables, even 
though they have different characteristics.

We should come up with a calculation that is better suited for Iceberg tables.


> Calculate memory usage estimates more precisely for Iceberg tables
> ------------------------------------------------------------------
>
>                 Key: IMPALA-13794
>                 URL: https://issues.apache.org/jira/browse/IMPALA-13794
>             Project: IMPALA
>          Issue Type: Improvement
>          Components: Catalog
>            Reporter: Zoltán Borók-Nagy
>            Assignee: Jason Fehr
>            Priority: Major
>              Labels: impala-iceberg, ramp-up
>
> Currently we use HdfsTable's memory usage estimates for Iceberg tables, even 
> though they have different characteristics.
> We should come up with a calculation that is better suited for Iceberg tables.
> The current estimate doesn't take some things into account, e.g.
> * IcebergFileDescriptor sizes are larger than plain FileDescriptor sizes, 
> especially when the table is partitioned
> * IcebergContentFileStore's data structures overhead
>     * V3: Deletion Vectors stored in IcebergContentFileStore.dataFileToDV_
> * the internal Iceberg BaseTable object
> Also need to cross-check the estimation with the Jamm weigher (that we use in 
> our caches).



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to