milenkovicm opened a new issue, #1851:
URL: https://github.com/apache/datafusion-ballista/issues/1851
**Is your feature request related to a problem or challenge? Please describe
what you are trying to do.**
`ExecutorMetadata` contains `ExecutorSpecification` &
`ExecutorOperatingSystemSpecification `
```rust
pub struct ExecutorMetadata {
/// Unique executor identifier.
pub id: String,
/// Hostname or IP address of the executor.
pub host: String,
/// Port number for data transfer.
pub port: u16,
/// Port number for gRPC communication.
pub grpc_port: u16,
/// Resource specification for this executor.
pub specification: ExecutorSpecification,
/// OS and hardware info for this executor
pub os_info: ExecutorOperatingSystemSpecification,
}
```
where `ExecutorOperatingSystemSpecification ` has more data (quite few of
them strings
```rust
pub struct ExecutorOperatingSystemSpecification {
/// System name
pub system_name: String,
/// Kernel version
pub kernel_ver: String,
/// OS version
pub os_ver: String,
/// OS version (long)
pub os_ver_long: String,
/// Number of physical cores available on this executor
pub physical_cores: u32,
/// Number of physical disks available on this executor
pub num_disks: u32,
/// Total disk space on this executor, in bytes
pub total_disk_space: u64,
/// Total available disk space on this executor, in bytes
pub total_available_disk_space: u64,
/// Open files limit on this executor
pub open_files_limit: u64,
}
```
problem is that `ExecutorMetadata` is part of `PartitionLocation`
```
#[derive(Debug, Clone)]
pub struct PartitionLocation {
/// The source partition ID from the map stage.
pub map_partition_id: usize,
/// The partition identifier.
pub partition_id: PartitionId,
/// Metadata about the executor hosting this partition.
pub executor_meta: ExecutorMetadata,
/// Statistics about the partition data.
pub partition_stats: PartitionStats,
/// shuffle file id
pub file_id: Option<u64>,
/// whether this partition uses sort shuffle
pub is_sort_shuffle: bool,
}
```
which represent partition locations and which is going around from/to
scheduler and executor, also it might be cached as part of stage informations.
which would take a bit of space and might attribute #1836
**Describe the solution you'd like**
Trim down or change information/data structure representing executor
metadata,
**Describe alternatives you've considered**
we can either make some fields of executor metadata optional or completely
different data structure but we should not share unnecessary data to the
executor
**Additional context**
Add any other context or screenshots about the feature request here.
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: [email protected]
For queries about this service, please contact Infrastructure at:
[email protected]
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]