milenkovicm commented on code in PR #1853:
URL: 
https://github.com/apache/datafusion-ballista/pull/1853#discussion_r3386621203


##########
ballista/core/proto/ballista.proto:
##########
@@ -367,8 +370,16 @@ message ExecutorMetadata {
   string host = 2;
   uint32 port = 3;
   uint32 grpc_port = 4;
-  ExecutorSpecification specification = 5;

Review Comment:
   could we make this properties optional rather than flatten them?



##########
ballista/core/proto/ballista.proto:
##########
@@ -272,10 +272,13 @@ message PartitionLocation {
   uint32 map_partition_id = 1;
   // partition_id of the shuffle, a composition of(job_id + map_stage_id + 
partition_id).
   PartitionId partition_id = 2;
-  ExecutorMetadata executor_meta = 3;
   PartitionStats partition_stats = 4;
-  optional uint64 file_id = 6;
-  bool is_sort_shuffle = 7;
+  string executor_id = 6;
+  string host = 7;
+  uint32 port = 8;
+  uint32 grpc_port = 9;

Review Comment:
   this is going to be replicated in each and every partition location, which 
is not optimal if we take into account 1000s of them. 
   
   Would it be possible to serialise `PartitionLocation` into actual vector of 
locations and map of executors, where key is going to be executor_id (as bytes) 
? 



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]


---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to