KR-bluejay commented on issue #1316:
URL: 
https://github.com/apache/datafusion-ballista/issues/1316#issuecomment-3280251438

   Thanks for the great ideas. As I'm still getting familiar with the internals 
of Ballista, I might be missing some key points, but I wanted to share a few 
thoughts that came to mind
   
   1) **Relative paths under Flight store**
      I like the idea, with one caveat: if we switch to relative paths, we 
should harden against path traversal (e.g. `../..`). A central `PathResolver` 
that joins with the Flight store root, canonicalizes, and rejects anything 
escaping the root (plus explicit checks like `starts_with(root)`) would make 
this safe.
   
   2) **Delete-on-client-disconnect**
      It sounds useful, but I think it should be **configurable** and possibly 
gated by a **grace period** (and/or liveness recheck). In practice clients can 
drop due to transient network issues; auto-deleting immediately might be risky. 
Spark has similar knobs around cleanup; having a setting like:
      - `delete_on_disconnect = off|immediate|grace(N seconds)`
      - scope: `session_only | session_and_job_state | everything`
      could help operators choose the right behavior. Also for streaming-style 
clients that “fire-and-forget” and await async responses, immediate deletion 
might be surprising.
   
   3) **Flight as file owner (Actions)**  
   The idea is interesting, but I would not adopt it now. Ballista’s strength 
should come from **parallel/distributed performance**. However, introducing a 
new centralized owner for deletion at this stage risks adding **coordination 
overhead and bottlenecks** that could actually hurt scalability.  
   Given current resources, fully reworking the ownership model also feels too 
heavy to justify at this point.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]


---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to