Gaurangi94 commented on pull request #29630:
URL: https://github.com/apache/spark/pull/29630#issuecomment-686316853


   > I'm not sure your PR really deals with reading from multiple directories. 
The change is listing -> glob with *. Could you please elaborate what is the 
difference? The change also doesn't have any new unit tests verifying the 
changes.
   > 
   > In general comment with the idea, having multiple root directories are 
still possible, but probably better to be just a static list (IMHO) instead of 
regex, as listing with glob pattern is known to be very slow.
   > 
   > One thing I'm afraid of having multiple root directories is, SHS is 
already very complicated in point of thread-safety view even we only allow 
single root directory, and it may make things more complicated. I'm on the 
fence on doing this, until we are clear that this won't make SHS more 
complicated.
   
   Thanks for your response! By multiple directories I meant that a regex could 
potentially match more than one directory. In case of external file system, 
glob pattern might be better considering we will have to make just one over the 
network call. Also, it will be easier for the user to specify just one setting, 
instead of multiple values. What do you think?
   
   I will add the unit tests. Thanks for pointing out.
   
   MHS will function only as a read only server. Can thread-safety be an issue 
in that case?


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
[email protected]



---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to