HeartSaVioR commented on issue #23260: [SPARK-26311][YARN] New feature: custom 
log URL for stdout/stderr
URL: https://github.com/apache/spark/pull/23260#issuecomment-446738674
 
 
   @squito 
   > Or maybe I'm still not quite following, and there is some 3rd party piece 
here, outside of spark & yarn, which collects the logs and can serve them later 
on, whether or not the NM can serve the logs?
   
   Exactly, and collecting logs can also happen while app is running.
   
   For now I would rather say there's 3rd party here, but Hadoop side is trying 
to leverage `clusterId` which I guess Hadoop is also going to have 
multi-clusters awareness, so it's not impossible for Hadoop/YARN to include 
multi-clusters aware centralized services in future.
   
   @vanzin 
   > because the user has to do that (well, the admin could put the values in 
the default Spark properties file)
   
   In practice, Admin will put the value in Spark properties. I agree it 
doesn't sound good if end users can override it, but not sure Spark can prevent 
it. Please let me know if there's a way for Spark to only read from Spark 
property file and not allowing end users to override it while submitting. I'm 
not aware of it and I'll use once it exists.
   
   > I'm not sure about what's the behavior while the app is running. If you go 
to the live UI, and click on the log link, where does that take you?
   
   Centralized log services (whatever they exist) will provide the log in 
unique URLs and Spark will always point to these URLs. 
   
   Suppose the log service knows the status of NM and application, then the 
service can do anything which we are serving now. If NM is live, the service 
could redirect/forward to NM's log URL, or just serve stored log file which is 
continuously pulled from NM. (For latter it may represent a bit outdated log, 
but it just depends on when to pull which is just a detail on the log service, 
not Spark should worry about) If not, it will serve stored log file pulled 
before NM goes offline.
   
   In any way, I wish we don't end up with dealing with static URL, and provide 
some flexibility on 3rd party and end users.

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

Reply via email to