In the comments bellow, by batch I mean a Livy job that is not interactive. This is not to be confused with batch vs streaming jobs in Spark. A Livy batch job could be a Spark streaming or batch job. A Livy interactive job could be a Spark batch or streaming job as well:
On Wed, May 29, 2019 at 5:23 PM Ravindra Chandrakar < ravindra.chandra...@gmail.com> wrote: > Hello, > > I would like to understand following. > > 1. What if livy server dies? What will happen to existing jobs? Will they > still be in running state in spark cluster? If yes, how to track their > status? > If HA is enabled and the jobs are running on YARN, Livy recovers the jobs on restart. > 2. Is there High Availability mode deployment available for Apache Livy > server like secondary Livy server or something? > There is not primary-secondary HA, but Livy can recover jobs after a crash or restart. > 3. Can we submit more than one job using batches api? If yes, is there any > limit on upper number? How to submit more than one job using single batches > api call? > Yes. Each jobs gets its own unique ID. The upper limit for now is about 2 billion (When Integer.MaxValue overflows) if the jobs are not submitted in a burst, otherwise Livy has a configurable rate limiter. > 4. If multiple job submission in single batches api call is allowed then > 1. Are those jobs going to run in parallel or sequential? > 2. Can i define dependency between these jobs? > Each Livy batch job is independent of other batch jobs. Livy is not aware of dependencies between batch (or interactive) jobs. Dependencies should be handled by the user outside Livy. Each batch job runs > 5. How can i debug a job that I've submitted using batches API? > > Start with Spark UI or YARN Resource Manager. > > Thanks, > Ravindra Chandrakar >