Thanks Jassy for the proposal, I left a few comments in the google doc.
Marco Gaido <marcogaid...@gmail.com> 于2019年5月23日周四 下午2:26写道: > Hi Jessy, > > I think overall the idea makes sense and it may help spreading livy > adoption by making easier to start and work with it. > Thanks for the proposal and the design doc. > I'd love to see more discussion about it in the design docand I think maybe > in a week or so we may start a vote on this. > > Thanks, > Marco > > Il giorno mer 22 mag 2019 alle ore 11:09 Jassy Wang < > jassywang1...@gmail.com> > ha scritto: > > > Hi all, > > Hello, I'm from Beijing DiDi Infinity Technology and Development Co., > Ltd., > > My company uses livy to build a spark service platform, hope to complete > > the unified submission, monitoring and management of spark jobs through > > livy. Currently, livy only supports the programming interface or rest api > > to submit job, but for individual users and Large-scale platform that > uses > > spark client before, most of previous submit job in the form of a shell > > script. For those user, Livy does not provide a more convenient way to > > submit job. > > Livy client is a higher-level spark client. The difference between Livy > > client and spark client is that job is submitted through Livy. User don't > > need to install spark client locally. And Livy client is designed to > > replace the diverse version of spark client scattered on each machine. To > > minimize the user's perception of client changes, Livy client adapt > command > > line parameters in spark client, and because of the usage of Livy, users > > can fully obtain Livy's security verification, session management, > > multi-tenancy and other features. Currently, There are 12K+ spark jobs > > submitted through livy daily and 900+ jobs submit through livy client in > my > > company. > > Livy client adapts the input command line in spark client, let user > > executes in the form of shell script. Livy client parses the parameters > in > > the user command line one by one, converts them into a map, and access > Livy > > through the rest api to submit job and get job progress, control of spark > > level is handed over to backend livy server cluster, and unified > submission > > in cluster mode ensures the consistency of the execution results. > > User start a different Livy interpreter through different livy clients, > > and the livy-submit type directly launches SparkYarnApp, as shown in > Figure > > 1: > > [image: image2019-4-23 11_9_49.png] > > Figure 1 > > There are these steps in Livy-Client submitting a spark job: Submit > > CommandLine, Parsing parameters, Separate varient params, Start > > Interpreter, Load hiveconf, Execute code, polling for result, print > result. > > as Figure 2: > > [image: 屏幕快照 2019-04-29 11.59.20.png] > > Figure 2 > > > > - Submit CommandLine:in commandLine, use abbreviation to set common > > spark or livy params, use --conf to set customized spark params or > livy rsc > > params, and use --hiveconf to set hive params. you can input --help > for > > user guidance. > > - Parsing parameters:use parser like spark-submit to parse command > > line and init SparkSubmitArgument class. > > - Separate varient params:separate spark config, livy config and hive > > config in SparkSubmitArgument. > > - Start interpreter:start LivyInterpreter in Livy-Client, at the same > > time submit a rest request to create session in livy server and > > LivyInterpreter control this session by sessionId which returned from > rest > > api respond. > > - Load hiveconf:if script contains hiveconf config,use SET command to > > active config after LivyInterpreter has been started. > > - Execution code:user can use -e or -f param to make client parse code > > line by line, or input code interactively. all code will be contained > in > > rest request, livy server receive request and start statement. when > > execution, Livy Client will get sparkUiUrl from livy session info, > Client > > will get job or stage progress from sparkui, and print in console. > > - Polling result:Livy-Client use sessionId and statementId to > > construct rest request access to livy, get execution result until > complete. > > - Print result:when statement is in finish state, Livy-Client will > > print statement result field in console, livy-submit will not print > result, > > only progress. > > > > Livy client has the following advantages over the spark client: > > > > - Compared with spark client, Livy-Client will almost NOT update which > > spent patient of most user, backend livy server can change spark > dependency > > at realtime and user has no need to know that. > > - Livy-Client is more lightweight than spark client and has no > > dependency, moreover spark job running in cluster mode will not occupy > > memory and calculation resources in local machine. > > - All jobs submited by Livy-Client will run in cluster mode which is > > more convenient to shoot the trouble > > > > For more information about Livy client, please see Livy client design doc > > < > https://docs.google.com/document/d/1Sc-EHLBhLhmqVn7kQqUexxZ1vwEomb8lW-Vvlpj2Gmc/edit?usp=sharing > > > > or discuss with us in issue link > > <https://issues.apache.org/jira/browse/LIVY-596>. > > > -- Best Regards Jeff Zhang