new user to livy/spark, basic use case questions

Decker, Seth Andrew Wed, 16 May 2018 07:17:24 -0700

Hello,

I'm new to Livy and Spark and have a question about how to properly use it.


I'm wanting to use spark both for the interactive scripting side as well as for 
passing it data/parameters to run x defined algorithms/applications. I'm 
looking at using the livy to interface with spark restfully, but am not sure if 
I can handle things how I want(or if that's the intended way to use them). So I 
can pass python script into spark through livy, which is great.

Is the intended way to get results to save to an hdfs/database/data store and 
then read those results in? I noticed in the java/python clients that you can 
create your job through livy and get the results back in the livy http message, 
which seems much simpler but I'm having trouble/concerns over using that path.

My first issue is I don't necessarily need the client to know the job. I'd 
rather that be saved to hdfs in apache/Hadoop and then livy just tells spark to 
run it with x parameters/input. Is this doable with the client or do I just 
stick with the http api?

If the previous is possible is it also possible to run a python script in 
spark, called via the java client? From my sleuthing in the github page it 
looks like you upload/run/submit jars in java and .py in python, and I would 
probably have use cases of wanting to run both(such as having tensorflow .py 
scripts, or custom java code). Is there a way to run both from the same client?

Thanks,
Seth Decker

new user to livy/spark, basic use case questions

Reply via email to