I want to ask about something related to this. Does anyone know if there is or will be a command line equivalent of spark-shell client for Livy Spark Server or any other Spark Job Server? The reason that I am asking spark-shell does not handle multiple users on the same server well. Since a Spark Job Server can generate "sessions" for each user, it would be great if this were possible.
Another person in the Livy users group pointed out some advantages. I think the use case makes complete sense for a number of reasons: 1. You wouldn't need an installation of spark and configs on the gateway machine 2. Since Livy is over HTTP, it'd be easier to run spark-shell in front of a firewall 3. Can "connect/disconnect" to sessions similar to screen in linux Thanks, Ben > On Mar 2, 2016, at 1:11 PM, Guru Medasani <gdm...@gmail.com> wrote: > > Hi Yanlin, > > This is a fairly new effort and is not officially released/supported by > Cloudera yet. I believe those numbers will be out once it is released. > > Guru Medasani > gdm...@gmail.com <mailto:gdm...@gmail.com> > > > >> On Mar 2, 2016, at 10:40 AM, yanlin wang <yanl...@me.com >> <mailto:yanl...@me.com>> wrote: >> >> Did any one use Livy in real world high concurrency web app? I think it uses >> spark submit command line to create job... How about job server or notebook >> comparing with Livy? >> >> Thx, >> Yanlin >> >> Sent from my iPhone >> >> On Mar 2, 2016, at 6:24 AM, Guru Medasani <gdm...@gmail.com >> <mailto:gdm...@gmail.com>> wrote: >> >>> Hi Don, >>> >>> Here is another REST interface for interacting with Spark from anywhere. >>> >>> https://github.com/cloudera/livy <https://github.com/cloudera/livy> >>> >>> Here is an example to estimate PI using Spark from Python using requests >>> library. >>> >>> >>> data = { >>> ... 'code': textwrap.dedent("""\ >>> ... val NUM_SAMPLES = 100000; >>> ... val count = sc.parallelize(1 to NUM_SAMPLES).map { i => >>> ... val x = Math.random(); >>> ... val y = Math.random(); >>> ... if (x*x + y*y < 1) 1 else 0 >>> ... }.reduce(_ + _); >>> ... println(\"Pi is roughly \" + 4.0 * count / NUM_SAMPLES) >>> ... """) >>> ... } >>> >>> r = requests.post(statements_url, data=json.dumps(data), >>> >>> headers=headers) >>> >>> pprint.pprint(r.json()) >>> {u'id': 1, >>> u'output': {u'data': {u'text/plain': u'Pi is roughly 3.14004\nNUM_SAMPLES: >>> Int = 100000\ncount: Int = 78501'}, >>> u'execution_count': 1, >>> u'status': u'ok'}, >>> u'state': u'available'} >>> >>> >>> Guru Medasani >>> gdm...@gmail.com <mailto:gdm...@gmail.com> >>> >>> >>> >>>> On Mar 2, 2016, at 7:47 AM, Todd Nist <tsind...@gmail.com >>>> <mailto:tsind...@gmail.com>> wrote: >>>> >>>> Have you looked at Apache Toree, http://toree.apache.org/ >>>> <http://toree.apache.org/>. This was formerly the Spark-Kernel from IBM >>>> but contributed to apache. >>>> >>>> https://github.com/apache/incubator-toree >>>> <https://github.com/apache/incubator-toree> >>>> >>>> You can find a good overview on the spark-kernel here: >>>> http://www.spark.tc/how-to-enable-interactive-applications-against-apache-spark/ >>>> >>>> <http://www.spark.tc/how-to-enable-interactive-applications-against-apache-spark/> >>>> >>>> Not sure if that is of value to you or not. >>>> >>>> HTH. >>>> >>>> -Todd >>>> >>>> On Tue, Mar 1, 2016 at 7:30 PM, Don Drake <dondr...@gmail.com >>>> <mailto:dondr...@gmail.com>> wrote: >>>> I'm interested in building a REST service that utilizes a Spark SQL >>>> Context to return records from a DataFrame (or IndexedRDD?) and even >>>> add/update records. >>>> >>>> This will be a simple REST API, with only a few end-points. I found this >>>> example: >>>> >>>> https://github.com/alexmasselot/spark-play-activator >>>> <https://github.com/alexmasselot/spark-play-activator> >>>> >>>> which looks close to what I am interested in doing. >>>> >>>> Are there any other ideas or options if I want to run this in a YARN >>>> cluster? >>>> >>>> Thanks. >>>> >>>> -Don >>>> >>>> -- >>>> Donald Drake >>>> Drake Consulting >>>> http://www.drakeconsulting.com/ <http://www.drakeconsulting.com/> >>>> https://twitter.com/dondrake <http://www.maillaunder.com/> >>>> 800-733-2143 <tel:800-733-2143> >>> >