Ah thanks for the clarification, I also had a feeling that what I asked might be too good to be true:D Anyway I have a better understanding of how Livy works now. Appreciate your help, and have a good day!
Regards, David 2017-12-11 14:16 GMT+08:00 Jeff Zhang <[email protected]>: > > No, you can not refer a variable in scala when it is defined in scala. You > need to register the table in python, then get it from > SparkSession/SparkContext. The following is what you can do. > > Python: > > df_in_pyspark = spark.read.json("examples/src/main/resources/people.json") > > df_in_pyspark.registerTempTable("mytable") > > Scala: > val df_in_pyspark = spark.table("mytable") > val dfInScala: DataFrame = df_in_pyspark.where("age > 35") > > > > David Hu <[email protected]>于2017年12月11日周一 下午12:09写道: > >> Hi Jeff, >> >> That's great to know! I've heard of zeppelin and sort of know what it >> does, but I haven't got a chance to use it by myself. So to confirm if what >> you are saying is what I am understanding, I'd like to go with a scenario. >> >> I first send a POST request to /sessions/1/statements with kind as >> 'pyspark' and code as the following: >> >> df_in_pyspark = spark.read.json("examples/src/main/resources/people.json") >> >> the above code defines a dataframe var `df_in_pyspark` in Python code and >> it will be used in the second POST request to /sessions/1/statements, whose >> 'kind' is 'spark'(scala) with the following code: >> >> val dfInScala: DataFrame = df_in_pyspark.where("age > 35") >> >> So basically you were saying that the above code would run without any >> issues, is that correct? If so, I assume it also applies to other types of >> vars like Estimator/Model/Pipeline? Then how about methods? Is it ok if I >> define a method in Scala and later use it in Python/R code and vice versa? >> >> Sorry for so many questions but if I could know the answer I would be much >> assured to upgrade to latest HDP and enable this awesome feature. Thanks! >> >> Regards, David >> >> 2017-12-11 11:07 GMT+08:00 Jeff Zhang <[email protected]>: >> >>> >>> You can use dataframe in scala if this dataframe is registered in >>> python. Because they share the same sparkcontext. >>> >>> I believe livy can meet your requirement. If you know zeppelin, the >>> behavior of livy now is very similar as zeppelin where you can run one >>> paragraph via scala and another paragraph via python or R. And they run in >>> the same spark application and be able to share data via sparkcontext. >>> >>> >>> >>> >>> >>> >>> David Hu <[email protected]>于2017年12月11日周一 上午10:44写道: >>> >>>> Hi Jeff & Saisai, >>>> >>>> Thank you so much for the explanation and they are very helpful, also >>>> sorry for not replying in time. >>>> >>>> I had read all the links you provided and the impression I got is that, >>>> correct me if I am wrong, this feature would not allow different >>>> session-kind interacting with each other? What I mean is, if I ran one >>>> Scala kind and one Python kind in the same context, unless some kind of >>>> persistence it won't be possible to refer a dataframe variable in >>>> Python code that was defined in Scala right? >>>> >>>> The goal I want to achieve is to mix different languages together and >>>> run as one integrated spark job within which vars/methods defined in one >>>> language can be referred/used in other, because our users might have >>>> different programming background. It might sound silly but I am keen to >>>> know if that's possible under the current Livy infrastructure. Appreciate >>>> it if anyone could answer. Thanks in advance! >>>> >>>> Regards, Dawei >>>> >>>> 2017-12-04 8:30 GMT+08:00 Saisai Shao <[email protected]>: >>>> >>>>> This feature is targeted for Livy 0.5.0 community version. But we >>>>> already back-ported this in HDP 2.6.3, so you can try this feature in HDP >>>>> 2.6.3. >>>>> >>>>> You can check this doc (https://github.com/apache/ >>>>> incubator-livy/blob/master/docs/rest-api.md) to see the API >>>>> difference for this feature. >>>>> >>>>> 2017-12-03 9:55 GMT+08:00 Jeff Zhang <[email protected]>: >>>>> >>>>>> >>>>>> It is implemented in https://issues.apache.org/jira/browse/LIVY-194 >>>>>> >>>>>> But not release in apache version, HDP backport it in their >>>>>> distribution >>>>>> >>>>>> >>>>>> >>>>>> 胡大为(David) <[email protected]>于2017年12月2日周六 上午10:58写道: >>>>>> >>>>>>> I forgot to add the link reference and here it is. >>>>>>> >>>>>>> https://hortonworks.com/blog/hdp-2-6-3-dataplane-service/ >>>>>>> >>>>>>> Regards, Dawei >>>>>>> >>>>>>> On 2 Dec 2017, at 8:24 AM, 胡大为(David) <[email protected]> wrote: >>>>>>> >>>>>>> >>>>>>> Hi all, >>>>>>> >>>>>>> I was reading the HDP 2.6.3 release notes and it mentions that Livy >>>>>>> service is able to multiple programming languages in the same Spark >>>>>>> context, but I went through all the Livy document and examples I can >>>>>>> find >>>>>>> but so far haven’t found out how to get it work. Currently I am using >>>>>>> the >>>>>>> latest Livy 0.4 to submit Scala code only and it would be awesome to >>>>>>> mix it >>>>>>> with Python or R code in the same session. Much appreciate it anyone >>>>>>> could >>>>>>> give me some clue about this. >>>>>>> >>>>>>> Thanks in advance and have a good day :) >>>>>>> >>>>>>> Regards, Dawei >>>>>>> >>>>>>> >>>>>>> >>>>> >>>> >>
