No, you can not refer a variable in scala when it is defined in scala. You
need to register the table in python, then get it from
SparkSession/SparkContext. The following is what you can do.

Python:

df_in_pyspark = spark.read.json("examples/src/main/resources/people.json")

df_in_pyspark.registerTempTable("mytable")

Scala:
         val df_in_pyspark = spark.table("mytable")
         val dfInScala: DataFrame = df_in_pyspark.where("age > 35")



David Hu <hood...@gmail.com>于2017年12月11日周一 下午12:09写道:

> Hi Jeff,
>
> That's great to know! I've heard of zeppelin and sort of know what it
> does, but I haven't got a chance to use it by myself. So to confirm if what
> you are saying is what I am understanding, I'd like to go with a scenario.
>
> I first send a POST request to /sessions/1/statements with kind as
> 'pyspark' and code as the following:
>
> df_in_pyspark = spark.read.json("examples/src/main/resources/people.json")
>
> the above code defines a dataframe var `df_in_pyspark` in Python code and
> it will be used in the second POST request to /sessions/1/statements, whose
> 'kind' is 'spark'(scala) with the following code:
>
> val dfInScala: DataFrame = df_in_pyspark.where("age > 35")
>
> So basically you were saying that the above code would run without any 
> issues, is that correct? If so, I assume it also applies to other types of 
> vars like Estimator/Model/Pipeline? Then how about methods? Is it ok if I 
> define a method in Scala and later use it in Python/R code and vice versa?
>
> Sorry for so many questions but if I could know the answer I would be much 
> assured to upgrade to latest HDP and enable this awesome feature. Thanks!
>
> Regards, David
>
> 2017-12-11 11:07 GMT+08:00 Jeff Zhang <zjf...@gmail.com>:
>
>>
>> You can use dataframe in scala if this dataframe is registered in python.
>> Because they share the same sparkcontext.
>>
>> I believe livy can meet your requirement. If you know zeppelin, the
>> behavior of livy now is very similar as zeppelin where you can run one
>> paragraph via scala and another paragraph via python or R. And they run in
>> the same spark application and be able to share data via sparkcontext.
>>
>>
>>
>>
>>
>>
>> David Hu <hood...@gmail.com>于2017年12月11日周一 上午10:44写道:
>>
>>> Hi Jeff & Saisai,
>>>
>>> Thank you so much for the explanation and they are very helpful, also
>>> sorry for not replying in time.
>>>
>>> I had read all the links you provided and the impression I got is that,
>>> correct me if I am wrong, this feature would not allow different
>>> session-kind interacting with each other? What I mean is, if I ran one
>>> Scala kind and one Python kind in the same context, unless some kind of
>>> persistence it won't be possible to refer a dataframe variable in
>>> Python code that was defined in Scala right?
>>>
>>> The goal I want to achieve is to mix different languages together and
>>> run as one integrated spark job within which vars/methods defined in one
>>> language can be referred/used in other, because our users might have
>>> different programming background. It might sound silly but I am keen to
>>> know if that's possible under the current Livy infrastructure. Appreciate
>>> it if anyone could answer. Thanks in advance!
>>>
>>> Regards, Dawei
>>>
>>> 2017-12-04 8:30 GMT+08:00 Saisai Shao <sai.sai.s...@gmail.com>:
>>>
>>>> This feature is targeted for Livy 0.5.0 community version. But we
>>>> already back-ported this in HDP 2.6.3, so you can try this feature in HDP
>>>> 2.6.3.
>>>>
>>>> You can check this doc (
>>>> https://github.com/apache/incubator-livy/blob/master/docs/rest-api.md)
>>>> to see the API difference for this feature.
>>>>
>>>> 2017-12-03 9:55 GMT+08:00 Jeff Zhang <zjf...@gmail.com>:
>>>>
>>>>>
>>>>> It is implemented in https://issues.apache.org/jira/browse/LIVY-194
>>>>>
>>>>> But not release in apache version, HDP backport it in their
>>>>> distribution
>>>>>
>>>>>
>>>>>
>>>>> 胡大为(David) <hood...@gmail.com>于2017年12月2日周六 上午10:58写道:
>>>>>
>>>>>> I forgot to add the link reference and here it is.
>>>>>>
>>>>>> https://hortonworks.com/blog/hdp-2-6-3-dataplane-service/
>>>>>>
>>>>>> Regards, Dawei
>>>>>>
>>>>>> On 2 Dec 2017, at 8:24 AM, 胡大为(David) <hood...@gmail.com> wrote:
>>>>>>
>>>>>>
>>>>>> Hi all,
>>>>>>
>>>>>> I was reading the HDP 2.6.3 release notes and it mentions that Livy
>>>>>> service is able to multiple programming languages in the same Spark
>>>>>> context, but I went through all the Livy document and examples I can find
>>>>>> but so far haven’t found out how to get it work. Currently I am using the
>>>>>> latest Livy 0.4 to submit Scala code only and it would be awesome to mix 
>>>>>> it
>>>>>> with Python or R code in the same session. Much appreciate it anyone 
>>>>>> could
>>>>>> give me some clue about this.
>>>>>>
>>>>>> Thanks in advance and have a good day :)
>>>>>>
>>>>>> Regards, Dawei
>>>>>>
>>>>>>
>>>>>>
>>>>
>>>
>

Reply via email to