Re: Order of paragraphs vs. different interpreters (spark vs. pyspark)

Hyung Sung Shim Wed, 13 Jul 2016 17:57:06 -0700

hi
I think you can run the workflows that you defined just 'run' paragraph.
and I believe functionality of view are going to be better. :)


2016년 7월 14일 목요일, xiufeng liu<toxiuf...@gmail.com>님이 작성한 메시지:

> It is easy to change the code. I did myself and use it as an ETL tool. It
> is very powerful
>
> Afancy
>
> On Wednesday, July 13, 2016, Ahmed Sobhi <ahmed.so...@gmail.com
> <javascript:_e(%7B%7D,'cvml','ahmed.so...@gmail.com');>> wrote:
>
>> I think this pr addresses what I need. Case 2 seem to describe the issue
>> I'm having if I'm reading it correctly.
>>
>> The proposed solution, however, is not that clear to me.
>>
>> Is it that you define workflows where a work flow is a sequence of
>> (notebook, paragraph) pairs that are to be run in a specific order?
>> If that's the case, then this definitely solves my problem, but it's
>> really cumbersome from a usability point of view. I think a better solution
>> for my use case is to just have an option to run all paragraphs in the
>> order they appear in on the notebook, regardless of which interpreter they
>> use.
>>
>> On Wed, Jul 13, 2016 at 12:31 PM, Hyung Sung Shim <hss...@nflabs.com>
>> wrote:
>>
>>> hi.
>>> Maybe https://github.com/apache/zeppelin/pull/1176 is related what you
>>> want.
>>> Please check this pr.
>>>
>>> 2016년 7월 13일 수요일, xiufeng liu<toxiuf...@gmail.com>님이 작성한 메시지:
>>>
>>> You have to change the source codes to add the dependencies of running
>>>> paragraphs. I think it is a really interesting feature, for example, it can
>>>> be use as an ETL tool. But, unfortunately, there is no configuration option
>>>> right now.
>>>>
>>>> /afancy
>>>>
>>>> On Wed, Jul 13, 2016 at 12:27 PM, Ahmed Sobhi <ahmed.so...@gmail.com>
>>>> wrote:
>>>>
>>>>> Hello,
>>>>>
>>>>> I have been working on a large Spark Scala notebook. I recently had
>>>>> the requirement to produce graphs/plots out of these data. Python and
>>>>> PySpark seemed like a natural fit but since I've already invested a lot of
>>>>> time and effort into the Scala version, I want to restrict my usage of
>>>>> python to just plotting.
>>>>>
>>>>> I found a good workflow for where in the scala paragraphs I can use 
>>>>> *registerTempTable
>>>>> *and in python I can just use *sqlContext.table *to retrieve that
>>>>> table.
>>>>>
>>>>> The problem now is that if I try to run all paragraphs to get the
>>>>> notebook updated, the python paragraphs fail because they are running
>>>>> before the scala ones eventhough they are placed after them.
>>>>>
>>>>> It seems like the behavior in Zeppelin is that it attempts to run the
>>>>> paragraphs concurrently if they were running on different interpreters
>>>>> which might seem fine on the surface. But now that I want to introduce 
>>>>> some
>>>>> dependency between spark/pyspark paragraphs, is there any way to do that?
>>>>>
>>>>> --
>>>>> Cheers,
>>>>> Ahmed
>>>>>
>>>>
>>>>
>>
>>
>> --
>> Cheers,
>> Ahmed
>> http://bit.ly/ahmed_abtme <http://about.me/humanzz>
>>
>

Re: Order of paragraphs vs. different interpreters (spark vs. pyspark)

Reply via email to