question about how to update zeppelin interpreters.

2017-10-03 Thread Jeffrey Rodriguez
Hi folks,
   I would like to upgrade the zeppelin interpreter properties
programmatically. I found two ways.
1. Update the conf/interpreters.json
2. User the interpreter REST API.

My question is, should users/developers update the interpreter directly?

id values for interpreters are like "2CVTZCCU4'", which seems to me not
very ramdom and more like a signature or class hash.

Is the prefer way to maintain consistency, the REST API??

Regards,
   Jeff Rodriguez


Re: How to execute spark-submit on Note

2017-10-03 Thread 小野圭二
Hi Dave,

Thank You for your suggestion.
It worked fine order by my expectation so far.
I did not know "%sh" could use like that.

Anyhow, i would like to explain why i would like to execute "spark-submit"
in a note, to be clear your wondering.
Yes, i know the basic instruction of Zeppelin as You explained to me in
your reply, Dave.
So, now, i tried to find the prospect of the environment of execution in
Zeppelin.
That mean, we were considering how to deliver our programs to users widely
after we made a program with collaboration on Zeppelin. In this case, we
might do not want to disclose our source code to them, but want to keep the
execution environment for rejecting any unnecessary issues.
Now i succeeded with a script code. Next will try to run a binary one.

That was the reason why, i posted this question into ML.
And i asked similar but another solution into JIRA,(#2721)

Once again, thank You Dave.

-Keiji


2017-10-03 19:12 GMT+09:00 David Howell :

> Hi Keiji,
>
>
>
> In the paragraph you would write:
>
> %sh
>
> spark-submit myapp.jar ...
>
>
>
> The %sh interpreter is a shell, and runs as the zeppelin service user with
> whatever permissions it has. You can run any shell commands in it.
>
>
>
> Although, this is a fairly strange way to run zeppelin so I’m not really
> sure that is what you want.
>
>
>
> You can just use the %spark.pyspark interpreter and write your python
> spark code in there. The spark interpreters in Zeppelin already create the
> Spark Context for you, as well as sqlContext and spark session. These are
> available as sc, sqlContext and spark. If you have a program that is ready
> for spark submit, I would use some other tool to schedule and run it, like
> cron, oozie, NiFi, Luigi, Airflow etc. Or if you want to run manually just
> use spark submit from the shell directly or ssh.
>
>
>
>
>
> Dave
>
>
>
> *From: *小野圭二 
> *Sent: *Tuesday, 3 October 2017 8:43 PM
> *To: *users@zeppelin.apache.org
> *Subject: *Re: How to execute spark-submit on Note
>
>
> Thank you for your quick reply again, Jeff.
>
> Yes i know the difference of the execution environment between "%sh" and
> ">spark-submit".
> And my question was "how to execute spark-submit as shell interpreter".
> That mean, i am searching how to execute a binary program from a note of
> zeppelin.
> This time it has been limited on Spark.
>
> Seems like Zeppelin have several procedure to execute Spark shell, like
> spark.pyspark, spark.sql
> So how to do "spark-submit" was my wondering.
>
> I am sorry for bothering Your time, but at the same time, i am appreciated
> if You get my wondering clearly, and show me some tips.
>
> -Keiji
>
>
> 2017-10-03 18:30 GMT+09:00 Jeff Zhang :
>
>> %sh is shell interpreter, you can run spark-submit just as you run it in
>> shell terminal.
>>
>> 小野圭二 于2017年10月3日周二 下午4:58写道:
>>
>>> Thank you for your reply, Jeff
>>>
>>> "%sh" ?
>>> "sh" seems like request something execution code.
>>> I tried "%sh", then
>>>
>>> %sh 
>>>   %sh bash: : no permission
>>>
>>> I made binary file from .py to .pyc, but the answer was as same.
>>> I am sorry seems like doubting you, but Is "%sh" the resolution?
>>>
>>> -Keiji
>>>
>>> 2017-10-03 17:35 GMT+09:00 Jianfeng (Jeff) Zhang >> >:
>>>

 I am surprised why would you use %spark-submit, there’s no document
 about %spark-submit.   If you want to use spark-submit in zeppelin, then
 you could use %sh


 Best Regard,
 Jeff Zhang


 From: 小野圭二 
 Reply-To: "users@zeppelin.apache.org" 
 Date: Tuesday, October 3, 2017 at 12:49 PM
 To: "users@zeppelin.apache.org" 
 Subject: How to execute spark-submit on Note

 Hi all,

 I searched this topic on the archive of ml, but still could not find
 out the solution clearly.
 So i have tried to post this again(maybe).

 I am using ver 0.8.0, and have installed spark 2.2 on the other path,
 just for checking my test program.
 Then i wrote a quite simple sample python code to check the how to.

 1. the code works fine on a note in Zeppelin
 2. the same code but added the initialize code for SparkContext in it
 works fine on the Spark by using 'spark-submit'.
 3. tried to execute "2" from a note in Zeppelin with the following
 script.
 yes, "spark" interpreter has been implemented in the note.
 then on the note,
 %spark-submit 
   -> interpreter not found error
 4.I have arranged 'SPARK_SUBMIT_OPTIONS' in zeppelin-env.sh order by
 the doc
 ex. export SPARK_SUBMIT_OPTIONS='--packages
 com.databricks:spark-csv_2.10:1.2.0'
 5. then running
  %spark-submit 
   -> interpreter not found error  (as same as "3")

 How can i use spark-submit 

Re: How to execute spark-submit on Note

2017-10-03 Thread 小野圭二
Thank you for your quick reply again, Jeff.

Yes i know the difference of the execution environment between "%sh" and
">spark-submit".
And my question was "how to execute spark-submit as shell interpreter".
That mean, i am searching how to execute a binary program from a note of
zeppelin.
This time it has been limited on Spark.

Seems like Zeppelin have several procedure to execute Spark shell, like
spark.pyspark, spark.sql
So how to do "spark-submit" was my wondering.

I am sorry for bothering Your time, but at the same time, i am appreciated
if You get my wondering clearly, and show me some tips.

-Keiji


2017-10-03 18:30 GMT+09:00 Jeff Zhang :

> %sh is shell interpreter, you can run spark-submit just as you run it in
> shell terminal.
>
> 小野圭二 于2017年10月3日周二 下午4:58写道:
>
>> Thank you for your reply, Jeff
>>
>> "%sh" ?
>> "sh" seems like request something execution code.
>> I tried "%sh", then
>>
>> %sh 
>>   %sh bash: : no permission
>>
>> I made binary file from .py to .pyc, but the answer was as same.
>> I am sorry seems like doubting you, but Is "%sh" the resolution?
>>
>> -Keiji
>>
>> 2017-10-03 17:35 GMT+09:00 Jianfeng (Jeff) Zhang 
>> :
>>
>>>
>>> I am surprised why would you use %spark-submit, there’s no document
>>> about %spark-submit.   If you want to use spark-submit in zeppelin, then
>>> you could use %sh
>>>
>>>
>>> Best Regard,
>>> Jeff Zhang
>>>
>>>
>>> From: 小野圭二 
>>> Reply-To: "users@zeppelin.apache.org" 
>>> Date: Tuesday, October 3, 2017 at 12:49 PM
>>> To: "users@zeppelin.apache.org" 
>>> Subject: How to execute spark-submit on Note
>>>
>>> Hi all,
>>>
>>> I searched this topic on the archive of ml, but still could not find out
>>> the solution clearly.
>>> So i have tried to post this again(maybe).
>>>
>>> I am using ver 0.8.0, and have installed spark 2.2 on the other path,
>>> just for checking my test program.
>>> Then i wrote a quite simple sample python code to check the how to.
>>>
>>> 1. the code works fine on a note in Zeppelin
>>> 2. the same code but added the initialize code for SparkContext in it
>>> works fine on the Spark by using 'spark-submit'.
>>> 3. tried to execute "2" from a note in Zeppelin with the following
>>> script.
>>> yes, "spark" interpreter has been implemented in the note.
>>> then on the note,
>>> %spark-submit 
>>>   -> interpreter not found error
>>> 4.I have arranged 'SPARK_SUBMIT_OPTIONS' in zeppelin-env.sh order by the
>>> doc
>>> ex. export SPARK_SUBMIT_OPTIONS='--packages
>>> com.databricks:spark-csv_2.10:1.2.0'
>>> 5. then running
>>>  %spark-submit 
>>>   -> interpreter not found error  (as same as "3")
>>>
>>> How can i use spark-submit from a note?
>>> Any advice thanks.
>>>
>>> -Keiji
>>>
>>
>>


Re: How to execute spark-submit on Note

2017-10-03 Thread Jeff Zhang
%sh is shell interpreter, you can run spark-submit just as you run it in
shell terminal.

小野圭二 于2017年10月3日周二 下午4:58写道:

> Thank you for your reply, Jeff
>
> "%sh" ?
> "sh" seems like request something execution code.
> I tried "%sh", then
>
> %sh 
>   %sh bash: : no permission
>
> I made binary file from .py to .pyc, but the answer was as same.
> I am sorry seems like doubting you, but Is "%sh" the resolution?
>
> -Keiji
>
> 2017-10-03 17:35 GMT+09:00 Jianfeng (Jeff) Zhang :
>
>>
>> I am surprised why would you use %spark-submit, there’s no document about
>> %spark-submit.   If you want to use spark-submit in zeppelin, then you
>> could use %sh
>>
>>
>> Best Regard,
>> Jeff Zhang
>>
>>
>> From: 小野圭二 
>> Reply-To: "users@zeppelin.apache.org" 
>> Date: Tuesday, October 3, 2017 at 12:49 PM
>> To: "users@zeppelin.apache.org" 
>> Subject: How to execute spark-submit on Note
>>
>> Hi all,
>>
>> I searched this topic on the archive of ml, but still could not find out
>> the solution clearly.
>> So i have tried to post this again(maybe).
>>
>> I am using ver 0.8.0, and have installed spark 2.2 on the other path,
>> just for checking my test program.
>> Then i wrote a quite simple sample python code to check the how to.
>>
>> 1. the code works fine on a note in Zeppelin
>> 2. the same code but added the initialize code for SparkContext in it
>> works fine on the Spark by using 'spark-submit'.
>> 3. tried to execute "2" from a note in Zeppelin with the following script.
>> yes, "spark" interpreter has been implemented in the note.
>> then on the note,
>> %spark-submit 
>>   -> interpreter not found error
>> 4.I have arranged 'SPARK_SUBMIT_OPTIONS' in zeppelin-env.sh order by the
>> doc
>> ex. export SPARK_SUBMIT_OPTIONS='--packages
>> com.databricks:spark-csv_2.10:1.2.0'
>> 5. then running
>>  %spark-submit 
>>   -> interpreter not found error  (as same as "3")
>>
>> How can i use spark-submit from a note?
>> Any advice thanks.
>>
>> -Keiji
>>
>
>


Re: How to execute spark-submit on Note

2017-10-03 Thread 小野圭二
Thank you for your reply, Jeff

"%sh" ?
"sh" seems like request something execution code.
I tried "%sh", then

%sh 
  %sh bash: : no permission

I made binary file from .py to .pyc, but the answer was as same.
I am sorry seems like doubting you, but Is "%sh" the resolution?

-Keiji

2017-10-03 17:35 GMT+09:00 Jianfeng (Jeff) Zhang :

>
> I am surprised why would you use %spark-submit, there’s no document about
> %spark-submit.   If you want to use spark-submit in zeppelin, then you
> could use %sh
>
>
> Best Regard,
> Jeff Zhang
>
>
> From: 小野圭二 
> Reply-To: "users@zeppelin.apache.org" 
> Date: Tuesday, October 3, 2017 at 12:49 PM
> To: "users@zeppelin.apache.org" 
> Subject: How to execute spark-submit on Note
>
> Hi all,
>
> I searched this topic on the archive of ml, but still could not find out
> the solution clearly.
> So i have tried to post this again(maybe).
>
> I am using ver 0.8.0, and have installed spark 2.2 on the other path, just
> for checking my test program.
> Then i wrote a quite simple sample python code to check the how to.
>
> 1. the code works fine on a note in Zeppelin
> 2. the same code but added the initialize code for SparkContext in it
> works fine on the Spark by using 'spark-submit'.
> 3. tried to execute "2" from a note in Zeppelin with the following script.
> yes, "spark" interpreter has been implemented in the note.
> then on the note,
> %spark-submit 
>   -> interpreter not found error
> 4.I have arranged 'SPARK_SUBMIT_OPTIONS' in zeppelin-env.sh order by the
> doc
> ex. export SPARK_SUBMIT_OPTIONS='--packages
> com.databricks:spark-csv_2.10:1.2.0'
> 5. then running
>  %spark-submit 
>   -> interpreter not found error  (as same as "3")
>
> How can i use spark-submit from a note?
> Any advice thanks.
>
> -Keiji
>


Re: How to execute spark-submit on Note

2017-10-03 Thread Jianfeng (Jeff) Zhang

I am surprised why would you use %spark-submit, there’s no document about 
%spark-submit.   If you want to use spark-submit in zeppelin, then you could 
use %sh


Best Regard,
Jeff Zhang


From: 小野圭二 >
Reply-To: "users@zeppelin.apache.org" 
>
Date: Tuesday, October 3, 2017 at 12:49 PM
To: "users@zeppelin.apache.org" 
>
Subject: How to execute spark-submit on Note

Hi all,

I searched this topic on the archive of ml, but still could not find out the 
solution clearly.
So i have tried to post this again(maybe).

I am using ver 0.8.0, and have installed spark 2.2 on the other path, just for 
checking my test program.
Then i wrote a quite simple sample python code to check the how to.

1. the code works fine on a note in Zeppelin
2. the same code but added the initialize code for SparkContext in it works 
fine on the Spark by using 'spark-submit'.
3. tried to execute "2" from a note in Zeppelin with the following script.
yes, "spark" interpreter has been implemented in the note.
then on the note,
%spark-submit 
  -> interpreter not found error
4.I have arranged 'SPARK_SUBMIT_OPTIONS' in zeppelin-env.sh order by the doc
ex. export SPARK_SUBMIT_OPTIONS='--packages 
com.databricks:spark-csv_2.10:1.2.0'
5. then running
 %spark-submit 
  -> interpreter not found error  (as same as "3")

How can i use spark-submit from a note?
Any advice thanks.

-Keiji


RE: Is any limitation of maximum interpreter processes?

2017-10-03 Thread Belousov Maksim Eduardovich
> Which interpreter is pending ?
There comes a time when any paragraph with any interpreter doesn't run and 
remains in 'Pending' state.
We use local spark instances in spark interpretator.

Logs don't contain errors.


Максим Белоусов
Архитектор
Отдел отчетности и витрин данных
Управление хранилищ данных и отчетности
Тел.: +7 495 648-10-00, доб. 2271

From: Jianfeng (Jeff) Zhang [mailto:jzh...@hortonworks.com]
Sent: Tuesday, October 03, 2017 2:01 AM
To: users@zeppelin.apache.org
Subject: Re: Is any limitation of maximum interpreter processes?


Which interpreter is pending ? It is possible that spark interpreter pending 
due to yarn resource capacity if you run it in yarn client mode

If it is pending, you can check the log first.



Best Regard,
Jeff Zhang


From: Belousov Maksim Eduardovich 
>
Reply-To: "users@zeppelin.apache.org" 
>
Date: Monday, October 2, 2017 at 9:26 PM
To: "users@zeppelin.apache.org" 
>
Subject: Is any limitation of maximum interpreter processes?

Hello, users!

Our analysts run notes with such interpreters: markdown, one or two jdbc and 
pyspark. The interpreters are instantiated Per User in isolated process and Per 
Note in isolated process.

And the analysts complain that sometimes paragraphs aren't processed and stay 
in status 'Pending'.
We noticed that it happen when number of started interpreter processes is about 
90-100.
If admin restarts one of the popular interpreter (that is killing some 
interpreter processes), the paragraphs become 'Running'.

We can't see any workload on zeppelin server when paragraphs are pended. RAM is 
sufficiently, iowait ~ 0
Also we can't find out any parameters about maximum interpreter processes.


Has anyone of you faced the same problem? How can this problem be solved?


Thanks,


Maksim Belousov