Re: how to setup notebook storage path

2020-06-16 Thread Manuel Sopena Ballesteros
ideally each user should use /home//zeppelin/notebooks folder.

is there a way to do this?

thank you

From: Manuel Sopena Ballesteros
Sent: Wednesday, 17 June 2020 1:32:16 AM
To: users
Subject: Re: how to setup notebook storage path


thank you Jeff,


do we need to put the full path?


From: Jeff Zhang 
Sent: Tuesday, 16 June 2020 4:49:49 PM
To: users
Subject: Re: how to setup notebook storage path

zeppelin.notebook.dir in zeppelin-site.xml is the notebook location for 
VFSNotebookRepo

Manuel Sopena Ballesteros 
mailto:manuel...@garvan.org.au>> 于2020年6月16日周二 
下午2:43写道:

Dear Zeppelin community,


I am using zeppelin 0.8.0 deployed by HDP/ambari, by default it uses 
FileSystemNotebookRepo as a notebook storage with path /user/.


I would like to change it to VFSNotebookRepo instead of hadoop.


I can change the zeppelin.notebook.storage in zeppelin-site configuration file 
so my questions are:


  *   which is the default location where zeppelin will store the notebbooks 
under VFSNotebookRepo?
  *   how can I specify the location of the notebooks?

Thank you very much
Manuel


NOTICE
Please consider the environment before printing this email. This message and 
any attachments are intended for the addressee named and may contain legally 
privileged/confidential/copyright information. If you are not the intended 
recipient, you should not read, use, disclose, copy or distribute this 
communication. If you have received this message in error please notify us at 
once by return email and then delete both messages. We accept no liability for 
the distribution of viruses or similar in electronic communications. This 
notice should not be removed.


--
Best Regards

Jeff Zhang
NOTICE
Please consider the environment before printing this email. This message and 
any attachments are intended for the addressee named and may contain legally 
privileged/confidential/copyright information. If you are not the intended 
recipient, you should not read, use, disclose, copy or distribute this 
communication. If you have received this message in error please notify us at 
once by return email and then delete both messages. We accept no liability for 
the distribution of viruses or similar in electronic communications. This 
notice should not be removed.


Re: how to setup notebook storage path

2020-06-16 Thread Manuel Sopena Ballesteros
thank you Jeff,


do we need to put the full path?


From: Jeff Zhang 
Sent: Tuesday, 16 June 2020 4:49:49 PM
To: users
Subject: Re: how to setup notebook storage path

zeppelin.notebook.dir in zeppelin-site.xml is the notebook location for 
VFSNotebookRepo

Manuel Sopena Ballesteros 
mailto:manuel...@garvan.org.au>> 于2020年6月16日周二 
下午2:43写道:

Dear Zeppelin community,


I am using zeppelin 0.8.0 deployed by HDP/ambari, by default it uses 
FileSystemNotebookRepo as a notebook storage with path /user/.


I would like to change it to VFSNotebookRepo instead of hadoop.


I can change the zeppelin.notebook.storage in zeppelin-site configuration file 
so my questions are:


  *   which is the default location where zeppelin will store the notebbooks 
under VFSNotebookRepo?
  *   how can I specify the location of the notebooks?

Thank you very much
Manuel


NOTICE
Please consider the environment before printing this email. This message and 
any attachments are intended for the addressee named and may contain legally 
privileged/confidential/copyright information. If you are not the intended 
recipient, you should not read, use, disclose, copy or distribute this 
communication. If you have received this message in error please notify us at 
once by return email and then delete both messages. We accept no liability for 
the distribution of viruses or similar in electronic communications. This 
notice should not be removed.


--
Best Regards

Jeff Zhang
NOTICE
Please consider the environment before printing this email. This message and 
any attachments are intended for the addressee named and may contain legally 
privileged/confidential/copyright information. If you are not the intended 
recipient, you should not read, use, disclose, copy or distribute this 
communication. If you have received this message in error please notify us at 
once by return email and then delete both messages. We accept no liability for 
the distribution of viruses or similar in electronic communications. This 
notice should not be removed.


how to setup path to store notes

2020-06-16 Thread Manuel Sopena Ballesteros
Dear Zeppelin community,


I am using zeppelin 0.8.0 deployed by HDP/ambari, by default it uses 
FileSystemNotebookRepo as a notebook storage with path /user/.


I would like to change it to VFSNotebookRepo instead of hadoop.


I can change the zeppelin.notebook.storage in zeppelin-site configuration file 
so my questions are:


  *   which is the default location where zeppelin will store the notebbooks 
under VFSNotebookRepo?
  *   how can I specify the location of the notebooks?

Thank you very much
Manuel

NOTICE
Please consider the environment before printing this email. This message and 
any attachments are intended for the addressee named and may contain legally 
privileged/confidential/copyright information. If you are not the intended 
recipient, you should not read, use, disclose, copy or distribute this 
communication. If you have received this message in error please notify us at 
once by return email and then delete both messages. We accept no liability for 
the distribution of viruses or similar in electronic communications. This 
notice should not be removed.


how to setup notebook storage path

2020-06-15 Thread Manuel Sopena Ballesteros
Dear Zeppelin community,


I am using zeppelin 0.8.0 deployed by HDP/ambari, by default it uses 
FileSystemNotebookRepo as a notebook storage with path /user/.


I would like to change it to VFSNotebookRepo instead of hadoop.


I can change the zeppelin.notebook.storage in zeppelin-site configuration file 
so my questions are:


  *   which is the default location where zeppelin will store the notebbooks 
under VFSNotebookRepo?
  *   how can I specify the location of the notebooks?

Thank you very much
Manuel


NOTICE
Please consider the environment before printing this email. This message and 
any attachments are intended for the addressee named and may contain legally 
privileged/confidential/copyright information. If you are not the intended 
recipient, you should not read, use, disclose, copy or distribute this 
communication. If you have received this message in error please notify us at 
once by return email and then delete both messages. We accept no liability for 
the distribution of viruses or similar in electronic communications. This 
notice should not be removed.


Re: Zeppelin context crashing

2020-05-19 Thread Manuel Sopena Ballesteros
Im using 0.8.0


it works for spark2.pyspark and spark2.r, so far it only fails in spark2.scala


From: Jeff Zhang 
Sent: Wednesday, 20 May 2020 11:57:22 AM
To: users
Subject: Re: Zeppelin context crashing

Which version of zeppelin are you using ? I remember this is a bug of 0.8, but 
is fixed in 0.8.2

Manuel Sopena Ballesteros 
mailto:manuel...@garvan.org.au>> 于2020年5月20日周三 
上午9:28写道:

this is what I can see from the zeppelin logs


DEBUG [2020-05-20 11:25:01,509] ({Exec Stream Pumper} 
RemoteInterpreterManagedProcess.java[processLine]:298) - 20/05/20 11:25:01 INFO 
Client: Application report for application_1587693971329_0042 (state: RUNNING)

 INFO [2020-05-20 11:25:01,753] ({pool-2-thread-74} 
SchedulerFactory.java[jobStarted]:109) - Job 20160223-144701_1698149301 started 
by scheduler 
org.apache.zeppelin.interpreter.remote.RemoteInterpreter-anaconda3:mansop:-shared_session

 INFO [2020-05-20 11:25:01,754] ({pool-2-thread-74} Paragraph.java[jobRun]:380) 
- Run paragraph [paragraph_id: 20160223-144701_1698149301, interpreter: 
anaconda3.spark, note_id: 2BWJFTXKJ, user: mansop]

DEBUG [2020-05-20 11:25:01,754] ({pool-2-thread-74} Paragraph.java[jobRun]:433) 
- RUN : z.input("name", "sun")

DEBUG [2020-05-20 11:25:01,754] ({pool-2-thread-74} 
RemoteInterpreter.java[interpret]:207) - st:

z.input("name", "sun")

DEBUG [2020-05-20 11:25:01,758] ({Thread-1602} 
RemoteInterpreterEventPoller.java[run]:114) - Receive message from 
RemoteInterpreter Process: RemoteInterpreterEvent(type:OUTPUT_UPDATE_ALL, 
data:{"messages":[],"noteId":"2BWJFTXKJ","paragraphId":"20160223-144701_1698149301"})

DEBUG [2020-05-20 11:25:01,768] ({Thread-1602} 
RemoteInterpreterEventPoller.java[run]:114) - Receive message from 
RemoteInterpreter Process: RemoteInterpreterEvent(type:META_INFOS, 
data:{"message":"Spark UI enabled","url":"http://zeta-6-13-mlx.mlx:39578"})

DEBUG [2020-05-20 11:25:01,770] ({Thread-1602} 
RemoteInterpreterEventPoller.java[run]:114) - Receive message from 
RemoteInterpreter Process: RemoteInterpreterEvent(type:OUTPUT_UPDATE_ALL, 
data:{"messages":[],"noteId":"2BWJFTXKJ","paragraphId":"20160223-144701_1698149301"})

DEBUG [2020-05-20 11:25:01,795] ({Thread-1602} 
RemoteInterpreterEventPoller.java[run]:114) - Receive message from 
RemoteInterpreter Process: RemoteInterpreterEvent(type:OUTPUT_UPDATE, 
data:{"data":"","index":"0","noteId":"2BWJFTXKJ","paragraphId":"20160223-144701_1698149301","type":"TEXT"})

DEBUG [2020-05-20 11:25:01,796] ({Thread-1602} 
RemoteInterpreterEventPoller.java[run]:114) - Receive message from 
RemoteInterpreter Process: RemoteInterpreterEvent(type:OUTPUT_APPEND, 
data:{"data":"\u003cconsole\u003e:24: error: not found: value 
z\n","index":"0","noteId":"2BWJFTXKJ","paragraphId":"20160223-144701_1698149301"})

DEBUG [2020-05-20 11:25:01,796] ({pool-26-thread-1} 
AppendOutputRunner.java[run]:91) - Processing time for append-output took 0 
milliseconds

DEBUG [2020-05-20 11:25:01,796] ({pool-26-thread-1} 
AppendOutputRunner.java[run]:107) - Processing size for append-output is 40 
characters

DEBUG [2020-05-20 11:25:01,796] ({Thread-1602} 
RemoteInterpreterEventPoller.java[run]:114) - Receive message from 
RemoteInterpreter Process: RemoteInterpreterEvent(type:OUTPUT_APPEND, 
data:{"data":"   z.input(\"name\", 
\"sun\")\n","index":"0","noteId":"2BWJFTXKJ","paragraphId":"20160223-144701_1698149301"})

DEBUG [2020-05-20 11:25:01,796] ({Thread-1602} 
RemoteInterpreterEventPoller.java[run]:114) - Receive message from 
RemoteInterpreter Process: RemoteInterpreterEvent(type:OUTPUT_APPEND, 
data:{"data":"   
^\n","index":"0","noteId":"2BWJFTXKJ","paragraphId":"20160223-144701_1698149301"})

DEBUG [2020-05-20 11:25:01,801] ({pool-2-thread-74} 
RemoteScheduler.java[run]:328) - Job Error, 20160223-144701_1698149301, null

DEBUG [2020-05-20 11:25:01,896] ({pool-26-thread-1} 
AppendOutputRunner.java[run]:91) - Processing time for append-output took 0 
milliseconds

DEBUG [2020-05-20 11:25:01,896] ({pool-26-thread-1} 
AppendOutputRunner.java[run]:107) - Processing size for append-output is 39 
characters

 INFO [2020-05-20 11:25:01,911] ({pool-2-thread-74} 
SchedulerFactory.java[jobFinished]:115) - Job 20160223-144701_1698149301 
finished by scheduler 
org.apache.zeppelin.interpreter.remote.RemoteInterpreter-anaconda3:mansop:-shared_session

DEBUG [2020-05-20 11:25:02,411] ({Exec Stream Pumper} 
RemoteInterpreterManagedP

Re: Zeppelin context crashing

2020-05-19 Thread Manuel Sopena Ballesteros
this is what I can see from the zeppelin logs


DEBUG [2020-05-20 11:25:01,509] ({Exec Stream Pumper} 
RemoteInterpreterManagedProcess.java[processLine]:298) - 20/05/20 11:25:01 INFO 
Client: Application report for application_1587693971329_0042 (state: RUNNING)

 INFO [2020-05-20 11:25:01,753] ({pool-2-thread-74} 
SchedulerFactory.java[jobStarted]:109) - Job 20160223-144701_1698149301 started 
by scheduler 
org.apache.zeppelin.interpreter.remote.RemoteInterpreter-anaconda3:mansop:-shared_session

 INFO [2020-05-20 11:25:01,754] ({pool-2-thread-74} Paragraph.java[jobRun]:380) 
- Run paragraph [paragraph_id: 20160223-144701_1698149301, interpreter: 
anaconda3.spark, note_id: 2BWJFTXKJ, user: mansop]

DEBUG [2020-05-20 11:25:01,754] ({pool-2-thread-74} Paragraph.java[jobRun]:433) 
- RUN : z.input("name", "sun")

DEBUG [2020-05-20 11:25:01,754] ({pool-2-thread-74} 
RemoteInterpreter.java[interpret]:207) - st:

z.input("name", "sun")

DEBUG [2020-05-20 11:25:01,758] ({Thread-1602} 
RemoteInterpreterEventPoller.java[run]:114) - Receive message from 
RemoteInterpreter Process: RemoteInterpreterEvent(type:OUTPUT_UPDATE_ALL, 
data:{"messages":[],"noteId":"2BWJFTXKJ","paragraphId":"20160223-144701_1698149301"})

DEBUG [2020-05-20 11:25:01,768] ({Thread-1602} 
RemoteInterpreterEventPoller.java[run]:114) - Receive message from 
RemoteInterpreter Process: RemoteInterpreterEvent(type:META_INFOS, 
data:{"message":"Spark UI enabled","url":"http://zeta-6-13-mlx.mlx:39578"})

DEBUG [2020-05-20 11:25:01,770] ({Thread-1602} 
RemoteInterpreterEventPoller.java[run]:114) - Receive message from 
RemoteInterpreter Process: RemoteInterpreterEvent(type:OUTPUT_UPDATE_ALL, 
data:{"messages":[],"noteId":"2BWJFTXKJ","paragraphId":"20160223-144701_1698149301"})

DEBUG [2020-05-20 11:25:01,795] ({Thread-1602} 
RemoteInterpreterEventPoller.java[run]:114) - Receive message from 
RemoteInterpreter Process: RemoteInterpreterEvent(type:OUTPUT_UPDATE, 
data:{"data":"","index":"0","noteId":"2BWJFTXKJ","paragraphId":"20160223-144701_1698149301","type":"TEXT"})

DEBUG [2020-05-20 11:25:01,796] ({Thread-1602} 
RemoteInterpreterEventPoller.java[run]:114) - Receive message from 
RemoteInterpreter Process: RemoteInterpreterEvent(type:OUTPUT_APPEND, 
data:{"data":"\u003cconsole\u003e:24: error: not found: value 
z\n","index":"0","noteId":"2BWJFTXKJ","paragraphId":"20160223-144701_1698149301"})

DEBUG [2020-05-20 11:25:01,796] ({pool-26-thread-1} 
AppendOutputRunner.java[run]:91) - Processing time for append-output took 0 
milliseconds

DEBUG [2020-05-20 11:25:01,796] ({pool-26-thread-1} 
AppendOutputRunner.java[run]:107) - Processing size for append-output is 40 
characters

DEBUG [2020-05-20 11:25:01,796] ({Thread-1602} 
RemoteInterpreterEventPoller.java[run]:114) - Receive message from 
RemoteInterpreter Process: RemoteInterpreterEvent(type:OUTPUT_APPEND, 
data:{"data":"   z.input(\"name\", 
\"sun\")\n","index":"0","noteId":"2BWJFTXKJ","paragraphId":"20160223-144701_1698149301"})

DEBUG [2020-05-20 11:25:01,796] ({Thread-1602} 
RemoteInterpreterEventPoller.java[run]:114) - Receive message from 
RemoteInterpreter Process: RemoteInterpreterEvent(type:OUTPUT_APPEND, 
data:{"data":"   
^\n","index":"0","noteId":"2BWJFTXKJ","paragraphId":"20160223-144701_1698149301"})

DEBUG [2020-05-20 11:25:01,801] ({pool-2-thread-74} 
RemoteScheduler.java[run]:328) - Job Error, 20160223-144701_1698149301, null

DEBUG [2020-05-20 11:25:01,896] ({pool-26-thread-1} 
AppendOutputRunner.java[run]:91) - Processing time for append-output took 0 
milliseconds

DEBUG [2020-05-20 11:25:01,896] ({pool-26-thread-1} 
AppendOutputRunner.java[run]:107) - Processing size for append-output is 39 
characters

 INFO [2020-05-20 11:25:01,911] ({pool-2-thread-74} 
SchedulerFactory.java[jobFinished]:115) - Job 20160223-144701_1698149301 
finished by scheduler 
org.apache.zeppelin.interpreter.remote.RemoteInterpreter-anaconda3:mansop:-shared_session

DEBUG [2020-05-20 11:25:02,411] ({Exec Stream Pumper} 
RemoteInterpreterManagedProcess.java[processLine]:298) - 20/05/20 11:25:02 INFO 
Client: Application report for application_1587693971329_0043 (state: RUNNING)



From: Manuel Sopena Ballesteros 
Sent: Wednesday, 20 May 2020 10:16:36 AM
To: users@zeppelin.apache.org
Subject: Zeppelin context crashing


Dear Zeppelin community,

For some reason my Zeppelin is not aware of the Zeppelin context

paragraph

%spark2.spark


Zeppelin context crashing

2020-05-19 Thread Manuel Sopena Ballesteros
Dear Zeppelin community,

For some reason my Zeppelin is not aware of the Zeppelin context

paragraph

%spark2.spark

z.input("name", "sun")



output


:24: error: not found: value z
   z.input("name", "sun")
   ^


Any thoughts?


thank you very much

Manuel

NOTICE
Please consider the environment before printing this email. This message and 
any attachments are intended for the addressee named and may contain legally 
privileged/confidential/copyright information. If you are not the intended 
recipient, you should not read, use, disclose, copy or distribute this 
communication. If you have received this message in error please notify us at 
once by return email and then delete both messages. We accept no liability for 
the distribution of viruses or similar in electronic communications. This 
notice should not be removed.


predefined notes to new users

2020-05-17 Thread Manuel Sopena Ballesteros
Dear Zeppelin community,


We are using zeppelin through Hortonworks Data Platform. We realised that 
Zeppelin provides a set of predefined notes tutorials (eg Getting Started / 
Apache Spark in 5 Minutes) that is available to all new users.


We would like to:

 - Delete those notes.

 - Create new notes as tutorials and make it available to all new users.


How can we do that?


thank you very much

Manuel

NOTICE
Please consider the environment before printing this email. This message and 
any attachments are intended for the addressee named and may contain legally 
privileged/confidential/copyright information. If you are not the intended 
recipient, you should not read, use, disclose, copy or distribute this 
communication. If you have received this message in error please notify us at 
once by return email and then delete both messages. We accept no liability for 
the distribution of viruses or similar in electronic communications. This 
notice should not be removed.


Re: error restarting interpreter if shiro [url] /api/interpreter/** = authc is commented

2020-04-29 Thread Manuel Sopena Ballesteros
thank you this works like a charm!


From: meilfo...@gmx.net 
Sent: Wednesday, 29 April 2020 4:14:41 PM
To: users@zeppelin.apache.org
Cc: users
Subject: Aw: error restarting interpreter if shiro [url] /api/interpreter/** = 
authc is commented


Hi,

try this:

#Will allow all authenticated user to restart Interpreters
/api/interpreter/setting/restart/** = authc
#Will only allow the role "admin" to access/change Interpreter settings
/api/interpreter/** = authc, roles[admin]


Also change Interpreter mode to perUser (or perNote) and isolated as otherwise 
in case userA restarts an interpreter will have impact on userB interpreter 
instance (his instance is also gone). Also somtimes (when Interpreter crashed 
before due to e.g. Spark YARN app run out of memory) you need to click on 
"Restart Interpreter" twice as you get an error during the first attempt but 
second attempt/click will work.

Regards,
Tom

Gesendet: Mittwoch, 29. April 2020 um 04:44 Uhr
Von: "Manuel Sopena Ballesteros" 
An: "users" 
Betreff: error restarting interpreter if shiro [url] /api/interpreter/** = 
authc is commented

I have restricted access to the interpreter configuration page by editing the 
shiro [url] section as follows





[urls]
# This section is used for url-based security.
# You can secure interpreter, configuration and credential information by urls. 
Comment or uncomment the below urls that you want to hide.
# anon means the access is anonymous.
# authc means Form based Auth Security
# To enfore security, comment the line below and uncomment the next one
/api/version = anon
/api/interpreter/** = authc, roles[admin]
#/api/interpreter/** = authc
/api/configurations/** = authc, roles[admin]
/api/credential/** = authc, roles[admin]
#/** = anon
/** = authc



I keep getting "Error restart interpreter." when try to restart the 
interpreter...



How can I fix this so I can restart the interpreter at the same time access to 
the interpreter configuration section is not allowed?



thank you





NOTICE
Please consider the environment before printing this email. This message and 
any attachments are intended for the addressee named and may contain legally 
privileged/confidential/copyright information. If you are not the intended 
recipient, you should not read, use, disclose, copy or distribute this 
communication. If you have received this message in error please notify us at 
once by return email and then delete both messages. We accept no liability for 
the distribution of viruses or similar in electronic communications. This 
notice should not be removed.
NOTICE
Please consider the environment before printing this email. This message and 
any attachments are intended for the addressee named and may contain legally 
privileged/confidential/copyright information. If you are not the intended 
recipient, you should not read, use, disclose, copy or distribute this 
communication. If you have received this message in error please notify us at 
once by return email and then delete both messages. We accept no liability for 
the distribution of viruses or similar in electronic communications. This 
notice should not be removed.


error restarting interpreter if shiro [url] /api/interpreter/** = authc is commented

2020-04-28 Thread Manuel Sopena Ballesteros
I have restricted access to the interpreter configuration page by editing the 
shiro [url] section as follows


[urls]
# This section is used for url-based security.
# You can secure interpreter, configuration and credential information by urls. 
Comment or uncomment the below urls that you want to hide.
# anon means the access is anonymous.
# authc means Form based Auth Security
# To enfore security, comment the line below and uncomment the next one
/api/version = anon
/api/interpreter/** = authc, roles[admin]
#/api/interpreter/** = authc
/api/configurations/** = authc, roles[admin]
/api/credential/** = authc, roles[admin]
#/** = anon
/** = authc


I keep getting "Error restart interpreter." when try to restart the 
interpreter...


How can I fix this so I can restart the interpreter at the same time access to 
the interpreter configuration section is not allowed?


thank you



NOTICE
Please consider the environment before printing this email. This message and 
any attachments are intended for the addressee named and may contain legally 
privileged/confidential/copyright information. If you are not the intended 
recipient, you should not read, use, disclose, copy or distribute this 
communication. If you have received this message in error please notify us at 
once by return email and then delete both messages. We accept no liability for 
the distribution of viruses or similar in electronic communications. This 
notice should not be removed.


how to speedup AD authentication

2019-11-20 Thread Manuel Sopena Ballesteros
Hi,

Sometimes a user tries to login (to zeppelin) it takes few minutes... is there 
a way to speed this up?

Thank you

Manuel Sopena Ballesteros

Big Data Engineer | Kinghorn Centre for Clinical Genomics

 [cid:image001.png@01D4C835.ED3C2230] <https://www.garvan.org.au/>

a: 384 Victoria Street, Darlinghurst NSW 2010
p: +61 2 9355 5760  |  +61 4 12 123 123
e: manuel...@garvan.org.au<mailto:manuel...@garvan.org.au>

Like us on Facebook<http://www.facebook.com/garvaninstitute> | Follow us on 
Twitter<http://twitter.com/GarvanInstitute> and 
LinkedIn<http://www.linkedin.com/company/garvan-institute-of-medical-research>

NOTICE
Please consider the environment before printing this email. This message and 
any attachments are intended for the addressee named and may contain legally 
privileged/confidential/copyright information. If you are not the intended 
recipient, you should not read, use, disclose, copy or distribute this 
communication. If you have received this message in error please notify us at 
once by return email and then delete both messages. We accept no liability for 
the distribution of viruses or similar in electronic communications. This 
notice should not be removed.


RE: Hiding shiro.ini and other sensitive files from end users

2019-11-20 Thread Manuel Sopena Ballesteros
Have you setup impersonate in spark interpreter?

Manuel

From: Tony Primerano [mailto:primer...@tonycode.com]
Sent: Thursday, November 21, 2019 12:35 PM
To: users@zeppelin.apache.org
Subject: Re: Hiding shiro.ini and other sensitive files from end users

I am currently running in spark stand-alone mode.
On Wed, Nov 20, 2019, 6:25 PM Manuel Sopena Ballesteros 
mailto:manuel...@garvan.org.au>> wrote:
Hi Tony,

Are you running a yarn cluster?

thanks

Manuel

From: Tony Primerano 
[mailto:primer...@tonycode.com<mailto:primer...@tonycode.com>]
Sent: Thursday, November 21, 2019 9:08 AM
To: users@zeppelin.apache.org<mailto:users@zeppelin.apache.org>
Subject: Hiding shiro.ini and other sensitive files from end users

Is there a recommended way to hide secrets contained in shirio.ini and other 
files?

I made my shell interpreter run as a different user to prevent access to 
configuration files but from a python interpreter you can run shell commands as 
the Zeppelin process user.

Is there a way to prevent this?

Thanks
Tony
NOTICE
Please consider the environment before printing this email. This message and 
any attachments are intended for the addressee named and may contain legally 
privileged/confidential/copyright information. If you are not the intended 
recipient, you should not read, use, disclose, copy or distribute this 
communication. If you have received this message in error please notify us at 
once by return email and then delete both messages. We accept no liability for 
the distribution of viruses or similar in electronic communications. This 
notice should not be removed.
NOTICE
Please consider the environment before printing this email. This message and 
any attachments are intended for the addressee named and may contain legally 
privileged/confidential/copyright information. If you are not the intended 
recipient, you should not read, use, disclose, copy or distribute this 
communication. If you have received this message in error please notify us at 
once by return email and then delete both messages. We accept no liability for 
the distribution of viruses or similar in electronic communications. This 
notice should not be removed.


RE: Hiding shiro.ini and other sensitive files from end users

2019-11-20 Thread Manuel Sopena Ballesteros
Hi Tony,

Are you running a yarn cluster?

thanks

Manuel

From: Tony Primerano [mailto:primer...@tonycode.com]
Sent: Thursday, November 21, 2019 9:08 AM
To: users@zeppelin.apache.org
Subject: Hiding shiro.ini and other sensitive files from end users

Is there a recommended way to hide secrets contained in shirio.ini and other 
files?

I made my shell interpreter run as a different user to prevent access to 
configuration files but from a python interpreter you can run shell commands as 
the Zeppelin process user.

Is there a way to prevent this?

Thanks
Tony
NOTICE
Please consider the environment before printing this email. This message and 
any attachments are intended for the addressee named and may contain legally 
privileged/confidential/copyright information. If you are not the intended 
recipient, you should not read, use, disclose, copy or distribute this 
communication. If you have received this message in error please notify us at 
once by return email and then delete both messages. We accept no liability for 
the distribution of viruses or similar in electronic communications. This 
notice should not be removed.


RE: restrict interpreters to users

2019-11-19 Thread Manuel Sopena Ballesteros
Rather than exception, I get an HTTP ERROR 503 when I hardcode a user in shiro 
config

[cid:image002.png@01D59F98.6B7039E0]

Manuel

From: Manuel Sopena Ballesteros
Sent: Wednesday, November 20, 2019 11:37 AM
To: users@zeppelin.apache.org
Subject: RE: restrict interpreters to users

Unfortunately, zeppelin will throw an exception if I change the [user] section 
in shiro configuration.
I guess this is because I am using AD integration hence local users are not 
allowed?

Please advise

Manuel

From: iamabug [mailto:18133622...@163.com]
Sent: Tuesday, November 19, 2019 4:54 PM
To: users@zeppelin.apache.org
Subject: Re: restrict interpreters to users

I think you misconfigure [roles] paragraph and [users] paragraph.

Suppose you want mansop to be an admin and alice to be a plain user without 
access to `interpreter` menu, you can try this:

[users]
mansop = password_for_mansop, admin
alice = password_for_alice

[roles]
role1 = *
role2 = *
role3 = *
admin = *

note that alice is not an admin or any other special role so she can only use 
basic features.

I think [roles] paragraph should be about role name and their permissions but I 
am not aware of any specific permissions and the documentation needs to provide 
more details. Just to be clear, if the configuration above is used, role1, 
role2, role3 have the same permissions as admin does.

Please let me know if it works.


On 11/19/2019 13:17,Manuel Sopena 
Ballesteros<mailto:manuel...@garvan.org.au> wrote:
We are using shiro to authenticate against Active Directory.

I changed the shiro configuration like this

[roles]
role1 = *
role2 = *
role3 = *
admin = mansop

however other users different than mansop can see and edit interpreters.
NOTE: mansop is an AD login

I would like to restrict users from editing or viewing interpreters.

Any thoughts?

Thank you

Manuel

From: iamabug [mailto:18133622...@163.com<mailto:18133622...@163.com>]
Sent: Tuesday, November 19, 2019 12:31 PM
To: users@zeppelin.apache.org<mailto:users@zeppelin.apache.org>
Subject: Re:restrict interpreters to users


Do you mean anonymous login by `by default` ?

If yes, enabling Shiro authentication can change this ? Please refer to 
https://zeppelin.apache.org/docs/0.8.2/setup/security/shiro_authentication.html

On 11/19/2019 09:28,Manuel Sopena 
Ballesteros<mailto:manuel...@garvan.org.au> wrote:
Dear Zeppelin community,

By default interpreters configuration can be changed by any user. Is there a 
way to avoid this? I would like to hide some interpreters so people can’t 
change them.

Thank you very much

Manuel Sopena Ballesteros

Big Data Engineer | Kinghorn Centre for Clinical Genomics

 [cid:image001.png@01D4C835.ED3C2230] <https://www.garvan.org.au/>

a: 384 Victoria Street, Darlinghurst NSW 2010
p: +61 2 9355 5760  |  +61 4 12 123 123
e: manuel...@garvan.org.au<mailto:manuel...@garvan.org.au>

Like us on Facebook<http://www.facebook.com/garvaninstitute> | Follow us on 
Twitter<http://twitter.com/GarvanInstitute> and 
LinkedIn<http://www.linkedin.com/company/garvan-institute-of-medical-research>

NOTICE
Please consider the environment before printing this email. This message and 
any attachments are intended for the addressee named and may contain legally 
privileged/confidential/copyright information. If you are not the intended 
recipient, you should not read, use, disclose, copy or distribute this 
communication. If you have received this message in error please notify us at 
once by return email and then delete both messages. We accept no liability for 
the distribution of viruses or similar in electronic communications. This 
notice should not be removed.
NOTICE
Please consider the environment before printing this email. This message and 
any attachments are intended for the addressee named and may contain legally 
privileged/confidential/copyright information. If you are not the intended 
recipient, you should not read, use, disclose, copy or distribute this 
communication. If you have received this message in error please notify us at 
once by return email and then delete both messages. We accept no liability for 
the distribution of viruses or similar in electronic communications. This 
notice should not be removed.
NOTICE
Please consider the environment before printing this email. This message and 
any attachments are intended for the addressee named and may contain legally 
privileged/confidential/copyright information. If you are not the intended 
recipient, you should not read, use, disclose, copy or distribute this 
communication. If you have received this message in error please notify us at 
once by return email and then delete both messages. We accept no liability for 
the distribution of viruses or similar in electronic communications. This 
notice should not be removed.


RE: restrict interpreters to users

2019-11-19 Thread Manuel Sopena Ballesteros
Unfortunately, zeppelin will throw an exception if I change the [user] section 
in shiro configuration.
I guess this is because I am using AD integration hence local users are not 
allowed?

Please advise

Manuel

From: iamabug [mailto:18133622...@163.com]
Sent: Tuesday, November 19, 2019 4:54 PM
To: users@zeppelin.apache.org
Subject: Re: restrict interpreters to users

I think you misconfigure [roles] paragraph and [users] paragraph.

Suppose you want mansop to be an admin and alice to be a plain user without 
access to `interpreter` menu, you can try this:

[users]
mansop = password_for_mansop, admin
alice = password_for_alice

[roles]
role1 = *
role2 = *
role3 = *
admin = *

note that alice is not an admin or any other special role so she can only use 
basic features.

I think [roles] paragraph should be about role name and their permissions but I 
am not aware of any specific permissions and the documentation needs to provide 
more details. Just to be clear, if the configuration above is used, role1, 
role2, role3 have the same permissions as admin does.

Please let me know if it works.


On 11/19/2019 13:17,Manuel Sopena 
Ballesteros<mailto:manuel...@garvan.org.au> wrote:
We are using shiro to authenticate against Active Directory.

I changed the shiro configuration like this

[roles]
role1 = *
role2 = *
role3 = *
admin = mansop

however other users different than mansop can see and edit interpreters.
NOTE: mansop is an AD login

I would like to restrict users from editing or viewing interpreters.

Any thoughts?

Thank you

Manuel

From: iamabug [mailto:18133622...@163.com<mailto:18133622...@163.com>]
Sent: Tuesday, November 19, 2019 12:31 PM
To: users@zeppelin.apache.org<mailto:users@zeppelin.apache.org>
Subject: Re:restrict interpreters to users


Do you mean anonymous login by `by default` ?

If yes, enabling Shiro authentication can change this ? Please refer to 
https://zeppelin.apache.org/docs/0.8.2/setup/security/shiro_authentication.html

On 11/19/2019 09:28,Manuel Sopena 
Ballesteros<mailto:manuel...@garvan.org.au> wrote:
Dear Zeppelin community,

By default interpreters configuration can be changed by any user. Is there a 
way to avoid this? I would like to hide some interpreters so people can’t 
change them.

Thank you very much

Manuel Sopena Ballesteros

Big Data Engineer | Kinghorn Centre for Clinical Genomics

 [cid:image001.png@01D4C835.ED3C2230] <https://www.garvan.org.au/>

a: 384 Victoria Street, Darlinghurst NSW 2010
p: +61 2 9355 5760  |  +61 4 12 123 123
e: manuel...@garvan.org.au<mailto:manuel...@garvan.org.au>

Like us on Facebook<http://www.facebook.com/garvaninstitute> | Follow us on 
Twitter<http://twitter.com/GarvanInstitute> and 
LinkedIn<http://www.linkedin.com/company/garvan-institute-of-medical-research>

NOTICE
Please consider the environment before printing this email. This message and 
any attachments are intended for the addressee named and may contain legally 
privileged/confidential/copyright information. If you are not the intended 
recipient, you should not read, use, disclose, copy or distribute this 
communication. If you have received this message in error please notify us at 
once by return email and then delete both messages. We accept no liability for 
the distribution of viruses or similar in electronic communications. This 
notice should not be removed.
NOTICE
Please consider the environment before printing this email. This message and 
any attachments are intended for the addressee named and may contain legally 
privileged/confidential/copyright information. If you are not the intended 
recipient, you should not read, use, disclose, copy or distribute this 
communication. If you have received this message in error please notify us at 
once by return email and then delete both messages. We accept no liability for 
the distribution of viruses or similar in electronic communications. This 
notice should not be removed.
NOTICE
Please consider the environment before printing this email. This message and 
any attachments are intended for the addressee named and may contain legally 
privileged/confidential/copyright information. If you are not the intended 
recipient, you should not read, use, disclose, copy or distribute this 
communication. If you have received this message in error please notify us at 
once by return email and then delete both messages. We accept no liability for 
the distribution of viruses or similar in electronic communications. This 
notice should not be removed.


RE: Re:restrict interpreters to users

2019-11-18 Thread Manuel Sopena Ballesteros
We are using shiro to authenticate against Active Directory.

I changed the shiro configuration like this

[roles]
role1 = *
role2 = *
role3 = *
admin = mansop

however other users different than mansop can see and edit interpreters.
NOTE: mansop is an AD login

I would like to restrict users from editing or viewing interpreters.

Any thoughts?

Thank you

Manuel

From: iamabug [mailto:18133622...@163.com]
Sent: Tuesday, November 19, 2019 12:31 PM
To: users@zeppelin.apache.org
Subject: Re:restrict interpreters to users


Do you mean anonymous login by `by default` ?

If yes, enabling Shiro authentication can change this ? Please refer to 
https://zeppelin.apache.org/docs/0.8.2/setup/security/shiro_authentication.html

On 11/19/2019 09:28,Manuel Sopena 
Ballesteros<mailto:manuel...@garvan.org.au> wrote:
Dear Zeppelin community,

By default interpreters configuration can be changed by any user. Is there a 
way to avoid this? I would like to hide some interpreters so people can’t 
change them.

Thank you very much

Manuel Sopena Ballesteros

Big Data Engineer | Kinghorn Centre for Clinical Genomics

 [cid:image001.png@01D4C835.ED3C2230] <https://www.garvan.org.au/>

a: 384 Victoria Street, Darlinghurst NSW 2010
p: +61 2 9355 5760  |  +61 4 12 123 123
e: manuel...@garvan.org.au<mailto:manuel...@garvan.org.au>

Like us on Facebook<http://www.facebook.com/garvaninstitute> | Follow us on 
Twitter<http://twitter.com/GarvanInstitute> and 
LinkedIn<http://www.linkedin.com/company/garvan-institute-of-medical-research>

NOTICE
Please consider the environment before printing this email. This message and 
any attachments are intended for the addressee named and may contain legally 
privileged/confidential/copyright information. If you are not the intended 
recipient, you should not read, use, disclose, copy or distribute this 
communication. If you have received this message in error please notify us at 
once by return email and then delete both messages. We accept no liability for 
the distribution of viruses or similar in electronic communications. This 
notice should not be removed.
NOTICE
Please consider the environment before printing this email. This message and 
any attachments are intended for the addressee named and may contain legally 
privileged/confidential/copyright information. If you are not the intended 
recipient, you should not read, use, disclose, copy or distribute this 
communication. If you have received this message in error please notify us at 
once by return email and then delete both messages. We accept no liability for 
the distribution of viruses or similar in electronic communications. This 
notice should not be removed.


restrict interpreters to users

2019-11-18 Thread Manuel Sopena Ballesteros
Dear Zeppelin community,

By default interpreters configuration can be changed by any user. Is there a 
way to avoid this? I would like to hide some interpreters so people can't 
change them.

Thank you very much

Manuel Sopena Ballesteros

Big Data Engineer | Kinghorn Centre for Clinical Genomics

 [cid:image001.png@01D4C835.ED3C2230] <https://www.garvan.org.au/>

a: 384 Victoria Street, Darlinghurst NSW 2010
p: +61 2 9355 5760  |  +61 4 12 123 123
e: manuel...@garvan.org.au<mailto:manuel...@garvan.org.au>

Like us on Facebook<http://www.facebook.com/garvaninstitute> | Follow us on 
Twitter<http://twitter.com/GarvanInstitute> and 
LinkedIn<http://www.linkedin.com/company/garvan-institute-of-medical-research>

NOTICE
Please consider the environment before printing this email. This message and 
any attachments are intended for the addressee named and may contain legally 
privileged/confidential/copyright information. If you are not the intended 
recipient, you should not read, use, disclose, copy or distribute this 
communication. If you have received this message in error please notify us at 
once by return email and then delete both messages. We accept no liability for 
the distribution of viruses or similar in electronic communications. This 
notice should not be removed.


RE: send parameters to pyspark

2019-11-14 Thread Manuel Sopena Ballesteros
Thank you very much, that worked

What about passing –conf flag to pyspark?

Manuel

From: Jeff Zhang [mailto:zjf...@gmail.com]
Sent: Friday, November 15, 2019 12:35 PM
To: users
Subject: Re: send parameters to pyspark

you can set property spark.jars

Manuel Sopena Ballesteros 
mailto:manuel...@garvan.org.au>> 于2019年11月15日周五 
上午9:30写道:
Dear zeppelin community,

I need to send some parameters to pyspark so it can find extra jars.

This is an example of the parameters I need to send to pyspark:

pyspark \
  --jars 
/share/ClusterShare/anaconda3/envs/python37/lib/python3.7/site-packages/hail/hail-all-spark.jar
 \
  --conf 
spark.driver.extraClassPath=/share/ClusterShare/anaconda3/envs/python37/lib/python3.7/site-packages/hail/hail-all-spark.jar
 \
  --conf 
spark.executor.extraClassPath=/share/ClusterShare/anaconda3/envs/python37/lib/python3.7/site-packages/hail/hail-all-spark.jar
 \
  --conf spark.serializer=org.apache.spark.serializer.KryoSerializer \
  --conf spark.kryo.registrator=is.hail.kryo.HailKryoRegistrator

How could I configure my spark interpreter to do this?

Thank you very much
NOTICE
Please consider the environment before printing this email. This message and 
any attachments are intended for the addressee named and may contain legally 
privileged/confidential/copyright information. If you are not the intended 
recipient, you should not read, use, disclose, copy or distribute this 
communication. If you have received this message in error please notify us at 
once by return email and then delete both messages. We accept no liability for 
the distribution of viruses or similar in electronic communications. This 
notice should not be removed.


--
Best Regards

Jeff Zhang
NOTICE
Please consider the environment before printing this email. This message and 
any attachments are intended for the addressee named and may contain legally 
privileged/confidential/copyright information. If you are not the intended 
recipient, you should not read, use, disclose, copy or distribute this 
communication. If you have received this message in error please notify us at 
once by return email and then delete both messages. We accept no liability for 
the distribution of viruses or similar in electronic communications. This 
notice should not be removed.


send parameters to pyspark

2019-11-14 Thread Manuel Sopena Ballesteros
Dear zeppelin community,

I need to send some parameters to pyspark so it can find extra jars.

This is an example of the parameters I need to send to pyspark:

pyspark \
  --jars 
/share/ClusterShare/anaconda3/envs/python37/lib/python3.7/site-packages/hail/hail-all-spark.jar
 \
  --conf 
spark.driver.extraClassPath=/share/ClusterShare/anaconda3/envs/python37/lib/python3.7/site-packages/hail/hail-all-spark.jar
 \
  --conf 
spark.executor.extraClassPath=/share/ClusterShare/anaconda3/envs/python37/lib/python3.7/site-packages/hail/hail-all-spark.jar
 \
  --conf spark.serializer=org.apache.spark.serializer.KryoSerializer \
  --conf spark.kryo.registrator=is.hail.kryo.HailKryoRegistrator

How could I configure my spark interpreter to do this?

Thank you very much
NOTICE
Please consider the environment before printing this email. This message and 
any attachments are intended for the addressee named and may contain legally 
privileged/confidential/copyright information. If you are not the intended 
recipient, you should not read, use, disclose, copy or distribute this 
communication. If you have received this message in error please notify us at 
once by return email and then delete both messages. We accept no liability for 
the distribution of viruses or similar in electronic communications. This 
notice should not be removed.


RE: spark r interpreter resets working directory

2019-11-12 Thread Manuel Sopena Ballesteros
Sorry, I got confused with the terminology (I meant paragraph instead of note)

My interpreter is configured per user +isolated --> this means the same 
interpreter process (jvm process) for same user.

First paragraph

%anaconda3.r

setwd("/home/mansop")
getwd()

output:
[1] "/home/mansop"

Second paragraph
%anaconda3.r

getwd()

output:
[1] 
"/d0/hadoop/yarn/local/usercache/mansop/appcache/application_1572410115474_0106/container_e16_1572410115474_0106_01_01"

Why R does not carry the working directory to the second paragraph even if both 
are running in the same interpreter process?

Thank you

Manuel

From: Manuel Sopena Ballesteros [mailto:manuel...@garvan.org.au]
Sent: Wednesday, November 13, 2019 2:32 PM
To: users@zeppelin.apache.org
Subject: spark r interpreter resets working directory

Dear Zeppelin community,

I am testing spark r interpreter and realised it does not keep the working 
directory across notes.
[cid:image001.png@01D59A4A.938EE2D0]

What is the reason behind this behavior?

Thank you very much

NOTICE
Please consider the environment before printing this email. This message and 
any attachments are intended for the addressee named and may contain legally 
privileged/confidential/copyright information. If you are not the intended 
recipient, you should not read, use, disclose, copy or distribute this 
communication. If you have received this message in error please notify us at 
once by return email and then delete both messages. We accept no liability for 
the distribution of viruses or similar in electronic communications. This 
notice should not be removed.
NOTICE
Please consider the environment before printing this email. This message and 
any attachments are intended for the addressee named and may contain legally 
privileged/confidential/copyright information. If you are not the intended 
recipient, you should not read, use, disclose, copy or distribute this 
communication. If you have received this message in error please notify us at 
once by return email and then delete both messages. We accept no liability for 
the distribution of viruses or similar in electronic communications. This 
notice should not be removed.


RE: spark r interpreter resets working directory

2019-11-12 Thread Manuel Sopena Ballesteros
Ok, what should I do in order to be able to reuse variables across different 
notes?

Manuel

From: Jeff Zhang [mailto:zjf...@gmail.com]
Sent: Wednesday, November 13, 2019 4:57 PM
To: users
Subject: Re: spark r interpreter resets working directory

In that case, each user use different interpreter process. In your second note, 
the current working directory is the yarn container location which is expected


Manuel Sopena Ballesteros 
mailto:manuel...@garvan.org.au>> 于2019年11月13日周三 
下午1:50写道:
Yarn cluster using impersonate (per user + isolated)

I guess that means each note use different interpreters?

Manuel

From: Jeff Zhang [mailto:zjf...@gmail.com<mailto:zjf...@gmail.com>]
Sent: Wednesday, November 13, 2019 2:35 PM
To: users
Subject: Re: spark r interpreter resets working directory

Does your different notes share the same interpreter ? I suspect you are using 
per note isolated or scoped mode.

Looks like you are local or yarn-client mode for the first note, but using 
yarn-cluster mode for the second note

Manuel Sopena Ballesteros 
mailto:manuel...@garvan.org.au>> 于2019年11月13日周三 
上午11:31写道:
Dear Zeppelin community,

I am testing spark r interpreter and realised it does not keep the working 
directory across notes.
[cid:image001.png@01D59A44.11B5FA10]

What is the reason behind this behavior?

Thank you very much

NOTICE
Please consider the environment before printing this email. This message and 
any attachments are intended for the addressee named and may contain legally 
privileged/confidential/copyright information. If you are not the intended 
recipient, you should not read, use, disclose, copy or distribute this 
communication. If you have received this message in error please notify us at 
once by return email and then delete both messages. We accept no liability for 
the distribution of viruses or similar in electronic communications. This 
notice should not be removed.


--
Best Regards

Jeff Zhang
NOTICE
Please consider the environment before printing this email. This message and 
any attachments are intended for the addressee named and may contain legally 
privileged/confidential/copyright information. If you are not the intended 
recipient, you should not read, use, disclose, copy or distribute this 
communication. If you have received this message in error please notify us at 
once by return email and then delete both messages. We accept no liability for 
the distribution of viruses or similar in electronic communications. This 
notice should not be removed.


--
Best Regards

Jeff Zhang
NOTICE
Please consider the environment before printing this email. This message and 
any attachments are intended for the addressee named and may contain legally 
privileged/confidential/copyright information. If you are not the intended 
recipient, you should not read, use, disclose, copy or distribute this 
communication. If you have received this message in error please notify us at 
once by return email and then delete both messages. We accept no liability for 
the distribution of viruses or similar in electronic communications. This 
notice should not be removed.


RE: spark r interpreter resets working directory

2019-11-12 Thread Manuel Sopena Ballesteros
Yarn cluster using impersonate (per user + isolated)

I guess that means each note use different interpreters?

Manuel

From: Jeff Zhang [mailto:zjf...@gmail.com]
Sent: Wednesday, November 13, 2019 2:35 PM
To: users
Subject: Re: spark r interpreter resets working directory

Does your different notes share the same interpreter ? I suspect you are using 
per note isolated or scoped mode.

Looks like you are local or yarn-client mode for the first note, but using 
yarn-cluster mode for the second note

Manuel Sopena Ballesteros 
mailto:manuel...@garvan.org.au>> 于2019年11月13日周三 
上午11:31写道:
Dear Zeppelin community,

I am testing spark r interpreter and realised it does not keep the working 
directory across notes.
[cid:image001.png@01D59A42.7279ED90]

What is the reason behind this behavior?

Thank you very much

NOTICE
Please consider the environment before printing this email. This message and 
any attachments are intended for the addressee named and may contain legally 
privileged/confidential/copyright information. If you are not the intended 
recipient, you should not read, use, disclose, copy or distribute this 
communication. If you have received this message in error please notify us at 
once by return email and then delete both messages. We accept no liability for 
the distribution of viruses or similar in electronic communications. This 
notice should not be removed.


--
Best Regards

Jeff Zhang
NOTICE
Please consider the environment before printing this email. This message and 
any attachments are intended for the addressee named and may contain legally 
privileged/confidential/copyright information. If you are not the intended 
recipient, you should not read, use, disclose, copy or distribute this 
communication. If you have received this message in error please notify us at 
once by return email and then delete both messages. We accept no liability for 
the distribution of viruses or similar in electronic communications. This 
notice should not be removed.


spark r interpreter resets working directory

2019-11-12 Thread Manuel Sopena Ballesteros
Dear Zeppelin community,

I am testing spark r interpreter and realised it does not keep the working 
directory across notes.
[cid:image001.png@01D59A2F.0E03FB20]

What is the reason behind this behavior?

Thank you very much

NOTICE
Please consider the environment before printing this email. This message and 
any attachments are intended for the addressee named and may contain legally 
privileged/confidential/copyright information. If you are not the intended 
recipient, you should not read, use, disclose, copy or distribute this 
communication. If you have received this message in error please notify us at 
once by return email and then delete both messages. We accept no liability for 
the distribution of viruses or similar in electronic communications. This 
notice should not be removed.


python interperter not installing (directory already exists)

2019-11-12 Thread Manuel Sopena Ballesteros
Hi,

For some reason python interpreter is missing from the interpreter list so I am 
trying to reinstall it.

$ sudo /usr/hdp/3.1.0.0-78/zeppelin/bin/install-interpreter.sh -n python
Java HotSpot(TM) 64-Bit Server VM warning: ignoring option MaxPermSize=512m; 
support was removed in 8.0
SLF4J: Class path contains multiple SLF4J bindings.
SLF4J: Found binding in 
[jar:file:/usr/hdp/3.1.0.0-78/zeppelin/lib/interpreter/slf4j-log4j12-1.7.10.jar!/org/slf4j/impl/StaticLoggerBinder.class]
SLF4J: Found binding in 
[jar:file:/usr/hdp/3.1.0.0-78/zeppelin/lib/slf4j-log4j12-1.7.10.jar!/org/slf4j/impl/StaticLoggerBinder.class]
SLF4J: Found binding in 
[jar:file:/usr/hdp/3.1.0.0-78/zeppelin/lib/slf4j-simple-1.6.4.jar!/org/slf4j/impl/StaticLoggerBinder.class]
SLF4J: See http://www.slf4j.org/codes.html#multiple_bindings for an explanation.
SLF4J: Actual binding is of type [org.slf4j.impl.Log4jLoggerFactory]
Directory /usr/hdp/3.1.0.0-78/zeppelin/interpreter/python already exists

Skipped

Question: is it ok to delete /usr/hdp/3.1.0.0-78/zeppelin/interpreter/python 
and reinstall?

Thank you

NOTICE
Please consider the environment before printing this email. This message and 
any attachments are intended for the addressee named and may contain legally 
privileged/confidential/copyright information. If you are not the intended 
recipient, you should not read, use, disclose, copy or distribute this 
communication. If you have received this message in error please notify us at 
once by return email and then delete both messages. We accept no liability for 
the distribution of viruses or similar in electronic communications. This 
notice should not be removed.


RE: can't plot

2019-10-31 Thread Manuel Sopena Ballesteros
:37,711] ({pool-6-thread-2} 
Interpreter.java[getProperty]:222) - key: zeppelin.interpreter.localRepo, 
value: /usr/hdp/current/zeppelin-server/local-repo/mansop

So I am confused because it says that ipython prerequisites are meet but still 
fails to start iphython interpreter

So what is involved in the process to start ipython interpreter from zeppelin 
point of view?

Manuel

From: Jeff Zhang [mailto:zjf...@gmail.com]
Sent: Wednesday, October 30, 2019 5:10 PM
To: users
Subject: Re: can't plot

It might be due other reason, you can set the interpreter log level to be DEBUG 
to get more info.

Add following into log4j.properties

log4j.logger.org.apache.zeppelin.interpreter=DEBUG


Manuel Sopena Ballesteros 
mailto:manuel...@garvan.org.au>> 于2019年10月30日周三 
下午1:51写道:
Ok,

One more question, I am getting an error when I force ipython

%mansop.ipyspark

print("Hello world!")

java.io.IOException: Fail to launch IPython Kernel in 30 seconds at 
org.apache.zeppelin.python.IPythonInterpreter.launchIPythonKernel(IPythonInterpreter.java:297)
 at 
org.apache.zeppelin.python.IPythonInterpreter.open(IPythonInterpreter.java:154) 
at 
org.apache.zeppelin.spark.IPySparkInterpreter.open(IPySparkInterpreter.java:66) 
at 
org.apache.zeppelin.interpreter.LazyOpenInterpreter.open(LazyOpenInterpreter.java:69)
 at 
org.apache.zeppelin.interpreter.remote.RemoteInterpreterServer$InterpretJob.jobRun(RemoteInterpreterServer.java:617)
 at org.apache.zeppelin.scheduler.Job.run(Job.java:188) at 
org.apache.zeppelin.scheduler.FIFOScheduler$1.run(FIFOScheduler.java:140) at 
java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511) at 
java.util.concurrent.FutureTask.run(FutureTask.java:266) at 
java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.access$201(ScheduledThreadPoolExecutor.java:180)
 at 
java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.run(ScheduledThreadPoolExecutor.java:293)
 at 
java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142) 
at 
java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617) 
at java.lang.Thread.run(Thread.java:745)


https://zeppelin.apache.org/docs/0.8.0/interpreter/python.html#ipython-support

both grpcio and jupyter are installed

any idea?

Manuel

From: Jeff Zhang [mailto:zjf...@gmail.com<mailto:zjf...@gmail.com>]
Sent: Wednesday, October 30, 2019 12:53 PM
To: users
Subject: Re: can't plot

Based on the error message, you are still using python instead of ipython. It 
is hard to tell what's wrong.

One suggestion is to try 0.8.2 which is the latest release.



Manuel Sopena Ballesteros 
mailto:manuel...@garvan.org.au>> 于2019年10月30日周三 
上午9:47写道:
Didn’t like %matplotlib inline

Traceback (most recent call last):
File 
"/d1/hadoop/yarn/local/usercache/mansop/appcache/application_1570749574365_0083/container_e15_1570749574365_0083_01_01/tmp/zeppelin_pyspark-2736590645623350055.py",
 line 364, in 
 code = compile('\n'.join(stmts), '', 'exec', ast.PyCF_ONLY_AST, 1)
File "", line 1
 %matplotlib inline
 ^
SyntaxError: invalid syntax

Manuel

From: Jeff Zhang [mailto:zjf...@gmail.com<mailto:zjf...@gmail.com>]
Sent: Wednesday, October 30, 2019 12:43 PM
To: users
Subject: Re: can't plot

Try this

%pyspark

%matplotlib inline
import matplotlib.pyplot as plt

plt.figure()

plt.plot([1, 2, 3])

Manuel Sopena Ballesteros 
mailto:manuel...@garvan.org.au>> 于2019年10月30日周三 
上午9:39写道:
Another example:

%pyspark

import matplotlib.pyplot as plt

plt.plot([1, 2, 3])
z.show(plt)
plt.close()



According to documentation
https://zeppelin.apache.org/docs/0.8.0/interpreter/python.html#matplotlib-integration

Am I right assuming that I can use z.show in %pyspark?

Thank you

Manuel

From: Manuel Sopena Ballesteros 
[mailto:manuel...@garvan.org.au<mailto:manuel...@garvan.org.au>]
Sent: Wednesday, October 30, 2019 12:12 PM
To: users@zeppelin.apache.org<mailto:users@zeppelin.apache.org>
Subject: can't plot

Dear Zeppelin user community,

I am running Zeppelin 0.8.0 and I am not able to print a plot using pyspark 
interpreter:

This is my notebook:

%pyspark

import matplotlib.pyplot as plt

plt.figure()

plt.plot([1, 2, 3])

And this is the output:

[]

Any idea?
NOTICE
Please consider the environment before printing this email. This message and 
any attachments are intended for the addressee named and may contain legally 
privileged/confidential/copyright information. If you are not the intended 
recipient, you should not read, use, disclose, copy or distribute this 
communication. If you have received this message in error please notify us at 
once by return email and then delete both messages. We accept no liability for 
the distribution of viruses or similar in electronic communications. This 
notice should not be removed.
NOTICE
Please consider the environment before printing this email. Th

RE: can't plot

2019-10-29 Thread Manuel Sopena Ballesteros
Ok,

One more question, I am getting an error when I force ipython

%mansop.ipyspark

print("Hello world!")

java.io.IOException: Fail to launch IPython Kernel in 30 seconds at 
org.apache.zeppelin.python.IPythonInterpreter.launchIPythonKernel(IPythonInterpreter.java:297)
 at 
org.apache.zeppelin.python.IPythonInterpreter.open(IPythonInterpreter.java:154) 
at 
org.apache.zeppelin.spark.IPySparkInterpreter.open(IPySparkInterpreter.java:66) 
at 
org.apache.zeppelin.interpreter.LazyOpenInterpreter.open(LazyOpenInterpreter.java:69)
 at 
org.apache.zeppelin.interpreter.remote.RemoteInterpreterServer$InterpretJob.jobRun(RemoteInterpreterServer.java:617)
 at org.apache.zeppelin.scheduler.Job.run(Job.java:188) at 
org.apache.zeppelin.scheduler.FIFOScheduler$1.run(FIFOScheduler.java:140) at 
java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511) at 
java.util.concurrent.FutureTask.run(FutureTask.java:266) at 
java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.access$201(ScheduledThreadPoolExecutor.java:180)
 at 
java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.run(ScheduledThreadPoolExecutor.java:293)
 at 
java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142) 
at 
java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617) 
at java.lang.Thread.run(Thread.java:745)


https://zeppelin.apache.org/docs/0.8.0/interpreter/python.html#ipython-support

both grpcio and jupyter are installed

any idea?

Manuel

From: Jeff Zhang [mailto:zjf...@gmail.com]
Sent: Wednesday, October 30, 2019 12:53 PM
To: users
Subject: Re: can't plot

Based on the error message, you are still using python instead of ipython. It 
is hard to tell what's wrong.

One suggestion is to try 0.8.2 which is the latest release.



Manuel Sopena Ballesteros 
mailto:manuel...@garvan.org.au>> 于2019年10月30日周三 
上午9:47写道:
Didn’t like %matplotlib inline

Traceback (most recent call last):
File 
"/d1/hadoop/yarn/local/usercache/mansop/appcache/application_1570749574365_0083/container_e15_1570749574365_0083_01_01/tmp/zeppelin_pyspark-2736590645623350055.py",
 line 364, in 
 code = compile('\n'.join(stmts), '', 'exec', ast.PyCF_ONLY_AST, 1)
File "", line 1
 %matplotlib inline
 ^
SyntaxError: invalid syntax

Manuel

From: Jeff Zhang [mailto:zjf...@gmail.com<mailto:zjf...@gmail.com>]
Sent: Wednesday, October 30, 2019 12:43 PM
To: users
Subject: Re: can't plot

Try this

%pyspark

%matplotlib inline
import matplotlib.pyplot as plt

plt.figure()

plt.plot([1, 2, 3])

Manuel Sopena Ballesteros 
mailto:manuel...@garvan.org.au>> 于2019年10月30日周三 
上午9:39写道:
Another example:

%pyspark

import matplotlib.pyplot as plt

plt.plot([1, 2, 3])
z.show(plt)
plt.close()



According to documentation
https://zeppelin.apache.org/docs/0.8.0/interpreter/python.html#matplotlib-integration

Am I right assuming that I can use z.show in %pyspark?

Thank you

Manuel

From: Manuel Sopena Ballesteros 
[mailto:manuel...@garvan.org.au<mailto:manuel...@garvan.org.au>]
Sent: Wednesday, October 30, 2019 12:12 PM
To: users@zeppelin.apache.org<mailto:users@zeppelin.apache.org>
Subject: can't plot

Dear Zeppelin user community,

I am running Zeppelin 0.8.0 and I am not able to print a plot using pyspark 
interpreter:

This is my notebook:

%pyspark

import matplotlib.pyplot as plt

plt.figure()

plt.plot([1, 2, 3])

And this is the output:

[]

Any idea?
NOTICE
Please consider the environment before printing this email. This message and 
any attachments are intended for the addressee named and may contain legally 
privileged/confidential/copyright information. If you are not the intended 
recipient, you should not read, use, disclose, copy or distribute this 
communication. If you have received this message in error please notify us at 
once by return email and then delete both messages. We accept no liability for 
the distribution of viruses or similar in electronic communications. This 
notice should not be removed.
NOTICE
Please consider the environment before printing this email. This message and 
any attachments are intended for the addressee named and may contain legally 
privileged/confidential/copyright information. If you are not the intended 
recipient, you should not read, use, disclose, copy or distribute this 
communication. If you have received this message in error please notify us at 
once by return email and then delete both messages. We accept no liability for 
the distribution of viruses or similar in electronic communications. This 
notice should not be removed.


--
Best Regards

Jeff Zhang
NOTICE
Please consider the environment before printing this email. This message and 
any attachments are intended for the addressee named and may contain legally 
privileged/confidential/copyright information. If you are not the intended 
recipient, you should not read, use, disclose, c

RE: can't plot

2019-10-29 Thread Manuel Sopena Ballesteros
Didn’t like %matplotlib inline

Traceback (most recent call last):
File 
"/d1/hadoop/yarn/local/usercache/mansop/appcache/application_1570749574365_0083/container_e15_1570749574365_0083_01_01/tmp/zeppelin_pyspark-2736590645623350055.py",
 line 364, in 
 code = compile('\n'.join(stmts), '', 'exec', ast.PyCF_ONLY_AST, 1)
File "", line 1
 %matplotlib inline
 ^
SyntaxError: invalid syntax

Manuel

From: Jeff Zhang [mailto:zjf...@gmail.com]
Sent: Wednesday, October 30, 2019 12:43 PM
To: users
Subject: Re: can't plot

Try this

%pyspark

%matplotlib inline
import matplotlib.pyplot as plt

plt.figure()

plt.plot([1, 2, 3])

Manuel Sopena Ballesteros 
mailto:manuel...@garvan.org.au>> 于2019年10月30日周三 
上午9:39写道:
Another example:

%pyspark

import matplotlib.pyplot as plt

plt.plot([1, 2, 3])
z.show(plt)
plt.close()



According to documentation
https://zeppelin.apache.org/docs/0.8.0/interpreter/python.html#matplotlib-integration

Am I right assuming that I can use z.show in %pyspark?

Thank you

Manuel

From: Manuel Sopena Ballesteros 
[mailto:manuel...@garvan.org.au<mailto:manuel...@garvan.org.au>]
Sent: Wednesday, October 30, 2019 12:12 PM
To: users@zeppelin.apache.org<mailto:users@zeppelin.apache.org>
Subject: can't plot

Dear Zeppelin user community,

I am running Zeppelin 0.8.0 and I am not able to print a plot using pyspark 
interpreter:

This is my notebook:

%pyspark

import matplotlib.pyplot as plt

plt.figure()

plt.plot([1, 2, 3])

And this is the output:

[]

Any idea?
NOTICE
Please consider the environment before printing this email. This message and 
any attachments are intended for the addressee named and may contain legally 
privileged/confidential/copyright information. If you are not the intended 
recipient, you should not read, use, disclose, copy or distribute this 
communication. If you have received this message in error please notify us at 
once by return email and then delete both messages. We accept no liability for 
the distribution of viruses or similar in electronic communications. This 
notice should not be removed.
NOTICE
Please consider the environment before printing this email. This message and 
any attachments are intended for the addressee named and may contain legally 
privileged/confidential/copyright information. If you are not the intended 
recipient, you should not read, use, disclose, copy or distribute this 
communication. If you have received this message in error please notify us at 
once by return email and then delete both messages. We accept no liability for 
the distribution of viruses or similar in electronic communications. This 
notice should not be removed.


--
Best Regards

Jeff Zhang
NOTICE
Please consider the environment before printing this email. This message and 
any attachments are intended for the addressee named and may contain legally 
privileged/confidential/copyright information. If you are not the intended 
recipient, you should not read, use, disclose, copy or distribute this 
communication. If you have received this message in error please notify us at 
once by return email and then delete both messages. We accept no liability for 
the distribution of viruses or similar in electronic communications. This 
notice should not be removed.


RE: can't plot

2019-10-29 Thread Manuel Sopena Ballesteros
Another example:

%pyspark

import matplotlib.pyplot as plt

plt.plot([1, 2, 3])
z.show(plt)
plt.close()



According to documentation
https://zeppelin.apache.org/docs/0.8.0/interpreter/python.html#matplotlib-integration

Am I right assuming that I can use z.show in %pyspark?

Thank you

Manuel

From: Manuel Sopena Ballesteros [mailto:manuel...@garvan.org.au]
Sent: Wednesday, October 30, 2019 12:12 PM
To: users@zeppelin.apache.org
Subject: can't plot

Dear Zeppelin user community,

I am running Zeppelin 0.8.0 and I am not able to print a plot using pyspark 
interpreter:

This is my notebook:

%pyspark

import matplotlib.pyplot as plt

plt.figure()

plt.plot([1, 2, 3])

And this is the output:

[]

Any idea?
NOTICE
Please consider the environment before printing this email. This message and 
any attachments are intended for the addressee named and may contain legally 
privileged/confidential/copyright information. If you are not the intended 
recipient, you should not read, use, disclose, copy or distribute this 
communication. If you have received this message in error please notify us at 
once by return email and then delete both messages. We accept no liability for 
the distribution of viruses or similar in electronic communications. This 
notice should not be removed.
NOTICE
Please consider the environment before printing this email. This message and 
any attachments are intended for the addressee named and may contain legally 
privileged/confidential/copyright information. If you are not the intended 
recipient, you should not read, use, disclose, copy or distribute this 
communication. If you have received this message in error please notify us at 
once by return email and then delete both messages. We accept no liability for 
the distribution of viruses or similar in electronic communications. This 
notice should not be removed.


can't plot

2019-10-29 Thread Manuel Sopena Ballesteros
Dear Zeppelin user community,

I am running Zeppelin 0.8.0 and I am not able to print a plot using pyspark 
interpreter:

This is my notebook:

%pyspark

import matplotlib.pyplot as plt

plt.figure()

plt.plot([1, 2, 3])

And this is the output:

[]

Any idea?
NOTICE
Please consider the environment before printing this email. This message and 
any attachments are intended for the addressee named and may contain legally 
privileged/confidential/copyright information. If you are not the intended 
recipient, you should not read, use, disclose, copy or distribute this 
communication. If you have received this message in error please notify us at 
once by return email and then delete both messages. We accept no liability for 
the distribution of viruses or similar in electronic communications. This 
notice should not be removed.


RE: error starting interpreter in yarn cluster mode

2019-10-17 Thread Manuel Sopena Ballesteros
Zeppelin version is 0.8.0

No changes to the source code but this zeppelin is installed by HDP

Manuel

From: Jeff Zhang [mailto:zjf...@gmail.com]
Sent: Friday, October 18, 2019 5:48 PM
To: users
Subject: Re: error starting interpreter in yarn cluster mode

The error seems a little weird, what version of zeppelin do you use ? Did you 
do any change on the source code ?

Manuel Sopena Ballesteros 
mailto:manuel...@garvan.org.au>> 于2019年10月18日周五 
下午2:36写道:
Dear Zeppelin community,

I am playing the script below in Zeppelin yarn cluster mode:

%pyspark
print("Hello world!")

output:

:5: error: object zeppelin is not a member of package org.apache var 
value: org.apache.zeppelin.spark.SparkZeppelinContext = _ ^ :6: error: 
object zeppelin is not a member of package org.apache def set(x: Any) = value = 
x.asInstanceOf[org.apache.zeppelin.spark.SparkZeppelinContext] ^ 
/d1/hadoop/yarn/local/usercache/mansop/appcache/application_1570749574365_0038/container_e15_1570749574365_0038_01_01/tmp/zeppelin_pyspark-5060717441683949247.py:179:
 UserWarning: Unable to load inline matplotlib backend, falling back to Agg 
warnings.warn("Unable to load inline matplotlib backend, " Hello world!

Any idea why am I getting these errors?

Thank you
NOTICE
Please consider the environment before printing this email. This message and 
any attachments are intended for the addressee named and may contain legally 
privileged/confidential/copyright information. If you are not the intended 
recipient, you should not read, use, disclose, copy or distribute this 
communication. If you have received this message in error please notify us at 
once by return email and then delete both messages. We accept no liability for 
the distribution of viruses or similar in electronic communications. This 
notice should not be removed.


--
Best Regards

Jeff Zhang
NOTICE
Please consider the environment before printing this email. This message and 
any attachments are intended for the addressee named and may contain legally 
privileged/confidential/copyright information. If you are not the intended 
recipient, you should not read, use, disclose, copy or distribute this 
communication. If you have received this message in error please notify us at 
once by return email and then delete both messages. We accept no liability for 
the distribution of viruses or similar in electronic communications. This 
notice should not be removed.


error starting interpreter in yarn cluster mode

2019-10-17 Thread Manuel Sopena Ballesteros
Dear Zeppelin community,

I am playing the script below in Zeppelin yarn cluster mode:

%pyspark
print("Hello world!")

output:

:5: error: object zeppelin is not a member of package org.apache var 
value: org.apache.zeppelin.spark.SparkZeppelinContext = _ ^ :6: error: 
object zeppelin is not a member of package org.apache def set(x: Any) = value = 
x.asInstanceOf[org.apache.zeppelin.spark.SparkZeppelinContext] ^ 
/d1/hadoop/yarn/local/usercache/mansop/appcache/application_1570749574365_0038/container_e15_1570749574365_0038_01_01/tmp/zeppelin_pyspark-5060717441683949247.py:179:
 UserWarning: Unable to load inline matplotlib backend, falling back to Agg 
warnings.warn("Unable to load inline matplotlib backend, " Hello world!

Any idea why am I getting these errors?

Thank you
NOTICE
Please consider the environment before printing this email. This message and 
any attachments are intended for the addressee named and may contain legally 
privileged/confidential/copyright information. If you are not the intended 
recipient, you should not read, use, disclose, copy or distribute this 
communication. If you have received this message in error please notify us at 
once by return email and then delete both messages. We accept no liability for 
the distribution of viruses or similar in electronic communications. This 
notice should not be removed.


RE: thrift.transport.TTransportException

2019-10-08 Thread Manuel Sopena Ballesteros
Hi Jeff,

Sorry for the late response.

I ran yarn-cluster mode with this setup

%spark2.conf

master yarn
spark.submit.deployMode cluster
zeppelin.pyspark.python /home/mansop/anaconda2/bin/python
spark.driver.memory 10g

I added ` log4j.logger.org.apache.zeppelin.interpreter=DEBUG` to the ` 
log4j_yarn_cluster.properties` file but nothing has changed, in fact the ` 
zeppelin-interpreter-spark2-mansop-root-zama-mlx.mlx.log` file is not updated 
after running my notes

This code works

%pyspark

print("Hello world!")

However this one does not work:

%pyspark

a = "bigword"
aList = []
for i in range(1000):
aList.append(i**i*a)
#print aList

for word in aList:
print word

which means I am still getting org.apache.thrift.transport.TTransportException 
at 
org.apache.thrift.transport.TIOStreamTransport.read(TIOStreamTransport.java:132)
 at org.apache.thrift.transport.TTransport.readAll(TTransport.java:86)

and spark logs says:
ERROR [2019-10-09 12:15:16,454] ({SIGTERM handler} 
SignalUtils.scala[apply$mcZ$sp]:43) - RECEIVED SIGNAL TERM
…
ERROR [2019-10-09 12:15:16,609] ({Reporter} Logging.scala[logError]:91) - 
Exception from Reporter thread.
org.apache.hadoop.yarn.exceptions.ApplicationAttemptNotFoundException: 
Application attempt appattempt_1570490897819_0013_01 doesn't exist in 
ApplicationMasterService cache.

Any idea?

Manuel

From: Jeff Zhang [mailto:zjf...@gmail.com]
Sent: Friday, October 4, 2019 5:12 PM
To: users
Subject: Re: thrift.transport.TTransportException

Then it looks like something wrong with the python process. Do you run it in 
yarn-cluster mode or yarn-client mode ?
Try to add the following line to log4j.properties for yarn-client mode or 
log4j_yarn_cluster.properties for yarn-cluster mode

log4j.logger.org.apache.zeppelin.interpreter=DEBUG

And try it again, this time you will get more log info, I suspect the python 
process fail to start




Manuel Sopena Ballesteros 
mailto:manuel...@garvan.org.au>> 于2019年10月4日周五 
上午9:09写道:
Sorry for the late response,

Yes, I have successfully ran few simple scala codes using %spark interpreter in 
zeppelin.

What should I do next?

Manuel

From: Jeff Zhang [mailto:zjf...@gmail.com<mailto:zjf...@gmail.com>]
Sent: Tuesday, October 1, 2019 5:44 PM
To: users
Subject: Re: thrift.transport.TTransportException

It looks like you are using pyspark, could you try just start scala spark 
interpreter via `%spark` ? First let's figure out whether it is related with 
pyspark.



Manuel Sopena Ballesteros 
mailto:manuel...@garvan.org.au>> 于2019年10月1日周二 
下午3:29写道:
Dear Zeppelin community,

I would like to ask for advice in regards an error I am having with thrift.

I am getting quite a lot of these errors while running my notebooks

org.apache.thrift.transport.TTransportException at 
org.apache.thrift.transport.TIOStreamTransport.read(TIOStreamTransport.java:132)
 at org.apache.thrift.transport.TTransport.readAll(TTransport.java:86) at 
org.apache.thrift.protocol.TBinaryProtocol.readAll(TBinaryProtocol.java:429) at 
org.apache.thrift.protocol.TBinaryProtocol.readI32(TBinaryProtocol.java:318) at 
org.apache.thrift.protocol.TBinaryProtocol.readMessageBegin(TBinaryProtocol.java:219)
 at org.apache.thrift.TServiceClient.receiveBase(TServiceClient.java:77) at 
org.apache.zeppelin.interpreter.thrift.RemoteInterpreterService$Client.recv_interpret(RemoteInterpreterService.java:274)
 at 
org.apache.zeppelin.interpreter.thrift.RemoteInterpreterService$Client.interpret(RemoteInterpreterService.java:258)
 at 
org.apache.zeppelin.interpreter.remote.RemoteInterpreter$4.call(RemoteInterpreter.java:233)
 at 
org.apache.zeppelin.interpreter.remote.RemoteInterpreter$4.call(RemoteInterpreter.java:229)
 at 
org.apache.zeppelin.interpreter.remote.RemoteInterpreterProcess.callRemoteFunction(RemoteInterpreterProcess.java:135)
 at 
org.apache.zeppelin.interpreter.remote.RemoteInterpreter.interpret(RemoteInterpreter.java:228)
 at org.apache.zeppelin.notebook.Paragraph.jobRun(Paragraph.java:437) at 
org.apache.zeppelin.scheduler.Job.run(Job.java:188) at 
org.apache.zeppelin.scheduler.RemoteScheduler$JobRunner.run(RemoteScheduler.java:307)
 at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511) at 
java.util.concurrent.FutureTask.run(FutureTask.java:266) at 
java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.access$201(ScheduledThreadPoolExecutor.java:180)
 at 
java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.run(ScheduledThreadPoolExecutor.java:293)
 at 
java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142) 
at 
java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617) 
at java.lang.Thread.run(Thread.java:745)

And this is the Spark driver application logs:
…
===
YARN executor launch context:
  env:
CLASSPATH -> 
{{PWD}}{{PWD}}/__spark_con

RE: thrift.transport.TTransportException

2019-10-03 Thread Manuel Sopena Ballesteros
Sorry for the late response,

Yes, I have successfully ran few simple scala codes using %spark interpreter in 
zeppelin.

What should I do next?

Manuel

From: Jeff Zhang [mailto:zjf...@gmail.com]
Sent: Tuesday, October 1, 2019 5:44 PM
To: users
Subject: Re: thrift.transport.TTransportException

It looks like you are using pyspark, could you try just start scala spark 
interpreter via `%spark` ? First let's figure out whether it is related with 
pyspark.



Manuel Sopena Ballesteros 
mailto:manuel...@garvan.org.au>> 于2019年10月1日周二 
下午3:29写道:
Dear Zeppelin community,

I would like to ask for advice in regards an error I am having with thrift.

I am getting quite a lot of these errors while running my notebooks

org.apache.thrift.transport.TTransportException at 
org.apache.thrift.transport.TIOStreamTransport.read(TIOStreamTransport.java:132)
 at org.apache.thrift.transport.TTransport.readAll(TTransport.java:86) at 
org.apache.thrift.protocol.TBinaryProtocol.readAll(TBinaryProtocol.java:429) at 
org.apache.thrift.protocol.TBinaryProtocol.readI32(TBinaryProtocol.java:318) at 
org.apache.thrift.protocol.TBinaryProtocol.readMessageBegin(TBinaryProtocol.java:219)
 at org.apache.thrift.TServiceClient.receiveBase(TServiceClient.java:77) at 
org.apache.zeppelin.interpreter.thrift.RemoteInterpreterService$Client.recv_interpret(RemoteInterpreterService.java:274)
 at 
org.apache.zeppelin.interpreter.thrift.RemoteInterpreterService$Client.interpret(RemoteInterpreterService.java:258)
 at 
org.apache.zeppelin.interpreter.remote.RemoteInterpreter$4.call(RemoteInterpreter.java:233)
 at 
org.apache.zeppelin.interpreter.remote.RemoteInterpreter$4.call(RemoteInterpreter.java:229)
 at 
org.apache.zeppelin.interpreter.remote.RemoteInterpreterProcess.callRemoteFunction(RemoteInterpreterProcess.java:135)
 at 
org.apache.zeppelin.interpreter.remote.RemoteInterpreter.interpret(RemoteInterpreter.java:228)
 at org.apache.zeppelin.notebook.Paragraph.jobRun(Paragraph.java:437) at 
org.apache.zeppelin.scheduler.Job.run(Job.java:188) at 
org.apache.zeppelin.scheduler.RemoteScheduler$JobRunner.run(RemoteScheduler.java:307)
 at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511) at 
java.util.concurrent.FutureTask.run(FutureTask.java:266) at 
java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.access$201(ScheduledThreadPoolExecutor.java:180)
 at 
java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.run(ScheduledThreadPoolExecutor.java:293)
 at 
java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142) 
at 
java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617) 
at java.lang.Thread.run(Thread.java:745)

And this is the Spark driver application logs:
…
===
YARN executor launch context:
  env:
CLASSPATH -> 
{{PWD}}{{PWD}}/__spark_conf__{{PWD}}/__spark_libs__/*$HADOOP_CONF_DIR/usr/hdp/3.1.0.0-78/hadoop/*/usr/hdp/3.1.0.0-78/hadoop/lib/*/usr/hdp/current/hadoop-hdfs-client/*/usr/hdp/current/hadoop-hdfs-client/lib/*/usr/hdp/current/hadoop-yarn-client/*/usr/hdp/current/hadoop-yarn-client/lib/*$PWD/mr-framework/hadoop/share/hadoop/mapreduce/*:$PWD/mr-framework/hadoop/share/hadoop/mapreduce/lib/*:$PWD/mr-framework/hadoop/share/hadoop/common/*:$PWD/mr-framework/hadoop/share/hadoop/common/lib/*:$PWD/mr-framework/hadoop/share/hadoop/yarn/*:$PWD/mr-framework/hadoop/share/hadoop/yarn/lib/*:$PWD/mr-framework/hadoop/share/hadoop/hdfs/*:$PWD/mr-framework/hadoop/share/hadoop/hdfs/lib/*:$PWD/mr-framework/hadoop/share/hadoop/tools/lib/*:/usr/hdp/3.1.0.0-78/hadoop/lib/hadoop-lzo-0.6.0.3.1.0.0-78.jar:/etc/hadoop/conf/secure{{PWD}}/__spark_conf__/__hadoop_conf__
SPARK_YARN_STAGING_DIR -> 
hdfs://gl-hdp-ctrl01-mlx.mlx:8020/user/mansop/.sparkStaging/application_1568954689585_0052
SPARK_USER -> mansop
PYTHONPATH -> 
/usr/hdp/current/spark2-client/python/lib/py4j-0.10.7-src.zip:/usr/hdp/current/spark2-client/python/:{{PWD}}/pyspark.zip{{PWD}}/py4j-0.10.7-src.zip

  command:

LD_LIBRARY_PATH="/usr/hdp/current/hadoop-client/lib/native:/usr/hdp/current/hadoop-client/lib/native/Linux-amd64-64:$LD_LIBRARY_PATH"
 \
  {{JAVA_HOME}}/bin/java \
  -server \
  -Xmx1024m \
  '-XX:+UseNUMA' \
  -Djava.io.tmpdir={{PWD}}/tmp \
  '-Dspark.history.ui.port=18081' \
  -Dspark.yarn.app.container.log.dir= \
  -XX:OnOutOfMemoryError='kill %p' \
  org.apache.spark.executor.CoarseGrainedExecutorBackend \
  --driver-url \
  spark://coarsegrainedschedu...@r640-1-12-mlx.mlx:35602 \
  --executor-id \
   \
  --hostname \
   \
  --cores \
  1 \
  --app-id \
  application_1568954689585_0052 \
  --user-class-path \
  file:$PWD/__app__.jar \
  1>/stdout \
  2>/stderr

  resources:
__app__.jar -> resource { scheme: "hdfs&

thrift.transport.TTransportException

2019-10-01 Thread Manuel Sopena Ballesteros
Dear Zeppelin community,

I would like to ask for advice in regards an error I am having with thrift.

I am getting quite a lot of these errors while running my notebooks

org.apache.thrift.transport.TTransportException at 
org.apache.thrift.transport.TIOStreamTransport.read(TIOStreamTransport.java:132)
 at org.apache.thrift.transport.TTransport.readAll(TTransport.java:86) at 
org.apache.thrift.protocol.TBinaryProtocol.readAll(TBinaryProtocol.java:429) at 
org.apache.thrift.protocol.TBinaryProtocol.readI32(TBinaryProtocol.java:318) at 
org.apache.thrift.protocol.TBinaryProtocol.readMessageBegin(TBinaryProtocol.java:219)
 at org.apache.thrift.TServiceClient.receiveBase(TServiceClient.java:77) at 
org.apache.zeppelin.interpreter.thrift.RemoteInterpreterService$Client.recv_interpret(RemoteInterpreterService.java:274)
 at 
org.apache.zeppelin.interpreter.thrift.RemoteInterpreterService$Client.interpret(RemoteInterpreterService.java:258)
 at 
org.apache.zeppelin.interpreter.remote.RemoteInterpreter$4.call(RemoteInterpreter.java:233)
 at 
org.apache.zeppelin.interpreter.remote.RemoteInterpreter$4.call(RemoteInterpreter.java:229)
 at 
org.apache.zeppelin.interpreter.remote.RemoteInterpreterProcess.callRemoteFunction(RemoteInterpreterProcess.java:135)
 at 
org.apache.zeppelin.interpreter.remote.RemoteInterpreter.interpret(RemoteInterpreter.java:228)
 at org.apache.zeppelin.notebook.Paragraph.jobRun(Paragraph.java:437) at 
org.apache.zeppelin.scheduler.Job.run(Job.java:188) at 
org.apache.zeppelin.scheduler.RemoteScheduler$JobRunner.run(RemoteScheduler.java:307)
 at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511) at 
java.util.concurrent.FutureTask.run(FutureTask.java:266) at 
java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.access$201(ScheduledThreadPoolExecutor.java:180)
 at 
java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.run(ScheduledThreadPoolExecutor.java:293)
 at 
java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142) 
at 
java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617) 
at java.lang.Thread.run(Thread.java:745)

And this is the Spark driver application logs:
...
===
YARN executor launch context:
  env:
CLASSPATH -> 
{{PWD}}{{PWD}}/__spark_conf__{{PWD}}/__spark_libs__/*$HADOOP_CONF_DIR/usr/hdp/3.1.0.0-78/hadoop/*/usr/hdp/3.1.0.0-78/hadoop/lib/*/usr/hdp/current/hadoop-hdfs-client/*/usr/hdp/current/hadoop-hdfs-client/lib/*/usr/hdp/current/hadoop-yarn-client/*/usr/hdp/current/hadoop-yarn-client/lib/*$PWD/mr-framework/hadoop/share/hadoop/mapreduce/*:$PWD/mr-framework/hadoop/share/hadoop/mapreduce/lib/*:$PWD/mr-framework/hadoop/share/hadoop/common/*:$PWD/mr-framework/hadoop/share/hadoop/common/lib/*:$PWD/mr-framework/hadoop/share/hadoop/yarn/*:$PWD/mr-framework/hadoop/share/hadoop/yarn/lib/*:$PWD/mr-framework/hadoop/share/hadoop/hdfs/*:$PWD/mr-framework/hadoop/share/hadoop/hdfs/lib/*:$PWD/mr-framework/hadoop/share/hadoop/tools/lib/*:/usr/hdp/3.1.0.0-78/hadoop/lib/hadoop-lzo-0.6.0.3.1.0.0-78.jar:/etc/hadoop/conf/secure{{PWD}}/__spark_conf__/__hadoop_conf__
SPARK_YARN_STAGING_DIR -> 
hdfs://gl-hdp-ctrl01-mlx.mlx:8020/user/mansop/.sparkStaging/application_1568954689585_0052
SPARK_USER -> mansop
PYTHONPATH -> 
/usr/hdp/current/spark2-client/python/lib/py4j-0.10.7-src.zip:/usr/hdp/current/spark2-client/python/:{{PWD}}/pyspark.zip{{PWD}}/py4j-0.10.7-src.zip

  command:

LD_LIBRARY_PATH="/usr/hdp/current/hadoop-client/lib/native:/usr/hdp/current/hadoop-client/lib/native/Linux-amd64-64:$LD_LIBRARY_PATH"
 \
  {{JAVA_HOME}}/bin/java \
  -server \
  -Xmx1024m \
  '-XX:+UseNUMA' \
  -Djava.io.tmpdir={{PWD}}/tmp \
  '-Dspark.history.ui.port=18081' \
  -Dspark.yarn.app.container.log.dir= \
  -XX:OnOutOfMemoryError='kill %p' \
  org.apache.spark.executor.CoarseGrainedExecutorBackend \
  --driver-url \
  spark://coarsegrainedschedu...@r640-1-12-mlx.mlx:35602 \
  --executor-id \
   \
  --hostname \
   \
  --cores \
  1 \
  --app-id \
  application_1568954689585_0052 \
  --user-class-path \
  file:$PWD/__app__.jar \
  1>/stdout \
  2>/stderr

  resources:
__app__.jar -> resource { scheme: "hdfs" host: "gl-hdp-ctrl01-mlx.mlx" 
port: 8020 file: 
"/user/mansop/.sparkStaging/application_1568954689585_0052/spark-interpreter-0.8.0.3.1.0.0-78.jar"
 } size: 20433040 timestamp: 1569804142906 type: FILE visibility: PRIVATE
__spark_conf__ -> resource { scheme: "hdfs" host: "gl-hdp-ctrl01-mlx.mlx" 
port: 8020 file: 
"/user/mansop/.sparkStaging/application_1568954689585_0052/__spark_conf__.zip" 
} size: 277725 timestamp: 1569804143239 type: ARCHIVE visibility: PRIVATE
sparkr -> resource { scheme: "hdfs" host: "gl-hdp-ctrl01-mlx.mlx" port: 
8020 file: 
"/user/mansop/.sparkStaging/application_1568954689585_0052/sparkr.zip" } size:

conda interpreter

2019-09-03 Thread Manuel Sopena Ballesteros
Dear Zeppelin user community,

I have a situation where I can't install R packages through zeppelin because:

1.   R expects me to give some feedback like choosing a repository or 
agreeing to compile and install package from source code.

2.   Be able to create multiple environments to keep different versions of 
python, R for each project

For 1) I don't think zeppelin provides capabilities for user interaction. Am I 
right assuming this?
For 2) How should I manage this? Documentation says I can use conda but this 
will only work for python... what about if I want to run my environment in 
Spark? What would you recommend?

Thank you very much

NOTICE
Please consider the environment before printing this email. This message and 
any attachments are intended for the addressee named and may contain legally 
privileged/confidential/copyright information. If you are not the intended 
recipient, you should not read, use, disclose, copy or distribute this 
communication. If you have received this message in error please notify us at 
once by return email and then delete both messages. We accept no liability for 
the distribution of viruses or similar in electronic communications. This 
notice should not be removed.


interactive notebook

2019-09-02 Thread Manuel Sopena Ballesteros
Dear Zeppelin community,

I am trying to install the following library

[cid:image003.png@01D561B3.57C646F0]

However when I run the command above `install.packages('Seurat')` in zeppelin 
notebook, it freezes, I guess because R is waiting the user to select an option.

I know this is a silly example but this issue may happen in other situations.

Is there a way I can setup zeppelin to run in interactive mode?

Thank you very much
NOTICE
Please consider the environment before printing this email. This message and 
any attachments are intended for the addressee named and may contain legally 
privileged/confidential/copyright information. If you are not the intended 
recipient, you should not read, use, disclose, copy or distribute this 
communication. If you have received this message in error please notify us at 
once by return email and then delete both messages. We accept no liability for 
the distribution of viruses or similar in electronic communications. This 
notice should not be removed.


RE: conda and pyspark interpreter

2019-08-21 Thread Manuel Sopena Ballesteros
This relates to python interpreter how would it work if I need to use the 
pyspark?

Manuel

From: Jeff Zhang [mailto:zjf...@gmail.com]
Sent: Thursday, August 22, 2019 12:01 PM
To: users
Subject: Re: conda and pyspark interpreter

See zeppelin doc . 
http://zeppelin.apache.org/docs/0.8.0/interpreter/python.html#conda


Manuel Sopena Ballesteros 
mailto:manuel...@garvan.org.au>> 于2019年8月22日周四 
上午9:57写道:
Hi,

Is there a way to integrate conda with pyspark interpreter so users can create 
list and activate environments?

Thank you very much

Manuel

NOTICE
Please consider the environment before printing this email. This message and 
any attachments are intended for the addressee named and may contain legally 
privileged/confidential/copyright information. If you are not the intended 
recipient, you should not read, use, disclose, copy or distribute this 
communication. If you have received this message in error please notify us at 
once by return email and then delete both messages. We accept no liability for 
the distribution of viruses or similar in electronic communications. This 
notice should not be removed.


--
Best Regards

Jeff Zhang
NOTICE
Please consider the environment before printing this email. This message and 
any attachments are intended for the addressee named and may contain legally 
privileged/confidential/copyright information. If you are not the intended 
recipient, you should not read, use, disclose, copy or distribute this 
communication. If you have received this message in error please notify us at 
once by return email and then delete both messages. We accept no liability for 
the distribution of viruses or similar in electronic communications. This 
notice should not be removed.


conda and pyspark interpreter

2019-08-21 Thread Manuel Sopena Ballesteros
Hi,

Is there a way to integrate conda with pyspark interpreter so users can create 
list and activate environments?

Thank you very much

Manuel

NOTICE
Please consider the environment before printing this email. This message and 
any attachments are intended for the addressee named and may contain legally 
privileged/confidential/copyright information. If you are not the intended 
recipient, you should not read, use, disclose, copy or distribute this 
communication. If you have received this message in error please notify us at 
once by return email and then delete both messages. We accept no liability for 
the distribution of viruses or similar in electronic communications. This 
notice should not be removed.


spark interpreter "master" parameter always resets to yarn-client after restart zeppelin

2019-08-19 Thread Manuel Sopena Ballesteros
Dear Zeppelin user community,

I would like I a zeppelin installation with spark integration and the "master" 
parameter in the spark interpreter configuration always resets its value from 
"yarn" to "yarn-client" after zeppelin service reboot.

How can I stop that?

Thank you

NOTICE
Please consider the environment before printing this email. This message and 
any attachments are intended for the addressee named and may contain legally 
privileged/confidential/copyright information. If you are not the intended 
recipient, you should not read, use, disclose, copy or distribute this 
communication. If you have received this message in error please notify us at 
once by return email and then delete both messages. We accept no liability for 
the distribution of viruses or similar in electronic communications. This 
notice should not be removed.


python virtual environment on spark interpreter

2019-08-19 Thread Manuel Sopena Ballesteros
Dear Zeppelin user community,

I have a zeppelin installation connected to a Spark cluster. I setup Zeppelin 
to submit jobs in yarn cluster mode and also impersonation is enabled. Now I 
would like to be able to use a python virtual environment instead of system one.
Is there a way I could specify the python parameter in the spark interpreter 
settings so is can point to specific folder use home folder (eg 
/home/{user_home}/python_virt_env/python) instead of a system one?

If not how should I achieve what I want?

Thank you

Manuel
NOTICE
Please consider the environment before printing this email. This message and 
any attachments are intended for the addressee named and may contain legally 
privileged/confidential/copyright information. If you are not the intended 
recipient, you should not read, use, disclose, copy or distribute this 
communication. If you have received this message in error please notify us at 
once by return email and then delete both messages. We accept no liability for 
the distribution of viruses or similar in electronic communications. This 
notice should not be removed.


users needs to install their own python and R libraries

2019-08-13 Thread Manuel Sopena Ballesteros
Dear Zeppelin user community,

I am trying to setup python and R to submit jobs through Spark cluster. This is 
already done but now I need to enable the users to install their own libraries.

I was thinking to ask the users to setup conda in their home directory and 
modify the `zeppelin.pyspark.python` to full conda python path. Then user 
should be able to enable either python2 or 3 using `generic configuration 
interpreter`.

Is this the right way of doing what I am trying to do?

Thank you very much




NOTICE
Please consider the environment before printing this email. This message and 
any attachments are intended for the addressee named and may contain legally 
privileged/confidential/copyright information. If you are not the intended 
recipient, you should not read, use, disclose, copy or distribute this 
communication. If you have received this message in error please notify us at 
once by return email and then delete both messages. We accept no liability for 
the distribution of viruses or similar in electronic communications. This 
notice should not be removed.


RE: multiple interpreters for spark python2 and 3

2019-08-12 Thread Manuel Sopena Ballesteros
Hi,

Do I need to create 2 spark interpreter groups or can I just create a new 
py3spark interpreter inside eexisting spark interpreter group like the example 
below?

…
  {
"group": "spark",
"name": "pyspark",
"className": "org.apache.zeppelin.spark.PySparkInterpreter",
"properties": {
  "zeppelin.pyspark.python": {
"envName": "PYSPARK_PYTHON",
"propertyName": null,
"defaultValue": "python",
"description": "Python command to run pyspark with",
"type": "string"
  },
  "zeppelin.pyspark.useIPython": {
"envName": null,
"propertyName": "zeppelin.pyspark.useIPython",
"defaultValue": true,
"description": "whether use IPython when it is available",
"type": "checkbox"
  }
},
"editor": {
  "language": "python",
  "editOnDblClick": false,
  "completionKey": "TAB",
  "completionSupport": true
}
  },
  {
"group": "spark",
"name": "py3spark",
"className": "org.apache.zeppelin.spark.PySparkInterpreter",
"properties": {
  "zeppelin.py3spark.python": {
"envName": "PYSPARK_PYTHON",
"propertyName": null,
"defaultValue": "python3.6",
"description": "Python3.6 command to run pyspark with",
"type": "string"
  },
  "zeppelin.pyspark.useIPython": {
"envName": null,
"propertyName": "zeppelin.pyspark.useIPython",
"defaultValue": true,
"description": "whether use IPython when it is available",
    "type": "checkbox"
  }
},
"editor": {
  "language": "python",
  "editOnDblClick": false,
  "completionKey": "TAB",
  "completionSupport": true
}
  },
…

Thank you

Manuel

From: Jeff Zhang [mailto:zjf...@gmail.com]
Sent: Monday, August 12, 2019 5:46 PM
To: users
Subject: Re: multiple interpreters for spark python2 and 3

2 Approaches:
1.  create 2 spark interpreters, one with python2 and another with python3
2.  use generic configuration interpreter. 
https://medium.com/@zjffdu/zeppelin-0-8-0-new-features-ea53e8810235

Manuel Sopena Ballesteros 
mailto:manuel...@garvan.org.au>> 于2019年8月12日周一 
下午3:41写道:

Dear Zeppelin community,

I have a zeppelin installation and a spark cluster. I need to provide options 
for users to run either python2 or 3 code using pyspark. At the moment the only 
way of doing this is by editing the spark interpreter and changing the 
`zeppelin.pyspark.python` from python to python3.6.
Is there a way to copy/duplicate the spark interpreter one with python2 and the 
other with python3 so I can chose which one to use without leaving the notebook?

Thank you

NOTICE
Please consider the environment before printing this email. This message and 
any attachments are intended for the addressee named and may contain legally 
privileged/confidential/copyright information. If you are not the intended 
recipient, you should not read, use, disclose, copy or distribute this 
communication. If you have received this message in error please notify us at 
once by return email and then delete both messages. We accept no liability for 
the distribution of viruses or similar in electronic communications. This 
notice should not be removed.


--
Best Regards

Jeff Zhang
NOTICE
Please consider the environment before printing this email. This message and 
any attachments are intended for the addressee named and may contain legally 
privileged/confidential/copyright information. If you are not the intended 
recipient, you should not read, use, disclose, copy or distribute this 
communication. If you have received this message in error please notify us at 
once by return email and then delete both messages. We accept no liability for 
the distribution of viruses or similar in electronic communications. This 
notice should not be removed.


spark jobs in spark history

2019-08-12 Thread Manuel Sopena Ballesteros
Dear Zeppelin community,

I have a Zeppelin installation connected to Spark. I realized that zeppelin 
runs a spark job when it starts but I can't see each individual jobs submitted 
through zeppelin notebooks.

Is this the expected behavior by design? Is there a way I can see in spark 
history server the different submissions from the zeppelin notebook?

Thank you very much
NOTICE
Please consider the environment before printing this email. This message and 
any attachments are intended for the addressee named and may contain legally 
privileged/confidential/copyright information. If you are not the intended 
recipient, you should not read, use, disclose, copy or distribute this 
communication. If you have received this message in error please notify us at 
once by return email and then delete both messages. We accept no liability for 
the distribution of viruses or similar in electronic communications. This 
notice should not be removed.


multiple interpreters for spark python2 and 3

2019-08-12 Thread Manuel Sopena Ballesteros

Dear Zeppelin community,

I have a zeppelin installation and a spark cluster. I need to provide options 
for users to run either python2 or 3 code using pyspark. At the moment the only 
way of doing this is by editing the spark interpreter and changing the 
`zeppelin.pyspark.python` from python to python3.6.
Is there a way to copy/duplicate the spark interpreter one with python2 and the 
other with python3 so I can chose which one to use without leaving the notebook?

Thank you

NOTICE
Please consider the environment before printing this email. This message and 
any attachments are intended for the addressee named and may contain legally 
privileged/confidential/copyright information. If you are not the intended 
recipient, you should not read, use, disclose, copy or distribute this 
communication. If you have received this message in error please notify us at 
once by return email and then delete both messages. We accept no liability for 
the distribution of viruses or similar in electronic communications. This 
notice should not be removed.


RE: can't use @spark2.r interpreter

2019-06-27 Thread Manuel Sopena Ballesteros
correct

Manuel

From: Jeff Zhang [mailto:zjf...@gmail.com]
Sent: Friday, June 28, 2019 12:41 PM
To: users
Subject: Re: can't use @spark2.r interpreter

Are you using HDP ?

Manuel Sopena Ballesteros 
mailto:manuel...@garvan.org.au>> 于2019年6月28日周五 
上午10:32写道:
Dear Zeppelin community,

I am trying to setup spark R interpreter in Zeppeling, however I can’t make it 
work.

This is my notebook:
%spark2.r

1 + 1

And this is the output:
Error in dev.control(displaylist = if (record) "enable" else "inhibit"): 
dev.control() called without an open graphics device

Any idea?

Thank you

Manuel Sopena Ballesteros

Big Data Engineer | Kinghorn Centre for Clinical Genomics

 [cid:image001.png@01D4C835.ED3C2230] <https://www.garvan.org.au/>

a: 384 Victoria Street, Darlinghurst NSW 2010
p: +61 2 9355 5760  |  +61 4 12 123 123
e: manuel...@garvan.org.au<mailto:manuel...@garvan.org.au>

Like us on Facebook<http://www.facebook.com/garvaninstitute> | Follow us on 
Twitter<http://twitter.com/GarvanInstitute> and 
LinkedIn<http://www.linkedin.com/company/garvan-institute-of-medical-research>

NOTICE
Please consider the environment before printing this email. This message and 
any attachments are intended for the addressee named and may contain legally 
privileged/confidential/copyright information. If you are not the intended 
recipient, you should not read, use, disclose, copy or distribute this 
communication. If you have received this message in error please notify us at 
once by return email and then delete both messages. We accept no liability for 
the distribution of viruses or similar in electronic communications. This 
notice should not be removed.


--
Best Regards

Jeff Zhang
NOTICE
Please consider the environment before printing this email. This message and 
any attachments are intended for the addressee named and may contain legally 
privileged/confidential/copyright information. If you are not the intended 
recipient, you should not read, use, disclose, copy or distribute this 
communication. If you have received this message in error please notify us at 
once by return email and then delete both messages. We accept no liability for 
the distribution of viruses or similar in electronic communications. This 
notice should not be removed.


can't use @spark2.r interpreter

2019-06-27 Thread Manuel Sopena Ballesteros
Dear Zeppelin community,

I am trying to setup spark R interpreter in Zeppeling, however I can't make it 
work.

This is my notebook:
%spark2.r

1 + 1

And this is the output:
Error in dev.control(displaylist = if (record) "enable" else "inhibit"): 
dev.control() called without an open graphics device

Any idea?

Thank you

Manuel Sopena Ballesteros

Big Data Engineer | Kinghorn Centre for Clinical Genomics

 [cid:image001.png@01D4C835.ED3C2230] <https://www.garvan.org.au/>

a: 384 Victoria Street, Darlinghurst NSW 2010
p: +61 2 9355 5760  |  +61 4 12 123 123
e: manuel...@garvan.org.au<mailto:manuel...@garvan.org.au>

Like us on Facebook<http://www.facebook.com/garvaninstitute> | Follow us on 
Twitter<http://twitter.com/GarvanInstitute> and 
LinkedIn<http://www.linkedin.com/company/garvan-institute-of-medical-research>

NOTICE
Please consider the environment before printing this email. This message and 
any attachments are intended for the addressee named and may contain legally 
privileged/confidential/copyright information. If you are not the intended 
recipient, you should not read, use, disclose, copy or distribute this 
communication. If you have received this message in error please notify us at 
once by return email and then delete both messages. We accept no liability for 
the distribution of viruses or similar in electronic communications. This 
notice should not be removed.


error trying to run r script in spark2.r interpreter

2019-06-19 Thread Manuel Sopena Ballesteros
java:174)
  at 
org.apache.zeppelin.spark.SparkRInterpreter.open(SparkRInterpreter.java:106)
  at 
org.apache.zeppelin.interpreter.LazyOpenInterpreter.open(LazyOpenInterpreter.java:69)
  at 
org.apache.zeppelin.interpreter.remote.RemoteInterpreterServer$InterpretJob.jobRun(RemoteInterpreterServer.java:617)
  at org.apache.zeppelin.scheduler.Job.run(Job.java:188)
  at 
org.apache.zeppelin.scheduler.FIFOScheduler$1.run(FIFOScheduler.java:140)
  at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511)
  at java.util.concurrent.FutureTask.run(FutureTask.java:266)
  at 
java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.access$201(ScheduledThreadPoolExecutor.java:180)
  at 
java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.run(ScheduledThreadPoolExecutor.java:293)
  at 
java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142)
  at 
java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617)
  at java.lang.Thread.run(Thread.java:745)

any advice?

Manuel Sopena Ballesteros

Big Data Engineer | Kinghorn Centre for Clinical Genomics

 [cid:image001.png@01D4C835.ED3C2230] <https://www.garvan.org.au/>

a: 384 Victoria Street, Darlinghurst NSW 2010
p: +61 2 9355 5760  |  +61 4 12 123 123
e: manuel...@garvan.org.au<mailto:manuel...@garvan.org.au>

Like us on Facebook<http://www.facebook.com/garvaninstitute> | Follow us on 
Twitter<http://twitter.com/GarvanInstitute> and 
LinkedIn<http://www.linkedin.com/company/garvan-institute-of-medical-research>

NOTICE
Please consider the environment before printing this email. This message and 
any attachments are intended for the addressee named and may contain legally 
privileged/confidential/copyright information. If you are not the intended 
recipient, you should not read, use, disclose, copy or distribute this 
communication. If you have received this message in error please notify us at 
once by return email and then delete both messages. We accept no liability for 
the distribution of viruses or similar in electronic communications. This 
notice should not be removed.


RE: python interpreter not working

2019-06-04 Thread Manuel Sopena Ballesteros
Same error without impersonation --> The interpreter will be instantiated 
“globally” in “shared” process

[root@gl-hdp-ctrl01 zeppelin]# python -V
Python 2.7.5

Thank you

Manuel

From: Jeff Zhang [mailto:zjf...@gmail.com]
Sent: Wednesday, June 5, 2019 12:49 PM
To: users
Subject: Re: python interpreter not working

Which zeppelin version do you use ? Does it work without impersonation ?

Manuel Sopena Ballesteros 
mailto:manuel...@garvan.org.au>> 于2019年6月5日周三 
上午10:38写道:
Dear Zeppelin community,

I am trying to setup the python interpreter. Installation is successful however 
I can’t make any python code to run.
This is what I can see from the logs:

INFO [2019-06-05 12:35:07,788] ({pool-2-thread-2} 
SchedulerFactory.java[jobStarted]:109) - Job 20190605-122140_1966429456 started 
by scheduler 
org.apache.zeppelin.interpreter.remote.RemoteInterpreter-python:mansop:-shared_session
INFO [2019-06-05 12:35:07,789] ({pool-2-thread-2} Paragraph.java[jobRun]:380) - 
Run paragraph [paragraph_id: 20190605-122140_1966429456, interpreter: python, 
note_id: 2EBKSAFA9, user: mansop]
WARN [2019-06-05 12:35:17,799] ({pool-2-thread-2} 
NotebookServer.java[afterStatusChange]:2302) - Job 20190605-122140_1966429456 
is finished, status: ERROR, exception: null, result: %text python is not 
responding
INFO [2019-06-05 12:35:17,841] ({pool-2-thread-2} 
SchedulerFactory.java[jobFinished]:115) - Job 20190605-122140_1966429456 
finished by scheduler 
org.apache.zeppelin.interpreter.remote.RemoteInterpreter-python:mansop:-shared_session

My python interpreter is setup with impersonation

Any thoughts?

Thank you very much
NOTICE
Please consider the environment before printing this email. This message and 
any attachments are intended for the addressee named and may contain legally 
privileged/confidential/copyright information. If you are not the intended 
recipient, you should not read, use, disclose, copy or distribute this 
communication. If you have received this message in error please notify us at 
once by return email and then delete both messages. We accept no liability for 
the distribution of viruses or similar in electronic communications. This 
notice should not be removed.


--
Best Regards

Jeff Zhang
NOTICE
Please consider the environment before printing this email. This message and 
any attachments are intended for the addressee named and may contain legally 
privileged/confidential/copyright information. If you are not the intended 
recipient, you should not read, use, disclose, copy or distribute this 
communication. If you have received this message in error please notify us at 
once by return email and then delete both messages. We accept no liability for 
the distribution of viruses or similar in electronic communications. This 
notice should not be removed.


python interpreter not working

2019-06-04 Thread Manuel Sopena Ballesteros
Dear Zeppelin community,

I am trying to setup the python interpreter. Installation is successful however 
I can't make any python code to run.
This is what I can see from the logs:

INFO [2019-06-05 12:35:07,788] ({pool-2-thread-2} 
SchedulerFactory.java[jobStarted]:109) - Job 20190605-122140_1966429456 started 
by scheduler 
org.apache.zeppelin.interpreter.remote.RemoteInterpreter-python:mansop:-shared_session
INFO [2019-06-05 12:35:07,789] ({pool-2-thread-2} Paragraph.java[jobRun]:380) - 
Run paragraph [paragraph_id: 20190605-122140_1966429456, interpreter: python, 
note_id: 2EBKSAFA9, user: mansop]
WARN [2019-06-05 12:35:17,799] ({pool-2-thread-2} 
NotebookServer.java[afterStatusChange]:2302) - Job 20190605-122140_1966429456 
is finished, status: ERROR, exception: null, result: %text python is not 
responding
INFO [2019-06-05 12:35:17,841] ({pool-2-thread-2} 
SchedulerFactory.java[jobFinished]:115) - Job 20190605-122140_1966429456 
finished by scheduler 
org.apache.zeppelin.interpreter.remote.RemoteInterpreter-python:mansop:-shared_session

My python interpreter is setup with impersonation

Any thoughts?

Thank you very much
NOTICE
Please consider the environment before printing this email. This message and 
any attachments are intended for the addressee named and may contain legally 
privileged/confidential/copyright information. If you are not the intended 
recipient, you should not read, use, disclose, copy or distribute this 
communication. If you have received this message in error please notify us at 
once by return email and then delete both messages. We accept no liability for 
the distribution of viruses or similar in electronic communications. This 
notice should not be removed.


RE: how to load pandas into pyspark (centos 6 with python 2.6)

2018-06-11 Thread Manuel Sopena Ballesteros
Ok, this is what I am getting

$/tmp/pythonvenv/bin/pip install pandas

The directory '/home/zeppelin/.cache/pip/http' or its parent directory is not 
owned by the current user and the cache has been disabled. Please check the 
permissions and owner of that directory. If executing pip with sudo, you may 
want sudo's -H flag.
pip is configured with locations that require TLS/SSL, however the ssl module 
in Python is not available.
The directory '/home/zeppelin/.cache/pip' or its parent directory is not owned 
by the current user and caching wheels has been disabled. check the permissions 
and owner of that directory. If executing pip with sudo, you may want sudo's -H 
flag.
Collecting pandas
  Retrying (Retry(total=4, connect=None, read=None, redirect=None, 
status=None)) after connection broken by 'SSLError("Can't connect to HTTPS URL 
because the SSL module is not available.",)': /simple/pandas/
  Retrying (Retry(total=3, connect=None, read=None, redirect=None, 
status=None)) after connection broken by 'SSLError("Can't connect to HTTPS URL 
because the SSL module is not available.",)': /simple/pandas/
  Retrying (Retry(total=2, connect=None, read=None, redirect=None, 
status=None)) after connection broken by 'SSLError("Can't connect to HTTPS URL 
because the SSL module is not available.",)': /simple/pandas/
  Retrying (Retry(total=1, connect=None, read=None, redirect=None, 
status=None)) after connection broken by 'SSLError("Can't connect to HTTPS URL 
because the SSL module is not available.",)': /simple/pandas/
  Retrying (Retry(total=0, connect=None, read=None, redirect=None, 
status=None)) after connection broken by 'SSLError("Can't connect to HTTPS URL 
because the SSL module is not available.",)': /simple/pandas/
  Could not find a version that satisfies the requirement pandas (from 
versions: )
No matching distribution found for pandas
  Could not fetch URL https://pypi.python.org/simple/pandas/: There was a 
problem confirming the ssl certificate: 
HTTPSConnectionPool(host='pypi.python.org', port=443): Max retries exceeded 
with url: /simple/pandas/ (Caused by SSLError("Can't connect to HTTPS URL 
because the SSL module is not available.",)) - skipping

Manuel

From: Jeff Zhang [mailto:zjf...@gmail.com]
Sent: Friday, June 8, 2018 2:54 PM
To: users@zeppelin.apache.org
Subject: Re: how to load pandas into pyspark (centos 6 with python 2.6)


Just find pip in your python 3.6 folder, and run pip using full path. e.g.

/tmp/Python-3.6.5/pip install pandas

Manuel Sopena Ballesteros 
mailto:manuel...@garvan.org.au>>于2018年6月8日周五 下午12:47写道:
Sorry for the stupid question

How can I use pip? Zeppelin will run pip through the shell interpreter but my 
system global python is 2.6…


[cid:image002.jpg@01D3FF37.8827CBF0]

thanks

Manuel

From: Jeff Zhang [mailto:zjf...@gmail.com<mailto:zjf...@gmail.com>]
Sent: Friday, June 8, 2018 1:45 PM

To: users@zeppelin.apache.org<mailto:users@zeppelin.apache.org>
Subject: Re: how to load pandas into pyspark (centos 6 with python 2.6)


pip should be available under your python3.6.5, you can use that to install 
pandas


Manuel Sopena Ballesteros 
mailto:manuel...@garvan.org.au>>于2018年6月8日周五 上午11:40写道:
Hi Jeff,

Thank you very much for your quick response. My zeppelin is deployed using HDP 
(hortonworks platform) so I already have spark/yarn integration and I am using 
zeppelin.pyspark.python to tell pyspark to run python 3.6:

zeppelin.pyspark.python --> /tmp/Python-3.6.5/python

I do have root access to the machine but OS is centos 6 (python system 
environment is 2.6) hence pip is not available

Thank you

Manuel

From: Jeff Zhang [mailto:zjf...@gmail.com<mailto:zjf...@gmail.com>]
Sent: Friday, June 8, 2018 11:47 AM
To: users@zeppelin.apache.org<mailto:users@zeppelin.apache.org>
Subject: Re: how to load pandas into pyspark (centos 6 with python 2.6)


First I would suggest you to use python 2.7 or python 3.x, because spark2.x has 
drop the support of python 2.6.
Second you need to configure PYSPARK_PYTHON in spark interpreter setting to 
point to the python that you installed. (I don't know what do you mena that you 
can't install pandas system wide). Do you mean you are not root and don't have 
permission to install python packages ?



Manuel Sopena Ballesteros 
mailto:manuel...@garvan.org.au>>于2018年6月8日周五 上午9:26写道:
Dear Zeppelin community,

I am trying to load pandas into my zeppelin %spark2.pyspark interpreter. The 
system I am using is centos 6 with python 2.6 so I can’t install pandas system 
wide through pip as suggested in the documentation.

What can I do if I want to add modules into the %spark2.pyspark interpreter?

Thank you very much

Manuel Sopena Ballesteros | Big data Engineer
Garvan Institute of Medical Research
The

RE: how to load pandas into pyspark (centos 6 with python 2.6)

2018-06-07 Thread Manuel Sopena Ballesteros
Sorry for the stupid question

How can I use pip? Zeppelin will run pip through the shell interpreter but my 
system global python is 2.6…


[cid:image002.jpg@01D3FF37.8827CBF0]

thanks

Manuel

From: Jeff Zhang [mailto:zjf...@gmail.com]
Sent: Friday, June 8, 2018 1:45 PM
To: users@zeppelin.apache.org
Subject: Re: how to load pandas into pyspark (centos 6 with python 2.6)


pip should be available under your python3.6.5, you can use that to install 
pandas


Manuel Sopena Ballesteros 
mailto:manuel...@garvan.org.au>>于2018年6月8日周五 上午11:40写道:
Hi Jeff,

Thank you very much for your quick response. My zeppelin is deployed using HDP 
(hortonworks platform) so I already have spark/yarn integration and I am using 
zeppelin.pyspark.python to tell pyspark to run python 3.6:

zeppelin.pyspark.python --> /tmp/Python-3.6.5/python

I do have root access to the machine but OS is centos 6 (python system 
environment is 2.6) hence pip is not available

Thank you

Manuel

From: Jeff Zhang [mailto:zjf...@gmail.com<mailto:zjf...@gmail.com>]
Sent: Friday, June 8, 2018 11:47 AM
To: users@zeppelin.apache.org<mailto:users@zeppelin.apache.org>
Subject: Re: how to load pandas into pyspark (centos 6 with python 2.6)


First I would suggest you to use python 2.7 or python 3.x, because spark2.x has 
drop the support of python 2.6.
Second you need to configure PYSPARK_PYTHON in spark interpreter setting to 
point to the python that you installed. (I don't know what do you mena that you 
can't install pandas system wide). Do you mean you are not root and don't have 
permission to install python packages ?



Manuel Sopena Ballesteros 
mailto:manuel...@garvan.org.au>>于2018年6月8日周五 上午9:26写道:
Dear Zeppelin community,

I am trying to load pandas into my zeppelin %spark2.pyspark interpreter. The 
system I am using is centos 6 with python 2.6 so I can’t install pandas system 
wide through pip as suggested in the documentation.

What can I do if I want to add modules into the %spark2.pyspark interpreter?

Thank you very much

Manuel Sopena Ballesteros | Big data Engineer
Garvan Institute of Medical Research
The Kinghorn Cancer Centre, 370 Victoria Street, Darlinghurst, NSW 
2010<https://maps.google.com/?q=370+Victoria+Street,+Darlinghurst,+NSW+2010&entry=gmail&source=g>
T: + 61 (0)2 9355 5760 | F: +61 (0)2 9295 
8507 | E: 
manuel...@garvan.org.au<mailto:manuel...@garvan.org.au>

NOTICE
Please consider the environment before printing this email. This message and 
any attachments are intended for the addressee named and may contain legally 
privileged/confidential/copyright information. If you are not the intended 
recipient, you should not read, use, disclose, copy or distribute this 
communication. If you have received this message in error please notify us at 
once by return email and then delete both messages. We accept no liability for 
the distribution of viruses or similar in electronic communications. This 
notice should not be removed.
NOTICE
Please consider the environment before printing this email. This message and 
any attachments are intended for the addressee named and may contain legally 
privileged/confidential/copyright information. If you are not the intended 
recipient, you should not read, use, disclose, copy or distribute this 
communication. If you have received this message in error please notify us at 
once by return email and then delete both messages. We accept no liability for 
the distribution of viruses or similar in electronic communications. This 
notice should not be removed.
NOTICE
Please consider the environment before printing this email. This message and 
any attachments are intended for the addressee named and may contain legally 
privileged/confidential/copyright information. If you are not the intended 
recipient, you should not read, use, disclose, copy or distribute this 
communication. If you have received this message in error please notify us at 
once by return email and then delete both messages. We accept no liability for 
the distribution of viruses or similar in electronic communications. This 
notice should not be removed.


RE: how to load pandas into pyspark (centos 6 with python 2.6)

2018-06-07 Thread Manuel Sopena Ballesteros
Hi Jeff,

Thank you very much for your quick response. My zeppelin is deployed using HDP 
(hortonworks platform) so I already have spark/yarn integration and I am using 
zeppelin.pyspark.python to tell pyspark to run python 3.6:

zeppelin.pyspark.python --> /tmp/Python-3.6.5/python

I do have root access to the machine but OS is centos 6 (python system 
environment is 2.6) hence pip is not available

Thank you

Manuel

From: Jeff Zhang [mailto:zjf...@gmail.com]
Sent: Friday, June 8, 2018 11:47 AM
To: users@zeppelin.apache.org
Subject: Re: how to load pandas into pyspark (centos 6 with python 2.6)


First I would suggest you to use python 2.7 or python 3.x, because spark2.x has 
drop the support of python 2.6.
Second you need to configure PYSPARK_PYTHON in spark interpreter setting to 
point to the python that you installed. (I don't know what do you mena that you 
can't install pandas system wide). Do you mean you are not root and don't have 
permission to install python packages ?



Manuel Sopena Ballesteros 
mailto:manuel...@garvan.org.au>>于2018年6月8日周五 上午9:26写道:
Dear Zeppelin community,

I am trying to load pandas into my zeppelin %spark2.pyspark interpreter. The 
system I am using is centos 6 with python 2.6 so I can’t install pandas system 
wide through pip as suggested in the documentation.

What can I do if I want to add modules into the %spark2.pyspark interpreter?

Thank you very much

Manuel Sopena Ballesteros | Big data Engineer
Garvan Institute of Medical Research
The Kinghorn Cancer Centre, 370 Victoria Street, Darlinghurst, NSW 
2010<https://maps.google.com/?q=370+Victoria+Street,+Darlinghurst,+NSW+2010&entry=gmail&source=g>
T: + 61 (0)2 9355 5760 | F: +61 (0)2 9295 
8507 | E: 
manuel...@garvan.org.au<mailto:manuel...@garvan.org.au>

NOTICE
Please consider the environment before printing this email. This message and 
any attachments are intended for the addressee named and may contain legally 
privileged/confidential/copyright information. If you are not the intended 
recipient, you should not read, use, disclose, copy or distribute this 
communication. If you have received this message in error please notify us at 
once by return email and then delete both messages. We accept no liability for 
the distribution of viruses or similar in electronic communications. This 
notice should not be removed.
NOTICE
Please consider the environment before printing this email. This message and 
any attachments are intended for the addressee named and may contain legally 
privileged/confidential/copyright information. If you are not the intended 
recipient, you should not read, use, disclose, copy or distribute this 
communication. If you have received this message in error please notify us at 
once by return email and then delete both messages. We accept no liability for 
the distribution of viruses or similar in electronic communications. This 
notice should not be removed.


how to load pandas into pyspark (centos 6 with python 2.6)

2018-06-07 Thread Manuel Sopena Ballesteros
Dear Zeppelin community,

I am trying to load pandas into my zeppelin %spark2.pyspark interpreter. The 
system I am using is centos 6 with python 2.6 so I can't install pandas system 
wide through pip as suggested in the documentation.

What can I do if I want to add modules into the %spark2.pyspark interpreter?

Thank you very much

Manuel Sopena Ballesteros | Big data Engineer
Garvan Institute of Medical Research
The Kinghorn Cancer Centre, 370 Victoria Street, Darlinghurst, NSW 2010
T: + 61 (0)2 9355 5760 | F: +61 (0)2 9295 8507 | E: 
manuel...@garvan.org.au<mailto:manuel...@garvan.org.au>

NOTICE
Please consider the environment before printing this email. This message and 
any attachments are intended for the addressee named and may contain legally 
privileged/confidential/copyright information. If you are not the intended 
recipient, you should not read, use, disclose, copy or distribute this 
communication. If you have received this message in error please notify us at 
once by return email and then delete both messages. We accept no liability for 
the distribution of viruses or similar in electronic communications. This 
notice should not be removed.