[jira] [Commented] (TOREE-438) CLONE - How to support Spark on Yarn model?

2017-09-12 Thread Ribamar Santarosa (JIRA)

[ 
https://issues.apache.org/jira/browse/TOREE-438?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16163072#comment-16163072
 ] 

Ribamar Santarosa commented on TOREE-438:
-

There is a failure on the CI that doesn't look related to that patch:

{code}
failed to register layer: Error processing tar file(exit status 1): write 
/opt/conda/envs/python2/lib/python2.7/site-packages/Cython/Compiler/Code.so: no 
space left on device
{code}

Unless writing the paths of those 2 env vars are so big that is consuming all 
the storage! =)

> CLONE - How to support Spark on Yarn model?
> ---
>
> Key: TOREE-438
> URL: https://issues.apache.org/jira/browse/TOREE-438
> Project: TOREE
>  Issue Type: Bug
>Reporter: Ribamar Santarosa
>
> It looks like the TOREE-97 issue -- support for Spark Yarn was closed without 
> definitive solution (or something went wrong on the way). Toree does support 
> it, but it won't work if a user doesn't add manually in their kernel.json 
> definition, the env vars for {{HADOOP_CONF_DIR}}. Without that env var, Spark 
> doesn't know what to do with the option {{--master=yarn}} (set in 
> {{__TOREE_SPARK_OPTS__}}). It would be desirable to have it by default.
> Probably this is not the nicest way to solve the problem, because it  just 
> hard codes more vars into the JSON file -- ideally it would be nice to have 
> an interface to add or remove env vars from those files, however, 
> {{HADOOP_CONF_DIR}}  and {{SPARK_CONF_DIR}}  look basic to be exported. Even 
> for an Spark Standalone deployment, {{HADOOP_CONF_DIR}} won't hurt.  So, here 
> it goes our 2 cents to improve a bit the situation.
> I cloned the TOREE-97 into TOREE-438 to sign this issue. 



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Updated] (TOREE-438) CLONE - How to support Spark on Yarn model?

2017-09-12 Thread Ribamar Santarosa (JIRA)

 [ 
https://issues.apache.org/jira/browse/TOREE-438?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ribamar Santarosa updated TOREE-438:

Description: 
It looks like the TOREE-97 issue -- support for Spark Yarn was closed without 
definitive solution (or something went wrong on the way). Toree does support 
it, but it won't work if a user doesn't add manually in their kernel.json 
definition, the env vars for {{HADOOP_CONF_DIR}}. Without that env var, Spark 
doesn't know what to do with the option {{--master=yarn}} (set in 
{{__TOREE_SPARK_OPTS__}}). It would be desirable to have it by default.
Probably this is not the nicest way to solve the problem, because it  just hard 
codes more vars into the JSON file -- ideally it would be nice to have an 
interface to add or remove env vars from those files, however, 
{{HADOOP_CONF_DIR}}  and {{SPARK_CONF_DIR}}  look basic to be exported. Even 
for an Spark Standalone deployment, {{HADOOP_CONF_DIR}} won't hurt.  So, here 
it goes our 2 cents to improve a bit the situation.

I cloned the TOREE-97 into TOREE-438 to sign this issue. 

  was:
It looks like the TOREE-97 issue -- support for Spark Yarn was closed without 
definitive solution (or something went wrong on the way). Toree does support 
it, but it won't work if a user don't add manually in their kernel.json 
definition, the env vars for {{HADOOP_CONF_DIR}}. Without that env var, Spark 
doesn't know what to do with the option {{--master=yarn}} (set in 
{{__TOREE_SPARK_OPTS__}}). It would be desirable to have it by default.
Probably this is not the nicest way to solve the problem, because it  just hard 
codes more vars into the JSON file -- ideally it would be nice to have an 
interface to add or remove env vars from those files, however, 
{{HADOOP_CONF_DIR}}  and {{SPARK_CONF_DIR}}  look basic to be exported. Even 
for an Spark Standalone deployment, {{HADOOP_CONF_DIR}} won't hurt.  So, here 
it goes our 2 cents to improve a bit the situation.

I cloned the TOREE-97 into TOREE-438 to sign this issue. 


> CLONE - How to support Spark on Yarn model?
> ---
>
> Key: TOREE-438
> URL: https://issues.apache.org/jira/browse/TOREE-438
> Project: TOREE
>  Issue Type: Bug
>Reporter: Ribamar Santarosa
>
> It looks like the TOREE-97 issue -- support for Spark Yarn was closed without 
> definitive solution (or something went wrong on the way). Toree does support 
> it, but it won't work if a user doesn't add manually in their kernel.json 
> definition, the env vars for {{HADOOP_CONF_DIR}}. Without that env var, Spark 
> doesn't know what to do with the option {{--master=yarn}} (set in 
> {{__TOREE_SPARK_OPTS__}}). It would be desirable to have it by default.
> Probably this is not the nicest way to solve the problem, because it  just 
> hard codes more vars into the JSON file -- ideally it would be nice to have 
> an interface to add or remove env vars from those files, however, 
> {{HADOOP_CONF_DIR}}  and {{SPARK_CONF_DIR}}  look basic to be exported. Even 
> for an Spark Standalone deployment, {{HADOOP_CONF_DIR}} won't hurt.  So, here 
> it goes our 2 cents to improve a bit the situation.
> I cloned the TOREE-97 into TOREE-438 to sign this issue. 



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Updated] (TOREE-438) CLONE - How to support Spark on Yarn model?

2017-09-12 Thread Ribamar Santarosa (JIRA)

 [ 
https://issues.apache.org/jira/browse/TOREE-438?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ribamar Santarosa updated TOREE-438:

Description: 
It looks like the TOREE-97 issue -- support for Spark Yarn was closed without 
definitive solution (or something went wrong on the way). Toree does support 
it, but it won't work if a user don't add manually in their kernel.json 
definition, the env vars for {{HADOOP_CONF_DIR}}. Without that env var, Spark 
doesn't know what to do with the option {{--master=yarn}} (set in 
{{__TOREE_SPARK_OPTS__}}). It would be desirable to have it by default.
Probably this is not the nicest way to solve the problem, because it  just hard 
codes more vars into the JSON file -- ideally it would be nice to have an 
interface to add or remove env vars from those files, however, 
{{HADOOP_CONF_DIR}}  and {{SPARK_CONF_DIR}}  look basic to be exported. Even 
for an Spark Standalone deployment, {{HADOOP_CONF_DIR}} won't hurt.  So, here 
it goes our 2 cents to improve a bit the situation.

I cloned the TOREE-97 into TOREE-438 to sign this issue. 

  was:
It looks like the TOREE-97 issue -- support for Spark Yarn was closed without 
definitive solution (or something went wrong on the way). Toree does support 
it, but it won't work if a user don't add manually in their kernel.json 
definition, the env vars for {{HADOOP_CONF_DIR}}. Without that env var, Spark 
doesn't know what to do with the option {{--master=yarn}} (set in 
{{__TOREE_SPARK_OPTS__}}). It would be desirable to have it by default, and 
this patch provides this functionality. 

Probably this is not the nicest way to solve the problem, because it  just hard 
codes more vars into the JSON file -- ideally it would be nice to have an 
interface to add or remove env vars from those files, however, 
{{HADOOP_CONF_DIR}}  and {{SPARK_CONF_DIR}}  look basic to be exported. Even 
for an Spark Standalone deployment, {{HADOOP_CONF_DIR}} won't hurt.  So, here 
it goes our 2 cents to improve a bit the situation.

I cloned the TOREE-97 into TOREE-438 to sign this issue. 


> CLONE - How to support Spark on Yarn model?
> ---
>
> Key: TOREE-438
> URL: https://issues.apache.org/jira/browse/TOREE-438
> Project: TOREE
>  Issue Type: Bug
>Reporter: Ribamar Santarosa
>
> It looks like the TOREE-97 issue -- support for Spark Yarn was closed without 
> definitive solution (or something went wrong on the way). Toree does support 
> it, but it won't work if a user don't add manually in their kernel.json 
> definition, the env vars for {{HADOOP_CONF_DIR}}. Without that env var, Spark 
> doesn't know what to do with the option {{--master=yarn}} (set in 
> {{__TOREE_SPARK_OPTS__}}). It would be desirable to have it by default.
> Probably this is not the nicest way to solve the problem, because it  just 
> hard codes more vars into the JSON file -- ideally it would be nice to have 
> an interface to add or remove env vars from those files, however, 
> {{HADOOP_CONF_DIR}}  and {{SPARK_CONF_DIR}}  look basic to be exported. Even 
> for an Spark Standalone deployment, {{HADOOP_CONF_DIR}} won't hurt.  So, here 
> it goes our 2 cents to improve a bit the situation.
> I cloned the TOREE-97 into TOREE-438 to sign this issue. 



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Updated] (TOREE-438) CLONE - How to support Spark on Yarn model?

2017-09-12 Thread Ribamar Santarosa (JIRA)

 [ 
https://issues.apache.org/jira/browse/TOREE-438?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ribamar Santarosa updated TOREE-438:

Description: 
It looks like the TOREE-97 issue -- support for Spark Yarn was closed without 
definitive solution (or something went wrong on the way). Toree does support 
it, but it won't work if a user don't add manually in their kernel.json 
definition, the env vars for {{HADOOP_CONF_DIR}}. Without that env var, Spark 
doesn't know what to do with the option {{--master=yarn}} (set in 
{{__TOREE_SPARK_OPTS__}}). It would be desirable to have it by default, and 
this patch provides this functionality. 

Probably this is not the nicest way to solve the problem, because it  just hard 
codes more vars into the JSON file -- ideally it would be nice to have an 
interface to add or remove env vars from those files, however, 
{{HADOOP_CONF_DIR}}  and {{SPARK_CONF_DIR}}  look basic to be exported. Even 
for an Spark Standalone deployment, {{HADOOP_CONF_DIR}} won't hurt.  So, here 
it goes our 2 cents to improve a bit the situation.

I cloned the TOREE-97 into TOREE-438 to sign this issue. 

  was:
It looks like the TOREE-97 issue -- support for Spark Yarn was closed without 
definitive solution (or something went wrong on the way). Toree does support 
it, but it won't work if a user don't add manually in their kernel.json 
definition, the env vars for `HADOOP_CONF_DIR`. Without that env var, Spark 
doesn't know what to do with the option `--master=yarn` (set in 
`__TOREE_SPARK_OPTS__`). It would be desirable to have it by default, and this 
patch provides this functionality. 

Probably this is not the nicest way to solve the problem, because it  just hard 
codes more vars into the JSON file -- ideally it would be nice to have an 
interface to add or remove env vars from those files, however, 
`HADOOP_CONF_DIR`  and `SPARK_CONF_DIR`  look basic to be exported. Even for an 
Spark Standalone deployment, `HADOOP_CONF_DIR` won't hurt.  So, here it goes 
our 2 cents to improve a bit the situation.

I cloned the TOREE-97 into TOREE-438 to sign this issue. 


> CLONE - How to support Spark on Yarn model?
> ---
>
> Key: TOREE-438
> URL: https://issues.apache.org/jira/browse/TOREE-438
> Project: TOREE
>  Issue Type: Bug
>Reporter: Ribamar Santarosa
>
> It looks like the TOREE-97 issue -- support for Spark Yarn was closed without 
> definitive solution (or something went wrong on the way). Toree does support 
> it, but it won't work if a user don't add manually in their kernel.json 
> definition, the env vars for {{HADOOP_CONF_DIR}}. Without that env var, Spark 
> doesn't know what to do with the option {{--master=yarn}} (set in 
> {{__TOREE_SPARK_OPTS__}}). It would be desirable to have it by default, and 
> this patch provides this functionality. 
> Probably this is not the nicest way to solve the problem, because it  just 
> hard codes more vars into the JSON file -- ideally it would be nice to have 
> an interface to add or remove env vars from those files, however, 
> {{HADOOP_CONF_DIR}}  and {{SPARK_CONF_DIR}}  look basic to be exported. Even 
> for an Spark Standalone deployment, {{HADOOP_CONF_DIR}} won't hurt.  So, here 
> it goes our 2 cents to improve a bit the situation.
> I cloned the TOREE-97 into TOREE-438 to sign this issue. 



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Commented] (TOREE-438) CLONE - How to support Spark on Yarn model?

2017-09-12 Thread Ribamar Santarosa (JIRA)

[ 
https://issues.apache.org/jira/browse/TOREE-438?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16162904#comment-16162904
 ] 

Ribamar Santarosa commented on TOREE-438:
-

https://github.com/apache/incubator-toree/pull/141 as proposed solution.

> CLONE - How to support Spark on Yarn model?
> ---
>
> Key: TOREE-438
> URL: https://issues.apache.org/jira/browse/TOREE-438
> Project: TOREE
>  Issue Type: Bug
>Reporter: Ribamar Santarosa
>
> It looks like the TOREE-97 issue -- support for Spark Yarn was closed without 
> definitive solution (or something went wrong on the way). Toree does support 
> it, but it won't work if a user don't add manually in their kernel.json 
> definition, the env vars for `HADOOP_CONF_DIR`. Without that env var, Spark 
> doesn't know what to do with the option `--master=yarn` (set in 
> `__TOREE_SPARK_OPTS__`). It would be desirable to have it by default, and 
> this patch provides this functionality. 
> Probably this is not the nicest way to solve the problem, because it  just 
> hard codes more vars into the JSON file -- ideally it would be nice to have 
> an interface to add or remove env vars from those files, however, 
> `HADOOP_CONF_DIR`  and `SPARK_CONF_DIR`  look basic to be exported. Even for 
> an Spark Standalone deployment, `HADOOP_CONF_DIR` won't hurt.  So, here it 
> goes our 2 cents to improve a bit the situation.
> I cloned the TOREE-97 into TOREE-438 to sign this issue. 



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Updated] (TOREE-438) CLONE - How to support Spark on Yarn model?

2017-09-12 Thread Ribamar Santarosa (JIRA)

 [ 
https://issues.apache.org/jira/browse/TOREE-438?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ribamar Santarosa updated TOREE-438:

Description: 
It looks like the TOREE-97 issue -- support for Spark Yarn was closed without 
definitive solution (or something went wrong on the way). Toree does support 
it, but it won't work if a user don't add manually in their kernel.json 
definition, the env vars for `HADOOP_CONF_DIR`. Without that env var, Spark 
doesn't know what to do with the option `--master=yarn` (set in 
`__TOREE_SPARK_OPTS__`). It would be desirable to have it by default, and this 
patch provides this functionality. 

Probably this is not the nicest way to solve the problem, because it  just hard 
codes more vars into the JSON file -- ideally it would be nice to have an 
interface to add or remove env vars from those files, however, 
`HADOOP_CONF_DIR`  and `SPARK_CONF_DIR`  look basic to be exported. Even for an 
Spark Standalone deployment, `HADOOP_CONF_DIR` won't hurt.  So, here it goes 
our 2 cents to improve a bit the situation.

I cloned the TOREE-97 into TOREE-438 to sign this issue. 

  was:
Hi, All 
  Now I test spark-kernel in IPython3.0 released and Spark On Yarn model.  
kernel.json like as below
{code}
{
"display_name": "SparkOnYarn",
"language": "scala",
"argv": [
  "/root/local/bin/sparkkernel",
   "--master",
   "yarn-client",
"--profile",
"{connection_file}"
 ],
 "codemirror_mode": "scala"
}
{code}
while kernel can not be started.


> CLONE - How to support Spark on Yarn model?
> ---
>
> Key: TOREE-438
> URL: https://issues.apache.org/jira/browse/TOREE-438
> Project: TOREE
>  Issue Type: Bug
>Reporter: Ribamar Santarosa
>
> It looks like the TOREE-97 issue -- support for Spark Yarn was closed without 
> definitive solution (or something went wrong on the way). Toree does support 
> it, but it won't work if a user don't add manually in their kernel.json 
> definition, the env vars for `HADOOP_CONF_DIR`. Without that env var, Spark 
> doesn't know what to do with the option `--master=yarn` (set in 
> `__TOREE_SPARK_OPTS__`). It would be desirable to have it by default, and 
> this patch provides this functionality. 
> Probably this is not the nicest way to solve the problem, because it  just 
> hard codes more vars into the JSON file -- ideally it would be nice to have 
> an interface to add or remove env vars from those files, however, 
> `HADOOP_CONF_DIR`  and `SPARK_CONF_DIR`  look basic to be exported. Even for 
> an Spark Standalone deployment, `HADOOP_CONF_DIR` won't hurt.  So, here it 
> goes our 2 cents to improve a bit the situation.
> I cloned the TOREE-97 into TOREE-438 to sign this issue. 



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Created] (TOREE-438) CLONE - How to support Spark on Yarn model?

2017-09-12 Thread Ribamar Santarosa (JIRA)
Ribamar Santarosa created TOREE-438:
---

 Summary: CLONE - How to support Spark on Yarn model?
 Key: TOREE-438
 URL: https://issues.apache.org/jira/browse/TOREE-438
 Project: TOREE
  Issue Type: Bug
Reporter: Ribamar Santarosa


Hi, All 
  Now I test spark-kernel in IPython3.0 released and Spark On Yarn model.  
kernel.json like as below
{code}
{
"display_name": "SparkOnYarn",
"language": "scala",
"argv": [
  "/root/local/bin/sparkkernel",
   "--master",
   "yarn-client",
"--profile",
"{connection_file}"
 ],
 "codemirror_mode": "scala"
}
{code}
while kernel can not be started.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Commented] (TOREE-427) Unable to use display() or magics in Toree-PySpark

2017-09-05 Thread Ribamar Santarosa (JIRA)

[ 
https://issues.apache.org/jira/browse/TOREE-427?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16153622#comment-16153622
 ] 

Ribamar Santarosa commented on TOREE-427:
-

[~markgritter] can you use magics using the scala kernel instead? 

For me, the magics are available with the scala kernel, but not with the 
pyspark one.

> Unable to use display() or magics in Toree-PySpark
> --
>
> Key: TOREE-427
> URL: https://issues.apache.org/jira/browse/TOREE-427
> Project: TOREE
>  Issue Type: Bug
>  Components: Kernel
>Affects Versions: 0.2.0
> Environment: Linux
>Reporter: Mark Gritter
>Priority: Minor
>
> I am trying to use PySpark in a Jupyter notebook using PySpark.  Plain-text 
> interaction is working fine.  When I tried to include HTML or graphical 
> output, though I get the following:
> display(sc.getConf().getAll())
> Name: org.apache.toree.interpreter.broker.BrokerException
> Message: Traceback (most recent call last):
>   File 
> "/tmp/kernel-PySpark-7b121ed7-79a8-4222-bf15-733286d125d2/pyspark_runner.py", 
> line 189, in 
> eval(compiled_code)
>   File "", line 1, in 
>   File "/usr/local/lib/python2.7/dist-packages/IPython/core/display.py", line 
> 300, in display
> format = InteractiveShell.instance().display_formatter.format
>   File 
> "/usr/local/lib/python2.7/dist-packages/traitlets/config/configurable.py", 
> line 412, in instance
> inst = cls(*args, **kwargs)
>   File 
> "/usr/local/lib/python2.7/dist-packages/IPython/core/interactiveshell.py", 
> line 500, in __init__
> self.init_io()
>   File 
> "/usr/local/lib/python2.7/dist-packages/IPython/core/interactiveshell.py", 
> line 660, in init_io
> io.stdout = io.IOStream(sys.stdout)
>   File "/usr/local/lib/python2.7/dist-packages/IPython/utils/io.py", line 34, 
> in __init__
> raise ValueError("fallback required, but not specified")
> ValueError: fallback required, but not specified
> StackTrace: 
> org.apache.toree.interpreter.broker.BrokerState$$anonfun$markFailure$1.apply(BrokerState.scala:163)
> org.apache.toree.interpreter.broker.BrokerState$$anonfun$markFailure$1.apply(BrokerState.scala:163)
> scala.Option.foreach(Option.scala:257)
> org.apache.toree.interpreter.broker.BrokerState.markFailure(BrokerState.scala:162)
> sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
> sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
> sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
> java.lang.reflect.Method.invoke(Method.java:498)
> py4j.reflection.MethodInvoker.invoke(MethodInvoker.java:244)
> py4j.reflection.ReflectionEngine.invoke(ReflectionEngine.java:357)
> py4j.Gateway.invoke(Gateway.java:280)
> py4j.commands.AbstractCommand.invokeMethod(AbstractCommand.java:132)
> py4j.commands.CallCommand.execute(CallCommand.java:79)
> py4j.GatewayConnection.run(GatewayConnection.java:214)
> java.lang.Thread.run(Thread.java:748)
> I tried using magics to display raw HTML (following the example here: 
> https://github.com/apache/incubator-toree/blob/master/etc/examples/notebooks/magic-tutorial.ipynb)
>  and get the following backtrace:
> %%dataframe
> Name: org.apache.toree.interpreter.broker.BrokerException
> Message: Traceback (most recent call last):
>   File 
> "/tmp/kernel-PySpark-7b121ed7-79a8-4222-bf15-733286d125d2/pyspark_runner.py", 
> line 189, in 
> eval(compiled_code)
>   File "", line 1, in 
>   File 
> "/tmp/kernel-PySpark-7b121ed7-79a8-4222-bf15-733286d125d2/pyspark_runner.py", 
> line 107, in __getattr__
> return self._jvm_kernel.__getattribute__(name)
> AttributeError: 'JavaObject' object has no attribute 'magics'
> StackTrace: 
> org.apache.toree.interpreter.broker.BrokerState$$anonfun$markFailure$1.apply(BrokerState.scala:163)
> org.apache.toree.interpreter.broker.BrokerState$$anonfun$markFailure$1.apply(BrokerState.scala:163)
> scala.Option.foreach(Option.scala:257)
> org.apache.toree.interpreter.broker.BrokerState.markFailure(BrokerState.scala:162)
> sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
> sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
> sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
> java.lang.reflect.Method.invoke(Method.java:498)
> py4j.reflection.MethodInvoker.invoke(MethodInvoker.java:244)
> py4j.reflection.ReflectionEngine.invoke(ReflectionEngine.java:357)
> py4j.Gateway.invoke(Gateway.java:280)
> py4j.commands.AbstractCommand.invokeMethod(AbstractCommand.java:132)
> py4j.commands.CallCommand.execute(CallCommand.java:79)
> py4j.GatewayConnection.run(GatewayConnection.java:214)
> java.lang.Thread.run(Thread.java:748)
> VERSION: 0.2.0.dev1-incubating
> COMMIT: 9b577f19df83



--
This message was sent by Atlassia