[jira] [Commented] (TOREE-438) CLONE - How to support Spark on Yarn model?
[ https://issues.apache.org/jira/browse/TOREE-438?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16163072#comment-16163072 ] Ribamar Santarosa commented on TOREE-438: - There is a failure on the CI that doesn't look related to that patch: {code} failed to register layer: Error processing tar file(exit status 1): write /opt/conda/envs/python2/lib/python2.7/site-packages/Cython/Compiler/Code.so: no space left on device {code} Unless writing the paths of those 2 env vars are so big that is consuming all the storage! =) > CLONE - How to support Spark on Yarn model? > --- > > Key: TOREE-438 > URL: https://issues.apache.org/jira/browse/TOREE-438 > Project: TOREE > Issue Type: Bug >Reporter: Ribamar Santarosa > > It looks like the TOREE-97 issue -- support for Spark Yarn was closed without > definitive solution (or something went wrong on the way). Toree does support > it, but it won't work if a user doesn't add manually in their kernel.json > definition, the env vars for {{HADOOP_CONF_DIR}}. Without that env var, Spark > doesn't know what to do with the option {{--master=yarn}} (set in > {{__TOREE_SPARK_OPTS__}}). It would be desirable to have it by default. > Probably this is not the nicest way to solve the problem, because it just > hard codes more vars into the JSON file -- ideally it would be nice to have > an interface to add or remove env vars from those files, however, > {{HADOOP_CONF_DIR}} and {{SPARK_CONF_DIR}} look basic to be exported. Even > for an Spark Standalone deployment, {{HADOOP_CONF_DIR}} won't hurt. So, here > it goes our 2 cents to improve a bit the situation. > I cloned the TOREE-97 into TOREE-438 to sign this issue. -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[jira] [Updated] (TOREE-438) CLONE - How to support Spark on Yarn model?
[ https://issues.apache.org/jira/browse/TOREE-438?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Ribamar Santarosa updated TOREE-438: Description: It looks like the TOREE-97 issue -- support for Spark Yarn was closed without definitive solution (or something went wrong on the way). Toree does support it, but it won't work if a user doesn't add manually in their kernel.json definition, the env vars for {{HADOOP_CONF_DIR}}. Without that env var, Spark doesn't know what to do with the option {{--master=yarn}} (set in {{__TOREE_SPARK_OPTS__}}). It would be desirable to have it by default. Probably this is not the nicest way to solve the problem, because it just hard codes more vars into the JSON file -- ideally it would be nice to have an interface to add or remove env vars from those files, however, {{HADOOP_CONF_DIR}} and {{SPARK_CONF_DIR}} look basic to be exported. Even for an Spark Standalone deployment, {{HADOOP_CONF_DIR}} won't hurt. So, here it goes our 2 cents to improve a bit the situation. I cloned the TOREE-97 into TOREE-438 to sign this issue. was: It looks like the TOREE-97 issue -- support for Spark Yarn was closed without definitive solution (or something went wrong on the way). Toree does support it, but it won't work if a user don't add manually in their kernel.json definition, the env vars for {{HADOOP_CONF_DIR}}. Without that env var, Spark doesn't know what to do with the option {{--master=yarn}} (set in {{__TOREE_SPARK_OPTS__}}). It would be desirable to have it by default. Probably this is not the nicest way to solve the problem, because it just hard codes more vars into the JSON file -- ideally it would be nice to have an interface to add or remove env vars from those files, however, {{HADOOP_CONF_DIR}} and {{SPARK_CONF_DIR}} look basic to be exported. Even for an Spark Standalone deployment, {{HADOOP_CONF_DIR}} won't hurt. So, here it goes our 2 cents to improve a bit the situation. I cloned the TOREE-97 into TOREE-438 to sign this issue. > CLONE - How to support Spark on Yarn model? > --- > > Key: TOREE-438 > URL: https://issues.apache.org/jira/browse/TOREE-438 > Project: TOREE > Issue Type: Bug >Reporter: Ribamar Santarosa > > It looks like the TOREE-97 issue -- support for Spark Yarn was closed without > definitive solution (or something went wrong on the way). Toree does support > it, but it won't work if a user doesn't add manually in their kernel.json > definition, the env vars for {{HADOOP_CONF_DIR}}. Without that env var, Spark > doesn't know what to do with the option {{--master=yarn}} (set in > {{__TOREE_SPARK_OPTS__}}). It would be desirable to have it by default. > Probably this is not the nicest way to solve the problem, because it just > hard codes more vars into the JSON file -- ideally it would be nice to have > an interface to add or remove env vars from those files, however, > {{HADOOP_CONF_DIR}} and {{SPARK_CONF_DIR}} look basic to be exported. Even > for an Spark Standalone deployment, {{HADOOP_CONF_DIR}} won't hurt. So, here > it goes our 2 cents to improve a bit the situation. > I cloned the TOREE-97 into TOREE-438 to sign this issue. -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[jira] [Updated] (TOREE-438) CLONE - How to support Spark on Yarn model?
[ https://issues.apache.org/jira/browse/TOREE-438?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Ribamar Santarosa updated TOREE-438: Description: It looks like the TOREE-97 issue -- support for Spark Yarn was closed without definitive solution (or something went wrong on the way). Toree does support it, but it won't work if a user don't add manually in their kernel.json definition, the env vars for {{HADOOP_CONF_DIR}}. Without that env var, Spark doesn't know what to do with the option {{--master=yarn}} (set in {{__TOREE_SPARK_OPTS__}}). It would be desirable to have it by default. Probably this is not the nicest way to solve the problem, because it just hard codes more vars into the JSON file -- ideally it would be nice to have an interface to add or remove env vars from those files, however, {{HADOOP_CONF_DIR}} and {{SPARK_CONF_DIR}} look basic to be exported. Even for an Spark Standalone deployment, {{HADOOP_CONF_DIR}} won't hurt. So, here it goes our 2 cents to improve a bit the situation. I cloned the TOREE-97 into TOREE-438 to sign this issue. was: It looks like the TOREE-97 issue -- support for Spark Yarn was closed without definitive solution (or something went wrong on the way). Toree does support it, but it won't work if a user don't add manually in their kernel.json definition, the env vars for {{HADOOP_CONF_DIR}}. Without that env var, Spark doesn't know what to do with the option {{--master=yarn}} (set in {{__TOREE_SPARK_OPTS__}}). It would be desirable to have it by default, and this patch provides this functionality. Probably this is not the nicest way to solve the problem, because it just hard codes more vars into the JSON file -- ideally it would be nice to have an interface to add or remove env vars from those files, however, {{HADOOP_CONF_DIR}} and {{SPARK_CONF_DIR}} look basic to be exported. Even for an Spark Standalone deployment, {{HADOOP_CONF_DIR}} won't hurt. So, here it goes our 2 cents to improve a bit the situation. I cloned the TOREE-97 into TOREE-438 to sign this issue. > CLONE - How to support Spark on Yarn model? > --- > > Key: TOREE-438 > URL: https://issues.apache.org/jira/browse/TOREE-438 > Project: TOREE > Issue Type: Bug >Reporter: Ribamar Santarosa > > It looks like the TOREE-97 issue -- support for Spark Yarn was closed without > definitive solution (or something went wrong on the way). Toree does support > it, but it won't work if a user don't add manually in their kernel.json > definition, the env vars for {{HADOOP_CONF_DIR}}. Without that env var, Spark > doesn't know what to do with the option {{--master=yarn}} (set in > {{__TOREE_SPARK_OPTS__}}). It would be desirable to have it by default. > Probably this is not the nicest way to solve the problem, because it just > hard codes more vars into the JSON file -- ideally it would be nice to have > an interface to add or remove env vars from those files, however, > {{HADOOP_CONF_DIR}} and {{SPARK_CONF_DIR}} look basic to be exported. Even > for an Spark Standalone deployment, {{HADOOP_CONF_DIR}} won't hurt. So, here > it goes our 2 cents to improve a bit the situation. > I cloned the TOREE-97 into TOREE-438 to sign this issue. -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[jira] [Updated] (TOREE-438) CLONE - How to support Spark on Yarn model?
[ https://issues.apache.org/jira/browse/TOREE-438?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Ribamar Santarosa updated TOREE-438: Description: It looks like the TOREE-97 issue -- support for Spark Yarn was closed without definitive solution (or something went wrong on the way). Toree does support it, but it won't work if a user don't add manually in their kernel.json definition, the env vars for {{HADOOP_CONF_DIR}}. Without that env var, Spark doesn't know what to do with the option {{--master=yarn}} (set in {{__TOREE_SPARK_OPTS__}}). It would be desirable to have it by default, and this patch provides this functionality. Probably this is not the nicest way to solve the problem, because it just hard codes more vars into the JSON file -- ideally it would be nice to have an interface to add or remove env vars from those files, however, {{HADOOP_CONF_DIR}} and {{SPARK_CONF_DIR}} look basic to be exported. Even for an Spark Standalone deployment, {{HADOOP_CONF_DIR}} won't hurt. So, here it goes our 2 cents to improve a bit the situation. I cloned the TOREE-97 into TOREE-438 to sign this issue. was: It looks like the TOREE-97 issue -- support for Spark Yarn was closed without definitive solution (or something went wrong on the way). Toree does support it, but it won't work if a user don't add manually in their kernel.json definition, the env vars for `HADOOP_CONF_DIR`. Without that env var, Spark doesn't know what to do with the option `--master=yarn` (set in `__TOREE_SPARK_OPTS__`). It would be desirable to have it by default, and this patch provides this functionality. Probably this is not the nicest way to solve the problem, because it just hard codes more vars into the JSON file -- ideally it would be nice to have an interface to add or remove env vars from those files, however, `HADOOP_CONF_DIR` and `SPARK_CONF_DIR` look basic to be exported. Even for an Spark Standalone deployment, `HADOOP_CONF_DIR` won't hurt. So, here it goes our 2 cents to improve a bit the situation. I cloned the TOREE-97 into TOREE-438 to sign this issue. > CLONE - How to support Spark on Yarn model? > --- > > Key: TOREE-438 > URL: https://issues.apache.org/jira/browse/TOREE-438 > Project: TOREE > Issue Type: Bug >Reporter: Ribamar Santarosa > > It looks like the TOREE-97 issue -- support for Spark Yarn was closed without > definitive solution (or something went wrong on the way). Toree does support > it, but it won't work if a user don't add manually in their kernel.json > definition, the env vars for {{HADOOP_CONF_DIR}}. Without that env var, Spark > doesn't know what to do with the option {{--master=yarn}} (set in > {{__TOREE_SPARK_OPTS__}}). It would be desirable to have it by default, and > this patch provides this functionality. > Probably this is not the nicest way to solve the problem, because it just > hard codes more vars into the JSON file -- ideally it would be nice to have > an interface to add or remove env vars from those files, however, > {{HADOOP_CONF_DIR}} and {{SPARK_CONF_DIR}} look basic to be exported. Even > for an Spark Standalone deployment, {{HADOOP_CONF_DIR}} won't hurt. So, here > it goes our 2 cents to improve a bit the situation. > I cloned the TOREE-97 into TOREE-438 to sign this issue. -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[jira] [Commented] (TOREE-438) CLONE - How to support Spark on Yarn model?
[ https://issues.apache.org/jira/browse/TOREE-438?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16162904#comment-16162904 ] Ribamar Santarosa commented on TOREE-438: - https://github.com/apache/incubator-toree/pull/141 as proposed solution. > CLONE - How to support Spark on Yarn model? > --- > > Key: TOREE-438 > URL: https://issues.apache.org/jira/browse/TOREE-438 > Project: TOREE > Issue Type: Bug >Reporter: Ribamar Santarosa > > It looks like the TOREE-97 issue -- support for Spark Yarn was closed without > definitive solution (or something went wrong on the way). Toree does support > it, but it won't work if a user don't add manually in their kernel.json > definition, the env vars for `HADOOP_CONF_DIR`. Without that env var, Spark > doesn't know what to do with the option `--master=yarn` (set in > `__TOREE_SPARK_OPTS__`). It would be desirable to have it by default, and > this patch provides this functionality. > Probably this is not the nicest way to solve the problem, because it just > hard codes more vars into the JSON file -- ideally it would be nice to have > an interface to add or remove env vars from those files, however, > `HADOOP_CONF_DIR` and `SPARK_CONF_DIR` look basic to be exported. Even for > an Spark Standalone deployment, `HADOOP_CONF_DIR` won't hurt. So, here it > goes our 2 cents to improve a bit the situation. > I cloned the TOREE-97 into TOREE-438 to sign this issue. -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[jira] [Updated] (TOREE-438) CLONE - How to support Spark on Yarn model?
[ https://issues.apache.org/jira/browse/TOREE-438?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Ribamar Santarosa updated TOREE-438: Description: It looks like the TOREE-97 issue -- support for Spark Yarn was closed without definitive solution (or something went wrong on the way). Toree does support it, but it won't work if a user don't add manually in their kernel.json definition, the env vars for `HADOOP_CONF_DIR`. Without that env var, Spark doesn't know what to do with the option `--master=yarn` (set in `__TOREE_SPARK_OPTS__`). It would be desirable to have it by default, and this patch provides this functionality. Probably this is not the nicest way to solve the problem, because it just hard codes more vars into the JSON file -- ideally it would be nice to have an interface to add or remove env vars from those files, however, `HADOOP_CONF_DIR` and `SPARK_CONF_DIR` look basic to be exported. Even for an Spark Standalone deployment, `HADOOP_CONF_DIR` won't hurt. So, here it goes our 2 cents to improve a bit the situation. I cloned the TOREE-97 into TOREE-438 to sign this issue. was: Hi, All Now I test spark-kernel in IPython3.0 released and Spark On Yarn model. kernel.json like as below {code} { "display_name": "SparkOnYarn", "language": "scala", "argv": [ "/root/local/bin/sparkkernel", "--master", "yarn-client", "--profile", "{connection_file}" ], "codemirror_mode": "scala" } {code} while kernel can not be started. > CLONE - How to support Spark on Yarn model? > --- > > Key: TOREE-438 > URL: https://issues.apache.org/jira/browse/TOREE-438 > Project: TOREE > Issue Type: Bug >Reporter: Ribamar Santarosa > > It looks like the TOREE-97 issue -- support for Spark Yarn was closed without > definitive solution (or something went wrong on the way). Toree does support > it, but it won't work if a user don't add manually in their kernel.json > definition, the env vars for `HADOOP_CONF_DIR`. Without that env var, Spark > doesn't know what to do with the option `--master=yarn` (set in > `__TOREE_SPARK_OPTS__`). It would be desirable to have it by default, and > this patch provides this functionality. > Probably this is not the nicest way to solve the problem, because it just > hard codes more vars into the JSON file -- ideally it would be nice to have > an interface to add or remove env vars from those files, however, > `HADOOP_CONF_DIR` and `SPARK_CONF_DIR` look basic to be exported. Even for > an Spark Standalone deployment, `HADOOP_CONF_DIR` won't hurt. So, here it > goes our 2 cents to improve a bit the situation. > I cloned the TOREE-97 into TOREE-438 to sign this issue. -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[jira] [Created] (TOREE-438) CLONE - How to support Spark on Yarn model?
Ribamar Santarosa created TOREE-438: --- Summary: CLONE - How to support Spark on Yarn model? Key: TOREE-438 URL: https://issues.apache.org/jira/browse/TOREE-438 Project: TOREE Issue Type: Bug Reporter: Ribamar Santarosa Hi, All Now I test spark-kernel in IPython3.0 released and Spark On Yarn model. kernel.json like as below {code} { "display_name": "SparkOnYarn", "language": "scala", "argv": [ "/root/local/bin/sparkkernel", "--master", "yarn-client", "--profile", "{connection_file}" ], "codemirror_mode": "scala" } {code} while kernel can not be started. -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[jira] [Commented] (TOREE-427) Unable to use display() or magics in Toree-PySpark
[ https://issues.apache.org/jira/browse/TOREE-427?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16153622#comment-16153622 ] Ribamar Santarosa commented on TOREE-427: - [~markgritter] can you use magics using the scala kernel instead? For me, the magics are available with the scala kernel, but not with the pyspark one. > Unable to use display() or magics in Toree-PySpark > -- > > Key: TOREE-427 > URL: https://issues.apache.org/jira/browse/TOREE-427 > Project: TOREE > Issue Type: Bug > Components: Kernel >Affects Versions: 0.2.0 > Environment: Linux >Reporter: Mark Gritter >Priority: Minor > > I am trying to use PySpark in a Jupyter notebook using PySpark. Plain-text > interaction is working fine. When I tried to include HTML or graphical > output, though I get the following: > display(sc.getConf().getAll()) > Name: org.apache.toree.interpreter.broker.BrokerException > Message: Traceback (most recent call last): > File > "/tmp/kernel-PySpark-7b121ed7-79a8-4222-bf15-733286d125d2/pyspark_runner.py", > line 189, in > eval(compiled_code) > File "", line 1, in > File "/usr/local/lib/python2.7/dist-packages/IPython/core/display.py", line > 300, in display > format = InteractiveShell.instance().display_formatter.format > File > "/usr/local/lib/python2.7/dist-packages/traitlets/config/configurable.py", > line 412, in instance > inst = cls(*args, **kwargs) > File > "/usr/local/lib/python2.7/dist-packages/IPython/core/interactiveshell.py", > line 500, in __init__ > self.init_io() > File > "/usr/local/lib/python2.7/dist-packages/IPython/core/interactiveshell.py", > line 660, in init_io > io.stdout = io.IOStream(sys.stdout) > File "/usr/local/lib/python2.7/dist-packages/IPython/utils/io.py", line 34, > in __init__ > raise ValueError("fallback required, but not specified") > ValueError: fallback required, but not specified > StackTrace: > org.apache.toree.interpreter.broker.BrokerState$$anonfun$markFailure$1.apply(BrokerState.scala:163) > org.apache.toree.interpreter.broker.BrokerState$$anonfun$markFailure$1.apply(BrokerState.scala:163) > scala.Option.foreach(Option.scala:257) > org.apache.toree.interpreter.broker.BrokerState.markFailure(BrokerState.scala:162) > sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) > sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62) > sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) > java.lang.reflect.Method.invoke(Method.java:498) > py4j.reflection.MethodInvoker.invoke(MethodInvoker.java:244) > py4j.reflection.ReflectionEngine.invoke(ReflectionEngine.java:357) > py4j.Gateway.invoke(Gateway.java:280) > py4j.commands.AbstractCommand.invokeMethod(AbstractCommand.java:132) > py4j.commands.CallCommand.execute(CallCommand.java:79) > py4j.GatewayConnection.run(GatewayConnection.java:214) > java.lang.Thread.run(Thread.java:748) > I tried using magics to display raw HTML (following the example here: > https://github.com/apache/incubator-toree/blob/master/etc/examples/notebooks/magic-tutorial.ipynb) > and get the following backtrace: > %%dataframe > Name: org.apache.toree.interpreter.broker.BrokerException > Message: Traceback (most recent call last): > File > "/tmp/kernel-PySpark-7b121ed7-79a8-4222-bf15-733286d125d2/pyspark_runner.py", > line 189, in > eval(compiled_code) > File "", line 1, in > File > "/tmp/kernel-PySpark-7b121ed7-79a8-4222-bf15-733286d125d2/pyspark_runner.py", > line 107, in __getattr__ > return self._jvm_kernel.__getattribute__(name) > AttributeError: 'JavaObject' object has no attribute 'magics' > StackTrace: > org.apache.toree.interpreter.broker.BrokerState$$anonfun$markFailure$1.apply(BrokerState.scala:163) > org.apache.toree.interpreter.broker.BrokerState$$anonfun$markFailure$1.apply(BrokerState.scala:163) > scala.Option.foreach(Option.scala:257) > org.apache.toree.interpreter.broker.BrokerState.markFailure(BrokerState.scala:162) > sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) > sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62) > sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) > java.lang.reflect.Method.invoke(Method.java:498) > py4j.reflection.MethodInvoker.invoke(MethodInvoker.java:244) > py4j.reflection.ReflectionEngine.invoke(ReflectionEngine.java:357) > py4j.Gateway.invoke(Gateway.java:280) > py4j.commands.AbstractCommand.invokeMethod(AbstractCommand.java:132) > py4j.commands.CallCommand.execute(CallCommand.java:79) > py4j.GatewayConnection.run(GatewayConnection.java:214) > java.lang.Thread.run(Thread.java:748) > VERSION: 0.2.0.dev1-incubating > COMMIT: 9b577f19df83 -- This message was sent by Atlassia