[GitHub] incubator-zeppelin pull request: R Interpreter for Zeppelin

Leemoonsoo Sat, 14 Nov 2015 17:56:36 -0800

Github user Leemoonsoo commented on the pull request:

    https://github.com/apache/incubator-zeppelin/pull/208#issuecomment-156769919
  
    > I am not using rscala.
    >
    > It is not possible to use the SparkR connection in the way you describe. 
I did look into this early on. There are numerous packages for interfacing the 
jvm and R. None of them use two-way connections.
    > 
    > The python-spark and zeppelin integrations you describe leverage an 
external dependency. There is no comparable package available for R that has a 
compatible license.
    
    Correction rscala -> forked-rscala.
    
    You don't need to find any other external package for two-way connection. 
You can make two-way `R<->JVM` invocation without any external dependency with 
the same technique used in PySparkInterpreter.
    
    PySpark implements one way connection `Python->JVM` similar to SparkR.
    PysparkInterpreter leverage this one way connection and successfully made 
`JVM->Python` invocation without any external dependency. 
    
    I guess RInterpreter can do the same, with SparkR. I'll able to make 
`JVM->R` invocation without any external dependency and socket connection.
    That'll simplify the code base. More precisely, more than 1000 lines of 
code.
    
    
    > As I understand, you don't use R, so it may seem strange to have a 
separate interpreter rather than a function. That's understandable.
    > 
    > The distinction between the r-repl and knitr interpreters makes perfect 
sense for people who are coming from R. The repl and knitr handle code, and 
errors, and output, in fundamentally different ways.
    > 
    > They have different capabilities. It is not possible, consistent with the 
zeppelin architecture, to put both capabilities into a single interpreter 
without making the use of that interpreter very unintuitive for someone coming 
from an R background.
    > 
    > The knit2html() command is something no R user would ever use when making 
use of R. It is perhaps best thought of as part of the "R operating system."
    
    
    
    Yes i'm not familiar with R. So please convince me. What KnitR Interpreter 
doing is basically
    
    ```
         rContext.set(".zeppknitrinput", st.split("\n"))
         rContext.eval(".knitout <- knit2html(text=.zeppknitrinput, envir = 
rzeppelin:::.zeppenv)")
         rContext.getS0(".knitout")
    ```
    
    and basic usage i found from KnitR website is
    
    ```
    library(knitr)
    ?knit
    knit(input)
    ```
    
    So, to me, it's hard to imagine why functions like `z.knite(input)` does 
not make sense.
    If you have use cases, please share.
    
    By the way, KnitR is GPL license. I don't think Zeppelin can have a feature 
that depends on GPL licensed code.
    
    
    > That's really fine, but in my view this is the lowest-priority possible 
item.
    
    License and Copyright problems are one of the hight priority item in 
Zeppelin project
    
    
    
    > The highest priority is the travis build problems. Travis consistently 
fails building parts other than rzeppelin.
    
    Latest exception from your CI Build is 
    
    ```
    Caused by: java.lang.OutOfMemoryError: PermGen space
        at 
org.apache.zeppelin.rinterpreter.RSparkTest$$anonfun$3.apply$mcV$sp(RSparkTest.scala:51)
    An exception or error caused a run to abort. This may have been caused by a 
problematic custom reporter.
    Exception in thread "ScalaTest-main" 
    Exception: java.lang.OutOfMemoryError thrown from the 
UncaughtExceptionHandler in thread "ScalaTest-main"
    ```
    
    Please try to increase PermGem memory option in the test.
    
    
    > My users have had a long stream of issues getting Spark to work through 
zeppelin. They get reported to me as rzeppelin issues, but have all turned out 
to be issues in the way zeppelin and spark interface, e.g., with conflicts 
between SPARK_HOME and spark.home. rzeppelin needs to be consistent with the 
rest of the Zeppelin architecture in that regard. This is not something I can 
fix because I don't own that code.
    
    And https://issues.apache.org/jira/browse/ZEPPELIN-421 will address removal 
of spark.home property, in Zeppelin setting window. But until that, you can 
simply not trying to set spark.home.
    
    And technically you don't own Zeppelin code but ASF does. but nothing stops 
you fix the problem.



---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

[GitHub] incubator-zeppelin pull request: R Interpreter for Zeppelin

Reply via email to