[jira] [Created] (ZEPPELIN-5988) Bokeh output in IPySpark is not in correct format
Axel Van Damme created ZEPPELIN-5988: Summary: Bokeh output in IPySpark is not in correct format Key: ZEPPELIN-5988 URL: https://issues.apache.org/jira/browse/ZEPPELIN-5988 Project: Zeppelin Issue Type: Bug Components: pySpark Affects Versions: 0.10.1 Environment: Latest Apache Zeppelin compiled from sources Reporter: Axel Van Damme I'm using the latest Zeppelin version compile from the source code (Version 0.11.0-SNAPSHOT) I'm facing the issue explained in https://issues.apache.org/jira/browse/ZEPPELIN-4771 It means that when running a simple Bokeh code with %ipyspark (or even with %python.ipython)interpreter, the text output is not correctly formatted. You can try for example with this example where I did a on purpose typo: %ipyspark import numpy as np from bokeh.plotting import figure, show, output_notebook output_notebook() x = np.linspace(-6, 6, 500) y = 8*np.sin(x)*np.sinc(x) p = figure(width=800, height=300, title="", tools="", toolbar_location=None, match_aspect=True) p.line(x, ey, color="navy", alpha=0.4, line_width=4) show(p) then the output is scrabbled like this: Loading BokehJS ... [0;31m---[0m [0;31mNameError[0m Traceback (most recent call last) Cell [0;32mIn[40], line 12[0m [1;32m 7[0m y [38;5;241m=[39m [38;5;241m8[39m[38;5;241m*[39mnp[38;5;241m.[39msin(x)[38;5;241m*[39mnp[38;5;241m.[39msinc(x) [1;32m 9[0m p [38;5;241m=[39m figure(width[38;5;241m=[39m[38;5;241m800[39m, height[38;5;241m=[39m[38;5;241m300[39m, title[38;5;241m=[39m[38;5;124m"[39m[38;5;124m"[39m, tools[38;5;241m=[39m[38;5;124m"[39m[38;5;124m"[39m, [1;32m 10[0m toolbar_location[38;5;241m=[39m[38;5;28;01mNone[39;00m, match_aspect[38;5;241m=[39m[38;5;28;01mTrue[39;00m) [0;32m---> 12[0m p[38;5;241m.[39mline(x, [43mey[49m, color[38;5;241m=[39m[38;5;124m"[39m[38;5;124mnavy[39m[38;5;124m"[39m, alpha[38;5;241m=[39m[38;5;241m0.4[39m, line_width[38;5;241m=[39m[38;5;241m4[39m) [1;32m 14[0m show(p) [0;31mNameError[0m: name 'ey' is not defined -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Created] (ZEPPELIN-5437) Search box: org.apache.lucene.index.IndexNotFoundException
Axel Van Damme created ZEPPELIN-5437: Summary: Search box: org.apache.lucene.index.IndexNotFoundException Key: ZEPPELIN-5437 URL: https://issues.apache.org/jira/browse/ZEPPELIN-5437 Project: Zeppelin Issue Type: Bug Components: GUI Affects Versions: 0.10.0 Environment: Latest Zeppelin code build from source on 30/06/21 Zeppelin started within a Docker container. Reporter: Axel Van Damme Attachments: zeppelin--zeppelin.log With the latest code available to date we are facing an issue with the Search box at the top rigth of Zeppelin UI. Whatever is put in the search box the result is always: {{We couldn’t find any notebook matching ...}} Looking at the zeppelin–zeppelin.log, we can see that an exception is raised: {code:java} INFO [2021-07-01 08:53:51,709] ({qtp931675031-131} NotebookRestApi.java[search]:1066) - Searching notes for: VARYING_NOISE_DSINFO [2021-07-01 08:53:51,709] ({qtp931675031-131} NotebookRestApi.java[search]:1066) - Searching notes for: VARYING_NOISE_DSERROR [2021-07-01 08:53:51,712] ({qtp931675031-131} LuceneSearch.java[query]:135) - Failed to open index dir MMapDirectory@/tmp/zeppelin-index lockFactory=org.apache.lucene.store.NativeFSLockFactory@640a8f93, make sure indexing finished OKorg.apache.lucene.index.IndexNotFoundException: no segments* file found in MMapDirectory@/tmp/zeppelin-index lockFactory=org.apache.lucene.store.NativeFSLockFactory@640a8f93: files: [write.lock] at org.apache.lucene.index.SegmentInfos$FindSegmentsFile.run(SegmentInfos.java:726) ...{code} Full trace log is attached (zeppelin--zeppelin.log). Issue can be seen at line 545 of this log file. Seems that some index file is missing. Is there a way to force reindexing the notes? many thanks -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Created] (ZEPPELIN-5208) Reload Notes from storage doesn't work
Axel Van Damme created ZEPPELIN-5208: Summary: Reload Notes from storage doesn't work Key: ZEPPELIN-5208 URL: https://issues.apache.org/jira/browse/ZEPPELIN-5208 Project: Zeppelin Issue Type: Bug Components: zeppelin-web Affects Versions: 0.9.0 Reporter: Axel Van Damme Hello, I'm using latest official Zeppelin release ([http://www.apache.org/dyn/closer.cgi/zeppelin/zeppelin-0.9.0/zeppelin-0.9.0-bin-netinst.tgz).] It seems that the reload Notes from storage button at Zeppelin landing page doesn't work anymore. When notes are changed on disk, clicking on the reload button doesn't do anything (I cannot see anything in the logs when doing the click). In my case we share notes with different Zeppelin instances so if one instance changes a note then we need to refresh from disk the other instance so it sees the changes, so the feature is quite used. Can you please have a look. thanks -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Created] (ZEPPELIN-4966) Unable to build Zeppelin for Spark 3.0.0
Axel Van Damme created ZEPPELIN-4966: Summary: Unable to build Zeppelin for Spark 3.0.0 Key: ZEPPELIN-4966 URL: https://issues.apache.org/jira/browse/ZEPPELIN-4966 Project: Zeppelin Issue Type: Bug Components: build Affects Versions: 0.9.0 Reporter: Axel Van Damme Attachments: zeppelin_compilation.log I'm trying to build latest Zeppelin for Spark 3.0.0 by cloning [https://github.com/apache/zeppelin.git.] When I try to change pom configuration to scala 2.12 I hit the following: {code:java} root@2a861ad3cd6f:/mnt/data/avan# /var/zeppelin/dev/change_scala_version.sh 2.12 Invalid Scala version: 2.12. Valid versions: 2.10 2.11 {code} I tried to modify zeppelin/dev/change_scala_version.sh setting the following: {code:java} VALID_VERSIONS=( 2.10 2.11 2.12 ) {code} Then launched the following command: {code:java} mvn clean package -Pbuild-distr -DskipTests -Pspark-3.0 -Phadoop-2.7 -Pr -Pscala-2.12 [ERROR] Failed to execute goal org.scala-tools:maven-scala-plugin:2.15.2:testCompile (test-compile) on project zeppelin-display: wrap: org.apache.commons.exec.ExecuteException: Process exited with an error: 240(Exit value: 240) -> [Help 1] {code} see full logs in attached file -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Created] (ZEPPELIN-4612) Optimize Notebooks loading
Axel Van Damme created ZEPPELIN-4612: Summary: Optimize Notebooks loading Key: ZEPPELIN-4612 URL: https://issues.apache.org/jira/browse/ZEPPELIN-4612 Project: Zeppelin Issue Type: Improvement Components: NotebookRepo Affects Versions: 0.9.0 Environment: Number of Notebooks in NotebookRepo: {code:java} root@zeppelin:/opt# find NotebookRepo/ -type f -not -path "NotebookRepo/.git/*" | wc -l 1524 {code} Disk space used by the NotebookRepo: {code:java} root@zeppelin:/opt# du -skh NotebookRepo/ 2.7GNotebookRepo/ {code} Environment variable: {code:java} root@zeppelin:/opt/zeppelin/logs# echo $ZEPPELIN_MEM -Xmx8192m -XX:MaxPermSize=1024m {code} Memory used before first login in Zeppelin UI (298.7MiB): {code:java} CONTAINER IDNAMECPU % MEM USAGE / LIMIT MEM % NET I/O BLOCK I/O PIDS 38bf46661dc9avan_zeppelin.1.ow2ckxn7pghvgk3osewrtr16j 0.09% 298.7MiB / 16GiB1.82% 2.14kB / 820B 0B / 0B 65 {code} Memory used after first login in Zeppelin UI (5.491GiB): {code:java} CONTAINER IDNAMECPU % MEM USAGE / LIMIT MEM % NET I/O BLOCK I/O PIDS 38bf46661dc9avan_zeppelin.1.ow2ckxn7pghvgk3osewrtr16j 0.12% 5.491GiB / 16GiB34.32% 11.9kB / 40.2kB 0B / 73.7kB {code} Logs of first login attached (zeppelin–zeppelin.log), login phase occured between 2020-02-13 08:23:11 and 2020-02-13 08:23:53 Reporter: Axel Van Damme Attachments: zeppelin--zeppelin.log Our current Notebooks base contains more than 1500 Notebooks. While in Zeppelin 0.8.3 the solution was not ideal because all Notebooks were loaded in memory at Zeppelin startup, the situation in Zeppelin 0.9 is worse because loading the Notebooks is occuring at login phase. So the end user has to wait a long long time before he is getting in Zeppelin and usually thinks that Zeppelin is down. Also, the solution of loading the entire Notebook base in memory is not scalable because as new Notebooks are created we always have to increase the ZEPPELIN_MEM environment variable. At the moment to be able to log in Zeppelin with our 1500 Notebooks we set: {code:java} ZEPPELIN_MEM=-Xmx8192m -XX:MaxPermSize=1024m {code} This is a lot of memory that cannot be used for actual code processing. The first logging takes 42 sec (see logs from 2020-02-13 08:23:11 to 2020-02-13 08:23:53) Wouldn't be possible to just walk through the directory structure of the NotebookRepo to display Zeppelin welcome page with the tree structure? This would be a great improvement and would offer the possibility to use Zeppelin at scale. -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Created] (ZEPPELIN-4598) upgrade-note.sh GitNotebookRepo cannot be cast to OldNotebookRepo
Axel Van Damme created ZEPPELIN-4598: Summary: upgrade-note.sh GitNotebookRepo cannot be cast to OldNotebookRepo Key: ZEPPELIN-4598 URL: https://issues.apache.org/jira/browse/ZEPPELIN-4598 Project: Zeppelin Issue Type: Bug Components: NotebookRepo Affects Versions: 0.9.0 Reporter: Axel Van Damme I'm trying to validate ZEPPELIN-4551 Still I cannot migrate Notebooks from version 0.8.2 to 0.9 snapshot. I'm now encountering the following issue: {code:java} ./upgrade-note.sh INFO [2020-02-07 08:57:30,469] ({main} ZeppelinConfiguration.java[create]:163) - Load configuration from file:/opt/zeppelin-0.9.0-SNAPSHOT/conf/zeppelin-site.xml INFO [2020-02-07 08:57:30,545] ({main} ZeppelinConfiguration.java[create]:171) - Server Host: 0.0.0.0 INFO [2020-02-07 08:57:30,545] ({main} ZeppelinConfiguration.java[create]:173) - Server Port: 8080 INFO [2020-02-07 08:57:30,545] ({main} ZeppelinConfiguration.java[create]:177) - Context Path: / INFO [2020-02-07 08:57:30,545] ({main} ZeppelinConfiguration.java[create]:178) - Zeppelin Version: 0.9.0-SNAPSHOT INFO [2020-02-07 08:57:30,549] ({main} PluginManager.java[loadNotebookRepo]:60) - Loading NotebookRepo Plugin: org.apache.zeppelin.notebook.repo.GitNotebookRepo INFO [2020-02-07 08:57:30,652] ({main} VFSNotebookRepo.java[setNotebookDirectory]:70) - Using notebookDir: /opt/NotebookRepo INFO [2020-02-07 08:57:30,824] ({main} GitNotebookRepo.java[init]:77) - Opening a git repo at '/opt/NotebookRepo' INFO [2020-02-07 08:57:30,981] ({main} PluginManager.java[loadOldNotebookRepo]:102) - Loading OldNotebookRepo Plugin: org.apache.zeppelin.notebook.repo.GitNotebookRepo Exception in thread "main" java.lang.ClassCastException: org.apache.zeppelin.notebook.repo.GitNotebookRepo cannot be cast to org.apache.zeppelin.notebook.repo.OldNotebookRepo at org.apache.zeppelin.plugin.PluginManager.loadOldNotebookRepo(PluginManager.java:107) at org.apache.zeppelin.notebook.repo.NotebookRepoSync.convertNoteFiles(NotebookRepoSync.java:110) at org.apache.zeppelin.notebook.repo.UpgradeNoteFileTool.main(UpgradeNoteFileTool.java:42) {code} Can you please have a look. thanks -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Created] (ZEPPELIN-4579) Unable to build Zeppelin 0.9.0-SNAPSHOT
Axel Van Damme created ZEPPELIN-4579: Summary: Unable to build Zeppelin 0.9.0-SNAPSHOT Key: ZEPPELIN-4579 URL: https://issues.apache.org/jira/browse/ZEPPELIN-4579 Project: Zeppelin Issue Type: Bug Components: build Affects Versions: 0.9.0 Environment: I do build Zeepin in a Dockerfile environment based on python:3.6-stretch as builder. The build command I use is the following: {code:java} mvn clean package -Pbuild-distr -DskipTests -Pspark-2.4 -Phadoop-2.7 -Pr -Pscala-2.11 -pl '!r' {code} Like mentioned, all lib dependencies should be installed ok since build was successful two weeks ago. Reporter: Axel Van Damme Attachments: zeppelin_build_logs.txt I'm trying to test the fix for ZEPPELIN-4551 but cannot build Zeppelin 0.9.0-SNAPSHOT anymore. Two weeks ago I was able to build with exactly the same command but today it fails at Zeppelin: web angular Application. I get many errors starting at line 22908 of the attached log file but build seems to continue after that. Then at line 23593 it seems that it's getting worse: {code:java} [ERROR] ERROR in app/interfaces/message-interceptor.ts:14:61 - error TS2307: Cannot find module '@zeppelin/sdk'. {code} and finally at line 24095 it stops the build by saying: {code:java} [ERROR] ERROR in app/app.module.ts(55,5): Error during template compile of 'AppModule'{code} -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Created] (ZEPPELIN-4551) upgrade-note.sh Plugin GitNotebookRepo doesn't exist
Axel Van Damme created ZEPPELIN-4551: Summary: upgrade-note.sh Plugin GitNotebookRepo doesn't exist Key: ZEPPELIN-4551 URL: https://issues.apache.org/jira/browse/ZEPPELIN-4551 Project: Zeppelin Issue Type: Bug Components: NotebookRepo Affects Versions: 0.9.0 Environment: I'm running Zeppelin in a container environnent. Reporter: Axel Van Damme I want to upgrade from Zeppelin 0.8.2 to Zeppelin 0.9.0-snapshot I've compiled latest source code mvn clean package -Pbuild-distr -DskipTests -Pspark-2.4 -Phadoop-2.7 -Pr -Pscala-2.11 -pl '!r' Once installed it starts ok, but when it comes to upgrade the notebooks executing bin/upgrade-note.sh -d fails. Here is the command output: {code:java} bin/upgrade-note.sh INFO [2020-01-09 17:26:11,237] ({main} ZeppelinConfiguration.java[create]:163) - Load configuration from file:/opt/zeppelin-0.9.0-SNAPSHOT/conf/zeppelin-site.xml INFO [2020-01-09 17:26:11,317] ({main} ZeppelinConfiguration.java[create]:171) - Server Host: 0.0.0.0 INFO [2020-01-09 17:26:11,318] ({main} ZeppelinConfiguration.java[create]:173) - Server Port: 8080 INFO [2020-01-09 17:26:11,318] ({main} ZeppelinConfiguration.java[create]:177) - Context Path: / INFO [2020-01-09 17:26:11,318] ({main} ZeppelinConfiguration.java[create]:178) - Zeppelin Version: 0.9.0-SNAPSHOT INFO [2020-01-09 17:26:11,321] ({main} PluginManager.java[loadNotebookRepo]:60) - Loading NotebookRepo Plugin: org.apache.zeppelin.notebook.repo.GitNotebookRepo INFO [2020-01-09 17:26:11,438] ({main} VFSNotebookRepo.java[setNotebookDirectory]:70) - Using notebookDir: /opt/NotebookRepo INFO [2020-01-09 17:26:11,570] ({main} GitNotebookRepo.java[init]:77) - Opening a git repo at '/opt/NotebookRepo' INFO [2020-01-09 17:26:11,726] ({main} PluginManager.java[loadOldNotebookRepo]:102) - Loading OldNotebookRepo Plugin: org.apache.zeppelin.notebook.repo.GitNotebookRepo WARN [2020-01-09 17:26:11,726] ({main} PluginManager.java[getPluginClassLoader]:181) - PluginFolder /opt/zeppelin/plugins/NotebookRepo/GitNotebookRepo doesn't exist or is not a directory Exception in thread "main" java.lang.NullPointerException at org.apache.zeppelin.notebook.repo.NotebookRepoSync.convertNoteFiles(NotebookRepoSync.java:111) at org.apache.zeppelin.notebook.repo.UpgradeNoteFileTool.main(UpgradeNoteFileTool.java:42) {code} I've checked the content of the folder {code:java} root@zeppelin:/opt/zeppelin/plugins/NotebookRepo# ls -ltr total 28 drwxr-xr-x 2 root root 4096 Jan 9 13:47 ZeppelinHubRepo drwxr-xr-x 2 root root 4096 Jan 9 13:47 S3NotebookRepo drwxr-xr-x 2 root root 4096 Jan 9 13:47 MongoNotebookRepo drwxr-xr-x 2 root root 4096 Jan 9 13:47 GitHubNotebookRepo drwxr-xr-x 2 root root 4096 Jan 9 13:47 GCSNotebookRepo drwxr-xr-x 2 root root 4096 Jan 9 13:47 FileSystemNotebookRepo drwxr-xr-x 2 root root 4096 Jan 9 13:47 AzureNotebookRepo {code} and indeed there is no GitNotebookRepo folder. I think there is an issue in loadOldNotebookRepo while loadNotebookRepo seems to be ok. -- This message was sent by Atlassian Jira (v8.3.4#803005)