This is an automated email from the ASF dual-hosted git repository.
philo pushed a commit to branch main
in repository https://gitbox.apache.org/repos/asf/incubator-gluten.git
The following commit(s) were added to refs/heads/main by this push:
new 74f3061d8 [GLUTEN-5708][VL] Minor wording polishing for NewToGluten.md
(#5707)
74f3061d8 is described below
commit 74f3061d88ce3155daca921c23266801533e7710
Author: James Xu <[email protected]>
AuthorDate: Mon May 13 13:38:28 2024 +0800
[GLUTEN-5708][VL] Minor wording polishing for NewToGluten.md (#5707)
---
docs/developers/NewToGluten.md | 62 +++++++++++++++++++-----------------------
1 file changed, 28 insertions(+), 34 deletions(-)
diff --git a/docs/developers/NewToGluten.md b/docs/developers/NewToGluten.md
index 1eb21d1e6..a397003ad 100644
--- a/docs/developers/NewToGluten.md
+++ b/docs/developers/NewToGluten.md
@@ -6,22 +6,20 @@ parent: Developer Overview
---
Help users to debug and test with gluten.
-For intel internal developer, you could refer to internal wiki [New Employee
Guide](https://wiki.ith.intel.com/display/HPDA/New+Employee+Guide) to get more
information such as proxy settings,
-Gluten has cpp code and java/scala code, we can use some useful IDE to read
and debug.
-
# Environment
Now gluten supports Ubuntu20.04, Ubuntu22.04, centos8, centos7 and macOS.
-## Openjdk8
+## OpenJDK 8
-### Environment setting
+### Environment Setting
-For root user, the environment variables file is `/etc/profile`, it will make
effect for all the users.
+For root user, the environment variables file is `/etc/profile`, it will take
effect for all the users.
For other user, you can set in `~/.bashrc`.
-### Guide for ubuntu
+### Guide for Ubuntu
+
The default JDK version in ubuntu is java11, we need to set to java8.
```bash
@@ -43,9 +41,9 @@ export PATH="$PATH:$JAVA_HOME/bin"
> Must set PATH with double quote in ubuntu.
-## Openjdk17
+## OpenJDK 17
-By defaults, Gluten compiles package using JDK8. Add maven profile `-Pjava-17`
changing to use JDK17, and please make sure your JAVA_HOME points to jdk17.
+By default, Gluten compiles package using JDK8. Enable maven profile by
`-Pjava-17` to use JDK17, and please make sure your JAVA_HOME points to jdk17.
Apache Spark and Arrow requires setting java args
`-Dio.netty.tryReflectionSetAccessible=true`, see
[SPARK-29924](https://issues.apache.org/jira/browse/SPARK-29924) and
[ARROW-6206](https://issues.apache.org/jira/browse/ARROW-6206).
So please add following configs in `spark-defaults.conf`:
@@ -78,31 +76,20 @@ If you need to debug the tests in <gluten>/gluten-ut, You
need to compile java c
# Java/scala code development with Intellij
-## Linux intellij local debug
+## Linux IntelliJ local debug
-Install the linux intellij version, and debug code locally.
+Install the Linux IntelliJ version, and debug code locally.
- Ask your linux maintainer to install the desktop, and then restart the
server.
- If you use Moba-XTerm to connect linux server, you don't need to install x11
server, If not (e.g. putty), please follow this guide:
[X11 Forwarding: Setup Instructions for Linux and
Mac](https://www.businessnewsdaily.com/11035-how-to-use-x11-forwarding.html)
-- Download [intellij linux community
version](https://www.jetbrains.com/idea/download/?fromIDE=#section=linux) to
linux server
+- Download [IntelliJ Linux community
version](https://www.jetbrains.com/idea/download/?fromIDE=#section=linux) to
Linux server
- Start Idea, `bash <idea_dir>/idea.sh`
-Notes: Sometimes, your desktop may stop accidently, left idea running.
-
-```bash
-root@xx2:~bash idea-IC-221.5787.30/bin/idea.sh
-Already running
-root@xx2:~ps ux | grep intellij
-root@xx2:kill -9 <pid>
-```
-
-And then restart idea.
+## Windows/macOS IntelliJ remote debug
-## Windows/Mac intellij remote debug
-
-If you have Ultimate intellij, you can try to debug remotely.
+If you have IntelliJ Ultimate Edition, you can debug Gluten code remotely.
## Set up gluten project
@@ -113,8 +100,8 @@ If you have Ultimate intellij, you can try to debug
remotely.
## Java/Scala code style
-Intellij IDE supports importing settings for Java/Scala code style. You can
import [intellij-codestyle.xml](../../dev/intellij-codestyle.xml) to your IDE.
-See [Intellij
guide](https://www.jetbrains.com/help/idea/configuring-code-style.html#import-code-style).
+IntelliJ supports importing settings for Java/Scala code style. You can import
[intellij-codestyle.xml](../../dev/intellij-codestyle.xml) to your IDE.
+See [IntelliJ
guide](https://www.jetbrains.com/help/idea/configuring-code-style.html#import-code-style).
To generate a fix for Java/Scala code style, you can run one or more of the
below commands according to the code modules involved in your PR.
@@ -161,7 +148,7 @@ VSCode support 2 ways to set user setting.
### Build by vscode
-VSCode will try to compile the debug version in <gluten_home>/build.
+VSCode will try to compile using debug mode in <gluten_home>/build.
And we need to compile velox debug mode before, if you have compiled velox
release mode, you just need to do.
```bash
@@ -259,14 +246,15 @@ Then you can create breakpoint and debug in `Run and
Debug` section.
### Velox debug
-For some velox tests such as `ParquetReaderTest`, tests need to read the
parquet file in `<velox_home>/velox/dwio/parquet/tests/examples`, you should
let the screen on `ParquetReaderTest.cpp`, then click `Start Debuging`,
otherwise you will raise No such file or directory exception
+For some velox tests such as `ParquetReaderTest`, tests need to read the
parquet file in `<velox_home>/velox/dwio/parquet/tests/examples`,
+you should let the screen on `ParquetReaderTest.cpp`, then click `Start
Debuging`, otherwise `No such file or directory` exception will be raised.
-## Usefule notes
+## Useful notes
-### Upgrade vscode
+### Do not upgrade vscode
No need to upgrade vscode version, if upgraded, will download linux server
again, switch update mode to off
-Search `update` in Manage->Settings to turn off update mode
+Search `update` in Manage->Settings to turn off update mode.
### Colour setting
@@ -299,7 +287,7 @@ Set config in `settings.json`
If exists multiple clang-format version, formatOnSave may not take effect,
specify the default formatter
Search `default formatter` in `Settings`, select Clang-Format.
-If your formatOnSave still make no effect, you can use shortcut `SHIFT+ALT+F`
to format one file mannually.
+If your formatOnSave still make no effect, you can use shortcut `SHIFT+ALT+F`
to format one file manually.
# Debug cpp code with coredump
@@ -370,7 +358,9 @@ wait to attach....
```
# Debug Memory leak
+
## Arrow memory allocator leak
+
If you receive error message like
```bash
@@ -378,6 +368,7 @@ If you receive error message like
24/04/18 08:15:38 WARN ArrowBufferAllocators$ArrowBufferAllocatorManager:
Leaked allocator stack Allocator(ROOT) 0/191/319/9223372036854775807
(res/actual/peak/limit)
```
You can open the Arrow allocator debug config by add VP option
`-Darrow.memory.debug.allocator=true`, then you can get more details like
+
```bash
child allocators: 0
ledgers: 7
@@ -403,9 +394,12 @@ child allocators: 0
at
org.apache.gluten.utils.IteratorCompleter.hasNext(Iterators.scala:69)
at
org.apache.spark.memory.SparkMemoryUtil$UnsafeItr.hasNext(SparkMemoryUtil.scala:246)
```
+
## CPP code memory leak
+
Sometimes you cannot get the coredump symbols, if you debug memory leak, you
can write googletest to use valgrind to detect
-```
+
+```bash
apt install valgrind
valgrind --leak-check=yes ./exec_backend_test
```
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]