Re: [PR] [SPARK-47829] Text Datasource supports Zstd compression codec [spark]
XorSum commented on PR #46026: URL: https://github.com/apache/spark/pull/46026#issuecomment-2054066305 @dongjoon-hyun. Thank you. I've edited the description according to your comments. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
Re: [PR] [SPARK-47829] Text Datasource supports Zstd compression codec [spark]
dongjoon-hyun commented on PR #46026: URL: https://github.com/apache/spark/pull/46026#issuecomment-2054061782 1. Could you put that info into the PR description first to get further reviews? 2. We need both `spark.driver.extraLibraryPath` and `spark.executor.extraLibraryPath` in general, doesn't we? Please mention both. 3. In your example, `/path/to/hadoop-3.4.0/lib/native/lib/native` looks like a typo. Please correct it by removing one of double `bin/native`. ``` $ tree hadoop-3.4.0/lib hadoop-3.4.0/lib └── native ├── examples │ ├── pipes-sort │ ├── wordcount-nopipe │ ├── wordcount-part │ └── wordcount-simple ├── libhadoop.a ├── libhadoop.so -> libhadoop.so.1.0.0 ├── libhadoop.so.1.0.0 ├── libhadooppipes.a ├── libhadooputils.a ├── libhdfs.a ├── libhdfs.so -> libhdfs.so.0.0.0 ├── libhdfs.so.0.0.0 ├── libhdfspp.a ├── libhdfspp.so -> libhdfspp.so.0.1.0 ├── libhdfspp.so.0.1.0 ├── libnativetask.a ├── libnativetask.so -> libnativetask.so.1.0.0 └── libnativetask.so.1.0.0 3 directories, 18 files ``` -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
Re: [PR] [SPARK-47829] Text Datasource supports Zstd compression codec [spark]
XorSum commented on PR #46026: URL: https://github.com/apache/spark/pull/46026#issuecomment-2053870360 The manual test steps to avoid native zStandard library not available error: 1. a x86_64 linux machine 2. download and extract [hadoop-3.4.0.tar.gz](https://dlcdn.apache.org/hadoop/common/hadoop-3.4.0/hadoop-3.4.0.tar.gz). We can find native lib in `hadoop-3.4.0/lib/native`. 3. start spark-sql with `extraLibraryPath` conf ``` ./bin/spark-sql \ --master local[*] \ --conf spark.driver.extraLibraryPath=/path/to/hadoop-3.4.0/lib/native/lib/native ``` However, I have no idea how to get and set hadoop native lib during unit test, or how to bundle the hadoop native lib into the spark distribution. Could you give me some advice or tips? -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
Re: [PR] [SPARK-47829] Text Datasource supports Zstd compression codec [spark]
dongjoon-hyun commented on PR #46026: URL: https://github.com/apache/spark/pull/46026#issuecomment-2053821293 Gentle ping, @XorSum . -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
Re: [PR] [SPARK-47829] Text Datasource supports Zstd compression codec [spark]
yaooqinn commented on PR #46026: URL: https://github.com/apache/spark/pull/46026#issuecomment-2051928714 Is the Hadoop native zstd library still missing? -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org