Re: [PR] [SPARK-47829] Text Datasource supports Zstd compression codec [spark]

2024-04-14 Thread via GitHub


XorSum commented on PR #46026:
URL: https://github.com/apache/spark/pull/46026#issuecomment-2054066305

   @dongjoon-hyun. Thank you. I've edited the description according to your 
comments.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



Re: [PR] [SPARK-47829] Text Datasource supports Zstd compression codec [spark]

2024-04-14 Thread via GitHub


dongjoon-hyun commented on PR #46026:
URL: https://github.com/apache/spark/pull/46026#issuecomment-2054061782

   1. Could you put that info into the PR description first to get further 
reviews?
   2. We need both `spark.driver.extraLibraryPath` and 
`spark.executor.extraLibraryPath` in general, doesn't we? Please mention both.
   3. In your example, `/path/to/hadoop-3.4.0/lib/native/lib/native` looks like 
a typo. Please correct it by removing one of double `bin/native`.
   ```
   $ tree hadoop-3.4.0/lib
   hadoop-3.4.0/lib
   └── native
   ├── examples
   │   ├── pipes-sort
   │   ├── wordcount-nopipe
   │   ├── wordcount-part
   │   └── wordcount-simple
   ├── libhadoop.a
   ├── libhadoop.so -> libhadoop.so.1.0.0
   ├── libhadoop.so.1.0.0
   ├── libhadooppipes.a
   ├── libhadooputils.a
   ├── libhdfs.a
   ├── libhdfs.so -> libhdfs.so.0.0.0
   ├── libhdfs.so.0.0.0
   ├── libhdfspp.a
   ├── libhdfspp.so -> libhdfspp.so.0.1.0
   ├── libhdfspp.so.0.1.0
   ├── libnativetask.a
   ├── libnativetask.so -> libnativetask.so.1.0.0
   └── libnativetask.so.1.0.0
   
   3 directories, 18 files
   ```


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



Re: [PR] [SPARK-47829] Text Datasource supports Zstd compression codec [spark]

2024-04-13 Thread via GitHub


XorSum commented on PR #46026:
URL: https://github.com/apache/spark/pull/46026#issuecomment-2053870360

   The manual test steps to avoid native zStandard library not available error:
   1. a x86_64 linux machine
   2. download and extract  
[hadoop-3.4.0.tar.gz](https://dlcdn.apache.org/hadoop/common/hadoop-3.4.0/hadoop-3.4.0.tar.gz).
 We can find native lib in `hadoop-3.4.0/lib/native`.
   3. start spark-sql with `extraLibraryPath` conf
   ```
   ./bin/spark-sql \
 --master local[*] \
 --conf 
spark.driver.extraLibraryPath=/path/to/hadoop-3.4.0/lib/native/lib/native
   ```
   
   However, I  have no idea how to get and set hadoop native lib during unit 
test, or how to bundle the hadoop native lib into the spark distribution. Could 
you give me some advice or tips?


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



Re: [PR] [SPARK-47829] Text Datasource supports Zstd compression codec [spark]

2024-04-13 Thread via GitHub


dongjoon-hyun commented on PR #46026:
URL: https://github.com/apache/spark/pull/46026#issuecomment-2053821293

   Gentle ping, @XorSum .


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



Re: [PR] [SPARK-47829] Text Datasource supports Zstd compression codec [spark]

2024-04-12 Thread via GitHub


yaooqinn commented on PR #46026:
URL: https://github.com/apache/spark/pull/46026#issuecomment-2051928714

   Is the Hadoop native zstd library still missing?


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org