jinchengchenghh commented on PR #5447:
URL: 
https://github.com/apache/incubator-gluten/pull/5447#issuecomment-2099651706

   After I enable HDFS in ARROW, I can successfully read csv file in HDFS.
   ```
   scala> val filePath = "/input/student.csv"
   filePath: String = /input/student.csv
   
   scala> val df = spark.read.format("csv").option("header", 
"true").load(filePath)
   E0508 10:25:42.841267 3005137 Exceptions.h:69] Line: 
/mnt/DP_disk1/code/incubator-gluten/ep/build-velox/build/velox_ep/velox/exec/Task.cpp:1850,
 Function:terminate, Expression:  Cancelled, Source: RUNTIME, ErrorCode: 
INVALID_STATE
   df: org.apache.spark.sql.DataFrame = [Name: string, Language: string]
   
   scala>
        | df.show()
   +-----+--------+
   | Name|Language|
   +-----+--------+
   | Juno|    Java|
   |Peter|  Python|
   |Celin|     C++|
   +-----+--------+
   
   
   scala> print(df.queryExecution.executedPlan)
   *(1) ColumnarToRow
   +- ArrowFileScan arrowcsv [Name#17,Language#18] Batched: true, DataFilters: 
[], Format: org.apache.gluten.datasource.ArrowCSVFileFormat@485f3327, Location: 
InMemoryFileIndex(1 paths)[hdfs://0.0.0.0:9000/input/student.csv], 
PartitionFilters: [], PushedFilters: [], ReadSchema: 
struct<Name:string,Language:string>
   
   
   ```
   My local protobuf version, I'm not sure if it is related to protobuf version 
@liujiayi771 
   ```
   root@sr249:/mnt/DP_disk2/tpcds/scripts# protoc --version
   libprotoc 3.21.4
   ```


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]


---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to