doupache commented on PR #12497:
URL: https://github.com/apache/datafusion/pull/12497#issuecomment-2362691758

   Thanks @austin362667  and @alamb. 
   
   I have updated the PR and learned some Cargo tips from @austin362667. 
   Using debug build during development is much faster.
   
   
   ```sh 
   #1
   cd benchmarks && cargo build 
   
   #2 
   cargo run --bin imdb -- convert --input ./data/imdb/ --output ./data/imdb/ 
--format parquet
   ```
   
   
   i also test all 21 parquet  like follwoing.
   
   ```sql 
   # create table 
   CREATE EXTERNAL TABLE name (
       id INTEGER NOT NULL PRIMARY KEY,
       name STRING NOT NULL,
       imdb_index STRING,
       imdb_id INTEGER,
       gender STRING,
       name_pcode_cf STRING,
       name_pcode_nf STRING,
       surname_pcode STRING,
       md5sum STRING
   )
   STORED AS PARQUET
   LOCATION '../benchmarks/data/imdb/temp/name.parquet';
   
   # read 
   SELECT * FROM name LIMIT 5;
   ```
   
   
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]


---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to