[GitHub] [arrow-datafusion] ismail opened a new issue, #5657: Request for documentation for compressed CSV/JSON support

via GitHub Mon, 20 Mar 2023 13:00:54 -0700


ismail opened a new issue, #5657:
URL: https://github.com/apache/arrow-datafusion/issues/5657


   Hi,
   
   Support for compressed csv/json was added in 
https://github.com/apache/arrow-datafusion/commit/b8a3a78f833fae8faace8d7542a1fb3d7a497b6a
 and trying to use it in a sample
   
   ```
   use datafusion::prelude::*;
   use datafusion::datasource::file_format::file_type::FileCompressionType;
   
   #[tokio::main]
   async fn main() -> datafusion::error::Result<()> {
       let ctx = SessionContext::new();
       let csv_options = CsvReadOptions::default()
           .has_header(true)
           .file_compression_type(FileCompressionType::BZIP2);
       let df = ctx.read_csv("summary.csv.bz2", csv_options).await?;
       let df = df
           .filter(col("status").eq(lit("OK")))?
           .select_columns(&["name", "id"])?;
   
       df.show().await?;
       Ok(())
   }
   ```
   
   results in
   
   ```
   Error: SchemaError(FieldNotFound { field: Column { relation: None, name: 
"status" }, valid_fields: [] })
   ```
   
   Code works fine if I work on the uncompressed CSV. Since the documentation 
for this feature is missing, I am wondering if I'm holding it wrong. Would 
appreciate if the documentation could give example of sample usage.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]

[GitHub] [arrow-datafusion] ismail opened a new issue, #5657: Request for documentation for compressed CSV/JSON support

Reply via email to