zmrdltl commented on issue #5410:
URL: https://github.com/apache/arrow-rs/issues/5410#issuecomment-1962309045

   
   In order to handle a parquet file with an SQL statement, you must convert 
the data into a gluesql schema and read it, then convert it to parquet's enum 
SchemaType when writing the parquet file, and then convert the glue sql data 
type to the parquet data type for each field's schema. At this time, writing is 
done using `SerializeColumnWriter`.
   At this time, using a ColumnWriter for each data causes a lot of 
duplication. 
   
   ```rust
    (Value::Null, ColumnWriter::Int32ColumnWriter(ref mut typed)) => {
      typed.write_batch(&[], Some(&[0]), None).map_storage_err()?;
   }
   (Value::Null, ColumnWriter::Int64ColumnWriter(ref mut typed)) => {
     typed.write_batch(&[], Some(&[0]), None).map_storage_err()?;
   }
   (Value::I8(val), ColumnWriter::Int32ColumnWriter(ref mut typed)) => {
     typed.write_batch(&[val as i32], Some(&[1]), None).map_storage_err()?;
   }
    (Value::Date(d), ColumnWriter::Int32ColumnWriter(ref mut typed)) => {
   ..
     typed.write_batch(&[days_since_epoch], Some(&[1]), 
None).map_storage_err()?;
   }
   (Value::U8(val), ColumnWriter::Int32ColumnWriter(ref mut typed)) => {
     typed.write_batch(&[val as i32], Some(&[1]), None).map_storage_err()?;
   }
   ```
   Therefore, in the process of refactoring the following code, I wrote a 
generic function, but I couldn't define it without importing 
`ColumnValueEncoder`, so I asked about it. Since `ColumnWriterImpl` is a type 
rather than a trait, it seems difficult to apply the structure. If so, what 
structure should I use?
   ```rust
       use parquet::column::writer::{encoder::ColumnValueEncoder, 
GenericColumnWriter}
       fn write_null<T, E>(typed: &mut GenericColumnWriter<'_, E>) -> 
Result<(), Error>
       where
           E: ColumnValueEncoder<T>,
       {
           typed.write_batch(&[], Some(&[0]), None).map_storage_err()
       }
   
       fn write_column<T, E>(
           value: Value,
           col_writer: &mut GenericColumnWriter<'_, E>,
       ) -> Result<(), Error>
       where
           E: ColumnValueEncoder<T>,
       {
           match value {
               Value::Null => write_null(col_writer),
               _ => write_value(col_writer, value),
           }
       }
   ```
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]

Reply via email to