haraldnh commented on issue #49368:
URL: https://github.com/apache/arrow/issues/49368#issuecomment-3944277483

   It's also the other way around: It works up until 20.0.0, which is where the 
fields are 'string'.  From 21.0.0 and onwards, things are 'string_view' and I 
have not been able to make things work.
   
   I'm reading a deltalake, partitioned on a 'date' (formatted like 
"2026-02-23") and 'time' (formatted "12:30") fields.  I then produce a 
partition-filter like:
   ```python
   def _make_single_day_partition_filter(
       floor: datetime, ceil: datetime, partition_scheme: str
   ) -> list[tuple[str, str, str]]:
       pfilter: list[tuple[str, str, str]] = [
           ('date', '=', f'{floor.year}-{floor.month:02d}-{floor.day:02d}')
       ]
       if partition_scheme in ('5min', 'hour'):
           pfilter.append(('time', '>=', 
f'{floor.hour:02d}-{floor.minute:02d}'))
           pfilter.append(('time', '<', f'{ceil.hour:02d}-{ceil.minute:02d}'))
       return pfilter
   ```
   
   Then feed this through dataset = deltalake.to_pyarrow_dataset(pfilter), and 
then data = dataset.to_table(columns).  From the error, it seems to be the 
partition-filter comparison that crashes, but it only fails in to_table().


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]

Reply via email to