kmitchener opened a new issue, #2963:
URL: https://github.com/apache/arrow-datafusion/issues/2963
**Describe the bug**
A clear and concise description of what the bug is.
Given some AWS Costs parquet files (parquet files generated by AWS cost and
usage reporting), a query executed against those files results in a panic:
From datafusion-cli, using latest master as of the time of this bug report:
```sql
CREATE EXTERNAL TABLE costs
STORED AS PARQUET
PARTITIONED BY (year, month)
LOCATION 'C:/tmp/aws-costs/cur/cost-and-usage/cost-and-usage';
select line_item_resource_id, sum(line_item_unblended_cost) from costs where
resource_tags_user_application = 'FERDA' group by 1;
```
```pre
thread 'thread 'thread 'thread 'tokio-runtime-workerthread 'thread 'thread
'tokio-runtime-workertokio-runtime-worker' panicked at
'tokio-runtime-workertokio-runtime-workertokio-runtime-workerthread '' panicked
at 'thread 'range end index 105 out of range for slice of length
104tokio-runtime-worker' panicked at 'thread '' panicked at '' panicked at ''
panicked at 'tokio-runtime-workerrange end index 113 out of range for slice of
length 96tokio-runtime-worker', ' panicked at 'range end index 89 out of range
for slice of length 80' panicked at 'range end index 97 out of range for slice
of length 96range end index 113 out of range for slice of length 80range end
index 111 out of range for slice of length 72range end index 89 out of range
for slice of length 80', ', library\core\src\slice\index.rsrange end index 111
out of range for slice of length 72', tokio-runtime-worker', ' panicked at '',
',
library\core\src\slice\index.rslibrary\core\src\slice\index.rslibrary\core\src\slice\index
.rs', :' panicked at 'range end index 89 out of range for slice of length
80library\core\src\slice\index.rslibrary\core\src\slice\index.rslibrary\core\src\slice\index.rs:::library\core\src\slice\index.rs73range
end index 113 out of range for slice of length 112', :::737373::',
library\core\src\slice\index.rs737373:::735library\core\src\slice\index.rs::::555:
:73555
573:
:55
ArrowError(ExternalError(Execution("Join Error: task 764 panicked")))
```
It seems to be this line_item_resource_id field in particular, as I was able
to group by other fields without issue.
**To Reproduce**
Steps to reproduce the behavior:
I don't think I should upload our AWS costs files for general consumption,
so I'm open to suggestions how to recreate a test case.
**Expected behavior**
A clear and concise description of what you expected to happen.
not panic :)
**Additional context**
Add any other context about the problem here.
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: [email protected]
For queries about this service, please contact Infrastructure at:
[email protected]