kmitchener opened a new issue, #2963:
URL: https://github.com/apache/arrow-datafusion/issues/2963

   **Describe the bug**
   A clear and concise description of what the bug is.
   
   Given some AWS Costs parquet files (parquet files generated by AWS cost and 
usage reporting), a query executed against those files results in a panic:
   
   From datafusion-cli, using latest master as of the time of this bug report:
   ```sql
   CREATE EXTERNAL TABLE costs 
       STORED AS PARQUET 
       PARTITIONED BY (year, month) 
       LOCATION 'C:/tmp/aws-costs/cur/cost-and-usage/cost-and-usage';
   
   select line_item_resource_id, sum(line_item_unblended_cost) from costs where 
resource_tags_user_application = 'FERDA' group by 1;
   ```
   
   ```pre
   thread 'thread 'thread 'thread 'tokio-runtime-workerthread 'thread 'thread 
'tokio-runtime-workertokio-runtime-worker' panicked at 
'tokio-runtime-workertokio-runtime-workertokio-runtime-workerthread '' panicked 
at 'thread 'range end index 105 out of range for slice of length 
104tokio-runtime-worker' panicked at 'thread '' panicked at '' panicked at '' 
panicked at 'tokio-runtime-workerrange end index 113 out of range for slice of 
length 96tokio-runtime-worker', ' panicked at 'range end index 89 out of range 
for slice of length 80' panicked at 'range end index 97 out of range for slice 
of length 96range end index 113 out of range for slice of length 80range end 
index 111 out of range for slice of length 72range end index 89 out of range 
for slice of length 80', ', library\core\src\slice\index.rsrange end index 111 
out of range for slice of length 72', tokio-runtime-worker', ' panicked at '', 
', 
library\core\src\slice\index.rslibrary\core\src\slice\index.rslibrary\core\src\slice\index
 .rs', :' panicked at 'range end index 89 out of range for slice of length 
80library\core\src\slice\index.rslibrary\core\src\slice\index.rslibrary\core\src\slice\index.rs:::library\core\src\slice\index.rs73range
 end index 113 out of range for slice of length 112', :::737373::', 
library\core\src\slice\index.rs737373:::735library\core\src\slice\index.rs::::555:
   :73555
   
   
   573:
   
   
   
   :55
   
   ArrowError(ExternalError(Execution("Join Error: task 764 panicked")))
   ```
   
   It seems to be this line_item_resource_id field in particular, as I was able 
to group by other fields without issue.
   
   **To Reproduce**
   Steps to reproduce the behavior:
   
   I don't think I should upload our AWS costs files for general consumption, 
so I'm open to suggestions how to recreate a test case.
   
   **Expected behavior**
   A clear and concise description of what you expected to happen.
   
   not panic :)
   
   **Additional context**
   Add any other context about the problem here.
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]

Reply via email to