[
https://issues.apache.org/jira/browse/DRILL-8037?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17487370#comment-17487370
]
ASF GitHub Bot commented on DRILL-8037:
---------------------------------------
vdiravka commented on pull request #2364:
URL: https://github.com/apache/drill/pull/2364#issuecomment-1030429509
Hi @paul-rogers I have rebased the branch to master branch. And in separate
new commit removed the hack, which hid the schema change in the
`HashAggTemplate` (and actually one row is missing in query result, just
actually test case doesn't check it).
Thanks for explanation how vectors is working, it helped me. It is clear
now, that schema is changing due to `RepeatedMapVector` [can't be obtained from
the
cache](https://github.com/apache/drill/blob/317f164791bbbe8f937eb452b49e92c34f1c0333/exec/java-exec/src/main/java/org/apache/drill/exec/physical/resultSet/impl/ColumnBuilder.java#L220):
```
// Don't get the map vector from the vector cache. Map vectors may
// have content that varies from batch to batch. Only the leaf
// vectors can be cached.
```
Obtaining vector from cache here leads to errors in this and other test
cases:
`mapVector = (RepeatedMapVector)
parent.vectorCache().vectorFor(mapColSchema.schema());`
```
org.apache.drill.common.exceptions.UserRemoteException: EXECUTION_ERROR
ERROR: null
Read failed for reader: JsonBatchReader
....
Caused by: java.lang.AssertionError:
at
org.apache.drill.exec.physical.resultSet.impl.TupleState$MapState.addOutputColumn(TupleState.java:475)
at
org.apache.drill.exec.physical.resultSet.impl.ColumnState.buildOutput(ColumnState.java:321)
at
org.apache.drill.exec.physical.resultSet.impl.TupleState.updateOutput(TupleState.java:206)
at
org.apache.drill.exec.physical.resultSet.impl.TupleState.updateOutput(TupleState.java:217)
at
org.apache.drill.exec.physical.resultSet.impl.TupleState$RowState.updateOutput(TupleState.java:430)
at
org.apache.drill.exec.physical.resultSet.impl.ResultSetLoaderImpl.harvest(ResultSetLoaderImpl.java:716)
```
So as for me looks like we need to implement supporting schema change for
hashAgg operator or obtaining `RepeatedMapVector` from the cache. I lean
towards the latter. What do you think?
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: [email protected]
For queries about this service, please contact Infrastructure at:
[email protected]
> Add V2 JSON Format Plugin based on EVF
> --------------------------------------
>
> Key: DRILL-8037
> URL: https://issues.apache.org/jira/browse/DRILL-8037
> Project: Apache Drill
> Issue Type: Sub-task
> Reporter: Vitalii Diravka
> Assignee: Vitalii Diravka
> Priority: Major
>
> This adds new V2 beta JSON Format Plugin based on the "Extended Vector
> Framework".
> This is follow up DRILL-6953 (was closed with the decision to merge it by
> small pieces).
> So it is based on [https://github.com/apache/drill/pull/1913] and
> [https://github.com/paul-rogers/drill/tree/DRILL-6953-rev2] work.
--
This message was sent by Atlassian Jira
(v8.20.1#820001)