[ 
https://issues.apache.org/jira/browse/DRILL-6373?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16507697#comment-16507697
 ] 

ASF GitHub Bot commented on DRILL-6373:
---------------------------------------

paul-rogers commented on issue #1244: DRILL-6373: Refactor Result Set Loader 
for Union, List support
URL: https://github.com/apache/drill/pull/1244#issuecomment-396126712
 
 
   Thanks much @vrozov for the analysis. I must say I'm a bit stumped. Value 
vectors are clearly not designed for concurrent modification. That is not 
simply a code bug, it is a fundamental design decision. Somewhere in code or 
documentation I recall a statement that says that value vectors are meant to be 
created once (by a single thread), then be immutable thereafter.
   
   It should be perfectly fine for any number of readers, in separate threads, 
to access the vector once it has entered its immutable phase. But, nothing 
about vectors allows concurrent access while mutable.
   
   What is going on in this use case to cause concurrent modification. Is that 
a "bug" or a "feature"? In the stack trace you provided, both threads are 
creating a new vector, which should not cause a conflict. If, however, they are 
modifying the same record batch, then we are violating a design assumption 
that, like vectors, batches are immutable once created, and that each batch is 
mutated by a single thread.
   
   The one other possibility is that a bit of code has a bug that is modifying 
the immutable schema when it should be modifying the mutable one (if working 
with two vectors), but I'm not sure how that could happen since code that adds 
fields is not aware of other vectors. Also, AFAIK, while I did change some code 
to keep metadata in sync (the design of `MaterializedField` really works only 
for simple vectors; it is a muddle for complex vectors such as maps), the 
changes only apply to the mutable stage of a vector's lifecycle.
   
   Thoughts?

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
[email protected]


> Refactor the Result Set Loader to prepare for Union, List support
> -----------------------------------------------------------------
>
>                 Key: DRILL-6373
>                 URL: https://issues.apache.org/jira/browse/DRILL-6373
>             Project: Apache Drill
>          Issue Type: Improvement
>    Affects Versions: 1.13.0
>            Reporter: Paul Rogers
>            Assignee: Paul Rogers
>            Priority: Major
>             Fix For: 1.14.0
>
>
> As the next step in merging the "batch sizing" enhancements, refactor the 
> {{ResultSetLoader}} and related classes to prepare for Union and List 
> support. This fix follows the refactoring of the column accessors for the 
> same purpose. Actual Union and List support is to follow in a separate PR.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

Reply via email to