[GitHub] drill pull request: DRILL-3987: Componentize Drill, extracting vec...

cwestin Wed, 11 Nov 2015 15:59:31 -0800

Github user cwestin commented on the pull request:

    https://github.com/apache/drill/pull/250#issuecomment-155948627
  
    On 2.2 above: This description of events is not entirely accurate. We have 
never held up releases for the allocator work. Not once, never mind numerous 
occasions. For two releases during which it was not ready, I constructed 
special extract patches that included bug fixes only, because we believed it
    would help the project. I did ask for people to hold off commits exactly 
once, for 24 hours. And once I found a problem (the rebase didn't go well), I 
took the hold off after about 18 hours.
    
    As for merging patches in the order they are ready: the original allocator 
patch was ready about 3 months ago now. All the tests passed. The problem was 
that you refused to review it because it was too large. It was 213 files, less 
than half the size of this patch. Much like this one, there were "virtually no 
functional changes in this patch and its purpose is extremely narrow in scope." 
A large number of files had minor fixes that removed warnings, because these 
were helpful in finding bugs (biggest example: making a number of things 
AutoCloseables because then the compiler tells you when you leak resources -- 
this found a number of problems). The core changes are in about a dozen files, 
mainly the rewrite of the allocator itself.
    
    You asked, and I pointed you at those core changes. I heard nothing back 
for a while, until you eventually asked that I break it up into smaller patches 
-- but still no comments on the allocator itself. Because of the layers of 
changes and dependencies, that was painful, but I did break things up. The 
first four were ready for review in mid-August. We exchanged messages and I 
told you where they were; When I came back from vacation in September, they had 
still not been reviewed. I made a stink at MapR, and some people finally 
reviewed them. I continued making patches (several more) which I continued to 
have a hard time getting people to review. When you finally did look at one, 
you immediately rejected it just because it didn't have its own JIRA. I 
submitted a JIRA for it within an hour of that, and it, like the others, 
continued to be ignored. I finally badgered more folks at MapR to review them.
    
    Exactly once (not "numerous" times), at the time we ran the performance 
regression (a magical thing that only folks at MapR can do), we found that 
there is a performance penalty. I wouldn't call it a "concurrency design 
challenge." Yes, when you synchronize variables to get transacted behavior 
(memory allocation is really a balance transfer), instead of just depending on 
volatile variables that can change under your feet if you refer to them more 
than once, things are going to be slower. But you can do more.
    
    In order to avoid this continual rebasing to track the project, I went back 
and made it possible for the new and old allocators to co-exist, so that the 
code would at least be committed while the performance penalty could be 
addressed (in order to avoid more rebasing in the future). This would also 
allow the use of the new allocator when assertions are enabled (or another 
debug flag), so that we could find any new problems; many of my rebases took 
time because I kept finding new problems introduced by ongoing work because the 
new allocator is much stricter than the old one.
    
    You insisted that if others could just look at it, you were sure that a 
faster solution would be found; you asked for pointers to the code. I gave them 
to you. I never heard anything back. In any case, it is set up to only be used 
in debug mode for the moment, and can stay that way until someone can find a 
way to make it faster (and I explained the multiple implementations I tried 
before I finally got it down to the ~10% performance penalty it currently 
carries)
    
    I continued breaking things down in an attempt to get them in. We are 
finally down to the last two patches.
    
    Now I see a larger patch with the same characteristics, and it feels like 
it is being bulldozed through, even though there aren't clear benefits to 
having it. Your diagrams doesn't justify it, nor explain why it will be better. 
The new allocator, even if only used in debug mode for now, will help engineers 
avoid creating new leaks that the old allocator doesn't detect. Given some 
other suggestions I've seen to change the vector class interfaces, it seems 
like they're not stable enough to be broken out into a separate project yet; 
that will complicate changes.




---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at [email protected] or file a JIRA ticket
with INFRA.
---

[GitHub] drill pull request: DRILL-3987: Componentize Drill, extracting vec...

Reply via email to