capistrant edited a comment on pull request #10464:
URL: https://github.com/apache/druid/pull/10464#issuecomment-721306987


   > Hi @a2l007 and @capistrant, thanks for the nice PR and review! We usually 
treat silence as acceptance, so thanks for merging the PR too. However, for 
this particular PR, I think it needs to get "Design Review" which needs 3 +1s 
from committers (including the author himself) because it changes some 
user-facing behavior 
(https://github.com/apache/druid/blob/master/dev/committer-instructions.md). We 
have this policy because it's hard to undo once the changed user-facing 
behavior appears in some release. 
   
   Apologies for the pre-mature merge. I will keep this in mind in future to 
make sure I loop in more committers during review process for something client 
facing like this.
   
   > When we do design review, we usually have one committer reviewing the 
whole PR line by line, so that others can focus on only the design part. From 
this aspect, I quickly reviewed only the design part. I have 3 comments so far.
   > 
   > * The purpose of the PR sounds good to me because the timeout error 
reporting was inconsistent before. However this change is incompatible which 
means it can break user applications working based on the previous behavior. 
Someone can argue that no one could do it because the previous behavior was 
inconsistent. IMO, this change looks worth since the previous timeout error 
handling was easy to break, but should be called out in the release notes.
   
   Good point. How does something like this work for release notes? Does the 
patch author write up release notes for their change that the release manager 
will use later on?
   
   > * Looking at the code, [the status code is defined as 
504](https://github.com/apache/druid/pull/10464/files#diff-4bc40182ca7938163c8f2b71355cb5849920526319bc56f83dd55e542f3a4323R42)
 which means gateway timeout. I'm wondering 408 request timeout is more 
appropriate in this case.
   
   Could you add more to your thoughts that 408 may be a better error code? I 
thought that 504 sounded ideal since the Broker API is kind of a proxy to the 
druid query engine. The engine did not produce a response in time for the 
Broker to return to client so it responds with 504. I also worry about the 
implications of using a 408 code in relation to the potential for middleware 
auto-retrying the request when the end client may not want that (something like 
a browser doing auto retries N times before actually surfacing that 408 to 
client if it continues). And I've always had the perception of 5XX being a 
server error and 4XX being more of a client error, like the client made a 
mistake. But in this case the server isn't coming up with a response in time.
   
   > * All query errors should be documented. Check out 
https://druid.apache.org/docs/latest/querying/querying.html#query-errors.
   
   shoot I had a feeling there was a doc like this out there, @a2l007 do you 
want to start a follow on PR for this? I suppose it may be a part of a larger 
follow on with whatever else we uncover in discussions with Jihoon.


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
[email protected]



---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to