John Sherman has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/17144 )

Change subject: IMPALA-10551: Add result sink support for external frontends
......................................................................


Patch Set 6:

(7 comments)

Thanks Joe, I think the suggested review comments has made this a better patch 
overall.

http://gerrit.cloudera.org:8080/#/c/17144/3/be/src/runtime/coordinator.h
File be/src/runtime/coordinator.h:

http://gerrit.cloudera.org:8080/#/c/17144/3/be/src/runtime/coordinator.h@308
PS3, Line 308:   /// Non-null if and only if the query produces results for the 
client; i.e. is of
             :   /// TStmtType::QUERY. Coordinator uses these to pull results 
from plan tree and return
             :   /// them to the client in GetNext(), and also to access the 
fragment instance's runtime
             :   /// state.
             :   ///
             :   /// Result rows are materialized by this fragment instance in 
its own thread. They are
             :   /// materialized into a QueryResultSet provided to the 
coordinator during GetNext().
             :   ///
             :   /// Owned by the QueryState. Set in Exec().
             :   FragmentInstanceState* coord_instance_ = nullptr;
> The way we are using coord_instance_ for a query with a result sink doesn't
I've added a method to query-exec-params and call it in a few places to 
determine if there is a result sink in coordinator.cc - not sure if that is the 
best way.


http://gerrit.cloudera.org:8080/#/c/17144/3/be/src/runtime/coordinator.cc
File be/src/runtime/coordinator.cc:

http://gerrit.cloudera.org:8080/#/c/17144/3/be/src/runtime/coordinator.cc@193
PS3, Line 193:   // set coord_instance_ and coord_sink_
             :   if (exec_params_.GetCoordFragment() != nullptr) {
             :     // this blocks until all fragment instances have finished 
their Prepare phase
             :     Status query_status = 
query_state_->GetFInstanceState(query_id(), &coord_instance_);
             :     if (!query_status.ok()) return UpdateExecState(query_status, 
nullptr, FLAGS_hostname);
             :     // We expected this query to have a coordinator instance.
             :     DCHECK(coord_instance_ != nullptr);
             :     // When GetFInstanceState() returns the coordinator 
instance, the Prepare phase is
             :     // done and the FragmentInstanceState's root sink will be 
set up.
             :     coord_sink_ = coord_instance_->GetRootSink();
             :     DCHECK(coord_sink_ != nullptr);
             :   }
             :   return Status::OK();
             : }
             :
> As mentioned in coordinator.h, I think coord_instance_ should be nullptr fo
Done


http://gerrit.cloudera.org:8080/#/c/17144/3/be/src/runtime/coordinator.cc@798
PS3, Line 798:   // staging directory.
> I would prefer to review this when I can see how it is being called.
The newest review has the call.


http://gerrit.cloudera.org:8080/#/c/17144/3/be/src/runtime/coordinator.cc@805
PS3, Line 805:   DCHECK(query_ctx().__isset.desc_tbl_serialized);
> If this returns an error (which it should if the query is not successful),
This code is based on FinalizeHdfsDml which follows a similar pattern. So I 
wonder if something else ends up cleaning it up in the error case.


http://gerrit.cloudera.org:8080/#/c/17144/3/be/src/runtime/coordinator.cc@894
PS3, Line 894:         && query_state_->query_options().spool_query_results
             :         && 
query_state_->query_options().spool_all_results_for_retries) {
             :       // Wait until the BufferedPlanRootSink spooled all results 
or any errors stopping
             :       // it, e.g. batch queue full, cancellation or failures.
             :       auto sink = 
static_cast<BufferedPlanRootSink*>(coord_sink_);
             :       if (sink->WaitForAllResultsSpooled()) {
             :         VLOG_QUERY << "Cannot spool all results in the allocated 
result spooling space."
             :             " Query retry will be skipped if any results have 
been returned.";
             :       }
             :
> If the below if statement does not apply to queries with result sinks, then
Done


http://gerrit.cloudera.org:8080/#/c/17144/3/be/src/runtime/coordinator.cc@905
PS3, Line 905:   }
> I think it would make sense to treat a query with a result sink like a DML
Done


http://gerrit.cloudera.org:8080/#/c/17144/3/be/src/runtime/coordinator.cc@940
PS3, Line 940:
> If a query with a result sink goes through the current DML path, then it wo
Done



--
To view, visit http://gerrit.cloudera.org:8080/17144
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I024bf41d77bb81f1ab0debdbd31ec3687c83f072
Gerrit-Change-Number: 17144
Gerrit-PatchSet: 6
Gerrit-Owner: John Sherman <j...@cloudera.com>
Gerrit-Reviewer: Aman Sinha <amsi...@cloudera.com>
Gerrit-Reviewer: Impala Public Jenkins <impala-public-jenk...@cloudera.com>
Gerrit-Reviewer: Joe McDonnell <joemcdonn...@cloudera.com>
Gerrit-Reviewer: John Sherman <j...@cloudera.com>
Gerrit-Reviewer: Thomas Tauber-Marshall <tmarsh...@cloudera.com>
Gerrit-Comment-Date: Thu, 11 Mar 2021 02:24:00 +0000
Gerrit-HasComments: Yes

Reply via email to