Michael Smith has posted comments on this change. ( http://gerrit.cloudera.org:8080/21168 )
Change subject: IMPALA-12920: Support ai_generate_text built-in function for OpenAI's chat completion API ...................................................................... Patch Set 6: (7 comments) http://gerrit.cloudera.org:8080/#/c/21168/6/be/src/exprs/ai-functions-ir.cc File be/src/exprs/ai-functions-ir.cc: http://gerrit.cloudera.org:8080/#/c/21168/6/be/src/exprs/ai-functions-ir.cc@101 PS6, Line 101: const rapidjson::Value& firstChoice = document[OPEN_AI_RESPONSE_FIELD_CHOICES][0]; Theoretically you could set the 'n' parameter, which would return multiple choices. This function doesn't support more than one response choice; we should mention it somewhere in the documentation. We could potentially return an error when parsing params below. http://gerrit.cloudera.org:8080/#/c/21168/6/be/src/exprs/ai-functions-ir.cc@128 PS6, Line 128: string endpoint_str(FLAGS_ai_endpoint); This makes an unnecessary copy; can we use string_view instead? https://en.cppreference.com/w/cpp/string/basic_string_view was added in C++17, which we now use. Could also just make initializing the value from FLAGS_ai_endpoint happen in an else clause, since it's going to be used as a 'const string&' for curl.PostToURL. http://gerrit.cloudera.org:8080/#/c/21168/6/be/src/exprs/scalar-expr-evaluator.cc File be/src/exprs/scalar-expr-evaluator.cc: http://gerrit.cloudera.org:8080/#/c/21168/6/be/src/exprs/scalar-expr-evaluator.cc@453 PS6, Line 453: AiFunctions::AiGenerateText(nullptr, StringVal::null(), StringVal::null(), Presumably this results an in error because 'prompt' is a null string, but might make sense to use the dry_run=true version to be safe. http://gerrit.cloudera.org:8080/#/c/21168/6/be/src/runtime/exec-env.cc File be/src/runtime/exec-env.cc: http://gerrit.cloudera.org:8080/#/c/21168/6/be/src/runtime/exec-env.cc@a213 PS6, Line 213: nit: unnecessary whitespace change; spacing here is pretty arbitrary http://gerrit.cloudera.org:8080/#/c/21168/6/be/src/runtime/exec-env.cc@528 PS6, Line 528: AiFunctions::set_api_key(api_key); Is this safe to permanently cache? I guess this comes from a site file, so it probably can't be dynamically updated. http://gerrit.cloudera.org:8080/#/c/21168/6/be/src/runtime/exec-env.cc@535 PS6, Line 535: LOG(ERROR) << "Config 'ai_endpoint' (" << FLAGS_ai_endpoint << ") is invalid" These don't cause anything immediately to fail. What's the rationale for not failing startup on invalid config? Could they be implemented via DEFINE_validator? http://gerrit.cloudera.org:8080/#/c/21168/6/be/src/udf/udf.h File be/src/udf/udf.h: http://gerrit.cloudera.org:8080/#/c/21168/6/be/src/udf/udf.h@a742 PS6, Line 742: nit: unnecessary whitespace change, although I think the new form is a little more consistent with the rest of our code. -- To view, visit http://gerrit.cloudera.org:8080/21168 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: Id4446957f6030bab1f985fdd69185c3da07d7c4b Gerrit-Change-Number: 21168 Gerrit-PatchSet: 6 Gerrit-Owner: Abhishek Rawat <[email protected]> Gerrit-Reviewer: Abhishek Rawat <[email protected]> Gerrit-Reviewer: Impala Public Jenkins <[email protected]> Gerrit-Reviewer: Michael Smith <[email protected]> Gerrit-Reviewer: Yida Wu <[email protected]> Gerrit-Comment-Date: Wed, 03 Apr 2024 22:39:51 +0000 Gerrit-HasComments: Yes
