Amogh Margoor has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/17389 )

Change subject: IMPALA-10680: Replace StringToFloatInternal using 
fast_double_parser library
......................................................................


Patch Set 6:

> (4 comments)
 >
 > It is great to know that Impala can achieve 926 MB/s conversion
 > rate and very attempting to get the best from fast_double_parser():-)
 >
 > The key is not to populate a new std::string when the original
 > input conforms to the requirements of the library (well formed
 > null-terminated string via string::c_str() in constant speed),
 > which should be true in most cases.
 >
 > Throughout the code base of Impala, I was able to find only the
 > following call that needs the service of converting string to
 > double which makes the above idea feasible.
 >
 > 346 static bool ParseProbability(const string& prob_str, bool*
 > should_execute) {
 > 347   StringParser::ParseResult parse_result;
 > 348   double probability = StringParser::StringToFloat<double>(
 > 349       prob_str.c_str(), prob_str.size(), &parse_result);
 > 350   if (parse_result != StringParser::PARSE_SUCCESS ||
 > 351       probability < 0.0 || probability > 1.0) {
 > 352     return false;
 > 353   }
 > 354   // +1L ensures probability of 0.0 and 1.0 work as expected.
 > 355   *should_execute = rand() < probability * (RAND_MAX + 1L);
 > 356   return true;
 > 357 }

Hi Qifan, I got late to the comment. So the other important code path which can 
lead to non-null terminated strings are due to the cast: 'select cast("0.454" 
as double)' or 'select cast(x as double) from foo' etc. The code path will pass 
through CastFunctions::CastToDoubleVal generated via Macro:
#define CAST_FROM_STRING(num_type, native_type, string_parser_fn) \
  num_type CastFunctions::CastTo##num_type(FunctionContext* ctx, const 
StringVal& val) { \
    if (val.is_null) return num_type::null(); \
    StringParser::ParseResult result; \
    num_type ret; \
    ret.val = StringParser::string_parser_fn<native_type>( \
        reinterpret_cast<char*>(val.ptr), val.len, &result); \
    if (UNLIKELY(result != StringParser::PARSE_SUCCESS)) return 
num_type::null(); \
    return ret; \
  }

this code can probably be frequently used based on usage of cast by 
client/customer. But the point you are making is valid that well formed 
null-terminated string need no extra processing and should directly be passed 
to library function.


--
To view, visit http://gerrit.cloudera.org:8080/17389
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: Ic105ad38a2fcbf2fb4e8ae8af6d9a8e251a9c141
Gerrit-Change-Number: 17389
Gerrit-PatchSet: 6
Gerrit-Owner: Amogh Margoor <[email protected]>
Gerrit-Reviewer: Amogh Margoor <[email protected]>
Gerrit-Reviewer: Csaba Ringhofer <[email protected]>
Gerrit-Reviewer: Impala Public Jenkins <[email protected]>
Gerrit-Reviewer: Qifan Chen <[email protected]>
Gerrit-Reviewer: Zoltan Borok-Nagy <[email protected]>
Gerrit-Comment-Date: Mon, 05 Jul 2021 14:34:48 +0000
Gerrit-HasComments: No

Reply via email to