Qifan Chen has posted comments on this change. (
http://gerrit.cloudera.org:8080/17389 )
Change subject: IMPALA-10680: Replace StringToFloatInternal using
fast_double_parser library
......................................................................
Patch Set 6:
(4 comments)
It is great to know that Impala can achieve 926 MB/s conversion rate and very
attempting to get the best from fast_double_parser():-)
The key is not to populate a new std::string when the original input conforms
to the requirements of the library (well formed null-terminated string via
string::c_str() in constant speed), which should be true in most cases.
Throughout the code base of Impala, I was able to find only the following call
that needs the service of converting string to double which makes the above
idea feasible.
346 static bool ParseProbability(const string& prob_str, bool* should_execute)
{
347 StringParser::ParseResult parse_result;
348 double probability = StringParser::StringToFloat<double>(
349 prob_str.c_str(), prob_str.size(), &parse_result);
350 if (parse_result != StringParser::PARSE_SUCCESS ||
351 probability < 0.0 || probability > 1.0) {
352 return false;
353 }
354 // +1L ensures probability of 0.0 and 1.0 work as expected.
355 *should_execute = rand() < probability * (RAND_MAX + 1L);
356 return true;
357 }
http://gerrit.cloudera.org:8080/#/c/17389/6/be/src/util/string-parser.h
File be/src/util/string-parser.h:
http://gerrit.cloudera.org:8080/#/c/17389/6/be/src/util/string-parser.h@463
PS6, Line 463: const char* s, int len
If we can change it to std::string(), we can take advantage of the
fast_double_parser to the fullest.
http://gerrit.cloudera.org:8080/#/c/17389/6/be/src/util/string-parser.h@498
PS6, Line 498: StringToFloatPreprocess(s, i, len);
We can pass the input std::string as 's' and get rid off 'len',
On return, if processed_str.length() is 0, then the original input confirms to
fast_double_parser() and we can make use of the original directly.
http://gerrit.cloudera.org:8080/#/c/17389/6/be/src/util/string-parser.h@538
PS6, Line 538: const char* s, int offset,
: int len
The new signature of this function becomes
static inline std::string StringToFloatPreprocess(const string& s, int offset)
http://gerrit.cloudera.org:8080/#/c/17389/6/be/src/util/string-parser.h@550
PS6, Line 550:
Here before copying the test of the string, make the following check and return
an empty string object to signal that the original input is good to go with
fast_double_parser.
if (i==offset && res.length() == 0) {
return string();
}
--
To view, visit http://gerrit.cloudera.org:8080/17389
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings
Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: Ic105ad38a2fcbf2fb4e8ae8af6d9a8e251a9c141
Gerrit-Change-Number: 17389
Gerrit-PatchSet: 6
Gerrit-Owner: Amogh Margoor <[email protected]>
Gerrit-Reviewer: Amogh Margoor <[email protected]>
Gerrit-Reviewer: Csaba Ringhofer <[email protected]>
Gerrit-Reviewer: Impala Public Jenkins <[email protected]>
Gerrit-Reviewer: Qifan Chen <[email protected]>
Gerrit-Reviewer: Zoltan Borok-Nagy <[email protected]>
Gerrit-Comment-Date: Thu, 03 Jun 2021 17:55:42 +0000
Gerrit-HasComments: Yes