[Impala-ASF-CR] IMPALA-10375: Lock down which filesystems use the file handle cache

2021-04-11 Thread Quanlong Huang (Code Review)
Quanlong Huang has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/16856 )

Change subject: IMPALA-10375: Lock down which filesystems use the file handle 
cache
..


Patch Set 1:

(2 comments)

http://gerrit.cloudera.org:8080/#/c/16856/1/be/src/runtime/io/scan-range.cc
File be/src/runtime/io/scan-range.cc:

http://gerrit.cloudera.org:8080/#/c/16856/1/be/src/runtime/io/scan-range.cc@200
PS1, Line 200:   if (is_file_handle_caching_enabled() && 
filesystem_supports_handle_caching(file()) &&
> We might need to be a bit careful about passing from frontend when it comes
Now more places (e.g. spilling to remote fs like s3) depend on the path prefix 
matching. So after we get HDFS-15289, we should change these as well:
https://github.com/apache/impala/blob/1dcd59638c340e2e1d167f7e8b21019e897893a9/be/src/util/hdfs-util.cc#L95-L118


http://gerrit.cloudera.org:8080/#/c/16856/1/be/src/runtime/io/scan-range.cc@201
PS1, Line 201: expected_local_
Can we simply change this to this?

 (expected_local_ && disk_id_ < io_mgr_->num_local_disks())



--
To view, visit http://gerrit.cloudera.org:8080/16856
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I136c3da9d19590cdbe8623d22480b8dd07192ce3
Gerrit-Change-Number: 16856
Gerrit-PatchSet: 1
Gerrit-Owner: Joe McDonnell 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Joe McDonnell 
Gerrit-Reviewer: Quanlong Huang 
Gerrit-Reviewer: Tim Armstrong 
Gerrit-Comment-Date: Mon, 12 Apr 2021 01:43:54 +
Gerrit-HasComments: Yes


[Impala-ASF-CR] IMPALA-10647 Improve always-true min/max filter handling in coordinator

2021-04-11 Thread Impala Public Jenkins (Code Review)
Impala Public Jenkins has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/17252 )

Change subject: IMPALA-10647 Improve always-true min/max filter handling in 
coordinator
..


Patch Set 16:

Build Successful

https://jenkins.impala.io/job/gerrit-code-review-checks/8545/ : Initial code 
review checks passed. Use gerrit-verify-dryrun-external or gerrit-verify-dryrun 
to run full precommit tests.


--
To view, visit http://gerrit.cloudera.org:8080/17252
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I326317833979efcbe02ce6c95ad80133dd5c7964
Gerrit-Change-Number: 17252
Gerrit-PatchSet: 16
Gerrit-Owner: Qifan Chen 
Gerrit-Reviewer: Fang-Yu Rao 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Qifan Chen 
Gerrit-Reviewer: Riza Suminto 
Gerrit-Reviewer: Wenzhe Zhou 
Gerrit-Comment-Date: Mon, 12 Apr 2021 01:23:43 +
Gerrit-HasComments: No


[Impala-ASF-CR] IMPALA-10647 Improve always-true min/max filter handling in coordinator

2021-04-11 Thread Qifan Chen (Code Review)
Qifan Chen has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/17252 )

Change subject: IMPALA-10647 Improve always-true min/max filter handling in 
coordinator
..


Patch Set 16:

> Will add the new tests tomorrow. Thanks all for the great comments.


--
To view, visit http://gerrit.cloudera.org:8080/17252
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I326317833979efcbe02ce6c95ad80133dd5c7964
Gerrit-Change-Number: 17252
Gerrit-PatchSet: 16
Gerrit-Owner: Qifan Chen 
Gerrit-Reviewer: Fang-Yu Rao 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Qifan Chen 
Gerrit-Reviewer: Riza Suminto 
Gerrit-Reviewer: Wenzhe Zhou 
Gerrit-Comment-Date: Mon, 12 Apr 2021 01:23:31 +
Gerrit-HasComments: No


[Impala-ASF-CR] IMPALA-10647 Improve always-true min/max filter handling in coordinator

2021-04-11 Thread Qifan Chen (Code Review)
Qifan Chen has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/17252 )

Change subject: IMPALA-10647 Improve always-true min/max filter handling in 
coordinator
..


Patch Set 16:

Will add the new tests tomorrow. Thanks all for great comments.


--
To view, visit http://gerrit.cloudera.org:8080/17252
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I326317833979efcbe02ce6c95ad80133dd5c7964
Gerrit-Change-Number: 17252
Gerrit-PatchSet: 16
Gerrit-Owner: Qifan Chen 
Gerrit-Reviewer: Fang-Yu Rao 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Qifan Chen 
Gerrit-Reviewer: Riza Suminto 
Gerrit-Reviewer: Wenzhe Zhou 
Gerrit-Comment-Date: Mon, 12 Apr 2021 01:23:04 +
Gerrit-HasComments: No


[Impala-ASF-CR] IMPALA-10647 Improve always-true min/max filter handling in coordinator

2021-04-11 Thread Qifan Chen (Code Review)
Qifan Chen has uploaded a new patch set (#16). ( 
http://gerrit.cloudera.org:8080/17252 )

Change subject: IMPALA-10647 Improve always-true min/max filter handling in 
coordinator
..

IMPALA-10647 Improve always-true min/max filter handling in coordinator

The change improves how a coordinator behaves when a just
arriving min/max filter is always true. A new member
'always_true_filter_received_' is introduced to record such a fact
and used when display the min and max column in
"Filter routing table" and "Final filter table" in profile. These
two columns now display the following possible values.

  1. 'PartialUpdates' - The min and the max are partially updated;
  2. 'AlwaysTrue' - One received filter is AlwaysTrue;
  3. 'AlwaysFalse'- No filter is received or all received
filters are empty;
  4. 'Real values'- The final accumulated min/max from all
received filters.

A second change introduced is to record, in scan node, the
arrival time of min/max filters (as a timestamp since the system
is rebooted obtained by calling MonotonicMillis()). A timestamp
of similar nature is recorded for hdfs parquet scanners when a
row group is processed. By comparing these two timestamps, one
can easily diagnose issues related to late arrival of min/max
filters.

Testing:
  1. Ran unit tests;
  2. Ran core tests.

Change-Id: I326317833979efcbe02ce6c95ad80133dd5c7964
---
M be/src/exec/parquet/hdfs-parquet-scanner.cc
M be/src/exec/scan-node.cc
M be/src/runtime/coordinator-filter-state.h
M be/src/runtime/coordinator.cc
M be/src/util/min-max-filter.cc
M be/src/util/min-max-filter.h
6 files changed, 122 insertions(+), 26 deletions(-)


  git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/52/17252/16
--
To view, visit http://gerrit.cloudera.org:8080/17252
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: newpatchset
Gerrit-Change-Id: I326317833979efcbe02ce6c95ad80133dd5c7964
Gerrit-Change-Number: 17252
Gerrit-PatchSet: 16
Gerrit-Owner: Qifan Chen 
Gerrit-Reviewer: Fang-Yu Rao 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Qifan Chen 
Gerrit-Reviewer: Riza Suminto 
Gerrit-Reviewer: Wenzhe Zhou 


[Impala-ASF-CR] IMPALA-10619: Minor refactoring of analytic function methods

2021-04-11 Thread Quanlong Huang (Code Review)
Quanlong Huang has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/17237 )

Change subject: IMPALA-10619: Minor refactoring of analytic function methods
..


Patch Set 3: Code-Review+2

(1 comment)

http://gerrit.cloudera.org:8080/#/c/17237/1/fe/src/main/java/org/apache/impala/analysis/AnalyticExpr.java
File fe/src/main/java/org/apache/impala/analysis/AnalyticExpr.java:

http://gerrit.cloudera.org:8080/#/c/17237/1/fe/src/main/java/org/apache/impala/analysis/AnalyticExpr.java@833
PS1, Line 833:   protected FunctionCallExpr 
createRewrittenFunction(FunctionName funcName,
> Sorry, was distracted by some other issues.  I did look into wrapping those
Sure. Don't want to block this. I'm ok with this since I think the external 
frontend can override the standardize() method and add checks at the end to 
make sure the fnCall_ is in the desired class (e.g. ExternalFEFunctionCallExpr).



--
To view, visit http://gerrit.cloudera.org:8080/17237
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I39e4268c0c5500f09acf98357a80763c28f615c2
Gerrit-Change-Number: 17237
Gerrit-PatchSet: 3
Gerrit-Owner: Aman Sinha 
Gerrit-Reviewer: Aman Sinha 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Quanlong Huang 
Gerrit-Comment-Date: Mon, 12 Apr 2021 00:18:17 +
Gerrit-HasComments: Yes


[Impala-ASF-CR] IMPALA-10350 Fix precision loss while converting DecimalValue to double.

2021-04-11 Thread Impala Public Jenkins (Code Review)
Impala Public Jenkins has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/17303 )

Change subject: IMPALA-10350 Fix precision loss while converting DecimalValue 
to double.
..


Patch Set 3:

Build Successful

https://jenkins.impala.io/job/gerrit-code-review-checks/8544/ : Initial code 
review checks passed. Use gerrit-verify-dryrun-external or gerrit-verify-dryrun 
to run full precommit tests.


--
To view, visit http://gerrit.cloudera.org:8080/17303
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I56f0652cb8f81a491b87d9b108a94c00ae6c99a1
Gerrit-Change-Number: 17303
Gerrit-PatchSet: 3
Gerrit-Owner: Amogh Margoor 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Comment-Date: Sun, 11 Apr 2021 20:28:49 +
Gerrit-HasComments: No


[Impala-ASF-CR] IMPALA-10350 Fix precision loss while converting DecimalValue to double.

2021-04-11 Thread Impala Public Jenkins (Code Review)
Impala Public Jenkins has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/17303 )

Change subject: IMPALA-10350 Fix precision loss while converting DecimalValue 
to double.
..


Patch Set 3:

(20 comments)

http://gerrit.cloudera.org:8080/#/c/17303/3/be/src/thirdparty/fast_double_parser/fast_double_parser.h
File be/src/thirdparty/fast_double_parser/fast_double_parser.h:

http://gerrit.cloudera.org:8080/#/c/17303/3/be/src/thirdparty/fast_double_parser/fast_double_parser.h@13
PS3, Line 13: #if (defined(sun) || defined(__sun))
line has trailing whitespace


http://gerrit.cloudera.org:8080/#/c/17303/3/be/src/thirdparty/fast_double_parser/fast_double_parser.h@17
PS3, Line 17: #if defined(__CYGWIN__) || defined(__MINGW32__) || 
defined(__MINGW64__)
line has trailing whitespace


http://gerrit.cloudera.org:8080/#/c/17303/3/be/src/thirdparty/fast_double_parser/fast_double_parser.h@22
PS3, Line 22:  * Determining whether we should import xlocale.h or not is
line has trailing whitespace


http://gerrit.cloudera.org:8080/#/c/17303/3/be/src/thirdparty/fast_double_parser/fast_double_parser.h@25
PS3, Line 25: #if defined(FAST_DOUBLE_PARSER_SOLARIS) || 
defined(FAST_DOUBLE_PARSER_CYGWIN)
line has trailing whitespace


http://gerrit.cloudera.org:8080/#/c/17303/3/be/src/thirdparty/fast_double_parser/fast_double_parser.h@67
PS3, Line 67: #endif //  defined(FAST_DOUBLE_PARSER_SOLARIS) || 
defined(FAST_DOUBLE_PARSER_CYGWIN)
line has trailing whitespace


http://gerrit.cloudera.org:8080/#/c/17303/3/be/src/thirdparty/fast_double_parser/fast_double_parser.h@84
PS3, Line 84:  * However, we have that
line has trailing whitespace


http://gerrit.cloudera.org:8080/#/c/17303/3/be/src/thirdparty/fast_double_parser/fast_double_parser.h@86
PS3, Line 86:  * Thus it is possible for a number of the form w * 10^-342 where
line has trailing whitespace


http://gerrit.cloudera.org:8080/#/c/17303/3/be/src/thirdparty/fast_double_parser/fast_double_parser.h@94
PS3, Line 94:  * Any number of form w * 10^309 where w>= 1 is going to be
line has trailing whitespace


http://gerrit.cloudera.org:8080/#/c/17303/3/be/src/thirdparty/fast_double_parser/fast_double_parser.h@153
PS3, Line 153: // credit: 
https://stackoverflow.com/questions/28868367/getting-the-high-part-of-64-bit-integer-multiplication
line too long (110 > 90)


http://gerrit.cloudera.org:8080/#/c/17303/3/be/src/thirdparty/fast_double_parser/fast_double_parser.h@154
PS3, Line 154: really_inline uint64_t Emulate64x64to128(uint64_t& r_hi, const 
uint64_t x, const uint64_t y) {
line too long (94 > 90)


http://gerrit.cloudera.org:8080/#/c/17303/3/be/src/thirdparty/fast_double_parser/fast_double_parser.h@159
PS3, Line 159: 
line has trailing whitespace


http://gerrit.cloudera.org:8080/#/c/17303/3/be/src/thirdparty/fast_double_parser/fast_double_parser.h@224
PS3, Line 224:  */
line has trailing whitespace


http://gerrit.cloudera.org:8080/#/c/17303/3/be/src/thirdparty/fast_double_parser/fast_double_parser.h@958
PS3, Line 958:
line has trailing whitespace


http://gerrit.cloudera.org:8080/#/c/17303/3/be/src/thirdparty/fast_double_parser/fast_double_parser.h@960
PS3, Line 960:   // The exponent is 1024 + 63 + power
line has trailing whitespace


http://gerrit.cloudera.org:8080/#/c/17303/3/be/src/thirdparty/fast_double_parser/fast_double_parser.h@976
PS3, Line 976:   // The 65536 is (1<<16) and corresponds to
line has trailing whitespace


http://gerrit.cloudera.org:8080/#/c/17303/3/be/src/thirdparty/fast_double_parser/fast_double_parser.h@979
PS3, Line 979:   // ((152170 * power ) >> 16) is equal to
line has trailing whitespace


http://gerrit.cloudera.org:8080/#/c/17303/3/be/src/thirdparty/fast_double_parser/fast_double_parser.h@980
PS3, Line 980:   // floor(log(5**power)/log(2))
line has trailing whitespace


http://gerrit.cloudera.org:8080/#/c/17303/3/be/src/thirdparty/fast_double_parser/fast_double_parser.h@982
PS3, Line 982:   // Note that this is not magic: 152170/(1<<16) is
line has trailing whitespace


http://gerrit.cloudera.org:8080/#/c/17303/3/be/src/thirdparty/fast_double_parser/fast_double_parser.h@984
PS3, Line 984:   // The 1<<16 value is a power of two; we could use a
line has trailing whitespace


http://gerrit.cloudera.org:8080/#/c/17303/3/be/src/thirdparty/fast_double_parser/fast_double_parser.h@1097
PS3, Line 1097: #if defined(FAST_DOUBLE_PARSER_SOLARIS) || 
defined(FAST_DOUBLE_PARSER_CYGWIN)
line has trailing whitespace



--
To view, visit http://gerrit.cloudera.org:8080/17303
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I56f0652cb8f81a491b87d9b108a94c00ae6c99a1
Gerrit-Change-Number: 17303
Gerrit-PatchSet: 3
Gerrit-Owner: Amogh Margoor 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Comment-Date: Sun, 11 Apr 2021 20:08:27 +
Gerrit-HasComments: Yes


[Impala-ASF-CR] IMPALA-10350 Fix precision loss while converting DecimalValue to double.

2021-04-11 Thread Amogh Margoor (Code Review)
Amogh Margoor has uploaded a new patch set (#3). ( 
http://gerrit.cloudera.org:8080/17303 )

Change subject: IMPALA-10350 Fix precision loss while converting DecimalValue 
to double.
..

IMPALA-10350 Fix precision loss while converting DecimalValue to double.

Original approach to convert DecimalValue(internal representation of decimals) 
was not accurate.
It was:
   static_cast(value_) / pow(10.0, scale).
However only integers from −2^53 to 2^53 can be represented accurately by
double precision without any loss.
Hence, it would not work for numbers like -0.43149576573887316.
For DecimalValue representing -0.43149576573887316, value_ would be
-43149576573887316 and scale would be 17. As value_ < -2^53, result would
not be accurate. In newer approach we are using third party library
https://github.com/lemire/fast_double_parser, which handles above scenario
in a performant manner.

Change-Id: I56f0652cb8f81a491b87d9b108a94c00ae6c99a1
---
M be/src/runtime/decimal-value.inline.h
A be/src/thirdparty/fast_double_parser/LICENSE
A be/src/thirdparty/fast_double_parser/LICENSE.BSL
A be/src/thirdparty/fast_double_parser/README.md
A be/src/thirdparty/fast_double_parser/fast_double_parser.h
M bin/rat_exclude_files.txt
M testdata/workloads/functional-query/queries/QueryTest/values.test
7 files changed, 1,541 insertions(+), 1 deletion(-)


  git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/03/17303/3
--
To view, visit http://gerrit.cloudera.org:8080/17303
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: newpatchset
Gerrit-Change-Id: I56f0652cb8f81a491b87d9b108a94c00ae6c99a1
Gerrit-Change-Number: 17303
Gerrit-PatchSet: 3
Gerrit-Owner: Amogh Margoor 
Gerrit-Reviewer: Impala Public Jenkins 


[Impala-ASF-CR] IMPALA-10647 Improve always-true min/max filter handling in coordinator

2021-04-11 Thread Fang-Yu Rao (Code Review)
Fang-Yu Rao has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/17252 )

Change subject: IMPALA-10647 Improve always-true min/max filter handling in 
coordinator
..


Patch Set 15:

(4 comments)

Hi Qifan, I only have some minor comments on this patch. Thank you very much 
for working on this!

http://gerrit.cloudera.org:8080/#/c/17252/15//COMMIT_MSG
Commit Message:

http://gerrit.cloudera.org:8080/#/c/17252/15//COMMIT_MSG@23
PS15, Line 23: arrival time of min/max filters
Is it true that the logic added in ScanNode::WaitForRuntimeFilters() also 
applies for Bloom filters? Namely, the logged 'end' over there records the time 
when we are done waiting for all filters in 'filter_ctxs_' whether or not there 
might be some Bloom filters.

If the logic added also applies for Bloom filters, is there any particular 
reason why we do not tackle the case when a runtime filter is a Bloom filter in 
this patch?

I will also paste the questions above at ScanNode::WaitForRuntimeFilters() for 
easy reference.


http://gerrit.cloudera.org:8080/#/c/17252/15//COMMIT_MSG@27
PS15, Line 27: relate
nit: related


http://gerrit.cloudera.org:8080/#/c/17252/15/be/src/exec/scan-node.cc
File be/src/exec/scan-node.cc:

http://gerrit.cloudera.org:8080/#/c/17252/15/be/src/exec/scan-node.cc@239
PS15, Line 239: end
Is it true that the logic added here also applies for Bloom filters? Namely, 
the logged 'end' here records the time when we are done waiting for all filters 
in 'filter_ctxs_' whether or not there might be some Bloom filters.

If the logic added also applies for Bloom filters, is there any particular 
reason why we do not tackle the case when a runtime filter is a Bloom filter in 
this patch?


http://gerrit.cloudera.org:8080/#/c/17252/15/be/src/runtime/coordinator.cc
File be/src/runtime/coordinator.cc:

http://gerrit.cloudera.org:8080/#/c/17252/15/be/src/runtime/coordinator.cc@652
PS15, Line 652:   // Also add the min/max value for partitioned joins, when 
all updates are available
  :   // or the filter is disabled due to being always true.
I was wondering whether the comment would be clearer if we rephrase this 
sentence as the following although it seems a bit verbose. Please also let me 
know if my understanding is correct. Thanks!

For partitioned joins, add the actual min/max values when all updates are 
available and the filter is neither always true nor always false. Add 
"AlwasyTrue" if at least one received filter is always true. Add "AlwaysFalse" 
when the aggregated filter is always false after all updates are received. All 
other cases are considered "PartialUpdates".



--
To view, visit http://gerrit.cloudera.org:8080/17252
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I326317833979efcbe02ce6c95ad80133dd5c7964
Gerrit-Change-Number: 17252
Gerrit-PatchSet: 15
Gerrit-Owner: Qifan Chen 
Gerrit-Reviewer: Fang-Yu Rao 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Qifan Chen 
Gerrit-Reviewer: Riza Suminto 
Gerrit-Reviewer: Wenzhe Zhou 
Gerrit-Comment-Date: Sun, 11 Apr 2021 19:04:02 +
Gerrit-HasComments: Yes


[Impala-ASF-CR] IMPALA-10350 Fix precision loss while converting DecimalValue to double.

2021-04-11 Thread Impala Public Jenkins (Code Review)
Impala Public Jenkins has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/17303 )

Change subject: IMPALA-10350 Fix precision loss while converting DecimalValue 
to double.
..


Patch Set 2:

Build Failed

https://jenkins.impala.io/job/gerrit-code-review-checks/8543/ : Initial code 
review checks failed. See linked job for details on the failure.


--
To view, visit http://gerrit.cloudera.org:8080/17303
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I56f0652cb8f81a491b87d9b108a94c00ae6c99a1
Gerrit-Change-Number: 17303
Gerrit-PatchSet: 2
Gerrit-Owner: Amogh Margoor 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Comment-Date: Sun, 11 Apr 2021 18:39:01 +
Gerrit-HasComments: No


[Impala-ASF-CR] IMPALA-10350 Fix precision loss while converting DecimalValue to double.

2021-04-11 Thread Impala Public Jenkins (Code Review)
Impala Public Jenkins has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/17303 )

Change subject: IMPALA-10350 Fix precision loss while converting DecimalValue 
to double.
..


Patch Set 2:

(20 comments)

http://gerrit.cloudera.org:8080/#/c/17303/2/be/src/thirdparty/fast_double_parser/fast_double_parser.h
File be/src/thirdparty/fast_double_parser/fast_double_parser.h:

http://gerrit.cloudera.org:8080/#/c/17303/2/be/src/thirdparty/fast_double_parser/fast_double_parser.h@13
PS2, Line 13: #if (defined(sun) || defined(__sun))
line has trailing whitespace


http://gerrit.cloudera.org:8080/#/c/17303/2/be/src/thirdparty/fast_double_parser/fast_double_parser.h@17
PS2, Line 17: #if defined(__CYGWIN__) || defined(__MINGW32__) || 
defined(__MINGW64__)
line has trailing whitespace


http://gerrit.cloudera.org:8080/#/c/17303/2/be/src/thirdparty/fast_double_parser/fast_double_parser.h@22
PS2, Line 22:  * Determining whether we should import xlocale.h or not is
line has trailing whitespace


http://gerrit.cloudera.org:8080/#/c/17303/2/be/src/thirdparty/fast_double_parser/fast_double_parser.h@25
PS2, Line 25: #if defined(FAST_DOUBLE_PARSER_SOLARIS) || 
defined(FAST_DOUBLE_PARSER_CYGWIN)
line has trailing whitespace


http://gerrit.cloudera.org:8080/#/c/17303/2/be/src/thirdparty/fast_double_parser/fast_double_parser.h@67
PS2, Line 67: #endif //  defined(FAST_DOUBLE_PARSER_SOLARIS) || 
defined(FAST_DOUBLE_PARSER_CYGWIN)
line has trailing whitespace


http://gerrit.cloudera.org:8080/#/c/17303/2/be/src/thirdparty/fast_double_parser/fast_double_parser.h@84
PS2, Line 84:  * However, we have that
line has trailing whitespace


http://gerrit.cloudera.org:8080/#/c/17303/2/be/src/thirdparty/fast_double_parser/fast_double_parser.h@86
PS2, Line 86:  * Thus it is possible for a number of the form w * 10^-342 where
line has trailing whitespace


http://gerrit.cloudera.org:8080/#/c/17303/2/be/src/thirdparty/fast_double_parser/fast_double_parser.h@94
PS2, Line 94:  * Any number of form w * 10^309 where w>= 1 is going to be
line has trailing whitespace


http://gerrit.cloudera.org:8080/#/c/17303/2/be/src/thirdparty/fast_double_parser/fast_double_parser.h@153
PS2, Line 153: // credit: 
https://stackoverflow.com/questions/28868367/getting-the-high-part-of-64-bit-integer-multiplication
line too long (110 > 90)


http://gerrit.cloudera.org:8080/#/c/17303/2/be/src/thirdparty/fast_double_parser/fast_double_parser.h@154
PS2, Line 154: really_inline uint64_t Emulate64x64to128(uint64_t& r_hi, const 
uint64_t x, const uint64_t y) {
line too long (94 > 90)


http://gerrit.cloudera.org:8080/#/c/17303/2/be/src/thirdparty/fast_double_parser/fast_double_parser.h@159
PS2, Line 159: 
line has trailing whitespace


http://gerrit.cloudera.org:8080/#/c/17303/2/be/src/thirdparty/fast_double_parser/fast_double_parser.h@224
PS2, Line 224:  */
line has trailing whitespace


http://gerrit.cloudera.org:8080/#/c/17303/2/be/src/thirdparty/fast_double_parser/fast_double_parser.h@958
PS2, Line 958:
line has trailing whitespace


http://gerrit.cloudera.org:8080/#/c/17303/2/be/src/thirdparty/fast_double_parser/fast_double_parser.h@960
PS2, Line 960:   // The exponent is 1024 + 63 + power
line has trailing whitespace


http://gerrit.cloudera.org:8080/#/c/17303/2/be/src/thirdparty/fast_double_parser/fast_double_parser.h@976
PS2, Line 976:   // The 65536 is (1<<16) and corresponds to
line has trailing whitespace


http://gerrit.cloudera.org:8080/#/c/17303/2/be/src/thirdparty/fast_double_parser/fast_double_parser.h@979
PS2, Line 979:   // ((152170 * power ) >> 16) is equal to
line has trailing whitespace


http://gerrit.cloudera.org:8080/#/c/17303/2/be/src/thirdparty/fast_double_parser/fast_double_parser.h@980
PS2, Line 980:   // floor(log(5**power)/log(2))
line has trailing whitespace


http://gerrit.cloudera.org:8080/#/c/17303/2/be/src/thirdparty/fast_double_parser/fast_double_parser.h@982
PS2, Line 982:   // Note that this is not magic: 152170/(1<<16) is
line has trailing whitespace


http://gerrit.cloudera.org:8080/#/c/17303/2/be/src/thirdparty/fast_double_parser/fast_double_parser.h@984
PS2, Line 984:   // The 1<<16 value is a power of two; we could use a
line has trailing whitespace


http://gerrit.cloudera.org:8080/#/c/17303/2/be/src/thirdparty/fast_double_parser/fast_double_parser.h@1097
PS2, Line 1097: #if defined(FAST_DOUBLE_PARSER_SOLARIS) || 
defined(FAST_DOUBLE_PARSER_CYGWIN)
line has trailing whitespace



--
To view, visit http://gerrit.cloudera.org:8080/17303
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I56f0652cb8f81a491b87d9b108a94c00ae6c99a1
Gerrit-Change-Number: 17303
Gerrit-PatchSet: 2
Gerrit-Owner: Amogh Margoor 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Comment-Date: Sun, 11 Apr 2021 18:19:27 +
Gerrit-HasComments: Yes


[Impala-ASF-CR] IMPALA-10350 Fix precision loss while converting DecimalValue to double.

2021-04-11 Thread Amogh Margoor (Code Review)
Amogh Margoor has uploaded a new patch set (#2). ( 
http://gerrit.cloudera.org:8080/17303 )

Change subject: IMPALA-10350 Fix precision loss while converting DecimalValue 
to double.
..

IMPALA-10350 Fix precision loss while converting DecimalValue to double.

Original approach to convert DecimalValue(internal representation of decimals) 
was not accurate.
It was:
   static_cast(value_) / pow(10.0, scale).
However only integers from −2^53 to 2^53 can be represented accurately by
double precision without any loss.
Hence, it would not work for numbers like -0.43149576573887316.
For DecimalValue representing -0.43149576573887316, value_ would be
-43149576573887316 and scale would be 17. As value_ < -2^53, result would
not be accurate. In newer approach we are using third party library
https://github.com/lemire/fast_double_parser, which handles above scenario
in a performant manner.

Change-Id: I56f0652cb8f81a491b87d9b108a94c00ae6c99a1
---
M be/src/runtime/decimal-value.inline.h
A be/src/thirdparty/fast_double_parser/LICENSE
A be/src/thirdparty/fast_double_parser/LICENSE.BSL
A be/src/thirdparty/fast_double_parser/README.md
A be/src/thirdparty/fast_double_parser/fast_double_parser.h
M testdata/workloads/functional-query/queries/QueryTest/values.test
6 files changed, 1,540 insertions(+), 1 deletion(-)


  git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/03/17303/2
--
To view, visit http://gerrit.cloudera.org:8080/17303
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: newpatchset
Gerrit-Change-Id: I56f0652cb8f81a491b87d9b108a94c00ae6c99a1
Gerrit-Change-Number: 17303
Gerrit-PatchSet: 2
Gerrit-Owner: Amogh Margoor 
Gerrit-Reviewer: Impala Public Jenkins 


[Impala-ASF-CR] IMPALA-10350 Fix precision loss while converting DecimalValue to double.

2021-04-11 Thread Impala Public Jenkins (Code Review)
Impala Public Jenkins has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/17303 )

Change subject: IMPALA-10350 Fix precision loss while converting DecimalValue 
to double.
..


Patch Set 1:

Build Failed

https://jenkins.impala.io/job/gerrit-code-review-checks/8542/ : Initial code 
review checks failed. See linked job for details on the failure.


--
To view, visit http://gerrit.cloudera.org:8080/17303
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I56f0652cb8f81a491b87d9b108a94c00ae6c99a1
Gerrit-Change-Number: 17303
Gerrit-PatchSet: 1
Gerrit-Owner: Amogh Margoor 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Comment-Date: Sun, 11 Apr 2021 17:58:25 +
Gerrit-HasComments: No


[Impala-ASF-CR] IMPALA-10350 Fix precision loss while converting DecimalValue to double.

2021-04-11 Thread Impala Public Jenkins (Code Review)
Impala Public Jenkins has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/17303 )

Change subject: IMPALA-10350 Fix precision loss while converting DecimalValue 
to double.
..


Patch Set 1:

(30 comments)

http://gerrit.cloudera.org:8080/#/c/17303/1/be/src/runtime/decimal-value.inline.h
File be/src/runtime/decimal-value.inline.h:

http://gerrit.cloudera.org:8080/#/c/17303/1/be/src/runtime/decimal-value.inline.h@727
PS1, Line 727:   // Original approach was to use:
line has trailing whitespace


http://gerrit.cloudera.org:8080/#/c/17303/1/be/src/runtime/decimal-value.inline.h@728
PS1, Line 728:   // static_cast(value_) / pow(10.0, scale).
line has trailing whitespace


http://gerrit.cloudera.org:8080/#/c/17303/1/be/src/runtime/decimal-value.inline.h@729
PS1, Line 729:   // However only integers from −2^53 to 2^53 can be represented 
accurately by
line has trailing whitespace


http://gerrit.cloudera.org:8080/#/c/17303/1/be/src/runtime/decimal-value.inline.h@730
PS1, Line 730:   // double precision without any loss.
line has trailing whitespace


http://gerrit.cloudera.org:8080/#/c/17303/1/be/src/runtime/decimal-value.inline.h@731
PS1, Line 731:   // Hence, it would not work for numbers like 
-0.43149576573887316.
line has trailing whitespace


http://gerrit.cloudera.org:8080/#/c/17303/1/be/src/runtime/decimal-value.inline.h@734
PS1, Line 734:   // not be accurate. In newer approach we are using
line has trailing whitespace


http://gerrit.cloudera.org:8080/#/c/17303/1/be/src/runtime/decimal-value.inline.h@742
PS1, Line 742:   // It expects value to be positive even for negative numbers.
line has trailing whitespace


http://gerrit.cloudera.org:8080/#/c/17303/1/be/src/runtime/decimal-value.inline.h@744
PS1, Line 744:   double result = fast_double_parser::compute_float_64((-scale),
line has trailing whitespace


http://gerrit.cloudera.org:8080/#/c/17303/1/be/src/runtime/decimal-value.inline.h@747
PS1, Line 747:   // Fallback to original approach. This is not always accurate 
as described above.
line has trailing whitespace


http://gerrit.cloudera.org:8080/#/c/17303/1/be/src/runtime/decimal-value.inline.h@749
PS1, Line 749:   // double using std:strtod. However std::strtod is atleast 4X 
slower
line has trailing whitespace


http://gerrit.cloudera.org:8080/#/c/17303/1/be/src/thirdparty/fast_double_parser/fast_double_parser.h
File be/src/thirdparty/fast_double_parser/fast_double_parser.h:

http://gerrit.cloudera.org:8080/#/c/17303/1/be/src/thirdparty/fast_double_parser/fast_double_parser.h@13
PS1, Line 13: #if (defined(sun) || defined(__sun))
line has trailing whitespace


http://gerrit.cloudera.org:8080/#/c/17303/1/be/src/thirdparty/fast_double_parser/fast_double_parser.h@17
PS1, Line 17: #if defined(__CYGWIN__) || defined(__MINGW32__) || 
defined(__MINGW64__)
line has trailing whitespace


http://gerrit.cloudera.org:8080/#/c/17303/1/be/src/thirdparty/fast_double_parser/fast_double_parser.h@22
PS1, Line 22:  * Determining whether we should import xlocale.h or not is
line has trailing whitespace


http://gerrit.cloudera.org:8080/#/c/17303/1/be/src/thirdparty/fast_double_parser/fast_double_parser.h@25
PS1, Line 25: #if defined(FAST_DOUBLE_PARSER_SOLARIS) || 
defined(FAST_DOUBLE_PARSER_CYGWIN)
line has trailing whitespace


http://gerrit.cloudera.org:8080/#/c/17303/1/be/src/thirdparty/fast_double_parser/fast_double_parser.h@67
PS1, Line 67: #endif //  defined(FAST_DOUBLE_PARSER_SOLARIS) || 
defined(FAST_DOUBLE_PARSER_CYGWIN)
line has trailing whitespace


http://gerrit.cloudera.org:8080/#/c/17303/1/be/src/thirdparty/fast_double_parser/fast_double_parser.h@84
PS1, Line 84:  * However, we have that
line has trailing whitespace


http://gerrit.cloudera.org:8080/#/c/17303/1/be/src/thirdparty/fast_double_parser/fast_double_parser.h@86
PS1, Line 86:  * Thus it is possible for a number of the form w * 10^-342 where
line has trailing whitespace


http://gerrit.cloudera.org:8080/#/c/17303/1/be/src/thirdparty/fast_double_parser/fast_double_parser.h@94
PS1, Line 94:  * Any number of form w * 10^309 where w>= 1 is going to be
line has trailing whitespace


http://gerrit.cloudera.org:8080/#/c/17303/1/be/src/thirdparty/fast_double_parser/fast_double_parser.h@153
PS1, Line 153: // credit: 
https://stackoverflow.com/questions/28868367/getting-the-high-part-of-64-bit-integer-multiplication
line too long (110 > 90)


http://gerrit.cloudera.org:8080/#/c/17303/1/be/src/thirdparty/fast_double_parser/fast_double_parser.h@154
PS1, Line 154: really_inline uint64_t Emulate64x64to128(uint64_t& r_hi, const 
uint64_t x, const uint64_t y) {
line too long (94 > 90)


http://gerrit.cloudera.org:8080/#/c/17303/1/be/src/thirdparty/fast_double_parser/fast_double_parser.h@159
PS1, Line 159:
line has trailing whitespace


http://gerrit.cloudera.org:8080/#/c/17303/1/be/src/thirdparty/fast_double_parser/fast_double_parser.h@224
PS1, Line 224:  */
line

[Impala-ASF-CR] IMPALA-10350 Fix precision loss while converting DecimalValue to double.

2021-04-11 Thread Amogh Margoor (Code Review)
Amogh Margoor has uploaded this change for review. ( 
http://gerrit.cloudera.org:8080/17303


Change subject: IMPALA-10350 Fix precision loss while converting DecimalValue 
to double.
..

IMPALA-10350 Fix precision loss while converting DecimalValue to double.

Original approach to convert DecimalValue(internal representation of decimals) 
was not accurate.
It was:
   static_cast(value_) / pow(10.0, scale).
However only integers from −2^53 to 2^53 can be represented accurately by
double precision without any loss.
Hence, it would not work for numbers like -0.43149576573887316.
For DecimalValue representing -0.43149576573887316, value_ would be
-43149576573887316 and scale would be 17. As value_ < -2^53, result would
not be accurate. In newer approach we are using third party library
https://github.com/lemire/fast_double_parser, which handles above scenario
in a performant manner.

Change-Id: I56f0652cb8f81a491b87d9b108a94c00ae6c99a1
---
M be/src/runtime/decimal-value.inline.h
A be/src/thirdparty/fast_double_parser/LICENSE
A be/src/thirdparty/fast_double_parser/LICENSE.BSL
A be/src/thirdparty/fast_double_parser/README.md
A be/src/thirdparty/fast_double_parser/fast_double_parser.h
M testdata/workloads/functional-query/queries/QueryTest/values.test
6 files changed, 1,540 insertions(+), 1 deletion(-)



  git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/03/17303/1
--
To view, visit http://gerrit.cloudera.org:8080/17303
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: newchange
Gerrit-Change-Id: I56f0652cb8f81a491b87d9b108a94c00ae6c99a1
Gerrit-Change-Number: 17303
Gerrit-PatchSet: 1
Gerrit-Owner: Amogh Margoor