[kudu-CR] KUDU-1291. Efficiently support predicates on non-prefix key components

2018-08-13 Thread Alexey Serbin (Code Review)
Alexey Serbin has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/10983 )

Change subject: KUDU-1291. Efficiently support predicates on non-prefix key 
components
..


Patch Set 14:

(1 comment)

http://gerrit.cloudera.org:8080/#/c/10983/12/src/kudu/tablet/cfile_set.cc
File src/kudu/tablet/cfile_set.cc:

http://gerrit.cloudera.org:8080/#/c/10983/12/src/kudu/tablet/cfile_set.cc@444
PS12, Line 444: col_id ==
> Yes, if the column is not found in the schema. Not quite sure whether to co
If that's something we expect to happen, I would simply disable the skip scan 
and break out of the cycle, adding a warning log message about wrong column 
name in the predicate.



--
To view, visit http://gerrit.cloudera.org:8080/10983
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: kudu
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I230cd5a288e28ace796b352a603e0d1bcc1e4e0f
Gerrit-Change-Number: 10983
Gerrit-PatchSet: 14
Gerrit-Owner: Anupama Gupta 
Gerrit-Reviewer: Alexey Serbin 
Gerrit-Reviewer: Andrew Wong 
Gerrit-Reviewer: Anupama Gupta 
Gerrit-Reviewer: Kudu Jenkins
Gerrit-Reviewer: Mike Percy 
Gerrit-Reviewer: Tidy Bot
Gerrit-Comment-Date: Tue, 14 Aug 2018 05:52:36 +
Gerrit-HasComments: Yes


[kudu-CR] KUDU-1291. Efficiently support predicates on non-prefix key components

2018-08-13 Thread Alexey Serbin (Code Review)
Alexey Serbin has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/10983 )

Change subject: KUDU-1291. Efficiently support predicates on non-prefix key 
components
..


Patch Set 14:

(6 comments)

http://gerrit.cloudera.org:8080/#/c/10983/12/src/kudu/tablet/cfile_set.cc
File src/kudu/tablet/cfile_set.cc:

http://gerrit.cloudera.org:8080/#/c/10983/12/src/kudu/tablet/cfile_set.cc@462
PS12, Line 462:
  : // Store the predicate column id.
  : skip_scan_predicate_column_id_ = min_col_id;
  :
  : // Store the predicate value.
  : skip_scan_predicate_value_ = pred_value;
  :
  : // Store the cutoff on the number of skip scan seeks.
  :
> Alexey, please let me know if you feel that this change should be reverted.
I think it's better to keep it as Andrew requested.  After some consideration I 
realized that's clearer because it's moved out of the cycle at least.


http://gerrit.cloudera.org:8080/#/c/10983/14/src/kudu/tablet/cfile_set.cc
File src/kudu/tablet/cfile_set.cc:

http://gerrit.cloudera.org:8080/#/c/10983/14/src/kudu/tablet/cfile_set.cc@414
PS14, Line 414: int num_key_cols
nit: since this is not to change in the code below, add 'const'


http://gerrit.cloudera.org:8080/#/c/10983/14/src/kudu/tablet/cfile_set.cc@576
PS14, Line 576: (
style nit: keep the brace with the name of the method


http://gerrit.cloudera.org:8080/#/c/10983/12/src/kudu/tablet/index_skipscan-test.cc
File src/kudu/tablet/index_skipscan-test.cc:

http://gerrit.cloudera.org:8080/#/c/10983/12/src/kudu/tablet/index_skipscan-test.cc@321
PS12, Line 321: auto pred_p1 = 
ColumnPredicate::Equality(schema_.column(0), _p1);
  : auto pred_p2 = 
ColumnPredicate::Equality(schema_.column(1), _p2);
  : spec.AddPredicate(pred_p1);
  : spec.AddPredicate(pred_p2);
  : vector results;
  : NO_FATALS(ScanTablet(, , "Exact match on 
P1 and P2"));
  : EXPECT_EQ(1, results.size());
  :   }
  :   break;
  :
  : c
> Changed to a more uniform layout. Please let me know if this looks good.
thanks, lgtm


http://gerrit.cloudera.org:8080/#/c/10983/14/src/kudu/tablet/index_skipscan-test.cc
File src/kudu/tablet/index_skipscan-test.cc:

PS14:
This seems to be a good set of unit tests regarding testing scan using 
different combinations primary key columns.

How about adding a simple unit test to verify that skip scan is 
enabled/disabled in cfileset iterator?  I.e., I was thinking to verify 
functionality of  CFileSet::Iterator::TryEnableSkipScan().

What do you think?


http://gerrit.cloudera.org:8080/#/c/10983/14/src/kudu/tablet/index_skipscan-test.cc@189
PS14, Line 189: const int32_t kNumDistinctP1 = 2;
  : const int16_t kNumDistinctP2 = 10;
  : const int kNumDistinctP3 = 5;
  : const int8_t kNumDistinctP4 = 1;
  : const int8_t kNumDistinctP5 = 20;
nit: all there might be constexpr:

constexpr int32_t kNumDistinctP1 = 2;
...



--
To view, visit http://gerrit.cloudera.org:8080/10983
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: kudu
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I230cd5a288e28ace796b352a603e0d1bcc1e4e0f
Gerrit-Change-Number: 10983
Gerrit-PatchSet: 14
Gerrit-Owner: Anupama Gupta 
Gerrit-Reviewer: Alexey Serbin 
Gerrit-Reviewer: Andrew Wong 
Gerrit-Reviewer: Anupama Gupta 
Gerrit-Reviewer: Kudu Jenkins
Gerrit-Reviewer: Mike Percy 
Gerrit-Reviewer: Tidy Bot
Gerrit-Comment-Date: Tue, 14 Aug 2018 05:47:54 +
Gerrit-HasComments: Yes


[kudu-CR] WIP [master] replica selection honors placement policy

2018-08-13 Thread Alexey Serbin (Code Review)
Hello Kudu Jenkins,

I'd like you to reexamine a change. Please visit

http://gerrit.cloudera.org:8080/11207

to look at the new patch set (#2).

Change subject: WIP [master] replica selection honors placement policy
..

WIP [master] replica selection honors placement policy

This patch introduces placement policy into the catalog manager's
replica selection process.  The replica selection logic
is factored out into the new PlacementPolicy class.

Added few test scenarios for the new functionality.

WIP: re-base the changes against a real changelist instead of
 [no review] Base to unblock threads of location awareness work

WIP: add more tests, covering already existing functionality as well

Change-Id: I4169098abf17d5591d4c1675561afc15b5477fcd
---
M src/kudu/master/CMakeLists.txt
M src/kudu/master/catalog_manager.cc
M src/kudu/master/catalog_manager.h
A src/kudu/master/placement_policy-test.cc
A src/kudu/master/placement_policy.cc
A src/kudu/master/placement_policy.h
M src/kudu/master/ts_descriptor.cc
M src/kudu/master/ts_descriptor.h
M src/kudu/master/ts_manager.cc
M src/kudu/master/ts_manager.h
10 files changed, 812 insertions(+), 188 deletions(-)


  git pull ssh://gerrit.cloudera.org:29418/kudu refs/changes/07/11207/2
--
To view, visit http://gerrit.cloudera.org:8080/11207
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: kudu
Gerrit-Branch: master
Gerrit-MessageType: newpatchset
Gerrit-Change-Id: I4169098abf17d5591d4c1675561afc15b5477fcd
Gerrit-Change-Number: 11207
Gerrit-PatchSet: 2
Gerrit-Owner: Alexey Serbin 
Gerrit-Reviewer: Kudu Jenkins


[kudu-CR] WIP [master] replica selection honors placement policy

2018-08-13 Thread Alexey Serbin (Code Review)
Alexey Serbin has uploaded this change for review. ( 
http://gerrit.cloudera.org:8080/11207


Change subject: WIP [master] replica selection honors placement policy
..

WIP [master] replica selection honors placement policy

WIP: some integration tests are failing -- need to fix

Change-Id: I4169098abf17d5591d4c1675561afc15b5477fcd
---
M src/kudu/master/CMakeLists.txt
M src/kudu/master/catalog_manager.cc
M src/kudu/master/catalog_manager.h
A src/kudu/master/placement_policy-test.cc
A src/kudu/master/placement_policy.cc
A src/kudu/master/placement_policy.h
M src/kudu/master/ts_descriptor.cc
M src/kudu/master/ts_descriptor.h
M src/kudu/master/ts_manager.cc
M src/kudu/master/ts_manager.h
10 files changed, 812 insertions(+), 199 deletions(-)



  git pull ssh://gerrit.cloudera.org:29418/kudu refs/changes/07/11207/1
--
To view, visit http://gerrit.cloudera.org:8080/11207
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: kudu
Gerrit-Branch: master
Gerrit-MessageType: newchange
Gerrit-Change-Id: I4169098abf17d5591d4c1675561afc15b5477fcd
Gerrit-Change-Number: 11207
Gerrit-PatchSet: 1
Gerrit-Owner: Alexey Serbin 


[kudu-CR] [hms] disable direct SQL in MiniHms

2018-08-13 Thread Adar Dembo (Code Review)
Adar Dembo has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/11198 )

Change subject: [hms] disable direct SQL in MiniHms
..


Patch Set 1:

(2 comments)

http://gerrit.cloudera.org:8080/#/c/11198/1//COMMIT_MSG
Commit Message:

http://gerrit.cloudera.org:8080/#/c/11198/1//COMMIT_MSG@7
PS1, Line 7: [hms] disable direct SQL in MiniHms
Why was direct SQL enabled in the first place? What advantages does it confer, 
if any?


http://gerrit.cloudera.org:8080/#/c/11198/1//COMMIT_MSG@12
PS1, Line 12: runing
running



--
To view, visit http://gerrit.cloudera.org:8080/11198
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: kudu
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: If48fb2551fae739e2f1548bb5ca2187364064703
Gerrit-Change-Number: 11198
Gerrit-PatchSet: 1
Gerrit-Owner: Hao Hao 
Gerrit-Reviewer: Adar Dembo 
Gerrit-Reviewer: Andrew Wong 
Gerrit-Reviewer: Dan Burkert 
Gerrit-Reviewer: Kudu Jenkins
Gerrit-Comment-Date: Mon, 13 Aug 2018 22:34:21 +
Gerrit-HasComments: Yes


[kudu-CR] Add delete external catalogs flag to table delete tool

2018-08-13 Thread Adar Dembo (Code Review)
Adar Dembo has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/11197 )

Change subject: Add delete_external_catalogs flag to table delete tool
..


Patch Set 1:

(3 comments)

http://gerrit.cloudera.org:8080/#/c/11197/1//COMMIT_MSG
Commit Message:

http://gerrit.cloudera.org:8080/#/c/11197/1//COMMIT_MSG@9
PS1, Line 9: it might be necessary to
   : introduce a flag to control the table deletion in the Kudu catalog
   : independently of the HMS.
Not really understanding this: this commit introduces this flag, so how could 
it "might be necessary" to introduce it in the future?


http://gerrit.cloudera.org:8080/#/c/11197/1/src/kudu/client/client.h
File src/kudu/client/client.h:

http://gerrit.cloudera.org:8080/#/c/11197/1/src/kudu/client/client.h@539
PS1, Line 539:   /// Delete/drop a table with option.
Can you move this so it's next to DeleteTable?


http://gerrit.cloudera.org:8080/#/c/11197/1/src/kudu/tools/tool_action_table.cc
File src/kudu/tools/tool_action_table.cc:

http://gerrit.cloudera.org:8080/#/c/11197/1/src/kudu/tools/tool_action_table.cc@39
PS1, Line 39: DEFINE_bool(alter_external_catalogs, true,
: "Whether to alter external catalogs, such as the Hive 
Metastore, "
: "when renaming a table.");
: DEFINE_bool(delete_external_catalogs, true,
: "Whether to delete external catalogs, such as the 
Hive Metastore, "
: "when deleting a table.");
These only exist to facilitate "upgrade" workflows that take you from no HMS 
integration to HMS integration, right? If so, perhaps we should warn users 
against changing the values of these parameters?



--
To view, visit http://gerrit.cloudera.org:8080/11197
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: kudu
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I0a128fb53c974a5c839786204d56408681b434e8
Gerrit-Change-Number: 11197
Gerrit-PatchSet: 1
Gerrit-Owner: Hao Hao 
Gerrit-Reviewer: Adar Dembo 
Gerrit-Reviewer: Andrew Wong 
Gerrit-Reviewer: Dan Burkert 
Gerrit-Reviewer: Kudu Jenkins
Gerrit-Comment-Date: Mon, 13 Aug 2018 22:33:46 +
Gerrit-HasComments: Yes


[kudu-CR] [build] Move default sanitizer options into build from shell scripts

2018-08-13 Thread Adar Dembo (Code Review)
Adar Dembo has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/11176 )

Change subject: [build] Move default sanitizer options into build from shell 
scripts
..


Patch Set 6:

(3 comments)

http://gerrit.cloudera.org:8080/#/c/11176/6//COMMIT_MSG
Commit Message:

http://gerrit.cloudera.org:8080/#/c/11176/6//COMMIT_MSG@9
PS6, Line 9: supressions
suppressions


http://gerrit.cloudera.org:8080/#/c/11176/6/src/kudu/util/CMakeLists.txt
File src/kudu/util/CMakeLists.txt:

http://gerrit.cloudera.org:8080/#/c/11176/6/src/kudu/util/CMakeLists.txt@327
PS6, Line 327: kudu_sanitizer_options)
Nit: should precede kudu_util.


http://gerrit.cloudera.org:8080/#/c/11176/6/src/kudu/util/sanitizer_options.cc
File src/kudu/util/sanitizer_options.cc:

http://gerrit.cloudera.org:8080/#/c/11176/6/src/kudu/util/sanitizer_options.cc@35
PS6, Line 35:   return kAsanDefaultOptions;
Nit: could we inline kAsanDefaultOptions directly into the return statement? As 
is you have to follow from the variable definition to the return statement in 
__asan_default_options; inlining would eliminate that following.

Same for the other options/suppressions below.



-- 
To view, visit http://gerrit.cloudera.org:8080/11176
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: kudu
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I9eab2d3ef0b107ae9a4e971bf8a6514bf425f645
Gerrit-Change-Number: 11176
Gerrit-PatchSet: 6
Gerrit-Owner: Grant Henke 
Gerrit-Reviewer: Adar Dembo 
Gerrit-Reviewer: Grant Henke 
Gerrit-Reviewer: Kudu Jenkins
Gerrit-Reviewer: Todd Lipcon 
Gerrit-Comment-Date: Mon, 13 Aug 2018 22:28:06 +
Gerrit-HasComments: Yes


[kudu-CR] [WIP] [tools] KUDU-2461 Add election metrics to ksck

2018-08-13 Thread Attila Bukor (Code Review)
Attila Bukor has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/10588 )

Change subject: [WIP] [tools] KUDU-2461 Add election metrics to ksck
..


Patch Set 5:

tagged it as WIP as I think I should write tests for this, but before doing so, 
I wanted to ask if this is a good approach.

Having an RPC to list all metrics seems like an overkill to me, so for now I 
implemented one for consensus metrics.

Will, Todd, Mike, what do you think?


--
To view, visit http://gerrit.cloudera.org:8080/10588
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: kudu
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I16fa6f32e6df7698365e908c7b48e63b0e52c745
Gerrit-Change-Number: 10588
Gerrit-PatchSet: 5
Gerrit-Owner: Attila Bukor 
Gerrit-Reviewer: Attila Bukor 
Gerrit-Reviewer: Kudu Jenkins
Gerrit-Reviewer: Mike Percy 
Gerrit-Reviewer: Todd Lipcon 
Gerrit-Reviewer: Will Berkeley 
Gerrit-Comment-Date: Mon, 13 Aug 2018 22:24:37 +
Gerrit-HasComments: No


[kudu-CR] [WIP] [tools] KUDU-2461 Add election metrics to ksck

2018-08-13 Thread Attila Bukor (Code Review)
Hello Will Berkeley, Mike Percy, Kudu Jenkins, Todd Lipcon,

I'd like you to reexamine a change. Please visit

http://gerrit.cloudera.org:8080/10588

to look at the new patch set (#5).

Change subject: [WIP] [tools] KUDU-2461 Add election metrics to ksck
..

[WIP] [tools] KUDU-2461 Add election metrics to ksck

KUDU-2287 added two new metrics:
* failed_elections_since_stable_leader
* time_since_last_leader_heartbeat

This commit adds two ksck configuration options:

* failed_elections_warn_limit (default 3)
* millis_since_heartbeat_warn_limit (default 5000)

If any of them is reached, a warning is added to the ksck output:

Replica 39d3ec2c1f9541e18162a344645a4cdc of
e996c2436f3149eba8f7207bb6281a0d has not heard from the leader in 275991
ms and has called 33 failed election(s) since then.

Change-Id: I16fa6f32e6df7698365e908c7b48e63b0e52c745
---
M src/kudu/consensus/consensus.proto
M src/kudu/consensus/metadata.proto
M src/kudu/consensus/raft_consensus.cc
M src/kudu/consensus/raft_consensus.h
M src/kudu/tools/ksck-test.cc
M src/kudu/tools/ksck.cc
M src/kudu/tools/ksck.h
M src/kudu/tools/ksck_remote.cc
M src/kudu/tools/ksck_remote.h
M src/kudu/tserver/tablet_service.cc
M src/kudu/tserver/tablet_service.h
11 files changed, 147 insertions(+), 1 deletion(-)


  git pull ssh://gerrit.cloudera.org:29418/kudu refs/changes/88/10588/5
--
To view, visit http://gerrit.cloudera.org:8080/10588
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: kudu
Gerrit-Branch: master
Gerrit-MessageType: newpatchset
Gerrit-Change-Id: I16fa6f32e6df7698365e908c7b48e63b0e52c745
Gerrit-Change-Number: 10588
Gerrit-PatchSet: 5
Gerrit-Owner: Attila Bukor 
Gerrit-Reviewer: Kudu Jenkins
Gerrit-Reviewer: Mike Percy 
Gerrit-Reviewer: Todd Lipcon 
Gerrit-Reviewer: Will Berkeley 


[kudu-CR] [build] Move default sanitizer options into build from shell scripts

2018-08-13 Thread Grant Henke (Code Review)
Hello Kudu Jenkins, Adar Dembo, Todd Lipcon,

I'd like you to reexamine a change. Please visit

http://gerrit.cloudera.org:8080/11176

to look at the new patch set (#6).

Change subject: [build] Move default sanitizer options into build from shell 
scripts
..

[build] Move default sanitizer options into build from shell scripts

This moves the sanitizer options and supressions
from the build scripts using
TSAN_OPTIONS/LSAN_OPTIONS/etc into the build
and linked at build time.

This ensures the same configurations are used for
C++ and Java tests.

Change-Id: I9eab2d3ef0b107ae9a4e971bf8a6514bf425f645
---
M README.adoc
M build-support/dist_test.py
M build-support/jenkins/build-and-test.sh
D build-support/lsan-suppressions.txt
M build-support/run-test.sh
M build-support/run_dist_test.py
D build-support/tsan-suppressions.txt
M src/kudu/util/CMakeLists.txt
A src/kudu/util/sanitizer_options.cc
9 files changed, 205 insertions(+), 181 deletions(-)


  git pull ssh://gerrit.cloudera.org:29418/kudu refs/changes/76/11176/6
--
To view, visit http://gerrit.cloudera.org:8080/11176
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: kudu
Gerrit-Branch: master
Gerrit-MessageType: newpatchset
Gerrit-Change-Id: I9eab2d3ef0b107ae9a4e971bf8a6514bf425f645
Gerrit-Change-Number: 11176
Gerrit-PatchSet: 6
Gerrit-Owner: Grant Henke 
Gerrit-Reviewer: Adar Dembo 
Gerrit-Reviewer: Grant Henke 
Gerrit-Reviewer: Kudu Jenkins
Gerrit-Reviewer: Todd Lipcon 


[kudu-CR] [build] Move default sanitizer options into build from shell scripts

2018-08-13 Thread Grant Henke (Code Review)
Grant Henke has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/11176 )

Change subject: [build] Move default sanitizer options into build from shell 
scripts
..


Patch Set 5:

(9 comments)

http://gerrit.cloudera.org:8080/#/c/11176/5//COMMIT_MSG
Commit Message:

http://gerrit.cloudera.org:8080/#/c/11176/5//COMMIT_MSG@9
PS5, Line 9: This moves the sanitzer options and supressions
> Nit: sanitizer
Done


http://gerrit.cloudera.org:8080/#/c/11176/4/src/kudu/sanitizers/sanitizer_options.cc
File src/kudu/sanitizers/sanitizer_options.cc:

http://gerrit.cloudera.org:8080/#/c/11176/4/src/kudu/sanitizers/sanitizer_options.cc@57
PS4, Line 57: "history_size=7 "
> Just to clarify, what's the exact behavior when the user sets TSAN_OPTIONS?
These are treated as defaults that the environment variables can override. The 
right most flag is the final value used.

Here is where the defaults and then env flags are processed:
https://github.com/llvm-mirror/compiler-rt/blob/96ccf181c8c1bf446bae8fc738b3dc8da5a65cfe/lib/tsan/rtl/tsan_flags.cc#L89-L96

Here is where the duplicate flags are handled:
https://github.com/llvm-mirror/compiler-rt/blob/master/lib/sanitizer_common/sanitizer_flag_parser.cc#L148-L152


http://gerrit.cloudera.org:8080/#/c/11176/5/src/kudu/sanitizers/sanitizer_options.cc
File src/kudu/sanitizers/sanitizer_options.cc:

http://gerrit.cloudera.org:8080/#/c/11176/5/src/kudu/sanitizers/sanitizer_options.cc@19
PS5, Line 19: #if defined(ADDRESS_SANITIZER) || defined(LEAK_SANITIZER) ||  \
: defined(MEMORY_SANITIZER) || defined(THREAD_SANITIZER) || \
: defined(UNDEFINED_SANITIZER)
> I don't see why the definition of SANITIZER_HOOK_ATTRIBUTE needs to be cond
Done


http://gerrit.cloudera.org:8080/#/c/11176/5/src/kudu/sanitizers/sanitizer_options.cc@30
PS5, Line 30:
> Nit: remove this.
Done


http://gerrit.cloudera.org:8080/#/c/11176/5/src/kudu/sanitizers/sanitizer_options.cc@45
PS5, Line 45: defined(OS_LINUX)
> I don't think this is ever set by the build.
oops, meant to remove this.


http://gerrit.cloudera.org:8080/#/c/11176/5/src/kudu/sanitizers/sanitizer_options.cc@54
PS5, Line 54: // TODO: " external_symbolizer_path=" + 
env['ASAN_SYMBOLIZER_PATH']
> So what does this TODO mean? Can this patch be merged without it? What happ
The llvm-symbolizer needs to be passed in order to support suppressions and 
improved stack traces on sanitizer errors. I am not sure the best way to 
default to the one in thirdparty from here. Perhaps we can't and I just need to 
use the env variables we were using before for the symbolizer.

See "external_symbolizer_path" here:
https://github.com/google/sanitizers/wiki/ThreadSanitizerFlags

See ASAN_SYMBOLIZER_PATH here: 
https://clang.llvm.org/docs/AddressSanitizer.html#symbolizing-the-reports


http://gerrit.cloudera.org:8080/#/c/11176/5/src/kudu/sanitizers/sanitizer_options.cc@64
PS5, Line 64: extern char kTSanDefaultSuppressions[];
> Can we avoid this by moving __tsan_default_suppressions into tsan_suppressi
I can just move it all into this file.


http://gerrit.cloudera.org:8080/#/c/11176/5/src/kudu/sanitizers/tsan_suppressions.cc
File src/kudu/sanitizers/tsan_suppressions.cc:

http://gerrit.cloudera.org:8080/#/c/11176/5/src/kudu/sanitizers/tsan_suppressions.cc@21
PS5, Line 21: // Please make sure the code below declares a single string 
variable
: // kTSanDefaultSuppressions which contains TSan suppressions 
delimited by
: // newlines.
> Not really understanding this either; the structure of kTsanDefaultSuppress
This syntax weirdness is carry over from the examples I used. I will remove all 
of it.


http://gerrit.cloudera.org:8080/#/c/11176/5/src/kudu/sanitizers/tsan_suppressions.cc@120
PS5, Line 120: ;  // Please keep this semicolon.
> Not sure I get the comment; without the semicolon this is a syntax error, n
Done



--
To view, visit http://gerrit.cloudera.org:8080/11176
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: kudu
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I9eab2d3ef0b107ae9a4e971bf8a6514bf425f645
Gerrit-Change-Number: 11176
Gerrit-PatchSet: 5
Gerrit-Owner: Grant Henke 
Gerrit-Reviewer: Adar Dembo 
Gerrit-Reviewer: Grant Henke 
Gerrit-Reviewer: Kudu Jenkins
Gerrit-Reviewer: Todd Lipcon 
Gerrit-Comment-Date: Mon, 13 Aug 2018 17:43:06 +
Gerrit-HasComments: Yes


[kudu-CR] Supporting Spark streaming DataFrame in KuduContext.

2018-08-13 Thread Attila Piros (Code Review)
Attila Piros has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/11199 )

Change subject: Supporting Spark streaming DataFrame in KuduContext.
..


Patch Set 4:

> Could you add tests to demonstrate and prove the functionality this
 > change is supporting? Also to ensure it doesn't break with future
 > changes.

Unfortunately no idea how. If you check my kudu custom sink repo:

 > Could you add tests to demonstrate and prove the functionality this
 > change is supporting? Also to ensure it doesn't break with future
 > changes.

Unfortunately it is not easy: here there is custom sink using the KuduContext 
in streaming: https://github.com/attilapiros/kudu_custom_sink but in our case 
we would need the Spark streaming test in memory solution too.


--
To view, visit http://gerrit.cloudera.org:8080/11199
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: kudu
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: Iead04539d3514920a5d6803c34715e5686124572
Gerrit-Change-Number: 11199
Gerrit-PatchSet: 4
Gerrit-Owner: Attila Piros 
Gerrit-Reviewer: Attila Piros 
Gerrit-Reviewer: Grant Henke 
Gerrit-Reviewer: Kudu Jenkins
Gerrit-Comment-Date: Mon, 13 Aug 2018 17:01:05 +
Gerrit-HasComments: No


[kudu-CR] Supporting Spark streaming DataFrame in KuduContext.

2018-08-13 Thread Grant Henke (Code Review)
Grant Henke has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/11199 )

Change subject: Supporting Spark streaming DataFrame in KuduContext.
..


Patch Set 1:

Could you add tests to demonstrate and prove the functionality this change is 
supporting? Also to ensure it doesn't break with future changes.


--
To view, visit http://gerrit.cloudera.org:8080/11199
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: kudu
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: Iead04539d3514920a5d6803c34715e5686124572
Gerrit-Change-Number: 11199
Gerrit-PatchSet: 1
Gerrit-Owner: Attila Piros 
Gerrit-Reviewer: Grant Henke 
Gerrit-Reviewer: Kudu Jenkins
Gerrit-Comment-Date: Mon, 13 Aug 2018 15:26:16 +
Gerrit-HasComments: No


[kudu-CR] Using CatalystTypeConverters for mapping InternalRow to Row

2018-08-13 Thread Attila Piros (Code Review)
Attila Piros has abandoned this change. ( http://gerrit.cloudera.org:8080/11200 
)

Change subject: Using CatalystTypeConverters for mapping InternalRow to Row
..


Abandoned
--
To view, visit http://gerrit.cloudera.org:8080/11200
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: kudu
Gerrit-Branch: master
Gerrit-MessageType: abandon
Gerrit-Change-Id: I56ece18cda64547d0e8238ed2e791599d8ad0655
Gerrit-Change-Number: 11200
Gerrit-PatchSet: 1
Gerrit-Owner: Attila Piros 
Gerrit-Reviewer: Kudu Jenkins


[kudu-CR] Supporting Spark streaming DataFrame in KuduContext.

2018-08-13 Thread Attila Piros (Code Review)
Hello Kudu Jenkins,

I'd like you to reexamine a change. Please visit

http://gerrit.cloudera.org:8080/11199

to look at the new patch set (#4).

Change subject: Supporting Spark streaming DataFrame in KuduContext.
..

Supporting Spark streaming DataFrame in KuduContext.

KUDU-2539: Supporting Spark streaming DataFrame in KuduContext.

This solution follows the way how other sinks ie. KafkaSink
is implemented, for details see
https://github.com/apache/spark/blob/master/external/kafka-0-10-sql/src/main/scala/org/apache/spark/sql/kafka010/KafkaWriter.scala#L87

Where on the DataFrame a queryExecution.toRdd.foreachPartition
is called to access the InternalRows which mapped to Rows by Catalyst
converters.

Change-Id: Iead04539d3514920a5d6803c34715e5686124572
---
M java/kudu-spark/src/main/scala/org/apache/kudu/spark/kudu/KuduContext.scala
1 file changed, 9 insertions(+), 3 deletions(-)


  git pull ssh://gerrit.cloudera.org:29418/kudu refs/changes/99/11199/4
--
To view, visit http://gerrit.cloudera.org:8080/11199
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: kudu
Gerrit-Branch: master
Gerrit-MessageType: newpatchset
Gerrit-Change-Id: Iead04539d3514920a5d6803c34715e5686124572
Gerrit-Change-Number: 11199
Gerrit-PatchSet: 4
Gerrit-Owner: Attila Piros 
Gerrit-Reviewer: Kudu Jenkins


[kudu-CR] Using CatalystTypeConverters for mapping InternalRow to Row

2018-08-13 Thread Attila Piros (Code Review)
Attila Piros has uploaded this change for review. ( 
http://gerrit.cloudera.org:8080/11200


Change subject: Using CatalystTypeConverters for mapping InternalRow to Row
..

Using CatalystTypeConverters for mapping InternalRow to Row

Change-Id: I56ece18cda64547d0e8238ed2e791599d8ad0655
---
M java/kudu-spark/src/main/scala/org/apache/kudu/spark/kudu/KuduContext.scala
1 file changed, 10 insertions(+), 9 deletions(-)



  git pull ssh://gerrit.cloudera.org:29418/kudu refs/changes/00/11200/1
--
To view, visit http://gerrit.cloudera.org:8080/11200
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: kudu
Gerrit-Branch: master
Gerrit-MessageType: newchange
Gerrit-Change-Id: I56ece18cda64547d0e8238ed2e791599d8ad0655
Gerrit-Change-Number: 11200
Gerrit-PatchSet: 1
Gerrit-Owner: Attila Piros 


[kudu-CR] Supporting Spark streaming DataFrame in KuduContext.

2018-08-13 Thread Attila Piros (Code Review)
Hello Kudu Jenkins,

I'd like you to reexamine a change. Please visit

http://gerrit.cloudera.org:8080/11199

to look at the new patch set (#2).

Change subject: Supporting Spark streaming DataFrame in KuduContext.
..

Supporting Spark streaming DataFrame in KuduContext.

KUDU-2539: Supporting Spark streaming DataFrame in KuduContext.

This solution follows the way how other sinks ie. KafkaSink
is implemented, see
https://github.com/apache/spark/blob/master/external/kafka-0-10-sql/src/main/scala/org/apache/spark/sql/kafka010/KafkaWriter.scala#L87

Where on the DataFrame a queryExecution.toRdd.foreachPartition
is called to access the InternalRows.

Change-Id: Iead04539d3514920a5d6803c34715e5686124572
---
M java/kudu-spark/src/main/scala/org/apache/kudu/spark/kudu/KuduContext.scala
1 file changed, 13 insertions(+), 8 deletions(-)


  git pull ssh://gerrit.cloudera.org:29418/kudu refs/changes/99/11199/2
--
To view, visit http://gerrit.cloudera.org:8080/11199
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: kudu
Gerrit-Branch: master
Gerrit-MessageType: newpatchset
Gerrit-Change-Id: Iead04539d3514920a5d6803c34715e5686124572
Gerrit-Change-Number: 11199
Gerrit-PatchSet: 2
Gerrit-Owner: Attila Piros 
Gerrit-Reviewer: Kudu Jenkins


[kudu-CR] Supporting Spark streaming DataFrame in KuduContext.

2018-08-13 Thread Attila Piros (Code Review)
Attila Piros has uploaded this change for review. ( 
http://gerrit.cloudera.org:8080/11199


Change subject: Supporting Spark streaming DataFrame in KuduContext.
..

Supporting Spark streaming DataFrame in KuduContext.

KUDU-2539.

Change-Id: Iead04539d3514920a5d6803c34715e5686124572
---
M java/kudu-spark/src/main/scala/org/apache/kudu/spark/kudu/KuduContext.scala
1 file changed, 13 insertions(+), 8 deletions(-)



  git pull ssh://gerrit.cloudera.org:29418/kudu refs/changes/99/11199/1
--
To view, visit http://gerrit.cloudera.org:8080/11199
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: kudu
Gerrit-Branch: master
Gerrit-MessageType: newchange
Gerrit-Change-Id: Iead04539d3514920a5d6803c34715e5686124572
Gerrit-Change-Number: 11199
Gerrit-PatchSet: 1
Gerrit-Owner: Attila Piros 


[kudu-CR] [KUDU-2521] Java Implementation for BloomFilter

2018-08-13 Thread Anonymous Coward (Code Review)
jinxing6...@126.com has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/11077 )

Change subject: [KUDU-2521] Java Implementation for BloomFilter
..


Patch Set 2:

(6 comments)

http://gerrit.cloudera.org:8080/#/c/11077/1/java/kudu-client/src/main/java/org/apache/kudu/util/BloomFilter.java
File java/kudu-client/src/main/java/org/apache/kudu/util/BloomFilter.java:

http://gerrit.cloudera.org:8080/#/c/11077/1/java/kudu-client/src/main/java/org/apache/kudu/util/BloomFilter.java@22
PS1, Line 22: public class BloomFilter {
> Is this meant to be part of the public API or an internal-facing class? I'm
True, I updated and added some annotation.


http://gerrit.cloudera.org:8080/#/c/11077/1/java/kudu-client/src/main/java/org/apache/kudu/util/BloomFilter.java@40
PS1, Line 40: genBloomKeyProbe
> this pattern seems like it will create a new object for each lookup in the
True,  I updated in current change.


http://gerrit.cloudera.org:8080/#/c/11077/1/java/kudu-client/src/main/java/org/apache/kudu/util/BloomFilter.java@86
PS1, Line 86: CharsetUtil
> I think we can use StandardCharsets from java 7 here
Updated


http://gerrit.cloudera.org:8080/#/c/11077/1/java/kudu-client/src/main/java/org/apache/kudu/util/BloomFilter.java@139
PS1, Line 139: CityHash
> I'm curious why the decision to use CityHash here? We are using murmur hash
Yes, in Kudu kernel, we are using Murmur2 for partitioning and CityHash for 
blooms on PK lookups. The hashing added in this change is only used for bloom 
filter. So I think it's necessary to make the hashing consistent with the 
blooms on PK lookups in TServer.
I was also looking forward to use the bloom filters to eliminate entire 
partitions. But it was hard. In scenario of Join, the small table used to 
generate BloomFilter is not always small enough, thus hard to eliminate the 
entire partition.


http://gerrit.cloudera.org:8080/#/c/11077/3/java/kudu-client/src/main/java/org/apache/kudu/util/BloomFilter.java
File java/kudu-client/src/main/java/org/apache/kudu/util/BloomFilter.java:

http://gerrit.cloudera.org:8080/#/c/11077/3/java/kudu-client/src/main/java/org/apache/kudu/util/BloomFilter.java@28
PS3, Line 28:   public BloomFilter(BloomFilterSizing sizing) {
Added some annotations


http://gerrit.cloudera.org:8080/#/c/11077/3/java/kudu-client/src/main/java/org/apache/kudu/util/BloomFilter.java@169
PS3, Line 169:
Yes, in Kudu kernel, we are using Murmur2 for partitioning and CityHash for 
blooms on PK lookups. The hashing added in this change is only used for bloom 
filter. So I think it's necessary to make the hashing consistent with the 
blooms on PK lookups in TServer.
I was also looking forward to use the bloom filters to eliminate entire 
partitions. But it was hard. In scenario of Join, the small table used to 
generate BloomFilter is not always small enough, thus hard to eliminate the 
entire partition.



--
To view, visit http://gerrit.cloudera.org:8080/11077
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: kudu
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I50c5dce4d24707c068c5ab94e5d311b12b3251b8
Gerrit-Change-Number: 11077
Gerrit-PatchSet: 2
Gerrit-Owner: jinxing6...@126.com
Gerrit-Reviewer: Dan Burkert 
Gerrit-Reviewer: Grant Henke 
Gerrit-Reviewer: Kudu Jenkins
Gerrit-Reviewer: Todd Lipcon 
Gerrit-Reviewer: jinxing6...@126.com
Gerrit-Comment-Date: Mon, 13 Aug 2018 09:59:24 +
Gerrit-HasComments: Yes


[kudu-CR](gh-pages) [site] Add http to https redirect

2018-08-13 Thread Attila Bukor (Code Review)
Attila Bukor has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/11162 )

Change subject: [site] Add http to https redirect
..


Patch Set 1:

> Patch Set 1: Code-Review-1
>
> I know I was the one that suggested this approach, however after doing more 
> research I think it's the wrong way to do it. There is a good article by 
> Google about how to enable HTTPS @ 
> https://developers.google.com/web/fundamentals/security/encrypt-in-transit/enable-https
>
> In the section of that article titled "Turn on Strict Transport Security and 
> secure cookies" they recommend using HSTS 
>  instead of a 
> 301 redirect, and in the section "Redirect HTTP to HTTPS" they recommend 
> using https://...;> tags in the page header to 
> gently redirect search engines to using https.

The HSTS approach is definitely something I'd like to look into, but as it's 
something that's hard to go back once enabled (we need to wait until max-age 
secnods), I thought this should be added as a next step once we are reasonably 
certain it's working fine for everyone (we can ask around on Slack and the 
mailing lists maybe).

As for the canonical link approach, it's nice to have for search engines, but 
it doesn't actually redirect, so 301 still should be used here. The Google 
article you linked seems inconsistent about this point too, maybe it has been 
updated at some point, but they failed to update all references:

"This defeats attacks such as SSL Stripping, and also avoids the round-trip 
cost of the 301 redirect that we enabled in Redirect HTTP to HTTPS." <- it 
talks about 301 redirect instead of the canonical link tag.

The HSTS RFC itself says a permanent redirect such as response code 301 SHOULD 
be used when the request is made over a non-secure HTTP transport[1]. It also 
says the STS header MUST NOT be included on such requests, only on HTTPS.

>
> Sending a hard 301 redirect from http seems undesirable because it locks out 
> clients that can't speak https, whereas the above approaches are both 
> backward-and forward-compatible.

You're right that it might lock out some clients with the current settings. 
According to Qualys SSL Labs tests, there's a protocol mismatch with IE 10 and 
earlier, Android 4.3 and earlier, and Safari 6.0.4 and earlier[2]. Do we have 
any data on whether we have any visitors using any of these browsers? I'd 
assume it would be near impossible to use any modern websites with them, so I'm 
not sure whether we need to take them into consideration. I imagine enabling 
old cipher suites/protocols they understand would result in a very low grade on 
this report.

Anyway, if we want to make kudu.apache.org accessible on these old browsers, I 
believe TLS 1.0, and RSA with 3DES should be enabled instead of allowing 
everyone else to access the page over non-secure HTTP.

[1] https://tools.ietf.org/html/rfc6797#section-7.2
[2] 
https://www.ssllabs.com/ssltest/analyze.html?d=kudu.apache.org=40.79.78.1


--
To view, visit http://gerrit.cloudera.org:8080/11162
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: kudu
Gerrit-Branch: gh-pages
Gerrit-MessageType: comment
Gerrit-Change-Id: Ic5a060a419466ec4b16840347d387262ca8a4199
Gerrit-Change-Number: 11162
Gerrit-PatchSet: 1
Gerrit-Owner: Attila Bukor 
Gerrit-Reviewer: Attila Bukor 
Gerrit-Reviewer: Dan Burkert 
Gerrit-Reviewer: Mike Percy 
Gerrit-Reviewer: Todd Lipcon 
Gerrit-Comment-Date: Mon, 13 Aug 2018 09:43:01 +
Gerrit-HasComments: No