[Impala-ASF-CR] IMPALA-7043: HBase split failure should not fail dataload

2018-05-16 Thread Impala Public Jenkins (Code Review)
Impala Public Jenkins has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/10437 )

Change subject: IMPALA-7043: HBase split failure should not fail dataload
..


Patch Set 1:

Build started: https://jenkins.impala.io/job/gerrit-verify-dryrun/2491/


--
To view, visit http://gerrit.cloudera.org:8080/10437
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I7497fe8c9f1655a34b2743462d8b7248eb94554e
Gerrit-Change-Number: 10437
Gerrit-PatchSet: 1
Gerrit-Owner: Joe McDonnell 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Joe McDonnell 
Gerrit-Reviewer: Philip Zeyliger 
Gerrit-Comment-Date: Thu, 17 May 2018 05:49:43 +
Gerrit-HasComments: No


[Impala-ASF-CR] IMPALA-7043: HBase split failure should not fail dataload

2018-05-16 Thread Joe McDonnell (Code Review)
Joe McDonnell has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/10437 )

Change subject: IMPALA-7043: HBase split failure should not fail dataload
..


Patch Set 1:

> Could you link the JIRA between this one and the "flaky test" JIRA
 > we have for this? (Alternately, mention both JIRAs in the commit
 > message.)
 >
 > I think this is tactically the right thing to do: the failure isn't
 > hiding because it's in the test anyway. Thanks for tackling it.

Added a link between this and IMPALA-6776. Also added a link to this JIRA from 
the JIRA we've been using to track this.


--
To view, visit http://gerrit.cloudera.org:8080/10437
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I7497fe8c9f1655a34b2743462d8b7248eb94554e
Gerrit-Change-Number: 10437
Gerrit-PatchSet: 1
Gerrit-Owner: Joe McDonnell 
Gerrit-Reviewer: Joe McDonnell 
Gerrit-Reviewer: Philip Zeyliger 
Gerrit-Comment-Date: Thu, 17 May 2018 05:49:24 +
Gerrit-HasComments: No


[Impala-ASF-CR] IMPALA-7017: deflake/fix test catalog restart test

2018-05-16 Thread Impala Public Jenkins (Code Review)
Impala Public Jenkins has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/10397 )

Change subject: IMPALA-7017: deflake/fix test_catalog_restart test
..


Patch Set 1:

Build started: https://jenkins.impala.io/job/gerrit-verify-dryrun/2490/


--
To view, visit http://gerrit.cloudera.org:8080/10397
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: Iab6edb01f0bd7f5408cfef28fd05fdc95fb78469
Gerrit-Change-Number: 10397
Gerrit-PatchSet: 1
Gerrit-Owner: Vuk Ercegovac 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Joe McDonnell 
Gerrit-Comment-Date: Thu, 17 May 2018 05:36:29 +
Gerrit-HasComments: No


[Impala-ASF-CR] IMPALA-7043: HBase split failure should not fail dataload

2018-05-16 Thread Philip Zeyliger (Code Review)
Philip Zeyliger has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/10437 )

Change subject: IMPALA-7043: HBase split failure should not fail dataload
..


Patch Set 1: Code-Review+2

Could you link the JIRA between this one and the "flaky test" JIRA we have for 
this? (Alternately, mention both JIRAs in the commit message.)

I think this is tactically the right thing to do: the failure isn't hiding 
because it's in the test anyway. Thanks for tackling it.


--
To view, visit http://gerrit.cloudera.org:8080/10437
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I7497fe8c9f1655a34b2743462d8b7248eb94554e
Gerrit-Change-Number: 10437
Gerrit-PatchSet: 1
Gerrit-Owner: Joe McDonnell 
Gerrit-Reviewer: Philip Zeyliger 
Gerrit-Comment-Date: Thu, 17 May 2018 02:48:04 +
Gerrit-HasComments: No


[Impala-ASF-CR] IMPALA-7000: [DOCS] Correct info about dedicated executors

2018-05-16 Thread Alex Rodoni (Code Review)
Alex Rodoni has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/10357 )

Change subject: IMPALA-7000: [DOCS] Correct info about dedicated executors
..


Patch Set 1:

Hi Tim,
It looks like Juan is not available to review this change. Could you recommend 
someone in your team to review and approve this change?
Thank you!


--
To view, visit http://gerrit.cloudera.org:8080/10357
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I4b7e6c57188a41a45d5813882b6dbc37cf47cf1f
Gerrit-Change-Number: 10357
Gerrit-PatchSet: 1
Gerrit-Owner: Alex Rodoni 
Gerrit-Reviewer: Alex Rodoni 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Juan Yu 
Gerrit-Reviewer: Tim Armstrong 
Gerrit-Comment-Date: Thu, 17 May 2018 01:32:02 +
Gerrit-HasComments: No


[Impala-ASF-CR] IMPALA-6827: [DOCS] Updated the download link for the tutorial data

2018-05-16 Thread Michael Brown (Code Review)
Michael Brown has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/10393 )

Change subject: IMPALA-6827: [DOCS] Updated the download link for the tutorial 
data
..


Patch Set 6:

> And I believe the changes were merged in another review.

Thanks for letting me know.

> Can we just leave them here for now?

No.

The changes already went in here: https://gerrit.cloudera.org/#/c/10401/

Since this patch is trying to re-introduce them, a conflict will arise when the 
merge is attempted. Strictly speaking, Gerrit will not be able to rebase before 
cherry picking. In any case, it would be impossible for this commit to land 
as-is. You'll have to remove the impala_create_table and impala_kudu changes.

I suggest you "git fetch asf-gerrit" and "git rebase -i asf-gerrit/master". 
During the rebase you can fix the conflicts. What you ultimately want is to 
have a new commit that doesn't introduce any changes to impala_kudu.xml or 
impala_create_table.xml.


--
To view, visit http://gerrit.cloudera.org:8080/10393
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I6823d1688169e0a6f09d5b552026bc18a3770828
Gerrit-Change-Number: 10393
Gerrit-PatchSet: 6
Gerrit-Owner: Alex Rodoni 
Gerrit-Reviewer: Alex Rodoni 
Gerrit-Reviewer: Harsh J 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Jim Apple 
Gerrit-Reviewer: Michael Brown 
Gerrit-Comment-Date: Thu, 17 May 2018 00:47:56 +
Gerrit-HasComments: No


[Impala-ASF-CR] IMPALA-5737: Tighten minicluster memory limit

2018-05-16 Thread Philip Zeyliger (Code Review)
Philip Zeyliger has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/10277 )

Change subject: IMPALA-5737: Tighten minicluster memory limit
..


Patch Set 2:

(6 comments)

Sorry for the delay; I missed that you'd updated this. Feel free to ping me 
sooner in the future!

My stylistic nit is that it would be better if the variables had the unit in 
their names, so renaming everything to ..._MB.

http://gerrit.cloudera.org:8080/#/c/10277/2//COMMIT_MSG
Commit Message:

http://gerrit.cloudera.org:8080/#/c/10277/2//COMMIT_MSG@9
PS2, Line 9: This patch limits minicluster memory based on the memory avaliable.
available


http://gerrit.cloudera.org:8080/#/c/10277/2/bin/impala-config.sh
File bin/impala-config.sh:

http://gerrit.cloudera.org:8080/#/c/10277/2/bin/impala-config.sh@634
PS2, Line 634: if [ -z "${MINI_CLUSTER_MEM_LIMIT+x}" ]; then
What's the "+x" doing here? I looked up what this was doing, but it may be 
worthwhile to add a comment like:
# If MINI_CLUSTER_MEM_LIMIT isn't set, configure it based on system memory.

In other places, we divide this sort of thing into a X_OVERRIDE and X variable 
to avoid issues when jumping between source directories, but it probably 
doesn't matter too much here.

Please consider renaming this to MINI_CLUSTER_MEM_LIMIT_MB so that it's obvious 
what unit we're talking about.


http://gerrit.cloudera.org:8080/#/c/10277/2/bin/impala-config.sh@636
PS2, Line 636: $(($(cat /proc/meminfo | grep -oP 
"MemAvailable:[[:space:]]*\K\d*")/1024))
I did not know about \K!


http://gerrit.cloudera.org:8080/#/c/10277/2/bin/impala-config.sh@639
PS2, Line 639: CGROUP_LIMIT=$(cat 
/sys/fs/cgroup/memory/memory.limit_in_bytes)
This is the current process's cgroup? Pointer to how this works?

Technically cgroups could be mounted in a different location; what environments 
does this work in?


http://gerrit.cloudera.org:8080/#/c/10277/2/bin/impala-config.sh@654
PS2, Line 654:   export IMPALAD_MEM_LIMIT=6144
Ideally all these names have units like _MB appended to them.


http://gerrit.cloudera.org:8080/#/c/10277/2/bin/start-impala-cluster.py
File bin/start-impala-cluster.py:

http://gerrit.cloudera.org:8080/#/c/10277/2/bin/start-impala-cluster.py@232
PS2, Line 232:   env_mem_limit = os.environ.get('IMPALAD_MEM_LIMIT')
It's somewhat problematic that we now have multiple ways to do:

export 
TEST_START_CLUSTER_ARGS="--impalad_args=--mem_limit=$IMPALAD_MEM_LIMIT_BYTES"

On balance, I think this is fine. Variables like TEST_START_CLUSTER_ARGS are 
really arrays, and it's hard to figure out a good way to consolidate them.



--
To view, visit http://gerrit.cloudera.org:8080/10277
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: If8240551e726c6da546a926a1ce3444f41ef87fe
Gerrit-Change-Number: 10277
Gerrit-PatchSet: 2
Gerrit-Owner: Tianyi Wang 
Gerrit-Reviewer: Philip Zeyliger 
Gerrit-Reviewer: Tianyi Wang 
Gerrit-Comment-Date: Thu, 17 May 2018 00:41:00 +
Gerrit-HasComments: Yes


[Impala-ASF-CR] IMPALA-6827: [DOCS] Updated the download link for the tutorial data

2018-05-16 Thread Alex Rodoni (Code Review)
Alex Rodoni has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/10393 )

Change subject: IMPALA-6827: [DOCS] Updated the download link for the tutorial 
data
..


Patch Set 4:

> (3 comments)
 >
 > > I accidentally committed from a wrong branch and combined 2
 > > changes, here.
 > >
 > > Let me know if you want me to generate a html for kudu_tutorial
 > for
 > > review.
 >
 > I didn't see anything about kudu_tutorial here, just something
 > about Kudu Tablet server replicas. Is that content relevant for
 > both 3.x and 2.x? If it isn't, you need to separate out the
 > commits, because the large change to impala_tutorial.xml can
 > definitely go into 2.x

impala_kudu.xml and impala_create_table.xml have the approved changes to go 
into all branches. And I believe the changes were merged in another review. Can 
we just leave them here for now?


--
To view, visit http://gerrit.cloudera.org:8080/10393
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I6823d1688169e0a6f09d5b552026bc18a3770828
Gerrit-Change-Number: 10393
Gerrit-PatchSet: 4
Gerrit-Owner: Alex Rodoni 
Gerrit-Reviewer: Alex Rodoni 
Gerrit-Reviewer: Harsh J 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Jim Apple 
Gerrit-Reviewer: Michael Brown 
Gerrit-Comment-Date: Thu, 17 May 2018 00:32:17 +
Gerrit-HasComments: No


[Impala-ASF-CR] IMPALA-6827: [DOCS] Updated the download link for the tutorial data

2018-05-16 Thread Alex Rodoni (Code Review)
Alex Rodoni has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/10393 )

Change subject: IMPALA-6827: [DOCS] Updated the download link for the tutorial 
data
..


Patch Set 4:

(2 comments)

http://gerrit.cloudera.org:8080/#/c/10393/4//COMMIT_MSG
Commit Message:

http://gerrit.cloudera.org:8080/#/c/10393/4//COMMIT_MSG@7
PS4, Line 7: IMPALA-6827: [DOCS] Updated the download link for the tutorial data
> Commit message has duplicate info. Please clean up the duplication.
Done


http://gerrit.cloudera.org:8080/#/c/10393/4/docs/topics/impala_tutorial.xml
File docs/topics/impala_tutorial.xml:

http://gerrit.cloudera.org:8080/#/c/10393/4/docs/topics/impala_tutorial.xml@1665
PS4, Line 1665: fi
> Was this the actual output or was it a mispaste?
Fixed



--
To view, visit http://gerrit.cloudera.org:8080/10393
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I6823d1688169e0a6f09d5b552026bc18a3770828
Gerrit-Change-Number: 10393
Gerrit-PatchSet: 4
Gerrit-Owner: Alex Rodoni 
Gerrit-Reviewer: Alex Rodoni 
Gerrit-Reviewer: Harsh J 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Jim Apple 
Gerrit-Reviewer: Michael Brown 
Gerrit-Comment-Date: Thu, 17 May 2018 00:30:24 +
Gerrit-HasComments: Yes


[Impala-ASF-CR] IMPALA-6827: [DOCS] Updated the download link for the tutorial data

2018-05-16 Thread Alex Rodoni (Code Review)
Hello Michael Brown, Jim Apple, Harsh J, Impala Public Jenkins,

I'd like you to reexamine a change. Please visit

http://gerrit.cloudera.org:8080/10393

to look at the new patch set (#6).

Change subject: IMPALA-6827: [DOCS] Updated the download link for the tutorial 
data
..

IMPALA-6827: [DOCS] Updated the download link for the tutorial data

Updated the link to download the Parquet airline files for tutorial.

Change-Id: I6823d1688169e0a6f09d5b552026bc18a3770828
---
M docs/topics/impala_create_table.xml
M docs/topics/impala_kudu.xml
M docs/topics/impala_tutorial.xml
3 files changed, 488 insertions(+), 643 deletions(-)


  git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/93/10393/6
--
To view, visit http://gerrit.cloudera.org:8080/10393
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: newpatchset
Gerrit-Change-Id: I6823d1688169e0a6f09d5b552026bc18a3770828
Gerrit-Change-Number: 10393
Gerrit-PatchSet: 6
Gerrit-Owner: Alex Rodoni 
Gerrit-Reviewer: Alex Rodoni 
Gerrit-Reviewer: Harsh J 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Jim Apple 
Gerrit-Reviewer: Michael Brown 


[Impala-ASF-CR] IMPALA-6827: [DOCS] Updated the download link for the tutorial data

2018-05-16 Thread Alex Rodoni (Code Review)
Hello Michael Brown, Jim Apple, Harsh J, Impala Public Jenkins,

I'd like you to reexamine a change. Please visit

http://gerrit.cloudera.org:8080/10393

to look at the new patch set (#5).

Change subject: IMPALA-6827: [DOCS] Updated the download link for the tutorial 
data
..

IMPALA-6827: [DOCS] Updated the download link for the tutorial data

Updated the link to download the Parquet airline files for tutorial.

Change-Id: I6823d1688169e0a6f09d5b552026bc18a3770828
---
M docs/topics/impala_create_table.xml
M docs/topics/impala_kudu.xml
M docs/topics/impala_tutorial.xml
3 files changed, 488 insertions(+), 643 deletions(-)


  git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/93/10393/5
--
To view, visit http://gerrit.cloudera.org:8080/10393
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: newpatchset
Gerrit-Change-Id: I6823d1688169e0a6f09d5b552026bc18a3770828
Gerrit-Change-Number: 10393
Gerrit-PatchSet: 5
Gerrit-Owner: Alex Rodoni 
Gerrit-Reviewer: Alex Rodoni 
Gerrit-Reviewer: Harsh J 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Jim Apple 
Gerrit-Reviewer: Michael Brown 


[Impala-ASF-CR] IMPALA-6827: [DOCS] Updated the download link for the tutorial data

2018-05-16 Thread Michael Brown (Code Review)
Michael Brown has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/10393 )

Change subject: IMPALA-6827: [DOCS] Updated the download link for the tutorial 
data
..


Patch Set 4:

(3 comments)

> I accidentally committed from a wrong branch and combined 2
 > changes, here.
 >
 > Let me know if you want me to generate a html for kudu_tutorial for
 > review.

I didn't see anything about kudu_tutorial here, just something about Kudu 
Tablet server replicas. Is that content relevant for both 3.x and 2.x? If it 
isn't, you need to separate out the commits, because the large change to 
impala_tutorial.xml can definitely go into 2.x

http://gerrit.cloudera.org:8080/#/c/10393/4//COMMIT_MSG
Commit Message:

http://gerrit.cloudera.org:8080/#/c/10393/4//COMMIT_MSG@7
PS4, Line 7: IMPALA-6827: [DOCS] Updated the download link for the tutorial data
Commit message has duplicate info. Please clean up the duplication.


http://gerrit.cloudera.org:8080/#/c/10393/4/docs/topics/impala_tutorial.xml
File docs/topics/impala_tutorial.xml:

http://gerrit.cloudera.org:8080/#/c/10393/4/docs/topics/impala_tutorial.xml@1665
PS4, Line 1665: fi
Was this the actual output or was it a mispaste?


http://gerrit.cloudera.org:8080/#/c/10393/4/docs/topics/impala_tutorial.xml@1823
PS4, Line 1823: GROUOP
Typo



-- 
To view, visit http://gerrit.cloudera.org:8080/10393
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I6823d1688169e0a6f09d5b552026bc18a3770828
Gerrit-Change-Number: 10393
Gerrit-PatchSet: 4
Gerrit-Owner: Alex Rodoni 
Gerrit-Reviewer: Alex Rodoni 
Gerrit-Reviewer: Harsh J 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Jim Apple 
Gerrit-Reviewer: Michael Brown 
Gerrit-Comment-Date: Thu, 17 May 2018 00:22:47 +
Gerrit-HasComments: Yes


[Impala-ASF-CR] IMPALA-6827: [DOCS] Updated the download link for the tutorial data

2018-05-16 Thread Alex Rodoni (Code Review)
Alex Rodoni has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/10393 )

Change subject: IMPALA-6827: [DOCS] Updated the download link for the tutorial 
data
..


Patch Set 4:

I accidentally committed from a wrong branch and combined 2 changes, here.

Let me know if you want me to generate a html for kudu_tutorial for review.


--
To view, visit http://gerrit.cloudera.org:8080/10393
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I6823d1688169e0a6f09d5b552026bc18a3770828
Gerrit-Change-Number: 10393
Gerrit-PatchSet: 4
Gerrit-Owner: Alex Rodoni 
Gerrit-Reviewer: Alex Rodoni 
Gerrit-Reviewer: Harsh J 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Jim Apple 
Gerrit-Reviewer: Michael Brown 
Gerrit-Comment-Date: Thu, 17 May 2018 00:02:54 +
Gerrit-HasComments: No


[Impala-ASF-CR] IMPALA-6827: [DOCS] Updated the download link for the tutorial data

2018-05-16 Thread Alex Rodoni (Code Review)
Hello Michael Brown, Jim Apple, Harsh J, Impala Public Jenkins,

I'd like you to reexamine a change. Please visit

http://gerrit.cloudera.org:8080/10393

to look at the new patch set (#4).

Change subject: IMPALA-6827: [DOCS] Updated the download link for the tutorial 
data
..

IMPALA-6827: [DOCS] Updated the download link for the tutorial data

Updated the link to download the Parquet airline files for tutorial.

Change-Id: I6823d1688169e0a6f09d5b552026bc18a3770828

IMPALA-6827: [DOCS] Updated the download link for the tutorial data

Updated the link to download the Parquet airline files for tutorial.

Change-Id: I6823d1688169e0a6f09d5b552026bc18a3770828
---
M docs/topics/impala_create_table.xml
M docs/topics/impala_kudu.xml
M docs/topics/impala_tutorial.xml
3 files changed, 488 insertions(+), 643 deletions(-)


  git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/93/10393/4
--
To view, visit http://gerrit.cloudera.org:8080/10393
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: newpatchset
Gerrit-Change-Id: I6823d1688169e0a6f09d5b552026bc18a3770828
Gerrit-Change-Number: 10393
Gerrit-PatchSet: 4
Gerrit-Owner: Alex Rodoni 
Gerrit-Reviewer: Alex Rodoni 
Gerrit-Reviewer: Harsh J 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Jim Apple 
Gerrit-Reviewer: Michael Brown 


[Impala-ASF-CR] IMPALA-7017: deflake/fix test catalog restart test

2018-05-16 Thread Joe McDonnell (Code Review)
Joe McDonnell has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/10397 )

Change subject: IMPALA-7017: deflake/fix test_catalog_restart test
..


Patch Set 1: Code-Review+2

This looks good to me.


--
To view, visit http://gerrit.cloudera.org:8080/10397
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: Iab6edb01f0bd7f5408cfef28fd05fdc95fb78469
Gerrit-Change-Number: 10397
Gerrit-PatchSet: 1
Gerrit-Owner: Vuk Ercegovac 
Gerrit-Reviewer: Joe McDonnell 
Gerrit-Comment-Date: Wed, 16 May 2018 23:56:19 +
Gerrit-HasComments: No


[Impala-ASF-CR] IMPALA-6827: [DOCS] Updated the download link for the tutorial data

2018-05-16 Thread Alex Rodoni (Code Review)
Hello Michael Brown, Jim Apple, Harsh J, Impala Public Jenkins,

I'd like you to reexamine a change. Please visit

http://gerrit.cloudera.org:8080/10393

to look at the new patch set (#3).

Change subject: IMPALA-6827: [DOCS] Updated the download link for the tutorial 
data
..

IMPALA-6827: [DOCS] Updated the download link for the tutorial data

Updated the link to download the Parquet airline files for tutorial.

Change-Id: I6823d1688169e0a6f09d5b552026bc18a3770828
---
M docs/topics/impala_tutorial.xml
1 file changed, 3 insertions(+), 1 deletion(-)


  git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/93/10393/3
--
To view, visit http://gerrit.cloudera.org:8080/10393
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: newpatchset
Gerrit-Change-Id: I6823d1688169e0a6f09d5b552026bc18a3770828
Gerrit-Change-Number: 10393
Gerrit-PatchSet: 3
Gerrit-Owner: Alex Rodoni 
Gerrit-Reviewer: Alex Rodoni 
Gerrit-Reviewer: Harsh J 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Jim Apple 
Gerrit-Reviewer: Michael Brown 


[Impala-ASF-CR] IMPALA-6827: [DOCS] Updated the download link for the tutorial data

2018-05-16 Thread Alex Rodoni (Code Review)
Alex Rodoni has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/10393 )

Change subject: IMPALA-6827: [DOCS] Updated the download link for the tutorial 
data
..


Patch Set 1:

(10 comments)

http://gerrit.cloudera.org:8080/#/c/10393/1/docs/topics/impala_tutorial.xml
File docs/topics/impala_tutorial.xml:

http://gerrit.cloudera.org:8080/#/c/10393/1/docs/topics/impala_tutorial.xml@1611
PS1, Line 1611: 
airlines_parquet/93459d994898a9ba-77674173b331fa9a_2073981944_data.0.parq
> File names are different now.
Done


http://gerrit.cloudera.org:8080/#/c/10393/1/docs/topics/impala_tutorial.xml@1622
PS1, Line 1622: $ du -kch *.parq
> The output below is now different.
Done


http://gerrit.cloudera.org:8080/#/c/10393/1/docs/topics/impala_tutorial.xml@1655
PS1, Line 1655: -rw-r--r--   3 jrussell supergroup  265107489 2015-08-12 17:18 
/user/impala/staging/airlines/93459d994898a9ba-77674173b331fa96_2118228804_data.0.parq
> This output will be different.
Done


http://gerrit.cloudera.org:8080/#/c/10393/1/docs/topics/impala_tutorial.xml@1689
PS1, Line 1689: WARNINGS: Impala does not have READ_WRITE access to path 
'hdfs://demo_host.example.com:8020/user/impala/staging'
> When I ran a similar command, I didn't get this warning.
I got the warning in the nightly cluster.


http://gerrit.cloudera.org:8080/#/c/10393/1/docs/topics/impala_tutorial.xml@1691
PS1, Line 1691:
> It's kind of a shame the tutorial does not urge the reader to run COMPUTE S
Will add it in the next iteration of updates.


http://gerrit.cloudera.org:8080/#/c/10393/1/docs/topics/impala_tutorial.xml@1722
PS1, Line 1722: 93459d994898a9ba
> Output will be different here and elsewhere.
Done


http://gerrit.cloudera.org:8080/#/c/10393/1/docs/topics/impala_tutorial.xml@1735
PS1, Line 1735: inferred from: optional int32 year
> I get a different output here. All the comments say "Inferred from Parquet
Done


http://gerrit.cloudera.org:8080/#/c/10393/1/docs/topics/impala_tutorial.xml@1851
PS1, Line 1851: [localhost:21000] > select ndv(carrier), 
ndv(flight_num), ndv(tail_num),
  :   >   ndv(origin), ndv(dest) from 
airlines_external;
  : 
+--+-+---+-+---+
  : | ndv(carrier) | ndv(flight_num) | ndv(tail_num) | ndv(origin) 
| ndv(dest) |
  : 
+--+-+---+-+---+
  : | 29   | 9086| 3 | 340 
| 347   |
  : 
+--+-+---+-+---+
> Awesome. I get slightly different numbers here!
I got these numbers.


http://gerrit.cloudera.org:8080/#/c/10393/1/docs/topics/impala_tutorial.xml@2191
PS1, Line 2191:   > STORED AS PARQUET
  :   > PARTITIONED BY (year INT);
> I had to reverse these two to make it work. The docs agree.
Done


http://gerrit.cloudera.org:8080/#/c/10393/1/docs/topics/impala_tutorial.xml@2268
PS1, Line 2268: | 1987  | 1311826   | 1  | 9.32MB   | NOT CACHED   | NOT 
CACHED | PARQUET | true
> This output is different.
Done



--
To view, visit http://gerrit.cloudera.org:8080/10393
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I6823d1688169e0a6f09d5b552026bc18a3770828
Gerrit-Change-Number: 10393
Gerrit-PatchSet: 1
Gerrit-Owner: Alex Rodoni 
Gerrit-Reviewer: Alex Rodoni 
Gerrit-Reviewer: Harsh J 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Jim Apple 
Gerrit-Reviewer: Michael Brown 
Gerrit-Comment-Date: Wed, 16 May 2018 23:47:22 +
Gerrit-HasComments: Yes


[Impala-ASF-CR] IMPALA-5168: Codegen HASH PARTITIONED KrpcDataStreamSender::Send()

2018-05-16 Thread Bikramjeet Vig (Code Review)
Bikramjeet Vig has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/10421 )

Change subject: IMPALA-5168: Codegen HASH_PARTITIONED 
KrpcDataStreamSender::Send()
..


Patch Set 1:

(5 comments)

http://gerrit.cloudera.org:8080/#/c/10421/1/be/src/runtime/krpc-data-stream-sender.cc
File be/src/runtime/krpc-data-stream-sender.cc:

http://gerrit.cloudera.org:8080/#/c/10421/1/be/src/runtime/krpc-data-stream-sender.cc@718
PS1, Line 718:  // Load the expression evaluator for the i-th partition 
expression
 : llvm::Function* get_expr_eval_fn =
 : 
codegen->GetFunction(IRFunction::KRPC_DSS_GET_PART_EXPR_EVAL, false);
 : DCHECK(get_expr_eval_fn != nullptr);
 : llvm::Value* expr_eval_arg =
 : builder.CreateCall(get_expr_eval_fn, {this_arg, 
codegen->GetI32Constant(i)});
can we use Builder.CreateExtractValue() and get rid of this function call.
(Looks like something that the llvm optimizer could have inlined but didn't 
according to the sample generated code above)


http://gerrit.cloudera.org:8080/#/c/10421/1/be/src/runtime/krpc-data-stream-sender.cc@803
PS1, Line 803: partition_type_ == TPartitionType::HASH_PARTITIONED
nit: Invert condition to avoid indentation.

if (partition_type_ != TPartitionType::HASH_PARTITIONED) {
const string& msg = Substitute("not $0",
partition_type_ == TPartitionType::KUDU ? "supported" : "needed");
profile()->AddCodegenMsg(false, msg, sender_name);
return;
}

llvm::Function* hash_row_fn;
codegen_status = CodegenHashRow(codegen, _row_fn);
.
.
.


http://gerrit.cloudera.org:8080/#/c/10421/1/be/src/runtime/krpc-data-stream-sender.cc@818
PS1, Line 818: HashRows
nit: HashRow()


http://gerrit.cloudera.org:8080/#/c/10421/1/be/src/runtime/krpc-data-stream-sender.cc@820
PS1, Line 820: KrpcDataStreamSender7HashRowEPNS_8TupleRowE
nit: maybe use a named const like
const char* HASH_ROW_SYMBOL


http://gerrit.cloudera.org:8080/#/c/10421/1/testdata/workloads/functional-query/queries/QueryTest/disable-codegen.test
File testdata/workloads/functional-query/queries/QueryTest/disable-codegen.test:

http://gerrit.cloudera.org:8080/#/c/10421/1/testdata/workloads/functional-query/queries/QueryTest/disable-codegen.test@61
PS1, Line 61:  QUERY
add cases for "codegen not supported/needed"



--
To view, visit http://gerrit.cloudera.org:8080/10421
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I1c44cc9312c062cc7a5a4ac9156ceaa31fb887ff
Gerrit-Change-Number: 10421
Gerrit-PatchSet: 1
Gerrit-Owner: Michael Ho 
Gerrit-Reviewer: Bikramjeet Vig 
Gerrit-Comment-Date: Wed, 16 May 2018 23:42:33 +
Gerrit-HasComments: Yes


[Impala-ASF-CR] IMPALA-7023: Wait for fragments to finish for test insert.py

2018-05-16 Thread Impala Public Jenkins (Code Review)
Impala Public Jenkins has submitted this change and it was merged. ( 
http://gerrit.cloudera.org:8080/10426 )

Change subject: IMPALA-7023: Wait for fragments to finish for test_insert.py
..

IMPALA-7023: Wait for fragments to finish for test_insert.py

The arrangement of tests in test_insert.py changed with
IMPALA-7010, splitting out the memory limit tests into
test_insert_mem_limit(). On exhaustive, the combination
of test dimensions means test_insert_mem_limit() executes
11 different combinations. Each of these statements can
use a large amount of memory and this is not cleaned
up immediately. This has been causing
test_insert_overwrite(), which immediately follows
test_insert_mem_limit(), to hit the process memory limit.

This changes test_insert_mem_limit() to make it wait
for its fragments to finish.

Change-Id: I5642e9cb32dd02afd74dde7e0d3b31bddbee3ccd
Reviewed-on: http://gerrit.cloudera.org:8080/10426
Reviewed-by: Philip Zeyliger 
Tested-by: Impala Public Jenkins 
---
M tests/query_test/test_insert.py
1 file changed, 8 insertions(+), 0 deletions(-)

Approvals:
  Philip Zeyliger: Looks good to me, approved
  Impala Public Jenkins: Verified

--
To view, visit http://gerrit.cloudera.org:8080/10426
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: merged
Gerrit-Change-Id: I5642e9cb32dd02afd74dde7e0d3b31bddbee3ccd
Gerrit-Change-Number: 10426
Gerrit-PatchSet: 2
Gerrit-Owner: Joe McDonnell 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Joe McDonnell 
Gerrit-Reviewer: Philip Zeyliger 


[Impala-ASF-CR] IMPALA-7023: Wait for fragments to finish for test insert.py

2018-05-16 Thread Impala Public Jenkins (Code Review)
Impala Public Jenkins has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/10426 )

Change subject: IMPALA-7023: Wait for fragments to finish for test_insert.py
..


Patch Set 1: Verified+1


--
To view, visit http://gerrit.cloudera.org:8080/10426
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I5642e9cb32dd02afd74dde7e0d3b31bddbee3ccd
Gerrit-Change-Number: 10426
Gerrit-PatchSet: 1
Gerrit-Owner: Joe McDonnell 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Joe McDonnell 
Gerrit-Reviewer: Philip Zeyliger 
Gerrit-Comment-Date: Wed, 16 May 2018 23:33:56 +
Gerrit-HasComments: No


[Impala-ASF-CR] IMPALA-5737: Tighten minicluster memory limit

2018-05-16 Thread Tianyi Wang (Code Review)
Tianyi Wang has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/10277 )

Change subject: IMPALA-5737: Tighten minicluster memory limit
..


Patch Set 2:

Ping


--
To view, visit http://gerrit.cloudera.org:8080/10277
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: If8240551e726c6da546a926a1ce3444f41ef87fe
Gerrit-Change-Number: 10277
Gerrit-PatchSet: 2
Gerrit-Owner: Tianyi Wang 
Gerrit-Reviewer: Philip Zeyliger 
Gerrit-Reviewer: Tianyi Wang 
Gerrit-Comment-Date: Wed, 16 May 2018 23:06:29 +
Gerrit-HasComments: No


[Impala-ASF-CR] IMPALA-7043: HBase split failure should not fail dataload

2018-05-16 Thread Joe McDonnell (Code Review)
Joe McDonnell has uploaded this change for review. ( 
http://gerrit.cloudera.org:8080/10437


Change subject: IMPALA-7043: HBase split failure should not fail dataload
..

IMPALA-7043: HBase split failure should not fail dataload

HBase splitting can fail due to changes in HBase code. It
is useful to still do tests even if HBase splitting failed.
As it is today, buildall.sh will abort if
create-load-data.sh's invocation of split-hbase.sh fails.
No tests run, even though the HBase splitting affects only
a small portion of our tests.

This changes create-load-data.sh to keep going with
dataload if HBase splitting fails. It outputs the same
errors to the log as it would before this change.
It adds a message to explain that it is ignoring
the failure and there may be related test failures.

Change-Id: I7497fe8c9f1655a34b2743462d8b7248eb94554e
---
M testdata/bin/create-load-data.sh
1 file changed, 5 insertions(+), 1 deletion(-)



  git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/37/10437/1
--
To view, visit http://gerrit.cloudera.org:8080/10437
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: newchange
Gerrit-Change-Id: I7497fe8c9f1655a34b2743462d8b7248eb94554e
Gerrit-Change-Number: 10437
Gerrit-PatchSet: 1
Gerrit-Owner: Joe McDonnell 


[Impala-ASF-CR] IMPALA-6034: Add Cpu and scanned bytes limits per query

2018-05-16 Thread Bikramjeet Vig (Code Review)
Bikramjeet Vig has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/10415 )

Change subject: IMPALA-6034: Add Cpu and scanned bytes limits per query
..


Patch Set 1:

(5 comments)

just a few more cleanup asks

http://gerrit.cloudera.org:8080/#/c/10415/1/be/src/service/impala-server.cc
File be/src/service/impala-server.cc:

http://gerrit.cloudera.org:8080/#/c/10415/1/be/src/service/impala-server.cc@1015
PS1, Line 1015: now + (1000L)
maybe have a named constant for this or a default constructor specifically for 
resource expiration events with a comment that resources checks are done every 
second.


http://gerrit.cloudera.org:8080/#/c/10415/1/common/thrift/ImpalaInternalService.thrift
File common/thrift/ImpalaInternalService.thrift:

http://gerrit.cloudera.org:8080/#/c/10415/1/common/thrift/ImpalaInternalService.thrift@286
PS1, Line 286: Scan bytes limit, after which a query will be cancelled
nit: "See comment in ImpalaService.thrift."  would work too


http://gerrit.cloudera.org:8080/#/c/10415/1/common/thrift/ImpalaInternalService.thrift@289
PS1, Line 289: CPU time limit in seconds, after which a query will be cancelled
same here


http://gerrit.cloudera.org:8080/#/c/10415/1/tests/custom_cluster/test_query_expiration.py
File tests/custom_cluster/test_query_expiration.py:

http://gerrit.cloudera.org:8080/#/c/10415/1/tests/custom_cluster/test_query_expiration.py@179
PS1, Line 179: client.execute("SET MAX_CPU_TIME_S=1")
 : client.execute("SET MAX_SCAN_BYTES=10G")
can use this instead to set multiple options:
set_configuration(self, config_option_dict)


http://gerrit.cloudera.org:8080/#/c/10415/1/tests/custom_cluster/test_query_expiration.py@182
PS1, Line 182: client.execute("SET MAX_CPU_TIME_S=0")
 : client.execute("SET MAX_SCAN_BYTES=0")
similarly can use this instead:

clear_configuration()



--
To view, visit http://gerrit.cloudera.org:8080/10415
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I4c6015e21da684bb9f33e236d71309dd4c178a20
Gerrit-Change-Number: 10415
Gerrit-PatchSet: 1
Gerrit-Owner: Mostafa Mokhtar 
Gerrit-Reviewer: Bikramjeet Vig 
Gerrit-Reviewer: Tim Armstrong 
Gerrit-Comment-Date: Wed, 16 May 2018 21:07:40 +
Gerrit-HasComments: Yes


[Impala-ASF-CR] Default ASAN options in codebase.

2018-05-16 Thread Jim Apple (Code Review)
Jim Apple has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/10404 )

Change subject: Default ASAN options in codebase.
..


Patch Set 2:

(1 comment)

http://gerrit.cloudera.org:8080/#/c/10404/2//COMMIT_MSG
Commit Message:

PS2:
Should we do this for the other sanitizers, too?

If so, should that be in this patch or another one?



--
To view, visit http://gerrit.cloudera.org:8080/10404
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I3cbbd210c67750a48003f336bea1f3e1cb2d9e6b
Gerrit-Change-Number: 10404
Gerrit-PatchSet: 2
Gerrit-Owner: Philip Zeyliger 
Gerrit-Reviewer: Jim Apple 
Gerrit-Reviewer: Joe McDonnell 
Gerrit-Reviewer: Philip Zeyliger 
Gerrit-Reviewer: Tim Armstrong 
Gerrit-Comment-Date: Wed, 16 May 2018 20:54:53 +
Gerrit-HasComments: Yes


[Impala-ASF-CR] IMPALA-7019: Schedule EC as remote & disable failed tests

2018-05-16 Thread Tianyi Wang (Code Review)
Tianyi Wang has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/10413 )

Change subject: IMPALA-7019: Schedule EC as remote & disable failed tests
..


Patch Set 2:

(1 comment)

http://gerrit.cloudera.org:8080/#/c/10413/1/tests/custom_cluster/test_hdfs_fd_caching.py
File tests/custom_cluster/test_hdfs_fd_caching.py:

http://gerrit.cloudera.org:8080/#/c/10413/1/tests/custom_cluster/test_hdfs_fd_caching.py@29
PS1, Line 29: @SkipIfEC.remote_read
> Is this really expected to work? FD caching is only applied to local sc rea
Thanks for pointing it out. I changed it to SkipIfEC.remote_read



--
To view, visit http://gerrit.cloudera.org:8080/10413
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I138738d3e28e5daa1718c05c04cd9dd146c4ff84
Gerrit-Change-Number: 10413
Gerrit-PatchSet: 2
Gerrit-Owner: Tianyi Wang 
Gerrit-Reviewer: Alex Behm 
Gerrit-Reviewer: Tianyi Wang 
Gerrit-Comment-Date: Wed, 16 May 2018 20:17:06 +
Gerrit-HasComments: Yes


[Impala-ASF-CR] IMPALA-7019: Schedule EC as remote & disable failed tests

2018-05-16 Thread Tianyi Wang (Code Review)
Tianyi Wang has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/10413 )

Change subject: IMPALA-7019: Schedule EC as remote & disable failed tests
..


Patch Set 2:

(4 comments)

http://gerrit.cloudera.org:8080/#/c/10413/1/fe/src/main/java/org/apache/impala/catalog/HdfsTable.java
File fe/src/main/java/org/apache/impala/catalog/HdfsTable.java:

http://gerrit.cloudera.org:8080/#/c/10413/1/fe/src/main/java/org/apache/impala/catalog/HdfsTable.java@439
PS1, Line 439: // the block location API.
> Shouldn't this check be in L424 so that synthesizeFileMd is true for EC? Ar
- Files are enumerated at L430. L424 is in partition scope.
- In the latest patch set only the replica is substituted with a remote 
address. Block offset and length are preserved.
- Done


http://gerrit.cloudera.org:8080/#/c/10413/1/fe/src/main/java/org/apache/impala/catalog/HdfsTable.java@510
PS1, Line 510: continue;
> move check to L488?
Files are enumerated at L502. L488 is in partition scope.


http://gerrit.cloudera.org:8080/#/c/10413/1/tests/common/skip.py
File tests/common/skip.py:

http://gerrit.cloudera.org:8080/#/c/10413/1/tests/common/skip.py@149
PS1, Line 149:   remote_read = pytest.mark.skipif(IS_EC, reason="EC files are 
read remotely and "
> Any more concrete reason, or are there too many to list?
I haven't scrutinized all the failed tests yet. I think this list will grow 
later.


http://gerrit.cloudera.org:8080/#/c/10413/1/tests/query_test/test_mt_dop.py
File tests/query_test/test_mt_dop.py:

http://gerrit.cloudera.org:8080/#/c/10413/1/tests/query_test/test_mt_dop.py@101
PS1, Line 101:   # Impala scans fewer row groups than it should with erasure 
coding.
> fewer row groups
Done



--
To view, visit http://gerrit.cloudera.org:8080/10413
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I138738d3e28e5daa1718c05c04cd9dd146c4ff84
Gerrit-Change-Number: 10413
Gerrit-PatchSet: 2
Gerrit-Owner: Tianyi Wang 
Gerrit-Reviewer: Alex Behm 
Gerrit-Reviewer: Tianyi Wang 
Gerrit-Comment-Date: Wed, 16 May 2018 20:16:05 +
Gerrit-HasComments: Yes


[Impala-ASF-CR] IMPALA-7019: Schedule EC as remote & disable failed tests

2018-05-16 Thread Tianyi Wang (Code Review)
Tianyi Wang has uploaded a new patch set (#2). ( 
http://gerrit.cloudera.org:8080/10413 )

Change subject: IMPALA-7019: Schedule EC as remote & disable failed tests
..

IMPALA-7019: Schedule EC as remote & disable failed tests

This patch schedules HDFS EC files without considering locality. Failed
tests are disabled and a jenkins build should succeed with export
ERASURE_COINDG=true.

Change-Id: I138738d3e28e5daa1718c05c04cd9dd146c4ff84
---
M common/fbs/CatalogObjects.fbs
M fe/src/main/java/org/apache/impala/catalog/HdfsPartition.java
M fe/src/main/java/org/apache/impala/catalog/HdfsTable.java
M fe/src/main/java/org/apache/impala/planner/HdfsScanNode.java
M tests/common/skip.py
M tests/custom_cluster/test_hdfs_fd_caching.py
M tests/query_test/test_hdfs_caching.py
M tests/query_test/test_insert.py
M tests/query_test/test_insert_parquet.py
M tests/query_test/test_mt_dop.py
M tests/query_test/test_nested_types.py
M tests/query_test/test_queries.py
M tests/query_test/test_query_mem_limit.py
M tests/query_test/test_scanners.py
M tests/util/filesystem_utils.py
15 files changed, 79 insertions(+), 17 deletions(-)


  git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/13/10413/2
--
To view, visit http://gerrit.cloudera.org:8080/10413
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: newpatchset
Gerrit-Change-Id: I138738d3e28e5daa1718c05c04cd9dd146c4ff84
Gerrit-Change-Number: 10413
Gerrit-PatchSet: 2
Gerrit-Owner: Tianyi Wang 
Gerrit-Reviewer: Alex Behm 


[Impala-ASF-CR] IMPALA-7023: Wait for fragments to finish for test insert.py

2018-05-16 Thread Impala Public Jenkins (Code Review)
Impala Public Jenkins has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/10426 )

Change subject: IMPALA-7023: Wait for fragments to finish for test_insert.py
..


Patch Set 1:

Build started: https://jenkins.impala.io/job/gerrit-verify-dryrun/2489/


--
To view, visit http://gerrit.cloudera.org:8080/10426
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I5642e9cb32dd02afd74dde7e0d3b31bddbee3ccd
Gerrit-Change-Number: 10426
Gerrit-PatchSet: 1
Gerrit-Owner: Joe McDonnell 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Joe McDonnell 
Gerrit-Reviewer: Philip Zeyliger 
Gerrit-Comment-Date: Wed, 16 May 2018 20:14:07 +
Gerrit-HasComments: No


[Impala-ASF-CR] IMPALA-6995: avoid DCHECK in TimestampParse::Parse()

2018-05-16 Thread Impala Public Jenkins (Code Review)
Impala Public Jenkins has submitted this change and it was merged. ( 
http://gerrit.cloudera.org:8080/10349 )

Change subject: IMPALA-6995: avoid DCHECK in TimestampParse::Parse()
..

IMPALA-6995: avoid DCHECK in TimestampParse::Parse()

The bug was that the string's length is checked before trimming leading
and trailing spaces instead of afterwards. The bug has been present for
a long time but couldn't hit a DCHECK until recently.

Testing:
Added some backend tests that reproduce the crash.

Change-Id: I02a18ffd8893fe74f5830144300f745ce31477b1
Reviewed-on: http://gerrit.cloudera.org:8080/10349
Reviewed-by: Tim Armstrong 
Tested-by: Impala Public Jenkins 
---
M be/src/exprs/expr-test.cc
M be/src/runtime/timestamp-parse-util.cc
2 files changed, 56 insertions(+), 51 deletions(-)

Approvals:
  Tim Armstrong: Looks good to me, approved
  Impala Public Jenkins: Verified

--
To view, visit http://gerrit.cloudera.org:8080/10349
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: merged
Gerrit-Change-Id: I02a18ffd8893fe74f5830144300f745ce31477b1
Gerrit-Change-Number: 10349
Gerrit-PatchSet: 5
Gerrit-Owner: Tim Armstrong 
Gerrit-Reviewer: Dan Hecht 
Gerrit-Reviewer: Gabor Kaszab 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Kim Jin Chul 
Gerrit-Reviewer: Tim Armstrong 


[Impala-ASF-CR] IMPALA-6995: avoid DCHECK in TimestampParse::Parse()

2018-05-16 Thread Impala Public Jenkins (Code Review)
Impala Public Jenkins has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/10349 )

Change subject: IMPALA-6995: avoid DCHECK in TimestampParse::Parse()
..


Patch Set 4: Verified+1


--
To view, visit http://gerrit.cloudera.org:8080/10349
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I02a18ffd8893fe74f5830144300f745ce31477b1
Gerrit-Change-Number: 10349
Gerrit-PatchSet: 4
Gerrit-Owner: Tim Armstrong 
Gerrit-Reviewer: Dan Hecht 
Gerrit-Reviewer: Gabor Kaszab 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Kim Jin Chul 
Gerrit-Reviewer: Tim Armstrong 
Gerrit-Comment-Date: Wed, 16 May 2018 20:02:19 +
Gerrit-HasComments: No


[Impala-ASF-CR] IMPALA-6034: Add Cpu and scanned bytes limits per query

2018-05-16 Thread Tim Armstrong (Code Review)
Tim Armstrong has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/10415 )

Change subject: IMPALA-6034: Add Cpu and scanned bytes limits per query
..


Patch Set 1:

(29 comments)

I have one major concern about the coordinator locking that we need to fix then 
a bunch of cleanup asks.

http://gerrit.cloudera.org:8080/#/c/10415/1/be/src/runtime/coordinator-backend-state.h
File be/src/runtime/coordinator-backend-state.h:

http://gerrit.cloudera.org:8080/#/c/10415/1/be/src/runtime/coordinator-backend-state.h@115
PS1, Line 115: Cpu
Maybe clarify:

  /// Return total user CPU consumption across all backends.


http://gerrit.cloudera.org:8080/#/c/10415/1/be/src/runtime/coordinator-backend-state.h@118
PS1, Line 118:   /// Return total sys Cpu.
/// Return total system CPU consumption across all backends.


http://gerrit.cloudera.org:8080/#/c/10415/1/be/src/runtime/coordinator-backend-state.h@121
PS1, Line 121:   /// Return total scanned bytes.
/// Return total scanned bytes (from HDFS or HDFS-compatible filesystem) across 
all backends.


http://gerrit.cloudera.org:8080/#/c/10415/1/be/src/runtime/coordinator-backend-state.h@124
PS1, Line 124:   void GetBackendResourceUsage(int64_t& user_cpu, int64_t& 
sys_cpu, int64_t& scanned_bytes,
It would be more readable if this returned a struct. 4 output arguments with 
the same type makes it easy to mess up argument order.


http://gerrit.cloudera.org:8080/#/c/10415/1/be/src/runtime/coordinator-backend-state.h@125
PS1, Line 125: int64_t& peak_mem
nit: use pointers for output parameters


http://gerrit.cloudera.org:8080/#/c/10415/1/be/src/runtime/coordinator-backend-state.h@206
PS1, Line 206: /// total scan ranges complete across all scan nodes
maybe add "for this backend"


http://gerrit.cloudera.org:8080/#/c/10415/1/be/src/runtime/coordinator-backend-state.h@209
PS1, Line 209: /// total user cpu consumed
maybe add "for this backend"


http://gerrit.cloudera.org:8080/#/c/10415/1/be/src/runtime/coordinator-backend-state.h@213
PS1, Line 213: int64_t cpu_sys_ = 0;
maybe add "for this backend"


http://gerrit.cloudera.org:8080/#/c/10415/1/be/src/runtime/coordinator-backend-state.h@281
PS1, Line 281:   /// total scan ranges complete across all scan nodes
maybe add "across all backends. Updated by AggregateBackendStats()."


http://gerrit.cloudera.org:8080/#/c/10415/1/be/src/runtime/coordinator-backend-state.h@284
PS1, Line 284:   /// total user cpu consumed
maybe add "across all backends. Updated by AggregateBackendStats()."


http://gerrit.cloudera.org:8080/#/c/10415/1/be/src/runtime/coordinator-backend-state.h@287
PS1, Line 287:   /// total system cpu consumed
maybe add "across all backends. Updated by AggregateBackendStats()."


http://gerrit.cloudera.org:8080/#/c/10415/1/be/src/runtime/coordinator.h
File be/src/runtime/coordinator.h:

http://gerrit.cloudera.org:8080/#/c/10415/1/be/src/runtime/coordinator.h@169
PS1, Line 169: int64_t&
nit: use pointer for output parameters


http://gerrit.cloudera.org:8080/#/c/10415/1/be/src/runtime/coordinator.h@169
PS1, Line 169: void
Maybe return a struct instead of multiple output arguments?


http://gerrit.cloudera.org:8080/#/c/10415/1/be/src/runtime/coordinator.cc
File be/src/runtime/coordinator.cc:

http://gerrit.cloudera.org:8080/#/c/10415/1/be/src/runtime/coordinator.cc@721
PS1, Line 721:
 :   int64_t total_cpu =0, total_scanned_bytes =0, backend_user_cpu 
= 0,
 :   backend_sys_cpu = 0,backend_bytes_read = 0, peak_memory = 
0;
nit: various whitespace inconsistencies


http://gerrit.cloudera.org:8080/#/c/10415/1/be/src/runtime/coordinator.cc@726
PS1, Line 726:
nit: don't need some of this vertical whitespace


http://gerrit.cloudera.org:8080/#/c/10415/1/be/src/runtime/coordinator.cc@751
PS1, Line 751:   query_profile_->AddInfoString("Total CPU time", 
cpu_total_info.str());
nit: capitalisation is a bit inconsistent - I think everything should be title 
case, i.e. "Read", "Time"


http://gerrit.cloudera.org:8080/#/c/10415/1/be/src/runtime/coordinator.cc@755
PS1, Line 755:   query_profile_->AddInfoString("Per Node User time", 
cpu_user_info.str());
Any reason why we don't show the aggregate user and system CPU time across all 
backends? I guess people will look at the top-level number then only drill down 
in rare cases?


http://gerrit.cloudera.org:8080/#/c/10415/1/be/src/runtime/coordinator.cc@802
PS1, Line 802: total_user_cpu += backend_state->GetUserCpu();
Should we modify the BackendState functions so that we can get all three values 
with a single lock acquisition, e.g. just have it return a 
BackendResourceUtilization struct with all three elements?


http://gerrit.cloudera.org:8080/#/c/10415/1/be/src/service/impala-server.h
File be/src/service/impala-server.h:

http://gerrit.cloudera.org:8080/#/c/10415/1/be/src/service/impala-server.h@1041
PS1, Line 1041: int64_t max_cpu_time_ns;
I think we 

[Impala-ASF-CR] IMPALA-7023: Wait for fragments to finish for test insert.py

2018-05-16 Thread Joe McDonnell (Code Review)
Joe McDonnell has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/10426 )

Change subject: IMPALA-7023: Wait for fragments to finish for test_insert.py
..


Patch Set 1:

(1 comment)

http://gerrit.cloudera.org:8080/#/c/10426/1/tests/query_test/test_insert.py
File tests/query_test/test_insert.py:

http://gerrit.cloudera.org:8080/#/c/10426/1/tests/query_test/test_insert.py@135
PS1, Line 135:   v.wait_for_metric("impala-server.num-fragments-in-flight", 
0, timeout=60)
> This is the 8th invocation of this line. Do we want to add
I've been able to run one exhaustive run with this along with some local tests, 
and I think I want to have a few more exhaustive runs verifying this fix before 
I refactor the code. I will file a follow-up JIRA.



--
To view, visit http://gerrit.cloudera.org:8080/10426
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I5642e9cb32dd02afd74dde7e0d3b31bddbee3ccd
Gerrit-Change-Number: 10426
Gerrit-PatchSet: 1
Gerrit-Owner: Joe McDonnell 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Joe McDonnell 
Gerrit-Reviewer: Philip Zeyliger 
Gerrit-Comment-Date: Wed, 16 May 2018 18:53:39 +
Gerrit-HasComments: Yes


[Impala-ASF-CR] IMPALA-7035: Configure jceks.key.serialFilter for KMS.

2018-05-16 Thread Joe McDonnell (Code Review)
Joe McDonnell has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/10418 )

Change subject: IMPALA-7035: Configure jceks.key.serialFilter for KMS.
..


Patch Set 1:

(1 comment)

http://gerrit.cloudera.org:8080/#/c/10418/1/testdata/cluster/node_templates/cdh6/etc/init.d/kms
File testdata/cluster/node_templates/cdh6/etc/init.d/kms:

http://gerrit.cloudera.org:8080/#/c/10418/1/testdata/cluster/node_templates/cdh6/etc/init.d/kms@32
PS1, Line 32: 
-Djceks.key.serialFilter=org.apache.hadoop.crypto.key.JavaKeyStoreProvider*"
> Extra Java system properties are definitely kosher.
That makes sense. Can you add a comment that you are omitting some of the 
defaults because kms doesn't need them?

Do we need something equivalent in the CDH5 version? 
(testdata/cluster/node_templates/cdh5/etc/init.d/kms)



--
To view, visit http://gerrit.cloudera.org:8080/10418
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I2d21c9cce3b91e8fd8b2b4f1cda75e3958c977d5
Gerrit-Change-Number: 10418
Gerrit-PatchSet: 1
Gerrit-Owner: Philip Zeyliger 
Gerrit-Reviewer: Joe McDonnell 
Gerrit-Reviewer: Philip Zeyliger 
Gerrit-Comment-Date: Wed, 16 May 2018 18:38:48 +
Gerrit-HasComments: Yes


[Impala-ASF-CR] IMPALA-7035: Configure jceks.key.serialFilter for KMS.

2018-05-16 Thread Philip Zeyliger (Code Review)
Philip Zeyliger has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/10418 )

Change subject: IMPALA-7035: Configure jceks.key.serialFilter for KMS.
..


Patch Set 1:

(1 comment)

http://gerrit.cloudera.org:8080/#/c/10418/1/testdata/cluster/node_templates/cdh6/etc/init.d/kms
File testdata/cluster/node_templates/cdh6/etc/init.d/kms:

http://gerrit.cloudera.org:8080/#/c/10418/1/testdata/cluster/node_templates/cdh6/etc/init.d/kms@32
PS1, Line 32: 
-Djceks.key.serialFilter=org.apache.hadoop.crypto.key.JavaKeyStoreProvider*"
> I assume older JVMs ignore this?
Extra Java system properties are definitely kosher.

I started out with including everything, but I realized that, as far as I can 
tell, KMS only uses its own key, which makes sense. This is scoped to one 
daemon.

The upstream Hadoop proposal currently has the long list, but I don't think 
it's necessary, so decided to shorten it. There are also quoting problems with 
"!*" and inner classes ($Foo). To be honest, I didn't test all the possible 
variants, but this one seems to work consistently.



--
To view, visit http://gerrit.cloudera.org:8080/10418
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I2d21c9cce3b91e8fd8b2b4f1cda75e3958c977d5
Gerrit-Change-Number: 10418
Gerrit-PatchSet: 1
Gerrit-Owner: Philip Zeyliger 
Gerrit-Reviewer: Joe McDonnell 
Gerrit-Reviewer: Philip Zeyliger 
Gerrit-Comment-Date: Wed, 16 May 2018 18:26:56 +
Gerrit-HasComments: Yes


[Impala-ASF-CR] IMPALA-7035: Configure jceks.key.serialFilter for KMS.

2018-05-16 Thread Joe McDonnell (Code Review)
Joe McDonnell has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/10418 )

Change subject: IMPALA-7035: Configure jceks.key.serialFilter for KMS.
..


Patch Set 1:

(1 comment)

http://gerrit.cloudera.org:8080/#/c/10418/1/testdata/cluster/node_templates/cdh6/etc/init.d/kms
File testdata/cluster/node_templates/cdh6/etc/init.d/kms:

http://gerrit.cloudera.org:8080/#/c/10418/1/testdata/cluster/node_templates/cdh6/etc/init.d/kms@32
PS1, Line 32: 
-Djceks.key.serialFilter=org.apache.hadoop.crypto.key.JavaKeyStoreProvider*"
I assume older JVMs ignore this?

Also, the description for the default jceks.key.serialFilter is:
"""
The default pattern allows java.lang.Enum, java.security.KeyRep, 
java.security.KeyRep$Type, and javax.crypto.spec.SecretKeySpec but rejects all 
the others.
"""
Do we need any of these others?

>From here:
http://www.oracle.com/technetwork/java/javase/8u171-relnotes-430.html



--
To view, visit http://gerrit.cloudera.org:8080/10418
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I2d21c9cce3b91e8fd8b2b4f1cda75e3958c977d5
Gerrit-Change-Number: 10418
Gerrit-PatchSet: 1
Gerrit-Owner: Philip Zeyliger 
Gerrit-Reviewer: Joe McDonnell 
Gerrit-Comment-Date: Wed, 16 May 2018 18:19:00 +
Gerrit-HasComments: Yes


[Impala-ASF-CR] IMPALA-7023: Wait for fragments to finish for test insert.py

2018-05-16 Thread Philip Zeyliger (Code Review)
Philip Zeyliger has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/10426 )

Change subject: IMPALA-7023: Wait for fragments to finish for test_insert.py
..


Patch Set 1: Code-Review+2

(1 comment)

This seems helpful to stabilize builds. If you want to take on the refactor, it 
seems pretty easy here, but we could do it separately (or punt).

http://gerrit.cloudera.org:8080/#/c/10426/1/tests/query_test/test_insert.py
File tests/query_test/test_insert.py:

http://gerrit.cloudera.org:8080/#/c/10426/1/tests/query_test/test_insert.py@135
PS1, Line 135:   v.wait_for_metric("impala-server.num-fragments-in-flight", 
0, timeout=60)
This is the 8th invocation of this line. Do we want to add

@wait_for_fragments_in_flight

or

self.wait_for_fragments_in_flight_to_finish()

as either a helper or an annotation?



--
To view, visit http://gerrit.cloudera.org:8080/10426
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I5642e9cb32dd02afd74dde7e0d3b31bddbee3ccd
Gerrit-Change-Number: 10426
Gerrit-PatchSet: 1
Gerrit-Owner: Joe McDonnell 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Philip Zeyliger 
Gerrit-Comment-Date: Wed, 16 May 2018 17:54:33 +
Gerrit-HasComments: Yes


[Impala-ASF-CR] IMPALA-6070: Adding ASAN, --tail to test-with-docker.

2018-05-16 Thread Joe McDonnell (Code Review)
Joe McDonnell has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/10319 )

Change subject: IMPALA-6070: Adding ASAN, --tail to test-with-docker.
..


Patch Set 3: Code-Review+2

(1 comment)

Looks good to me.

http://gerrit.cloudera.org:8080/#/c/10319/3/docker/entrypoint.sh
File docker/entrypoint.sh:

http://gerrit.cloudera.org:8080/#/c/10319/3/docker/entrypoint.sh@264
PS3, Line 264: -notests
> Line 269, I believe.
Oh, got it. CMake updates the Makefiles.



--
To view, visit http://gerrit.cloudera.org:8080/10319
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I51451cdf1352fc0f9516d729b9a77700488d993f
Gerrit-Change-Number: 10319
Gerrit-PatchSet: 3
Gerrit-Owner: Philip Zeyliger 
Gerrit-Reviewer: Joe McDonnell 
Gerrit-Reviewer: Lars Volker 
Gerrit-Reviewer: Philip Zeyliger 
Gerrit-Comment-Date: Wed, 16 May 2018 17:53:52 +
Gerrit-HasComments: Yes


[Impala-ASF-CR] IMPALA-7023: Wait for fragments to finish for test insert.py

2018-05-16 Thread Joe McDonnell (Code Review)
Joe McDonnell has uploaded this change for review. ( 
http://gerrit.cloudera.org:8080/10426


Change subject: IMPALA-7023: Wait for fragments to finish for test_insert.py
..

IMPALA-7023: Wait for fragments to finish for test_insert.py

The arrangement of tests in test_insert.py changed with
IMPALA-7010, splitting out the memory limit tests into
test_insert_mem_limit(). On exhaustive, the combination
of test dimensions means test_insert_mem_limit() executes
11 different combinations. Each of these statements can
use a large amount of memory and this is not cleaned
up immediately. This has been causing
test_insert_overwrite(), which immediately follows
test_insert_mem_limit(), to hit the process memory limit.

This changes test_insert_mem_limit() to make it wait
for its fragments to finish.

Change-Id: I5642e9cb32dd02afd74dde7e0d3b31bddbee3ccd
---
M tests/query_test/test_insert.py
1 file changed, 8 insertions(+), 0 deletions(-)



  git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/26/10426/1
--
To view, visit http://gerrit.cloudera.org:8080/10426
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: newchange
Gerrit-Change-Id: I5642e9cb32dd02afd74dde7e0d3b31bddbee3ccd
Gerrit-Change-Number: 10426
Gerrit-PatchSet: 1
Gerrit-Owner: Joe McDonnell 


[Impala-ASF-CR] IMPALA-6070: Adding ASAN, --tail to test-with-docker.

2018-05-16 Thread Philip Zeyliger (Code Review)
Philip Zeyliger has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/10319 )

Change subject: IMPALA-6070: Adding ASAN, --tail to test-with-docker.
..


Patch Set 3:

(1 comment)

http://gerrit.cloudera.org:8080/#/c/10319/3/docker/entrypoint.sh
File docker/entrypoint.sh:

http://gerrit.cloudera.org:8080/#/c/10319/3/docker/entrypoint.sh@264
PS3, Line 264: -notests
> Maybe I'm missing something, where do we build the backend tests with ASAN?
Line 269, I believe.



--
To view, visit http://gerrit.cloudera.org:8080/10319
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I51451cdf1352fc0f9516d729b9a77700488d993f
Gerrit-Change-Number: 10319
Gerrit-PatchSet: 3
Gerrit-Owner: Philip Zeyliger 
Gerrit-Reviewer: Joe McDonnell 
Gerrit-Reviewer: Lars Volker 
Gerrit-Reviewer: Philip Zeyliger 
Gerrit-Comment-Date: Wed, 16 May 2018 17:49:40 +
Gerrit-HasComments: Yes


[Impala-ASF-CR] IMPALA-6070: Adding ASAN, --tail to test-with-docker.

2018-05-16 Thread Joe McDonnell (Code Review)
Joe McDonnell has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/10319 )

Change subject: IMPALA-6070: Adding ASAN, --tail to test-with-docker.
..


Patch Set 3:

(1 comment)

http://gerrit.cloudera.org:8080/#/c/10319/3/docker/entrypoint.sh
File docker/entrypoint.sh:

http://gerrit.cloudera.org:8080/#/c/10319/3/docker/entrypoint.sh@264
PS3, Line 264: -notests
Maybe I'm missing something, where do we build the backend tests with ASAN?



--
To view, visit http://gerrit.cloudera.org:8080/10319
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I51451cdf1352fc0f9516d729b9a77700488d993f
Gerrit-Change-Number: 10319
Gerrit-PatchSet: 3
Gerrit-Owner: Philip Zeyliger 
Gerrit-Reviewer: Joe McDonnell 
Gerrit-Reviewer: Lars Volker 
Gerrit-Reviewer: Philip Zeyliger 
Gerrit-Comment-Date: Wed, 16 May 2018 17:37:43 +
Gerrit-HasComments: Yes


[Impala-ASF-CR] IMPALA-5842: Write page index in Parquet files

2018-05-16 Thread Lars Volker (Code Review)
Lars Volker has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/9693 )

Change subject: IMPALA-5842: Write page index in Parquet files
..


Patch Set 19: Code-Review+2


--
To view, visit http://gerrit.cloudera.org:8080/9693
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: Icbacf7fe3b7672e3ce719261ecef445b16f8dec9
Gerrit-Change-Number: 9693
Gerrit-PatchSet: 19
Gerrit-Owner: Zoltan Borok-Nagy 
Gerrit-Reviewer: Anonymous Coward #248
Gerrit-Reviewer: Csaba Ringhofer 
Gerrit-Reviewer: Lars Volker 
Gerrit-Reviewer: Tim Armstrong 
Gerrit-Reviewer: Zoltan Borok-Nagy 
Gerrit-Comment-Date: Wed, 16 May 2018 17:01:50 +
Gerrit-HasComments: No


[Impala-ASF-CR] IMPALA-5216: Make admission control queuing async

2018-05-16 Thread Dan Hecht (Code Review)
Dan Hecht has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/10060 )

Change subject: IMPALA-5216: Make admission control queuing async
..


Patch Set 7:

(1 comment)

http://gerrit.cloudera.org:8080/#/c/10060/7/be/src/service/client-request-state.h
File be/src/service/client-request-state.h:

http://gerrit.cloudera.org:8080/#/c/10060/7/be/src/service/client-request-state.h@172
PS7, Line 172:   Coordinator* coord() const {
 : return coord_exec_started_.Load() ? coord_.get() : nullptr;
 :   }
> TLDR: I am not sure, if we can entirely get rid of most of the confusing us
Okay, let's defer this until later when we think through the full CRS states. 
We may also want to change the interfaces in a different way at that time.



--
To view, visit http://gerrit.cloudera.org:8080/10060
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I989cf5b259afb8f5bc5c35590c94961c81ce88bf
Gerrit-Change-Number: 10060
Gerrit-PatchSet: 7
Gerrit-Owner: Bikramjeet Vig 
Gerrit-Reviewer: Bikramjeet Vig 
Gerrit-Reviewer: Dan Hecht 
Gerrit-Reviewer: Tim Armstrong 
Gerrit-Comment-Date: Wed, 16 May 2018 16:56:12 +
Gerrit-HasComments: Yes


[Impala-ASF-CR] IMPALA-5842: Write page index in Parquet files

2018-05-16 Thread Zoltan Borok-Nagy (Code Review)
Zoltan Borok-Nagy has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/9693 )

Change subject: IMPALA-5842: Write page index in Parquet files
..


Patch Set 19:

(1 comment)

http://gerrit.cloudera.org:8080/#/c/9693/18/tests/util/get_parquet_metadata.py
File tests/util/get_parquet_metadata.py:

http://gerrit.cloudera.org:8080/#/c/9693/18/tests/util/get_parquet_metadata.py@137
PS18, Line 137:  alr
> update comment to include that it needs to be an open file handle.
Updated the comment.



--
To view, visit http://gerrit.cloudera.org:8080/9693
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: Icbacf7fe3b7672e3ce719261ecef445b16f8dec9
Gerrit-Change-Number: 9693
Gerrit-PatchSet: 19
Gerrit-Owner: Zoltan Borok-Nagy 
Gerrit-Reviewer: Anonymous Coward #248
Gerrit-Reviewer: Csaba Ringhofer 
Gerrit-Reviewer: Lars Volker 
Gerrit-Reviewer: Tim Armstrong 
Gerrit-Reviewer: Zoltan Borok-Nagy 
Gerrit-Comment-Date: Wed, 16 May 2018 16:54:47 +
Gerrit-HasComments: Yes


[Impala-ASF-CR] IMPALA-5842: Write page index in Parquet files

2018-05-16 Thread Zoltan Borok-Nagy (Code Review)
Hello Lars Volker, Anonymous Coward #248, Tim Armstrong, Csaba Ringhofer,

I'd like you to reexamine a change. Please visit

http://gerrit.cloudera.org:8080/9693

to look at the new patch set (#19).

Change subject: IMPALA-5842: Write page index in Parquet files
..

IMPALA-5842: Write page index in Parquet files

This commit builds on the previous work of
Pooja Nilangekar: https://gerrit.cloudera.org/#/c/7464/

The commit implements the write path of PARQUET-922:
"Add column indexes to parquet.thrift". As specified in the
parquet-format, Impala writes the page indexes just before
the footer. This allows much more efficient page filtering
than using the same information from the 'statistics' field
of DataPageHeader.

I updated Pooja's python tests as well.

Change-Id: Icbacf7fe3b7672e3ce719261ecef445b16f8dec9
---
M be/src/exec/hdfs-parquet-table-writer.cc
M be/src/exec/hdfs-parquet-table-writer.h
M be/src/exec/parquet-column-stats.h
M be/src/exec/parquet-column-stats.inline.h
M be/src/util/CMakeLists.txt
A be/src/util/string-util-test.cc
A be/src/util/string-util.cc
A be/src/util/string-util.h
M common/thrift/parquet.thrift
M testdata/bin/load-dependent-tables.sql
M tests/query_test/test_chars.py
A tests/query_test/test_parquet_page_index.py
M tests/util/get_parquet_metadata.py
13 files changed, 995 insertions(+), 108 deletions(-)


  git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/93/9693/19
--
To view, visit http://gerrit.cloudera.org:8080/9693
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: newpatchset
Gerrit-Change-Id: Icbacf7fe3b7672e3ce719261ecef445b16f8dec9
Gerrit-Change-Number: 9693
Gerrit-PatchSet: 19
Gerrit-Owner: Zoltan Borok-Nagy 
Gerrit-Reviewer: Anonymous Coward #248
Gerrit-Reviewer: Csaba Ringhofer 
Gerrit-Reviewer: Lars Volker 
Gerrit-Reviewer: Tim Armstrong 
Gerrit-Reviewer: Zoltan Borok-Nagy 


[Impala-ASF-CR] IMPALA-6995: avoid DCHECK in TimestampParse::Parse()

2018-05-16 Thread Impala Public Jenkins (Code Review)
Impala Public Jenkins has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/10349 )

Change subject: IMPALA-6995: avoid DCHECK in TimestampParse::Parse()
..


Patch Set 4:

Build started: https://jenkins.impala.io/job/gerrit-verify-dryrun/2487/


--
To view, visit http://gerrit.cloudera.org:8080/10349
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I02a18ffd8893fe74f5830144300f745ce31477b1
Gerrit-Change-Number: 10349
Gerrit-PatchSet: 4
Gerrit-Owner: Tim Armstrong 
Gerrit-Reviewer: Dan Hecht 
Gerrit-Reviewer: Gabor Kaszab 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Kim Jin Chul 
Gerrit-Reviewer: Tim Armstrong 
Gerrit-Comment-Date: Wed, 16 May 2018 16:39:06 +
Gerrit-HasComments: No


[Impala-ASF-CR] IMPALA-5842: Write page index in Parquet files

2018-05-16 Thread Lars Volker (Code Review)
Lars Volker has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/9693 )

Change subject: IMPALA-5842: Write page index in Parquet files
..


Patch Set 18: Code-Review+2

(2 comments)

LGTM, had one remaining nit.

http://gerrit.cloudera.org:8080/#/c/9693/17/tests/query_test/test_parquet_page_index.py
File tests/query_test/test_parquet_page_index.py:

http://gerrit.cloudera.org:8080/#/c/9693/17/tests/query_test/test_parquet_page_index.py@179
PS17, Line 179:
> The Python sorted() function also has a reversed flag with the same meaning
Thanks for the explanation. Lets keep it as is then.


http://gerrit.cloudera.org:8080/#/c/9693/18/tests/util/get_parquet_metadata.py
File tests/util/get_parquet_metadata.py:

http://gerrit.cloudera.org:8080/#/c/9693/18/tests/util/get_parquet_metadata.py@137
PS18, Line 137: file
update comment to include that it needs to be an open file handle.



--
To view, visit http://gerrit.cloudera.org:8080/9693
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: Icbacf7fe3b7672e3ce719261ecef445b16f8dec9
Gerrit-Change-Number: 9693
Gerrit-PatchSet: 18
Gerrit-Owner: Zoltan Borok-Nagy 
Gerrit-Reviewer: Anonymous Coward #248
Gerrit-Reviewer: Csaba Ringhofer 
Gerrit-Reviewer: Lars Volker 
Gerrit-Reviewer: Tim Armstrong 
Gerrit-Reviewer: Zoltan Borok-Nagy 
Gerrit-Comment-Date: Wed, 16 May 2018 16:39:16 +
Gerrit-HasComments: Yes


[Impala-ASF-CR] IMPALA-6995: avoid DCHECK in TimestampParse::Parse()

2018-05-16 Thread Tim Armstrong (Code Review)
Tim Armstrong has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/10349 )

Change subject: IMPALA-6995: avoid DCHECK in TimestampParse::Parse()
..


Patch Set 4: Code-Review+2


--
To view, visit http://gerrit.cloudera.org:8080/10349
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I02a18ffd8893fe74f5830144300f745ce31477b1
Gerrit-Change-Number: 10349
Gerrit-PatchSet: 4
Gerrit-Owner: Tim Armstrong 
Gerrit-Reviewer: Dan Hecht 
Gerrit-Reviewer: Gabor Kaszab 
Gerrit-Reviewer: Kim Jin Chul 
Gerrit-Reviewer: Tim Armstrong 
Gerrit-Comment-Date: Wed, 16 May 2018 16:38:56 +
Gerrit-HasComments: No


[Impala-ASF-CR] Default ASAN options in codebase.

2018-05-16 Thread Tim Armstrong (Code Review)
Tim Armstrong has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/10404 )

Change subject: Default ASAN options in codebase.
..


Patch Set 2: Code-Review+2

Looks good - can you run an ASAN build just to double-check that it still works 
as before.


--
To view, visit http://gerrit.cloudera.org:8080/10404
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I3cbbd210c67750a48003f336bea1f3e1cb2d9e6b
Gerrit-Change-Number: 10404
Gerrit-PatchSet: 2
Gerrit-Owner: Philip Zeyliger 
Gerrit-Reviewer: Joe McDonnell 
Gerrit-Reviewer: Philip Zeyliger 
Gerrit-Reviewer: Tim Armstrong 
Gerrit-Comment-Date: Wed, 16 May 2018 16:25:15 +
Gerrit-HasComments: No


[Impala-ASF-CR] IMPALA-3307: Add support for IANA time-zone db

2018-05-16 Thread Tim Armstrong (Code Review)
Tim Armstrong has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/9986 )

Change subject: IMPALA-3307: Add support for IANA time-zone db
..


Patch Set 7:

(15 comments)

This is going to be a big improvement. Did a pass, mainly had comments about 
clarifying internal interfaces.

http://gerrit.cloudera.org:8080/#/c/9986/7/be/src/exec/data-source-scan-node.h
File be/src/exec/data-source-scan-node.h:

http://gerrit.cloudera.org:8080/#/c/9986/7/be/src/exec/data-source-scan-node.h@101
PS7, Line 101:   /// local time-zone for materializing 'TYPE_TIMESTAMP' slots.
Can local_tz be NULL? Maybe make it const& if not.


http://gerrit.cloudera.org:8080/#/c/9986/7/be/src/exec/parquet-column-readers.cc
File be/src/exec/parquet-column-readers.cc:

http://gerrit.cloudera.org:8080/#/c/9986/7/be/src/exec/parquet-column-readers.cc@603
PS7, Line 603:   if (dst_ts->HasDateAndTime()) 
dst_ts->UtcToLocal(parent_->state_->local_time_zone());
Would it make sense to cache the timezone locally in the ScalarColumnReader? 
That would save at least 2 pointer indirections per value, which could be 
meaningful in this part of the code.


http://gerrit.cloudera.org:8080/#/c/9986/7/be/src/exprs/decimal-operators.h
File be/src/exprs/decimal-operators.h:

http://gerrit.cloudera.org:8080/#/c/9986/7/be/src/exprs/decimal-operators.h@172
PS7, Line 172:   const T& decimal_value, int scale, bool round, const 
Timezone* local_tz);
Is it nullable?


http://gerrit.cloudera.org:8080/#/c/9986/7/be/src/exprs/expr-test.cc
File be/src/exprs/expr-test.cc:

http://gerrit.cloudera.org:8080/#/c/9986/7/be/src/exprs/expr-test.cc@161
PS7, Line 161:   const Timezone *new_tz_;
nit: Timezone* new_tz_


http://gerrit.cloudera.org:8080/#/c/9986/7/be/src/exprs/timezone_db.cc
File be/src/exprs/timezone_db.cc:

http://gerrit.cloudera.org:8080/#/c/9986/7/be/src/exprs/timezone_db.cc@48
PS7, Line 48: "HDFS/S3A/ADLS path to load IANA time-zone database from.");
nn]


http://gerrit.cloudera.org:8080/#/c/9986/7/be/src/runtime/runtime-state.h
File be/src/runtime/runtime-state.h:

http://gerrit.cloudera.org:8080/#/c/9986/7/be/src/runtime/runtime-state.h@317
PS7, Line 317:   /// Query-global timezone used as local timezone when 
executing the query.
Can this be NULL? Would be good to document. We should maybe return a const& 
above if it can't be NULL so that it's self-documenting.


http://gerrit.cloudera.org:8080/#/c/9986/7/be/src/runtime/runtime-state.cc
File be/src/runtime/runtime-state.cc:

http://gerrit.cloudera.org:8080/#/c/9986/7/be/src/runtime/runtime-state.cc@131
PS7, Line 131: LIKELY
LIKELY won't make a measurable difference outside of perf-critical code, I find 
it adds noise in cases like this. The codebase isn't very consistent about it 
but I'm trying to stop the pattern from spreading :)


http://gerrit.cloudera.org:8080/#/c/9986/7/be/src/runtime/runtime-state.cc@134
PS7, Line 134: LOG(ERROR) << "Failed to find local timezone " << 
query_ctx().local_time_zone
I think this should be a WARNING-level log. We should reserve ERROR for really 
severe errors, whereas this might flood logs.

I think we should also add a warning to the query warnings so that it's 
surfaced to the user, not just the admin.


http://gerrit.cloudera.org:8080/#/c/9986/7/be/src/runtime/timestamp-value.h
File be/src/runtime/timestamp-value.h:

http://gerrit.cloudera.org:8080/#/c/9986/7/be/src/runtime/timestamp-value.h@98
PS7, Line 98:   static TimestampValue FromUnixTime(time_t unix_time, const 
Timezone* local_tz) {
Here and below it isn't clear if local_tz is allowed to be NULL. If it can be 
NULL, can we extend comments to explain what happens if that case. If it can't, 
we could make it self-documenting by making it a const& instead of a const*.


http://gerrit.cloudera.org:8080/#/c/9986/7/be/src/runtime/timestamp-value.cc
File be/src/runtime/timestamp-value.cc:

http://gerrit.cloudera.org:8080/#/c/9986/7/be/src/runtime/timestamp-value.cc@100
PS7, Line 100: CheckIfDateOutOfRange
Maybe IsDateOutOfRange(). With "Check" it isn't clear whether returning true 
means that it's out of range or if it means that the timestamp passed the check.


http://gerrit.cloudera.org:8080/#/c/9986/7/be/src/util/time.h
File be/src/util/time.h:

http://gerrit.cloudera.org:8080/#/c/9986/7/be/src/util/time.h@122
PS7, Line 122: Timezone
Maybe const& if it's not allowed to be NULL.


http://gerrit.cloudera.org:8080/#/c/9986/7/cmake_modules/FindCctz.cmake
File cmake_modules/FindCctz.cmake:

http://gerrit.cloudera.org:8080/#/c/9986/7/cmake_modules/FindCctz.cmake@27
PS7, Line 27:   $ENV{IMPALA_HOME}/thirdparty/cctz-$ENV{IMPALA_CCTZ_VERSION}/src)
We can get rid of the thirdparty/ stuff. That's just left over from when Impala 
stored vendored versions of these dependencies in thirdpart/


http://gerrit.cloudera.org:8080/#/c/9986/7/common/thrift/ImpalaInternalService.thrift
File common/thrift/ImpalaInternalService.thrift:


[Impala-ASF-CR] Default ASAN options in codebase.

2018-05-16 Thread Philip Zeyliger (Code Review)
Hello Tim Armstrong, Joe McDonnell,

I'd like you to reexamine a change. Please visit

http://gerrit.cloudera.org:8080/10404

to look at the new patch set (#2).

Change subject: Default ASAN options in codebase.
..

Default ASAN options in codebase.

When running tests with ASAN, you need to set ASAN_OPTIONS explicitly,
to avoid various failures. In particular, backend tests fail
complaining about memory leaks and tests that use the parquet-reader
binary complain similarly. It turns out that we can shove
the default options into our code base directly, avoiding
the need for users to set it explicitly.

Change-Id: I3cbbd210c67750a48003f336bea1f3e1cb2d9e6b
---
M be/src/common/init.cc
M bin/run-backend-tests.sh
M bin/start-catalogd.sh
M bin/start-impalad.sh
M bin/start-statestored.sh
5 files changed, 8 insertions(+), 4 deletions(-)


  git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/04/10404/2
--
To view, visit http://gerrit.cloudera.org:8080/10404
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: newpatchset
Gerrit-Change-Id: I3cbbd210c67750a48003f336bea1f3e1cb2d9e6b
Gerrit-Change-Number: 10404
Gerrit-PatchSet: 2
Gerrit-Owner: Philip Zeyliger 
Gerrit-Reviewer: Joe McDonnell 
Gerrit-Reviewer: Philip Zeyliger 
Gerrit-Reviewer: Tim Armstrong 


[Impala-ASF-CR] IMPALA-6070: Adding ASAN, --tail to test-with-docker.

2018-05-16 Thread Philip Zeyliger (Code Review)
Hello Lars Volker, Joe McDonnell,

I'd like you to reexamine a change. Please visit

http://gerrit.cloudera.org:8080/10319

to look at the new patch set (#3).

Change subject: IMPALA-6070: Adding ASAN, --tail to test-with-docker.
..

IMPALA-6070: Adding ASAN, --tail to test-with-docker.

* Adds -ASAN suites to test-with-docker.
* Adds --tail flag, which starts a tail subprocess. This
  isn't pretty (there's potential for overlap), but it's a dead simple
  way to keep an eye on what's going on.
* Fixes a bug wherein I could call "docker rm " twice
  simultaneously, which would make Docker fail the second call,
  and then fail the related "docker rmi". It's better to serialize,
  and I did that with a simple lock.

Change-Id: I51451cdf1352fc0f9516d729b9a77700488d993f
---
M docker/entrypoint.sh
M docker/test-with-docker.py
2 files changed, 48 insertions(+), 7 deletions(-)


  git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/19/10319/3
--
To view, visit http://gerrit.cloudera.org:8080/10319
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: newpatchset
Gerrit-Change-Id: I51451cdf1352fc0f9516d729b9a77700488d993f
Gerrit-Change-Number: 10319
Gerrit-PatchSet: 3
Gerrit-Owner: Philip Zeyliger 
Gerrit-Reviewer: Joe McDonnell 
Gerrit-Reviewer: Lars Volker 
Gerrit-Reviewer: Philip Zeyliger 


[Impala-ASF-CR] IMPALA-6995: avoid DCHECK in TimestampParse::Parse()

2018-05-16 Thread Dan Hecht (Code Review)
Dan Hecht has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/10349 )

Change subject: IMPALA-6995: avoid DCHECK in TimestampParse::Parse()
..


Patch Set 3: Code-Review+2


--
To view, visit http://gerrit.cloudera.org:8080/10349
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I02a18ffd8893fe74f5830144300f745ce31477b1
Gerrit-Change-Number: 10349
Gerrit-PatchSet: 3
Gerrit-Owner: Tim Armstrong 
Gerrit-Reviewer: Dan Hecht 
Gerrit-Reviewer: Gabor Kaszab 
Gerrit-Reviewer: Kim Jin Chul 
Gerrit-Reviewer: Tim Armstrong 
Gerrit-Comment-Date: Wed, 16 May 2018 16:03:19 +
Gerrit-HasComments: No


[Impala-ASF-CR] IMPALA-6983: stress: don't write a null runtime profile

2018-05-16 Thread Michael Brown (Code Review)
Michael Brown has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/10381 )

Change subject: IMPALA-6983: stress: don't write a null runtime profile
..


Patch Set 3: Verified+1

False positive failure due to mistakenly reading Junit XML from docs source. 
All tests actually passed.


--
To view, visit http://gerrit.cloudera.org:8080/10381
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: Ic8299a8a97ad1f2bd1f2927e3111db8df1d3a3e5
Gerrit-Change-Number: 10381
Gerrit-PatchSet: 3
Gerrit-Owner: Michael Brown 
Gerrit-Reviewer: Dan Hecht 
Gerrit-Reviewer: David Knupp 
Gerrit-Reviewer: Fredy Wijaya 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Michael Brown 
Gerrit-Reviewer: Nithya Janarthanan 
Gerrit-Comment-Date: Wed, 16 May 2018 15:20:25 +
Gerrit-HasComments: No


[Impala-ASF-CR] IMPALA-6983: stress: don't write a null runtime profile

2018-05-16 Thread Michael Brown (Code Review)
Michael Brown has submitted this change and it was merged. ( 
http://gerrit.cloudera.org:8080/10381 )

Change subject: IMPALA-6983: stress: don't write a null runtime profile
..

IMPALA-6983: stress: don't write a null runtime profile

This patch fixes a problem in which the writing of profiles always
assumed profiles would be collected during binary. That's not the case
when queries are too expensive to run. The simple fix is to just check
for None and not perform the write.

Change-Id: Ic8299a8a97ad1f2bd1f2927e3111db8df1d3a3e5
Reviewed-on: http://gerrit.cloudera.org:8080/10381
Reviewed-by: Michael Brown 
Tested-by: Michael Brown 
---
M tests/stress/concurrent_select.py
1 file changed, 3 insertions(+), 0 deletions(-)

Approvals:
  Michael Brown: Looks good to me, approved; Verified

--
To view, visit http://gerrit.cloudera.org:8080/10381
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: merged
Gerrit-Change-Id: Ic8299a8a97ad1f2bd1f2927e3111db8df1d3a3e5
Gerrit-Change-Number: 10381
Gerrit-PatchSet: 4
Gerrit-Owner: Michael Brown 
Gerrit-Reviewer: Dan Hecht 
Gerrit-Reviewer: David Knupp 
Gerrit-Reviewer: Fredy Wijaya 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Michael Brown 
Gerrit-Reviewer: Nithya Janarthanan 


[Impala-ASF-CR] IMPALA-6983: stress: don't write a null runtime profile

2018-05-16 Thread Michael Brown (Code Review)
Michael Brown has removed a vote on this change.

Change subject: IMPALA-6983: stress: don't write a null runtime profile
..


Removed Verified-1 by Impala Public Jenkins 
--
To view, visit http://gerrit.cloudera.org:8080/10381
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: deleteVote
Gerrit-Change-Id: Ic8299a8a97ad1f2bd1f2927e3111db8df1d3a3e5
Gerrit-Change-Number: 10381
Gerrit-PatchSet: 3
Gerrit-Owner: Michael Brown 
Gerrit-Reviewer: Dan Hecht 
Gerrit-Reviewer: David Knupp 
Gerrit-Reviewer: Fredy Wijaya 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Michael Brown 
Gerrit-Reviewer: Nithya Janarthanan 


[Impala-ASF-CR] IMPALA-5706: Spilling sort optimisations

2018-05-16 Thread Tim Armstrong (Code Review)
Tim Armstrong has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/9943 )

Change subject: IMPALA-5706: Spilling sort optimisations
..


Patch Set 8:

(1 comment)

http://gerrit.cloudera.org:8080/#/c/9943/8/be/src/runtime/sorter.cc
File be/src/runtime/sorter.cc:

http://gerrit.cloudera.org:8080/#/c/9943/8/be/src/runtime/sorter.cc@1676
PS8, Line 1676:   num_required_buffers += var_len_pages_found ? 2 : 1;
> As far as I see in Run::Init() it already knows if the run has var_len_slot
It's possible to have var-len slots but no var-len pages if all strings are 
NULL or empty. I've run into bugs with that edge case a few times before.



--
To view, visit http://gerrit.cloudera.org:8080/9943
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I74857c1694802e81f1cfc765d2b4e8bc644387f9
Gerrit-Change-Number: 9943
Gerrit-PatchSet: 8
Gerrit-Owner: Gabor Kaszab 
Gerrit-Reviewer: Csaba Ringhofer 
Gerrit-Reviewer: Gabor Kaszab 
Gerrit-Reviewer: Tim Armstrong 
Gerrit-Comment-Date: Wed, 16 May 2018 15:00:26 +
Gerrit-HasComments: Yes


[Impala-ASF-CR] IMPALA-3307: Add support for IANA time-zone db

2018-05-16 Thread Gabor Kaszab (Code Review)
Gabor Kaszab has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/9986 )

Change subject: IMPALA-3307: Add support for IANA time-zone db
..


Patch Set 7: Code-Review+1

(4 comments)

Thanks Attila for addressing my comments! I'm fine with the change.

http://gerrit.cloudera.org:8080/#/c/9986/7/be/src/exprs/timezone_db-test.cc
File be/src/exprs/timezone_db-test.cc:

http://gerrit.cloudera.org:8080/#/c/9986/7/be/src/exprs/timezone_db-test.cc@119
PS7, Line 119:   TestInvalidTimezoneAbbrevName("pST");
Four of these are already tested by TimezoneDbNamesTest. No need to test the 
here as well, I think.


http://gerrit.cloudera.org:8080/#/c/9986/4/be/src/exprs/timezone_db.h
File be/src/exprs/timezone_db.h:

http://gerrit.cloudera.org:8080/#/c/9986/4/be/src/exprs/timezone_db.h@1
PS4, Line 1: // Licensed to the Apache Software Foundation (ASF) under one
> Not sure what happened there. I probably inadvertently executed some myster
:D


http://gerrit.cloudera.org:8080/#/c/9986/4/be/src/exprs/timezone_db.cc
File be/src/exprs/timezone_db.cc:

http://gerrit.cloudera.org:8080/#/c/9986/4/be/src/exprs/timezone_db.cc@365
PS4, Line 365:   RETURN_IF_ERROR(
> LoadZoneInfoFromHdfs() uses cctz::load_time_zone() to load time-zone files
Thx!


http://gerrit.cloudera.org:8080/#/c/9986/5/be/src/runtime/runtime-state.cc
File be/src/runtime/runtime-state.cc:

http://gerrit.cloudera.org:8080/#/c/9986/5/be/src/runtime/runtime-state.cc@136
PS5, Line 136: local_time_zone_ = ::GetUtcTimezone();
> True, but I wanted to be more explicit here.
thanks for the explanation.



--
To view, visit http://gerrit.cloudera.org:8080/9986
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I93c1fbffe81f067919706e30db0a34d0e58e7e77
Gerrit-Change-Number: 9986
Gerrit-PatchSet: 7
Gerrit-Owner: Attila Jeges 
Gerrit-Reviewer: Attila Jeges 
Gerrit-Reviewer: Csaba Ringhofer 
Gerrit-Reviewer: Dan Hecht 
Gerrit-Reviewer: Gabor Kaszab 
Gerrit-Reviewer: Zoltan Borok-Nagy 
Gerrit-Comment-Date: Wed, 16 May 2018 14:43:24 +
Gerrit-HasComments: Yes


[Impala-ASF-CR] IMPALA-7003: Deflake erasure coding data loading

2018-05-16 Thread Impala Public Jenkins (Code Review)
Impala Public Jenkins has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/10362 )

Change subject: IMPALA-7003: Deflake erasure coding data loading
..


Patch Set 4:

This change did not cherrypick successfully into branch 2.x. To resolve this, 
please do the cherry-pick manually and submit it to Gerrit at refs/for/2.x or 
add an exception to the branch 2.x copy of bin/ignored_commits.json. Thanks, 
your friendly bot at https://jenkins.impala.io/job/cherrypick-2.x-and-test/503/ 
.


--
To view, visit http://gerrit.cloudera.org:8080/10362
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I219106cd3ec7ffab7a834700f2a722b165e5f66c
Gerrit-Change-Number: 10362
Gerrit-PatchSet: 4
Gerrit-Owner: Tianyi Wang 
Gerrit-Reviewer: Alex Behm 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Philip Zeyliger 
Gerrit-Reviewer: Taras Bobrovytsky 
Gerrit-Reviewer: Tianyi Wang 
Gerrit-Comment-Date: Wed, 16 May 2018 13:55:46 +
Gerrit-HasComments: No


[Impala-ASF-CR] IMPALA-5706: Spilling sort optimisations

2018-05-16 Thread Gabor Kaszab (Code Review)
Gabor Kaszab has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/9943 )

Change subject: IMPALA-5706: Spilling sort optimisations
..


Patch Set 10:

(6 comments)

http://gerrit.cloudera.org:8080/#/c/9943/8/be/src/runtime/sorter.h
File be/src/runtime/sorter.h:

http://gerrit.cloudera.org:8080/#/c/9943/8/be/src/runtime/sorter.h@193
PS8, Line 193:   /// function calculates the maximum number of runs that can be 
taken care of during the
> this sentence seems garbled. "can be taken care of"?
Done


http://gerrit.cloudera.org:8080/#/c/9943/8/be/src/runtime/sorter.h@199
PS8, Line 199:   /// round of merging. Returns at most MaxRunsInNextMerge(), so 
the Sorter will have
> Can you mention the invariant that we have enough reservation for this numb
Done


http://gerrit.cloudera.org:8080/#/c/9943/8/be/src/runtime/sorter.cc
File be/src/runtime/sorter.cc:

http://gerrit.cloudera.org:8080/#/c/9943/8/be/src/runtime/sorter.cc@1676
PS8, Line 1676:   num_required_buffers += var_len_pages_found ? 2 : 1;
> I think we need two buffers regardless if there are var-len slots, since Ru
As far as I see in Run::Init() it already knows if the run has var_len_slots 
(from TupleDescriptor given as constructor param) and add a page for that only 
in case it does.


http://gerrit.cloudera.org:8080/#/c/9943/8/be/src/runtime/sorter.cc@1679
PS8, Line 1679:  i -
> nit: i - 1
Done


http://gerrit.cloudera.org:8080/#/c/9943/8/tests/query_test/test_sort.py
File tests/query_test/test_sort.py:

http://gerrit.cloudera.org:8080/#/c/9943/8/tests/query_test/test_sort.py@104
PS8, Line 104: orders o2 on (o1.o_orderkey = o2.o_orderkey) order by 
o1.o_orderdate limit 10"""
> Should we also set num_nodes=1 here? Otherwise it might not be robust when
Good point. Done


http://gerrit.cloudera.org:8080/#/c/9943/10/tests/query_test/test_sort.py
File tests/query_test/test_sort.py:

http://gerrit.cloudera.org:8080/#/c/9943/10/tests/query_test/test_sort.py@142
PS10, Line 142: exec_option['buffer_pool_limit'] = "22m"
Had to change this as the minimum mem requirement increased for this query. I 
guess that is because I pulled in another change with the rebase that affected 
this. As a result, the total merges done by the first sort decreased.



--
To view, visit http://gerrit.cloudera.org:8080/9943
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I74857c1694802e81f1cfc765d2b4e8bc644387f9
Gerrit-Change-Number: 9943
Gerrit-PatchSet: 10
Gerrit-Owner: Gabor Kaszab 
Gerrit-Reviewer: Csaba Ringhofer 
Gerrit-Reviewer: Gabor Kaszab 
Gerrit-Reviewer: Tim Armstrong 
Gerrit-Comment-Date: Wed, 16 May 2018 13:29:18 +
Gerrit-HasComments: Yes


[Impala-ASF-CR] IMPALA-5706: Spilling sort optimisations

2018-05-16 Thread Gabor Kaszab (Code Review)
Hello Tim Armstrong, Csaba Ringhofer,

I'd like you to reexamine a change. Please visit

http://gerrit.cloudera.org:8080/9943

to look at the new patch set (#10).

Change subject: IMPALA-5706: Spilling sort optimisations
..

IMPALA-5706: Spilling sort optimisations

This patch covers multiple changes with the purpose of optimizing
spilling sort mechanism:
  - Remove the hard-coded maximum limit of buffers that can be used
for merging the sorted runs. Instead this number is calculated
based on the available memory through buffer pool.
  - The already sorted runs are distributed more optimally between
the last intermediate merge and the final merge to avoid that a
heavy intermediate merge is followed by a light final merge.
  - Right before starting the merging phase Sorter tries to allocate
additional memory through the buffer pool.
  - An output run is not allocated anymore for the final merge.

Note, double-buffering the runs during a merge was also planned with
this patch. However, performance testing showed that except some
exotic queries with unreasonably small amount of buffer pool memory
available double-buffering doesn't add to the overall performance.
It's basically because the half of the available buffers have to be
sacrificed to do double-buffering and as a result the merge tree can
get deeper. In addition the amount of I/O wait time is not reaching
the level where double-buffering could countervail the reduced number
of runs during a particular merge.

Performance measurements were made during manual testing to verify
that this is in fact an optimization:
  - In case doing a sort on top of a join when working with a
restricted amount of memory then the Sort node successfully
allocates additional memory right before the merging phase. This
is feasible because once Join finishes sending new input data and
calls InputDone() then it releases memory that can be picked up
by the Sorter. This results in shallower merging trees (more runs
grabbed for a merge).
  - On a multi-node cluster I verified that in cases when at least one
merging step is done then this change reduces the execution time
for sorts.
  - The more merging steps are done the bigger the performance gain is
compared to the baseline.

Change-Id: I74857c1694802e81f1cfc765d2b4e8bc644387f9
---
M be/src/runtime/sorter.cc
M be/src/runtime/sorter.h
M tests/query_test/test_sort.py
3 files changed, 191 insertions(+), 114 deletions(-)


  git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/43/9943/10
--
To view, visit http://gerrit.cloudera.org:8080/9943
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: newpatchset
Gerrit-Change-Id: I74857c1694802e81f1cfc765d2b4e8bc644387f9
Gerrit-Change-Number: 9943
Gerrit-PatchSet: 10
Gerrit-Owner: Gabor Kaszab 
Gerrit-Reviewer: Csaba Ringhofer 
Gerrit-Reviewer: Gabor Kaszab 
Gerrit-Reviewer: Tim Armstrong 


[Impala-ASF-CR](2.x) IMPALA-7033/IMPALA-7030: Backout suspected change leading to crash

2018-05-16 Thread Impala Public Jenkins (Code Review)
Impala Public Jenkins has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/10423 )

Change subject: IMPALA-7033/IMPALA-7030: Backout suspected change leading to 
crash
..


Patch Set 1: Verified+1


--
To view, visit http://gerrit.cloudera.org:8080/10423
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: 2.x
Gerrit-MessageType: comment
Gerrit-Change-Id: Idc63006e6e04130b2873a6a9730e434c563327c5
Gerrit-Change-Number: 10423
Gerrit-PatchSet: 1
Gerrit-Owner: Dan Hecht 
Gerrit-Reviewer: Dan Hecht 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Tim Armstrong 
Gerrit-Comment-Date: Wed, 16 May 2018 09:25:30 +
Gerrit-HasComments: No


[Impala-ASF-CR](2.x) IMPALA-7033/IMPALA-7030: Backout suspected change leading to crash

2018-05-16 Thread Impala Public Jenkins (Code Review)
Impala Public Jenkins has submitted this change and it was merged. ( 
http://gerrit.cloudera.org:8080/10423 )

Change subject: IMPALA-7033/IMPALA-7030: Backout suspected change leading to 
crash
..

IMPALA-7033/IMPALA-7030: Backout suspected change leading to crash

Revert "IMPALA-5384, part 2: Simplify Coordinator locking and clarify state"

This reverts commit 6ca87e46736a1e591ed7d7d5fee05b4b4d2fbb50.

Change-Id: Idc63006e6e04130b2873a6a9730e434c563327c5
Reviewed-on: http://gerrit.cloudera.org:8080/10412
Reviewed-by: Tim Armstrong 
Tested-by: Impala Public Jenkins 
Reviewed-on: http://gerrit.cloudera.org:8080/10423
Reviewed-by: Dan Hecht 
---
M be/src/runtime/coordinator-backend-state.h
M be/src/runtime/coordinator.cc
M be/src/runtime/coordinator.h
M be/src/service/client-request-state.cc
M be/src/service/impala-server.h
M be/src/util/counting-barrier.h
6 files changed, 391 insertions(+), 391 deletions(-)

Approvals:
  Dan Hecht: Looks good to me, approved
  Impala Public Jenkins: Verified

--
To view, visit http://gerrit.cloudera.org:8080/10423
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: 2.x
Gerrit-MessageType: merged
Gerrit-Change-Id: Idc63006e6e04130b2873a6a9730e434c563327c5
Gerrit-Change-Number: 10423
Gerrit-PatchSet: 2
Gerrit-Owner: Dan Hecht 
Gerrit-Reviewer: Dan Hecht 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Tim Armstrong 


[Impala-ASF-CR] IMPALA-6995: avoid DCHECK in TimestampParse::Parse()

2018-05-16 Thread Gabor Kaszab (Code Review)
Gabor Kaszab has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/10349 )

Change subject: IMPALA-6995: avoid DCHECK in TimestampParse::Parse()
..


Patch Set 3: Code-Review+1

Thank for responding to my comments! I'm fine with the changes.


--
To view, visit http://gerrit.cloudera.org:8080/10349
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I02a18ffd8893fe74f5830144300f745ce31477b1
Gerrit-Change-Number: 10349
Gerrit-PatchSet: 3
Gerrit-Owner: Tim Armstrong 
Gerrit-Reviewer: Gabor Kaszab 
Gerrit-Reviewer: Kim Jin Chul 
Gerrit-Reviewer: Tim Armstrong 
Gerrit-Comment-Date: Wed, 16 May 2018 07:04:04 +
Gerrit-HasComments: No


[Impala-ASF-CR] IMPALA-5931: Generates scan ranges in planner for s3/adls

2018-05-16 Thread Vuk Ercegovac (Code Review)
Vuk Ercegovac has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/8523 )

Change subject: IMPALA-5931: Generates scan ranges in planner for s3/adls
..


Patch Set 12:

(1 comment)

http://gerrit.cloudera.org:8080/#/c/8523/12//COMMIT_MSG
Commit Message:

http://gerrit.cloudera.org:8080/#/c/8523/12//COMMIT_MSG@13
PS12, Line 13: includes S3, ADLS, and local fs.
> Does this change the dependency on fs.s3a.block.size?
This change does not attempt to change the scan ranges that are produced for 
these file systems. Currently, we rely on the filestatus blocksize (see L177 of 
HdfsPartition) when synthesizing blocks to store in the catalog. This patch 
shifts that synthesis to the scheduler, so that its generated for each use 
instead of stored in memory. The block parameter used for this synthesis is set 
in this change on L784 of HdfsScanNode. Instead of filestatus blocksize, it 
uses the filesystem's default block size, which I think is the same thing for 
these file-systems (at least, from what I could tell from hadoop.fs.FileSystem).



--
To view, visit http://gerrit.cloudera.org:8080/8523
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I326065adbb2f7e632814113aae85cb51ca4779a5
Gerrit-Change-Number: 8523
Gerrit-PatchSet: 12
Gerrit-Owner: Vuk Ercegovac 
Gerrit-Reviewer: Alex Behm 
Gerrit-Reviewer: Dan Hecht 
Gerrit-Reviewer: Dimitris Tsirogiannis 
Gerrit-Reviewer: Lars Volker 
Gerrit-Reviewer: Mostafa Mokhtar 
Gerrit-Reviewer: Vuk Ercegovac 
Gerrit-Comment-Date: Wed, 16 May 2018 06:04:52 +
Gerrit-HasComments: Yes