[jira] [Commented] (FLINK-17309) TPC-DS fail to run data generator

2020-05-07 Thread Dawid Wysakowicz (Jira)


[ 
https://issues.apache.org/jira/browse/FLINK-17309?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17101687#comment-17101687
 ] 

Dawid Wysakowicz commented on FLINK-17309:
--

I assigned this issue to you [~Leonard Xu]

> TPC-DS fail to run data generator
> -
>
> Key: FLINK-17309
> URL: https://issues.apache.org/jira/browse/FLINK-17309
> Project: Flink
>  Issue Type: Bug
>  Components: Table SQL / Planner
>Affects Versions: 1.11.0
>Reporter: Dawid Wysakowicz
>Assignee: Leonard Xu
>Priority: Critical
>  Labels: pull-request-available, test-stability
>
> {code}
> [INFO] Download data generator success.
> [INFO] 15:53:41 Generating TPC-DS qualification data, this need several 
> minutes, please wait...
> ./dsdgen_linux: line 1: 500:: command not found
> [FAIL] Test script contains errors.
> {code}
> https://dev.azure.com/rmetzger/Flink/_build/results?buildId=7849=logs=c88eea3b-64a0-564d-0031-9fdcd7b8abee=1e2bbe5b-4657-50be-1f07-d84bfce5b1f5



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Commented] (FLINK-17309) TPC-DS fail to run data generator

2020-05-07 Thread Leonard Xu (Jira)


[ 
https://issues.apache.org/jira/browse/FLINK-17309?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17101494#comment-17101494
 ] 

Leonard Xu commented on FLINK-17309:


Hi, [~dwysakowicz]

Could you help assign this ticket to me ? and I have updated the PR.

> TPC-DS fail to run data generator
> -
>
> Key: FLINK-17309
> URL: https://issues.apache.org/jira/browse/FLINK-17309
> Project: Flink
>  Issue Type: Bug
>  Components: Table SQL / Planner
>Affects Versions: 1.11.0
>Reporter: Dawid Wysakowicz
>Priority: Critical
>  Labels: pull-request-available, test-stability
>
> {code}
> [INFO] Download data generator success.
> [INFO] 15:53:41 Generating TPC-DS qualification data, this need several 
> minutes, please wait...
> ./dsdgen_linux: line 1: 500:: command not found
> [FAIL] Test script contains errors.
> {code}
> https://dev.azure.com/rmetzger/Flink/_build/results?buildId=7849=logs=c88eea3b-64a0-564d-0031-9fdcd7b8abee=1e2bbe5b-4657-50be-1f07-d84bfce5b1f5



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Commented] (FLINK-17309) TPC-DS fail to run data generator

2020-05-06 Thread Robert Metzger (Jira)


[ 
https://issues.apache.org/jira/browse/FLINK-17309?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17101395#comment-17101395
 ] 

Robert Metzger commented on FLINK-17309:


Thanks a lot for validating the change. In my opinion, we can polish & merge 
the PR.

> TPC-DS fail to run data generator
> -
>
> Key: FLINK-17309
> URL: https://issues.apache.org/jira/browse/FLINK-17309
> Project: Flink
>  Issue Type: Bug
>  Components: Table SQL / Planner
>Affects Versions: 1.11.0
>Reporter: Dawid Wysakowicz
>Priority: Critical
>  Labels: pull-request-available, test-stability
>
> {code}
> [INFO] Download data generator success.
> [INFO] 15:53:41 Generating TPC-DS qualification data, this need several 
> minutes, please wait...
> ./dsdgen_linux: line 1: 500:: command not found
> [FAIL] Test script contains errors.
> {code}
> https://dev.azure.com/rmetzger/Flink/_build/results?buildId=7849=logs=c88eea3b-64a0-564d-0031-9fdcd7b8abee=1e2bbe5b-4657-50be-1f07-d84bfce5b1f5



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Commented] (FLINK-17309) TPC-DS fail to run data generator

2020-05-05 Thread Leonard Xu (Jira)


[ 
https://issues.apache.org/jira/browse/FLINK-17309?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17100426#comment-17100426
 ] 

Leonard Xu commented on FLINK-17309:


Hi, all 

I prepared the PR that add validation and retry logic when the md5sum of 
generator is not matched, I triggered many times in past days and only found 
one fail because network issue as we doubt[1].

And I checked all AZP [2]in past week(AZP number from 4.30.1 to 5.6.1) manually 
and found no TPC-DS failure comes from data generator.

I think it's time to confirm the PR is worked or not, I can polish the PR soon 
if we reach a consensus.

What do you think? [~rmetzger] [~dwysakowicz] [~lzljs3620320]

[1][https://github.com/apache/flink/pull/11867#issuecomment-618413808]

[2][https://dev.azure.com/apache-flink/apache-flink/_build?definitionId=2&_a=summary]
 

> TPC-DS fail to run data generator
> -
>
> Key: FLINK-17309
> URL: https://issues.apache.org/jira/browse/FLINK-17309
> Project: Flink
>  Issue Type: Bug
>  Components: Table SQL / Planner
>Affects Versions: 1.11.0
>Reporter: Dawid Wysakowicz
>Priority: Critical
>  Labels: pull-request-available, test-stability
>
> {code}
> [INFO] Download data generator success.
> [INFO] 15:53:41 Generating TPC-DS qualification data, this need several 
> minutes, please wait...
> ./dsdgen_linux: line 1: 500:: command not found
> [FAIL] Test script contains errors.
> {code}
> https://dev.azure.com/rmetzger/Flink/_build/results?buildId=7849=logs=c88eea3b-64a0-564d-0031-9fdcd7b8abee=1e2bbe5b-4657-50be-1f07-d84bfce5b1f5



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Commented] (FLINK-17309) TPC-DS fail to run data generator

2020-04-22 Thread Leonard Xu (Jira)


[ 
https://issues.apache.org/jira/browse/FLINK-17309?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17089390#comment-17089390
 ] 

Leonard Xu commented on FLINK-17309:


[~rmetzger] 

Thanks for you reply, What I'm considering is similar  to you, I‘m using '-x' 
to debug and I guess the network issues two, I have an original idea to avoid 
that is checking the md5sum of binary before execution and retry download for 
several times.

> TPC-DS fail to run data generator
> -
>
> Key: FLINK-17309
> URL: https://issues.apache.org/jira/browse/FLINK-17309
> Project: Flink
>  Issue Type: Bug
>  Components: Table SQL / Planner
>Affects Versions: 1.11.0
>Reporter: Dawid Wysakowicz
>Priority: Critical
>  Labels: test-stability
>
> {code}
> [INFO] Download data generator success.
> [INFO] 15:53:41 Generating TPC-DS qualification data, this need several 
> minutes, please wait...
> ./dsdgen_linux: line 1: 500:: command not found
> [FAIL] Test script contains errors.
> {code}
> https://dev.azure.com/rmetzger/Flink/_build/results?buildId=7849=logs=c88eea3b-64a0-564d-0031-9fdcd7b8abee=1e2bbe5b-4657-50be-1f07-d84bfce5b1f5



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Commented] (FLINK-17309) TPC-DS fail to run data generator

2020-04-22 Thread Robert Metzger (Jira)


[ 
https://issues.apache.org/jira/browse/FLINK-17309?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17089373#comment-17089373
 ] 

Robert Metzger commented on FLINK-17309:


The only explanation I can come up with is that the binary is corrupted, due to 
network issues?
We could print the md5sum of the binary before execution to rule that out? 

> TPC-DS fail to run data generator
> -
>
> Key: FLINK-17309
> URL: https://issues.apache.org/jira/browse/FLINK-17309
> Project: Flink
>  Issue Type: Bug
>  Components: Table SQL / Planner
>Affects Versions: 1.11.0
>Reporter: Dawid Wysakowicz
>Priority: Critical
>  Labels: test-stability
>
> {code}
> [INFO] Download data generator success.
> [INFO] 15:53:41 Generating TPC-DS qualification data, this need several 
> minutes, please wait...
> ./dsdgen_linux: line 1: 500:: command not found
> [FAIL] Test script contains errors.
> {code}
> https://dev.azure.com/rmetzger/Flink/_build/results?buildId=7849=logs=c88eea3b-64a0-564d-0031-9fdcd7b8abee=1e2bbe5b-4657-50be-1f07-d84bfce5b1f5



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Commented] (FLINK-17309) TPC-DS fail to run data generator

2020-04-22 Thread Robert Metzger (Jira)


[ 
https://issues.apache.org/jira/browse/FLINK-17309?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17089353#comment-17089353
 ] 

Robert Metzger commented on FLINK-17309:


Regarding (1): Except for the e2e tests, all the compile and test jobs are 
executed in this docker container {{rmetzger/flink-ci:ubuntu-amd64-bcef226}}. I 
run the tests inside the container to make sure we always have the same 
environment, and we can reproduce issues locally.
(2) I guess you are talking about this line here: 
https://github.com/apache/flink/blob/master/flink-end-to-end-tests/flink-tpcds-test/tpcds-tool/data_generator.sh#L79
What you can do is execute the script (on Azure) with "set -x" to get debugging 
information. I don't think the chmod call fails. I think is is the 
{{./dsdgen_linux}} call, because the error message looks like this
{code}
./dsdgen_linux: line 1: 500:: command not found
{code}


> TPC-DS fail to run data generator
> -
>
> Key: FLINK-17309
> URL: https://issues.apache.org/jira/browse/FLINK-17309
> Project: Flink
>  Issue Type: Bug
>  Components: Table SQL / Planner
>Affects Versions: 1.11.0
>Reporter: Dawid Wysakowicz
>Priority: Critical
>  Labels: test-stability
>
> {code}
> [INFO] Download data generator success.
> [INFO] 15:53:41 Generating TPC-DS qualification data, this need several 
> minutes, please wait...
> ./dsdgen_linux: line 1: 500:: command not found
> [FAIL] Test script contains errors.
> {code}
> https://dev.azure.com/rmetzger/Flink/_build/results?buildId=7849=logs=c88eea3b-64a0-564d-0031-9fdcd7b8abee=1e2bbe5b-4657-50be-1f07-d84bfce5b1f5



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Commented] (FLINK-17309) TPC-DS fail to run data generator

2020-04-21 Thread Leonard Xu (Jira)


[ 
https://issues.apache.org/jira/browse/FLINK-17309?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17089261#comment-17089261
 ] 

Leonard Xu commented on FLINK-17309:


I checked related scripts hasn't modified, and looked up some materials to find 
the root cause but not a reasonable answer. 

Hi  [~chesnay], could you help to answer the two questions?  I'm not familiar 
with Azure pipeline(many thanks).

(1) It seems these tests fails are random, Is the bash of Azuere's machine 
environment are same between pipelines? 

(2)`dsdgen_linux` need to be execute by root, so the scripts `chmod +x` before 
execution, Is there any possible the step may happen error when invoke `chmod 
+x` ?

> TPC-DS fail to run data generator
> -
>
> Key: FLINK-17309
> URL: https://issues.apache.org/jira/browse/FLINK-17309
> Project: Flink
>  Issue Type: Bug
>  Components: Table SQL / Planner
>Affects Versions: 1.11.0
>Reporter: Dawid Wysakowicz
>Priority: Critical
>  Labels: test-stability
>
> {code}
> [INFO] Download data generator success.
> [INFO] 15:53:41 Generating TPC-DS qualification data, this need several 
> minutes, please wait...
> ./dsdgen_linux: line 1: 500:: command not found
> [FAIL] Test script contains errors.
> {code}
> https://dev.azure.com/rmetzger/Flink/_build/results?buildId=7849=logs=c88eea3b-64a0-564d-0031-9fdcd7b8abee=1e2bbe5b-4657-50be-1f07-d84bfce5b1f5



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Commented] (FLINK-17309) TPC-DS fail to run data generator

2020-04-21 Thread Leonard Xu (Jira)


[ 
https://issues.apache.org/jira/browse/FLINK-17309?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17089209#comment-17089209
 ] 

Leonard Xu commented on FLINK-17309:


it's maybe caused by scripts, Let me track this.  [~dwysakowicz] could you 
assign the ticket to me?

> TPC-DS fail to run data generator
> -
>
> Key: FLINK-17309
> URL: https://issues.apache.org/jira/browse/FLINK-17309
> Project: Flink
>  Issue Type: Bug
>  Components: Table SQL / Planner
>Affects Versions: 1.11.0
>Reporter: Dawid Wysakowicz
>Priority: Critical
>  Labels: test-stability
>
> {code}
> [INFO] Download data generator success.
> [INFO] 15:53:41 Generating TPC-DS qualification data, this need several 
> minutes, please wait...
> ./dsdgen_linux: line 1: 500:: command not found
> [FAIL] Test script contains errors.
> {code}
> https://dev.azure.com/rmetzger/Flink/_build/results?buildId=7849=logs=c88eea3b-64a0-564d-0031-9fdcd7b8abee=1e2bbe5b-4657-50be-1f07-d84bfce5b1f5



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Commented] (FLINK-17309) TPC-DS fail to run data generator

2020-04-21 Thread Jingsong Lee (Jira)


[ 
https://issues.apache.org/jira/browse/FLINK-17309?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17089205#comment-17089205
 ] 

Jingsong Lee commented on FLINK-17309:
--

CC: [~Leonard Xu]

> TPC-DS fail to run data generator
> -
>
> Key: FLINK-17309
> URL: https://issues.apache.org/jira/browse/FLINK-17309
> Project: Flink
>  Issue Type: Bug
>  Components: Table SQL / Planner
>Affects Versions: 1.11.0
>Reporter: Dawid Wysakowicz
>Priority: Critical
>  Labels: test-stability
>
> {code}
> [INFO] Download data generator success.
> [INFO] 15:53:41 Generating TPC-DS qualification data, this need several 
> minutes, please wait...
> ./dsdgen_linux: line 1: 500:: command not found
> [FAIL] Test script contains errors.
> {code}
> https://dev.azure.com/rmetzger/Flink/_build/results?buildId=7849=logs=c88eea3b-64a0-564d-0031-9fdcd7b8abee=1e2bbe5b-4657-50be-1f07-d84bfce5b1f5



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Commented] (FLINK-17309) TPC-DS fail to run data generator

2020-04-21 Thread Dawid Wysakowicz (Jira)


[ 
https://issues.apache.org/jira/browse/FLINK-17309?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17088834#comment-17088834
 ] 

Dawid Wysakowicz commented on FLINK-17309:
--

Another instance: 
https://dev.azure.com/rmetzger/Flink/_build/results?buildId=7850=logs=c88eea3b-64a0-564d-0031-9fdcd7b8abee=1e2bbe5b-4657-50be-1f07-d84bfce5b1f5

> TPC-DS fail to run data generator
> -
>
> Key: FLINK-17309
> URL: https://issues.apache.org/jira/browse/FLINK-17309
> Project: Flink
>  Issue Type: Bug
>  Components: Table SQL / Planner
>Affects Versions: 1.11.0
>Reporter: Dawid Wysakowicz
>Priority: Critical
>  Labels: test-stability
>
> {code}
> [INFO] Download data generator success.
> [INFO] 15:53:41 Generating TPC-DS qualification data, this need several 
> minutes, please wait...
> ./dsdgen_linux: line 1: 500:: command not found
> [FAIL] Test script contains errors.
> {code}
> https://dev.azure.com/rmetzger/Flink/_build/results?buildId=7849=logs=c88eea3b-64a0-564d-0031-9fdcd7b8abee=1e2bbe5b-4657-50be-1f07-d84bfce5b1f5



--
This message was sent by Atlassian Jira
(v8.3.4#803005)