[jira] [Commented] (HUDI-312) Investigate recent flaky CI runs

2019-10-31 Thread Vinoth Chandar (Jira)


[ 
https://issues.apache.org/jira/browse/HUDI-312?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16964154#comment-16964154
 ] 

Vinoth Chandar commented on HUDI-312:
-

> Embedded timeline had been enabled for some time now. So, it is still not 
> clear if embedded timeline server is causing it.

Given its a feature we recommend to users, I would suggest not to disable this 
right away. 

@uditme This is where we are now. 

Next steps could be : 
 * Dump out the logs (currently logs are printed after command succeeds) as the 
command makes progress and see where the hang is at
 * Try force killing the jvm after that point and see if it exits atleast. (I 
tried added System.exit(0) to last line of DeltaStreamer::main and it did not 
do the trick. 

> Investigate recent flaky CI runs
> 
>
> Key: HUDI-312
> URL: https://issues.apache.org/jira/browse/HUDI-312
> Project: Apache Hudi (incubating)
>  Issue Type: New Feature
>  Components: Testing
>Reporter: Vinoth Chandar
>Assignee: leesf
>Priority: Major
>  Labels: pull-request-available
> Fix For: 0.5.1
>
> Attachments: Builds - apache_incubator-hudi - Travis CI.pdf
>
>  Time Spent: 10m
>  Remaining Estimate: 0h
>
> master used to be solid green. noticing that nowadays PRs and even some 
> master merges fail with 
> - No output received for 10m
> - Exceeded runtime of 50m 
> - VM exit crash 
> We saw this earlier in the year as well. It was due to the apache org queue 
> in travis being busy/stressed. I think we should shadow azure CI or circle CI 
> parallely and weed out code vs environment issue.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Commented] (HUDI-312) Investigate recent flaky CI runs

2019-10-30 Thread Balaji Varadarajan (Jira)


[ 
https://issues.apache.org/jira/browse/HUDI-312?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16963663#comment-16963663
 ] 

Balaji Varadarajan commented on HUDI-312:
-

I am able to consistently reproduce the integration test hanging issue locally 
in my mac laptop. I tried with disabling embedded timeline server and the tests 
seems to pass. I ran this only one time though. Opening a PR 
[https://github.com/apache/incubator-hudi/pull/989] to run integration tests in 
travis. Embedded timeline had been enabled for some time now. So, it is still 
not clear if embedded timeline server is causing it. Even if this overcomes 
file hanging issue, we still need to get to the root of the issue.

> Investigate recent flaky CI runs
> 
>
> Key: HUDI-312
> URL: https://issues.apache.org/jira/browse/HUDI-312
> Project: Apache Hudi (incubating)
>  Issue Type: New Feature
>  Components: Testing
>Reporter: Vinoth Chandar
>Assignee: leesf
>Priority: Major
>  Labels: pull-request-available
> Fix For: 0.5.1
>
> Attachments: Builds - apache_incubator-hudi - Travis CI.pdf
>
>  Time Spent: 10m
>  Remaining Estimate: 0h
>
> master used to be solid green. noticing that nowadays PRs and even some 
> master merges fail with 
> - No output received for 10m
> - Exceeded runtime of 50m 
> - VM exit crash 
> We saw this earlier in the year as well. It was due to the apache org queue 
> in travis being busy/stressed. I think we should shadow azure CI or circle CI 
> parallely and weed out code vs environment issue.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Commented] (HUDI-312) Investigate recent flaky CI runs

2019-10-30 Thread Vinoth Chandar (Jira)


[ 
https://issues.apache.org/jira/browse/HUDI-312?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16963610#comment-16963610
 ] 

Vinoth Chandar commented on HUDI-312:
-

I am surprised that you are unable to reproduce.. me or balaji taking a stab at 
it. Seems bit tricky. But should have been a recent regression 

> Investigate recent flaky CI runs
> 
>
> Key: HUDI-312
> URL: https://issues.apache.org/jira/browse/HUDI-312
> Project: Apache Hudi (incubating)
>  Issue Type: New Feature
>  Components: Testing
>Reporter: Vinoth Chandar
>Assignee: leesf
>Priority: Major
> Fix For: 0.5.1
>
> Attachments: Builds - apache_incubator-hudi - Travis CI.pdf
>
>
> master used to be solid green. noticing that nowadays PRs and even some 
> master merges fail with 
> - No output received for 10m
> - Exceeded runtime of 50m 
> - VM exit crash 
> We saw this earlier in the year as well. It was due to the apache org queue 
> in travis being busy/stressed. I think we should shadow azure CI or circle CI 
> parallely and weed out code vs environment issue.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Commented] (HUDI-312) Investigate recent flaky CI runs

2019-10-29 Thread leesf (Jira)


[ 
https://issues.apache.org/jira/browse/HUDI-312?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16962467#comment-16962467
 ] 

leesf commented on HUDI-312:


Will provide a PR soon.
Of cause it is suitable to open a new issue to track integ test hanging. And I 
could not reproduce integ test hanging locally yet.:(

> Investigate recent flaky CI runs
> 
>
> Key: HUDI-312
> URL: https://issues.apache.org/jira/browse/HUDI-312
> Project: Apache Hudi (incubating)
>  Issue Type: New Feature
>  Components: Testing
>Reporter: Vinoth Chandar
>Assignee: leesf
>Priority: Major
> Fix For: 0.5.1
>
> Attachments: Builds - apache_incubator-hudi - Travis CI.pdf
>
>
> master used to be solid green. noticing that nowadays PRs and even some 
> master merges fail with 
> - No output received for 10m
> - Exceeded runtime of 50m 
> - VM exit crash 
> We saw this earlier in the year as well. It was due to the apache org queue 
> in travis being busy/stressed. I think we should shadow azure CI or circle CI 
> parallely and weed out code vs environment issue.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Commented] (HUDI-312) Investigate recent flaky CI runs

2019-10-29 Thread Vinoth Chandar (Jira)


[ 
https://issues.apache.org/jira/browse/HUDI-312?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16961960#comment-16961960
 ] 

Vinoth Chandar commented on HUDI-312:
-

May be we should open a new issue for the case of integ test hanging locally? 
are you able to reproduce this as well [~xleesf] ? 

> Investigate recent flaky CI runs
> 
>
> Key: HUDI-312
> URL: https://issues.apache.org/jira/browse/HUDI-312
> Project: Apache Hudi (incubating)
>  Issue Type: New Feature
>  Components: Testing
>Reporter: Vinoth Chandar
>Assignee: leesf
>Priority: Major
> Fix For: 0.5.1
>
> Attachments: Builds - apache_incubator-hudi - Travis CI.pdf
>
>
> master used to be solid green. noticing that nowadays PRs and even some 
> master merges fail with 
> - No output received for 10m
> - Exceeded runtime of 50m 
> - VM exit crash 
> We saw this earlier in the year as well. It was due to the apache org queue 
> in travis being busy/stressed. I think we should shadow azure CI or circle CI 
> parallely and weed out code vs environment issue.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Commented] (HUDI-312) Investigate recent flaky CI runs

2019-10-29 Thread Vinoth Chandar (Jira)


[ 
https://issues.apache.org/jira/browse/HUDI-312?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16961956#comment-16961956
 ] 

Vinoth Chandar commented on HUDI-312:
-

Please open a PR. lets see how this goes.. 



> Investigate recent flaky CI runs
> 
>
> Key: HUDI-312
> URL: https://issues.apache.org/jira/browse/HUDI-312
> Project: Apache Hudi (incubating)
>  Issue Type: New Feature
>  Components: Testing
>Reporter: Vinoth Chandar
>Assignee: leesf
>Priority: Major
> Fix For: 0.5.1
>
> Attachments: Builds - apache_incubator-hudi - Travis CI.pdf
>
>
> master used to be solid green. noticing that nowadays PRs and even some 
> master merges fail with 
> - No output received for 10m
> - Exceeded runtime of 50m 
> - VM exit crash 
> We saw this earlier in the year as well. It was due to the apache org queue 
> in travis being busy/stressed. I think we should shadow azure CI or circle CI 
> parallely and weed out code vs environment issue.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Commented] (HUDI-312) Investigate recent flaky CI runs

2019-10-29 Thread Vinoth Chandar (Jira)


[ 
https://issues.apache.org/jira/browse/HUDI-312?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16961934#comment-16961934
 ] 

Vinoth Chandar commented on HUDI-312:
-

But does the test pass?  I am wondering if there is a root cause we are 
missing.. 
But agree with you fixes to first stabilize would be good. 

> Investigate recent flaky CI runs
> 
>
> Key: HUDI-312
> URL: https://issues.apache.org/jira/browse/HUDI-312
> Project: Apache Hudi (incubating)
>  Issue Type: New Feature
>  Components: Testing
>Reporter: Vinoth Chandar
>Assignee: leesf
>Priority: Major
> Fix For: 0.5.1
>
> Attachments: Builds - apache_incubator-hudi - Travis CI.pdf
>
>
> master used to be solid green. noticing that nowadays PRs and even some 
> master merges fail with 
> - No output received for 10m
> - Exceeded runtime of 50m 
> - VM exit crash 
> We saw this earlier in the year as well. It was due to the apache org queue 
> in travis being busy/stressed. I think we should shadow azure CI or circle CI 
> parallely and weed out code vs environment issue.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Commented] (HUDI-312) Investigate recent flaky CI runs

2019-10-29 Thread leesf (Jira)


[ 
https://issues.apache.org/jira/browse/HUDI-312?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16961772#comment-16961772
 ] 

leesf commented on HUDI-312:


On no output, I changed the 
[travis.yaml|https://github.com/leesf/incubator-hudi/commit/dcc7e6a13b41dcf8e8df30eb8d5b64367b8feb06],
 borrow from 
[here|https://github.com/cyberFund/cybernode-archive/commit/0dbb14c5169144b7535cbf3dc474916b93a64a5e]
 and run more than 5 times in 
[travis|https://travis-ci.org/leesf/incubator-hudi/jobs/604252022], no output 
occurs again. I will continue restart the travis and observe the result before 
send a PR.

> Investigate recent flaky CI runs
> 
>
> Key: HUDI-312
> URL: https://issues.apache.org/jira/browse/HUDI-312
> Project: Apache Hudi (incubating)
>  Issue Type: New Feature
>  Components: Testing
>Reporter: Vinoth Chandar
>Assignee: leesf
>Priority: Major
> Fix For: 0.5.1
>
> Attachments: Builds - apache_incubator-hudi - Travis CI.pdf
>
>
> master used to be solid green. noticing that nowadays PRs and even some 
> master merges fail with 
> - No output received for 10m
> - Exceeded runtime of 50m 
> - VM exit crash 
> We saw this earlier in the year as well. It was due to the apache org queue 
> in travis being busy/stressed. I think we should shadow azure CI or circle CI 
> parallely and weed out code vs environment issue.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Commented] (HUDI-312) Investigate recent flaky CI runs

2019-10-28 Thread Vinoth Chandar (Jira)


[ 
https://issues.apache.org/jira/browse/HUDI-312?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16961649#comment-16961649
 ] 

Vinoth Chandar commented on HUDI-312:
-

Right now, this ticket lacks a clear owner :). Feel free to pick it up if 
interested. On no output, if its from the integ test, then the thread on the 
mailing list could be useful to understand

> Investigate recent flaky CI runs
> 
>
> Key: HUDI-312
> URL: https://issues.apache.org/jira/browse/HUDI-312
> Project: Apache Hudi (incubating)
>  Issue Type: New Feature
>  Components: Testing
>Reporter: Vinoth Chandar
>Priority: Major
> Fix For: 0.5.1
>
> Attachments: Builds - apache_incubator-hudi - Travis CI.pdf
>
>
> master used to be solid green. noticing that nowadays PRs and even some 
> master merges fail with 
> - No output received for 10m
> - Exceeded runtime of 50m 
> - VM exit crash 
> We saw this earlier in the year as well. It was due to the apache org queue 
> in travis being busy/stressed. I think we should shadow azure CI or circle CI 
> parallely and weed out code vs environment issue.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Commented] (HUDI-312) Investigate recent flaky CI runs

2019-10-28 Thread leesf (Jira)


[ 
https://issues.apache.org/jira/browse/HUDI-312?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16961627#comment-16961627
 ] 

leesf commented on HUDI-312:


another build error (No output has been received in the last 10m0s, this 
potentially indicates a stalled build or something wrong with the build 
itself.) https://travis-ci.org/leesf/incubator-hudi/jobs/604224429

> Investigate recent flaky CI runs
> 
>
> Key: HUDI-312
> URL: https://issues.apache.org/jira/browse/HUDI-312
> Project: Apache Hudi (incubating)
>  Issue Type: New Feature
>  Components: Testing
>Reporter: Vinoth Chandar
>Priority: Major
> Fix For: 0.5.1
>
> Attachments: Builds - apache_incubator-hudi - Travis CI.pdf
>
>
> master used to be solid green. noticing that nowadays PRs and even some 
> master merges fail with 
> - No output received for 10m
> - Exceeded runtime of 50m 
> - VM exit crash 
> We saw this earlier in the year as well. It was due to the apache org queue 
> in travis being busy/stressed. I think we should shadow azure CI or circle CI 
> parallely and weed out code vs environment issue.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Commented] (HUDI-312) Investigate recent flaky CI runs

2019-10-28 Thread leesf (Jira)


[ 
https://issues.apache.org/jira/browse/HUDI-312?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16961205#comment-16961205
 ] 

leesf commented on HUDI-312:


Wil send a PR later today.

> Investigate recent flaky CI runs
> 
>
> Key: HUDI-312
> URL: https://issues.apache.org/jira/browse/HUDI-312
> Project: Apache Hudi (incubating)
>  Issue Type: New Feature
>  Components: Testing
>Reporter: Vinoth Chandar
>Priority: Major
> Fix For: 0.5.1
>
> Attachments: Builds - apache_incubator-hudi - Travis CI.pdf
>
>
> master used to be solid green. noticing that nowadays PRs and even some 
> master merges fail with 
> - No output received for 10m
> - Exceeded runtime of 50m 
> - VM exit crash 
> We saw this earlier in the year as well. It was due to the apache org queue 
> in travis being busy/stressed. I think we should shadow azure CI or circle CI 
> parallely and weed out code vs environment issue.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Commented] (HUDI-312) Investigate recent flaky CI runs

2019-10-28 Thread Vinoth Chandar (Jira)


[ 
https://issues.apache.org/jira/browse/HUDI-312?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16961179#comment-16961179
 ] 

Vinoth Chandar commented on HUDI-312:
-

Worth giving a shot. Can you please send a PR. 

I am still tracking down the docker/spark-submit hang.. Have not gotten around 
to nailing it. 

> Investigate recent flaky CI runs
> 
>
> Key: HUDI-312
> URL: https://issues.apache.org/jira/browse/HUDI-312
> Project: Apache Hudi (incubating)
>  Issue Type: New Feature
>  Components: Testing
>Reporter: Vinoth Chandar
>Priority: Major
> Fix For: 0.5.1
>
> Attachments: Builds - apache_incubator-hudi - Travis CI.pdf
>
>
> master used to be solid green. noticing that nowadays PRs and even some 
> master merges fail with 
> - No output received for 10m
> - Exceeded runtime of 50m 
> - VM exit crash 
> We saw this earlier in the year as well. It was due to the apache org queue 
> in travis being busy/stressed. I think we should shadow azure CI or circle CI 
> parallely and weed out code vs environment issue.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Commented] (HUDI-312) Investigate recent flaky CI runs

2019-10-28 Thread leesf (Jira)


[ 
https://issues.apache.org/jira/browse/HUDI-312?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16961014#comment-16961014
 ] 

leesf commented on HUDI-312:


For vm crashes, the following information is useful.
https://stackoverflow.com/questions/23260057/the-forked-vm-terminated-without-saying-properly-goodbye-vm-crash-or-system-exi.
I added -Xmx1024m -XX:MaxPermSize=256m to surefire plugin 
configuration(https://github.com/leesf/incubator-hudi/commit/d2b26650fc921666761d816b721cea22307bf884)
 and build it many times in travis 
(https://travis-ci.org/leesf/incubator-hudi/builds/603810854), the vm did not 
crash again. [~vinoth]

> Investigate recent flaky CI runs
> 
>
> Key: HUDI-312
> URL: https://issues.apache.org/jira/browse/HUDI-312
> Project: Apache Hudi (incubating)
>  Issue Type: New Feature
>  Components: Testing
>Reporter: Vinoth Chandar
>Priority: Major
> Fix For: 0.5.1
>
> Attachments: Builds - apache_incubator-hudi - Travis CI.pdf
>
>
> master used to be solid green. noticing that nowadays PRs and even some 
> master merges fail with 
> - No output received for 10m
> - Exceeded runtime of 50m 
> - VM exit crash 
> We saw this earlier in the year as well. It was due to the apache org queue 
> in travis being busy/stressed. I think we should shadow azure CI or circle CI 
> parallely and weed out code vs environment issue.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Commented] (HUDI-312) Investigate recent flaky CI runs

2019-10-24 Thread vinoyang (Jira)


[ 
https://issues.apache.org/jira/browse/HUDI-312?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16959380#comment-16959380
 ] 

vinoyang commented on HUDI-312:
---

[~vinoth] yes, agree.

> Investigate recent flaky CI runs
> 
>
> Key: HUDI-312
> URL: https://issues.apache.org/jira/browse/HUDI-312
> Project: Apache Hudi (incubating)
>  Issue Type: New Feature
>  Components: Testing
>Reporter: Vinoth Chandar
>Priority: Major
> Fix For: 0.5.1
>
> Attachments: Builds - apache_incubator-hudi - Travis CI.pdf
>
>
> master used to be solid green. noticing that nowadays PRs and even some 
> master merges fail with 
> - No output received for 10m
> - Exceeded runtime of 50m 
> - VM exit crash 
> We saw this earlier in the year as well. It was due to the apache org queue 
> in travis being busy/stressed. I think we should shadow azure CI or circle CI 
> parallely and weed out code vs environment issue.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Commented] (HUDI-312) Investigate recent flaky CI runs

2019-10-24 Thread Vinoth Chandar (Jira)


[ 
https://issues.apache.org/jira/browse/HUDI-312?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16959374#comment-16959374
 ] 

Vinoth Chandar commented on HUDI-312:
-

I think we could start by focussing on whether the integration test passes 
locally without hanging (as two people have reported so far) for N times. If 
this is true, this would be problematic on travis , causing vm timeouts. 

[~yanghua] some of this is definitely travis. esp the vm crashes . We have seen 
that in the past as well

> Investigate recent flaky CI runs
> 
>
> Key: HUDI-312
> URL: https://issues.apache.org/jira/browse/HUDI-312
> Project: Apache Hudi (incubating)
>  Issue Type: New Feature
>  Components: Testing
>Reporter: Vinoth Chandar
>Priority: Major
> Fix For: 0.5.1
>
> Attachments: Builds - apache_incubator-hudi - Travis CI.pdf
>
>
> master used to be solid green. noticing that nowadays PRs and even some 
> master merges fail with 
> - No output received for 10m
> - Exceeded runtime of 50m 
> - VM exit crash 
> We saw this earlier in the year as well. It was due to the apache org queue 
> in travis being busy/stressed. I think we should shadow azure CI or circle CI 
> parallely and weed out code vs environment issue.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Commented] (HUDI-312) Investigate recent flaky CI runs

2019-10-24 Thread leesf (Jira)


[ 
https://issues.apache.org/jira/browse/HUDI-312?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16959125#comment-16959125
 ] 

leesf commented on HUDI-312:


I will invertigate the frequently VM exit crash error when get a circle.

> Investigate recent flaky CI runs
> 
>
> Key: HUDI-312
> URL: https://issues.apache.org/jira/browse/HUDI-312
> Project: Apache Hudi (incubating)
>  Issue Type: New Feature
>  Components: Testing
>Reporter: Vinoth Chandar
>Priority: Major
> Fix For: 0.5.1
>
> Attachments: Builds - apache_incubator-hudi - Travis CI.pdf
>
>
> master used to be solid green. noticing that nowadays PRs and even some 
> master merges fail with 
> - No output received for 10m
> - Exceeded runtime of 50m 
> - VM exit crash 
> We saw this earlier in the year as well. It was due to the apache org queue 
> in travis being busy/stressed. I think we should shadow azure CI or circle CI 
> parallely and weed out code vs environment issue.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Commented] (HUDI-312) Investigate recent flaky CI runs

2019-10-24 Thread vinoyang (Jira)


[ 
https://issues.apache.org/jira/browse/HUDI-312?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16958656#comment-16958656
 ] 

vinoyang commented on HUDI-312:
---

Actually, based on my experience in the Flink community, Flink's Travis service 
also often fails for unusual reasons, but not as frequently as Hudi recently. I 
have an unfounded guess, is it due to the instability of the test environment 
itself (because many times local tests can pass)?

> Investigate recent flaky CI runs
> 
>
> Key: HUDI-312
> URL: https://issues.apache.org/jira/browse/HUDI-312
> Project: Apache Hudi (incubating)
>  Issue Type: New Feature
>  Components: Testing
>Reporter: Vinoth Chandar
>Priority: Major
> Fix For: 0.5.1
>
> Attachments: Builds - apache_incubator-hudi - Travis CI.pdf
>
>
> master used to be solid green. noticing that nowadays PRs and even some 
> master merges fail with 
> - No output received for 10m
> - Exceeded runtime of 50m 
> - VM exit crash 
> We saw this earlier in the year as well. It was due to the apache org queue 
> in travis being busy/stressed. I think we should shadow azure CI or circle CI 
> parallely and weed out code vs environment issue.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Commented] (HUDI-312) Investigate recent flaky CI runs

2019-10-23 Thread Balaji Varadarajan (Jira)


[ 
https://issues.apache.org/jira/browse/HUDI-312?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16958289#comment-16958289
 ] 

Balaji Varadarajan commented on HUDI-312:
-

Yeah, this looks like a recent issue. [~leesf] [~yanghua] [~nagarwal] : I am 
trying to get some tech debts related to timeline management finished up this 
week and would need one of your help to take a lead on this. Let me know if you 
have cycles. 

Thanks,

Balaji.V

> Investigate recent flaky CI runs
> 
>
> Key: HUDI-312
> URL: https://issues.apache.org/jira/browse/HUDI-312
> Project: Apache Hudi (incubating)
>  Issue Type: New Feature
>  Components: Testing
>Reporter: Vinoth Chandar
>Priority: Major
> Fix For: 0.5.1
>
> Attachments: Builds - apache_incubator-hudi - Travis CI.pdf
>
>
> master used to be solid green. noticing that nowadays PRs and even some 
> master merges fail with 
> - No output received for 10m
> - Exceeded runtime of 50m 
> - VM exit crash 
> We saw this earlier in the year as well. It was due to the apache org queue 
> in travis being busy/stressed. I think we should shadow azure CI or circle CI 
> parallely and weed out code vs environment issue.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Commented] (HUDI-312) Investigate recent flaky CI runs

2019-10-23 Thread Vinoth Chandar (Jira)


[ 
https://issues.apache.org/jira/browse/HUDI-312?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16957861#comment-16957861
 ] 

Vinoth Chandar commented on HUDI-312:
-

[~vbalaji][~xleesf] [~yanghua] FYI 

if you notice the pdf, the flakiness is recent and I can't think of any of the 
recent commits that could signiificantly affect this this way 

> Investigate recent flaky CI runs
> 
>
> Key: HUDI-312
> URL: https://issues.apache.org/jira/browse/HUDI-312
> Project: Apache Hudi (incubating)
>  Issue Type: New Feature
>  Components: Testing
>Reporter: Vinoth Chandar
>Priority: Major
> Fix For: 0.5.1
>
> Attachments: Builds - apache_incubator-hudi - Travis CI.pdf
>
>
> master used to be solid green. noticing that nowadays PRs and even some 
> master merges fail with 
> - No output received for 10m
> - Exceeded runtime of 50m 
> - VM exit crash 
> We saw this earlier in the year as well. It was due to the apache org queue 
> in travis being busy/stressed. I think we should shadow azure CI or circle CI 
> parallely and weed out code vs environment issue.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)