Able to run the tests successfully now. Thank you for digging into the
issue.

On Fri, Nov 1, 2019 at 11:04 PM [email protected] <[email protected]>
wrote:

>  Agree that we need to keep hdfs data transient across integration test
> runs. I have removed the volumes in the compose file and updated the PR
> https://github.com/apache/incubator-hudi/pull/989
> Hopefully, this should fix the flakiness.
> Balaji.V
>
>      On Friday, November 1, 2019, 08:26:38 AM PDT, Vinoth Chandar <
> [email protected]> wrote:
>
>  Update on this thread..  There has been progress and we have few fixes
> being tested
> https://github.com/vinothchandar/incubator-hudi/tree/hudi-312-flaky-tests
> https://github.com/apache/incubator-hudi/pull/989
>
> It boiled down the remnants from the previous run hanging around and
> causing invalid states. We also had some threadpool that was n't closed
> upon such an unexpected error causing the jvm to hang around.
> @Balaji Varadarajan  I think its best to rebuild and publish new images
> which use local storage for hdfs . wdyt?
>
> Also filed a few follow ups : HUDI-322, HUDI-323
>
>
> On Sat, Oct 26, 2019 at 9:36 AM Vinoth Chandar <[email protected]> wrote:
>
> Disabling UI is not doing the trick. I think it gets stuck while starting
> up (and not while exiting like I assumed incorrectly before).
>
> On Fri, Oct 25, 2019 at 9:00 AM Vinoth Chandar <[email protected]> wrote:
>
> Could we disable the UI and try again? Its either the jetty threads or the
> two HDFS threads that's hanging on. Cannot understand why the JVM would n't
> exit otherwise.
> On Fri, Oct 25, 2019 at 5:27 AM Bhavani Sudha <[email protected]>
> wrote:
>
> https://gist.github.com/bhasudha/5aac43d93a942f68bcab413a26229292
>  Took a thread dump. Seems like jetty threads are not shutting down? Dont
> see any hudi/spark related activity that is pending. Only threads in
> RUNNABLE state are jetty ones
>
> On Fri, Oct 25, 2019 at 1:54 AM Pratyaksh Sharma <[email protected]>
> wrote:
>
> > Hi Vinoth,
> >
> > > can you try
> > - Do : docker ps -a and make sure there are no lingering containers.
> > - if so, run : cd docker; ./stop_demo.sh
> > - cd ..
> > - mvn clean verify -DskipUTs=true -B
> >
> > I ran the above 3 times. Twice it was successful but once it incurred the
> > same errors I listed in previous mail.
> >
> > On Fri, Oct 25, 2019 at 8:26 AM Vinoth Chandar <
> > [email protected]> wrote:
> >
> > > Got the integ test to hang once, at the same spot as Pratyaksh
> > mentioned..
> > > So it would be a good candidate to drill into.
> > >
> > > @nishith in this state, the containers are all open. So you could just
> > hop
> > > in and stack trace to see whats going on.
> > >
> > >
> > > On Thu, Oct 24, 2019 at 9:14 AM Nishith <[email protected]> wrote:
> > >
> > > > I’m going to look into the flaky tests on Travis sometime today.
> > > >
> > > > -Nishith
> > > >
> > > > Sent from my iPhone
> > > >
> > > > > On Oct 23, 2019, at 10:23 PM, Vinoth Chandar <[email protected]>
> > > wrote:
> > > > >
> > > > > Just to make sure we are on the same page,
> > > > >
> > > > > can you try
> > > > > - Do : docker ps -a and make sure there are no lingering
> containers.
> > > > > - if so, run : cd docker; ./stop_demo.sh
> > > > > - cd ..
> > > > > - mvn clean verify -DskipUTs=true -B
> > > > >
> > > > > and this always gets stuck? The failures on CI seem to be random
> > > > timeouts.
> > > > > Not very related to this.
> > > > >
> > > > > FWIW I ran the above 3 times, without glitches so far.. So if you
> can
> > > > > confirm then it ll help
> > > > >
> > > > >> On Wed, Oct 23, 2019 at 7:04 AM Vinoth Chandar <[email protected]
> >
> > > > wrote:
> > > > >>
> > > > >> I saw someone else share the same experience. Can't think of
> > anything
> > > > that
> > > > >> could have caused this to become flaky recently.
> > > > >> I already created https://issues.apache.org/jira/browse/HUDI-312
> > > > >> <
> > > >
> > >
> >
> https://issues.apache.org/jira/browse/HUDI-312?filter=12347468&jql=project%20%3D%20HUDI%20AND%20fixVersion%20%3D%200.5.1%20AND%20(status%20%3D%20Open%20OR%20status%20%3D%20%22In%20Progress%22)%20ORDER%20BY%20assignee%20ASC
> > > >
> > > > to
> > > > >> look into some flakiness on travis.
> > > > >>
> > > > >> any volunteers to drive this? (I am in the middle of fleshing out
> an
> > > > RFC)
> > > > >>
> > > > >> On Wed, Oct 23, 2019 at 6:43 AM Pratyaksh Sharma <
> > > [email protected]
> > > > >
> > > > >> wrote:
> > > > >>
> > > > >>> It gets stuck forever while running the following -
> > > > >>>
> > > > >>> Container : /adhoc-1, Running command :spark-submit --class
> > > > >>> org.apache.hudi.utilities.deltastreamer.HoodieDeltaStreamer
> > > > >>>
> > > >
> > /var/hoodie/ws/docker/hoodie/hadoop/hive_base/target/hoodie-utilities.jar
> > > > >>> --storage-type MERGE_ON_READ  --source-class
> > > > >>> org.apache.hudi.utilities.sources.JsonDFSSource
> > > > --source-ordering-field ts
> > > > >>> --target-base-path /user/hive/warehouse/stock_ticks_mor
> > > --target-table
> > > > >>> stock_ticks_mor --props /var/demo/config/dfs-source.properties
> > > > >>> --schemaprovider-class
> > > > >>> org.apache.hudi.utilities.schema.FilebasedSchemaProvider
> > > > >>> --disable-compaction  --enable-hive-sync  --hoodie-conf
> > > > >>> hoodie.datasource.hive_sync.jdbcurl=jdbc:hive2://hiveserver:10000
> > > > >>> --hoodie-conf hoodie.datasource.hive_sync.username=hive
> > > --hoodie-conf
> > > > >>> hoodie.datasource.hive_sync.password=hive  --hoodie-conf
> > > > >>> hoodie.datasource.hive_sync.partition_fields=dt  --hoodie-conf
> > > > >>> hoodie.datasource.hive_sync.database=default  --hoodie-conf
> > > > >>> hoodie.datasource.hive_sync.table=stock_ticks_mor
> > > > >>>
> > > > >>> On Wed, Oct 23, 2019 at 7:02 PM Pratyaksh Sharma <
> > > > [email protected]>
> > > > >>> wrote:
> > > > >>>
> > > > >>>> Hi,
> > > > >>>>
> > > > >>>> I am facing errors when trying to run integration tests using
> the
> > > > script
> > > > >>>> travis_run_tests.sh and also it takes a lot of time or rather
> gets
> > > > >>> stuck.
> > > > >>>> If I run them like normal junit tests, they work fine.
> > > > >>>>
> > > > >>>> Sometimes random transient errors also come, but these are the
> > most
> > > > >>>> frequent ones -
> > > > >>>>
> > > > >>>> [ERROR] Tests run: 3, Failures: 3, Errors: 0, Skipped: 0, Time
> > > > elapsed:
> > > > >>>> 345.207 s <<< FAILURE! - in
> > org.apache.hudi.integ.ITTestHoodieSanity
> > > > >>>> [ERROR]
> > > > >>>>
> > > > >>>
> > > >
> > >
> >
> testRunHoodieJavaAppOnSinglePartitionKeyCOWTable(org.apache.hudi.integ.ITTestHoodieSanity)
> > > > >>>> Time elapsed: 129.227 s  <<< FAILURE!
> > > > >>>> java.lang.AssertionError: Expecting 100 rows to be present in
> the
> > > new
> > > > >>>> table expected:<100> but was:<200>
> > > > >>>> at
> > > > >>>>
> > > > >>>
> > > >
> > >
> >
> org.apache.hudi.integ.ITTestHoodieSanity.testRunHoodieJavaAppOnCOWTable(ITTestHoodieSanity.java:115)
> > > > >>>> at
> > > > >>>>
> > > > >>>
> > > >
> > >
> >
> org.apache.hudi.integ.ITTestHoodieSanity.testRunHoodieJavaAppOnSinglePartitionKeyCOWTable(ITTestHoodieSanity.java:42)
> > > > >>>>
> > > > >>>> [ERROR]
> > > > >>>>
> > > > >>>
> > > >
> > >
> >
> testRunHoodieJavaAppOnMultiPartitionKeysCOWTable(org.apache.hudi.integ.ITTestHoodieSanity)
> > > > >>>> Time elapsed: 108.146 s  <<< FAILURE!
> > > > >>>> java.lang.AssertionError: Expecting 100 rows to be present in
> the
> > > new
> > > > >>>> table expected:<100> but was:<200>
> > > > >>>> at
> > > > >>>>
> > > > >>>
> > > >
> > >
> >
> org.apache.hudi.integ.ITTestHoodieSanity.testRunHoodieJavaAppOnCOWTable(ITTestHoodieSanity.java:115)
> > > > >>>> at
> > > > >>>>
> > > > >>>
> > > >
> > >
> >
> org.apache.hudi.integ.ITTestHoodieSanity.testRunHoodieJavaAppOnMultiPartitionKeysCOWTable(ITTestHoodieSanity.java:54)
> > > > >>>>
> > > > >>>> [ERROR]
> > > > >>>>
> > > > >>>
> > > >
> > >
> >
> testRunHoodieJavaAppOnNonPartitionedCOWTable(org.apache.hudi.integ.ITTestHoodieSanity)
> > > > >>>> Time elapsed: 107.63 s  <<< FAILURE!
> > > > >>>> java.lang.AssertionError: Expecting 100 rows to be present in
> the
> > > new
> > > > >>>> table expected:<100> but was:<200>
> > > > >>>> at
> > > > >>>>
> > > > >>>
> > > >
> > >
> >
> org.apache.hudi.integ.ITTestHoodieSanity.testRunHoodieJavaAppOnCOWTable(ITTestHoodieSanity.java:115)
> > > > >>>> at
> > > > >>>>
> > > > >>>
> > > >
> > >
> >
> org.apache.hudi.integ.ITTestHoodieSanity.testRunHoodieJavaAppOnNonPartitionedCOWTable(ITTestHoodieSanity.java:66)
> > > > >>>>
> > > > >>>> Has anybody else faced similar issues?
> > > > >>>>
> > > > >>>>
> > > > >>>>
> > > > >>>
> > > > >>
> > > >
> > >
> >
>
>
>
>

Reply via email to