Re: Unable to run hudi-cli integration tests

2020-05-19 Thread Pratyaksh Sharma
Hi hddong,

Thank you for your help. Looks like brew installation of spark was the
issue. I set up spark on my machine using spark binaries, and it runs fine
now.

On Mon, May 18, 2020 at 9:02 PM Pratyaksh Sharma 
wrote:

> Hi hddong,
>
> The concerned test in my error log (org.apache.hudi.cli.integ.
> ITTestRepairsCommand.testDeduplicateWithReal) passes when run on our
> travis CI. So there is some problem with my local itself.
>
> On Mon, May 18, 2020 at 3:33 PM hddong  wrote:
>
>> Hi,
>>
>> I had try docker before, it usually use `execStartCmd` to exec cmd
>> directly.
>> But for hudi-cli, we need exec cmd in interactive mode. There are some
>> different.
>> If there is any way, run in docker is better.
>>
>> @Shiyan Your command run failed due to spark task failed, I guess you need
>> a tmp folder. Use `mkdir /tmp/spark-events/`, if you not change the config
>> for spark.
>> You'd better have a look of  detail error log (above assert Error).
>>
>> @Pratyaksh yes, it looks like deduping is done, but not work. Is it cause
>> of your code adjustment?
>> Can you try run the test in master branch and check if the exception
>> exist.
>>
>> Pratyaksh Sharma  于2020年5月18日周一 上午1:30写道:
>>
>> > Hi,
>> >
>> > For me also the test runs but looking at the error, it looks like no
>> work
>> > or deduping is done, which is strange. Here is the error ->
>> >
>> > [*ERROR*] *Tests **run: 1*, *Failures: 1*, Errors: 0, Skipped: 0, Time
>> > elapsed: 8.425 s* <<< FAILURE!* - in org.apache.hudi.cli.integ.
>> > *ITTestRepairsCommand*
>> >
>> > [*ERROR*]
>> > org.apache.hudi.cli.integ.ITTestRepairsCommand.testDeduplicateWithReal
>> > Time elapsed: 7.588 s  <<< FAILURE!
>> >
>> > org.opentest4j.AssertionFailedError: expected: <200> but was: <210>
>> >
>> > at
>> >
>> >
>> org.apache.hudi.cli.integ.ITTestRepairsCommand.testDeduplicateWithReal(ITTestRepairsCommand.java:254)
>> >
>> > Initially also 210 records are present, so effectively the test runs but
>> > without doing anything. There is no other error apart from the above
>> one.
>> >
>> > I feel integration tests for hudi-cli should also be running in docker
>> like
>> > other integration tests rather than running on local spark installation.
>> > That would help ensure such issues do not come up in future. Thoughts?
>> >
>> > On Sun, May 17, 2020 at 7:59 PM hddong  wrote:
>> >
>> > > Hi Pratyaksh,
>> > >
>> > > Dose it throws the same Exception? And can you check if sparkLauncher
>> > > throws the same Exception. Most time, ITTest failed due to some
>> config of
>> > > local spark.
>> > >  I had got this Exception before, but it run successfully after `mvn
>> > clean
>> > > package ...`.
>> > >
>> > > Regards
>> > > hddong
>> > >
>> > > Pratyaksh Sharma  于2020年5月17日周日 下午8:42写道:
>> > >
>> > > > Hi hddong,
>> > > >
>> > > > Strange but nothing seems to work for me. I tried doing mvn clean
>> and
>> > > then
>> > > > run travis tests. Also I tried running the command `mvn clean
>> package
>> > > > -DskipTests -DskipITs -Pspark-shade-unbundle-avro` first and then
>> run
>> > the
>> > > > test using `mvn -Dtest=ITTestRepairsCommand#testDeduplicateWithReal
>> > > > -DfailIfNoTests=false test`. But both of them did not work. I have
>> > spark
>> > > > installation and I am setting the SPARK_HOME to
>> > > > /usr/local/Cellar/apache-spark/2.4.5.
>> > > >
>> > > > On Sun, May 17, 2020 at 9:00 AM hddong 
>> wrote:
>> > > >
>> > > > > Hi Pratyaksh,
>> > > > >
>> > > > > run_travis_tests,sh not run `mvn clean`, You can try to run `mvn
>> > > > > clean` manually
>> > > > > before integration test.
>> > > > >
>> > > > > BTW, if you use IDEA, you can do
>> > > > > `mvn clean package -DskipTests -DskipITs
>> -Pspark-shade-unbundle-avro`
>> > > > > first,
>> > > > > then just run integration test in IDEA like unit test does.
>> > > > >
>> > > > > But, there are something to notice here: you need a runnable spark
>> > and
>> > > > > SPARK_HOME should in env.
>> > > > >
>> > > > > Regards
>> > > > > hddong
>> > > > >
>> > > >
>> > >
>> >
>>
>


Re: Unable to run hudi-cli integration tests

2020-05-18 Thread Pratyaksh Sharma
Hi hddong,

The concerned test in my error log (org.apache.hudi.cli.integ.
ITTestRepairsCommand.testDeduplicateWithReal) passes when run on our travis
CI. So there is some problem with my local itself.

On Mon, May 18, 2020 at 3:33 PM hddong  wrote:

> Hi,
>
> I had try docker before, it usually use `execStartCmd` to exec cmd
> directly.
> But for hudi-cli, we need exec cmd in interactive mode. There are some
> different.
> If there is any way, run in docker is better.
>
> @Shiyan Your command run failed due to spark task failed, I guess you need
> a tmp folder. Use `mkdir /tmp/spark-events/`, if you not change the config
> for spark.
> You'd better have a look of  detail error log (above assert Error).
>
> @Pratyaksh yes, it looks like deduping is done, but not work. Is it cause
> of your code adjustment?
> Can you try run the test in master branch and check if the exception exist.
>
> Pratyaksh Sharma  于2020年5月18日周一 上午1:30写道:
>
> > Hi,
> >
> > For me also the test runs but looking at the error, it looks like no work
> > or deduping is done, which is strange. Here is the error ->
> >
> > [*ERROR*] *Tests **run: 1*, *Failures: 1*, Errors: 0, Skipped: 0, Time
> > elapsed: 8.425 s* <<< FAILURE!* - in org.apache.hudi.cli.integ.
> > *ITTestRepairsCommand*
> >
> > [*ERROR*]
> > org.apache.hudi.cli.integ.ITTestRepairsCommand.testDeduplicateWithReal
> > Time elapsed: 7.588 s  <<< FAILURE!
> >
> > org.opentest4j.AssertionFailedError: expected: <200> but was: <210>
> >
> > at
> >
> >
> org.apache.hudi.cli.integ.ITTestRepairsCommand.testDeduplicateWithReal(ITTestRepairsCommand.java:254)
> >
> > Initially also 210 records are present, so effectively the test runs but
> > without doing anything. There is no other error apart from the above one.
> >
> > I feel integration tests for hudi-cli should also be running in docker
> like
> > other integration tests rather than running on local spark installation.
> > That would help ensure such issues do not come up in future. Thoughts?
> >
> > On Sun, May 17, 2020 at 7:59 PM hddong  wrote:
> >
> > > Hi Pratyaksh,
> > >
> > > Dose it throws the same Exception? And can you check if sparkLauncher
> > > throws the same Exception. Most time, ITTest failed due to some config
> of
> > > local spark.
> > >  I had got this Exception before, but it run successfully after `mvn
> > clean
> > > package ...`.
> > >
> > > Regards
> > > hddong
> > >
> > > Pratyaksh Sharma  于2020年5月17日周日 下午8:42写道:
> > >
> > > > Hi hddong,
> > > >
> > > > Strange but nothing seems to work for me. I tried doing mvn clean and
> > > then
> > > > run travis tests. Also I tried running the command `mvn clean package
> > > > -DskipTests -DskipITs -Pspark-shade-unbundle-avro` first and then run
> > the
> > > > test using `mvn -Dtest=ITTestRepairsCommand#testDeduplicateWithReal
> > > > -DfailIfNoTests=false test`. But both of them did not work. I have
> > spark
> > > > installation and I am setting the SPARK_HOME to
> > > > /usr/local/Cellar/apache-spark/2.4.5.
> > > >
> > > > On Sun, May 17, 2020 at 9:00 AM hddong  wrote:
> > > >
> > > > > Hi Pratyaksh,
> > > > >
> > > > > run_travis_tests,sh not run `mvn clean`, You can try to run `mvn
> > > > > clean` manually
> > > > > before integration test.
> > > > >
> > > > > BTW, if you use IDEA, you can do
> > > > > `mvn clean package -DskipTests -DskipITs
> -Pspark-shade-unbundle-avro`
> > > > > first,
> > > > > then just run integration test in IDEA like unit test does.
> > > > >
> > > > > But, there are something to notice here: you need a runnable spark
> > and
> > > > > SPARK_HOME should in env.
> > > > >
> > > > > Regards
> > > > > hddong
> > > > >
> > > >
> > >
> >
>


Re: Unable to run hudi-cli integration tests

2020-05-18 Thread hddong
Hi,

I had try docker before, it usually use `execStartCmd` to exec cmd directly.
But for hudi-cli, we need exec cmd in interactive mode. There are some
different.
If there is any way, run in docker is better.

@Shiyan Your command run failed due to spark task failed, I guess you need
a tmp folder. Use `mkdir /tmp/spark-events/`, if you not change the config
for spark.
You'd better have a look of  detail error log (above assert Error).

@Pratyaksh yes, it looks like deduping is done, but not work. Is it cause
of your code adjustment?
Can you try run the test in master branch and check if the exception exist.

Pratyaksh Sharma  于2020年5月18日周一 上午1:30写道:

> Hi,
>
> For me also the test runs but looking at the error, it looks like no work
> or deduping is done, which is strange. Here is the error ->
>
> [*ERROR*] *Tests **run: 1*, *Failures: 1*, Errors: 0, Skipped: 0, Time
> elapsed: 8.425 s* <<< FAILURE!* - in org.apache.hudi.cli.integ.
> *ITTestRepairsCommand*
>
> [*ERROR*]
> org.apache.hudi.cli.integ.ITTestRepairsCommand.testDeduplicateWithReal
> Time elapsed: 7.588 s  <<< FAILURE!
>
> org.opentest4j.AssertionFailedError: expected: <200> but was: <210>
>
> at
>
> org.apache.hudi.cli.integ.ITTestRepairsCommand.testDeduplicateWithReal(ITTestRepairsCommand.java:254)
>
> Initially also 210 records are present, so effectively the test runs but
> without doing anything. There is no other error apart from the above one.
>
> I feel integration tests for hudi-cli should also be running in docker like
> other integration tests rather than running on local spark installation.
> That would help ensure such issues do not come up in future. Thoughts?
>
> On Sun, May 17, 2020 at 7:59 PM hddong  wrote:
>
> > Hi Pratyaksh,
> >
> > Dose it throws the same Exception? And can you check if sparkLauncher
> > throws the same Exception. Most time, ITTest failed due to some config of
> > local spark.
> >  I had got this Exception before, but it run successfully after `mvn
> clean
> > package ...`.
> >
> > Regards
> > hddong
> >
> > Pratyaksh Sharma  于2020年5月17日周日 下午8:42写道:
> >
> > > Hi hddong,
> > >
> > > Strange but nothing seems to work for me. I tried doing mvn clean and
> > then
> > > run travis tests. Also I tried running the command `mvn clean package
> > > -DskipTests -DskipITs -Pspark-shade-unbundle-avro` first and then run
> the
> > > test using `mvn -Dtest=ITTestRepairsCommand#testDeduplicateWithReal
> > > -DfailIfNoTests=false test`. But both of them did not work. I have
> spark
> > > installation and I am setting the SPARK_HOME to
> > > /usr/local/Cellar/apache-spark/2.4.5.
> > >
> > > On Sun, May 17, 2020 at 9:00 AM hddong  wrote:
> > >
> > > > Hi Pratyaksh,
> > > >
> > > > run_travis_tests,sh not run `mvn clean`, You can try to run `mvn
> > > > clean` manually
> > > > before integration test.
> > > >
> > > > BTW, if you use IDEA, you can do
> > > > `mvn clean package -DskipTests -DskipITs -Pspark-shade-unbundle-avro`
> > > > first,
> > > > then just run integration test in IDEA like unit test does.
> > > >
> > > > But, there are something to notice here: you need a runnable spark
> and
> > > > SPARK_HOME should in env.
> > > >
> > > > Regards
> > > > hddong
> > > >
> > >
> >
>


Re: Unable to run hudi-cli integration tests

2020-05-17 Thread Pratyaksh Sharma
Hi,

For me also the test runs but looking at the error, it looks like no work
or deduping is done, which is strange. Here is the error ->

[*ERROR*] *Tests **run: 1*, *Failures: 1*, Errors: 0, Skipped: 0, Time
elapsed: 8.425 s* <<< FAILURE!* - in org.apache.hudi.cli.integ.
*ITTestRepairsCommand*

[*ERROR*]
org.apache.hudi.cli.integ.ITTestRepairsCommand.testDeduplicateWithReal
Time elapsed: 7.588 s  <<< FAILURE!

org.opentest4j.AssertionFailedError: expected: <200> but was: <210>

at
org.apache.hudi.cli.integ.ITTestRepairsCommand.testDeduplicateWithReal(ITTestRepairsCommand.java:254)

Initially also 210 records are present, so effectively the test runs but
without doing anything. There is no other error apart from the above one.

I feel integration tests for hudi-cli should also be running in docker like
other integration tests rather than running on local spark installation.
That would help ensure such issues do not come up in future. Thoughts?

On Sun, May 17, 2020 at 7:59 PM hddong  wrote:

> Hi Pratyaksh,
>
> Dose it throws the same Exception? And can you check if sparkLauncher
> throws the same Exception. Most time, ITTest failed due to some config of
> local spark.
>  I had got this Exception before, but it run successfully after `mvn clean
> package ...`.
>
> Regards
> hddong
>
> Pratyaksh Sharma  于2020年5月17日周日 下午8:42写道:
>
> > Hi hddong,
> >
> > Strange but nothing seems to work for me. I tried doing mvn clean and
> then
> > run travis tests. Also I tried running the command `mvn clean package
> > -DskipTests -DskipITs -Pspark-shade-unbundle-avro` first and then run the
> > test using `mvn -Dtest=ITTestRepairsCommand#testDeduplicateWithReal
> > -DfailIfNoTests=false test`. But both of them did not work. I have spark
> > installation and I am setting the SPARK_HOME to
> > /usr/local/Cellar/apache-spark/2.4.5.
> >
> > On Sun, May 17, 2020 at 9:00 AM hddong  wrote:
> >
> > > Hi Pratyaksh,
> > >
> > > run_travis_tests,sh not run `mvn clean`, You can try to run `mvn
> > > clean` manually
> > > before integration test.
> > >
> > > BTW, if you use IDEA, you can do
> > > `mvn clean package -DskipTests -DskipITs -Pspark-shade-unbundle-avro`
> > > first,
> > > then just run integration test in IDEA like unit test does.
> > >
> > > But, there are something to notice here: you need a runnable spark and
> > > SPARK_HOME should in env.
> > >
> > > Regards
> > > hddong
> > >
> >
>


Re: Unable to run hudi-cli integration tests

2020-05-17 Thread hddong
Hi Pratyaksh,

Dose it throws the same Exception? And can you check if sparkLauncher
throws the same Exception. Most time, ITTest failed due to some config of
local spark.
 I had got this Exception before, but it run successfully after `mvn clean
package ...`.

Regards
hddong

Pratyaksh Sharma  于2020年5月17日周日 下午8:42写道:

> Hi hddong,
>
> Strange but nothing seems to work for me. I tried doing mvn clean and then
> run travis tests. Also I tried running the command `mvn clean package
> -DskipTests -DskipITs -Pspark-shade-unbundle-avro` first and then run the
> test using `mvn -Dtest=ITTestRepairsCommand#testDeduplicateWithReal
> -DfailIfNoTests=false test`. But both of them did not work. I have spark
> installation and I am setting the SPARK_HOME to
> /usr/local/Cellar/apache-spark/2.4.5.
>
> On Sun, May 17, 2020 at 9:00 AM hddong  wrote:
>
> > Hi Pratyaksh,
> >
> > run_travis_tests,sh not run `mvn clean`, You can try to run `mvn
> > clean` manually
> > before integration test.
> >
> > BTW, if you use IDEA, you can do
> > `mvn clean package -DskipTests -DskipITs -Pspark-shade-unbundle-avro`
> > first,
> > then just run integration test in IDEA like unit test does.
> >
> > But, there are something to notice here: you need a runnable spark and
> > SPARK_HOME should in env.
> >
> > Regards
> > hddong
> >
>


Re: Unable to run hudi-cli integration tests

2020-05-17 Thread Shiyan Xu
Hi Pratyaksh,

I have the same setup as yours. I would normally tend to clean up my local
deps

mvn dependency:purge-local-repository

mvn clean install -DskipTests -DskipITs

mvn -Dtest=ITTestRepairsCommand#testDeduplicateWithReal
-DfailIfNoTests=false test

Though I was able to run the test, it failed to pass... sharing this to
check if I'm not alone :)

[ERROR] Tests run: 1, Failures: 1, Errors: 0, Skipped: 0, Time elapsed:
6.822 s <<< FAILURE! - in org.apache.hudi.cli.integ.ITTestRepairsCommand
[ERROR]
org.apache.hudi.cli.integ.ITTestRepairsCommand.testDeduplicateWithReal
 Time elapsed: 6.014 s  <<< FAILURE!
org.opentest4j.AssertionFailedError: expected:  but was: 
at
org.apache.hudi.cli.integ.ITTestRepairsCommand.testDeduplicateWithReal(ITTestRepairsCommand.java:171)

If it's the same for others, we'd need to investigate why CI is passing.
(hope it's just my setup)

My local setup
➜ mvn -v
Apache Maven 3.6.3 (cecedd343002696d0abb50b32b541b8a6ba2883f)
Maven home: /usr/local/Cellar/maven/3.6.3_1/libexec
Java version: 1.8.0_242, vendor: AdoptOpenJDK, runtime:
/Library/Java/JavaVirtualMachines/adoptopenjdk-8.jdk/Contents/Home/jre
Default locale: en_US, platform encoding: UTF-8
OS name: "mac os x", version: "10.15.4", arch: "x86_64", family: "mac"

➜ scala -version
Scala code runner version 2.11.12 -- Copyright 2002-2017, LAMP/EPFL

On Sun, May 17, 2020 at 5:42 AM Pratyaksh Sharma 
wrote:

> Hi hddong,
>
> Strange but nothing seems to work for me. I tried doing mvn clean and then
> run travis tests. Also I tried running the command `mvn clean package
> -DskipTests -DskipITs -Pspark-shade-unbundle-avro` first and then run the
> test using `mvn -Dtest=ITTestRepairsCommand#testDeduplicateWithReal
> -DfailIfNoTests=false test`. But both of them did not work. I have spark
> installation and I am setting the SPARK_HOME to
> /usr/local/Cellar/apache-spark/2.4.5.
>
> On Sun, May 17, 2020 at 9:00 AM hddong  wrote:
>
> > Hi Pratyaksh,
> >
> > run_travis_tests,sh not run `mvn clean`, You can try to run `mvn
> > clean` manually
> > before integration test.
> >
> > BTW, if you use IDEA, you can do
> > `mvn clean package -DskipTests -DskipITs -Pspark-shade-unbundle-avro`
> > first,
> > then just run integration test in IDEA like unit test does.
> >
> > But, there are something to notice here: you need a runnable spark and
> > SPARK_HOME should in env.
> >
> > Regards
> > hddong
> >
>


Re: Unable to run hudi-cli integration tests

2020-05-17 Thread Pratyaksh Sharma
Hi hddong,

Strange but nothing seems to work for me. I tried doing mvn clean and then
run travis tests. Also I tried running the command `mvn clean package
-DskipTests -DskipITs -Pspark-shade-unbundle-avro` first and then run the
test using `mvn -Dtest=ITTestRepairsCommand#testDeduplicateWithReal
-DfailIfNoTests=false test`. But both of them did not work. I have spark
installation and I am setting the SPARK_HOME to
/usr/local/Cellar/apache-spark/2.4.5.

On Sun, May 17, 2020 at 9:00 AM hddong  wrote:

> Hi Pratyaksh,
>
> run_travis_tests,sh not run `mvn clean`, You can try to run `mvn
> clean` manually
> before integration test.
>
> BTW, if you use IDEA, you can do
> `mvn clean package -DskipTests -DskipITs -Pspark-shade-unbundle-avro`
> first,
> then just run integration test in IDEA like unit test does.
>
> But, there are something to notice here: you need a runnable spark and
> SPARK_HOME should in env.
>
> Regards
> hddong
>


Re: Unable to run hudi-cli integration tests

2020-05-16 Thread hddong
Hi Pratyaksh,

run_travis_tests,sh not run `mvn clean`, You can try to run `mvn
clean` manually
before integration test.

BTW, if you use IDEA, you can do
`mvn clean package -DskipTests -DskipITs -Pspark-shade-unbundle-avro` first,
then just run integration test in IDEA like unit test does.

But, there are something to notice here: you need a runnable spark and
SPARK_HOME should in env.

Regards
hddong