[jira] [Commented] (HUDI-781) Re-design test utilities

2020-06-11 Thread Vinoth Chandar (Jira)


[ 
https://issues.apache.org/jira/browse/HUDI-781?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17133539#comment-17133539
 ] 

Vinoth Chandar commented on HUDI-781:
-

One thing to be careful about when deciding mocks vs functional tests is 
whether we are replacing the functional test or not (which is when the time 
will actually decrease)..

I would not index too much on the time reduction aspects alone.. the functional 
tests are more comprehensive for the most part and served us well for so long.. 
So we should only replace the functional tests with good  confidence that the 
coverage it provides can achieve solely based on a mocked unit test. 

> Re-design test utilities
> 
>
> Key: HUDI-781
> URL: https://issues.apache.org/jira/browse/HUDI-781
> Project: Apache Hudi
>  Issue Type: Test
>  Components: Testing
>Reporter: Raymond Xu
>Priority: Major
>
> Test utility classes are to re-designed with considerations like
>  * Use more mockings
>  * Reduce spark context setup
>  * Improve/clean up data generator
> An RFC would be preferred for illustrating the design work.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Commented] (HUDI-781) Re-design test utilities

2020-06-09 Thread Nishith Agarwal (Jira)


[ 
https://issues.apache.org/jira/browse/HUDI-781?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17130049#comment-17130049
 ] 

Nishith Agarwal commented on HUDI-781:
--

[~pwason] Can you help with #2 ? Like we talked about, mocks can be helpful to 
reduce the build time especially for client tests. 

> Re-design test utilities
> 
>
> Key: HUDI-781
> URL: https://issues.apache.org/jira/browse/HUDI-781
> Project: Apache Hudi
>  Issue Type: Test
>  Components: Testing
>Reporter: Raymond Xu
>Priority: Major
>
> Test utility classes are to re-designed with considerations like
>  * Use more mockings
>  * Reduce spark context setup
>  * Improve/clean up data generator
> An RFC would be preferred for illustrating the design work.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Commented] (HUDI-781) Re-design test utilities

2020-06-09 Thread Raymond Xu (Jira)


[ 
https://issues.apache.org/jira/browse/HUDI-781?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17129642#comment-17129642
 ] 

Raymond Xu commented on HUDI-781:
-

[~vinoth] Make sense. I've paused #1 as it's targeting from a different angle. 
I've talked to [~garyli1019], as he tried to eradicate the leaking but it 
turned out to be difficult, probably due to too many resource init. We def. 
would keep watching the issues see whenever we can to fix some.

> Re-design test utilities
> 
>
> Key: HUDI-781
> URL: https://issues.apache.org/jira/browse/HUDI-781
> Project: Apache Hudi
>  Issue Type: Test
>  Components: Testing
>Reporter: Raymond Xu
>Priority: Major
>
> Test utility classes are to re-designed with considerations like
>  * Use more mockings
>  * Reduce spark context setup
>  * Improve/clean up data generator
> An RFC would be preferred for illustrating the design work.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Commented] (HUDI-781) Re-design test utilities

2020-06-09 Thread Vinoth Chandar (Jira)


[ 
https://issues.apache.org/jira/browse/HUDI-781?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17129248#comment-17129248
 ] 

Vinoth Chandar commented on HUDI-781:
-

Sounds good overall.. I would suggest we get a head start in #5 though.. As, 
faster stabler test will likely help us iterate faster on the test redesign 
project as well.?

 

 

> Re-design test utilities
> 
>
> Key: HUDI-781
> URL: https://issues.apache.org/jira/browse/HUDI-781
> Project: Apache Hudi
>  Issue Type: Test
>  Components: Testing
>Reporter: Raymond Xu
>Priority: Major
>
> Test utility classes are to re-designed with considerations like
>  * Use more mockings
>  * Reduce spark context setup
>  * Improve/clean up data generator
> An RFC would be preferred for illustrating the design work.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Commented] (HUDI-781) Re-design test utilities

2020-06-08 Thread Raymond Xu (Jira)


[ 
https://issues.apache.org/jira/browse/HUDI-781?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17128778#comment-17128778
 ] 

Raymond Xu commented on HUDI-781:
-

[~yanghua] [~yanghua] [~nishith29] [~garyli1019]

Here is an execution plan of the subtasks
 * To begin with, I'm trying to finish subtask #1 as it can be a quick win. As 
shown in [https://github.com/apache/hudi/pull/1619#issuecomment-627610722,] we 
can reduce CI time by 10+ min by simply split the test tasks
 * In parallel we can start #3. The proposed `hudi-testutils` module is to 
encompass all `testutils` from each module, which makes the test dependencies 
clearer. It will clean up some misplaced tests found during package 
restructure. 
 ** org.apache.hudi.execution.TestBoundedInMemoryQueue in `hudi-client` should 
be put in `hudi-common` (due to client test harness dependency)
 ** org.apache.hudi.utilities.inline.fs.TestParquetInLining in `hudi-utilities` 
should be put in `hudi-common` (due to data generator dependency)
 * Once a minimum setup of `hudi-testutils` is done, we can start #4
 ** Implement a shared spark session provider there
 ** Use the shared spark session provider for test suites, which group 
functional tests with similar setup/teardown logic (may need to figure out 
Junit 5 version of Junit 4 test suites with Rule / ClassRule )
 ** By using the new provider class on functional tests one by one, we should 
start observing reduced test time of hudi-client module or others
 * #2 and #5 can be done in parallel

Each subtask has its own detailed points in its ticket. Please review this 
rough plan and feedback accordingly. Thanks!

> Re-design test utilities
> 
>
> Key: HUDI-781
> URL: https://issues.apache.org/jira/browse/HUDI-781
> Project: Apache Hudi
>  Issue Type: Test
>  Components: Testing
>Reporter: Raymond Xu
>Priority: Major
>
> Test utility classes are to re-designed with considerations like
>  * Use more mockings
>  * Reduce spark context setup
>  * Improve/clean up data generator
> An RFC would be preferred for illustrating the design work.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Commented] (HUDI-781) Re-design test utilities

2020-04-18 Thread Raymond Xu (Jira)


[ 
https://issues.apache.org/jira/browse/HUDI-781?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17086739#comment-17086739
 ] 

Raymond Xu commented on HUDI-781:
-

[~vinoth][~yanghua]Thank you for the feedbacks. Created 
https://jira.apache.org/jira/browse/HUDI-811 for this task, which will be 
started after migrating all to JUnit 5 APIs.

> Re-design test utilities
> 
>
> Key: HUDI-781
> URL: https://issues.apache.org/jira/browse/HUDI-781
> Project: Apache Hudi (incubating)
>  Issue Type: Improvement
>  Components: Testing
>Reporter: Raymond Xu
>Priority: Major
>
> Test utility classes are to re-designed with considerations like
>  * Use more mockings
>  * Reduce spark context setup
>  * Improve/clean up data generator
> An RFC would be preferred for illustrating the design work.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Commented] (HUDI-781) Re-design test utilities

2020-04-14 Thread vinoyang (Jira)


[ 
https://issues.apache.org/jira/browse/HUDI-781?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17083724#comment-17083724
 ] 

vinoyang commented on HUDI-781:
---

+1 to distinguish the pure unit tests and engine-binding tests, this action 
would be helpful to unify more in the future.

> Re-design test utilities
> 
>
> Key: HUDI-781
> URL: https://issues.apache.org/jira/browse/HUDI-781
> Project: Apache Hudi (incubating)
>  Issue Type: Improvement
>  Components: Testing
>Reporter: Raymond Xu
>Priority: Major
>
> Test utility classes are to re-designed with considerations like
>  * Use more mockings
>  * Reduce spark context setup
>  * Improve/clean up data generator
> An RFC would be preferred for illustrating the design work.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Commented] (HUDI-781) Re-design test utilities

2020-04-14 Thread Vinoth Chandar (Jira)


[ 
https://issues.apache.org/jira/browse/HUDI-781?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17083624#comment-17083624
 ] 

Vinoth Chandar commented on HUDI-781:
-

+1 ... I would also add having the unit tests mirror the package structure..

What you ll find is that we have mostly functional style tests.. (that where 
you invest your time, if its scarce.. probably will make the same choice if I 
am in those shoes again :))

> Re-design test utilities
> 
>
> Key: HUDI-781
> URL: https://issues.apache.org/jira/browse/HUDI-781
> Project: Apache Hudi (incubating)
>  Issue Type: Improvement
>  Components: Testing
>Reporter: Raymond Xu
>Priority: Major
>
> Test utility classes are to re-designed with considerations like
>  * Use more mockings
>  * Reduce spark context setup
>  * Improve/clean up data generator
> An RFC would be preferred for illustrating the design work.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Commented] (HUDI-781) Re-design test utilities

2020-04-14 Thread Raymond Xu (Jira)


[ 
https://issues.apache.org/jira/browse/HUDI-781?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17083451#comment-17083451
 ] 

Raymond Xu commented on HUDI-781:
-

[~vinothchandar] [~yanghua] Before we consider re-design some test utilities, I 
think it will be easier to just move classes and categorize them. I'm thinking 
adding these 2 packages under each module

* functional/ contains test cases that require spark context and local servers
* testutils/ contains all test utilities for that module

all other test packages should only contain pure-logic unit tests

As for RFC, guess it's not very necessary at the moment for the restructuring. 
WDYT?

> Re-design test utilities
> 
>
> Key: HUDI-781
> URL: https://issues.apache.org/jira/browse/HUDI-781
> Project: Apache Hudi (incubating)
>  Issue Type: Improvement
>  Components: Testing
>Reporter: Raymond Xu
>Priority: Major
>
> Test utility classes are to re-designed with considerations like
>  * Use more mockings
>  * Reduce spark context setup
>  * Improve/clean up data generator
> An RFC would be preferred for illustrating the design work.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)