[GitHub] spark pull request: [SPARK-7017][Build][Project Infra]: Refactor d...

JoshRosen Fri, 12 Jun 2015 14:35:38 -0700

Github user JoshRosen commented on the pull request:

    https://github.com/apache/spark/pull/5694#issuecomment-111625224
  
    After thinking about it a bit more, I think that this PR's test triggering 
logic could be significantly easier to understand if we rewrote it in terms of 
a job / dependency graph abstraction.
    
    At a high level, we have Spark modules / components which
    
    1. are affected / impacted by file changes (e.g. a module is associated 
with a set of source files, so changes to those files change the module),
    2. contain a set of tests to run, which are triggered via Maven, SBT, or 
via Python / R scripts.
    3. depend on other modules and have dependent modules: if we change core, 
then every downstream test should be run.
    
    Right now, the per-module logic is spread across a few different places: we 
have one function that describes how to detect changes for all modules, another 
function that (implicitly) deals with module dependencies, etc.
    
    Instead, I propose that we introduce a class for describing a module, use 
instances of this class to build up a dependency graph, then phrase the "find 
which tests to run" operations in terms of that graph.  I think that this will 
be easier to understand / maintain.




---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at [email protected] or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

[GitHub] spark pull request: [SPARK-7017][Build][Project Infra]: Refactor d...

Reply via email to