[GitHub] spark pull request: [SPARK-1768] History server enhancements.

vanzin Wed, 04 Jun 2014 12:04:57 -0700

Github user vanzin commented on a diff in the pull request:

    https://github.com/apache/spark/pull/718#discussion_r13404251
  
    --- Diff: 
core/src/main/scala/org/apache/spark/deploy/history/HistoryServer.scala ---
    @@ -290,8 +217,88 @@ object HistoryServer {
         }
       }
     
    +  private def parse(args: List[String]): Unit = {
    +    args match {
    +      case ("--dir" | "-d") :: value :: tail =>
    +        set("fs.logDirectory",  value)
    +        parse(tail)
    +
    +      case ("--port" | "-p") :: value :: tail =>
    +        set("ui.port", value)
    +        parse(tail)
    +
    +      case ("-D") :: opt :: value :: tail =>
    +        set(opt, value)
    +        parse(tail)
    +
    +      case ("--help" | "-h") :: tail =>
    +        printUsageAndExit(0)
    +
    +      case Nil =>
    +
    +      case _ =>
    +        printUsageAndExit(1)
    +    }
    +  }
    +
    +  private def set(name: String, value: String) = {
    +    conf.set("spark.history." + name, value)
    +  }
    +
    +  private def printUsageAndExit(exitCode: Int) {
    +    System.err.println(
    +      """
    +      |Usage: HistoryServer [options]
    +      |
    +      |Options are set by passing "-D option value" command line arguments 
to the class.
    +      |Command line options will override the Spark configuration file and 
system properties.
    +      |History Server options are always available; additional options 
depend on the provider.
    +      |
    +      |History Server options:
    +      |
    +      |  ui.port           Port where server will listen for connections 
(default 18080)
    +      |  ui.acls.enable    Whether to enable view acls for all 
applications (default false)
    +      |  provider          Name of history provider class (defaults to 
file system-based provider)
    +      |
    +      |FsHistoryProvider options:
    +      |
    +      |  fs.logDirectory   Directory where app logs are stored (required)
    +      |  fs.updateInterval How often to reload log data from storage 
(seconds, default 10)
    +      |""".stripMargin)
    +    System.exit(exitCode)
    +  }
    +
     }
     
    +private[spark] abstract class ApplicationHistoryProvider {
    +
    +  /**
    +   * This method should return a list of applications available for the 
history server to
    +   * show. The listing is assumed to be in descending time order.
    +   *
    +   * An adjusted offset should be returned if the app list has changed and 
the request
    +   * references an invalid start offset. Otherwise, the provided offset 
should be returned.
    +   *
    +   * @param offset Starting offset for returned objects.
    +   * @param count Max number of objects to return.
    +   * @return 3-tuple (requested app list, adjusted offset, count of all 
available apps)
    +   */
    +  def getListing(offset: Int, count: Int): (Seq[ApplicationHistoryInfo], 
Int, Int)
    --- End diff --
    
    I'm not sure I totally follow. If you mean both methods return the same 
type (ApplicationHistoryInfo) I agree that's a little bit confusing. It could 
be easily changed though. I can clarify what the ordering means, pending the 
rest of the discussion.
    
    Regarding predicates, I was deliberately avoiding going down that path. I 
don't see a good way to have both the FS-based backend and a future Yarn 
backend cleanly sharing the same predicate language. Doing so would mean 
creating some Spark-specific language for that, and parsing / translating that 
to what yarn understands. Or supporting a subset of Yarn's parameters in the 
non-Yarn backend. 
    
    I'm not sure that's the best path forward (although I understand not 
everybody uses Yarn, let alone the latest and greatest version). In my view, 
when using Yarn as the backend, the user would navigate the listing using 
Yarn's UI, and that would link to the Spark history server for rendering 
individual applications. The SHS listing page would be just a simple fallback 
if someone ends up going there, but wouldn't provide many features, and having 
offset / limit parameters would be the bare minimum to allow the server to 
scale and at the same time not flood the client with a huge HTML page.
    
    But if you think that the SHS should be enhanced in the future to support 
these kinds of predicates, offset / limits can be built into that language.
    
    BTW, this is what the Yarn API exposes as far as predicates:
    
          @PathParam("entityType") String entityType,
          @QueryParam("primaryFilter") String primaryFilter,
          @QueryParam("secondaryFilter") String secondaryFilter,
          @QueryParam("windowStart") String windowStart,
          @QueryParam("windowEnd") String windowEnd,
          @QueryParam("fromId") String fromId,
          @QueryParam("fromTs") String fromTs,
          @QueryParam("limit") String limit,
          @QueryParam("fields") String fields
    
    I think you could map "offset" to the windowing parameters, although 
documentation there is still kinda lacking. Anyway, to avoid going down a rat 
hole here, I'll just pull the offset / limit parameters into the UI layer for 
now.



---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at [email protected] or file a JIRA ticket
with INFRA.
---

[GitHub] spark pull request: [SPARK-1768] History server enhancements.

Reply via email to