[GitHub] spark pull request: [SPARK-4180] [Core] Prevent creation of multip...

JoshRosen Wed, 05 Nov 2014 23:10:49 -0800

Github user JoshRosen commented on the pull request:

    https://github.com/apache/spark/pull/3121#issuecomment-61935501
  
    To be more concrete, I'm suggesting something like this:
    
    ```scala
    object TestData {
    
      /**
       * Initialize TestData using the given SQLContext.  This will re-create 
all SchemaRDDs and tables
       * using that context.
       */
      def init(sqlContext: SQLContext) {
        initMethods.foreach(m => m(sqlContext))
      }
    
      /** A sequence of functions that are invoked when `init()` is called */
      private val initMethods = mutable.Buffer[SQLContext => Unit]()
    
      /**
       * Register a block of code to be called when TestData is initialized 
with a new SQLContext.
       */
      private def onInit(block: SQLContext => Unit) {
        initMethods += block
      }
    
      def testData = _testData
      private var _testData: SchemaRDD = null
      onInit { sqlContext =>
         _testData = sqlContext.sparkContext.parallelize(
          (1 to 100).map(i => TestData(i, i.toString))).toSchemaRDD
        testData.registerTempTable("testData")
      }
    
     case class LargeAndSmallInts(a: Int, b: Int)
      def largeAndSmallInts = _largeAndSmallInts
      private var _largeAndSmallInts: SchemaRDD = null
      onInit { sqlContext =>
          ...
      }
    
      [...]
    ```
    
    This whole `onInit` thing is a way to co-locate the fields, case classes, 
and initialization code fragments.  From clients' perspectives, the public 
`val`s have become getter `def`s, but everything else stays the same.




---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at [email protected] or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

[GitHub] spark pull request: [SPARK-4180] [Core] Prevent creation of multip...

Reply via email to