Approach 2 is definitely better :) Can you tell us more about the use case why you want to do this?
TD On Wed, Apr 8, 2015 at 1:44 AM, Emre Sevinc <emre.sev...@gmail.com> wrote: > Hello, > > This is about SPARK-3276 and I want to make MIN_REMEMBER_DURATION (that is > now a constant) a variable (configurable, with a default value). Before > spending effort on developing something and creating a pull request, I > wanted to consult with the core developers to see which approach makes most > sense, and has the higher probability of being accepted. > > The constant MIN_REMEMBER_DURATION can be seen at: > > > > https://github.com/apache/spark/blob/master/streaming/src/main/scala/org/apache/spark/streaming/dstream/FileInputDStream.scala#L338 > > it is marked as private member of private[streaming] object > FileInputDStream. > > Approach 1: Make MIN_REMEMBER_DURATION a variable, with a new name of > minRememberDuration, and then add a new fileStream method to > JavaStreamingContext.scala : > > > > https://github.com/apache/spark/blob/master/streaming/src/main/scala/org/apache/spark/streaming/api/java/JavaStreamingContext.scala > > such that the new fileStream method accepts a new parameter, e.g. > minRememberDuration: Int (in seconds), and then use this value to set the > private minRememberDuration. > > > Approach 2: Create a new, public Spark configuration property, e.g. named > spark.rememberDuration.min (with a default value of 60 seconds), and then > set the private variable minRememberDuration to the value of this Spark > property. > > > Approach 1 would mean adding a new method to the public API, Approach 2 > would mean creating a new public Spark property. Right now, approach 2 > seems more straightforward and simpler to me, but nevertheless I wanted to > have the opinions of other developers who know the internals of Spark > better than I do. > > Kind regards, > Emre Sevinç >