[ https://issues.apache.org/jira/browse/SPARK-3276?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14483174#comment-14483174 ]
Emre Sevinç edited comment on SPARK-3276 at 4/7/15 2:36 PM: ------------------------------------------------------------ [~srowen] would it be fine if I added a public API method on FileInputDStream class that takes a single parameter (duration) and sets the value of {{MIN_REMEMBER_DURATION}} to that value? And of course, at the same time changing MIN_REMEMBER_DURATION from a constant into a variable, with a default value of 1 minute (that is the currently hard-coded value). Or, as an alternative to achieve the similar effect: Create another Spark configuration property (with a default value of 1 minute) and re-factor the code so that (the new) {{minRememberDuration}} variable takes its value from that property. Right now, I have no idea which of the above two approaches is more meaningful / idiomatic. Any comments? was (Author: emres): [~srowen] would it be fine if I added a public API method on FileInputDStream class that takes a single parameter (duration) and sets the value of MIN_REMEMBER_DURATION to that value? And of course, at the same time changing MIN_REMEMBER_DURATION from a constant into a variable, with a default value of 1 minute (that is the currently hard-coded value). > Provide a API to specify MIN_REMEMBER_DURATION for files to consider as input > in streaming > ------------------------------------------------------------------------------------------ > > Key: SPARK-3276 > URL: https://issues.apache.org/jira/browse/SPARK-3276 > Project: Spark > Issue Type: Improvement > Components: Streaming > Affects Versions: 1.2.0 > Reporter: Jack Hu > Priority: Minor > > Currently, only one API called textFileStream in StreamingContext to specify > the text file dstream, which ignores the old files always. On some times, the > old files is still useful. > Need a API to let user choose whether the old files need to be ingored or not > . > The API currently in StreamingContext: > def textFileStream(directory: String): DStream[String] = { > fileStream[LongWritable, Text, > TextInputFormat](directory).map(_._2.toString) > } -- This message was sent by Atlassian JIRA (v6.3.4#6332) --------------------------------------------------------------------- To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org