Re: Re: Re: How to change output mode to Update
sorry my mistake i gave wrong id here is correct one https://issues.apache.org/jira/browse/SPARK-15183 On Wed, May 18, 2016 at 11:19 AM, Toddwrote: > Hi Sachin, > > Could you please give the url of jira-15146? Thanks! > > > > > > At 2016-05-18 13:33:47, "Sachin Aggarwal" > wrote: > > Hi, there is some code I have added in jira-15146 please have a look at > it, I have not finished it. U can use the same code in ur example as of now > On 18-May-2016 10:46 AM, "Saisai Shao" wrote: > >> > .mode(SaveMode.Overwrite) >> >> From my understanding mode is not supported in continuous query. >> >> def mode(saveMode: SaveMode): DataFrameWriter = { >> // mode() is used for non-continuous queries >> // outputMode() is used for continuous queries >> assertNotStreaming("mode() can only be called on non-continuous queries") >> this.mode = saveMode >> this >> } >> >> >> On Wed, May 18, 2016 at 12:25 PM, Todd wrote: >> >>> Thanks Ted. >>> >>> I didn't try, but I think SaveMode and OuputMode are different things. >>> Currently, the spark code contain two output mode, Append and Update. >>> Append is the default mode,but looks that there is no way to change to >>> Update. >>> >>> Take a look at DataFrameWriter#startQuery >>> >>> Thanks. >>> >>> >>> >>> >>> >>> >>> At 2016-05-18 12:10:11, "Ted Yu" wrote: >>> >>> Have you tried adding: >>> >>> .mode(SaveMode.Overwrite) >>> >>> On Tue, May 17, 2016 at 8:55 PM, Todd wrote: >>> scala> records.groupBy("name").count().write.trigger(ProcessingTime("30 seconds")).option("checkpointLocation", "file:///home/hadoop/jsoncheckpoint").startStream("file:///home/hadoop/jsonresult") org.apache.spark.sql.AnalysisException: Aggregations are not supported on streaming DataFrames/Datasets in Append output mode. Consider changing output mode to Update.; at org.apache.spark.sql.catalyst.analysis.UnsupportedOperationChecker$.org$apache$spark$sql$catalyst$analysis$UnsupportedOperationChecker$$throwError(UnsupportedOperationChecker.scala:142) at org.apache.spark.sql.catalyst.analysis.UnsupportedOperationChecker$$anonfun$checkForStreaming$1.apply(UnsupportedOperationChecker.scala:59) at org.apache.spark.sql.catalyst.analysis.UnsupportedOperationChecker$$anonfun$checkForStreaming$1.apply(UnsupportedOperationChecker.scala:46) at org.apache.spark.sql.catalyst.trees.TreeNode.foreachUp(TreeNode.scala:125) at org.apache.spark.sql.catalyst.analysis.UnsupportedOperationChecker$.checkForStreaming(UnsupportedOperationChecker.scala:46) at org.apache.spark.sql.ContinuousQueryManager.startQuery(ContinuousQueryManager.scala:190) at org.apache.spark.sql.DataFrameWriter.startStream(DataFrameWriter.scala:351) at org.apache.spark.sql.DataFrameWriter.startStream(DataFrameWriter.scala:279) I brief the spark code, looks like there is no way to change output mode to Update? >>> >>> >> -- Thanks & Regards Sachin Aggarwal 7760502772
Re: Re: How to change output mode to Update
Hi, there is some code I have added in jira-15146 please have a look at it, I have not finished it. U can use the same code in ur example as of now On 18-May-2016 10:46 AM, "Saisai Shao"wrote: > > .mode(SaveMode.Overwrite) > > From my understanding mode is not supported in continuous query. > > def mode(saveMode: SaveMode): DataFrameWriter = { > // mode() is used for non-continuous queries > // outputMode() is used for continuous queries > assertNotStreaming("mode() can only be called on non-continuous queries") > this.mode = saveMode > this > } > > > On Wed, May 18, 2016 at 12:25 PM, Todd wrote: > >> Thanks Ted. >> >> I didn't try, but I think SaveMode and OuputMode are different things. >> Currently, the spark code contain two output mode, Append and Update. >> Append is the default mode,but looks that there is no way to change to >> Update. >> >> Take a look at DataFrameWriter#startQuery >> >> Thanks. >> >> >> >> >> >> >> At 2016-05-18 12:10:11, "Ted Yu" wrote: >> >> Have you tried adding: >> >> .mode(SaveMode.Overwrite) >> >> On Tue, May 17, 2016 at 8:55 PM, Todd wrote: >> >>> scala> records.groupBy("name").count().write.trigger(ProcessingTime("30 >>> seconds")).option("checkpointLocation", >>> "file:///home/hadoop/jsoncheckpoint").startStream("file:///home/hadoop/jsonresult") >>> org.apache.spark.sql.AnalysisException: Aggregations are not supported >>> on streaming DataFrames/Datasets in Append output mode. Consider changing >>> output mode to Update.; >>> at >>> org.apache.spark.sql.catalyst.analysis.UnsupportedOperationChecker$.org$apache$spark$sql$catalyst$analysis$UnsupportedOperationChecker$$throwError(UnsupportedOperationChecker.scala:142) >>> at >>> org.apache.spark.sql.catalyst.analysis.UnsupportedOperationChecker$$anonfun$checkForStreaming$1.apply(UnsupportedOperationChecker.scala:59) >>> at >>> org.apache.spark.sql.catalyst.analysis.UnsupportedOperationChecker$$anonfun$checkForStreaming$1.apply(UnsupportedOperationChecker.scala:46) >>> at >>> org.apache.spark.sql.catalyst.trees.TreeNode.foreachUp(TreeNode.scala:125) >>> at >>> org.apache.spark.sql.catalyst.analysis.UnsupportedOperationChecker$.checkForStreaming(UnsupportedOperationChecker.scala:46) >>> at >>> org.apache.spark.sql.ContinuousQueryManager.startQuery(ContinuousQueryManager.scala:190) >>> at >>> org.apache.spark.sql.DataFrameWriter.startStream(DataFrameWriter.scala:351) >>> at >>> org.apache.spark.sql.DataFrameWriter.startStream(DataFrameWriter.scala:279) >>> >>> >>> I brief the spark code, looks like there is no way to change output mode >>> to Update? >>> >> >> >
Re: Re: How to change output mode to Update
> .mode(SaveMode.Overwrite) >From my understanding mode is not supported in continuous query. def mode(saveMode: SaveMode): DataFrameWriter = { // mode() is used for non-continuous queries // outputMode() is used for continuous queries assertNotStreaming("mode() can only be called on non-continuous queries") this.mode = saveMode this } On Wed, May 18, 2016 at 12:25 PM, Toddwrote: > Thanks Ted. > > I didn't try, but I think SaveMode and OuputMode are different things. > Currently, the spark code contain two output mode, Append and Update. > Append is the default mode,but looks that there is no way to change to > Update. > > Take a look at DataFrameWriter#startQuery > > Thanks. > > > > > > > At 2016-05-18 12:10:11, "Ted Yu" wrote: > > Have you tried adding: > > .mode(SaveMode.Overwrite) > > On Tue, May 17, 2016 at 8:55 PM, Todd wrote: > >> scala> records.groupBy("name").count().write.trigger(ProcessingTime("30 >> seconds")).option("checkpointLocation", >> "file:///home/hadoop/jsoncheckpoint").startStream("file:///home/hadoop/jsonresult") >> org.apache.spark.sql.AnalysisException: Aggregations are not supported on >> streaming DataFrames/Datasets in Append output mode. Consider changing >> output mode to Update.; >> at >> org.apache.spark.sql.catalyst.analysis.UnsupportedOperationChecker$.org$apache$spark$sql$catalyst$analysis$UnsupportedOperationChecker$$throwError(UnsupportedOperationChecker.scala:142) >> at >> org.apache.spark.sql.catalyst.analysis.UnsupportedOperationChecker$$anonfun$checkForStreaming$1.apply(UnsupportedOperationChecker.scala:59) >> at >> org.apache.spark.sql.catalyst.analysis.UnsupportedOperationChecker$$anonfun$checkForStreaming$1.apply(UnsupportedOperationChecker.scala:46) >> at >> org.apache.spark.sql.catalyst.trees.TreeNode.foreachUp(TreeNode.scala:125) >> at >> org.apache.spark.sql.catalyst.analysis.UnsupportedOperationChecker$.checkForStreaming(UnsupportedOperationChecker.scala:46) >> at >> org.apache.spark.sql.ContinuousQueryManager.startQuery(ContinuousQueryManager.scala:190) >> at >> org.apache.spark.sql.DataFrameWriter.startStream(DataFrameWriter.scala:351) >> at >> org.apache.spark.sql.DataFrameWriter.startStream(DataFrameWriter.scala:279) >> >> >> I brief the spark code, looks like there is no way to change output mode >> to Update? >> > >