Re: Re: Re: How to change output mode to Update

2016-05-18 Thread Sachin Aggarwal
sorry my mistake i gave wrong id

here is correct one
https://issues.apache.org/jira/browse/SPARK-15183

On Wed, May 18, 2016 at 11:19 AM, Todd  wrote:

> Hi Sachin,
>
> Could you please give the url of jira-15146? Thanks!
>
>
>
>
>
> At 2016-05-18 13:33:47, "Sachin Aggarwal" 
> wrote:
>
> Hi, there is some code I have added in jira-15146 please have a look at
> it, I have not finished it. U can use the same code in ur example as of now
> On 18-May-2016 10:46 AM, "Saisai Shao"  wrote:
>
>> > .mode(SaveMode.Overwrite)
>>
>> From my understanding mode is not supported in continuous query.
>>
>> def mode(saveMode: SaveMode): DataFrameWriter = {
>>   // mode() is used for non-continuous queries
>>   // outputMode() is used for continuous queries
>>   assertNotStreaming("mode() can only be called on non-continuous queries")
>>   this.mode = saveMode
>>   this
>> }
>>
>>
>> On Wed, May 18, 2016 at 12:25 PM, Todd  wrote:
>>
>>> Thanks Ted.
>>>
>>> I didn't try, but I think SaveMode and OuputMode are different things.
>>> Currently, the spark code contain two output mode, Append and Update.
>>> Append is the default mode,but looks that there is no way to change to
>>> Update.
>>>
>>> Take a look at DataFrameWriter#startQuery
>>>
>>> Thanks.
>>>
>>>
>>>
>>>
>>>
>>>
>>> At 2016-05-18 12:10:11, "Ted Yu"  wrote:
>>>
>>> Have you tried adding:
>>>
>>> .mode(SaveMode.Overwrite)
>>>
>>> On Tue, May 17, 2016 at 8:55 PM, Todd  wrote:
>>>
 scala> records.groupBy("name").count().write.trigger(ProcessingTime("30
 seconds")).option("checkpointLocation",
 "file:///home/hadoop/jsoncheckpoint").startStream("file:///home/hadoop/jsonresult")
 org.apache.spark.sql.AnalysisException: Aggregations are not supported
 on streaming DataFrames/Datasets in Append output mode. Consider changing
 output mode to Update.;
   at
 org.apache.spark.sql.catalyst.analysis.UnsupportedOperationChecker$.org$apache$spark$sql$catalyst$analysis$UnsupportedOperationChecker$$throwError(UnsupportedOperationChecker.scala:142)
   at
 org.apache.spark.sql.catalyst.analysis.UnsupportedOperationChecker$$anonfun$checkForStreaming$1.apply(UnsupportedOperationChecker.scala:59)
   at
 org.apache.spark.sql.catalyst.analysis.UnsupportedOperationChecker$$anonfun$checkForStreaming$1.apply(UnsupportedOperationChecker.scala:46)
   at
 org.apache.spark.sql.catalyst.trees.TreeNode.foreachUp(TreeNode.scala:125)
   at
 org.apache.spark.sql.catalyst.analysis.UnsupportedOperationChecker$.checkForStreaming(UnsupportedOperationChecker.scala:46)
   at
 org.apache.spark.sql.ContinuousQueryManager.startQuery(ContinuousQueryManager.scala:190)
   at
 org.apache.spark.sql.DataFrameWriter.startStream(DataFrameWriter.scala:351)
   at
 org.apache.spark.sql.DataFrameWriter.startStream(DataFrameWriter.scala:279)


 I brief the spark code, looks like there is no way to change output
 mode to Update?

>>>
>>>
>>


-- 

Thanks & Regards

Sachin Aggarwal
7760502772


Re: Re: How to change output mode to Update

2016-05-17 Thread Sachin Aggarwal
Hi, there is some code I have added in jira-15146 please have a look at it,
I have not finished it. U can use the same code in ur example as of now
On 18-May-2016 10:46 AM, "Saisai Shao"  wrote:

> > .mode(SaveMode.Overwrite)
>
> From my understanding mode is not supported in continuous query.
>
> def mode(saveMode: SaveMode): DataFrameWriter = {
>   // mode() is used for non-continuous queries
>   // outputMode() is used for continuous queries
>   assertNotStreaming("mode() can only be called on non-continuous queries")
>   this.mode = saveMode
>   this
> }
>
>
> On Wed, May 18, 2016 at 12:25 PM, Todd  wrote:
>
>> Thanks Ted.
>>
>> I didn't try, but I think SaveMode and OuputMode are different things.
>> Currently, the spark code contain two output mode, Append and Update.
>> Append is the default mode,but looks that there is no way to change to
>> Update.
>>
>> Take a look at DataFrameWriter#startQuery
>>
>> Thanks.
>>
>>
>>
>>
>>
>>
>> At 2016-05-18 12:10:11, "Ted Yu"  wrote:
>>
>> Have you tried adding:
>>
>> .mode(SaveMode.Overwrite)
>>
>> On Tue, May 17, 2016 at 8:55 PM, Todd  wrote:
>>
>>> scala> records.groupBy("name").count().write.trigger(ProcessingTime("30
>>> seconds")).option("checkpointLocation",
>>> "file:///home/hadoop/jsoncheckpoint").startStream("file:///home/hadoop/jsonresult")
>>> org.apache.spark.sql.AnalysisException: Aggregations are not supported
>>> on streaming DataFrames/Datasets in Append output mode. Consider changing
>>> output mode to Update.;
>>>   at
>>> org.apache.spark.sql.catalyst.analysis.UnsupportedOperationChecker$.org$apache$spark$sql$catalyst$analysis$UnsupportedOperationChecker$$throwError(UnsupportedOperationChecker.scala:142)
>>>   at
>>> org.apache.spark.sql.catalyst.analysis.UnsupportedOperationChecker$$anonfun$checkForStreaming$1.apply(UnsupportedOperationChecker.scala:59)
>>>   at
>>> org.apache.spark.sql.catalyst.analysis.UnsupportedOperationChecker$$anonfun$checkForStreaming$1.apply(UnsupportedOperationChecker.scala:46)
>>>   at
>>> org.apache.spark.sql.catalyst.trees.TreeNode.foreachUp(TreeNode.scala:125)
>>>   at
>>> org.apache.spark.sql.catalyst.analysis.UnsupportedOperationChecker$.checkForStreaming(UnsupportedOperationChecker.scala:46)
>>>   at
>>> org.apache.spark.sql.ContinuousQueryManager.startQuery(ContinuousQueryManager.scala:190)
>>>   at
>>> org.apache.spark.sql.DataFrameWriter.startStream(DataFrameWriter.scala:351)
>>>   at
>>> org.apache.spark.sql.DataFrameWriter.startStream(DataFrameWriter.scala:279)
>>>
>>>
>>> I brief the spark code, looks like there is no way to change output mode
>>> to Update?
>>>
>>
>>
>


Re: Re: How to change output mode to Update

2016-05-17 Thread Saisai Shao
> .mode(SaveMode.Overwrite)

>From my understanding mode is not supported in continuous query.

def mode(saveMode: SaveMode): DataFrameWriter = {
  // mode() is used for non-continuous queries
  // outputMode() is used for continuous queries
  assertNotStreaming("mode() can only be called on non-continuous queries")
  this.mode = saveMode
  this
}


On Wed, May 18, 2016 at 12:25 PM, Todd  wrote:

> Thanks Ted.
>
> I didn't try, but I think SaveMode and OuputMode are different things.
> Currently, the spark code contain two output mode, Append and Update.
> Append is the default mode,but looks that there is no way to change to
> Update.
>
> Take a look at DataFrameWriter#startQuery
>
> Thanks.
>
>
>
>
>
>
> At 2016-05-18 12:10:11, "Ted Yu"  wrote:
>
> Have you tried adding:
>
> .mode(SaveMode.Overwrite)
>
> On Tue, May 17, 2016 at 8:55 PM, Todd  wrote:
>
>> scala> records.groupBy("name").count().write.trigger(ProcessingTime("30
>> seconds")).option("checkpointLocation",
>> "file:///home/hadoop/jsoncheckpoint").startStream("file:///home/hadoop/jsonresult")
>> org.apache.spark.sql.AnalysisException: Aggregations are not supported on
>> streaming DataFrames/Datasets in Append output mode. Consider changing
>> output mode to Update.;
>>   at
>> org.apache.spark.sql.catalyst.analysis.UnsupportedOperationChecker$.org$apache$spark$sql$catalyst$analysis$UnsupportedOperationChecker$$throwError(UnsupportedOperationChecker.scala:142)
>>   at
>> org.apache.spark.sql.catalyst.analysis.UnsupportedOperationChecker$$anonfun$checkForStreaming$1.apply(UnsupportedOperationChecker.scala:59)
>>   at
>> org.apache.spark.sql.catalyst.analysis.UnsupportedOperationChecker$$anonfun$checkForStreaming$1.apply(UnsupportedOperationChecker.scala:46)
>>   at
>> org.apache.spark.sql.catalyst.trees.TreeNode.foreachUp(TreeNode.scala:125)
>>   at
>> org.apache.spark.sql.catalyst.analysis.UnsupportedOperationChecker$.checkForStreaming(UnsupportedOperationChecker.scala:46)
>>   at
>> org.apache.spark.sql.ContinuousQueryManager.startQuery(ContinuousQueryManager.scala:190)
>>   at
>> org.apache.spark.sql.DataFrameWriter.startStream(DataFrameWriter.scala:351)
>>   at
>> org.apache.spark.sql.DataFrameWriter.startStream(DataFrameWriter.scala:279)
>>
>>
>> I brief the spark code, looks like there is no way to change output mode
>> to Update?
>>
>
>