Re: Continuous Processing mode behaves differently from Batch mode

2018-05-17 Thread Yuta Morisawa

Thank you for reply.

I checked WEB UI and found that the total number of tasks is 10.
So, I changed the number of cores from 1 to 10, then it works well.

But I haven't figure out what is happening.

My assumption is that each Job consists of 10 tasks in default and each 
task occupies 1 core.

So, in my case, assigning only 1 core cause the issue.
In other words, Continuous mode needs at least 10 cores.

Is it right?


Regards;
Yuta

On 2018/05/16 15:24, Shixiong(Ryan) Zhu wrote:
One possible case is you don't have enough resources to launch all tasks 
for your continuous processing query. Could you check the Spark UI and 
see if all tasks are running rather than waiting for resources?


Best Regards,

Shixiong Zhu
Databricks Inc.
shixi...@databricks.com 

databricks.com 

http://databricks.com 





On Tue, May 15, 2018 at 5:38 PM, Yuta Morisawa 
mailto:yu-moris...@kddi-research.jp>> wrote:


Hi all

Now I am using Structured Streaming in Continuous Processing mode
and I faced a odd problem.

My code is so simple that it is similar to the sample code on the
documentation.

https://spark.apache.org/docs/latest/structured-streaming-programming-guide.html#continuous-processing




When I send the same text data ten times, for example 10 lines text,
in Batch mode the result has 100 lines.

But in Continuous Processing mode the result has only 10 lines.
It appears duplicated lines are removed.

The difference of these two codes is only with or without trigger
method.

Why these two code behave differently ?


--
Regard,
Yuta


-
To unsubscribe e-mail: user-unsubscr...@spark.apache.org






-
To unsubscribe e-mail: user-unsubscr...@spark.apache.org



Re: Continuous Processing mode behaves differently from Batch mode

2018-05-15 Thread Shixiong(Ryan) Zhu
One possible case is you don't have enough resources to launch all tasks
for your continuous processing query. Could you check the Spark UI and see
if all tasks are running rather than waiting for resources?

Best Regards,
Shixiong Zhu
Databricks Inc.
shixi...@databricks.com

databricks.com

[image: http://databricks.com] 




On Tue, May 15, 2018 at 5:38 PM, Yuta Morisawa  wrote:

> Hi all
>
> Now I am using Structured Streaming in Continuous Processing mode and I
> faced a odd problem.
>
> My code is so simple that it is similar to the sample code on the
> documentation.
> https://spark.apache.org/docs/latest/structured-streaming-pr
> ogramming-guide.html#continuous-processing
>
>
> When I send the same text data ten times, for example 10 lines text, in
> Batch mode the result has 100 lines.
>
> But in Continuous Processing mode the result has only 10 lines.
> It appears duplicated lines are removed.
>
> The difference of these two codes is only with or without trigger method.
>
> Why these two code behave differently ?
>
>
> --
> Regard,
> Yuta
>
>
> -
> To unsubscribe e-mail: user-unsubscr...@spark.apache.org
>
>


Continuous Processing mode behaves differently from Batch mode

2018-05-15 Thread Yuta Morisawa

Hi all

Now I am using Structured Streaming in Continuous Processing mode and I 
faced a odd problem.


My code is so simple that it is similar to the sample code on the 
documentation.

https://spark.apache.org/docs/latest/structured-streaming-programming-guide.html#continuous-processing


When I send the same text data ten times, for example 10 lines text, in 
Batch mode the result has 100 lines.


But in Continuous Processing mode the result has only 10 lines.
It appears duplicated lines are removed.

The difference of these two codes is only with or without trigger method.

Why these two code behave differently ?


--
Regard,
Yuta


-
To unsubscribe e-mail: user-unsubscr...@spark.apache.org