date:20151119

[GitHub] flink pull request: [FLINK-1945][py] Python Tests less verbose

2015-11-19 Thread zentol

Github user zentol commented on the pull request:

https://github.com/apache/flink/pull/1376#issuecomment-158015146
  
@tillrohrmann I've changed the type deduction test to a) no longer produce 
any output and b) actually fail with an exception f things go wrong.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

[GitHub] flink pull request: [FLINK-1945][py] Python Tests less verbose

2015-11-19 Thread tillrohrmann

Github user tillrohrmann commented on the pull request:

https://github.com/apache/flink/pull/1376#issuecomment-158016205
  
Great @zentol. Good to merge then.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

[jira] [Commented] (FLINK-1945) Make python tests less verbose

2015-11-19 Thread ASF GitHub Bot (JIRA)


[ 
https://issues.apache.org/jira/browse/FLINK-1945?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15013292#comment-15013292
 ] 

ASF GitHub Bot commented on FLINK-1945:
---

Github user tillrohrmann commented on the pull request:

https://github.com/apache/flink/pull/1376#issuecomment-158016205
  
Great @zentol. Good to merge then.


> Make python tests less verbose
> --
>
> Key: FLINK-1945
> URL: https://issues.apache.org/jira/browse/FLINK-1945
> Project: Flink
>  Issue Type: Improvement
>  Components: Python API, Tests
>Reporter: Till Rohrmann
>Assignee: Chesnay Schepler
>Priority: Minor
>
> Currently, the python tests print a lot of log messages to stdout. 
> Furthermore there seems to be some println statements which clutter the 
> console output. I think that these log messages are not required for the 
> tests and thus should be suppressed. 



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (FLINK-3020) Local streaming execution: set number of task manager slots to the maximum parallelism

2015-11-19 Thread ASF GitHub Bot (JIRA)


[ 
https://issues.apache.org/jira/browse/FLINK-3020?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15013311#comment-15013311
 ] 

ASF GitHub Bot commented on FLINK-3020:
---

Github user mxm commented on the pull request:

https://github.com/apache/flink/pull/1360#issuecomment-158020011
  
I would also be in favor of changing the local execution to always use the 
maximum specified parallelism as the number of task slots. IMHO the current 
behavior is not intuitive. The default parallelism currently acts as a maximum 
parallelism in local execution.


> Local streaming execution: set number of task manager slots to the maximum 
> parallelism
> --
>
> Key: FLINK-3020
> URL: https://issues.apache.org/jira/browse/FLINK-3020
> Project: Flink
>  Issue Type: Bug
>  Components: Local Runtime
>Affects Versions: 0.10.0
>Reporter: Maximilian Michels
>Assignee: Maximilian Michels
>Priority: Minor
> Fix For: 1.0.0, 0.10.1
>
>
> Quite an inconvenience is the local execution configuration behavior. It sets 
> the number of task slots of the mini cluster to the default parallelism. This 
> causes problem if you use {{setParallelism(parallelism)}} on an operator and 
> set a parallelism larger than the default parallelism.
> {noformat}
> Caused by: 
> org.apache.flink.runtime.jobmanager.scheduler.NoResourceAvailableException: 
> Not enough free slots available to run the job. You can decrease the operator 
> parallelism or increase the number of slots per TaskManager in the 
> configuration. Task to schedule: < Attempt #0 (Flat Map (9/100)) @ 
> (unassigned) - [SCHEDULED] > with groupID < fa7240ee1fed08bd7e6278899db3e838 
> > in sharing group < SlotSharingGroup [f3d578e9819be9c39ceee86cf5eb8c08, 
> 8fa330746efa1d034558146e4604d0b4, fa7240ee1fed08bd7e6278899db3e838] >. 
> Resources available to scheduler: Number of instances=1, total number of 
> slots=8, available slots=0
>   at 
> org.apache.flink.runtime.jobmanager.scheduler.Scheduler.scheduleTask(Scheduler.java:256)
>   at 
> org.apache.flink.runtime.jobmanager.scheduler.Scheduler.scheduleImmediately(Scheduler.java:131)
>   at 
> org.apache.flink.runtime.executiongraph.Execution.scheduleForExecution(Execution.java:298)
>   at 
> org.apache.flink.runtime.executiongraph.ExecutionVertex.scheduleForExecution(ExecutionVertex.java:458)
>   at 
> org.apache.flink.runtime.executiongraph.ExecutionJobVertex.scheduleAll(ExecutionJobVertex.java:322)
>   at 
> org.apache.flink.runtime.executiongraph.ExecutionGraph.scheduleForExecution(ExecutionGraph.java:686)
>   at 
> org.apache.flink.runtime.jobmanager.JobManager$$anonfun$org$apache$flink$runtime$jobmanager$JobManager$$submitJob$1.apply$mcV$sp(JobManager.scala:982)
>   at 
> org.apache.flink.runtime.jobmanager.JobManager$$anonfun$org$apache$flink$runtime$jobmanager$JobManager$$submitJob$1.apply(JobManager.scala:962)
>   at 
> org.apache.flink.runtime.jobmanager.JobManager$$anonfun$org$apache$flink$runtime$jobmanager$JobManager$$submitJob$1.apply(JobManager.scala:962)
>   at 
> scala.concurrent.impl.Future$PromiseCompletingRunnable.liftedTree1$1(Future.scala:24)
>   at 
> scala.concurrent.impl.Future$PromiseCompletingRunnable.run(Future.scala:24)
>   at akka.dispatch.TaskInvocation.run(AbstractDispatcher.scala:41)
>   at 
> akka.dispatch.ForkJoinExecutorConfigurator$AkkaForkJoinTask.exec(AbstractDispatcher.scala:401)
>   at scala.concurrent.forkjoin.ForkJoinTask.doExec(ForkJoinTask.java:260)
>   at 
> scala.concurrent.forkjoin.ForkJoinPool$WorkQueue.runTask(ForkJoinPool.java:1339)
>   ... 2 more
> {noformat}
> I propose to change this behavior to setting the number of task slots to the 
> maximum parallelism present in the user program.
> What do you think?



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[GitHub] flink pull request: [FLINK-3020][streaming] set number of task slo...

2015-11-19 Thread mxm

Github user mxm commented on the pull request:

https://github.com/apache/flink/pull/1360#issuecomment-158020011
  
I would also be in favor of changing the local execution to always use the 
maximum specified parallelism as the number of task slots. IMHO the current 
behavior is not intuitive. The default parallelism currently acts as a maximum 
parallelism in local execution.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

[jira] [Commented] (FLINK-1994) Add different gain calculation schemes to SGD

2015-11-19 Thread Till Rohrmann (JIRA)


[ 
https://issues.apache.org/jira/browse/FLINK-1994?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15013142#comment-15013142
 ] 

Till Rohrmann commented on FLINK-1994:
--

Hi Trevor,

would be great to add tests and docs, otherwise it happens easily that
someone else changes things without noticing that affects also other parts.

What you can do though is to open the PR and then adding commits with the
documentation and tests to the same branch. That way we can already review
the gain implementation.

Cheers, Till



> Add different gain calculation schemes to SGD
> -
>
> Key: FLINK-1994
> URL: https://issues.apache.org/jira/browse/FLINK-1994
> Project: Flink
>  Issue Type: Improvement
>  Components: Machine Learning Library
>Reporter: Till Rohrmann
>Assignee: Trevor Grant
>Priority: Minor
>  Labels: ML, Starter
>
> The current SGD implementation uses as gain for the weight updates the 
> formula {{stepsize/sqrt(iterationNumber)}}. It would be good to make the gain 
> calculation configurable and to provide different strategies for that. For 
> example:
> * stepsize/(1 + iterationNumber)
> * stepsize*(1 + regularization * stepsize * iterationNumber)^(-3/4)
> See also how to properly select the gains [1].
> Resources:
> [1] http://arxiv.org/pdf/1107.2490.pdf



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[GitHub] flink pull request: [FLINK-2955] Add operators description in Tabl...

2015-11-19 Thread aljoscha

Github user aljoscha commented on the pull request:

https://github.com/apache/flink/pull/1318#issuecomment-157998634
  
Thanks for adding this. :+1: 


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

[jira] [Commented] (FLINK-2955) Add operations introduction in Table API page.

2015-11-19 Thread ASF GitHub Bot (JIRA)


[ 
https://issues.apache.org/jira/browse/FLINK-2955?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15013205#comment-15013205
 ] 

ASF GitHub Bot commented on FLINK-2955:
---

Github user aljoscha commented on the pull request:

https://github.com/apache/flink/pull/1318#issuecomment-157998150
  
I'm merging


> Add operations introduction in Table API page.
> --
>
> Key: FLINK-2955
> URL: https://issues.apache.org/jira/browse/FLINK-2955
> Project: Flink
>  Issue Type: New Feature
>  Components: Documentation
>Reporter: Chengxiang Li
>Assignee: Chengxiang Li
>Priority: Minor
>
> On the Table API page, there is no formal introduction of current supported 
> operations, it should be nice to have it.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (FLINK-2955) Add operations introduction in Table API page.

2015-11-19 Thread ASF GitHub Bot (JIRA)


[ 
https://issues.apache.org/jira/browse/FLINK-2955?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15013212#comment-15013212
 ] 

ASF GitHub Bot commented on FLINK-2955:
---

Github user aljoscha commented on the pull request:

https://github.com/apache/flink/pull/1318#issuecomment-157999295
  
Github seems to not want to close this. @ChengXiangLi Could you please do 
it?


> Add operations introduction in Table API page.
> --
>
> Key: FLINK-2955
> URL: https://issues.apache.org/jira/browse/FLINK-2955
> Project: Flink
>  Issue Type: New Feature
>  Components: Documentation
>Reporter: Chengxiang Li
>Assignee: Chengxiang Li
>Priority: Minor
> Fix For: 1.0.0
>
>
> On the Table API page, there is no formal introduction of current supported 
> operations, it should be nice to have it.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[GitHub] flink pull request: [FLINK-2955] Add operators description in Tabl...

2015-11-19 Thread aljoscha

Github user aljoscha commented on the pull request:

https://github.com/apache/flink/pull/1318#issuecomment-157999295
  
Github seems to not want to close this. @ChengXiangLi Could you please do 
it?


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

[GitHub] flink pull request: [FLINK-3002] Add Either type to the Java API

2015-11-19 Thread vasia

Github user vasia commented on the pull request:

https://github.com/apache/flink/pull/1371#issuecomment-158003742
  
Thanks! Updated. I'll open a JIRA for integrating it into the TypeExtractor.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

[jira] [Commented] (FLINK-3002) Add an EitherType to the Java API

2015-11-19 Thread ASF GitHub Bot (JIRA)


[ 
https://issues.apache.org/jira/browse/FLINK-3002?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15013228#comment-15013228
 ] 

ASF GitHub Bot commented on FLINK-3002:
---

Github user vasia commented on the pull request:

https://github.com/apache/flink/pull/1371#issuecomment-158003742
  
Thanks! Updated. I'll open a JIRA for integrating it into the TypeExtractor.


> Add an EitherType to the Java API
> -
>
> Key: FLINK-3002
> URL: https://issues.apache.org/jira/browse/FLINK-3002
> Project: Flink
>  Issue Type: New Feature
>  Components: Java API
>Affects Versions: 0.10.0
>Reporter: Stephan Ewen
>Assignee: Vasia Kalavri
> Fix For: 1.0.0
>
>
> Either types are recurring patterns and should be serialized efficiently, so 
> it makes sense to add them to the core Java API.
> Since Java does not have such a type as of Java 8, we would need to add our 
> own version.
> The Scala API handles the Scala Either Type already efficiently. I would not 
> use the Scala Either Type in the Java API, since we are trying to get the 
> {{flink-java}} project "Scala free" for people that don't use Scala and o not 
> want to worry about Scala version matches and mismatches.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Created] (FLINK-3046) Integrate the Either Java type with the TypeExtractor

2015-11-19 Thread Vasia Kalavri (JIRA)

Vasia Kalavri created FLINK-3046:


 Summary: Integrate the Either Java type with the TypeExtractor
 Key: FLINK-3046
 URL: https://issues.apache.org/jira/browse/FLINK-3046
 Project: Flink
  Issue Type: Improvement
  Components: Type Serialization System
Affects Versions: 1.0.0
Reporter: Vasia Kalavri


Integrate the Either Java type with the TypeExtractor, so that the APIs 
recognize the type and choose the type info properly.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (FLINK-2115) TableAPI throws ExpressionException for "Dangling GroupBy operation"

2015-11-19 Thread ASF GitHub Bot (JIRA)


[ 
https://issues.apache.org/jira/browse/FLINK-2115?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15013254#comment-15013254
 ] 

ASF GitHub Bot commented on FLINK-2115:
---

Github user ChengXiangLi commented on the pull request:

https://github.com/apache/flink/pull/1377#issuecomment-158007073
  
Ok, i think it could be added in the `Select` operator description in Table 
API page, it would depends on  FLINK-2955. Seems it takes quite long time delay 
to sync merged commit to github, waiting for FLINK-2955 get synced.


> TableAPI throws ExpressionException for "Dangling GroupBy operation"
> 
>
> Key: FLINK-2115
> URL: https://issues.apache.org/jira/browse/FLINK-2115
> Project: Flink
>  Issue Type: Bug
>  Components: Table API
>Affects Versions: 0.9
>Reporter: Fabian Hueske
>Assignee: Chengxiang Li
>
> The following program below throws an ExpressionException due to a "Dangling 
> GroupBy operation".
> However, I think the program is semantically correct and should execute.
> {code}
> public static void main(String[] args) throws Exception {
>   ExecutionEnvironment env = ExecutionEnvironment.getExecutionEnvironment();
>   DataSet data = env.fromElements(1,2,2,3,3,3,4,4,4,4);
>   DataSet> tuples = data
>   .map(new MapFunction>() {
> @Override
> public Tuple2 map(Integer i) throws Exception {
>   return new Tuple2(i, i*2);
> }
>   });
>   TableEnvironment tEnv = new TableEnvironment();
>   Table t = tEnv.toTable(tuples).as("i, i2")
>   .groupBy("i, i2").select("i, i2")
>   .groupBy("i").select("i, i.count as cnt");
>   tEnv.toSet(t, Row.class).print();
> }
> {code}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[GitHub] flink pull request: [FLINK-3020][streaming] set number of task slo...

2015-11-19 Thread mxm

Github user mxm commented on the pull request:

https://github.com/apache/flink/pull/1360#issuecomment-158009498
  
True. That's how it is handled on the batch side. Not sure about this 
behavior though. If a user sets a default parallelism but uses operators with 
`parallelism > defaultParallelism` this would fail, right? The rational behind 
this is probably to maximize the parallelism for all operators and not have 
operators with exceptional high parallelism.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

[jira] [Commented] (FLINK-3020) Local streaming execution: set number of task manager slots to the maximum parallelism

2015-11-19 Thread ASF GitHub Bot (JIRA)


[ 
https://issues.apache.org/jira/browse/FLINK-3020?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15013266#comment-15013266
 ] 

ASF GitHub Bot commented on FLINK-3020:
---

Github user mxm commented on the pull request:

https://github.com/apache/flink/pull/1360#issuecomment-158009498
  
True. That's how it is handled on the batch side. Not sure about this 
behavior though. If a user sets a default parallelism but uses operators with 
`parallelism > defaultParallelism` this would fail, right? The rational behind 
this is probably to maximize the parallelism for all operators and not have 
operators with exceptional high parallelism.


> Local streaming execution: set number of task manager slots to the maximum 
> parallelism
> --
>
> Key: FLINK-3020
> URL: https://issues.apache.org/jira/browse/FLINK-3020
> Project: Flink
>  Issue Type: Bug
>  Components: Local Runtime
>Affects Versions: 0.10.0
>Reporter: Maximilian Michels
>Assignee: Maximilian Michels
>Priority: Minor
> Fix For: 1.0.0, 0.10.1
>
>
> Quite an inconvenience is the local execution configuration behavior. It sets 
> the number of task slots of the mini cluster to the default parallelism. This 
> causes problem if you use {{setParallelism(parallelism)}} on an operator and 
> set a parallelism larger than the default parallelism.
> {noformat}
> Caused by: 
> org.apache.flink.runtime.jobmanager.scheduler.NoResourceAvailableException: 
> Not enough free slots available to run the job. You can decrease the operator 
> parallelism or increase the number of slots per TaskManager in the 
> configuration. Task to schedule: < Attempt #0 (Flat Map (9/100)) @ 
> (unassigned) - [SCHEDULED] > with groupID < fa7240ee1fed08bd7e6278899db3e838 
> > in sharing group < SlotSharingGroup [f3d578e9819be9c39ceee86cf5eb8c08, 
> 8fa330746efa1d034558146e4604d0b4, fa7240ee1fed08bd7e6278899db3e838] >. 
> Resources available to scheduler: Number of instances=1, total number of 
> slots=8, available slots=0
>   at 
> org.apache.flink.runtime.jobmanager.scheduler.Scheduler.scheduleTask(Scheduler.java:256)
>   at 
> org.apache.flink.runtime.jobmanager.scheduler.Scheduler.scheduleImmediately(Scheduler.java:131)
>   at 
> org.apache.flink.runtime.executiongraph.Execution.scheduleForExecution(Execution.java:298)
>   at 
> org.apache.flink.runtime.executiongraph.ExecutionVertex.scheduleForExecution(ExecutionVertex.java:458)
>   at 
> org.apache.flink.runtime.executiongraph.ExecutionJobVertex.scheduleAll(ExecutionJobVertex.java:322)
>   at 
> org.apache.flink.runtime.executiongraph.ExecutionGraph.scheduleForExecution(ExecutionGraph.java:686)
>   at 
> org.apache.flink.runtime.jobmanager.JobManager$$anonfun$org$apache$flink$runtime$jobmanager$JobManager$$submitJob$1.apply$mcV$sp(JobManager.scala:982)
>   at 
> org.apache.flink.runtime.jobmanager.JobManager$$anonfun$org$apache$flink$runtime$jobmanager$JobManager$$submitJob$1.apply(JobManager.scala:962)
>   at 
> org.apache.flink.runtime.jobmanager.JobManager$$anonfun$org$apache$flink$runtime$jobmanager$JobManager$$submitJob$1.apply(JobManager.scala:962)
>   at 
> scala.concurrent.impl.Future$PromiseCompletingRunnable.liftedTree1$1(Future.scala:24)
>   at 
> scala.concurrent.impl.Future$PromiseCompletingRunnable.run(Future.scala:24)
>   at akka.dispatch.TaskInvocation.run(AbstractDispatcher.scala:41)
>   at 
> akka.dispatch.ForkJoinExecutorConfigurator$AkkaForkJoinTask.exec(AbstractDispatcher.scala:401)
>   at scala.concurrent.forkjoin.ForkJoinTask.doExec(ForkJoinTask.java:260)
>   at 
> scala.concurrent.forkjoin.ForkJoinPool$WorkQueue.runTask(ForkJoinPool.java:1339)
>   ... 2 more
> {noformat}
> I propose to change this behavior to setting the number of task slots to the 
> maximum parallelism present in the user program.
> What do you think?



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Closed] (FLINK-2249) ExecutionEnvironment: Ignore calls to execute() if no data sinks defined

2015-11-19 Thread Maximilian Michels (JIRA)


 [ 
https://issues.apache.org/jira/browse/FLINK-2249?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Maximilian Michels closed FLINK-2249.
-
Resolution: Won't Fix

> ExecutionEnvironment: Ignore calls to execute() if no data sinks defined
> 
>
> Key: FLINK-2249
> URL: https://issues.apache.org/jira/browse/FLINK-2249
> Project: Flink
>  Issue Type: Improvement
>  Components: Java API, Scala API
>Affects Versions: 0.9
>Reporter: Maximilian Michels
>Assignee: Chesnay Schepler
>
> The basic skeleton of a Flink program looks like this: 
> {code}
> ExecutionEnvironment env = ExecutionEnvironment.getExecutionEnvironment();
> // bootstrap DataSet
> DataSet<..> ds = env.fromElements(1,2,3,4);
> // perform transformations
> ..
> // define sinks, e.g.
> ds.writeToTextFile("/some/path");
> // execute
> env.execute()
> {code}
> First thing users do is to change {{ds.writeToTextFile("/some/path");}} into 
> {{ds.print();}}. But that fails with an Exception ("No new data sinks 
> defined...").
> In FLINK-2026 we made this exception message easier to understand. However, 
> users still don't understand what is happening. Especially because they see 
> Flink executing and then failing.
> I propose to ignore calls to execute() when no sinks are defined. Instead, we 
> should just print a warning: "Detected call to execute without any data 
> sinks. Not executing."



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Created] (FLINK-3045) Properly expose the key of a kafka message

2015-11-19 Thread Robert Metzger (JIRA)

Robert Metzger created FLINK-3045:
-

 Summary: Properly expose the key of a kafka message
 Key: FLINK-3045
 URL: https://issues.apache.org/jira/browse/FLINK-3045
 Project: Flink
  Issue Type: Improvement
  Components: Kafka Connector
Reporter: Robert Metzger
Assignee: Robert Metzger
Priority: Critical


Currently, the {{flink-kafka-connector}} is not properly exposing the message 
key.




--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (FLINK-2978) Integrate web submission interface into the new dashboard

2015-11-19 Thread ASF GitHub Bot (JIRA)


[ 
https://issues.apache.org/jira/browse/FLINK-2978?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15013215#comment-15013215
 ] 

ASF GitHub Bot commented on FLINK-2978:
---

Github user sachingoel0101 commented on a diff in the pull request:

https://github.com/apache/flink/pull/1338#discussion_r45317406
  
--- Diff: 
flink-core/src/main/java/org/apache/flink/configuration/ConfigConstants.java ---
@@ -635,8 +644,18 @@
 * The default number of archived jobs for the jobmanager
 */
public static final int DEFAULT_JOB_MANAGER_WEB_ARCHIVE_COUNT = 5;
-   
-   
+
+   /**
+* By default, submitting jobs from the web-frontend is allowed.
+*/
+   public static final boolean DEFAULT_JOB_MANAGER_WEB_SUBMISSION = true;
+
+   /**
+* Default directory for uploaded file storage for the Web frontend.
+*/
+   public static final String DEFAULT_JOB_MANAGER_WEB_UPLOAD_DIR =
+   (System.getProperty("java.io.tmpdir") == null ? "/tmp" 
: System.getProperty("java.io.tmpdir")) + "/webmonitor/";
--- End diff --

@rmetzger can we make a decision on this? I'm in favor of having a 
directory `/jars/` inside the `webRootDir` created by the Web Monitor. There is 
already a shutdown hook for removing it.


> Integrate web submission interface into the new dashboard
> -
>
> Key: FLINK-2978
> URL: https://issues.apache.org/jira/browse/FLINK-2978
> Project: Flink
>  Issue Type: New Feature
>  Components: Web Client, Webfrontend
>Reporter: Sachin Goel
>Assignee: Sachin Goel
>
> As discussed in 
> http://mail-archives.apache.org/mod_mbox/flink-dev/201511.mbox/%3CCAL3J2zQg6UBKNDnm=8tshpz6r4p2jvx7nrlom7caajrb9s6...@mail.gmail.com%3E,
>  we should integrate job submission from the web into the dashboard.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[GitHub] flink pull request: [FLINK-2978][web-dashboard][webclient] Integra...

2015-11-19 Thread sachingoel0101

Github user sachingoel0101 commented on a diff in the pull request:

https://github.com/apache/flink/pull/1338#discussion_r45317406
  
--- Diff: 
flink-core/src/main/java/org/apache/flink/configuration/ConfigConstants.java ---
@@ -635,8 +644,18 @@
 * The default number of archived jobs for the jobmanager
 */
public static final int DEFAULT_JOB_MANAGER_WEB_ARCHIVE_COUNT = 5;
-   
-   
+
+   /**
+* By default, submitting jobs from the web-frontend is allowed.
+*/
+   public static final boolean DEFAULT_JOB_MANAGER_WEB_SUBMISSION = true;
+
+   /**
+* Default directory for uploaded file storage for the Web frontend.
+*/
+   public static final String DEFAULT_JOB_MANAGER_WEB_UPLOAD_DIR =
+   (System.getProperty("java.io.tmpdir") == null ? "/tmp" 
: System.getProperty("java.io.tmpdir")) + "/webmonitor/";
--- End diff --

@rmetzger can we make a decision on this? I'm in favor of having a 
directory `/jars/` inside the `webRootDir` created by the Web Monitor. There is 
already a shutdown hook for removing it.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

[jira] [Resolved] (FLINK-3040) Add docs describing how to configure State Backends

2015-11-19 Thread Stephan Ewen (JIRA)


 [ 
https://issues.apache.org/jira/browse/FLINK-3040?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Stephan Ewen resolved FLINK-3040.
-
Resolution: Fixed

Fixed in
  - 0.10.1 in db456a761480679f54136743237999049cb7476b
  - 1.0.0 in 8b086eb918c05f9ea8a2cc3840ca538788b39b01

> Add docs describing how to configure State Backends
> ---
>
> Key: FLINK-3040
> URL: https://issues.apache.org/jira/browse/FLINK-3040
> Project: Flink
>  Issue Type: Improvement
>  Components: Documentation
>Affects Versions: 0.10.0
>Reporter: Stephan Ewen
>Assignee: Stephan Ewen
> Fix For: 1.0.0, 0.10.1
>
>




--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Closed] (FLINK-3040) Add docs describing how to configure State Backends

2015-11-19 Thread Stephan Ewen (JIRA)


 [ 
https://issues.apache.org/jira/browse/FLINK-3040?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Stephan Ewen closed FLINK-3040.
---

> Add docs describing how to configure State Backends
> ---
>
> Key: FLINK-3040
> URL: https://issues.apache.org/jira/browse/FLINK-3040
> Project: Flink
>  Issue Type: Improvement
>  Components: Documentation
>Affects Versions: 0.10.0
>Reporter: Stephan Ewen
>Assignee: Stephan Ewen
> Fix For: 1.0.0, 0.10.1
>
>




--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Reopened] (FLINK-2249) ExecutionEnvironment: Ignore calls to execute() if no data sinks defined

2015-11-19 Thread Maximilian Michels (JIRA)


 [ 
https://issues.apache.org/jira/browse/FLINK-2249?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Maximilian Michels reopened FLINK-2249:
---

> ExecutionEnvironment: Ignore calls to execute() if no data sinks defined
> 
>
> Key: FLINK-2249
> URL: https://issues.apache.org/jira/browse/FLINK-2249
> Project: Flink
>  Issue Type: Improvement
>  Components: Java API, Scala API
>Affects Versions: 0.9
>Reporter: Maximilian Michels
>Assignee: Chesnay Schepler
>
> The basic skeleton of a Flink program looks like this: 
> {code}
> ExecutionEnvironment env = ExecutionEnvironment.getExecutionEnvironment();
> // bootstrap DataSet
> DataSet<..> ds = env.fromElements(1,2,3,4);
> // perform transformations
> ..
> // define sinks, e.g.
> ds.writeToTextFile("/some/path");
> // execute
> env.execute()
> {code}
> First thing users do is to change {{ds.writeToTextFile("/some/path");}} into 
> {{ds.print();}}. But that fails with an Exception ("No new data sinks 
> defined...").
> In FLINK-2026 we made this exception message easier to understand. However, 
> users still don't understand what is happening. Especially because they see 
> Flink executing and then failing.
> I propose to ignore calls to execute() when no sinks are defined. Instead, we 
> should just print a warning: "Detected call to execute without any data 
> sinks. Not executing."



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (FLINK-2115) TableAPI throws ExpressionException for "Dangling GroupBy operation"

2015-11-19 Thread ASF GitHub Bot (JIRA)


[ 
https://issues.apache.org/jira/browse/FLINK-2115?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15013200#comment-15013200
 ] 

ASF GitHub Bot commented on FLINK-2115:
---

Github user aljoscha commented on the pull request:

https://github.com/apache/flink/pull/1377#issuecomment-157997049
  
OK, I see your point. We can go with the MySQL way then. Thanks for 
explaining it. :smile: 

Could you please update the documentation with a small paragraph that 
explains the behavior?


> TableAPI throws ExpressionException for "Dangling GroupBy operation"
> 
>
> Key: FLINK-2115
> URL: https://issues.apache.org/jira/browse/FLINK-2115
> Project: Flink
>  Issue Type: Bug
>  Components: Table API
>Affects Versions: 0.9
>Reporter: Fabian Hueske
>Assignee: Chengxiang Li
>
> The following program below throws an ExpressionException due to a "Dangling 
> GroupBy operation".
> However, I think the program is semantically correct and should execute.
> {code}
> public static void main(String[] args) throws Exception {
>   ExecutionEnvironment env = ExecutionEnvironment.getExecutionEnvironment();
>   DataSet data = env.fromElements(1,2,2,3,3,3,4,4,4,4);
>   DataSet> tuples = data
>   .map(new MapFunction>() {
> @Override
> public Tuple2 map(Integer i) throws Exception {
>   return new Tuple2(i, i*2);
> }
>   });
>   TableEnvironment tEnv = new TableEnvironment();
>   Table t = tEnv.toTable(tuples).as("i, i2")
>   .groupBy("i, i2").select("i, i2")
>   .groupBy("i").select("i, i.count as cnt");
>   tEnv.toSet(t, Row.class).print();
> }
> {code}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[GitHub] flink pull request: [FLINK-2115] [Table API] support no aggregatio...

2015-11-19 Thread aljoscha

Github user aljoscha commented on the pull request:

https://github.com/apache/flink/pull/1377#issuecomment-157997049
  
OK, I see your point. We can go with the MySQL way then. Thanks for 
explaining it. :smile: 

Could you please update the documentation with a small paragraph that 
explains the behavior?


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

[GitHub] flink pull request: [FLINK-2955] Add operators description in Tabl...

2015-11-19 Thread aljoscha

Github user aljoscha commented on the pull request:

https://github.com/apache/flink/pull/1318#issuecomment-157998150
  
I'm merging


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

[jira] [Commented] (FLINK-2955) Add operations introduction in Table API page.

2015-11-19 Thread ASF GitHub Bot (JIRA)


[ 
https://issues.apache.org/jira/browse/FLINK-2955?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15013208#comment-15013208
 ] 

ASF GitHub Bot commented on FLINK-2955:
---

Github user aljoscha commented on the pull request:

https://github.com/apache/flink/pull/1318#issuecomment-157998634
  
Thanks for adding this. :+1: 


> Add operations introduction in Table API page.
> --
>
> Key: FLINK-2955
> URL: https://issues.apache.org/jira/browse/FLINK-2955
> Project: Flink
>  Issue Type: New Feature
>  Components: Documentation
>Reporter: Chengxiang Li
>Assignee: Chengxiang Li
>Priority: Minor
>
> On the Table API page, there is no formal introduction of current supported 
> operations, it should be nice to have it.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Resolved] (FLINK-2955) Add operations introduction in Table API page.

2015-11-19 Thread Aljoscha Krettek (JIRA)


 [ 
https://issues.apache.org/jira/browse/FLINK-2955?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Aljoscha Krettek resolved FLINK-2955.
-
   Resolution: Fixed
Fix Version/s: 1.0.0

Resolved in 
https://github.com/apache/flink/commit/8325bc6a664ede74be02e7b4de145aff4bb82a43

> Add operations introduction in Table API page.
> --
>
> Key: FLINK-2955
> URL: https://issues.apache.org/jira/browse/FLINK-2955
> Project: Flink
>  Issue Type: New Feature
>  Components: Documentation
>Reporter: Chengxiang Li
>Assignee: Chengxiang Li
>Priority: Minor
> Fix For: 1.0.0
>
>
> On the Table API page, there is no formal introduction of current supported 
> operations, it should be nice to have it.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[GitHub] flink pull request: [FLINK-2955] Add operators description in Tabl...

2015-11-19 Thread ChengXiangLi

Github user ChengXiangLi closed the pull request at:

https://github.com/apache/flink/pull/1318


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

[jira] [Commented] (FLINK-2955) Add operations introduction in Table API page.

2015-11-19 Thread ASF GitHub Bot (JIRA)


[ 
https://issues.apache.org/jira/browse/FLINK-2955?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15013234#comment-15013234
 ] 

ASF GitHub Bot commented on FLINK-2955:
---

Github user ChengXiangLi closed the pull request at:

https://github.com/apache/flink/pull/1318


> Add operations introduction in Table API page.
> --
>
> Key: FLINK-2955
> URL: https://issues.apache.org/jira/browse/FLINK-2955
> Project: Flink
>  Issue Type: New Feature
>  Components: Documentation
>Reporter: Chengxiang Li
>Assignee: Chengxiang Li
>Priority: Minor
> Fix For: 1.0.0
>
>
> On the Table API page, there is no formal introduction of current supported 
> operations, it should be nice to have it.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[GitHub] flink pull request: fix pos assignment of instance

2015-11-19 Thread asfgit

Github user asfgit closed the pull request at:

https://github.com/apache/flink/pull/1345


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

[jira] [Comment Edited] (FLINK-3001) Add Support for Java 8 Optional type

2015-11-19 Thread Hubert Czerpak (JIRA)


[ 
https://issues.apache.org/jira/browse/FLINK-3001?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15013511#comment-15013511
 ] 

Hubert Czerpak edited comment on FLINK-3001 at 11/19/15 5:50 PM:
-

Optionals shouldn't be stored at all. Its purpose in Java8 is to be used as 
return value from functions. But whenever you want to use its value (e.g. store 
in a variable) you need to unwrap it from the optional. It may be different 
than optionals in other languages but Oracle recommends avoiding 
identity-sensitive operations. E.g. in cases of sorting I think optionals 
shouldn't be on the sorted collection but their unwrapped values instead.


was (Author: hefron):
Optionals shouldn't be stored at all. Its purpose in Java8 is to be used as 
return value from functions. But whenever you want to use its value (e.g. store 
in a variable) you need to extract it from the optional. It may be different 
than optionals in other languages but Oracle recommends avoiding 
identity-sensitive operations. E.g. in cases of sorting I think optionals 
shouldn't be on the sorted collection but their values instead (either null or 
value).

> Add Support for Java 8 Optional type
> 
>
> Key: FLINK-3001
> URL: https://issues.apache.org/jira/browse/FLINK-3001
> Project: Flink
>  Issue Type: New Feature
>  Components: Java API
>Affects Versions: 0.10.0
>Reporter: Stephan Ewen
>Priority: Minor
> Fix For: 1.0.0
>
>
> Using {{Optional}} is a good way to handle nullable fields.
> The missing support for null fields in tuples can be easily handled by using 
> {{Optional}} for nullable fields and {{T}} directly for non nullable 
> fields.
> That also retains best serialization efficiency.
> Since we cannot always assume the presence of {{Optional}} (only introduced 
> in Java8), the TypeExtractor needs to analyze and create that TypeInfo with 
> reflection.
> Further more, we need to add the OptionalTypeInfo to the flink-java8 project 
> and people need to include flink-java8 in their project if they want to use 
> the Optional.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[GitHub] flink pull request: Out-of-core state backend for JDBC databases

2015-11-19 Thread gyfora

Github user gyfora commented on a diff in the pull request:

https://github.com/apache/flink/pull/1305#discussion_r45377370
  
--- Diff: 
flink-runtime/src/main/java/org/apache/flink/runtime/state/StateBackend.java ---
@@ -64,33 +64,35 @@
/**
 * Closes the state backend, releasing all internal resources, but does 
not delete any persistent
 * checkpoint data.
-* 
+*
 * @throws Exception Exceptions can be forwarded and will be logged by 
the system
 */
public abstract void close() throws Exception;
-   
+
// 

//  key/value state
// 

 
/**
 * Creates a key/value state backed by this state backend.
-* 
+*
+* @param operatorId Unique id for the operator creating the state
+* @param stateName Name of the created state
 * @param keySerializer The serializer for the key.
 * @param valueSerializer The serializer for the value.
 * @param defaultValue The value that is returned when no other value 
has been associated with a key, yet.
 * @param  The type of the key.
 * @param  The type of the value.
-* 
+*
 * @return A new key/value state backed by this backend.
-* 
+*
 * @throws Exception Exceptions may occur during initialization of the 
state and should be forwarded.
 */
-   public abstract  KvState createKvState(
+   public abstract  KvState createKvState(int 
operatorId, String stateName,
--- End diff --

Let me first describe how sharding works than I will give a concrete 
example. 
Key-Value pairs are sharded by key not by the subtask. This means that each 
parallel subtask maintains a connection to all the shards and partitions the 
states before writing them to the appropriate shards according to the user 
defined partitioner (in the backend config). This is much better than sharding 
by subtask because we can later change the parallelism of the job without 
affecting the state and also lets us defined a more elaborate sharding strategy 
through the partitioner.

This means, when a kv state is created we create a table for that kvstate 
in each shard. If we would do it according to your suggestion we would need to 
create numShards number of tables for each parallel instance (total of p*ns) 
for each kvstate. Furthermore this makes the fancy sharding useless because we 
cannot change the job parallelism. So we need to make sure that parallel 
subtasks of a given operator write to the same state tables (so we only have ns 
number of tables regardless of the parallelism).

In order to do this we need something that uniqely identifies a given state 
in the streaming program (and parallel instances should have the same id).

The information required to create such unique state id is an identifier 
for the operator that has the state + the name of the state. (The information 
obtained from the environment is not enough because chained operators have the 
same environment, therefore if they have conflicting state names the id is not 
unique). The only thing that identifies an operator in the logical streaming 
program is the operator id assigned by the jobgraphbuilder (thats the whole 
point of having it). 

An example job with p=2 and numshards = 3:

chained map -> filter, both the mapper and filter has a state named 
"count", and let's assume that mapper has opid 1 and filter 2.

In this case the mapper would create 3 db tables (1 on each shard) with the 
same name kvstate_count_1_jobId. The filter would also create 3 tables with 
names: kvstate_count_2_jobId

All mapper instances would write to all three database shards, and the same 
goes for all the filters.

I hope you get what I am trying to say. 


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

[jira] [Commented] (FLINK-3009) Cannot build docs with Jekyll 3.0.0

2015-11-19 Thread ASF GitHub Bot (JIRA)


[ 
https://issues.apache.org/jira/browse/FLINK-3009?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15014891#comment-15014891
 ] 

ASF GitHub Bot commented on FLINK-3009:
---

Github user hsaputra commented on the pull request:

https://github.com/apache/flink/pull/1363#issuecomment-158245010
  
LGTM 
+1
Will commit end of day if no one beats me


> Cannot build docs with Jekyll 3.0.0
> ---
>
> Key: FLINK-3009
> URL: https://issues.apache.org/jira/browse/FLINK-3009
> Project: Flink
>  Issue Type: Bug
>  Components: Build System
>Affects Versions: 0.10.0
>Reporter: Ufuk Celebi
>Priority: Minor
>
> Building the docs with the newly released Jekyll 3.0.0 fails:
> {code}
> Configuration file: /Users/ufuk/code/flink/docs/_config.yml
> /Users/ufuk/code/flink/docs/_plugins/removeDuplicateLicenseHeaders.rb:63:in 
> `': cannot load such file -- jekyll/post (LoadError)
>   from 
> /Users/ufuk/code/flink/docs/_plugins/removeDuplicateLicenseHeaders.rb:25:in 
> `'
>   from 
> /System/Library/Frameworks/Ruby.framework/Versions/2.0/usr/lib/ruby/2.0.0/rubygems/core_ext/kernel_require.rb:126:in
>  `require'
>   from 
> /System/Library/Frameworks/Ruby.framework/Versions/2.0/usr/lib/ruby/2.0.0/rubygems/core_ext/kernel_require.rb:126:in
>  `require'
>   from 
> /Library/Ruby/Gems/2.0.0/gems/jekyll-3.0.0/lib/jekyll/plugin_manager.rb:75:in 
> `block (2 levels) in require_plugin_files'
>   from 
> /Library/Ruby/Gems/2.0.0/gems/jekyll-3.0.0/lib/jekyll/plugin_manager.rb:74:in 
> `each'
>   from 
> /Library/Ruby/Gems/2.0.0/gems/jekyll-3.0.0/lib/jekyll/plugin_manager.rb:74:in 
> `block in require_plugin_files'
>   from 
> /Library/Ruby/Gems/2.0.0/gems/jekyll-3.0.0/lib/jekyll/plugin_manager.rb:73:in 
> `each'
>   from 
> /Library/Ruby/Gems/2.0.0/gems/jekyll-3.0.0/lib/jekyll/plugin_manager.rb:73:in 
> `require_plugin_files'
>   from 
> /Library/Ruby/Gems/2.0.0/gems/jekyll-3.0.0/lib/jekyll/plugin_manager.rb:18:in 
> `conscientious_require'
>   from 
> /Library/Ruby/Gems/2.0.0/gems/jekyll-3.0.0/lib/jekyll/site.rb:97:in `setup'
>   from 
> /Library/Ruby/Gems/2.0.0/gems/jekyll-3.0.0/lib/jekyll/site.rb:49:in 
> `initialize'
>   from 
> /Library/Ruby/Gems/2.0.0/gems/jekyll-3.0.0/lib/jekyll/commands/build.rb:30:in 
> `new'
>   from 
> /Library/Ruby/Gems/2.0.0/gems/jekyll-3.0.0/lib/jekyll/commands/build.rb:30:in 
> `process'
>   from 
> /Library/Ruby/Gems/2.0.0/gems/jekyll-3.0.0/lib/jekyll/commands/build.rb:18:in 
> `block (2 levels) in init_with_program'
>   from 
> /Library/Ruby/Gems/2.0.0/gems/mercenary-0.3.5/lib/mercenary/command.rb:220:in 
> `call'
>   from 
> /Library/Ruby/Gems/2.0.0/gems/mercenary-0.3.5/lib/mercenary/command.rb:220:in 
> `block in execute'
>   from 
> /Library/Ruby/Gems/2.0.0/gems/mercenary-0.3.5/lib/mercenary/command.rb:220:in 
> `each'
>   from 
> /Library/Ruby/Gems/2.0.0/gems/mercenary-0.3.5/lib/mercenary/command.rb:220:in 
> `execute'
>   from 
> /Library/Ruby/Gems/2.0.0/gems/mercenary-0.3.5/lib/mercenary/program.rb:42:in 
> `go'
>   from 
> /Library/Ruby/Gems/2.0.0/gems/mercenary-0.3.5/lib/mercenary.rb:19:in `program'
>   from /Library/Ruby/Gems/2.0.0/gems/jekyll-3.0.0/bin/jekyll:17:in ` (required)>'
>   from /usr/local/bin/jekyll:23:in `load'
>   from /usr/local/bin/jekyll:23:in `'
> {code}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[GitHub] flink pull request: [FLINK-3009] Add dockerized jekyll environment

2015-11-19 Thread hsaputra

Github user hsaputra commented on the pull request:

https://github.com/apache/flink/pull/1363#issuecomment-158245010
  
LGTM 
+1
Will commit end of day if no one beats me


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

[jira] [Resolved] (FLINK-3032) Flink does not start on Hadoop 2.7.1 (HDP), due to class conflict

2015-11-19 Thread Robert Metzger (JIRA)


 [ 
https://issues.apache.org/jira/browse/FLINK-3032?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Robert Metzger resolved FLINK-3032.
---
   Resolution: Fixed
Fix Version/s: 0.10.1
   1.0.0

Fixed for 0.10 branch: 
http://git-wip-us.apache.org/repos/asf/flink/commit/4bcc1547

> Flink does not start on Hadoop 2.7.1 (HDP), due to class conflict
> -
>
> Key: FLINK-3032
> URL: https://issues.apache.org/jira/browse/FLINK-3032
> Project: Flink
>  Issue Type: Bug
>  Components: Build System
>Affects Versions: 0.10.0, 1.0.0
>Reporter: Robert Metzger
>Assignee: Robert Metzger
>Priority: Critical
> Fix For: 1.0.0, 0.10.1
>
>
> Steps to reproduce:
> - Build flink {{mvn clean install -DskipTests 
> -Dhadoop.version=2.7.1.2.3.2.0-2950 -Pvendor-repos}}
> - Start it on a HDP 2.3.2 Hadoop as a yarn-session
> - Watch it fail with
> {code}
> 6:18:56,459 INFO  org.apache.flink.runtime.filecache.FileCache
>   - User file cache uses directory /hadoop/yarn/local/usercache/flink/appcache
> /application_1447687546708_0005/flink-dist-cache-f4710796-598c-4778-992c-5df000faffae
> 16:18:56,561 ERROR akka.actor.OneForOneStrategy   
>- exception during creation
> akka.actor.ActorInitializationException: exception during creation
> at akka.actor.ActorInitializationException$.apply(Actor.scala:164)
> at akka.actor.ActorCell.create(ActorCell.scala:596)
> at akka.actor.ActorCell.invokeAll$1(ActorCell.scala:456)
> at akka.actor.ActorCell.systemInvoke(ActorCell.scala:478)
> at akka.dispatch.Mailbox.processAllSystemMessages(Mailbox.scala:279)
> at akka.dispatch.Mailbox.run(Mailbox.scala:220)
> at akka.dispatch.Mailbox.exec(Mailbox.scala:231)
> at 
> scala.concurrent.forkjoin.ForkJoinTask.doExec(ForkJoinTask.java:260)
> at 
> scala.concurrent.forkjoin.ForkJoinPool$WorkQueue.pollAndExecAll(ForkJoinPool.java:1253)
> at 
> scala.concurrent.forkjoin.ForkJoinPool$WorkQueue.runTask(ForkJoinPool.java:1346)
> at 
> scala.concurrent.forkjoin.ForkJoinPool.runWorker(ForkJoinPool.java:1979)
> at 
> scala.concurrent.forkjoin.ForkJoinWorkerThread.run(ForkJoinWorkerThread.java:107)
> Caused by: java.lang.reflect.InvocationTargetException
> at sun.reflect.NativeConstructorAccessorImpl.newInstance0(Native 
> Method)
> at 
> sun.reflect.NativeConstructorAccessorImpl.newInstance(NativeConstructorAccessorImpl.java:57)
> at 
> sun.reflect.DelegatingConstructorAccessorImpl.newInstance(DelegatingConstructorAccessorImpl.java:45)
> at java.lang.reflect.Constructor.newInstance(Constructor.java:526)
> at akka.util.Reflect$.instantiate(Reflect.scala:66)
> at akka.actor.ArgsReflectConstructor.produce(Props.scala:352)
> at akka.actor.Props.newActor(Props.scala:252)
> at akka.actor.ActorCell.newActor(ActorCell.scala:552)
> at akka.actor.ActorCell.create(ActorCell.scala:578)
> ... 10 more
> Caused by: java.lang.NoSuchMethodError: 
> com.fasterxml.jackson.core.JsonFactory.requiresPropertyOrdering()Z
> at 
> com.fasterxml.jackson.databind.ObjectMapper.(ObjectMapper.java:458)
> at 
> com.fasterxml.jackson.databind.ObjectMapper.(ObjectMapper.java:379)
> at 
> org.apache.flink.runtime.taskmanager.TaskManager.(TaskManager.scala:153)
> at 
> org.apache.flink.yarn.YarnTaskManager.(YarnTaskManager.scala:32)
> ... 19 more
> 16:18:56,564 ERROR org.apache.flink.runtime.taskmanager.TaskManager   
>- Actor akka://flink/user/taskmanager#-1189186354 terminated, stopping 
> process...
> {code}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[GitHub] flink pull request: [FLINK-2017] Add predefined required parameter...

2015-11-19 Thread mliesenberg

Github user mliesenberg commented on the pull request:

https://github.com/apache/flink/pull/1097#issuecomment-158164663
  
Oha. yes. Found the problem causing the exception, will fix it and make the 
tests more specific as to which exceptions they need to catch.

While I am at it, anything else which needs change or are we good to go?

I'll update the PR tomorrow.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

[jira] [Commented] (FLINK-2017) Add predefined required parameters to ParameterTool

2015-11-19 Thread ASF GitHub Bot (JIRA)


[ 
https://issues.apache.org/jira/browse/FLINK-2017?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15014198#comment-15014198
 ] 

ASF GitHub Bot commented on FLINK-2017:
---

Github user mliesenberg commented on the pull request:

https://github.com/apache/flink/pull/1097#issuecomment-158164663
  
Oha. yes. Found the problem causing the exception, will fix it and make the 
tests more specific as to which exceptions they need to catch.

While I am at it, anything else which needs change or are we good to go?

I'll update the PR tomorrow.


> Add predefined required parameters to ParameterTool
> ---
>
> Key: FLINK-2017
> URL: https://issues.apache.org/jira/browse/FLINK-2017
> Project: Flink
>  Issue Type: Improvement
>Affects Versions: 0.9
>Reporter: Robert Metzger
>  Labels: starter
>
> In FLINK-1525 we've added the {{ParameterTool}}.
> During the PR review, there was a request for required parameters.
> This issue is about implementing a facility to define required parameters. 
> The tool should also be able to print a help menu with a list of all 
> parameters.
> This test case shows my initial ideas how to design the API
> {code}
>   @Test
>   public void requiredParameters() {
>   RequiredParameters required = new RequiredParameters();
>   Option input = required.add("input").alt("i").help("Path to 
> input file or directory"); // parameter with long and short variant
>   required.add("output"); // parameter only with long variant
>   Option parallelism = 
> required.add("parallelism").alt("p").type(Integer.class); // parameter with 
> type
>   Option spOption = 
> required.add("sourceParallelism").alt("sp").defaultValue(12).help("Number 
> specifying the number of parallel data source instances"); // parameter with 
> default value, specifying the type.
>   Option executionType = 
> required.add("executionType").alt("et").defaultValue("pipelined").choices("pipelined",
>  "batch");
>   ParameterUtil parameter = ParameterUtil.fromArgs(new 
> String[]{"-i", "someinput", "--output", "someout", "-p", "15"});
>   required.check(parameter);
>   required.printHelp();
>   required.checkAndPopulate(parameter);
>   String inputString = input.get();
>   int par = parallelism.getInteger();
>   String output = parameter.get("output");
>   int sourcePar = parameter.getInteger(spOption.getName());
>   }
> {code}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (FLINK-3009) Cannot build docs with Jekyll 3.0.0

2015-11-19 Thread ASF GitHub Bot (JIRA)


[ 
https://issues.apache.org/jira/browse/FLINK-3009?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15014482#comment-15014482
 ] 

ASF GitHub Bot commented on FLINK-3009:
---

Github user jaoki commented on the pull request:

https://github.com/apache/flink/pull/1363#issuecomment-158213589
  
Does anyone want to commit this?


> Cannot build docs with Jekyll 3.0.0
> ---
>
> Key: FLINK-3009
> URL: https://issues.apache.org/jira/browse/FLINK-3009
> Project: Flink
>  Issue Type: Bug
>  Components: Build System
>Affects Versions: 0.10.0
>Reporter: Ufuk Celebi
>Priority: Minor
>
> Building the docs with the newly released Jekyll 3.0.0 fails:
> {code}
> Configuration file: /Users/ufuk/code/flink/docs/_config.yml
> /Users/ufuk/code/flink/docs/_plugins/removeDuplicateLicenseHeaders.rb:63:in 
> `': cannot load such file -- jekyll/post (LoadError)
>   from 
> /Users/ufuk/code/flink/docs/_plugins/removeDuplicateLicenseHeaders.rb:25:in 
> `'
>   from 
> /System/Library/Frameworks/Ruby.framework/Versions/2.0/usr/lib/ruby/2.0.0/rubygems/core_ext/kernel_require.rb:126:in
>  `require'
>   from 
> /System/Library/Frameworks/Ruby.framework/Versions/2.0/usr/lib/ruby/2.0.0/rubygems/core_ext/kernel_require.rb:126:in
>  `require'
>   from 
> /Library/Ruby/Gems/2.0.0/gems/jekyll-3.0.0/lib/jekyll/plugin_manager.rb:75:in 
> `block (2 levels) in require_plugin_files'
>   from 
> /Library/Ruby/Gems/2.0.0/gems/jekyll-3.0.0/lib/jekyll/plugin_manager.rb:74:in 
> `each'
>   from 
> /Library/Ruby/Gems/2.0.0/gems/jekyll-3.0.0/lib/jekyll/plugin_manager.rb:74:in 
> `block in require_plugin_files'
>   from 
> /Library/Ruby/Gems/2.0.0/gems/jekyll-3.0.0/lib/jekyll/plugin_manager.rb:73:in 
> `each'
>   from 
> /Library/Ruby/Gems/2.0.0/gems/jekyll-3.0.0/lib/jekyll/plugin_manager.rb:73:in 
> `require_plugin_files'
>   from 
> /Library/Ruby/Gems/2.0.0/gems/jekyll-3.0.0/lib/jekyll/plugin_manager.rb:18:in 
> `conscientious_require'
>   from 
> /Library/Ruby/Gems/2.0.0/gems/jekyll-3.0.0/lib/jekyll/site.rb:97:in `setup'
>   from 
> /Library/Ruby/Gems/2.0.0/gems/jekyll-3.0.0/lib/jekyll/site.rb:49:in 
> `initialize'
>   from 
> /Library/Ruby/Gems/2.0.0/gems/jekyll-3.0.0/lib/jekyll/commands/build.rb:30:in 
> `new'
>   from 
> /Library/Ruby/Gems/2.0.0/gems/jekyll-3.0.0/lib/jekyll/commands/build.rb:30:in 
> `process'
>   from 
> /Library/Ruby/Gems/2.0.0/gems/jekyll-3.0.0/lib/jekyll/commands/build.rb:18:in 
> `block (2 levels) in init_with_program'
>   from 
> /Library/Ruby/Gems/2.0.0/gems/mercenary-0.3.5/lib/mercenary/command.rb:220:in 
> `call'
>   from 
> /Library/Ruby/Gems/2.0.0/gems/mercenary-0.3.5/lib/mercenary/command.rb:220:in 
> `block in execute'
>   from 
> /Library/Ruby/Gems/2.0.0/gems/mercenary-0.3.5/lib/mercenary/command.rb:220:in 
> `each'
>   from 
> /Library/Ruby/Gems/2.0.0/gems/mercenary-0.3.5/lib/mercenary/command.rb:220:in 
> `execute'
>   from 
> /Library/Ruby/Gems/2.0.0/gems/mercenary-0.3.5/lib/mercenary/program.rb:42:in 
> `go'
>   from 
> /Library/Ruby/Gems/2.0.0/gems/mercenary-0.3.5/lib/mercenary.rb:19:in `program'
>   from /Library/Ruby/Gems/2.0.0/gems/jekyll-3.0.0/bin/jekyll:17:in ` (required)>'
>   from /usr/local/bin/jekyll:23:in `load'
>   from /usr/local/bin/jekyll:23:in `'
> {code}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[GitHub] flink pull request: [FLINK-3009] Add dockerized jekyll environment

2015-11-19 Thread jaoki

Github user jaoki commented on the pull request:

https://github.com/apache/flink/pull/1363#issuecomment-158213589
  
Does anyone want to commit this?


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

[jira] [Commented] (FLINK-3020) Local streaming execution: set number of task manager slots to the maximum parallelism

2015-11-19 Thread ASF GitHub Bot (JIRA)


[ 
https://issues.apache.org/jira/browse/FLINK-3020?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15013342#comment-15013342
 ] 

ASF GitHub Bot commented on FLINK-3020:
---

Github user mxm commented on the pull request:

https://github.com/apache/flink/pull/1360#issuecomment-158024467
  
Alright. I will push the original pull request version again which uses the 
max parallelism of all operators. Further, I will open a separate JIRA for the 
batch side change.


> Local streaming execution: set number of task manager slots to the maximum 
> parallelism
> --
>
> Key: FLINK-3020
> URL: https://issues.apache.org/jira/browse/FLINK-3020
> Project: Flink
>  Issue Type: Bug
>  Components: Local Runtime
>Affects Versions: 0.10.0
>Reporter: Maximilian Michels
>Assignee: Maximilian Michels
>Priority: Minor
> Fix For: 1.0.0, 0.10.1
>
>
> Quite an inconvenience is the local execution configuration behavior. It sets 
> the number of task slots of the mini cluster to the default parallelism. This 
> causes problem if you use {{setParallelism(parallelism)}} on an operator and 
> set a parallelism larger than the default parallelism.
> {noformat}
> Caused by: 
> org.apache.flink.runtime.jobmanager.scheduler.NoResourceAvailableException: 
> Not enough free slots available to run the job. You can decrease the operator 
> parallelism or increase the number of slots per TaskManager in the 
> configuration. Task to schedule: < Attempt #0 (Flat Map (9/100)) @ 
> (unassigned) - [SCHEDULED] > with groupID < fa7240ee1fed08bd7e6278899db3e838 
> > in sharing group < SlotSharingGroup [f3d578e9819be9c39ceee86cf5eb8c08, 
> 8fa330746efa1d034558146e4604d0b4, fa7240ee1fed08bd7e6278899db3e838] >. 
> Resources available to scheduler: Number of instances=1, total number of 
> slots=8, available slots=0
>   at 
> org.apache.flink.runtime.jobmanager.scheduler.Scheduler.scheduleTask(Scheduler.java:256)
>   at 
> org.apache.flink.runtime.jobmanager.scheduler.Scheduler.scheduleImmediately(Scheduler.java:131)
>   at 
> org.apache.flink.runtime.executiongraph.Execution.scheduleForExecution(Execution.java:298)
>   at 
> org.apache.flink.runtime.executiongraph.ExecutionVertex.scheduleForExecution(ExecutionVertex.java:458)
>   at 
> org.apache.flink.runtime.executiongraph.ExecutionJobVertex.scheduleAll(ExecutionJobVertex.java:322)
>   at 
> org.apache.flink.runtime.executiongraph.ExecutionGraph.scheduleForExecution(ExecutionGraph.java:686)
>   at 
> org.apache.flink.runtime.jobmanager.JobManager$$anonfun$org$apache$flink$runtime$jobmanager$JobManager$$submitJob$1.apply$mcV$sp(JobManager.scala:982)
>   at 
> org.apache.flink.runtime.jobmanager.JobManager$$anonfun$org$apache$flink$runtime$jobmanager$JobManager$$submitJob$1.apply(JobManager.scala:962)
>   at 
> org.apache.flink.runtime.jobmanager.JobManager$$anonfun$org$apache$flink$runtime$jobmanager$JobManager$$submitJob$1.apply(JobManager.scala:962)
>   at 
> scala.concurrent.impl.Future$PromiseCompletingRunnable.liftedTree1$1(Future.scala:24)
>   at 
> scala.concurrent.impl.Future$PromiseCompletingRunnable.run(Future.scala:24)
>   at akka.dispatch.TaskInvocation.run(AbstractDispatcher.scala:41)
>   at 
> akka.dispatch.ForkJoinExecutorConfigurator$AkkaForkJoinTask.exec(AbstractDispatcher.scala:401)
>   at scala.concurrent.forkjoin.ForkJoinTask.doExec(ForkJoinTask.java:260)
>   at 
> scala.concurrent.forkjoin.ForkJoinPool$WorkQueue.runTask(ForkJoinPool.java:1339)
>   ... 2 more
> {noformat}
> I propose to change this behavior to setting the number of task slots to the 
> maximum parallelism present in the user program.
> What do you think?



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[GitHub] flink pull request: [FLINK-3020][streaming] set number of task slo...

2015-11-19 Thread tillrohrmann

Github user tillrohrmann commented on a diff in the pull request:

https://github.com/apache/flink/pull/1360#discussion_r45327211
  
--- Diff: 
flink-streaming-java/src/main/java/org/apache/flink/streaming/api/environment/LocalStreamEnvironment.java
 ---
@@ -91,7 +91,11 @@ public JobExecutionResult execute(String jobName) throws 
Exception {
configuration.addAll(jobGraph.getJobConfiguration());
 

configuration.setLong(ConfigConstants.TASK_MANAGER_MEMORY_SIZE_KEY, -1L);
-   
configuration.setInteger(ConfigConstants.TASK_MANAGER_NUM_TASK_SLOTS, 
getParallelism());
+
+   int parallelism = getParallelism() == defaultLocalParallelism ?
+   defaultLocalParallelism : 
jobGraph.getMaximumParallelism();
--- End diff --

`defaultLocalParallelism` seems to be the number of available processors. I 
think this is very unintuitive that the default parallelism is only taken if it 
equals the number of processors.

This means in case of an operator with a higher dop than the #cores, the 
program will only fail if the user sets the default parallelism to #cores.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

[GitHub] flink pull request: [FLINK-3020][streaming] set number of task slo...

2015-11-19 Thread mxm

Github user mxm commented on the pull request:

https://github.com/apache/flink/pull/1360#issuecomment-158024467
  
Alright. I will push the original pull request version again which uses the 
max parallelism of all operators. Further, I will open a separate JIRA for the 
batch side change.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

[jira] [Commented] (FLINK-3020) Local streaming execution: set number of task manager slots to the maximum parallelism

2015-11-19 Thread ASF GitHub Bot (JIRA)


[ 
https://issues.apache.org/jira/browse/FLINK-3020?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15013341#comment-15013341
 ] 

ASF GitHub Bot commented on FLINK-3020:
---

Github user tillrohrmann commented on a diff in the pull request:

https://github.com/apache/flink/pull/1360#discussion_r45327211
  
--- Diff: 
flink-streaming-java/src/main/java/org/apache/flink/streaming/api/environment/LocalStreamEnvironment.java
 ---
@@ -91,7 +91,11 @@ public JobExecutionResult execute(String jobName) throws 
Exception {
configuration.addAll(jobGraph.getJobConfiguration());
 

configuration.setLong(ConfigConstants.TASK_MANAGER_MEMORY_SIZE_KEY, -1L);
-   
configuration.setInteger(ConfigConstants.TASK_MANAGER_NUM_TASK_SLOTS, 
getParallelism());
+
+   int parallelism = getParallelism() == defaultLocalParallelism ?
+   defaultLocalParallelism : 
jobGraph.getMaximumParallelism();
--- End diff --

`defaultLocalParallelism` seems to be the number of available processors. I 
think this is very unintuitive that the default parallelism is only taken if it 
equals the number of processors.

This means in case of an operator with a higher dop than the #cores, the 
program will only fail if the user sets the default parallelism to #cores.


> Local streaming execution: set number of task manager slots to the maximum 
> parallelism
> --
>
> Key: FLINK-3020
> URL: https://issues.apache.org/jira/browse/FLINK-3020
> Project: Flink
>  Issue Type: Bug
>  Components: Local Runtime
>Affects Versions: 0.10.0
>Reporter: Maximilian Michels
>Assignee: Maximilian Michels
>Priority: Minor
> Fix For: 1.0.0, 0.10.1
>
>
> Quite an inconvenience is the local execution configuration behavior. It sets 
> the number of task slots of the mini cluster to the default parallelism. This 
> causes problem if you use {{setParallelism(parallelism)}} on an operator and 
> set a parallelism larger than the default parallelism.
> {noformat}
> Caused by: 
> org.apache.flink.runtime.jobmanager.scheduler.NoResourceAvailableException: 
> Not enough free slots available to run the job. You can decrease the operator 
> parallelism or increase the number of slots per TaskManager in the 
> configuration. Task to schedule: < Attempt #0 (Flat Map (9/100)) @ 
> (unassigned) - [SCHEDULED] > with groupID < fa7240ee1fed08bd7e6278899db3e838 
> > in sharing group < SlotSharingGroup [f3d578e9819be9c39ceee86cf5eb8c08, 
> 8fa330746efa1d034558146e4604d0b4, fa7240ee1fed08bd7e6278899db3e838] >. 
> Resources available to scheduler: Number of instances=1, total number of 
> slots=8, available slots=0
>   at 
> org.apache.flink.runtime.jobmanager.scheduler.Scheduler.scheduleTask(Scheduler.java:256)
>   at 
> org.apache.flink.runtime.jobmanager.scheduler.Scheduler.scheduleImmediately(Scheduler.java:131)
>   at 
> org.apache.flink.runtime.executiongraph.Execution.scheduleForExecution(Execution.java:298)
>   at 
> org.apache.flink.runtime.executiongraph.ExecutionVertex.scheduleForExecution(ExecutionVertex.java:458)
>   at 
> org.apache.flink.runtime.executiongraph.ExecutionJobVertex.scheduleAll(ExecutionJobVertex.java:322)
>   at 
> org.apache.flink.runtime.executiongraph.ExecutionGraph.scheduleForExecution(ExecutionGraph.java:686)
>   at 
> org.apache.flink.runtime.jobmanager.JobManager$$anonfun$org$apache$flink$runtime$jobmanager$JobManager$$submitJob$1.apply$mcV$sp(JobManager.scala:982)
>   at 
> org.apache.flink.runtime.jobmanager.JobManager$$anonfun$org$apache$flink$runtime$jobmanager$JobManager$$submitJob$1.apply(JobManager.scala:962)
>   at 
> org.apache.flink.runtime.jobmanager.JobManager$$anonfun$org$apache$flink$runtime$jobmanager$JobManager$$submitJob$1.apply(JobManager.scala:962)
>   at 
> scala.concurrent.impl.Future$PromiseCompletingRunnable.liftedTree1$1(Future.scala:24)
>   at 
> scala.concurrent.impl.Future$PromiseCompletingRunnable.run(Future.scala:24)
>   at akka.dispatch.TaskInvocation.run(AbstractDispatcher.scala:41)
>   at 
> akka.dispatch.ForkJoinExecutorConfigurator$AkkaForkJoinTask.exec(AbstractDispatcher.scala:401)
>   at scala.concurrent.forkjoin.ForkJoinTask.doExec(ForkJoinTask.java:260)
>   at 
> scala.concurrent.forkjoin.ForkJoinPool$WorkQueue.runTask(ForkJoinPool.java:1339)
>   ... 2 more
> {noformat}
> I propose to change this behavior to setting the number of task slots to the 
> maximum parallelism present in the user program.
> What do you think?



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (FLINK-3044) In YARN mode, configure FsStateBackend by default.

2015-11-19 Thread Stephan Ewen (JIRA)


[ 
https://issues.apache.org/jira/browse/FLINK-3044?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15013362#comment-15013362
 ] 

Stephan Ewen commented on FLINK-3044:
-

Can you elaborate? Why is it not respected, do the TaskManagers not get the 
proper config?

> In YARN mode, configure FsStateBackend by default.
> --
>
> Key: FLINK-3044
> URL: https://issues.apache.org/jira/browse/FLINK-3044
> Project: Flink
>  Issue Type: Improvement
>  Components: Streaming, YARN Client
>Affects Versions: 0.10.0
>Reporter: Stephan Ewen
> Fix For: 1.0.0
>
>
> When starting a YARN session, the default state backend should be set to the 
> {{FsStateBackend}}, with the checkpoint directory set to the working 
> directory for that YARN App.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[GitHub] flink pull request: [FLINK-3002] Add Either type to the Java API

2015-11-19 Thread StephanEwen

Github user StephanEwen commented on the pull request:

https://github.com/apache/flink/pull/1371#issuecomment-158027623
  
Looks good, +1 to merge this


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

[jira] [Commented] (FLINK-3002) Add an EitherType to the Java API

2015-11-19 Thread ASF GitHub Bot (JIRA)


[ 
https://issues.apache.org/jira/browse/FLINK-3002?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15013380#comment-15013380
 ] 

ASF GitHub Bot commented on FLINK-3002:
---

Github user StephanEwen commented on the pull request:

https://github.com/apache/flink/pull/1371#issuecomment-158027623
  
Looks good, +1 to merge this


> Add an EitherType to the Java API
> -
>
> Key: FLINK-3002
> URL: https://issues.apache.org/jira/browse/FLINK-3002
> Project: Flink
>  Issue Type: New Feature
>  Components: Java API
>Affects Versions: 0.10.0
>Reporter: Stephan Ewen
>Assignee: Vasia Kalavri
> Fix For: 1.0.0
>
>
> Either types are recurring patterns and should be serialized efficiently, so 
> it makes sense to add them to the core Java API.
> Since Java does not have such a type as of Java 8, we would need to add our 
> own version.
> The Scala API handles the Scala Either Type already efficiently. I would not 
> use the Scala Either Type in the Java API, since we are trying to get the 
> {{flink-java}} project "Scala free" for people that don't use Scala and o not 
> want to worry about Scala version matches and mismatches.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[GitHub] flink pull request: [FLINK-3041]: Twitter Streaming Description se...

2015-11-19 Thread tillrohrmann

Github user tillrohrmann commented on the pull request:

https://github.com/apache/flink/pull/1379#issuecomment-158044491
  
LGTM. Will merge this PR. Thanks for your work @smarthi.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

[jira] [Commented] (FLINK-3043) Kafka Connector description in Streaming API guide is wrong/outdated

2015-11-19 Thread ASF GitHub Bot (JIRA)


[ 
https://issues.apache.org/jira/browse/FLINK-3043?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15013457#comment-15013457
 ] 

ASF GitHub Bot commented on FLINK-3043:
---

Github user tillrohrmann commented on the pull request:

https://github.com/apache/flink/pull/1380#issuecomment-158045723
  
LGTM. Maybe we can also update the documentation for 0.10 if the changes 
apply to this version as well.


> Kafka Connector description in Streaming API guide is wrong/outdated
> 
>
> Key: FLINK-3043
> URL: https://issues.apache.org/jira/browse/FLINK-3043
> Project: Flink
>  Issue Type: Bug
>  Components: Documentation
>Affects Versions: 0.10.0
>Reporter: Stephan Ewen
>Assignee: Stephan Ewen
> Fix For: 1.0.0, 0.10.1
>
>




--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (FLINK-3041) Twitter Streaming Description section of Streaming Programming guide refers to an incorrect example 'TwitterLocal'

2015-11-19 Thread ASF GitHub Bot (JIRA)


[ 
https://issues.apache.org/jira/browse/FLINK-3041?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15013461#comment-15013461
 ] 

ASF GitHub Bot commented on FLINK-3041:
---

Github user asfgit closed the pull request at:

https://github.com/apache/flink/pull/1379


> Twitter Streaming Description section of Streaming Programming guide refers 
> to an incorrect example 'TwitterLocal'
> --
>
> Key: FLINK-3041
> URL: https://issues.apache.org/jira/browse/FLINK-3041
> Project: Flink
>  Issue Type: Improvement
>  Components: Documentation, Examples, Streaming
>Affects Versions: 0.10.0
>Reporter: Suneel Marthi
>Assignee: Suneel Marthi
>Priority: Minor
> Fix For: 0.10.1
>
>
> Twitter Streaming Description section of Streaming Programming guide refers 
> to an incorrect example 'TwitterLocal', it should be 'TwitterStream'.  Fix 
> other typos in the Twitter streaming description and code cleanup.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Resolved] (FLINK-3041) Twitter Streaming Description section of Streaming Programming guide refers to an incorrect example 'TwitterLocal'

2015-11-19 Thread Till Rohrmann (JIRA)


 [ 
https://issues.apache.org/jira/browse/FLINK-3041?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Till Rohrmann resolved FLINK-3041.
--
Resolution: Fixed

Added via dc61c7a955f0192ca8c86470cd3e22eb1f5dc2dd

> Twitter Streaming Description section of Streaming Programming guide refers 
> to an incorrect example 'TwitterLocal'
> --
>
> Key: FLINK-3041
> URL: https://issues.apache.org/jira/browse/FLINK-3041
> Project: Flink
>  Issue Type: Improvement
>  Components: Documentation, Examples, Streaming
>Affects Versions: 0.10.0
>Reporter: Suneel Marthi
>Assignee: Suneel Marthi
>Priority: Minor
> Fix For: 0.10.1
>
>
> Twitter Streaming Description section of Streaming Programming guide refers 
> to an incorrect example 'TwitterLocal', it should be 'TwitterStream'.  Fix 
> other typos in the Twitter streaming description and code cleanup.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[GitHub] flink pull request: [FLINK-3041]: Twitter Streaming Description se...

2015-11-19 Thread asfgit

Github user asfgit closed the pull request at:

https://github.com/apache/flink/pull/1379


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

[GitHub] flink pull request: [FLINK-3011, 3019, 3028] Cancel jobs in RESTAR...

2015-11-19 Thread uce

Github user uce commented on the pull request:

https://github.com/apache/flink/pull/1369#issuecomment-158047558
  
I've updated the PR as you suggested and rebased on the current master. 
Waiting for Travis. After that I, think it is ready to be merged to 
`release-0.10` and `master`.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

[jira] [Commented] (FLINK-3011) Cannot cancel failing/restarting streaming job from the command line

2015-11-19 Thread ASF GitHub Bot (JIRA)


[ 
https://issues.apache.org/jira/browse/FLINK-3011?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15013474#comment-15013474
 ] 

ASF GitHub Bot commented on FLINK-3011:
---

Github user uce commented on the pull request:

https://github.com/apache/flink/pull/1369#issuecomment-158047558
  
I've updated the PR as you suggested and rebased on the current master. 
Waiting for Travis. After that I, think it is ready to be merged to 
`release-0.10` and `master`.


> Cannot cancel failing/restarting streaming job from the command line
> 
>
> Key: FLINK-3011
> URL: https://issues.apache.org/jira/browse/FLINK-3011
> Project: Flink
>  Issue Type: Bug
>  Components: Command-line client
>Affects Versions: 0.10.0, 1.0.0
>Reporter: Gyula Fora
>Assignee: Ufuk Celebi
>Priority: Critical
>
> I cannot seem to be able to cancel a failing/restarting job from the command 
> line client. The job cannot be rescheduled so it keeps failing:
> The exception I get:
> 13:58:11,240 INFO  org.apache.flink.runtime.jobmanager.JobManager 
>- Status of job 0c895d22c632de5dfe16c42a9ba818d5 (player-id) changed to 
> RESTARTING.
> 13:58:25,234 INFO  org.apache.flink.runtime.jobmanager.JobManager 
>- Trying to cancel job with ID 0c895d22c632de5dfe16c42a9ba818d5.
> 13:58:25,561 WARN  akka.remote.ReliableDeliverySupervisor 
>- Association with remote system [akka.tcp://flink@127.0.0.1:42012] has 
> failed, address is now gated for [5000] ms. Reason is: [Disassociated].



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (FLINK-3020) Local streaming execution: set number of task manager slots to the maximum parallelism

2015-11-19 Thread ASF GitHub Bot (JIRA)


[ 
https://issues.apache.org/jira/browse/FLINK-3020?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15013477#comment-15013477
 ] 

ASF GitHub Bot commented on FLINK-3020:
---

Github user tillrohrmann commented on the pull request:

https://github.com/apache/flink/pull/1360#issuecomment-158047994
  
But this is only true if the `taskManagerNumSlots` in `LocalExecutor` are 
left untouched. Currently, this is the case but this is not enforced.


> Local streaming execution: set number of task manager slots to the maximum 
> parallelism
> --
>
> Key: FLINK-3020
> URL: https://issues.apache.org/jira/browse/FLINK-3020
> Project: Flink
>  Issue Type: Bug
>  Components: Local Runtime
>Affects Versions: 0.10.0
>Reporter: Maximilian Michels
>Assignee: Maximilian Michels
>Priority: Minor
> Fix For: 1.0.0, 0.10.1
>
>
> Quite an inconvenience is the local execution configuration behavior. It sets 
> the number of task slots of the mini cluster to the default parallelism. This 
> causes problem if you use {{setParallelism(parallelism)}} on an operator and 
> set a parallelism larger than the default parallelism.
> {noformat}
> Caused by: 
> org.apache.flink.runtime.jobmanager.scheduler.NoResourceAvailableException: 
> Not enough free slots available to run the job. You can decrease the operator 
> parallelism or increase the number of slots per TaskManager in the 
> configuration. Task to schedule: < Attempt #0 (Flat Map (9/100)) @ 
> (unassigned) - [SCHEDULED] > with groupID < fa7240ee1fed08bd7e6278899db3e838 
> > in sharing group < SlotSharingGroup [f3d578e9819be9c39ceee86cf5eb8c08, 
> 8fa330746efa1d034558146e4604d0b4, fa7240ee1fed08bd7e6278899db3e838] >. 
> Resources available to scheduler: Number of instances=1, total number of 
> slots=8, available slots=0
>   at 
> org.apache.flink.runtime.jobmanager.scheduler.Scheduler.scheduleTask(Scheduler.java:256)
>   at 
> org.apache.flink.runtime.jobmanager.scheduler.Scheduler.scheduleImmediately(Scheduler.java:131)
>   at 
> org.apache.flink.runtime.executiongraph.Execution.scheduleForExecution(Execution.java:298)
>   at 
> org.apache.flink.runtime.executiongraph.ExecutionVertex.scheduleForExecution(ExecutionVertex.java:458)
>   at 
> org.apache.flink.runtime.executiongraph.ExecutionJobVertex.scheduleAll(ExecutionJobVertex.java:322)
>   at 
> org.apache.flink.runtime.executiongraph.ExecutionGraph.scheduleForExecution(ExecutionGraph.java:686)
>   at 
> org.apache.flink.runtime.jobmanager.JobManager$$anonfun$org$apache$flink$runtime$jobmanager$JobManager$$submitJob$1.apply$mcV$sp(JobManager.scala:982)
>   at 
> org.apache.flink.runtime.jobmanager.JobManager$$anonfun$org$apache$flink$runtime$jobmanager$JobManager$$submitJob$1.apply(JobManager.scala:962)
>   at 
> org.apache.flink.runtime.jobmanager.JobManager$$anonfun$org$apache$flink$runtime$jobmanager$JobManager$$submitJob$1.apply(JobManager.scala:962)
>   at 
> scala.concurrent.impl.Future$PromiseCompletingRunnable.liftedTree1$1(Future.scala:24)
>   at 
> scala.concurrent.impl.Future$PromiseCompletingRunnable.run(Future.scala:24)
>   at akka.dispatch.TaskInvocation.run(AbstractDispatcher.scala:41)
>   at 
> akka.dispatch.ForkJoinExecutorConfigurator$AkkaForkJoinTask.exec(AbstractDispatcher.scala:401)
>   at scala.concurrent.forkjoin.ForkJoinTask.doExec(ForkJoinTask.java:260)
>   at 
> scala.concurrent.forkjoin.ForkJoinPool$WorkQueue.runTask(ForkJoinPool.java:1339)
>   ... 2 more
> {noformat}
> I propose to change this behavior to setting the number of task slots to the 
> maximum parallelism present in the user program.
> What do you think?



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[GitHub] flink pull request: [FLINK-3020][streaming] set number of task slo...

2015-11-19 Thread tillrohrmann

Github user tillrohrmann commented on the pull request:

https://github.com/apache/flink/pull/1360#issuecomment-158047994
  
But this is only true if the `taskManagerNumSlots` in `LocalExecutor` are 
left untouched. Currently, this is the case but this is not enforced.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

[GitHub] flink pull request:

2015-11-19 Thread alexeyegorov

Github user alexeyegorov commented on the pull request:


https://github.com/apache/flink/commit/bf29de981c2bcd5cb5d33c68b158c95c8820f43d#commitcomment-14496327
  
@tillrohrmann @mxm thank you both. It's good know where I can find binaries 
I build on my own (was looking for them in my target folder). I was also trying 
to find some of them on the website, but it is somehow difficult, e.g. this 
[page](https://ci.apache.org/projects/flink/flink-docs-master/) states that 
pre-built snapshot can be downloaded there, but what you find is 0.9.1 stable 
version.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

[jira] [Commented] (FLINK-3020) Local streaming execution: set number of task manager slots to the maximum parallelism

2015-11-19 Thread ASF GitHub Bot (JIRA)


[ 
https://issues.apache.org/jira/browse/FLINK-3020?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15013328#comment-15013328
 ] 

ASF GitHub Bot commented on FLINK-3020:
---

Github user StephanEwen commented on a diff in the pull request:

https://github.com/apache/flink/pull/1360#discussion_r45326391
  
--- Diff: 
flink-streaming-java/src/main/java/org/apache/flink/streaming/api/environment/LocalStreamEnvironment.java
 ---
@@ -91,7 +91,11 @@ public JobExecutionResult execute(String jobName) throws 
Exception {
configuration.addAll(jobGraph.getJobConfiguration());
 

configuration.setLong(ConfigConstants.TASK_MANAGER_MEMORY_SIZE_KEY, -1L);
-   
configuration.setInteger(ConfigConstants.TASK_MANAGER_NUM_TASK_SLOTS, 
getParallelism());
+
+   int parallelism = getParallelism() == defaultLocalParallelism ?
+   defaultLocalParallelism : 
jobGraph.getMaximumParallelism();
--- End diff --

This currently means that in the default case, the slots do not respect the 
max parallel operator, correct?


> Local streaming execution: set number of task manager slots to the maximum 
> parallelism
> --
>
> Key: FLINK-3020
> URL: https://issues.apache.org/jira/browse/FLINK-3020
> Project: Flink
>  Issue Type: Bug
>  Components: Local Runtime
>Affects Versions: 0.10.0
>Reporter: Maximilian Michels
>Assignee: Maximilian Michels
>Priority: Minor
> Fix For: 1.0.0, 0.10.1
>
>
> Quite an inconvenience is the local execution configuration behavior. It sets 
> the number of task slots of the mini cluster to the default parallelism. This 
> causes problem if you use {{setParallelism(parallelism)}} on an operator and 
> set a parallelism larger than the default parallelism.
> {noformat}
> Caused by: 
> org.apache.flink.runtime.jobmanager.scheduler.NoResourceAvailableException: 
> Not enough free slots available to run the job. You can decrease the operator 
> parallelism or increase the number of slots per TaskManager in the 
> configuration. Task to schedule: < Attempt #0 (Flat Map (9/100)) @ 
> (unassigned) - [SCHEDULED] > with groupID < fa7240ee1fed08bd7e6278899db3e838 
> > in sharing group < SlotSharingGroup [f3d578e9819be9c39ceee86cf5eb8c08, 
> 8fa330746efa1d034558146e4604d0b4, fa7240ee1fed08bd7e6278899db3e838] >. 
> Resources available to scheduler: Number of instances=1, total number of 
> slots=8, available slots=0
>   at 
> org.apache.flink.runtime.jobmanager.scheduler.Scheduler.scheduleTask(Scheduler.java:256)
>   at 
> org.apache.flink.runtime.jobmanager.scheduler.Scheduler.scheduleImmediately(Scheduler.java:131)
>   at 
> org.apache.flink.runtime.executiongraph.Execution.scheduleForExecution(Execution.java:298)
>   at 
> org.apache.flink.runtime.executiongraph.ExecutionVertex.scheduleForExecution(ExecutionVertex.java:458)
>   at 
> org.apache.flink.runtime.executiongraph.ExecutionJobVertex.scheduleAll(ExecutionJobVertex.java:322)
>   at 
> org.apache.flink.runtime.executiongraph.ExecutionGraph.scheduleForExecution(ExecutionGraph.java:686)
>   at 
> org.apache.flink.runtime.jobmanager.JobManager$$anonfun$org$apache$flink$runtime$jobmanager$JobManager$$submitJob$1.apply$mcV$sp(JobManager.scala:982)
>   at 
> org.apache.flink.runtime.jobmanager.JobManager$$anonfun$org$apache$flink$runtime$jobmanager$JobManager$$submitJob$1.apply(JobManager.scala:962)
>   at 
> org.apache.flink.runtime.jobmanager.JobManager$$anonfun$org$apache$flink$runtime$jobmanager$JobManager$$submitJob$1.apply(JobManager.scala:962)
>   at 
> scala.concurrent.impl.Future$PromiseCompletingRunnable.liftedTree1$1(Future.scala:24)
>   at 
> scala.concurrent.impl.Future$PromiseCompletingRunnable.run(Future.scala:24)
>   at akka.dispatch.TaskInvocation.run(AbstractDispatcher.scala:41)
>   at 
> akka.dispatch.ForkJoinExecutorConfigurator$AkkaForkJoinTask.exec(AbstractDispatcher.scala:401)
>   at scala.concurrent.forkjoin.ForkJoinTask.doExec(ForkJoinTask.java:260)
>   at 
> scala.concurrent.forkjoin.ForkJoinPool$WorkQueue.runTask(ForkJoinPool.java:1339)
>   ... 2 more
> {noformat}
> I propose to change this behavior to setting the number of task slots to the 
> maximum parallelism present in the user program.
> What do you think?



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[GitHub] flink pull request: [FLINK-3020][streaming] set number of task slo...

2015-11-19 Thread StephanEwen

Github user StephanEwen commented on a diff in the pull request:

https://github.com/apache/flink/pull/1360#discussion_r45326391
  
--- Diff: 
flink-streaming-java/src/main/java/org/apache/flink/streaming/api/environment/LocalStreamEnvironment.java
 ---
@@ -91,7 +91,11 @@ public JobExecutionResult execute(String jobName) throws 
Exception {
configuration.addAll(jobGraph.getJobConfiguration());
 

configuration.setLong(ConfigConstants.TASK_MANAGER_MEMORY_SIZE_KEY, -1L);
-   
configuration.setInteger(ConfigConstants.TASK_MANAGER_NUM_TASK_SLOTS, 
getParallelism());
+
+   int parallelism = getParallelism() == defaultLocalParallelism ?
+   defaultLocalParallelism : 
jobGraph.getMaximumParallelism();
--- End diff --

This currently means that in the default case, the slots do not respect the 
max parallel operator, correct?


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

[jira] [Commented] (FLINK-3044) In YARN mode, configure FsStateBackend by default.

2015-11-19 Thread Stephan Ewen (JIRA)


[ 
https://issues.apache.org/jira/browse/FLINK-3044?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15013395#comment-15013395
 ] 

Stephan Ewen commented on FLINK-3044:
-

Let's verify that. Would be surprising, as other config flags get picked up...

> In YARN mode, configure FsStateBackend by default.
> --
>
> Key: FLINK-3044
> URL: https://issues.apache.org/jira/browse/FLINK-3044
> Project: Flink
>  Issue Type: Improvement
>  Components: Streaming, YARN Client
>Affects Versions: 0.10.0
>Reporter: Stephan Ewen
> Fix For: 1.0.0
>
>
> When starting a YARN session, the default state backend should be set to the 
> {{FsStateBackend}}, with the checkpoint directory set to the working 
> directory for that YARN App.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Closed] (FLINK-3047) Local batch execution: set number of task manager slots to the maximum parallelism

2015-11-19 Thread Maximilian Michels (JIRA)


 [ 
https://issues.apache.org/jira/browse/FLINK-3047?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Maximilian Michels closed FLINK-3047.
-
Resolution: Not A Problem

> Local batch execution: set number of task manager slots to the maximum 
> parallelism
> --
>
> Key: FLINK-3047
> URL: https://issues.apache.org/jira/browse/FLINK-3047
> Project: Flink
>  Issue Type: Bug
>  Components: Batch, Local Runtime
>Affects Versions: 0.10.0
>Reporter: Maximilian Michels
>Assignee: Maximilian Michels
>Priority: Minor
> Fix For: 1.0.0, 0.10.1
>
>
> The number of task slots for local execution are determined by the maximum 
> parallelism found. However, if a default parallelism has been set, this 
> parallelism is used as the upper bound for the number of task slots.
> We should change this to always use the maximum parallelism as the number of 
> task slots. Otherwise jobs which include operators with a parallelism higher 
> than the default parallelism fail to execute locally.
> For example, this fails
> {noformat}
> ExecutionEnvironment env = ..
> env.setParallelism(2);
> DataSet set = env.fromElements(1,2,3,4)
> .map(el -> el+1)
> .setParallelism(4);
> set.print();
> {noformat}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[GitHub] flink pull request: [FLINK-3020][streaming] set number of task slo...

2015-11-19 Thread mxm

Github user mxm commented on the pull request:

https://github.com/apache/flink/pull/1360#issuecomment-158032582
  
Just checked. The batch side always uses the maximum parallelism as the 
number of task slots (if they are not set explicitly). Till and me actually 
thought differently. So the proposed changes in this PR align with the batch 
behavior.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

[jira] [Commented] (FLINK-3020) Local streaming execution: set number of task manager slots to the maximum parallelism

2015-11-19 Thread ASF GitHub Bot (JIRA)


[ 
https://issues.apache.org/jira/browse/FLINK-3020?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15013400#comment-15013400
 ] 

ASF GitHub Bot commented on FLINK-3020:
---

Github user mxm commented on the pull request:

https://github.com/apache/flink/pull/1360#issuecomment-158032582
  
Just checked. The batch side always uses the maximum parallelism as the 
number of task slots (if they are not set explicitly). Till and me actually 
thought differently. So the proposed changes in this PR align with the batch 
behavior.


> Local streaming execution: set number of task manager slots to the maximum 
> parallelism
> --
>
> Key: FLINK-3020
> URL: https://issues.apache.org/jira/browse/FLINK-3020
> Project: Flink
>  Issue Type: Bug
>  Components: Local Runtime
>Affects Versions: 0.10.0
>Reporter: Maximilian Michels
>Assignee: Maximilian Michels
>Priority: Minor
> Fix For: 1.0.0, 0.10.1
>
>
> Quite an inconvenience is the local execution configuration behavior. It sets 
> the number of task slots of the mini cluster to the default parallelism. This 
> causes problem if you use {{setParallelism(parallelism)}} on an operator and 
> set a parallelism larger than the default parallelism.
> {noformat}
> Caused by: 
> org.apache.flink.runtime.jobmanager.scheduler.NoResourceAvailableException: 
> Not enough free slots available to run the job. You can decrease the operator 
> parallelism or increase the number of slots per TaskManager in the 
> configuration. Task to schedule: < Attempt #0 (Flat Map (9/100)) @ 
> (unassigned) - [SCHEDULED] > with groupID < fa7240ee1fed08bd7e6278899db3e838 
> > in sharing group < SlotSharingGroup [f3d578e9819be9c39ceee86cf5eb8c08, 
> 8fa330746efa1d034558146e4604d0b4, fa7240ee1fed08bd7e6278899db3e838] >. 
> Resources available to scheduler: Number of instances=1, total number of 
> slots=8, available slots=0
>   at 
> org.apache.flink.runtime.jobmanager.scheduler.Scheduler.scheduleTask(Scheduler.java:256)
>   at 
> org.apache.flink.runtime.jobmanager.scheduler.Scheduler.scheduleImmediately(Scheduler.java:131)
>   at 
> org.apache.flink.runtime.executiongraph.Execution.scheduleForExecution(Execution.java:298)
>   at 
> org.apache.flink.runtime.executiongraph.ExecutionVertex.scheduleForExecution(ExecutionVertex.java:458)
>   at 
> org.apache.flink.runtime.executiongraph.ExecutionJobVertex.scheduleAll(ExecutionJobVertex.java:322)
>   at 
> org.apache.flink.runtime.executiongraph.ExecutionGraph.scheduleForExecution(ExecutionGraph.java:686)
>   at 
> org.apache.flink.runtime.jobmanager.JobManager$$anonfun$org$apache$flink$runtime$jobmanager$JobManager$$submitJob$1.apply$mcV$sp(JobManager.scala:982)
>   at 
> org.apache.flink.runtime.jobmanager.JobManager$$anonfun$org$apache$flink$runtime$jobmanager$JobManager$$submitJob$1.apply(JobManager.scala:962)
>   at 
> org.apache.flink.runtime.jobmanager.JobManager$$anonfun$org$apache$flink$runtime$jobmanager$JobManager$$submitJob$1.apply(JobManager.scala:962)
>   at 
> scala.concurrent.impl.Future$PromiseCompletingRunnable.liftedTree1$1(Future.scala:24)
>   at 
> scala.concurrent.impl.Future$PromiseCompletingRunnable.run(Future.scala:24)
>   at akka.dispatch.TaskInvocation.run(AbstractDispatcher.scala:41)
>   at 
> akka.dispatch.ForkJoinExecutorConfigurator$AkkaForkJoinTask.exec(AbstractDispatcher.scala:401)
>   at scala.concurrent.forkjoin.ForkJoinTask.doExec(ForkJoinTask.java:260)
>   at 
> scala.concurrent.forkjoin.ForkJoinPool$WorkQueue.runTask(ForkJoinPool.java:1339)
>   ... 2 more
> {noformat}
> I propose to change this behavior to setting the number of task slots to the 
> maximum parallelism present in the user program.
> What do you think?



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[GitHub] flink pull request: [FLINK-1945][py] Python Tests less verbose

2015-11-19 Thread asfgit

Github user asfgit closed the pull request at:

https://github.com/apache/flink/pull/1376


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

[jira] [Commented] (FLINK-1945) Make python tests less verbose

2015-11-19 Thread ASF GitHub Bot (JIRA)


[ 
https://issues.apache.org/jira/browse/FLINK-1945?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15013439#comment-15013439
 ] 

ASF GitHub Bot commented on FLINK-1945:
---

Github user asfgit closed the pull request at:

https://github.com/apache/flink/pull/1376


> Make python tests less verbose
> --
>
> Key: FLINK-1945
> URL: https://issues.apache.org/jira/browse/FLINK-1945
> Project: Flink
>  Issue Type: Improvement
>  Components: Python API, Tests
>Reporter: Till Rohrmann
>Assignee: Chesnay Schepler
>Priority: Minor
>
> Currently, the python tests print a lot of log messages to stdout. 
> Furthermore there seems to be some println statements which clutter the 
> console output. I think that these log messages are not required for the 
> tests and thus should be suppressed. 



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Closed] (FLINK-1945) Make python tests less verbose

2015-11-19 Thread Chesnay Schepler (JIRA)


 [ 
https://issues.apache.org/jira/browse/FLINK-1945?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Chesnay Schepler closed FLINK-1945.
---
Resolution: Fixed

> Make python tests less verbose
> --
>
> Key: FLINK-1945
> URL: https://issues.apache.org/jira/browse/FLINK-1945
> Project: Flink
>  Issue Type: Improvement
>  Components: Python API, Tests
>Reporter: Till Rohrmann
>Assignee: Chesnay Schepler
>Priority: Minor
>
> Currently, the python tests print a lot of log messages to stdout. 
> Furthermore there seems to be some println statements which clutter the 
> console output. I think that these log messages are not required for the 
> tests and thus should be suppressed. 



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (FLINK-2440) [py] Expand Environment feature coverage

2015-11-19 Thread ASF GitHub Bot (JIRA)


[ 
https://issues.apache.org/jira/browse/FLINK-2440?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15013447#comment-15013447
 ] 

ASF GitHub Bot commented on FLINK-2440:
---

GitHub user zentol opened a pull request:

https://github.com/apache/flink/pull/1383

[FLINK-2440][py] Expand Environment feature coverage

Small PR that
* renames setDegreeOfParallelism() to setParallelism()
* adds getDegreeOfParallelism(), set-/getNumberOfExecutionRetries()


You can merge this pull request into a Git repository by running:

$ git pull https://github.com/zentol/flink 2440_py_env

Alternatively you can review and apply these changes as the patch at:

https://github.com/apache/flink/pull/1383.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

This closes #1383


commit 7bd79f6acb52465149aa7189c89f49662c1f666c
Author: zentol 
Date:   2015-11-19T12:30:48Z

[FLINK-2440][py] Expand Environment feature coverage




> [py] Expand Environment feature coverage
> 
>
> Key: FLINK-2440
> URL: https://issues.apache.org/jira/browse/FLINK-2440
> Project: Flink
>  Issue Type: Sub-task
>  Components: Python API
>Reporter: Chesnay Schepler
>Assignee: Chesnay Schepler
>
> An upcoming commit of mine will add the following methods to the Python API's 
> Environment class:
> getParallelism
> set-/getNumberOfExecutionRetries
> Additionally, calls to the now deprecated java getDegreeOfParallelism were 
> changed, and the equivalent python method renamed.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[GitHub] flink pull request: [FLINK-3043] [docs] Fix description of Kafka C...

2015-11-19 Thread tillrohrmann

Github user tillrohrmann commented on the pull request:

https://github.com/apache/flink/pull/1380#issuecomment-158045723
  
LGTM. Maybe we can also update the documentation for 0.10 if the changes 
apply to this version as well.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

[GitHub] flink pull request: [FLINK-3002] Add Either type to the Java API

2015-11-19 Thread twalthr

Github user twalthr commented on the pull request:

https://github.com/apache/flink/pull/1371#issuecomment-158048821
  
@vasia I can implement the TypeExtractor support.

@StephanEwen Custom type integration interfaces sound very good. I can open 
an issue for that.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

[jira] [Commented] (FLINK-3002) Add an EitherType to the Java API

2015-11-19 Thread ASF GitHub Bot (JIRA)


[ 
https://issues.apache.org/jira/browse/FLINK-3002?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15013480#comment-15013480
 ] 

ASF GitHub Bot commented on FLINK-3002:
---

Github user twalthr commented on the pull request:

https://github.com/apache/flink/pull/1371#issuecomment-158048821
  
@vasia I can implement the TypeExtractor support.

@StephanEwen Custom type integration interfaces sound very good. I can open 
an issue for that.


> Add an EitherType to the Java API
> -
>
> Key: FLINK-3002
> URL: https://issues.apache.org/jira/browse/FLINK-3002
> Project: Flink
>  Issue Type: New Feature
>  Components: Java API
>Affects Versions: 0.10.0
>Reporter: Stephan Ewen
>Assignee: Vasia Kalavri
> Fix For: 1.0.0
>
>
> Either types are recurring patterns and should be serialized efficiently, so 
> it makes sense to add them to the core Java API.
> Since Java does not have such a type as of Java 8, we would need to add our 
> own version.
> The Scala API handles the Scala Either Type already efficiently. I would not 
> use the Scala Either Type in the Java API, since we are trying to get the 
> {{flink-java}} project "Scala free" for people that don't use Scala and o not 
> want to worry about Scala version matches and mismatches.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (FLINK-2115) TableAPI throws ExpressionException for "Dangling GroupBy operation"

2015-11-19 Thread ASF GitHub Bot (JIRA)


[ 
https://issues.apache.org/jira/browse/FLINK-2115?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15013350#comment-15013350
 ] 

ASF GitHub Bot commented on FLINK-2115:
---

Github user tillrohrmann commented on the pull request:

https://github.com/apache/flink/pull/1377#issuecomment-158025348
  
@Li Chengxiang, you can always pull changes directly from the ASF git
repository: https://git-wip-us.apache.org/repos/asf/flink.git. Then you
don't have to wait for syncing with the github mirror.

On Thu, Nov 19, 2015 at 10:51 AM, Li Chengxiang 
wrote:

> Ok, i think it could be added in the Select operator description in Table
> API page, it would depends on FLINK-2955. Seems it takes quite long time
> delay to sync merged commit to github, waiting for FLINK-2955 get synced.
>
> —
> Reply to this email directly or view it on GitHub
> .
>



> TableAPI throws ExpressionException for "Dangling GroupBy operation"
> 
>
> Key: FLINK-2115
> URL: https://issues.apache.org/jira/browse/FLINK-2115
> Project: Flink
>  Issue Type: Bug
>  Components: Table API
>Affects Versions: 0.9
>Reporter: Fabian Hueske
>Assignee: Chengxiang Li
>
> The following program below throws an ExpressionException due to a "Dangling 
> GroupBy operation".
> However, I think the program is semantically correct and should execute.
> {code}
> public static void main(String[] args) throws Exception {
>   ExecutionEnvironment env = ExecutionEnvironment.getExecutionEnvironment();
>   DataSet data = env.fromElements(1,2,2,3,3,3,4,4,4,4);
>   DataSet> tuples = data
>   .map(new MapFunction>() {
> @Override
> public Tuple2 map(Integer i) throws Exception {
>   return new Tuple2(i, i*2);
> }
>   });
>   TableEnvironment tEnv = new TableEnvironment();
>   Table t = tEnv.toTable(tuples).as("i, i2")
>   .groupBy("i, i2").select("i, i2")
>   .groupBy("i").select("i, i.count as cnt");
>   tEnv.toSet(t, Row.class).print();
> }
> {code}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[GitHub] flink pull request: [FLINK-2115] [Table API] support no aggregatio...

2015-11-19 Thread tillrohrmann

Github user tillrohrmann commented on the pull request:

https://github.com/apache/flink/pull/1377#issuecomment-158025348

@Li Chengxiang, you can always pull changes directly from the ASF git
repository: https://git-wip-us.apache.org/repos/asf/flink.git. Then you
don't have to wait for syncing with the github mirror.

On Thu, Nov 19, 2015 at 10:51 AM, Li Chengxiang 
wrote:

> Ok, i think it could be added in the Select operator description in Table
> API page, it would depends on FLINK-2955. Seems it takes quite long time
> delay to sync merged commit to github, waiting for FLINK-2955 get synced.
>
> â
> Reply to this email directly or view it on GitHub
> .
>

---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

[jira] [Commented] (FLINK-3044) In YARN mode, configure FsStateBackend by default.

2015-11-19 Thread Aljoscha Krettek (JIRA)


[ 
https://issues.apache.org/jira/browse/FLINK-3044?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15013381#comment-15013381
 ] 

Aljoscha Krettek commented on FLINK-3044:
-

I recently ran a Job on GCE to test the state backend. If I remember correctly, 
I had to set the backend on the StreamExecutionEnv because the config in 
flink-conf.yaml was not respected.

> In YARN mode, configure FsStateBackend by default.
> --
>
> Key: FLINK-3044
> URL: https://issues.apache.org/jira/browse/FLINK-3044
> Project: Flink
>  Issue Type: Improvement
>  Components: Streaming, YARN Client
>Affects Versions: 0.10.0
>Reporter: Stephan Ewen
> Fix For: 1.0.0
>
>
> When starting a YARN session, the default state backend should be set to the 
> {{FsStateBackend}}, with the checkpoint directory set to the working 
> directory for that YARN App.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (FLINK-2732) Add access to the TaskManagers' log file and out file in the web dashboard.

2015-11-19 Thread Martin Liesenberg (JIRA)


[ 
https://issues.apache.org/jira/browse/FLINK-2732?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15013411#comment-15013411
 ] 

Martin Liesenberg commented on FLINK-2732:
--

I have two questions concerning the display and integration into the frontend:

In the `WebRuntimeMonitor` I can add handlers which serve the logs at a 
specified URL. As far as I can see the existing URLs follow a RESTful schema, 
so for the TaskManager logs that'd be "taskmanagers/:taskmanagerID:/[log | 
stdout]" 
So my idea is to use the `StaticFileServerHandler` to serve the log files, 
which I'd modify to check for the above mentioned URL and if so fetch the logs.
Does that sound ok to you?

Moreover, is there anything apart from adding two tabs for the log/stdout I 
need to add in the JavaScript? 

Best regards!

> Add access to the TaskManagers' log file and out file in the web dashboard.
> ---
>
> Key: FLINK-2732
> URL: https://issues.apache.org/jira/browse/FLINK-2732
> Project: Flink
>  Issue Type: Sub-task
>  Components: Webfrontend
>Affects Versions: 0.10.0
>Reporter: Stephan Ewen
>Assignee: Martin Liesenberg
> Fix For: 1.0.0
>
>
> Add access to the TaskManagers' log file and out file in the web dashboard.
> This needs addition on the server side, as the log files need to be 
> transferred   to the JobManager via the blob server.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[GitHub] flink pull request: [FLINK-3002] Add Either type to the Java API

2015-11-19 Thread vasia

Github user vasia commented on the pull request:

https://github.com/apache/flink/pull/1371#issuecomment-158042526
  
Thanks. I'll squash the commits and merge later today.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

[jira] [Commented] (FLINK-3002) Add an EitherType to the Java API

2015-11-19 Thread ASF GitHub Bot (JIRA)


[ 
https://issues.apache.org/jira/browse/FLINK-3002?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15013429#comment-15013429
 ] 

ASF GitHub Bot commented on FLINK-3002:
---

Github user vasia commented on the pull request:

https://github.com/apache/flink/pull/1371#issuecomment-158042526
  
Thanks. I'll squash the commits and merge later today.


> Add an EitherType to the Java API
> -
>
> Key: FLINK-3002
> URL: https://issues.apache.org/jira/browse/FLINK-3002
> Project: Flink
>  Issue Type: New Feature
>  Components: Java API
>Affects Versions: 0.10.0
>Reporter: Stephan Ewen
>Assignee: Vasia Kalavri
> Fix For: 1.0.0
>
>
> Either types are recurring patterns and should be serialized efficiently, so 
> it makes sense to add them to the core Java API.
> Since Java does not have such a type as of Java 8, we would need to add our 
> own version.
> The Scala API handles the Scala Either Type already efficiently. I would not 
> use the Scala Either Type in the Java API, since we are trying to get the 
> {{flink-java}} project "Scala free" for people that don't use Scala and o not 
> want to worry about Scala version matches and mismatches.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (FLINK-1945) Make python tests less verbose

2015-11-19 Thread ASF GitHub Bot (JIRA)


[ 
https://issues.apache.org/jira/browse/FLINK-1945?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15013432#comment-15013432
 ] 

ASF GitHub Bot commented on FLINK-1945:
---

Github user zentol commented on the pull request:

https://github.com/apache/flink/pull/1376#issuecomment-158042693
  
Merging this.


> Make python tests less verbose
> --
>
> Key: FLINK-1945
> URL: https://issues.apache.org/jira/browse/FLINK-1945
> Project: Flink
>  Issue Type: Improvement
>  Components: Python API, Tests
>Reporter: Till Rohrmann
>Assignee: Chesnay Schepler
>Priority: Minor
>
> Currently, the python tests print a lot of log messages to stdout. 
> Furthermore there seems to be some println statements which clutter the 
> console output. I think that these log messages are not required for the 
> tests and thus should be suppressed. 



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (FLINK-2318) BroadcastVariable of unioned data set fails

2015-11-19 Thread Chesnay Schepler (JIRA)


[ 
https://issues.apache.org/jira/browse/FLINK-2318?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15013431#comment-15013431
 ] 

Chesnay Schepler commented on FLINK-2318:
-

I think i may have a fix for this issue, can be found here:

https://github.com/zentol/flink/tree/2318_bc_union

> BroadcastVariable of unioned data set fails
> ---
>
> Key: FLINK-2318
> URL: https://issues.apache.org/jira/browse/FLINK-2318
> Project: Flink
>  Issue Type: Bug
>  Components: Distributed Runtime, Optimizer
>Affects Versions: 0.9
>Reporter: Fabian Hueske
>
> Using a unioned data set as broadcast variable such as this:
> {code}
> DataSet d1 = [...]
> DataSet d2 = [...]
> DataSet d3 = [...]
> d1
>   .map(new MyMapper())
>   .withBroadcastSet(d2.union(d3), "myBroadcast");
> {code}
> throws an exception at runtime:
> {code}
> java.lang.Exception: Call to registerInputOutput() of invokable failed
>   at org.apache.flink.runtime.taskmanager.Task.run(Task.java:504)
>   at java.lang.Thread.run(Thread.java:745)
> Caused by: java.lang.RuntimeException: Initializing the input streams failed 
> in Task MapPartition (MapPartition at 
> translatHashJoinAsMap(FlinkFlowStep.java:755)): Illegal input group size in 
> task configuration: -1
>   at 
> org.apache.flink.runtime.operators.RegularPactTask.registerInputOutput(RegularPactTask.java:246)
>   at org.apache.flink.runtime.taskmanager.Task.run(Task.java:501)
>   ... 1 more
> Caused by: java.lang.Exception: Illegal input group size in task 
> configuration: -1
>   at 
> org.apache.flink.runtime.operators.RegularPactTask.initBroadcastInputReaders(RegularPactTask.java:783)
>   at 
> org.apache.flink.runtime.operators.RegularPactTask.registerInputOutput(RegularPactTask.java:243)
>   ... 2 more
> {code}
> A simple workaround is to apply an identity mapper on the unioned data set.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[GitHub] flink pull request: [FLINK-2440][py] Expand Environment feature co...

2015-11-19 Thread zentol

GitHub user zentol opened a pull request:

https://github.com/apache/flink/pull/1383

[FLINK-2440][py] Expand Environment feature coverage

Small PR that
* renames setDegreeOfParallelism() to setParallelism()
* adds getDegreeOfParallelism(), set-/getNumberOfExecutionRetries()


You can merge this pull request into a Git repository by running:

$ git pull https://github.com/zentol/flink 2440_py_env

Alternatively you can review and apply these changes as the patch at:

https://github.com/apache/flink/pull/1383.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

This closes #1383


commit 7bd79f6acb52465149aa7189c89f49662c1f666c
Author: zentol 
Date:   2015-11-19T12:30:48Z

[FLINK-2440][py] Expand Environment feature coverage




---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

[jira] [Commented] (FLINK-3041) Twitter Streaming Description section of Streaming Programming guide refers to an incorrect example 'TwitterLocal'

2015-11-19 Thread ASF GitHub Bot (JIRA)


[ 
https://issues.apache.org/jira/browse/FLINK-3041?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15013445#comment-15013445
 ] 

ASF GitHub Bot commented on FLINK-3041:
---

Github user tillrohrmann commented on the pull request:

https://github.com/apache/flink/pull/1379#issuecomment-158044491
  
LGTM. Will merge this PR. Thanks for your work @smarthi.


> Twitter Streaming Description section of Streaming Programming guide refers 
> to an incorrect example 'TwitterLocal'
> --
>
> Key: FLINK-3041
> URL: https://issues.apache.org/jira/browse/FLINK-3041
> Project: Flink
>  Issue Type: Improvement
>  Components: Documentation, Examples, Streaming
>Affects Versions: 0.10.0
>Reporter: Suneel Marthi
>Assignee: Suneel Marthi
>Priority: Minor
> Fix For: 0.10.1
>
>
> Twitter Streaming Description section of Streaming Programming guide refers 
> to an incorrect example 'TwitterLocal', it should be 'TwitterStream'.  Fix 
> other typos in the Twitter streaming description and code cleanup.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Created] (FLINK-3047) Local batch execution: set number of task manager slots to the maximum parallelism

2015-11-19 Thread Maximilian Michels (JIRA)

Maximilian Michels created FLINK-3047:
-

 Summary: Local batch execution: set number of task manager slots 
to the maximum parallelism
 Key: FLINK-3047
 URL: https://issues.apache.org/jira/browse/FLINK-3047
 Project: Flink
  Issue Type: Bug
  Components: Batch, Local Runtime
Affects Versions: 0.10.0
Reporter: Maximilian Michels
Assignee: Maximilian Michels
Priority: Minor
 Fix For: 1.0.0, 0.10.1


The number of task slots for local execution are determined by the maximum 
parallelism found. However, if a default parallelism has been set, this 
parallelism is used as the upper bound for the number of task slots.

We should change this to always use the maximum parallelism as the number of 
task slots. Otherwise jobs which include operators with a parallelism higher 
than the default parallelism fail to execute locally.

For example, this fails

{noformat}
ExecutionEnvironment env = ..

env.setParallelism(2);

DataSet set = env.fromElements(1,2,3,4)
.map(el -> el+1)
.setParallelism(4);

set.print();
{noformat}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (FLINK-3047) Local batch execution: set number of task manager slots to the maximum parallelism

2015-11-19 Thread Maximilian Michels (JIRA)


[ 
https://issues.apache.org/jira/browse/FLINK-3047?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15013402#comment-15013402
 ] 

Maximilian Michels commented on FLINK-3047:
---

This has already been fixed for batch execution.

> Local batch execution: set number of task manager slots to the maximum 
> parallelism
> --
>
> Key: FLINK-3047
> URL: https://issues.apache.org/jira/browse/FLINK-3047
> Project: Flink
>  Issue Type: Bug
>  Components: Batch, Local Runtime
>Affects Versions: 0.10.0
>Reporter: Maximilian Michels
>Assignee: Maximilian Michels
>Priority: Minor
> Fix For: 1.0.0, 0.10.1
>
>
> The number of task slots for local execution are determined by the maximum 
> parallelism found. However, if a default parallelism has been set, this 
> parallelism is used as the upper bound for the number of task slots.
> We should change this to always use the maximum parallelism as the number of 
> task slots. Otherwise jobs which include operators with a parallelism higher 
> than the default parallelism fail to execute locally.
> For example, this fails
> {noformat}
> ExecutionEnvironment env = ..
> env.setParallelism(2);
> DataSet set = env.fromElements(1,2,3,4)
> .map(el -> el+1)
> .setParallelism(4);
> set.print();
> {noformat}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[GitHub] flink pull request: [FLINK-1945][py] Python Tests less verbose

2015-11-19 Thread zentol

Github user zentol commented on the pull request:

https://github.com/apache/flink/pull/1376#issuecomment-158042693
  
Merging this.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

[jira] [Commented] (FLINK-3043) Kafka Connector description in Streaming API guide is wrong/outdated

2015-11-19 Thread ASF GitHub Bot (JIRA)


[ 
https://issues.apache.org/jira/browse/FLINK-3043?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15013453#comment-15013453
 ] 

ASF GitHub Bot commented on FLINK-3043:
---

Github user tillrohrmann commented on a diff in the pull request:

https://github.com/apache/flink/pull/1380#discussion_r45335152
  
--- Diff: docs/apis/streaming_guide.md ---
@@ -3448,15 +3417,14 @@ Then, import the connector in your maven project:
 Note that the streaming connectors are currently not part of the binary 
distribution. See how to link with them for cluster execution 
[here](cluster_execution.html#linking-with-modules-not-contained-in-the-binary-distribution).
 
  Installing Apache Kafka
+
 * Follow the instructions from [Kafka's 
quickstart](https://kafka.apache.org/documentation.html#quickstart) to download 
the code and launch a server (launching a Zookeeper and a Kafka server is 
required every time before starting the application).
 * On 32 bit computers 
[this](http://stackoverflow.com/questions/22325364/unrecognized-vm-option-usecompressedoops-when-running-kafka-from-my-ubuntu-in)
 problem may occur.
 * If the Kafka and Zookeeper servers are running on a remote machine, then 
the `advertised.host.name` setting in the `config/server.properties` file must 
be set to the machine's IP address.
 
  Kafka Consumer
 
-The standard `FlinkKafkaConsumer082` is a Kafka consumer providing access 
to one topic.
-
-The following parameters have to be provided for the 
`FlinkKafkaConsumer082(...)` constructor:
+The standard `FlinkKafkaConsumer082` is a Kafka consumer providing access 
to one topic. It takes the following parameters to teh constructor:
--- End diff --

typo: teh => the


> Kafka Connector description in Streaming API guide is wrong/outdated
> 
>
> Key: FLINK-3043
> URL: https://issues.apache.org/jira/browse/FLINK-3043
> Project: Flink
>  Issue Type: Bug
>  Components: Documentation
>Affects Versions: 0.10.0
>Reporter: Stephan Ewen
>Assignee: Stephan Ewen
> Fix For: 1.0.0, 0.10.1
>
>




--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[GitHub] flink pull request: [FLINK-3043] [docs] Fix description of Kafka C...

2015-11-19 Thread tillrohrmann

Github user tillrohrmann commented on a diff in the pull request:

https://github.com/apache/flink/pull/1380#discussion_r45335152
  
--- Diff: docs/apis/streaming_guide.md ---
@@ -3448,15 +3417,14 @@ Then, import the connector in your maven project:
 Note that the streaming connectors are currently not part of the binary 
distribution. See how to link with them for cluster execution 
[here](cluster_execution.html#linking-with-modules-not-contained-in-the-binary-distribution).
 
  Installing Apache Kafka
+
 * Follow the instructions from [Kafka's 
quickstart](https://kafka.apache.org/documentation.html#quickstart) to download 
the code and launch a server (launching a Zookeeper and a Kafka server is 
required every time before starting the application).
 * On 32 bit computers 
[this](http://stackoverflow.com/questions/22325364/unrecognized-vm-option-usecompressedoops-when-running-kafka-from-my-ubuntu-in)
 problem may occur.
 * If the Kafka and Zookeeper servers are running on a remote machine, then 
the `advertised.host.name` setting in the `config/server.properties` file must 
be set to the machine's IP address.
 
  Kafka Consumer
 
-The standard `FlinkKafkaConsumer082` is a Kafka consumer providing access 
to one topic.
-
-The following parameters have to be provided for the 
`FlinkKafkaConsumer082(...)` constructor:
+The standard `FlinkKafkaConsumer082` is a Kafka consumer providing access 
to one topic. It takes the following parameters to teh constructor:
--- End diff --

typo: teh => the


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

[jira] [Commented] (FLINK-3020) Local streaming execution: set number of task manager slots to the maximum parallelism

2015-11-19 Thread ASF GitHub Bot (JIRA)


[ 
https://issues.apache.org/jira/browse/FLINK-3020?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15013309#comment-15013309
 ] 

ASF GitHub Bot commented on FLINK-3020:
---

Github user tillrohrmann commented on the pull request:

https://github.com/apache/flink/pull/1360#issuecomment-158019230
  
Yes, in such a case it would fail. Thinking about it, you're right that it 
would give a better user experience if the maximum degree of the job is taken 
instead of the default parallelism. Maybe we should change it then for the 
batch part as well.


> Local streaming execution: set number of task manager slots to the maximum 
> parallelism
> --
>
> Key: FLINK-3020
> URL: https://issues.apache.org/jira/browse/FLINK-3020
> Project: Flink
>  Issue Type: Bug
>  Components: Local Runtime
>Affects Versions: 0.10.0
>Reporter: Maximilian Michels
>Assignee: Maximilian Michels
>Priority: Minor
> Fix For: 1.0.0, 0.10.1
>
>
> Quite an inconvenience is the local execution configuration behavior. It sets 
> the number of task slots of the mini cluster to the default parallelism. This 
> causes problem if you use {{setParallelism(parallelism)}} on an operator and 
> set a parallelism larger than the default parallelism.
> {noformat}
> Caused by: 
> org.apache.flink.runtime.jobmanager.scheduler.NoResourceAvailableException: 
> Not enough free slots available to run the job. You can decrease the operator 
> parallelism or increase the number of slots per TaskManager in the 
> configuration. Task to schedule: < Attempt #0 (Flat Map (9/100)) @ 
> (unassigned) - [SCHEDULED] > with groupID < fa7240ee1fed08bd7e6278899db3e838 
> > in sharing group < SlotSharingGroup [f3d578e9819be9c39ceee86cf5eb8c08, 
> 8fa330746efa1d034558146e4604d0b4, fa7240ee1fed08bd7e6278899db3e838] >. 
> Resources available to scheduler: Number of instances=1, total number of 
> slots=8, available slots=0
>   at 
> org.apache.flink.runtime.jobmanager.scheduler.Scheduler.scheduleTask(Scheduler.java:256)
>   at 
> org.apache.flink.runtime.jobmanager.scheduler.Scheduler.scheduleImmediately(Scheduler.java:131)
>   at 
> org.apache.flink.runtime.executiongraph.Execution.scheduleForExecution(Execution.java:298)
>   at 
> org.apache.flink.runtime.executiongraph.ExecutionVertex.scheduleForExecution(ExecutionVertex.java:458)
>   at 
> org.apache.flink.runtime.executiongraph.ExecutionJobVertex.scheduleAll(ExecutionJobVertex.java:322)
>   at 
> org.apache.flink.runtime.executiongraph.ExecutionGraph.scheduleForExecution(ExecutionGraph.java:686)
>   at 
> org.apache.flink.runtime.jobmanager.JobManager$$anonfun$org$apache$flink$runtime$jobmanager$JobManager$$submitJob$1.apply$mcV$sp(JobManager.scala:982)
>   at 
> org.apache.flink.runtime.jobmanager.JobManager$$anonfun$org$apache$flink$runtime$jobmanager$JobManager$$submitJob$1.apply(JobManager.scala:962)
>   at 
> org.apache.flink.runtime.jobmanager.JobManager$$anonfun$org$apache$flink$runtime$jobmanager$JobManager$$submitJob$1.apply(JobManager.scala:962)
>   at 
> scala.concurrent.impl.Future$PromiseCompletingRunnable.liftedTree1$1(Future.scala:24)
>   at 
> scala.concurrent.impl.Future$PromiseCompletingRunnable.run(Future.scala:24)
>   at akka.dispatch.TaskInvocation.run(AbstractDispatcher.scala:41)
>   at 
> akka.dispatch.ForkJoinExecutorConfigurator$AkkaForkJoinTask.exec(AbstractDispatcher.scala:401)
>   at scala.concurrent.forkjoin.ForkJoinTask.doExec(ForkJoinTask.java:260)
>   at 
> scala.concurrent.forkjoin.ForkJoinPool$WorkQueue.runTask(ForkJoinPool.java:1339)
>   ... 2 more
> {noformat}
> I propose to change this behavior to setting the number of task slots to the 
> maximum parallelism present in the user program.
> What do you think?



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[GitHub] flink pull request: [FLINK-3020][streaming] set number of task slo...

2015-11-19 Thread tillrohrmann

Github user tillrohrmann commented on the pull request:

https://github.com/apache/flink/pull/1360#issuecomment-158019230
  
Yes, in such a case it would fail. Thinking about it, you're right that it 
would give a better user experience if the maximum degree of the job is taken 
instead of the default parallelism. Maybe we should change it then for the 
batch part as well.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

[GitHub] flink pull request: Out-of-core state backend for JDBC databases

2015-11-19 Thread StephanEwen

Github user StephanEwen commented on a diff in the pull request:

https://github.com/apache/flink/pull/1305#discussion_r45351298
  
--- Diff: 
flink-contrib/flink-streaming-contrib/src/main/java/org/apache/flink/contrib/streaming/state/DbBackendConfig.java
 ---
@@ -0,0 +1,406 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one or more
+ * contributor license agreements.  See the NOTICE file distributed with
+ * this work for additional information regarding copyright ownership.
+ * The ASF licenses this file to You under the Apache License, Version 2.0
+ * (the "License"); you may not use this file except in compliance with
+ * the License.  You may obtain a copy of the License at
+ *
+ * http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+
+package org.apache.flink.contrib.streaming.state;
+
+import java.io.Serializable;
+import java.sql.SQLException;
+import java.util.List;
+
+import 
org.apache.flink.contrib.streaming.state.ShardedConnection.Partitioner;
+
+import com.google.common.collect.Lists;
+
+/**
+ * 
+ * Configuration object for {@link DbStateBackend}, containing information 
to
+ * shard and connect to the databases that will store the state 
checkpoints.
+ *
+ */
+public class DbBackendConfig implements Serializable {
+
+   private static final long serialVersionUID = 1L;
+
+   // Database connection properties
+   private final String userName;
+   private final String userPassword;
+   private final List shardUrls;
+
+   // JDBC Driver + DbAdapter information
+   private DbAdapter dbAdapter = new MySqlAdapter();
+   private String JDBCDriver = null;
+
+   private int maxNumberOfSqlRetries = 5;
+   private int sleepBetweenSqlRetries = 100;
+
+   // KvState properties
+   private int kvStateCacheSize = 1;
+   private int maxKvInsertBatchSize = 1000;
+   private float maxKvEvictFraction = 0.1f;
+   private int kvStateCompactionFreq = -1;
+
+   private Partitioner shardPartitioner;
+
+   /**
+* Creates a new sharded database state backend configuration with the 
given
+* parameters and default {@link MySqlAdapter}.
+* 
+* @param dbUserName
+*The username used to connect to the database at the given 
url.
+* @param dbUserPassword
+*The password used to connect to the database at the given 
url
+*and username.
+* @param dbShardUrls
+*The list of JDBC urls of the databases that will be used 
as
+*shards for the state backend. Sharding of the state will
+*happen based on the subtask index of the given task.
+*/
+   public DbBackendConfig(String dbUserName, String dbUserPassword, 
List dbShardUrls) {
+   this.userName = dbUserName;
+   this.userPassword = dbUserPassword;
+   this.shardUrls = dbShardUrls;
+   }
+
+   /**
+* Creates a new database state backend configuration with the given
+* parameters and default {@link MySqlAdapter}.
+* 
+* @param dbUserName
+*The username used to connect to the database at the given 
url.
+* @param dbUserPassword
+*The password used to connect to the database at the given 
url
+*and username.
+* @param dbUrl
+*The JDBC url of the database for example
+*"jdbc:mysql://localhost:3306/flinkdb".
+*/
+   public DbBackendConfig(String dbUserName, String dbUserPassword, String 
dbUrl) {
+   this(dbUserName, dbUserPassword, Lists.newArrayList(dbUrl));
+   }
+
+   /**
+* The username used to connect to the database at the given urls.
+*/
+   public String getUserName() {
+   return userName;
+   }
+
+   /**
+* The password used to connect to the database at the given url and
+* username.
+*/
+   public String getUserPassword() {
+   return userPassword;
+   }
+
+   /**
+* Number of database shards defined.
+*/
+   public int getNumberOfShards() {
+   return shardUrls.size();
+   }
+
+   /**
+* Database shard urls as provided in the constructor.
+* 
+*/
+   public List getShardUrls() {
+   return shardUrls;
+

[GitHub] flink pull request: Out-of-core state backend for JDBC databases

2015-11-19 Thread gyfora

Github user gyfora commented on a diff in the pull request:

https://github.com/apache/flink/pull/1305#discussion_r45351896
  
--- Diff: 
flink-contrib/flink-streaming-contrib/src/main/java/org/apache/flink/contrib/streaming/state/DbBackendConfig.java
 ---
@@ -0,0 +1,406 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one or more
+ * contributor license agreements.  See the NOTICE file distributed with
+ * this work for additional information regarding copyright ownership.
+ * The ASF licenses this file to You under the Apache License, Version 2.0
+ * (the "License"); you may not use this file except in compliance with
+ * the License.  You may obtain a copy of the License at
+ *
+ * http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+
+package org.apache.flink.contrib.streaming.state;
+
+import java.io.Serializable;
+import java.sql.SQLException;
+import java.util.List;
+
+import 
org.apache.flink.contrib.streaming.state.ShardedConnection.Partitioner;
+
+import com.google.common.collect.Lists;
+
+/**
+ * 
+ * Configuration object for {@link DbStateBackend}, containing information 
to
+ * shard and connect to the databases that will store the state 
checkpoints.
+ *
+ */
+public class DbBackendConfig implements Serializable {
+
+   private static final long serialVersionUID = 1L;
+
+   // Database connection properties
+   private final String userName;
+   private final String userPassword;
+   private final List shardUrls;
+
+   // JDBC Driver + DbAdapter information
+   private DbAdapter dbAdapter = new MySqlAdapter();
+   private String JDBCDriver = null;
+
+   private int maxNumberOfSqlRetries = 5;
+   private int sleepBetweenSqlRetries = 100;
+
+   // KvState properties
+   private int kvStateCacheSize = 1;
+   private int maxKvInsertBatchSize = 1000;
+   private float maxKvEvictFraction = 0.1f;
+   private int kvStateCompactionFreq = -1;
+
+   private Partitioner shardPartitioner;
+
+   /**
+* Creates a new sharded database state backend configuration with the 
given
+* parameters and default {@link MySqlAdapter}.
+* 
+* @param dbUserName
+*The username used to connect to the database at the given 
url.
+* @param dbUserPassword
+*The password used to connect to the database at the given 
url
+*and username.
+* @param dbShardUrls
+*The list of JDBC urls of the databases that will be used 
as
+*shards for the state backend. Sharding of the state will
+*happen based on the subtask index of the given task.
+*/
+   public DbBackendConfig(String dbUserName, String dbUserPassword, 
List dbShardUrls) {
+   this.userName = dbUserName;
+   this.userPassword = dbUserPassword;
+   this.shardUrls = dbShardUrls;
+   }
+
+   /**
+* Creates a new database state backend configuration with the given
+* parameters and default {@link MySqlAdapter}.
+* 
+* @param dbUserName
+*The username used to connect to the database at the given 
url.
+* @param dbUserPassword
+*The password used to connect to the database at the given 
url
+*and username.
+* @param dbUrl
+*The JDBC url of the database for example
+*"jdbc:mysql://localhost:3306/flinkdb".
+*/
+   public DbBackendConfig(String dbUserName, String dbUserPassword, String 
dbUrl) {
+   this(dbUserName, dbUserPassword, Lists.newArrayList(dbUrl));
+   }
+
+   /**
+* The username used to connect to the database at the given urls.
+*/
+   public String getUserName() {
+   return userName;
+   }
+
+   /**
+* The password used to connect to the database at the given url and
+* username.
+*/
+   public String getUserPassword() {
+   return userPassword;
+   }
+
+   /**
+* Number of database shards defined.
+*/
+   public int getNumberOfShards() {
+   return shardUrls.size();
+   }
+
+   /**
+* Database shard urls as provided in the constructor.
+* 
+*/
+   public List getShardUrls() {
+   return shardUrls;
+   }

[GitHub] flink pull request: [FLINK-3020][streaming] set number of task slo...

2015-11-19 Thread StephanEwen

Github user StephanEwen commented on the pull request:

https://github.com/apache/flink/pull/1360#issuecomment-158022328
  
+1 for taking the max parallelism of all operators


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

[jira] [Commented] (FLINK-3020) Local streaming execution: set number of task manager slots to the maximum parallelism

2015-11-19 Thread ASF GitHub Bot (JIRA)


[ 
https://issues.apache.org/jira/browse/FLINK-3020?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15013327#comment-15013327
 ] 

ASF GitHub Bot commented on FLINK-3020:
---

Github user StephanEwen commented on the pull request:

https://github.com/apache/flink/pull/1360#issuecomment-158022328
  
+1 for taking the max parallelism of all operators


> Local streaming execution: set number of task manager slots to the maximum 
> parallelism
> --
>
> Key: FLINK-3020
> URL: https://issues.apache.org/jira/browse/FLINK-3020
> Project: Flink
>  Issue Type: Bug
>  Components: Local Runtime
>Affects Versions: 0.10.0
>Reporter: Maximilian Michels
>Assignee: Maximilian Michels
>Priority: Minor
> Fix For: 1.0.0, 0.10.1
>
>
> Quite an inconvenience is the local execution configuration behavior. It sets 
> the number of task slots of the mini cluster to the default parallelism. This 
> causes problem if you use {{setParallelism(parallelism)}} on an operator and 
> set a parallelism larger than the default parallelism.
> {noformat}
> Caused by: 
> org.apache.flink.runtime.jobmanager.scheduler.NoResourceAvailableException: 
> Not enough free slots available to run the job. You can decrease the operator 
> parallelism or increase the number of slots per TaskManager in the 
> configuration. Task to schedule: < Attempt #0 (Flat Map (9/100)) @ 
> (unassigned) - [SCHEDULED] > with groupID < fa7240ee1fed08bd7e6278899db3e838 
> > in sharing group < SlotSharingGroup [f3d578e9819be9c39ceee86cf5eb8c08, 
> 8fa330746efa1d034558146e4604d0b4, fa7240ee1fed08bd7e6278899db3e838] >. 
> Resources available to scheduler: Number of instances=1, total number of 
> slots=8, available slots=0
>   at 
> org.apache.flink.runtime.jobmanager.scheduler.Scheduler.scheduleTask(Scheduler.java:256)
>   at 
> org.apache.flink.runtime.jobmanager.scheduler.Scheduler.scheduleImmediately(Scheduler.java:131)
>   at 
> org.apache.flink.runtime.executiongraph.Execution.scheduleForExecution(Execution.java:298)
>   at 
> org.apache.flink.runtime.executiongraph.ExecutionVertex.scheduleForExecution(ExecutionVertex.java:458)
>   at 
> org.apache.flink.runtime.executiongraph.ExecutionJobVertex.scheduleAll(ExecutionJobVertex.java:322)
>   at 
> org.apache.flink.runtime.executiongraph.ExecutionGraph.scheduleForExecution(ExecutionGraph.java:686)
>   at 
> org.apache.flink.runtime.jobmanager.JobManager$$anonfun$org$apache$flink$runtime$jobmanager$JobManager$$submitJob$1.apply$mcV$sp(JobManager.scala:982)
>   at 
> org.apache.flink.runtime.jobmanager.JobManager$$anonfun$org$apache$flink$runtime$jobmanager$JobManager$$submitJob$1.apply(JobManager.scala:962)
>   at 
> org.apache.flink.runtime.jobmanager.JobManager$$anonfun$org$apache$flink$runtime$jobmanager$JobManager$$submitJob$1.apply(JobManager.scala:962)
>   at 
> scala.concurrent.impl.Future$PromiseCompletingRunnable.liftedTree1$1(Future.scala:24)
>   at 
> scala.concurrent.impl.Future$PromiseCompletingRunnable.run(Future.scala:24)
>   at akka.dispatch.TaskInvocation.run(AbstractDispatcher.scala:41)
>   at 
> akka.dispatch.ForkJoinExecutorConfigurator$AkkaForkJoinTask.exec(AbstractDispatcher.scala:401)
>   at scala.concurrent.forkjoin.ForkJoinTask.doExec(ForkJoinTask.java:260)
>   at 
> scala.concurrent.forkjoin.ForkJoinPool$WorkQueue.runTask(ForkJoinPool.java:1339)
>   ... 2 more
> {noformat}
> I propose to change this behavior to setting the number of task slots to the 
> maximum parallelism present in the user program.
> What do you think?



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (FLINK-2978) Integrate web submission interface into the new dashboard

2015-11-19 Thread ASF GitHub Bot (JIRA)


[ 
https://issues.apache.org/jira/browse/FLINK-2978?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15013761#comment-15013761
 ] 

ASF GitHub Bot commented on FLINK-2978:
---

Github user rmetzger commented on a diff in the pull request:

https://github.com/apache/flink/pull/1338#discussion_r45360635
  
--- Diff: 
flink-core/src/main/java/org/apache/flink/configuration/ConfigConstants.java ---
@@ -635,8 +644,18 @@
 * The default number of archived jobs for the jobmanager
 */
public static final int DEFAULT_JOB_MANAGER_WEB_ARCHIVE_COUNT = 5;
-   
-   
+
+   /**
+* By default, submitting jobs from the web-frontend is allowed.
+*/
+   public static final boolean DEFAULT_JOB_MANAGER_WEB_SUBMISSION = true;
+
+   /**
+* Default directory for uploaded file storage for the Web frontend.
+*/
+   public static final String DEFAULT_JOB_MANAGER_WEB_UPLOAD_DIR =
+   (System.getProperty("java.io.tmpdir") == null ? "/tmp" 
: System.getProperty("java.io.tmpdir")) + "/webmonitor/";
--- End diff --

I like your idea of putting the files into a `/jars/` in the webRootDir, so 
that the files are deleted on shutdown!


> Integrate web submission interface into the new dashboard
> -
>
> Key: FLINK-2978
> URL: https://issues.apache.org/jira/browse/FLINK-2978
> Project: Flink
>  Issue Type: New Feature
>  Components: Web Client, Webfrontend
>Reporter: Sachin Goel
>Assignee: Sachin Goel
>
> As discussed in 
> http://mail-archives.apache.org/mod_mbox/flink-dev/201511.mbox/%3CCAL3J2zQg6UBKNDnm=8tshpz6r4p2jvx7nrlom7caajrb9s6...@mail.gmail.com%3E,
>  we should integrate job submission from the web into the dashboard.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[GitHub] flink pull request: [FLINK-2913] Close of ObjectOutputStream shoul...

2015-11-19 Thread asfgit

Github user asfgit closed the pull request at:

https://github.com/apache/flink/pull/1353


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

[jira] [Commented] (FLINK-2913) Close of ObjectOutputStream should be enclosed in finally block in FsStateBackend

2015-11-19 Thread ASF GitHub Bot (JIRA)


[ 
https://issues.apache.org/jira/browse/FLINK-2913?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15013767#comment-15013767
 ] 

ASF GitHub Bot commented on FLINK-2913:
---

Github user asfgit closed the pull request at:

https://github.com/apache/flink/pull/1353


> Close of ObjectOutputStream should be enclosed in finally block in 
> FsStateBackend
> -
>
> Key: FLINK-2913
> URL: https://issues.apache.org/jira/browse/FLINK-2913
> Project: Flink
>  Issue Type: Bug
>Reporter: Ted Yu
>Priority: Minor
>
> {code}
>   ObjectOutputStream os = new ObjectOutputStream(outStream);
>   os.writeObject(state);
>   os.close();
> {code}
> If IOException is thrown out of writeObject(), the close() call would be 
> skipped.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Closed] (FLINK-3048) DataSinkTaskTest.testCancelDataSinkTask

2015-11-19 Thread Stephan Ewen (JIRA)


 [ 
https://issues.apache.org/jira/browse/FLINK-3048?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Stephan Ewen closed FLINK-3048.
---

> DataSinkTaskTest.testCancelDataSinkTask
> ---
>
> Key: FLINK-3048
> URL: https://issues.apache.org/jira/browse/FLINK-3048
> Project: Flink
>  Issue Type: Bug
>  Components: Tests
>Reporter: Matthias J. Sax
>Assignee: Stephan Ewen
>Priority: Critical
>  Labels: test-stability
> Fix For: 1.0.0
>
>
> https://travis-ci.org/apache/flink/jobs/91941025
> {noformat}
> Tests run: 7, Failures: 1, Errors: 0, Skipped: 0, Time elapsed: 16.483 sec 
> <<< FAILURE! - in org.apache.flink.runtime.operators.DataSinkTaskTest
> testCancelDataSinkTask(org.apache.flink.runtime.operators.DataSinkTaskTest)  
> Time elapsed: 1.136 sec  <<< FAILURE!
> java.lang.AssertionError: Temp output file has not been removed
>   at org.junit.Assert.fail(Assert.java:88)
>   at org.junit.Assert.assertTrue(Assert.java:41)
>   at org.junit.Assert.assertFalse(Assert.java:64)
>   at 
> org.apache.flink.runtime.operators.DataSinkTaskTest.testCancelDataSinkTask(DataSinkTaskTest.java:397)
> {noformat}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Resolved] (FLINK-3048) DataSinkTaskTest.testCancelDataSinkTask

2015-11-19 Thread Stephan Ewen (JIRA)


 [ 
https://issues.apache.org/jira/browse/FLINK-3048?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Stephan Ewen resolved FLINK-3048.
-
   Resolution: Fixed
Fix Version/s: 1.0.0

Fixed via 93622001e499fa04bb5c4a63b1b3ed09b270f5b9

> DataSinkTaskTest.testCancelDataSinkTask
> ---
>
> Key: FLINK-3048
> URL: https://issues.apache.org/jira/browse/FLINK-3048
> Project: Flink
>  Issue Type: Bug
>  Components: Tests
>Reporter: Matthias J. Sax
>Assignee: Stephan Ewen
>Priority: Critical
>  Labels: test-stability
> Fix For: 1.0.0
>
>
> https://travis-ci.org/apache/flink/jobs/91941025
> {noformat}
> Tests run: 7, Failures: 1, Errors: 0, Skipped: 0, Time elapsed: 16.483 sec 
> <<< FAILURE! - in org.apache.flink.runtime.operators.DataSinkTaskTest
> testCancelDataSinkTask(org.apache.flink.runtime.operators.DataSinkTaskTest)  
> Time elapsed: 1.136 sec  <<< FAILURE!
> java.lang.AssertionError: Temp output file has not been removed
>   at org.junit.Assert.fail(Assert.java:88)
>   at org.junit.Assert.assertTrue(Assert.java:41)
>   at org.junit.Assert.assertFalse(Assert.java:64)
>   at 
> org.apache.flink.runtime.operators.DataSinkTaskTest.testCancelDataSinkTask(DataSinkTaskTest.java:397)
> {noformat}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Resolved] (FLINK-2913) Close of ObjectOutputStream should be enclosed in finally block in FsStateBackend

2015-11-19 Thread Stephan Ewen (JIRA)


 [ 
https://issues.apache.org/jira/browse/FLINK-2913?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Stephan Ewen resolved FLINK-2913.
-
   Resolution: Fixed
 Assignee: Ted Yu
Fix Version/s: 1.0.0

Fixed via ff52d289113560273830421eceef82028d8bc99c

Thank you for the contribution!

> Close of ObjectOutputStream should be enclosed in finally block in 
> FsStateBackend
> -
>
> Key: FLINK-2913
> URL: https://issues.apache.org/jira/browse/FLINK-2913
> Project: Flink
>  Issue Type: Bug
>Reporter: Ted Yu
>Assignee: Ted Yu
>Priority: Minor
> Fix For: 1.0.0
>
>
> {code}
>   ObjectOutputStream os = new ObjectOutputStream(outStream);
>   os.writeObject(state);
>   os.close();
> {code}
> If IOException is thrown out of writeObject(), the close() call would be 
> skipped.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Closed] (FLINK-2913) Close of ObjectOutputStream should be enclosed in finally block in FsStateBackend

2015-11-19 Thread Stephan Ewen (JIRA)


 [ 
https://issues.apache.org/jira/browse/FLINK-2913?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Stephan Ewen closed FLINK-2913.
---

> Close of ObjectOutputStream should be enclosed in finally block in 
> FsStateBackend
> -
>
> Key: FLINK-2913
> URL: https://issues.apache.org/jira/browse/FLINK-2913
> Project: Flink
>  Issue Type: Bug
>Reporter: Ted Yu
>Assignee: Ted Yu
>Priority: Minor
> Fix For: 1.0.0
>
>
> {code}
>   ObjectOutputStream os = new ObjectOutputStream(outStream);
>   os.writeObject(state);
>   os.close();
> {code}
> If IOException is thrown out of writeObject(), the close() call would be 
> skipped.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[GitHub] flink pull request: Out-of-core state backend for JDBC databases

2015-11-19 Thread StephanEwen

Github user StephanEwen commented on a diff in the pull request:

https://github.com/apache/flink/pull/1305#discussion_r45362408
  
--- Diff: 
flink-runtime/src/main/java/org/apache/flink/runtime/state/StateBackend.java ---
@@ -64,33 +64,35 @@
/**
 * Closes the state backend, releasing all internal resources, but does 
not delete any persistent
 * checkpoint data.
-* 
+*
 * @throws Exception Exceptions can be forwarded and will be logged by 
the system
 */
public abstract void close() throws Exception;
-   
+
// 

//  key/value state
// 

 
/**
 * Creates a key/value state backed by this state backend.
-* 
+*
+* @param operatorId Unique id for the operator creating the state
+* @param stateName Name of the created state
 * @param keySerializer The serializer for the key.
 * @param valueSerializer The serializer for the value.
 * @param defaultValue The value that is returned when no other value 
has been associated with a key, yet.
 * @param  The type of the key.
 * @param  The type of the value.
-* 
+*
 * @return A new key/value state backed by this backend.
-* 
+*
 * @throws Exception Exceptions may occur during initialization of the 
state and should be forwarded.
 */
-   public abstract  KvState createKvState(
+   public abstract  KvState createKvState(int 
operatorId, String stateName,
--- End diff --

I would like to get rid of this change and simply let the state backend 
create a UID for the state name.

This method is called one per proper creation of a state (so it should not 
need deterministic state naming). Recovery happens from the state handle, which 
can store all required info.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

[GitHub] flink pull request: Out-of-core state backend for JDBC databases

2015-11-19 Thread StephanEwen

Github user StephanEwen commented on the pull request:

https://github.com/apache/flink/pull/1305#issuecomment-158107515
  
I have a final comment inline. Otherwise, I think this is good to merge.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

1 2 >

1 - 100 of 190 matches

Mail list logo