[jira] [Commented] (FLINK-1390) java.lang.ClassCastException: X cannot be cast to X

2015-02-07 Thread Fabian Hueske (JIRA)

[ 
https://issues.apache.org/jira/browse/FLINK-1390?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14310629#comment-14310629
 ] 

Fabian Hueske commented on FLINK-1390:
--

The possibly related issue FLINK-1438 has been fixed.
Can we somehow check if that fixes also this problem?

  java.lang.ClassCastException: X cannot be cast to X
 

 Key: FLINK-1390
 URL: https://issues.apache.org/jira/browse/FLINK-1390
 Project: Flink
  Issue Type: Bug
  Components: YARN Client
Affects Versions: 0.8
Reporter: Robert Metzger

 A user is affected by an issue, which is probably caused by different 
 classloaders being used for loading user classes.
 Current state of investigation:
 - the error happens in yarn sessions (there is only a YARN environment 
 available)
 - the error doesn't happen on the first time the job is being executed. It 
 only happens on subsequent executions.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (FLINK-1476) Flink VS Spark on loop test

2015-02-07 Thread Fabian Hueske (JIRA)

 [ 
https://issues.apache.org/jira/browse/FLINK-1476?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Fabian Hueske updated FLINK-1476:
-
Priority: Minor  (was: Critical)

 Flink VS Spark on loop test
 ---

 Key: FLINK-1476
 URL: https://issues.apache.org/jira/browse/FLINK-1476
 Project: Flink
  Issue Type: Test
Affects Versions: 0.7.0-incubating, 0.8
 Environment: 3 machines, every machines has 24 CPU cores and allocate 
 16 CPU cores for the tests. The memory situation is: 3 * 32G
Reporter: xuhong
Priority: Minor

 In the last days, i did some test on flink and spark. The test results 
 shows that flink can do better on many operations, such as GroupBy, Join and 
 some complex jobs. But when I do the KMeans, LinearRegression and other loop 
 tests, i found that flink is no more excellent than spark. I want to konw, 
 whether flink is more comfortable to do the loop jobs with spark.
 I add code: env.setDegreeOfParallelism(16) in each test to allocate same 
 CPU cores as in Spark tests.
 My english is not good, i wish you guys can understand me!
 the following is some config of my Flnk:
 jobmanager.rpc.port: 6123
 jobmanager.heap.mb: 2048
 taskmanager.heap.mb: 2048
 taskmanager.numberOfTaskSlots: 24
 parallelization.degree.default: 72
 jobmanager.web.port: 8081
 webclient.port: 8085
 fs.overwrite-files: true
 taskmanager.memory.fraction: 0.8
 taskmanager.network.numberofBuffers: 7



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[GitHub] flink pull request: [FLINK-1201] Add flink-gelly to flink-addons (...

2015-02-07 Thread rmetzger
Github user rmetzger commented on the pull request:

https://github.com/apache/flink/pull/335#issuecomment-73366441
  
I'm really sorry that I've messed up this pull request by renaming 
flink-addons to flink-staging :(
I was doing it in a rush Really sorry.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---


[jira] [Commented] (FLINK-1201) Graph API for Flink

2015-02-07 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/FLINK-1201?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14310746#comment-14310746
 ] 

ASF GitHub Bot commented on FLINK-1201:
---

Github user rmetzger commented on the pull request:

https://github.com/apache/flink/pull/335#issuecomment-73366441
  
I'm really sorry that I've messed up this pull request by renaming 
flink-addons to flink-staging :(
I was doing it in a rush Really sorry.


 Graph API for Flink 
 

 Key: FLINK-1201
 URL: https://issues.apache.org/jira/browse/FLINK-1201
 Project: Flink
  Issue Type: New Feature
Reporter: Kostas Tzoumas
Assignee: Vasia Kalavri

 This issue tracks the development of a Graph API/DSL for Flink.
 Until the code is pushed to the Flink repository, collaboration is happening 
 here: https://github.com/project-flink/flink-graph



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (FLINK-1438) ClassCastException for Custom InputSplit in local mode and invalid type code in distributed mode

2015-02-07 Thread Robert Metzger (JIRA)

[ 
https://issues.apache.org/jira/browse/FLINK-1438?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14310891#comment-14310891
 ] 

Robert Metzger commented on FLINK-1438:
---

We should add this also to release-0.8 in my optinion

 ClassCastException for Custom InputSplit in local mode and invalid type code 
 in distributed mode
 

 Key: FLINK-1438
 URL: https://issues.apache.org/jira/browse/FLINK-1438
 Project: Flink
  Issue Type: Bug
  Components: JobManager
Affects Versions: 0.8, 0.9
Reporter: Fabian Hueske
Assignee: Stephan Ewen
Priority: Minor
 Fix For: 0.9


 Jobs with custom InputSplits fail with a ClassCastException such as 
 {{org.apache.flink.examples.java.misc.CustomSplitTestJob$TestFileInputSplit 
 cannot be cast to 
 org.apache.flink.examples.java.misc.CustomSplitTestJob$TestFileInputSplit}} 
 if executed on a local setup. 
 This issue is probably related to different ClassLoaders used by the 
 JobManager when InputSplits are generated and when they are handed to the 
 InputFormat by the TaskManager. Moving the class of the custom InputSplit 
 into the {{./lib}} folder and removing it from the job's makes the job work.
 To reproduce the bug, run the following job on a local setup. 
 {code}
 public class CustomSplitTestJob {
   public static void main(String[] args) throws Exception {
   ExecutionEnvironment env = 
 ExecutionEnvironment.getExecutionEnvironment();
   DataSetString x = env.createInput(new TestFileInputFormat());
   x.print();
   env.execute();
   }
   public static class TestFileInputFormat implements 
 InputFormatString,TestFileInputSplit {
   @Override
   public void configure(Configuration parameters) {
   }
   @Override
   public BaseStatistics getStatistics(BaseStatistics 
 cachedStatistics) throws IOException {
   return null;
   }
   @Override
   public TestFileInputSplit[] createInputSplits(int minNumSplits) 
 throws IOException {
   return new TestFileInputSplit[]{new 
 TestFileInputSplit()};
   }
   @Override
   public InputSplitAssigner 
 getInputSplitAssigner(TestFileInputSplit[] inputSplits) {
   return new LocatableInputSplitAssigner(inputSplits);
   }
   @Override
   public void open(TestFileInputSplit split) throws IOException {
   }
   @Override
   public boolean reachedEnd() throws IOException {
   return false;
   }
   @Override
   public String nextRecord(String reuse) throws IOException {
   return null;
   }
   @Override
   public void close() throws IOException {
   }
   }
   public static class TestFileInputSplit extends FileInputSplit {
   }
 }
 {code}
 The same happens in distributed mode just that Akka terminates the 
 transmission of the input split with a meaningless {{invalid type code: 00}}.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (FLINK-1390) java.lang.ClassCastException: X cannot be cast to X

2015-02-07 Thread Robert Metzger (JIRA)

[ 
https://issues.apache.org/jira/browse/FLINK-1390?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14310892#comment-14310892
 ] 

Robert Metzger commented on FLINK-1390:
---

I'll ask the user who reported the issue whether the change in 1438 resolved it.

  java.lang.ClassCastException: X cannot be cast to X
 

 Key: FLINK-1390
 URL: https://issues.apache.org/jira/browse/FLINK-1390
 Project: Flink
  Issue Type: Bug
  Components: YARN Client
Affects Versions: 0.8
Reporter: Robert Metzger

 A user is affected by an issue, which is probably caused by different 
 classloaders being used for loading user classes.
 Current state of investigation:
 - the error happens in yarn sessions (there is only a YARN environment 
 available)
 - the error doesn't happen on the first time the job is being executed. It 
 only happens on subsequent executions.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (FLINK-1201) Graph API for Flink

2015-02-07 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/FLINK-1201?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14310922#comment-14310922
 ] 

ASF GitHub Bot commented on FLINK-1201:
---

Github user andralungu commented on the pull request:

https://github.com/apache/flink/pull/335#issuecomment-73381932
  
I got the ACK from Apache. My ICLA has been filed :) 




 Graph API for Flink 
 

 Key: FLINK-1201
 URL: https://issues.apache.org/jira/browse/FLINK-1201
 Project: Flink
  Issue Type: New Feature
Reporter: Kostas Tzoumas
Assignee: Vasia Kalavri

 This issue tracks the development of a Graph API/DSL for Flink.
 Until the code is pushed to the Flink repository, collaboration is happening 
 here: https://github.com/project-flink/flink-graph



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (FLINK-100) Pact API Proposal: Add keyless CoGroup (send all to a single group)

2015-02-07 Thread Fabian Hueske (JIRA)

 [ 
https://issues.apache.org/jira/browse/FLINK-100?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Fabian Hueske updated FLINK-100:

Priority: Minor  (was: Major)

 Pact API Proposal: Add keyless CoGroup (send all to a single group)
 ---

 Key: FLINK-100
 URL: https://issues.apache.org/jira/browse/FLINK-100
 Project: Flink
  Issue Type: Improvement
Reporter: GitHub Import
Priority: Minor
  Labels: github-import
 Fix For: pre-apache


 I propose to add a keyless version of CoGroup that groups both inputs in a 
 single group, analogous to the keyless Reducer version that was added in 
 https://github.com/dimalabs/ozone/pull/61
 ```
 CoGroupContract myCoGroup = CoGroupContract.builder(MyUdf.class)
 .input1(contractA)
 .input2(contractB)
 .build();
 ```
 I have a use case where I need to process the output of two contracts in a 
 single udf and I currently have to use the workaround to add a constant field 
 and use this as grouping key.
 Adding a keyless version would reduce the overhead (network traffic, 
 serialization and code-writing) and give the compiler additional knowledge 
 (The compiler knows that there will be only a single group and a single udf 
 call. If setAvgRecordsEmittedPerStubCall is set, it could infer the output 
 cardinality)
 Furthermore I think this would be consequent, because CoGroup is like Reduce 
 for multiple inputs.
  Imported from GitHub 
 Url: https://github.com/stratosphere/stratosphere/issues/100
 Created by: [andrehacker|https://github.com/andrehacker]
 Labels: enhancement, 
 Created at: Sat Sep 14 23:15:59 CEST 2013
 State: open



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (FLINK-1219) Add support for Apache Tez as execution engine

2015-02-07 Thread Fabian Hueske (JIRA)

[ 
https://issues.apache.org/jira/browse/FLINK-1219?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14311019#comment-14311019
 ] 

Fabian Hueske commented on FLINK-1219:
--

Is this a duplicate of FLINK-972?

 Add support for Apache Tez as execution engine
 --

 Key: FLINK-1219
 URL: https://issues.apache.org/jira/browse/FLINK-1219
 Project: Flink
  Issue Type: New Feature
Reporter: Kostas Tzoumas
Assignee: Kostas Tzoumas

 This is an umbrella issue to track Apache Tez support.
 The goal is to be able to run unmodified Flink programs as Apache Tez jobs.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (FLINK-1486) Add a string to the print method to identify output

2015-02-07 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/FLINK-1486?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14310983#comment-14310983
 ] 

ASF GitHub Bot commented on FLINK-1486:
---

Github user fhueske commented on the pull request:

https://github.com/apache/flink/pull/372#issuecomment-73387496
  
I think it would be nice to have some kind of hierarchical structure of the 
output such as:
`$sinkName:$taskId  $outputValue`
That would give the name of the sink first followed by the sink's task id, 
and finally behind the ``prompt the actual output. Wouldn't that also make 
output parsing easier?

Looks good otherwise.


 Add a string to the print method to identify output
 ---

 Key: FLINK-1486
 URL: https://issues.apache.org/jira/browse/FLINK-1486
 Project: Flink
  Issue Type: Improvement
  Components: Local Runtime
Reporter: Max Michels
Assignee: Max Michels
Priority: Minor
  Labels: usability

 The output of the {{print}} method of {[DataSet}} is mainly used for debug 
 purposes. Currently, it is difficult to identify the output.
 I would suggest to add another {{print(String str)}} method which allows the 
 user to supply a String to identify the output. This could be a prefix before 
 the actual output or a format string (which might be an overkill).
 {code}
 DataSet data = env.fromElements(1,2,3,4,5);
 {code}
 For example, {{data.print(MyDataSet: )}} would output print
 {noformat}
 MyDataSet: 1
 MyDataSet: 2
 ...
 {noformat}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[GitHub] flink pull request: [FLINK-1486] add print method for prefixing a ...

2015-02-07 Thread fhueske
Github user fhueske commented on the pull request:

https://github.com/apache/flink/pull/372#issuecomment-73387496
  
I think it would be nice to have some kind of hierarchical structure of the 
output such as:
`$sinkName:$taskId  $outputValue`
That would give the name of the sink first followed by the sink's task id, 
and finally behind the ``prompt the actual output. Wouldn't that also make 
output parsing easier?

Looks good otherwise.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---


[jira] [Updated] (FLINK-1418) Make 'print()' output on the client command line, rather than on the task manager sysout

2015-02-07 Thread Fabian Hueske (JIRA)

 [ 
https://issues.apache.org/jira/browse/FLINK-1418?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Fabian Hueske updated FLINK-1418:
-
Priority: Minor  (was: Major)

 Make 'print()' output on the client command line, rather than on the task 
 manager sysout
 

 Key: FLINK-1418
 URL: https://issues.apache.org/jira/browse/FLINK-1418
 Project: Flink
  Issue Type: Improvement
  Components: Java API
Affects Versions: 0.9
Reporter: Stephan Ewen
Priority: Minor

 Right now, the {{print()}} command prints inside the data sinks where the 
 code runs. It should pull data back to the client and print it there.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (FLINK-1414) Remove quickstart-*.sh from git source and put them to the website's svn

2015-02-07 Thread Fabian Hueske (JIRA)

 [ 
https://issues.apache.org/jira/browse/FLINK-1414?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Fabian Hueske updated FLINK-1414:
-
Priority: Minor  (was: Major)

 Remove quickstart-*.sh from git source and put them to the website's svn
 

 Key: FLINK-1414
 URL: https://issues.apache.org/jira/browse/FLINK-1414
 Project: Flink
  Issue Type: Task
  Components: Project Website
Reporter: Robert Metzger
Priority: Minor

 The quickstart.sh script is currently (due to historic reasons) located in 
 the main source repo.
 It probably better fits into the homepage because it is independent of the 
 versions in the pom.xml files. 
 This also makes the release maintenance easier.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (FLINK-1321) New web interface, contains parts from WebInfoServer and WebClient

2015-02-07 Thread Fabian Hueske (JIRA)

 [ 
https://issues.apache.org/jira/browse/FLINK-1321?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Fabian Hueske updated FLINK-1321:
-
Priority: Minor  (was: Major)

 New web interface, contains parts from WebInfoServer and WebClient
 --

 Key: FLINK-1321
 URL: https://issues.apache.org/jira/browse/FLINK-1321
 Project: Flink
  Issue Type: New Feature
  Components: JobManager, Webfrontend
Reporter: Matthias Schumacher
Priority: Minor
 Fix For: 0.7.0-incubating


 The new webserver is based on the data from Runtime WebInfoServer and is 
 extended with the functionality and the graph from WebClient.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (FLINK-1340) Project restructure

2015-02-07 Thread Fabian Hueske (JIRA)

 [ 
https://issues.apache.org/jira/browse/FLINK-1340?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Fabian Hueske updated FLINK-1340:
-
Priority: Minor  (was: Major)

 Project restructure
 ---

 Key: FLINK-1340
 URL: https://issues.apache.org/jira/browse/FLINK-1340
 Project: Flink
  Issue Type: Task
  Components: Java API, Scala API, Streaming
Affects Versions: 0.9
Reporter: Márton Balassi
Priority: Minor

 During a recent PR of the streaming scala api [1] arose the issue of possibly 
 changing the project structure. For the discussion it seems to me that we 
 should address this as a separate issue. Things to note:
* According to Stephan for the batch part, there are discussions
 to combine the flink-core, flink-java projects, possibly also the 
 flink-scala project. We are starting to see too many interdependencies. [2]
* Streaming is currently under flink-addons, but we are positive that for 
 the next version we can come up with a fairly stable api if needed. We would 
 like to have it top level eventually.
* Minor issue to keep in mind: Developing our projects with both scala and 
 java nature seems a bit flaky at the moment at least for Eclipse. [3] 
 Proposed solutions are also there, just let us make sure to give new 
 developers a smooth experience with Flink.
 I personally like the following suggestion: [2]
 We could, in the next version, go for something like 
 - flink-core (core and batch, java  scala) 
 - flink-streaming (java  scala) 
 - flink-runtime 
 - ...
 Ufuk also +1'd this.
 [1] https://github.com/apache/incubator-flink/pull/275
 [2] https://github.com/apache/incubator-flink/pull/275#issuecomment-68049822
 [3] 
 http://mail-archives.apache.org/mod_mbox/incubator-flink-dev/201412.mbox/%3CCANC1h_tLtGeOxT-aaA5KR6V4m-Efz8fSN5yKcdX%2B7sjeTdFBEw%40mail.gmail.com%3E



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (FLINK-785) Add Chained operators for AllReduce and AllGroupReduce

2015-02-07 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/FLINK-785?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14310996#comment-14310996
 ] 

ASF GitHub Bot commented on FLINK-785:
--

Github user fhueske commented on the pull request:

https://github.com/apache/flink/pull/370#issuecomment-73388209
  
Looks good in general. 
You need to make sure though, that you obey the execution object re-usage 
settings.
That basically means you need to pay attention to the objects that you give 
to and receive from the user code and possibly copy them to new objects (see 
FLINK-1285 and @aljoscha's commits for details). 


 Add Chained operators for AllReduce and AllGroupReduce
 --

 Key: FLINK-785
 URL: https://issues.apache.org/jira/browse/FLINK-785
 Project: Flink
  Issue Type: Improvement
Reporter: GitHub Import
  Labels: github-import
 Fix For: pre-apache


 Because the operators `AllReduce` and `AllGroupReduce` are used both for the 
 pre-reduce (combiner side) and the final reduce, they would greatly benefit 
 from a chained version.
  Imported from GitHub 
 Url: https://github.com/stratosphere/stratosphere/issues/785
 Created by: [StephanEwen|https://github.com/StephanEwen]
 Labels: runtime, 
 Milestone: Release 0.6 (unplanned)
 Created at: Sun May 11 17:41:12 CEST 2014
 State: open



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Closed] (FLINK-98) Temp Files are not removed if job fails

2015-02-07 Thread Fabian Hueske (JIRA)

 [ 
https://issues.apache.org/jira/browse/FLINK-98?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Fabian Hueske closed FLINK-98.
--
Resolution: Duplicate

Duplicate of FLINK-1483

 Temp Files are not removed if job fails
 ---

 Key: FLINK-98
 URL: https://issues.apache.org/jira/browse/FLINK-98
 Project: Flink
  Issue Type: Bug
  Components: Local Runtime, TaskManager
Affects Versions: 0.7.0-incubating
Reporter: Fabian Hueske
  Labels: github-import
 Fix For: pre-apache


 If a job fails, temp files such as sorted runs might not be removed.
  Imported from GitHub 
 Url: https://github.com/stratosphere/stratosphere/issues/98
 Created by: [fhueske|https://github.com/fhueske]
 Labels: bug, runtime, 
 Created at: Fri Sep 13 21:17:24 CEST 2013
 State: open



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (FLINK-1256) Use Kryo for user code object serialization

2015-02-07 Thread Fabian Hueske (JIRA)

 [ 
https://issues.apache.org/jira/browse/FLINK-1256?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Fabian Hueske updated FLINK-1256:
-
Priority: Minor  (was: Major)

 Use Kryo for user code object serialization
 ---

 Key: FLINK-1256
 URL: https://issues.apache.org/jira/browse/FLINK-1256
 Project: Flink
  Issue Type: Improvement
Reporter: Till Rohrmann
Priority: Minor

 Currently, Flink uses java serialization for user code objects (UDFs). This 
 limits the set of supported types which can be caught in the closure or are 
 contained as members in the user code object. Maybe we can add Kryo as a 
 second serialization mechanism which is used if the Java serialization fails. 
 That way we would support more complex user code objects which finally 
 improves the user experience.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (FLINK-1278) Remove the Record special code paths

2015-02-07 Thread Fabian Hueske (JIRA)

 [ 
https://issues.apache.org/jira/browse/FLINK-1278?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Fabian Hueske updated FLINK-1278:
-
Priority: Minor  (was: Major)

 Remove the Record special code paths
 

 Key: FLINK-1278
 URL: https://issues.apache.org/jira/browse/FLINK-1278
 Project: Flink
  Issue Type: Bug
  Components: Local Runtime
Affects Versions: 0.8
Reporter: Stephan Ewen
Assignee: Kostas Tzoumas
Priority: Minor
 Fix For: 0.9


 There are some legacy Record code paths in the runtime, which are often 
 forgotten to be kept in sync and cause errors if people actually use records.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (FLINK-1180) Add support for Hadoop MapReduce.* API Mappers and Reducers

2015-02-07 Thread Fabian Hueske (JIRA)

 [ 
https://issues.apache.org/jira/browse/FLINK-1180?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Fabian Hueske updated FLINK-1180:
-
Priority: Minor  (was: Major)

 Add support for Hadoop MapReduce.* API Mappers and Reducers
 ---

 Key: FLINK-1180
 URL: https://issues.apache.org/jira/browse/FLINK-1180
 Project: Flink
  Issue Type: Task
  Components: Hadoop Compatibility
Affects Versions: 0.7.0-incubating
Reporter: Mohitdeep Singh
Assignee: Mohitdeep Singh
Priority: Minor
  Labels: hadoop

 Flink currently supports hadoop mapred mapper and reduce function but not via 
 mapreduce api. 
 Reference: email exchange on flink mailing list.
 ...Another option would be to extend the Hadoop Compatibility Layer. Right 
 now, we have wrappers for Hadoop's mapred-API function (Mapper, Reducer), but 
 not for the mapreduce-API functions [2]. Having wrappers for mapreduce-API 
 functions would also be cool. There is no JIRA for this issue yet. 



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (FLINK-1309) Upload Flink logo with Flink name

2015-02-07 Thread Fabian Hueske (JIRA)

 [ 
https://issues.apache.org/jira/browse/FLINK-1309?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Fabian Hueske updated FLINK-1309:
-
Priority: Minor  (was: Major)

 Upload Flink logo with Flink name
 -

 Key: FLINK-1309
 URL: https://issues.apache.org/jira/browse/FLINK-1309
 Project: Flink
  Issue Type: Improvement
  Components: Project Website
Reporter: Kostas Tzoumas
Assignee: Kostas Tzoumas
Priority: Minor

 The website only contains logos of the mascot.
 Would be good to have the mascot + the Flink name next or under it so that 
 people can copy-paste in their slides



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (FLINK-1116) Packaged Scala Examples do not work due to missing test data

2015-02-07 Thread Fabian Hueske (JIRA)

 [ 
https://issues.apache.org/jira/browse/FLINK-1116?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Fabian Hueske updated FLINK-1116:
-
Priority: Minor  (was: Major)

 Packaged Scala Examples do not work due to missing test data
 

 Key: FLINK-1116
 URL: https://issues.apache.org/jira/browse/FLINK-1116
 Project: Flink
  Issue Type: Bug
  Components: Scala API
Reporter: Stephan Ewen
Priority: Minor

 The example data classes are in the java examples project. The maven jar 
 plugin cannot include them into the jars of the Scala examples, causing the 
 examples to fail with a ClassNotFoundException when staring the example job.
 For now, I disabled the Scala example jars from being built, because they do 
 not work anyways.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[GitHub] flink pull request: [FLINK-1179] Add button to JobManager web inte...

2015-02-07 Thread chiwanpark
GitHub user chiwanpark opened a pull request:

https://github.com/apache/flink/pull/374

[FLINK-1179] Add button to JobManager web interface to request stack trace 
of a TaskManager

This PR contains following changes:
* Add public constructors of `org.apache.flink.runtime.instance.InstanceID` 
for sending instance ID from web interface to job manager
* Add a helper method called `getRegisteredInstanceById(InstanceID)` into 
`org.apache.flink.runtime.instance.InstanceManager` for finding Akka Actor from 
instance ID
* Add akka messages called `RequestStackTrace`, `SendStackTrace` and 
`StackTrace`
* Modify a task manager page in web interface of job manager to request and 
show stack trace of a task manager

The following image is a screenshot of web interface of job manager.

![screen shot 2015-02-08 at 3 49 51 
pm](https://cloud.githubusercontent.com/assets/1941681/6095765/9293e996-afaf-11e4-9e8e-4dcd69ce595b.png)


You can merge this pull request into a Git repository by running:

$ git pull https://github.com/chiwanpark/flink FLINK-1179

Alternatively you can review and apply these changes as the patch at:

https://github.com/apache/flink/pull/374.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

This closes #374


commit 83ab236ce52dd8c3b0aa0b94ee8644a4de28e152
Author: Chiwan Park chiwanp...@icloud.com
Date:   2015-02-08T06:36:19Z

[FLINK-1179] [runtime] Add helper method for InstanceID

commit a2a0a0f8261851c93330417f3c60a16c5f1d2dd5
Author: Chiwan Park chiwanp...@icloud.com
Date:   2015-02-08T07:05:29Z

[FLINK-1179] Add internal API for obtaining StackTrace

commit 423e64ca4cc6ad4a9396e4418eab95ce5b81b219
Author: Chiwan Park chiwanp...@icloud.com
Date:   2015-02-08T07:10:03Z

[FLINK-1179] [jobmanager] Add button to JobManager web interface to request 
stack trace of a TaskManager




---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---


[GitHub] flink pull request: [FLINK-785] Chained AllReduce

2015-02-07 Thread fhueske
Github user fhueske commented on the pull request:

https://github.com/apache/flink/pull/370#issuecomment-73388209
  
Looks good in general. 
You need to make sure though, that you obey the execution object re-usage 
settings.
That basically means you need to pay attention to the objects that you give 
to and receive from the user code and possibly copy them to new objects (see 
FLINK-1285 and @aljoscha's commits for details). 


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---


[jira] [Updated] (FLINK-685) Add support for semi-joins

2015-02-07 Thread Fabian Hueske (JIRA)

 [ 
https://issues.apache.org/jira/browse/FLINK-685?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Fabian Hueske updated FLINK-685:

Priority: Minor  (was: Major)

 Add support for semi-joins
 --

 Key: FLINK-685
 URL: https://issues.apache.org/jira/browse/FLINK-685
 Project: Flink
  Issue Type: New Feature
Reporter: GitHub Import
Priority: Minor
  Labels: github-import
 Fix For: pre-apache


 A semi-join is basically a join filter. One input is filtering and the 
 other one is filtered.
 A tuple of the filtered input is emitted exactly once if the filtering 
 input has one (ore more) tuples with matching join keys. That means that the 
 output of a semi-join has the same type as the filtered input and the 
 filtering input is completely discarded.
 In order to support a semi-join, we need to add an additional physical 
 execution strategy, that ensures, that a tuple of the filtered input is 
 emitted only once if the filtering input has more than one tuple with 
 matching keys. Furthermore, a couple of optimizations compared to standard 
 joins can be done such as storing only keys and not the full tuple of the 
 filtering input in a hash table.
  Imported from GitHub 
 Url: https://github.com/stratosphere/stratosphere/issues/685
 Created by: [fhueske|https://github.com/fhueske]
 Labels: enhancement, java api, runtime, 
 Milestone: Release 0.6 (unplanned)
 Created at: Mon Apr 14 12:05:29 CEST 2014
 State: open



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (FLINK-687) Add support for outer-joins

2015-02-07 Thread Fabian Hueske (JIRA)

 [ 
https://issues.apache.org/jira/browse/FLINK-687?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Fabian Hueske updated FLINK-687:

Priority: Minor  (was: Major)

 Add support for outer-joins
 ---

 Key: FLINK-687
 URL: https://issues.apache.org/jira/browse/FLINK-687
 Project: Flink
  Issue Type: New Feature
Reporter: GitHub Import
Priority: Minor
  Labels: github-import
 Fix For: pre-apache


 There are three types of outer-joins:
 - left outer,
 - right outer, and
 - full outer
 joins.
 An outer-join does not filter tuples of the outer-side that do not find a 
 matching tuple on the other side. Instead, it is joined with a NULL value.
 Supporting outer-joins requires some modifications in the join execution 
 strategies.
  Imported from GitHub 
 Url: https://github.com/stratosphere/stratosphere/issues/687
 Created by: [fhueske|https://github.com/fhueske]
 Labels: enhancement, java api, runtime, 
 Created at: Mon Apr 14 12:09:00 CEST 2014
 State: open



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (FLINK-603) Add Spargel GAS pattern

2015-02-07 Thread Fabian Hueske (JIRA)

 [ 
https://issues.apache.org/jira/browse/FLINK-603?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Fabian Hueske updated FLINK-603:

Priority: Minor  (was: Major)

 Add Spargel GAS pattern
 ---

 Key: FLINK-603
 URL: https://issues.apache.org/jira/browse/FLINK-603
 Project: Flink
  Issue Type: Improvement
Reporter: GitHub Import
Priority: Minor
  Labels: github-import
 Fix For: pre-apache


 GAS = Gather, Apply, Scatter
  Imported from GitHub 
 Url: https://github.com/stratosphere/stratosphere/issues/603
 Created by: [rmetzger|https://github.com/rmetzger]
 Labels: java api, 
 Created at: Tue Mar 18 16:57:29 CET 2014
 State: open



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (FLINK-788) Enable combiner for Reduce with Broadcast Variable

2015-02-07 Thread Fabian Hueske (JIRA)

 [ 
https://issues.apache.org/jira/browse/FLINK-788?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Fabian Hueske updated FLINK-788:

Priority: Minor  (was: Major)

 Enable combiner for Reduce with Broadcast Variable
 --

 Key: FLINK-788
 URL: https://issues.apache.org/jira/browse/FLINK-788
 Project: Flink
  Issue Type: Improvement
Reporter: GitHub Import
Priority: Minor
  Labels: github-import
 Fix For: pre-apache


 If the `ReduceFunction` uses a broadcast variable, we can currently not 
 execute the combiner side of the function. The reason is that the combiner is 
 inserted dynamically and has no access to the broadcast variable.
  Imported from GitHub 
 Url: https://github.com/stratosphere/stratosphere/issues/788
 Created by: [StephanEwen|https://github.com/StephanEwen]
 Labels: 
 Created at: Mon May 12 03:01:18 CEST 2014
 State: open



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)