[jira] [Commented] (FLINK-377) Create a general purpose framework for language bindings
[ https://issues.apache.org/jira/browse/FLINK-377?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14504975#comment-14504975 ] ASF GitHub Bot commented on FLINK-377: -- Github user fhueske commented on the pull request: https://github.com/apache/flink/pull/202#issuecomment-94799438 Indeed! Very happy to have this in :-) Create a general purpose framework for language bindings Key: FLINK-377 URL: https://issues.apache.org/jira/browse/FLINK-377 Project: Flink Issue Type: Improvement Reporter: GitHub Import Assignee: Chesnay Schepler Labels: github-import Fix For: pre-apache A general purpose API to run operators with arbitrary binaries. This will allow to run Stratosphere programs written in Python, JavaScript, Ruby, Go or whatever you like. We suggest using Google Protocol Buffers for data serialization. This is the list of languages that currently support ProtoBuf: https://code.google.com/p/protobuf/wiki/ThirdPartyAddOns Very early prototype with python: https://github.com/rmetzger/scratch/tree/learn-protobuf (basically testing protobuf) For Ruby: https://github.com/infochimps-labs/wukong Two new students working at Stratosphere (@skunert and @filiphaase) are working on this. The reference binding language will be for Python, but other bindings are very welcome. The best name for this so far is stratosphere-lang-bindings. I created this issue to track the progress (and give everybody a chance to comment on this) Imported from GitHub Url: https://github.com/stratosphere/stratosphere/issues/377 Created by: [rmetzger|https://github.com/rmetzger] Labels: enhancement, Assignee: [filiphaase|https://github.com/filiphaase] Created at: Tue Jan 07 19:47:20 CET 2014 State: open -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (FLINK-377) Create a general purpose framework for language bindings
[ https://issues.apache.org/jira/browse/FLINK-377?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14505001#comment-14505001 ] ASF GitHub Bot commented on FLINK-377: -- Github user hsaputra commented on the pull request: https://github.com/apache/flink/pull/202#issuecomment-94809876 W00t! Nice to see this one gets in Create a general purpose framework for language bindings Key: FLINK-377 URL: https://issues.apache.org/jira/browse/FLINK-377 Project: Flink Issue Type: Improvement Reporter: GitHub Import Assignee: Chesnay Schepler Labels: github-import Fix For: pre-apache A general purpose API to run operators with arbitrary binaries. This will allow to run Stratosphere programs written in Python, JavaScript, Ruby, Go or whatever you like. We suggest using Google Protocol Buffers for data serialization. This is the list of languages that currently support ProtoBuf: https://code.google.com/p/protobuf/wiki/ThirdPartyAddOns Very early prototype with python: https://github.com/rmetzger/scratch/tree/learn-protobuf (basically testing protobuf) For Ruby: https://github.com/infochimps-labs/wukong Two new students working at Stratosphere (@skunert and @filiphaase) are working on this. The reference binding language will be for Python, but other bindings are very welcome. The best name for this so far is stratosphere-lang-bindings. I created this issue to track the progress (and give everybody a chance to comment on this) Imported from GitHub Url: https://github.com/stratosphere/stratosphere/issues/377 Created by: [rmetzger|https://github.com/rmetzger] Labels: enhancement, Assignee: [filiphaase|https://github.com/filiphaase] Created at: Tue Jan 07 19:47:20 CET 2014 State: open -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (FLINK-377) Create a general purpose framework for language bindings
[ https://issues.apache.org/jira/browse/FLINK-377?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14504967#comment-14504967 ] ASF GitHub Bot commented on FLINK-377: -- Github user asfgit closed the pull request at: https://github.com/apache/flink/pull/202 Create a general purpose framework for language bindings Key: FLINK-377 URL: https://issues.apache.org/jira/browse/FLINK-377 Project: Flink Issue Type: Improvement Reporter: GitHub Import Assignee: Chesnay Schepler Labels: github-import Fix For: pre-apache A general purpose API to run operators with arbitrary binaries. This will allow to run Stratosphere programs written in Python, JavaScript, Ruby, Go or whatever you like. We suggest using Google Protocol Buffers for data serialization. This is the list of languages that currently support ProtoBuf: https://code.google.com/p/protobuf/wiki/ThirdPartyAddOns Very early prototype with python: https://github.com/rmetzger/scratch/tree/learn-protobuf (basically testing protobuf) For Ruby: https://github.com/infochimps-labs/wukong Two new students working at Stratosphere (@skunert and @filiphaase) are working on this. The reference binding language will be for Python, but other bindings are very welcome. The best name for this so far is stratosphere-lang-bindings. I created this issue to track the progress (and give everybody a chance to comment on this) Imported from GitHub Url: https://github.com/stratosphere/stratosphere/issues/377 Created by: [rmetzger|https://github.com/rmetzger] Labels: enhancement, Assignee: [filiphaase|https://github.com/filiphaase] Created at: Tue Jan 07 19:47:20 CET 2014 State: open -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (FLINK-377) Create a general purpose framework for language bindings
[ https://issues.apache.org/jira/browse/FLINK-377?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14504834#comment-14504834 ] ASF GitHub Bot commented on FLINK-377: -- Github user zentol commented on the pull request: https://github.com/apache/flink/pull/202#issuecomment-94760254 @rmetzger Done. Unless you want me to merge commits as well. Create a general purpose framework for language bindings Key: FLINK-377 URL: https://issues.apache.org/jira/browse/FLINK-377 Project: Flink Issue Type: Improvement Reporter: GitHub Import Assignee: Chesnay Schepler Labels: github-import Fix For: pre-apache A general purpose API to run operators with arbitrary binaries. This will allow to run Stratosphere programs written in Python, JavaScript, Ruby, Go or whatever you like. We suggest using Google Protocol Buffers for data serialization. This is the list of languages that currently support ProtoBuf: https://code.google.com/p/protobuf/wiki/ThirdPartyAddOns Very early prototype with python: https://github.com/rmetzger/scratch/tree/learn-protobuf (basically testing protobuf) For Ruby: https://github.com/infochimps-labs/wukong Two new students working at Stratosphere (@skunert and @filiphaase) are working on this. The reference binding language will be for Python, but other bindings are very welcome. The best name for this so far is stratosphere-lang-bindings. I created this issue to track the progress (and give everybody a chance to comment on this) Imported from GitHub Url: https://github.com/stratosphere/stratosphere/issues/377 Created by: [rmetzger|https://github.com/rmetzger] Labels: enhancement, Assignee: [filiphaase|https://github.com/filiphaase] Created at: Tue Jan 07 19:47:20 CET 2014 State: open -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (FLINK-377) Create a general purpose framework for language bindings
[ https://issues.apache.org/jira/browse/FLINK-377?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14504857#comment-14504857 ] ASF GitHub Bot commented on FLINK-377: -- Github user zentol commented on the pull request: https://github.com/apache/flink/pull/202#issuecomment-94771487 nah I'll do it. Create a general purpose framework for language bindings Key: FLINK-377 URL: https://issues.apache.org/jira/browse/FLINK-377 Project: Flink Issue Type: Improvement Reporter: GitHub Import Assignee: Chesnay Schepler Labels: github-import Fix For: pre-apache A general purpose API to run operators with arbitrary binaries. This will allow to run Stratosphere programs written in Python, JavaScript, Ruby, Go or whatever you like. We suggest using Google Protocol Buffers for data serialization. This is the list of languages that currently support ProtoBuf: https://code.google.com/p/protobuf/wiki/ThirdPartyAddOns Very early prototype with python: https://github.com/rmetzger/scratch/tree/learn-protobuf (basically testing protobuf) For Ruby: https://github.com/infochimps-labs/wukong Two new students working at Stratosphere (@skunert and @filiphaase) are working on this. The reference binding language will be for Python, but other bindings are very welcome. The best name for this so far is stratosphere-lang-bindings. I created this issue to track the progress (and give everybody a chance to comment on this) Imported from GitHub Url: https://github.com/stratosphere/stratosphere/issues/377 Created by: [rmetzger|https://github.com/rmetzger] Labels: enhancement, Assignee: [filiphaase|https://github.com/filiphaase] Created at: Tue Jan 07 19:47:20 CET 2014 State: open -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (FLINK-377) Create a general purpose framework for language bindings
[ https://issues.apache.org/jira/browse/FLINK-377?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14504858#comment-14504858 ] ASF GitHub Bot commented on FLINK-377: -- Github user zentol commented on the pull request: https://github.com/apache/flink/pull/202#issuecomment-94774209 Done Create a general purpose framework for language bindings Key: FLINK-377 URL: https://issues.apache.org/jira/browse/FLINK-377 Project: Flink Issue Type: Improvement Reporter: GitHub Import Assignee: Chesnay Schepler Labels: github-import Fix For: pre-apache A general purpose API to run operators with arbitrary binaries. This will allow to run Stratosphere programs written in Python, JavaScript, Ruby, Go or whatever you like. We suggest using Google Protocol Buffers for data serialization. This is the list of languages that currently support ProtoBuf: https://code.google.com/p/protobuf/wiki/ThirdPartyAddOns Very early prototype with python: https://github.com/rmetzger/scratch/tree/learn-protobuf (basically testing protobuf) For Ruby: https://github.com/infochimps-labs/wukong Two new students working at Stratosphere (@skunert and @filiphaase) are working on this. The reference binding language will be for Python, but other bindings are very welcome. The best name for this so far is stratosphere-lang-bindings. I created this issue to track the progress (and give everybody a chance to comment on this) Imported from GitHub Url: https://github.com/stratosphere/stratosphere/issues/377 Created by: [rmetzger|https://github.com/rmetzger] Labels: enhancement, Assignee: [filiphaase|https://github.com/filiphaase] Created at: Tue Jan 07 19:47:20 CET 2014 State: open -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (FLINK-377) Create a general purpose framework for language bindings
[ https://issues.apache.org/jira/browse/FLINK-377?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14504839#comment-14504839 ] ASF GitHub Bot commented on FLINK-377: -- Github user rmetzger commented on the pull request: https://github.com/apache/flink/pull/202#issuecomment-94761465 I can also do the squashing if you want ;) Create a general purpose framework for language bindings Key: FLINK-377 URL: https://issues.apache.org/jira/browse/FLINK-377 Project: Flink Issue Type: Improvement Reporter: GitHub Import Assignee: Chesnay Schepler Labels: github-import Fix For: pre-apache A general purpose API to run operators with arbitrary binaries. This will allow to run Stratosphere programs written in Python, JavaScript, Ruby, Go or whatever you like. We suggest using Google Protocol Buffers for data serialization. This is the list of languages that currently support ProtoBuf: https://code.google.com/p/protobuf/wiki/ThirdPartyAddOns Very early prototype with python: https://github.com/rmetzger/scratch/tree/learn-protobuf (basically testing protobuf) For Ruby: https://github.com/infochimps-labs/wukong Two new students working at Stratosphere (@skunert and @filiphaase) are working on this. The reference binding language will be for Python, but other bindings are very welcome. The best name for this so far is stratosphere-lang-bindings. I created this issue to track the progress (and give everybody a chance to comment on this) Imported from GitHub Url: https://github.com/stratosphere/stratosphere/issues/377 Created by: [rmetzger|https://github.com/rmetzger] Labels: enhancement, Assignee: [filiphaase|https://github.com/filiphaase] Created at: Tue Jan 07 19:47:20 CET 2014 State: open -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (FLINK-377) Create a general purpose framework for language bindings
[ https://issues.apache.org/jira/browse/FLINK-377?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14504838#comment-14504838 ] ASF GitHub Bot commented on FLINK-377: -- Github user rmetzger commented on the pull request: https://github.com/apache/flink/pull/202#issuecomment-94761408 Thanks a lot. Can you squash the commits after mingliang's example into one commit, prefixed with FLINK-671 ? Then we'll have 4 commits for the change, which is totally okay given its size. Create a general purpose framework for language bindings Key: FLINK-377 URL: https://issues.apache.org/jira/browse/FLINK-377 Project: Flink Issue Type: Improvement Reporter: GitHub Import Assignee: Chesnay Schepler Labels: github-import Fix For: pre-apache A general purpose API to run operators with arbitrary binaries. This will allow to run Stratosphere programs written in Python, JavaScript, Ruby, Go or whatever you like. We suggest using Google Protocol Buffers for data serialization. This is the list of languages that currently support ProtoBuf: https://code.google.com/p/protobuf/wiki/ThirdPartyAddOns Very early prototype with python: https://github.com/rmetzger/scratch/tree/learn-protobuf (basically testing protobuf) For Ruby: https://github.com/infochimps-labs/wukong Two new students working at Stratosphere (@skunert and @filiphaase) are working on this. The reference binding language will be for Python, but other bindings are very welcome. The best name for this so far is stratosphere-lang-bindings. I created this issue to track the progress (and give everybody a chance to comment on this) Imported from GitHub Url: https://github.com/stratosphere/stratosphere/issues/377 Created by: [rmetzger|https://github.com/rmetzger] Labels: enhancement, Assignee: [filiphaase|https://github.com/filiphaase] Created at: Tue Jan 07 19:47:20 CET 2014 State: open -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (FLINK-377) Create a general purpose framework for language bindings
[ https://issues.apache.org/jira/browse/FLINK-377?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14502729#comment-14502729 ] ASF GitHub Bot commented on FLINK-377: -- Github user aljoscha commented on the pull request: https://github.com/apache/flink/pull/202#issuecomment-94443054 I just ran it on the cluster. Works like a charm. :smile: For word count, python takes 12 minutes, java about 2:40. But this should be expected, I guess. Good to merge now, in my opinion. Create a general purpose framework for language bindings Key: FLINK-377 URL: https://issues.apache.org/jira/browse/FLINK-377 Project: Flink Issue Type: Improvement Reporter: GitHub Import Assignee: Chesnay Schepler Labels: github-import Fix For: pre-apache A general purpose API to run operators with arbitrary binaries. This will allow to run Stratosphere programs written in Python, JavaScript, Ruby, Go or whatever you like. We suggest using Google Protocol Buffers for data serialization. This is the list of languages that currently support ProtoBuf: https://code.google.com/p/protobuf/wiki/ThirdPartyAddOns Very early prototype with python: https://github.com/rmetzger/scratch/tree/learn-protobuf (basically testing protobuf) For Ruby: https://github.com/infochimps-labs/wukong Two new students working at Stratosphere (@skunert and @filiphaase) are working on this. The reference binding language will be for Python, but other bindings are very welcome. The best name for this so far is stratosphere-lang-bindings. I created this issue to track the progress (and give everybody a chance to comment on this) Imported from GitHub Url: https://github.com/stratosphere/stratosphere/issues/377 Created by: [rmetzger|https://github.com/rmetzger] Labels: enhancement, Assignee: [filiphaase|https://github.com/filiphaase] Created at: Tue Jan 07 19:47:20 CET 2014 State: open -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (FLINK-377) Create a general purpose framework for language bindings
[ https://issues.apache.org/jira/browse/FLINK-377?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14502733#comment-14502733 ] ASF GitHub Bot commented on FLINK-377: -- Github user rmetzger commented on the pull request: https://github.com/apache/flink/pull/202#issuecomment-94443599 Can you give me a +1 in the discussion on the ML as well? On Mon, Apr 20, 2015 at 2:46 PM, Aljoscha Krettek notificati...@github.com wrote: I just ran it on the cluster. Works like a charm. [image: :smile:] For word count, python takes 12 minutes, java about 2:40. But this should be expected, I guess. Good to merge now, in my opinion. — Reply to this email directly or view it on GitHub https://github.com/apache/flink/pull/202#issuecomment-94443054. -- Robert Metzger, Kontakt: metzg...@web.de, Mobil: 0171/7424461 Create a general purpose framework for language bindings Key: FLINK-377 URL: https://issues.apache.org/jira/browse/FLINK-377 Project: Flink Issue Type: Improvement Reporter: GitHub Import Assignee: Chesnay Schepler Labels: github-import Fix For: pre-apache A general purpose API to run operators with arbitrary binaries. This will allow to run Stratosphere programs written in Python, JavaScript, Ruby, Go or whatever you like. We suggest using Google Protocol Buffers for data serialization. This is the list of languages that currently support ProtoBuf: https://code.google.com/p/protobuf/wiki/ThirdPartyAddOns Very early prototype with python: https://github.com/rmetzger/scratch/tree/learn-protobuf (basically testing protobuf) For Ruby: https://github.com/infochimps-labs/wukong Two new students working at Stratosphere (@skunert and @filiphaase) are working on this. The reference binding language will be for Python, but other bindings are very welcome. The best name for this so far is stratosphere-lang-bindings. I created this issue to track the progress (and give everybody a chance to comment on this) Imported from GitHub Url: https://github.com/stratosphere/stratosphere/issues/377 Created by: [rmetzger|https://github.com/rmetzger] Labels: enhancement, Assignee: [filiphaase|https://github.com/filiphaase] Created at: Tue Jan 07 19:47:20 CET 2014 State: open -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (FLINK-377) Create a general purpose framework for language bindings
[ https://issues.apache.org/jira/browse/FLINK-377?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14502545#comment-14502545 ] ASF GitHub Bot commented on FLINK-377: -- Github user rmetzger commented on the pull request: https://github.com/apache/flink/pull/202#issuecomment-94401023 I wrote to the mailinglist to discuss the state of this PR Create a general purpose framework for language bindings Key: FLINK-377 URL: https://issues.apache.org/jira/browse/FLINK-377 Project: Flink Issue Type: Improvement Reporter: GitHub Import Assignee: Chesnay Schepler Labels: github-import Fix For: pre-apache A general purpose API to run operators with arbitrary binaries. This will allow to run Stratosphere programs written in Python, JavaScript, Ruby, Go or whatever you like. We suggest using Google Protocol Buffers for data serialization. This is the list of languages that currently support ProtoBuf: https://code.google.com/p/protobuf/wiki/ThirdPartyAddOns Very early prototype with python: https://github.com/rmetzger/scratch/tree/learn-protobuf (basically testing protobuf) For Ruby: https://github.com/infochimps-labs/wukong Two new students working at Stratosphere (@skunert and @filiphaase) are working on this. The reference binding language will be for Python, but other bindings are very welcome. The best name for this so far is stratosphere-lang-bindings. I created this issue to track the progress (and give everybody a chance to comment on this) Imported from GitHub Url: https://github.com/stratosphere/stratosphere/issues/377 Created by: [rmetzger|https://github.com/rmetzger] Labels: enhancement, Assignee: [filiphaase|https://github.com/filiphaase] Created at: Tue Jan 07 19:47:20 CET 2014 State: open -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (FLINK-377) Create a general purpose framework for language bindings
[ https://issues.apache.org/jira/browse/FLINK-377?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14502558#comment-14502558 ] ASF GitHub Bot commented on FLINK-377: -- Github user zentol commented on the pull request: https://github.com/apache/flink/pull/202#issuecomment-94403584 yes that is correct. Create a general purpose framework for language bindings Key: FLINK-377 URL: https://issues.apache.org/jira/browse/FLINK-377 Project: Flink Issue Type: Improvement Reporter: GitHub Import Assignee: Chesnay Schepler Labels: github-import Fix For: pre-apache A general purpose API to run operators with arbitrary binaries. This will allow to run Stratosphere programs written in Python, JavaScript, Ruby, Go or whatever you like. We suggest using Google Protocol Buffers for data serialization. This is the list of languages that currently support ProtoBuf: https://code.google.com/p/protobuf/wiki/ThirdPartyAddOns Very early prototype with python: https://github.com/rmetzger/scratch/tree/learn-protobuf (basically testing protobuf) For Ruby: https://github.com/infochimps-labs/wukong Two new students working at Stratosphere (@skunert and @filiphaase) are working on this. The reference binding language will be for Python, but other bindings are very welcome. The best name for this so far is stratosphere-lang-bindings. I created this issue to track the progress (and give everybody a chance to comment on this) Imported from GitHub Url: https://github.com/stratosphere/stratosphere/issues/377 Created by: [rmetzger|https://github.com/rmetzger] Labels: enhancement, Assignee: [filiphaase|https://github.com/filiphaase] Created at: Tue Jan 07 19:47:20 CET 2014 State: open -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (FLINK-377) Create a general purpose framework for language bindings
[ https://issues.apache.org/jira/browse/FLINK-377?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14502551#comment-14502551 ] ASF GitHub Bot commented on FLINK-377: -- Github user zentol commented on the pull request: https://github.com/apache/flink/pull/202#issuecomment-94402062 @aljoscha Timeout is removed. Data transfer is still done with mapped file, access to these files is synchronized using TCP. Im not sure what you mean with your last sentence. Create a general purpose framework for language bindings Key: FLINK-377 URL: https://issues.apache.org/jira/browse/FLINK-377 Project: Flink Issue Type: Improvement Reporter: GitHub Import Assignee: Chesnay Schepler Labels: github-import Fix For: pre-apache A general purpose API to run operators with arbitrary binaries. This will allow to run Stratosphere programs written in Python, JavaScript, Ruby, Go or whatever you like. We suggest using Google Protocol Buffers for data serialization. This is the list of languages that currently support ProtoBuf: https://code.google.com/p/protobuf/wiki/ThirdPartyAddOns Very early prototype with python: https://github.com/rmetzger/scratch/tree/learn-protobuf (basically testing protobuf) For Ruby: https://github.com/infochimps-labs/wukong Two new students working at Stratosphere (@skunert and @filiphaase) are working on this. The reference binding language will be for Python, but other bindings are very welcome. The best name for this so far is stratosphere-lang-bindings. I created this issue to track the progress (and give everybody a chance to comment on this) Imported from GitHub Url: https://github.com/stratosphere/stratosphere/issues/377 Created by: [rmetzger|https://github.com/rmetzger] Labels: enhancement, Assignee: [filiphaase|https://github.com/filiphaase] Created at: Tue Jan 07 19:47:20 CET 2014 State: open -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (FLINK-377) Create a general purpose framework for language bindings
[ https://issues.apache.org/jira/browse/FLINK-377?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14502544#comment-14502544 ] ASF GitHub Bot commented on FLINK-377: -- Github user aljoscha commented on the pull request: https://github.com/apache/flink/pull/202#issuecomment-94400959 I'll test it again on a cluster. Could you please elaborate a bit. Is the timeout still in? Communication is through TCP instead of the mapped files. but still with the same basic interface of writing basic values for communication? Create a general purpose framework for language bindings Key: FLINK-377 URL: https://issues.apache.org/jira/browse/FLINK-377 Project: Flink Issue Type: Improvement Reporter: GitHub Import Assignee: Chesnay Schepler Labels: github-import Fix For: pre-apache A general purpose API to run operators with arbitrary binaries. This will allow to run Stratosphere programs written in Python, JavaScript, Ruby, Go or whatever you like. We suggest using Google Protocol Buffers for data serialization. This is the list of languages that currently support ProtoBuf: https://code.google.com/p/protobuf/wiki/ThirdPartyAddOns Very early prototype with python: https://github.com/rmetzger/scratch/tree/learn-protobuf (basically testing protobuf) For Ruby: https://github.com/infochimps-labs/wukong Two new students working at Stratosphere (@skunert and @filiphaase) are working on this. The reference binding language will be for Python, but other bindings are very welcome. The best name for this so far is stratosphere-lang-bindings. I created this issue to track the progress (and give everybody a chance to comment on this) Imported from GitHub Url: https://github.com/stratosphere/stratosphere/issues/377 Created by: [rmetzger|https://github.com/rmetzger] Labels: enhancement, Assignee: [filiphaase|https://github.com/filiphaase] Created at: Tue Jan 07 19:47:20 CET 2014 State: open -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (FLINK-377) Create a general purpose framework for language bindings
[ https://issues.apache.org/jira/browse/FLINK-377?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14502534#comment-14502534 ] ASF GitHub Bot commented on FLINK-377: -- Github user zentol commented on the pull request: https://github.com/apache/flink/pull/202#issuecomment-94397131 All issues that I'm aware of are resolved. Create a general purpose framework for language bindings Key: FLINK-377 URL: https://issues.apache.org/jira/browse/FLINK-377 Project: Flink Issue Type: Improvement Reporter: GitHub Import Assignee: Chesnay Schepler Labels: github-import Fix For: pre-apache A general purpose API to run operators with arbitrary binaries. This will allow to run Stratosphere programs written in Python, JavaScript, Ruby, Go or whatever you like. We suggest using Google Protocol Buffers for data serialization. This is the list of languages that currently support ProtoBuf: https://code.google.com/p/protobuf/wiki/ThirdPartyAddOns Very early prototype with python: https://github.com/rmetzger/scratch/tree/learn-protobuf (basically testing protobuf) For Ruby: https://github.com/infochimps-labs/wukong Two new students working at Stratosphere (@skunert and @filiphaase) are working on this. The reference binding language will be for Python, but other bindings are very welcome. The best name for this so far is stratosphere-lang-bindings. I created this issue to track the progress (and give everybody a chance to comment on this) Imported from GitHub Url: https://github.com/stratosphere/stratosphere/issues/377 Created by: [rmetzger|https://github.com/rmetzger] Labels: enhancement, Assignee: [filiphaase|https://github.com/filiphaase] Created at: Tue Jan 07 19:47:20 CET 2014 State: open -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (FLINK-377) Create a general purpose framework for language bindings
[ https://issues.apache.org/jira/browse/FLINK-377?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14499568#comment-14499568 ] ASF GitHub Bot commented on FLINK-377: -- Github user rmetzger commented on the pull request: https://github.com/apache/flink/pull/202#issuecomment-93957146 What is the timeline for merging this? Create a general purpose framework for language bindings Key: FLINK-377 URL: https://issues.apache.org/jira/browse/FLINK-377 Project: Flink Issue Type: Improvement Reporter: GitHub Import Assignee: Chesnay Schepler Labels: github-import Fix For: pre-apache A general purpose API to run operators with arbitrary binaries. This will allow to run Stratosphere programs written in Python, JavaScript, Ruby, Go or whatever you like. We suggest using Google Protocol Buffers for data serialization. This is the list of languages that currently support ProtoBuf: https://code.google.com/p/protobuf/wiki/ThirdPartyAddOns Very early prototype with python: https://github.com/rmetzger/scratch/tree/learn-protobuf (basically testing protobuf) For Ruby: https://github.com/infochimps-labs/wukong Two new students working at Stratosphere (@skunert and @filiphaase) are working on this. The reference binding language will be for Python, but other bindings are very welcome. The best name for this so far is stratosphere-lang-bindings. I created this issue to track the progress (and give everybody a chance to comment on this) Imported from GitHub Url: https://github.com/stratosphere/stratosphere/issues/377 Created by: [rmetzger|https://github.com/rmetzger] Labels: enhancement, Assignee: [filiphaase|https://github.com/filiphaase] Created at: Tue Jan 07 19:47:20 CET 2014 State: open -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (FLINK-377) Create a general purpose framework for language bindings
[ https://issues.apache.org/jira/browse/FLINK-377?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14493894#comment-14493894 ] ASF GitHub Bot commented on FLINK-377: -- Github user zentol commented on the pull request: https://github.com/apache/flink/pull/202#issuecomment-92765309 Now uses TCP to exchange signals. Create a general purpose framework for language bindings Key: FLINK-377 URL: https://issues.apache.org/jira/browse/FLINK-377 Project: Flink Issue Type: Improvement Reporter: GitHub Import Assignee: Chesnay Schepler Labels: github-import Fix For: pre-apache A general purpose API to run operators with arbitrary binaries. This will allow to run Stratosphere programs written in Python, JavaScript, Ruby, Go or whatever you like. We suggest using Google Protocol Buffers for data serialization. This is the list of languages that currently support ProtoBuf: https://code.google.com/p/protobuf/wiki/ThirdPartyAddOns Very early prototype with python: https://github.com/rmetzger/scratch/tree/learn-protobuf (basically testing protobuf) For Ruby: https://github.com/infochimps-labs/wukong Two new students working at Stratosphere (@skunert and @filiphaase) are working on this. The reference binding language will be for Python, but other bindings are very welcome. The best name for this so far is stratosphere-lang-bindings. I created this issue to track the progress (and give everybody a chance to comment on this) Imported from GitHub Url: https://github.com/stratosphere/stratosphere/issues/377 Created by: [rmetzger|https://github.com/rmetzger] Labels: enhancement, Assignee: [filiphaase|https://github.com/filiphaase] Created at: Tue Jan 07 19:47:20 CET 2014 State: open -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (FLINK-377) Create a general purpose framework for language bindings
[ https://issues.apache.org/jira/browse/FLINK-377?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14485088#comment-14485088 ] ASF GitHub Bot commented on FLINK-377: -- Github user StephanEwen commented on the pull request: https://github.com/apache/flink/pull/202#issuecomment-90883545 Yeah, it happens frequently that the Java side dies without calling `close()`, for example when the TaskManager encounters an unrecoverable error and terminates. +1 for having a safety net, timeout sounds good. It can be a rather high timeout, though (order of several minutes) Ideally, the python process would also be a child process of the JVM process, to it is killes anyways by the kernel when the JVM process exists, as another safety net. Create a general purpose framework for language bindings Key: FLINK-377 URL: https://issues.apache.org/jira/browse/FLINK-377 Project: Flink Issue Type: Improvement Reporter: GitHub Import Assignee: Chesnay Schepler Labels: github-import Fix For: pre-apache A general purpose API to run operators with arbitrary binaries. This will allow to run Stratosphere programs written in Python, JavaScript, Ruby, Go or whatever you like. We suggest using Google Protocol Buffers for data serialization. This is the list of languages that currently support ProtoBuf: https://code.google.com/p/protobuf/wiki/ThirdPartyAddOns Very early prototype with python: https://github.com/rmetzger/scratch/tree/learn-protobuf (basically testing protobuf) For Ruby: https://github.com/infochimps-labs/wukong Two new students working at Stratosphere (@skunert and @filiphaase) are working on this. The reference binding language will be for Python, but other bindings are very welcome. The best name for this so far is stratosphere-lang-bindings. I created this issue to track the progress (and give everybody a chance to comment on this) Imported from GitHub Url: https://github.com/stratosphere/stratosphere/issues/377 Created by: [rmetzger|https://github.com/rmetzger] Labels: enhancement, Assignee: [filiphaase|https://github.com/filiphaase] Created at: Tue Jan 07 19:47:20 CET 2014 State: open -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (FLINK-377) Create a general purpose framework for language bindings
[ https://issues.apache.org/jira/browse/FLINK-377?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14485009#comment-14485009 ] ASF GitHub Bot commented on FLINK-377: -- Github user zentol commented on the pull request: https://github.com/apache/flink/pull/202#issuecomment-90865196 i guess it's nice to have in case the java side dies *somehow* without calling close() on the java function. Create a general purpose framework for language bindings Key: FLINK-377 URL: https://issues.apache.org/jira/browse/FLINK-377 Project: Flink Issue Type: Improvement Reporter: GitHub Import Assignee: Chesnay Schepler Labels: github-import Fix For: pre-apache A general purpose API to run operators with arbitrary binaries. This will allow to run Stratosphere programs written in Python, JavaScript, Ruby, Go or whatever you like. We suggest using Google Protocol Buffers for data serialization. This is the list of languages that currently support ProtoBuf: https://code.google.com/p/protobuf/wiki/ThirdPartyAddOns Very early prototype with python: https://github.com/rmetzger/scratch/tree/learn-protobuf (basically testing protobuf) For Ruby: https://github.com/infochimps-labs/wukong Two new students working at Stratosphere (@skunert and @filiphaase) are working on this. The reference binding language will be for Python, but other bindings are very welcome. The best name for this so far is stratosphere-lang-bindings. I created this issue to track the progress (and give everybody a chance to comment on this) Imported from GitHub Url: https://github.com/stratosphere/stratosphere/issues/377 Created by: [rmetzger|https://github.com/rmetzger] Labels: enhancement, Assignee: [filiphaase|https://github.com/filiphaase] Created at: Tue Jan 07 19:47:20 CET 2014 State: open -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (FLINK-377) Create a general purpose framework for language bindings
[ https://issues.apache.org/jira/browse/FLINK-377?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14484985#comment-14484985 ] ASF GitHub Bot commented on FLINK-377: -- Github user zentol commented on the pull request: https://github.com/apache/flink/pull/202#issuecomment-90859901 I can't say for sure whether the current timeout is enough, we don't have enough data for that. we could make it configurable, that way a user can just increase it if it occurs without the job actually deadlocking. Create a general purpose framework for language bindings Key: FLINK-377 URL: https://issues.apache.org/jira/browse/FLINK-377 Project: Flink Issue Type: Improvement Reporter: GitHub Import Assignee: Chesnay Schepler Labels: github-import Fix For: pre-apache A general purpose API to run operators with arbitrary binaries. This will allow to run Stratosphere programs written in Python, JavaScript, Ruby, Go or whatever you like. We suggest using Google Protocol Buffers for data serialization. This is the list of languages that currently support ProtoBuf: https://code.google.com/p/protobuf/wiki/ThirdPartyAddOns Very early prototype with python: https://github.com/rmetzger/scratch/tree/learn-protobuf (basically testing protobuf) For Ruby: https://github.com/infochimps-labs/wukong Two new students working at Stratosphere (@skunert and @filiphaase) are working on this. The reference binding language will be for Python, but other bindings are very welcome. The best name for this so far is stratosphere-lang-bindings. I created this issue to track the progress (and give everybody a chance to comment on this) Imported from GitHub Url: https://github.com/stratosphere/stratosphere/issues/377 Created by: [rmetzger|https://github.com/rmetzger] Labels: enhancement, Assignee: [filiphaase|https://github.com/filiphaase] Created at: Tue Jan 07 19:47:20 CET 2014 State: open -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (FLINK-377) Create a general purpose framework for language bindings
[ https://issues.apache.org/jira/browse/FLINK-377?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14484980#comment-14484980 ] ASF GitHub Bot commented on FLINK-377: -- Github user aljoscha commented on the pull request: https://github.com/apache/flink/pull/202#issuecomment-90858070 What about the timeout? Are we confident that a longer timeout will be sufficient a large number of possible jobs? Create a general purpose framework for language bindings Key: FLINK-377 URL: https://issues.apache.org/jira/browse/FLINK-377 Project: Flink Issue Type: Improvement Reporter: GitHub Import Assignee: Chesnay Schepler Labels: github-import Fix For: pre-apache A general purpose API to run operators with arbitrary binaries. This will allow to run Stratosphere programs written in Python, JavaScript, Ruby, Go or whatever you like. We suggest using Google Protocol Buffers for data serialization. This is the list of languages that currently support ProtoBuf: https://code.google.com/p/protobuf/wiki/ThirdPartyAddOns Very early prototype with python: https://github.com/rmetzger/scratch/tree/learn-protobuf (basically testing protobuf) For Ruby: https://github.com/infochimps-labs/wukong Two new students working at Stratosphere (@skunert and @filiphaase) are working on this. The reference binding language will be for Python, but other bindings are very welcome. The best name for this so far is stratosphere-lang-bindings. I created this issue to track the progress (and give everybody a chance to comment on this) Imported from GitHub Url: https://github.com/stratosphere/stratosphere/issues/377 Created by: [rmetzger|https://github.com/rmetzger] Labels: enhancement, Assignee: [filiphaase|https://github.com/filiphaase] Created at: Tue Jan 07 19:47:20 CET 2014 State: open -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (FLINK-377) Create a general purpose framework for language bindings
[ https://issues.apache.org/jira/browse/FLINK-377?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14484986#comment-14484986 ] ASF GitHub Bot commented on FLINK-377: -- Github user zentol commented on the pull request: https://github.com/apache/flink/pull/202#issuecomment-90860600 the whole timeout thing may actually be no longer required with the changes to the process termination. Create a general purpose framework for language bindings Key: FLINK-377 URL: https://issues.apache.org/jira/browse/FLINK-377 Project: Flink Issue Type: Improvement Reporter: GitHub Import Assignee: Chesnay Schepler Labels: github-import Fix For: pre-apache A general purpose API to run operators with arbitrary binaries. This will allow to run Stratosphere programs written in Python, JavaScript, Ruby, Go or whatever you like. We suggest using Google Protocol Buffers for data serialization. This is the list of languages that currently support ProtoBuf: https://code.google.com/p/protobuf/wiki/ThirdPartyAddOns Very early prototype with python: https://github.com/rmetzger/scratch/tree/learn-protobuf (basically testing protobuf) For Ruby: https://github.com/infochimps-labs/wukong Two new students working at Stratosphere (@skunert and @filiphaase) are working on this. The reference binding language will be for Python, but other bindings are very welcome. The best name for this so far is stratosphere-lang-bindings. I created this issue to track the progress (and give everybody a chance to comment on this) Imported from GitHub Url: https://github.com/stratosphere/stratosphere/issues/377 Created by: [rmetzger|https://github.com/rmetzger] Labels: enhancement, Assignee: [filiphaase|https://github.com/filiphaase] Created at: Tue Jan 07 19:47:20 CET 2014 State: open -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (FLINK-377) Create a general purpose framework for language bindings
[ https://issues.apache.org/jira/browse/FLINK-377?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14484996#comment-14484996 ] ASF GitHub Bot commented on FLINK-377: -- Github user aljoscha commented on the pull request: https://github.com/apache/flink/pull/202#issuecomment-90863375 processes are terminated when a user cancels a job? so is the timeout still required or not? :smile: Create a general purpose framework for language bindings Key: FLINK-377 URL: https://issues.apache.org/jira/browse/FLINK-377 Project: Flink Issue Type: Improvement Reporter: GitHub Import Assignee: Chesnay Schepler Labels: github-import Fix For: pre-apache A general purpose API to run operators with arbitrary binaries. This will allow to run Stratosphere programs written in Python, JavaScript, Ruby, Go or whatever you like. We suggest using Google Protocol Buffers for data serialization. This is the list of languages that currently support ProtoBuf: https://code.google.com/p/protobuf/wiki/ThirdPartyAddOns Very early prototype with python: https://github.com/rmetzger/scratch/tree/learn-protobuf (basically testing protobuf) For Ruby: https://github.com/infochimps-labs/wukong Two new students working at Stratosphere (@skunert and @filiphaase) are working on this. The reference binding language will be for Python, but other bindings are very welcome. The best name for this so far is stratosphere-lang-bindings. I created this issue to track the progress (and give everybody a chance to comment on this) Imported from GitHub Url: https://github.com/stratosphere/stratosphere/issues/377 Created by: [rmetzger|https://github.com/rmetzger] Labels: enhancement, Assignee: [filiphaase|https://github.com/filiphaase] Created at: Tue Jan 07 19:47:20 CET 2014 State: open -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (FLINK-377) Create a general purpose framework for language bindings
[ https://issues.apache.org/jira/browse/FLINK-377?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14485182#comment-14485182 ] ASF GitHub Bot commented on FLINK-377: -- Github user aljoscha commented on the pull request: https://github.com/apache/flink/pull/202#issuecomment-90900778 Yeah, and several minutes doesn't cut it, as we've seen. On a WordCount example on not very large data a 5 minute timeout was not enough. Create a general purpose framework for language bindings Key: FLINK-377 URL: https://issues.apache.org/jira/browse/FLINK-377 Project: Flink Issue Type: Improvement Reporter: GitHub Import Assignee: Chesnay Schepler Labels: github-import Fix For: pre-apache A general purpose API to run operators with arbitrary binaries. This will allow to run Stratosphere programs written in Python, JavaScript, Ruby, Go or whatever you like. We suggest using Google Protocol Buffers for data serialization. This is the list of languages that currently support ProtoBuf: https://code.google.com/p/protobuf/wiki/ThirdPartyAddOns Very early prototype with python: https://github.com/rmetzger/scratch/tree/learn-protobuf (basically testing protobuf) For Ruby: https://github.com/infochimps-labs/wukong Two new students working at Stratosphere (@skunert and @filiphaase) are working on this. The reference binding language will be for Python, but other bindings are very welcome. The best name for this so far is stratosphere-lang-bindings. I created this issue to track the progress (and give everybody a chance to comment on this) Imported from GitHub Url: https://github.com/stratosphere/stratosphere/issues/377 Created by: [rmetzger|https://github.com/rmetzger] Labels: enhancement, Assignee: [filiphaase|https://github.com/filiphaase] Created at: Tue Jan 07 19:47:20 CET 2014 State: open -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (FLINK-377) Create a general purpose framework for language bindings
[ https://issues.apache.org/jira/browse/FLINK-377?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14485248#comment-14485248 ] ASF GitHub Bot commented on FLINK-377: -- Github user zentol commented on the pull request: https://github.com/apache/flink/pull/202#issuecomment-90921879 The timeout measures how long java or python are stuck in a blocking udp operation. this generally means how long it takes for the python side to compute one chunk of data. the java side sends a chunk, and then waits for the next signal. if this takes too long, timeout. it's quite fickle to be honest, and without a regular heartbeat one can always construct scenarios where it will break the job. i *think* tcp would cover it. thanks for the hint to remove the shutdown hook, addressed. Create a general purpose framework for language bindings Key: FLINK-377 URL: https://issues.apache.org/jira/browse/FLINK-377 Project: Flink Issue Type: Improvement Reporter: GitHub Import Assignee: Chesnay Schepler Labels: github-import Fix For: pre-apache A general purpose API to run operators with arbitrary binaries. This will allow to run Stratosphere programs written in Python, JavaScript, Ruby, Go or whatever you like. We suggest using Google Protocol Buffers for data serialization. This is the list of languages that currently support ProtoBuf: https://code.google.com/p/protobuf/wiki/ThirdPartyAddOns Very early prototype with python: https://github.com/rmetzger/scratch/tree/learn-protobuf (basically testing protobuf) For Ruby: https://github.com/infochimps-labs/wukong Two new students working at Stratosphere (@skunert and @filiphaase) are working on this. The reference binding language will be for Python, but other bindings are very welcome. The best name for this so far is stratosphere-lang-bindings. I created this issue to track the progress (and give everybody a chance to comment on this) Imported from GitHub Url: https://github.com/stratosphere/stratosphere/issues/377 Created by: [rmetzger|https://github.com/rmetzger] Labels: enhancement, Assignee: [filiphaase|https://github.com/filiphaase] Created at: Tue Jan 07 19:47:20 CET 2014 State: open -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (FLINK-377) Create a general purpose framework for language bindings
[ https://issues.apache.org/jira/browse/FLINK-377?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14485197#comment-14485197 ] ASF GitHub Bot commented on FLINK-377: -- Github user StephanEwen commented on the pull request: https://github.com/apache/flink/pull/202#issuecomment-90903241 What does the timeout measure? The time while no data was coming? Can the protocol send heartbeats if no data is available? Would it be simpler to switch to TCP as the coordinator protocol? It has keepalive messages integrated and it throws an error on the reader side if the connection is closed by the sender. Would that cover it? A shutdown hook is a good idea. Make sure you also remove the shutdown hook in `close()`, otherwise you have a resource leak in the JVM. Take a look at the `BlobServer` and its `shutdown()` method for an example of how to do this robustly. Create a general purpose framework for language bindings Key: FLINK-377 URL: https://issues.apache.org/jira/browse/FLINK-377 Project: Flink Issue Type: Improvement Reporter: GitHub Import Assignee: Chesnay Schepler Labels: github-import Fix For: pre-apache A general purpose API to run operators with arbitrary binaries. This will allow to run Stratosphere programs written in Python, JavaScript, Ruby, Go or whatever you like. We suggest using Google Protocol Buffers for data serialization. This is the list of languages that currently support ProtoBuf: https://code.google.com/p/protobuf/wiki/ThirdPartyAddOns Very early prototype with python: https://github.com/rmetzger/scratch/tree/learn-protobuf (basically testing protobuf) For Ruby: https://github.com/infochimps-labs/wukong Two new students working at Stratosphere (@skunert and @filiphaase) are working on this. The reference binding language will be for Python, but other bindings are very welcome. The best name for this so far is stratosphere-lang-bindings. I created this issue to track the progress (and give everybody a chance to comment on this) Imported from GitHub Url: https://github.com/stratosphere/stratosphere/issues/377 Created by: [rmetzger|https://github.com/rmetzger] Labels: enhancement, Assignee: [filiphaase|https://github.com/filiphaase] Created at: Tue Jan 07 19:47:20 CET 2014 State: open -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (FLINK-377) Create a general purpose framework for language bindings
[ https://issues.apache.org/jira/browse/FLINK-377?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14355081#comment-14355081 ] ASF GitHub Bot commented on FLINK-377: -- Github user mxm commented on a diff in the pull request: https://github.com/apache/flink/pull/202#discussion_r26132770 --- Diff: flink-addons/flink-language-binding/src/main/python/org/apache/flink/languagebinding/api/python/dill/__diff.py --- @@ -0,0 +1,247 @@ + --- End diff -- The main problem here is that you prepended the BSD-licensed Dill code with the Apache license. Just remove the Apache license header and clearly separate your code from the Dill library. Then it should be no problem having the Apache and BSD license side by side. After all, they are very similar, except for additional patent grants that the Apache license provides. Alternatively, I'd suggest to use some sort of package management (e.g. `pip`) to install the library. Then we don't have to package it. Create a general purpose framework for language bindings Key: FLINK-377 URL: https://issues.apache.org/jira/browse/FLINK-377 Project: Flink Issue Type: Improvement Reporter: GitHub Import Assignee: Chesnay Schepler Labels: github-import Fix For: pre-apache A general purpose API to run operators with arbitrary binaries. This will allow to run Stratosphere programs written in Python, JavaScript, Ruby, Go or whatever you like. We suggest using Google Protocol Buffers for data serialization. This is the list of languages that currently support ProtoBuf: https://code.google.com/p/protobuf/wiki/ThirdPartyAddOns Very early prototype with python: https://github.com/rmetzger/scratch/tree/learn-protobuf (basically testing protobuf) For Ruby: https://github.com/infochimps-labs/wukong Two new students working at Stratosphere (@skunert and @filiphaase) are working on this. The reference binding language will be for Python, but other bindings are very welcome. The best name for this so far is stratosphere-lang-bindings. I created this issue to track the progress (and give everybody a chance to comment on this) Imported from GitHub Url: https://github.com/stratosphere/stratosphere/issues/377 Created by: [rmetzger|https://github.com/rmetzger] Labels: enhancement, Assignee: [filiphaase|https://github.com/filiphaase] Created at: Tue Jan 07 19:47:20 CET 2014 State: open -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (FLINK-377) Create a general purpose framework for language bindings
[ https://issues.apache.org/jira/browse/FLINK-377?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14354434#comment-14354434 ] ASF GitHub Bot commented on FLINK-377: -- Github user zentol commented on a diff in the pull request: https://github.com/apache/flink/pull/202#discussion_r26102129 --- Diff: flink-addons/flink-language-binding/src/main/python/org/apache/flink/languagebinding/api/python/dill/__diff.py --- @@ -0,0 +1,247 @@ + --- End diff -- license issue is still unresolved. Create a general purpose framework for language bindings Key: FLINK-377 URL: https://issues.apache.org/jira/browse/FLINK-377 Project: Flink Issue Type: Improvement Reporter: GitHub Import Assignee: Chesnay Schepler Labels: github-import Fix For: pre-apache A general purpose API to run operators with arbitrary binaries. This will allow to run Stratosphere programs written in Python, JavaScript, Ruby, Go or whatever you like. We suggest using Google Protocol Buffers for data serialization. This is the list of languages that currently support ProtoBuf: https://code.google.com/p/protobuf/wiki/ThirdPartyAddOns Very early prototype with python: https://github.com/rmetzger/scratch/tree/learn-protobuf (basically testing protobuf) For Ruby: https://github.com/infochimps-labs/wukong Two new students working at Stratosphere (@skunert and @filiphaase) are working on this. The reference binding language will be for Python, but other bindings are very welcome. The best name for this so far is stratosphere-lang-bindings. I created this issue to track the progress (and give everybody a chance to comment on this) Imported from GitHub Url: https://github.com/stratosphere/stratosphere/issues/377 Created by: [rmetzger|https://github.com/rmetzger] Labels: enhancement, Assignee: [filiphaase|https://github.com/filiphaase] Created at: Tue Jan 07 19:47:20 CET 2014 State: open -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (FLINK-377) Create a general purpose framework for language bindings
[ https://issues.apache.org/jira/browse/FLINK-377?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14346670#comment-14346670 ] ASF GitHub Bot commented on FLINK-377: -- Github user aljoscha commented on the pull request: https://github.com/apache/flink/pull/202#issuecomment-77127742 Thanks, I have two last requests, sorry for that. Could you rename flink-generic to flink-language-binding-generic? The problem is, that the package name is now flink-generic, it pops up like this in maven central and so on without the information that it is actually a sub package of flink-language-binding. This could be quite confusing. In MapFunctin.py and FilterFunction.py you use map() and filter() respectively. These operations are not lazy, i.e. in map() it does first apply the user map-function to every element in the partition and then it collects the results. This can become a problem if the input is very big. Instead we should iterate over the iterator and output each element after mapping. This keeps memory consumption low. Same applies to filter(). Create a general purpose framework for language bindings Key: FLINK-377 URL: https://issues.apache.org/jira/browse/FLINK-377 Project: Flink Issue Type: Improvement Reporter: GitHub Import Assignee: Chesnay Schepler Labels: github-import Fix For: pre-apache A general purpose API to run operators with arbitrary binaries. This will allow to run Stratosphere programs written in Python, JavaScript, Ruby, Go or whatever you like. We suggest using Google Protocol Buffers for data serialization. This is the list of languages that currently support ProtoBuf: https://code.google.com/p/protobuf/wiki/ThirdPartyAddOns Very early prototype with python: https://github.com/rmetzger/scratch/tree/learn-protobuf (basically testing protobuf) For Ruby: https://github.com/infochimps-labs/wukong Two new students working at Stratosphere (@skunert and @filiphaase) are working on this. The reference binding language will be for Python, but other bindings are very welcome. The best name for this so far is stratosphere-lang-bindings. I created this issue to track the progress (and give everybody a chance to comment on this) Imported from GitHub Url: https://github.com/stratosphere/stratosphere/issues/377 Created by: [rmetzger|https://github.com/rmetzger] Labels: enhancement, Assignee: [filiphaase|https://github.com/filiphaase] Created at: Tue Jan 07 19:47:20 CET 2014 State: open -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (FLINK-377) Create a general purpose framework for language bindings
[ https://issues.apache.org/jira/browse/FLINK-377?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14345284#comment-14345284 ] ASF GitHub Bot commented on FLINK-377: -- Github user zentol commented on the pull request: https://github.com/apache/flink/pull/202#issuecomment-76979731 yes they are implemented as they are for performance reasons. the python cogroup grouping logic is actually a direct port of the SortMergeCoGroupIterator. it also makes things a bit simpler since you can work on the assumption that an operators function is not called more than once. if, on the java side, hasNext() returns false we know that we processed all input data, something you usually can only say when close() was called. the coGroupPython* stuiff is generic, will try to come up with a more suitable name. Create a general purpose framework for language bindings Key: FLINK-377 URL: https://issues.apache.org/jira/browse/FLINK-377 Project: Flink Issue Type: Improvement Reporter: GitHub Import Assignee: Chesnay Schepler Labels: github-import Fix For: pre-apache A general purpose API to run operators with arbitrary binaries. This will allow to run Stratosphere programs written in Python, JavaScript, Ruby, Go or whatever you like. We suggest using Google Protocol Buffers for data serialization. This is the list of languages that currently support ProtoBuf: https://code.google.com/p/protobuf/wiki/ThirdPartyAddOns Very early prototype with python: https://github.com/rmetzger/scratch/tree/learn-protobuf (basically testing protobuf) For Ruby: https://github.com/infochimps-labs/wukong Two new students working at Stratosphere (@skunert and @filiphaase) are working on this. The reference binding language will be for Python, but other bindings are very welcome. The best name for this so far is stratosphere-lang-bindings. I created this issue to track the progress (and give everybody a chance to comment on this) Imported from GitHub Url: https://github.com/stratosphere/stratosphere/issues/377 Created by: [rmetzger|https://github.com/rmetzger] Labels: enhancement, Assignee: [filiphaase|https://github.com/filiphaase] Created at: Tue Jan 07 19:47:20 CET 2014 State: open -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (FLINK-377) Create a general purpose framework for language bindings
[ https://issues.apache.org/jira/browse/FLINK-377?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14345296#comment-14345296 ] ASF GitHub Bot commented on FLINK-377: -- Github user aljoscha commented on the pull request: https://github.com/apache/flink/pull/202#issuecomment-76981299 You could call it CoGroupRaw, just an idea... Once that and the split into the python and generic part is done I vote for merging this. The API looks good and other stuff, such as getting rid of the type annotations can be worked on afterwards. I think it would be good to get people that are interested to try it out. Also, the code is very well commented and documented. :smile_cat: Create a general purpose framework for language bindings Key: FLINK-377 URL: https://issues.apache.org/jira/browse/FLINK-377 Project: Flink Issue Type: Improvement Reporter: GitHub Import Assignee: Chesnay Schepler Labels: github-import Fix For: pre-apache A general purpose API to run operators with arbitrary binaries. This will allow to run Stratosphere programs written in Python, JavaScript, Ruby, Go or whatever you like. We suggest using Google Protocol Buffers for data serialization. This is the list of languages that currently support ProtoBuf: https://code.google.com/p/protobuf/wiki/ThirdPartyAddOns Very early prototype with python: https://github.com/rmetzger/scratch/tree/learn-protobuf (basically testing protobuf) For Ruby: https://github.com/infochimps-labs/wukong Two new students working at Stratosphere (@skunert and @filiphaase) are working on this. The reference binding language will be for Python, but other bindings are very welcome. The best name for this so far is stratosphere-lang-bindings. I created this issue to track the progress (and give everybody a chance to comment on this) Imported from GitHub Url: https://github.com/stratosphere/stratosphere/issues/377 Created by: [rmetzger|https://github.com/rmetzger] Labels: enhancement, Assignee: [filiphaase|https://github.com/filiphaase] Created at: Tue Jan 07 19:47:20 CET 2014 State: open -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (FLINK-377) Create a general purpose framework for language bindings
[ https://issues.apache.org/jira/browse/FLINK-377?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14345656#comment-14345656 ] ASF GitHub Bot commented on FLINK-377: -- Github user zentol commented on the pull request: https://github.com/apache/flink/pull/202#issuecomment-77025149 renamed, rebased, re...structured! Create a general purpose framework for language bindings Key: FLINK-377 URL: https://issues.apache.org/jira/browse/FLINK-377 Project: Flink Issue Type: Improvement Reporter: GitHub Import Assignee: Chesnay Schepler Labels: github-import Fix For: pre-apache A general purpose API to run operators with arbitrary binaries. This will allow to run Stratosphere programs written in Python, JavaScript, Ruby, Go or whatever you like. We suggest using Google Protocol Buffers for data serialization. This is the list of languages that currently support ProtoBuf: https://code.google.com/p/protobuf/wiki/ThirdPartyAddOns Very early prototype with python: https://github.com/rmetzger/scratch/tree/learn-protobuf (basically testing protobuf) For Ruby: https://github.com/infochimps-labs/wukong Two new students working at Stratosphere (@skunert and @filiphaase) are working on this. The reference binding language will be for Python, but other bindings are very welcome. The best name for this so far is stratosphere-lang-bindings. I created this issue to track the progress (and give everybody a chance to comment on this) Imported from GitHub Url: https://github.com/stratosphere/stratosphere/issues/377 Created by: [rmetzger|https://github.com/rmetzger] Labels: enhancement, Assignee: [filiphaase|https://github.com/filiphaase] Created at: Tue Jan 07 19:47:20 CET 2014 State: open -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (FLINK-377) Create a general purpose framework for language bindings
[ https://issues.apache.org/jira/browse/FLINK-377?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14343302#comment-14343302 ] ASF GitHub Bot commented on FLINK-377: -- Github user aljoscha commented on the pull request: https://github.com/apache/flink/pull/202#issuecomment-76738603 I'm the next person to be looking at this. Hopefully wan can merge it after I've looked at it. :smile: @zentol Do you want to keep in in the current location or do you want to move it to flink-python? Create a general purpose framework for language bindings Key: FLINK-377 URL: https://issues.apache.org/jira/browse/FLINK-377 Project: Flink Issue Type: Improvement Reporter: GitHub Import Assignee: Chesnay Schepler Labels: github-import Fix For: pre-apache A general purpose API to run operators with arbitrary binaries. This will allow to run Stratosphere programs written in Python, JavaScript, Ruby, Go or whatever you like. We suggest using Google Protocol Buffers for data serialization. This is the list of languages that currently support ProtoBuf: https://code.google.com/p/protobuf/wiki/ThirdPartyAddOns Very early prototype with python: https://github.com/rmetzger/scratch/tree/learn-protobuf (basically testing protobuf) For Ruby: https://github.com/infochimps-labs/wukong Two new students working at Stratosphere (@skunert and @filiphaase) are working on this. The reference binding language will be for Python, but other bindings are very welcome. The best name for this so far is stratosphere-lang-bindings. I created this issue to track the progress (and give everybody a chance to comment on this) Imported from GitHub Url: https://github.com/stratosphere/stratosphere/issues/377 Created by: [rmetzger|https://github.com/rmetzger] Labels: enhancement, Assignee: [filiphaase|https://github.com/filiphaase] Created at: Tue Jan 07 19:47:20 CET 2014 State: open -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (FLINK-377) Create a general purpose framework for language bindings
[ https://issues.apache.org/jira/browse/FLINK-377?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14317935#comment-14317935 ] ASF GitHub Bot commented on FLINK-377: -- Github user mxm commented on the pull request: https://github.com/apache/flink/pull/202#issuecomment-74046665 Thanks, works fine now. For work stations, you cannot assume that the hostname maps to localhost or any address at all. Create a general purpose framework for language bindings Key: FLINK-377 URL: https://issues.apache.org/jira/browse/FLINK-377 Project: Flink Issue Type: Improvement Reporter: GitHub Import Assignee: Chesnay Schepler Labels: github-import Fix For: pre-apache A general purpose API to run operators with arbitrary binaries. This will allow to run Stratosphere programs written in Python, JavaScript, Ruby, Go or whatever you like. We suggest using Google Protocol Buffers for data serialization. This is the list of languages that currently support ProtoBuf: https://code.google.com/p/protobuf/wiki/ThirdPartyAddOns Very early prototype with python: https://github.com/rmetzger/scratch/tree/learn-protobuf (basically testing protobuf) For Ruby: https://github.com/infochimps-labs/wukong Two new students working at Stratosphere (@skunert and @filiphaase) are working on this. The reference binding language will be for Python, but other bindings are very welcome. The best name for this so far is stratosphere-lang-bindings. I created this issue to track the progress (and give everybody a chance to comment on this) Imported from GitHub Url: https://github.com/stratosphere/stratosphere/issues/377 Created by: [rmetzger|https://github.com/rmetzger] Labels: enhancement, Assignee: [filiphaase|https://github.com/filiphaase] Created at: Tue Jan 07 19:47:20 CET 2014 State: open -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (FLINK-377) Create a general purpose framework for language bindings
[ https://issues.apache.org/jira/browse/FLINK-377?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14316632#comment-14316632 ] ASF GitHub Bot commented on FLINK-377: -- Github user zentol commented on the pull request: https://github.com/apache/flink/pull/202#issuecomment-73928499 too bad, I'll revert the change. i assume that with your change to /etc/hosts it can now resolve your hostname to some address(this failed earlier with an exception), but the resulting address is not the same as the one java uses. Create a general purpose framework for language bindings Key: FLINK-377 URL: https://issues.apache.org/jira/browse/FLINK-377 Project: Flink Issue Type: Improvement Reporter: GitHub Import Assignee: Chesnay Schepler Labels: github-import Fix For: pre-apache A general purpose API to run operators with arbitrary binaries. This will allow to run Stratosphere programs written in Python, JavaScript, Ruby, Go or whatever you like. We suggest using Google Protocol Buffers for data serialization. This is the list of languages that currently support ProtoBuf: https://code.google.com/p/protobuf/wiki/ThirdPartyAddOns Very early prototype with python: https://github.com/rmetzger/scratch/tree/learn-protobuf (basically testing protobuf) For Ruby: https://github.com/infochimps-labs/wukong Two new students working at Stratosphere (@skunert and @filiphaase) are working on this. The reference binding language will be for Python, but other bindings are very welcome. The best name for this so far is stratosphere-lang-bindings. I created this issue to track the progress (and give everybody a chance to comment on this) Imported from GitHub Url: https://github.com/stratosphere/stratosphere/issues/377 Created by: [rmetzger|https://github.com/rmetzger] Labels: enhancement, Assignee: [filiphaase|https://github.com/filiphaase] Created at: Tue Jan 07 19:47:20 CET 2014 State: open -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (FLINK-377) Create a general purpose framework for language bindings
[ https://issues.apache.org/jira/browse/FLINK-377?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14316728#comment-14316728 ] ASF GitHub Bot commented on FLINK-377: -- Github user zentol commented on the pull request: https://github.com/apache/flink/pull/202#issuecomment-73936205 in my case, it returns this: ```python import socket socket.gethostname() 'Linux' socket.getaddrinfo(socket.gethostname(), None, socket.AF_INET, socket.SOCK_DGRAM) [(2, 2, 17, '', ('127.0.1.1', 0))] ``` 127.0.1.1 is also present in my /etc/hosts i assume they are different because no error is printed. I tried several different approaches, and when i got no (python) error and only the timeout, further investigation showed that the resolved ip did not match 127.0.1.1. Create a general purpose framework for language bindings Key: FLINK-377 URL: https://issues.apache.org/jira/browse/FLINK-377 Project: Flink Issue Type: Improvement Reporter: GitHub Import Assignee: Chesnay Schepler Labels: github-import Fix For: pre-apache A general purpose API to run operators with arbitrary binaries. This will allow to run Stratosphere programs written in Python, JavaScript, Ruby, Go or whatever you like. We suggest using Google Protocol Buffers for data serialization. This is the list of languages that currently support ProtoBuf: https://code.google.com/p/protobuf/wiki/ThirdPartyAddOns Very early prototype with python: https://github.com/rmetzger/scratch/tree/learn-protobuf (basically testing protobuf) For Ruby: https://github.com/infochimps-labs/wukong Two new students working at Stratosphere (@skunert and @filiphaase) are working on this. The reference binding language will be for Python, but other bindings are very welcome. The best name for this so far is stratosphere-lang-bindings. I created this issue to track the progress (and give everybody a chance to comment on this) Imported from GitHub Url: https://github.com/stratosphere/stratosphere/issues/377 Created by: [rmetzger|https://github.com/rmetzger] Labels: enhancement, Assignee: [filiphaase|https://github.com/filiphaase] Created at: Tue Jan 07 19:47:20 CET 2014 State: open -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (FLINK-377) Create a general purpose framework for language bindings
[ https://issues.apache.org/jira/browse/FLINK-377?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14314428#comment-14314428 ] ASF GitHub Bot commented on FLINK-377: -- Github user zentol commented on the pull request: https://github.com/apache/flink/pull/202#issuecomment-73731632 if you don't mind, try the example again and let it run. both processes should timeout after 5 minutes throwing exceptions, hopefully pointing to the origin of the lock. Create a general purpose framework for language bindings Key: FLINK-377 URL: https://issues.apache.org/jira/browse/FLINK-377 Project: Flink Issue Type: Improvement Reporter: GitHub Import Assignee: Chesnay Schepler Labels: github-import Fix For: pre-apache A general purpose API to run operators with arbitrary binaries. This will allow to run Stratosphere programs written in Python, JavaScript, Ruby, Go or whatever you like. We suggest using Google Protocol Buffers for data serialization. This is the list of languages that currently support ProtoBuf: https://code.google.com/p/protobuf/wiki/ThirdPartyAddOns Very early prototype with python: https://github.com/rmetzger/scratch/tree/learn-protobuf (basically testing protobuf) For Ruby: https://github.com/infochimps-labs/wukong Two new students working at Stratosphere (@skunert and @filiphaase) are working on this. The reference binding language will be for Python, but other bindings are very welcome. The best name for this so far is stratosphere-lang-bindings. I created this issue to track the progress (and give everybody a chance to comment on this) Imported from GitHub Url: https://github.com/stratosphere/stratosphere/issues/377 Created by: [rmetzger|https://github.com/rmetzger] Labels: enhancement, Assignee: [filiphaase|https://github.com/filiphaase] Created at: Tue Jan 07 19:47:20 CET 2014 State: open -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (FLINK-377) Create a general purpose framework for language bindings
[ https://issues.apache.org/jira/browse/FLINK-377?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14312202#comment-14312202 ] ASF GitHub Bot commented on FLINK-377: -- Github user zentol commented on the pull request: https://github.com/apache/flink/pull/202#issuecomment-73505531 @qmlmoon has provided TPCH Query 3 / 10 and WebLogAnalysis examples Create a general purpose framework for language bindings Key: FLINK-377 URL: https://issues.apache.org/jira/browse/FLINK-377 Project: Flink Issue Type: Improvement Reporter: GitHub Import Assignee: Chesnay Schepler Labels: github-import Fix For: pre-apache A general purpose API to run operators with arbitrary binaries. This will allow to run Stratosphere programs written in Python, JavaScript, Ruby, Go or whatever you like. We suggest using Google Protocol Buffers for data serialization. This is the list of languages that currently support ProtoBuf: https://code.google.com/p/protobuf/wiki/ThirdPartyAddOns Very early prototype with python: https://github.com/rmetzger/scratch/tree/learn-protobuf (basically testing protobuf) For Ruby: https://github.com/infochimps-labs/wukong Two new students working at Stratosphere (@skunert and @filiphaase) are working on this. The reference binding language will be for Python, but other bindings are very welcome. The best name for this so far is stratosphere-lang-bindings. I created this issue to track the progress (and give everybody a chance to comment on this) Imported from GitHub Url: https://github.com/stratosphere/stratosphere/issues/377 Created by: [rmetzger|https://github.com/rmetzger] Labels: enhancement, Assignee: [filiphaase|https://github.com/filiphaase] Created at: Tue Jan 07 19:47:20 CET 2014 State: open -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (FLINK-377) Create a general purpose framework for language bindings
[ https://issues.apache.org/jira/browse/FLINK-377?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14305977#comment-14305977 ] ASF GitHub Bot commented on FLINK-377: -- Github user fhueske commented on the pull request: https://github.com/apache/flink/pull/202#issuecomment-72941023 Got my triangle example also to working on a local setup. :-) Will play around more in the next days. Create a general purpose framework for language bindings Key: FLINK-377 URL: https://issues.apache.org/jira/browse/FLINK-377 Project: Flink Issue Type: Improvement Reporter: GitHub Import Assignee: Chesnay Schepler Labels: github-import Fix For: pre-apache A general purpose API to run operators with arbitrary binaries. This will allow to run Stratosphere programs written in Python, JavaScript, Ruby, Go or whatever you like. We suggest using Google Protocol Buffers for data serialization. This is the list of languages that currently support ProtoBuf: https://code.google.com/p/protobuf/wiki/ThirdPartyAddOns Very early prototype with python: https://github.com/rmetzger/scratch/tree/learn-protobuf (basically testing protobuf) For Ruby: https://github.com/infochimps-labs/wukong Two new students working at Stratosphere (@skunert and @filiphaase) are working on this. The reference binding language will be for Python, but other bindings are very welcome. The best name for this so far is stratosphere-lang-bindings. I created this issue to track the progress (and give everybody a chance to comment on this) Imported from GitHub Url: https://github.com/stratosphere/stratosphere/issues/377 Created by: [rmetzger|https://github.com/rmetzger] Labels: enhancement, Assignee: [filiphaase|https://github.com/filiphaase] Created at: Tue Jan 07 19:47:20 CET 2014 State: open -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (FLINK-377) Create a general purpose framework for language bindings
[ https://issues.apache.org/jira/browse/FLINK-377?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14303164#comment-14303164 ] ASF GitHub Bot commented on FLINK-377: -- Github user fhueske commented on the pull request: https://github.com/apache/flink/pull/202#issuecomment-72640916 Just implemented the basic triangle enumeration job and figured out that this example is already included in this PR ;-) However, when trying to run both programs, I encountered two problems: 1. I had to manually create a file `/tmp/flink_data/output` and give it a certain size (100MB worked for me). I'm on OS X. 2. After I had that file, the ./pyflink2.sh command did not print any error message but did not respond either. Seemed kind of deadlocked. The error message for the missing file was not really helpful and could be improved. Create a general purpose framework for language bindings Key: FLINK-377 URL: https://issues.apache.org/jira/browse/FLINK-377 Project: Flink Issue Type: Improvement Reporter: GitHub Import Assignee: Chesnay Schepler Labels: github-import Fix For: pre-apache A general purpose API to run operators with arbitrary binaries. This will allow to run Stratosphere programs written in Python, JavaScript, Ruby, Go or whatever you like. We suggest using Google Protocol Buffers for data serialization. This is the list of languages that currently support ProtoBuf: https://code.google.com/p/protobuf/wiki/ThirdPartyAddOns Very early prototype with python: https://github.com/rmetzger/scratch/tree/learn-protobuf (basically testing protobuf) For Ruby: https://github.com/infochimps-labs/wukong Two new students working at Stratosphere (@skunert and @filiphaase) are working on this. The reference binding language will be for Python, but other bindings are very welcome. The best name for this so far is stratosphere-lang-bindings. I created this issue to track the progress (and give everybody a chance to comment on this) Imported from GitHub Url: https://github.com/stratosphere/stratosphere/issues/377 Created by: [rmetzger|https://github.com/rmetzger] Labels: enhancement, Assignee: [filiphaase|https://github.com/filiphaase] Created at: Tue Jan 07 19:47:20 CET 2014 State: open -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (FLINK-377) Create a general purpose framework for language bindings
[ https://issues.apache.org/jira/browse/FLINK-377?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14303170#comment-14303170 ] ASF GitHub Bot commented on FLINK-377: -- Github user fhueske commented on the pull request: https://github.com/apache/flink/pull/202#issuecomment-72641447 btw. implementing the program felt quite good. Very nice API, IMO! Create a general purpose framework for language bindings Key: FLINK-377 URL: https://issues.apache.org/jira/browse/FLINK-377 Project: Flink Issue Type: Improvement Reporter: GitHub Import Assignee: Chesnay Schepler Labels: github-import Fix For: pre-apache A general purpose API to run operators with arbitrary binaries. This will allow to run Stratosphere programs written in Python, JavaScript, Ruby, Go or whatever you like. We suggest using Google Protocol Buffers for data serialization. This is the list of languages that currently support ProtoBuf: https://code.google.com/p/protobuf/wiki/ThirdPartyAddOns Very early prototype with python: https://github.com/rmetzger/scratch/tree/learn-protobuf (basically testing protobuf) For Ruby: https://github.com/infochimps-labs/wukong Two new students working at Stratosphere (@skunert and @filiphaase) are working on this. The reference binding language will be for Python, but other bindings are very welcome. The best name for this so far is stratosphere-lang-bindings. I created this issue to track the progress (and give everybody a chance to comment on this) Imported from GitHub Url: https://github.com/stratosphere/stratosphere/issues/377 Created by: [rmetzger|https://github.com/rmetzger] Labels: enhancement, Assignee: [filiphaase|https://github.com/filiphaase] Created at: Tue Jan 07 19:47:20 CET 2014 State: open -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (FLINK-377) Create a general purpose framework for language bindings
[ https://issues.apache.org/jira/browse/FLINK-377?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14304096#comment-14304096 ] ASF GitHub Bot commented on FLINK-377: -- Github user zentol commented on the pull request: https://github.com/apache/flink/pull/202#issuecomment-72744279 about error messages going to command-line: the only way i see for that to work is by wrapping the complete error message into an exception, since they do show up on the command-line. wc deadlock: i just can't reproduce it. i tried small files (4 words) and went up to 750mb with dop=1. can you send me the test data you used? @qmlmoon THANK YOU! that would have taken me ages to figure out. working on a fix right now. Create a general purpose framework for language bindings Key: FLINK-377 URL: https://issues.apache.org/jira/browse/FLINK-377 Project: Flink Issue Type: Improvement Reporter: GitHub Import Assignee: Chesnay Schepler Labels: github-import Fix For: pre-apache A general purpose API to run operators with arbitrary binaries. This will allow to run Stratosphere programs written in Python, JavaScript, Ruby, Go or whatever you like. We suggest using Google Protocol Buffers for data serialization. This is the list of languages that currently support ProtoBuf: https://code.google.com/p/protobuf/wiki/ThirdPartyAddOns Very early prototype with python: https://github.com/rmetzger/scratch/tree/learn-protobuf (basically testing protobuf) For Ruby: https://github.com/infochimps-labs/wukong Two new students working at Stratosphere (@skunert and @filiphaase) are working on this. The reference binding language will be for Python, but other bindings are very welcome. The best name for this so far is stratosphere-lang-bindings. I created this issue to track the progress (and give everybody a chance to comment on this) Imported from GitHub Url: https://github.com/stratosphere/stratosphere/issues/377 Created by: [rmetzger|https://github.com/rmetzger] Labels: enhancement, Assignee: [filiphaase|https://github.com/filiphaase] Created at: Tue Jan 07 19:47:20 CET 2014 State: open -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (FLINK-377) Create a general purpose framework for language bindings
[ https://issues.apache.org/jira/browse/FLINK-377?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14303266#comment-14303266 ] ASF GitHub Bot commented on FLINK-377: -- Github user zentol commented on the pull request: https://github.com/apache/flink/pull/202#issuecomment-72652471 hmm...you are now the second person to report that creating the tmp files does not work on OS X. i don't know why that doesn't work. the file creation is done from java, is there any magic required there? i can't debug OS X error myself at the moment. all i can do on that front is add sanity checks for better error reporting. the included triangle enumeration is kinda odd, even if it runs the output is empty; Ive already checked the implementation yesterday and it appears equal to the java counterpart. will give it another go. the plan execution is one of the more fragile parts. generally, when the process exits with an error it is noticed. but if for example something is missing (like the call to execute) things just get stuck. this is due to the fact that information is only ever sent to java, but never received, a complete one-way street. since accumulators nor actions were supposed to be implemented anytime soon this seemed appropriate, but it seems that requires a change already. some timeouts could be useful as well. @fhueske Thanks for trying it out! Create a general purpose framework for language bindings Key: FLINK-377 URL: https://issues.apache.org/jira/browse/FLINK-377 Project: Flink Issue Type: Improvement Reporter: GitHub Import Assignee: Chesnay Schepler Labels: github-import Fix For: pre-apache A general purpose API to run operators with arbitrary binaries. This will allow to run Stratosphere programs written in Python, JavaScript, Ruby, Go or whatever you like. We suggest using Google Protocol Buffers for data serialization. This is the list of languages that currently support ProtoBuf: https://code.google.com/p/protobuf/wiki/ThirdPartyAddOns Very early prototype with python: https://github.com/rmetzger/scratch/tree/learn-protobuf (basically testing protobuf) For Ruby: https://github.com/infochimps-labs/wukong Two new students working at Stratosphere (@skunert and @filiphaase) are working on this. The reference binding language will be for Python, but other bindings are very welcome. The best name for this so far is stratosphere-lang-bindings. I created this issue to track the progress (and give everybody a chance to comment on this) Imported from GitHub Url: https://github.com/stratosphere/stratosphere/issues/377 Created by: [rmetzger|https://github.com/rmetzger] Labels: enhancement, Assignee: [filiphaase|https://github.com/filiphaase] Created at: Tue Jan 07 19:47:20 CET 2014 State: open -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (FLINK-377) Create a general purpose framework for language bindings
[ https://issues.apache.org/jira/browse/FLINK-377?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14304192#comment-14304192 ] ASF GitHub Bot commented on FLINK-377: -- Github user zentol commented on the pull request: https://github.com/apache/flink/pull/202#issuecomment-72752968 @qmlmoon sweet. @rmetzger errors should show up on the console now. and in the .out file. and i suppose by extension in the .log file aswell. Create a general purpose framework for language bindings Key: FLINK-377 URL: https://issues.apache.org/jira/browse/FLINK-377 Project: Flink Issue Type: Improvement Reporter: GitHub Import Assignee: Chesnay Schepler Labels: github-import Fix For: pre-apache A general purpose API to run operators with arbitrary binaries. This will allow to run Stratosphere programs written in Python, JavaScript, Ruby, Go or whatever you like. We suggest using Google Protocol Buffers for data serialization. This is the list of languages that currently support ProtoBuf: https://code.google.com/p/protobuf/wiki/ThirdPartyAddOns Very early prototype with python: https://github.com/rmetzger/scratch/tree/learn-protobuf (basically testing protobuf) For Ruby: https://github.com/infochimps-labs/wukong Two new students working at Stratosphere (@skunert and @filiphaase) are working on this. The reference binding language will be for Python, but other bindings are very welcome. The best name for this so far is stratosphere-lang-bindings. I created this issue to track the progress (and give everybody a chance to comment on this) Imported from GitHub Url: https://github.com/stratosphere/stratosphere/issues/377 Created by: [rmetzger|https://github.com/rmetzger] Labels: enhancement, Assignee: [filiphaase|https://github.com/filiphaase] Created at: Tue Jan 07 19:47:20 CET 2014 State: open -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (FLINK-377) Create a general purpose framework for language bindings
[ https://issues.apache.org/jira/browse/FLINK-377?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14301892#comment-14301892 ] ASF GitHub Bot commented on FLINK-377: -- Github user rmetzger commented on a diff in the pull request: https://github.com/apache/flink/pull/202#discussion_r23956914 --- Diff: flink-addons/flink-language-binding/src/main/java/org/apache/flink/languagebinding/api/java/common/OperationInfo.java --- @@ -0,0 +1,48 @@ +/** + * Licensed to the Apache Software Foundation (ASF) under one or more contributor license agreements. See the NOTICE + * file distributed with this work for additional information regarding copyright ownership. The ASF licenses this file + * to you under the Apache License, Version 2.0 (the License); you may not use this file except in compliance with the + * License. You may obtain a copy of the License at + * + * http://www.apache.org/licenses/LICENSE-2.0 + * + * Unless required by applicable law or agreed to in writing, software distributed under the License is distributed on + * an AS IS BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. See the License for the + * specific language governing permissions and limitations under the License. + */ +package org.apache.flink.languagebinding.api.java.common; + +/** + * Container for all generic information related to operations. This class contains the absolute minimum fields that are + * required for all operations. This class should be extended to contain any additional fields required on a + * per-language basis. + */ +public abstract class OperationInfo { + public int parentID; //DataSet that an operation is applied on + public int otherID; //secondary DataSet + public int setID; //ID for new DataSet + public int[] keys1; //grouping keys + public int[] keys2; //grouping keys + public int[] projectionKeys1; //projection keys + public int[] projectionKeys2; //projection keys + public Object types; //an object that is of the same type as the output type --- End diff -- Mh. I don't think that the `getProducedType()` is called after the local pre-flight phase has finished. So you can safely put the TypeInformation object into a `transient` field, so that the system doesn't try to serialize it. Create a general purpose framework for language bindings Key: FLINK-377 URL: https://issues.apache.org/jira/browse/FLINK-377 Project: Flink Issue Type: Improvement Reporter: GitHub Import Assignee: Chesnay Schepler Labels: github-import Fix For: pre-apache A general purpose API to run operators with arbitrary binaries. This will allow to run Stratosphere programs written in Python, JavaScript, Ruby, Go or whatever you like. We suggest using Google Protocol Buffers for data serialization. This is the list of languages that currently support ProtoBuf: https://code.google.com/p/protobuf/wiki/ThirdPartyAddOns Very early prototype with python: https://github.com/rmetzger/scratch/tree/learn-protobuf (basically testing protobuf) For Ruby: https://github.com/infochimps-labs/wukong Two new students working at Stratosphere (@skunert and @filiphaase) are working on this. The reference binding language will be for Python, but other bindings are very welcome. The best name for this so far is stratosphere-lang-bindings. I created this issue to track the progress (and give everybody a chance to comment on this) Imported from GitHub Url: https://github.com/stratosphere/stratosphere/issues/377 Created by: [rmetzger|https://github.com/rmetzger] Labels: enhancement, Assignee: [filiphaase|https://github.com/filiphaase] Created at: Tue Jan 07 19:47:20 CET 2014 State: open -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (FLINK-377) Create a general purpose framework for language bindings
[ https://issues.apache.org/jira/browse/FLINK-377?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14301888#comment-14301888 ] ASF GitHub Bot commented on FLINK-377: -- Github user rmetzger commented on the pull request: https://github.com/apache/flink/pull/202#issuecomment-72533930 Wordcount with build-in data works :+1: nice. ``` robert@robert-tower ...k-0.9-SNAPSHOT-bin/flink-0.9-SNAPSHOT (git)-[papipr] % ./bin/pyflink3.sh pyflink.py - /home/robert/flink-workdir/yarnLog /tmp/yarnPyWC 02/02/2015 21:20:34 Job execution switched to status RUNNING. 02/02/2015 21:20:34 DataSource (TextSource)(1/1) switched to SCHEDULED 02/02/2015 21:20:34 DataSource (TextSource)(1/1) switched to DEPLOYING 02/02/2015 21:20:34 DataSource (TextSource)(1/1) switched to RUNNING 02/02/2015 21:20:34 MapPartition (PythonFlatMap - PythonCombine)(1/1) switched to SCHEDULED 02/02/2015 21:20:34 MapPartition (PythonFlatMap - PythonCombine)(1/1) switched to DEPLOYING 02/02/2015 21:20:34 MapPartition (PythonFlatMap - PythonCombine)(1/1) switched to RUNNING 02/02/2015 21:20:34 DataSource (TextSource)(1/1) switched to FINISHED ``` I wanted to run wordcount locally on some serious data, but sadly it seems that the job somehow deadlocked. ``` MapPartition (PythonFlatMap - PythonCombine) (1/1) #85 daemon prio=5 os_prio=0 tid=0x01b59000 nid=0x855 runnable [0x7fd73b3f4000] java.lang.Thread.State: RUNNABLE at java.net.PlainDatagramSocketImpl.receive0(Native Method) - locked 0xfad36828 (a java.net.PlainDatagramSocketImpl) at java.net.AbstractPlainDatagramSocketImpl.receive(AbstractPlainDatagramSocketImpl.java:143) - locked 0xfad36828 (a java.net.PlainDatagramSocketImpl) at java.net.DatagramSocket.receive(DatagramSocket.java:781) - locked 0xfad367a0 (a java.net.DatagramPacket) - locked 0xfad367c8 (a java.net.DatagramSocket) at org.apache.flink.languagebinding.api.java.common.streaming.Streamer.streamBufferWithoutGroups(Streamer.java:172) at org.apache.flink.languagebinding.api.java.python.functions.PythonMapPartition.mapPartition(PythonMapPartition.java:55) at org.apache.flink.runtime.operators.MapPartitionDriver.run(MapPartitionDriver.java:98) at org.apache.flink.runtime.operators.RegularPactTask.run(RegularPactTask.java:496) at org.apache.flink.runtime.operators.RegularPactTask.invoke(RegularPactTask.java:360) at org.apache.flink.runtime.execution.RuntimeEnvironment.run(RuntimeEnvironment.java:204) at java.lang.Thread.run(Thread.java:745) ``` ``` Thread-23 #86 daemon prio=5 os_prio=0 tid=0x7fd634001800 nid=0x857 runnable [0x7fd73b4f4000] java.lang.Thread.State: RUNNABLE at java.io.FileInputStream.readBytes(Native Method) at java.io.FileInputStream.read(FileInputStream.java:234) at java.io.BufferedInputStream.read1(BufferedInputStream.java:284) at java.io.BufferedInputStream.read(BufferedInputStream.java:345) - locked 0xfad3a440 (a java.lang.UNIXProcess$ProcessPipeInputStream) at sun.nio.cs.StreamDecoder.readBytes(StreamDecoder.java:284) at sun.nio.cs.StreamDecoder.implRead(StreamDecoder.java:326) at sun.nio.cs.StreamDecoder.read(StreamDecoder.java:178) - locked 0xfad3e968 (a java.io.InputStreamReader) at java.io.InputStreamReader.read(InputStreamReader.java:184) at java.io.BufferedReader.fill(BufferedReader.java:161) at java.io.BufferedReader.readLine(BufferedReader.java:324) - locked 0xfad3e968 (a java.io.InputStreamReader) at java.io.BufferedReader.readLine(BufferedReader.java:389) at org.apache.flink.languagebinding.api.java.common.streaming.StreamPrinter.run(StreamPrinter.java:34) ``` I have to dig further to understand whats going on. My understanding of the pull request right now is the following: I see that this change was a LOT of work and that there had been some iterations of improvement. What the code certainly needs are a few more developers. This will probably automatically lead to cleaner code, better code comments, better error handling and so on. I'm still not convinced to merge the code in the state its currently in. Therefore, I'm just facing too many issues right now. That the example in the documentation is broken is certainly not the dealbreaker here. Issues like hard to find error messages or the issues I had with the wordcount (I don't know if its the runtime or an issue of the Python code) Please don't get my feedback here wrong. I appreciate
[jira] [Commented] (FLINK-377) Create a general purpose framework for language bindings
[ https://issues.apache.org/jira/browse/FLINK-377?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14301925#comment-14301925 ] ASF GitHub Bot commented on FLINK-377: -- Github user zentol commented on a diff in the pull request: https://github.com/apache/flink/pull/202#discussion_r23958310 --- Diff: flink-addons/flink-language-binding/src/main/java/org/apache/flink/languagebinding/api/java/common/OperationInfo.java --- @@ -0,0 +1,48 @@ +/** + * Licensed to the Apache Software Foundation (ASF) under one or more contributor license agreements. See the NOTICE + * file distributed with this work for additional information regarding copyright ownership. The ASF licenses this file + * to you under the Apache License, Version 2.0 (the License); you may not use this file except in compliance with the + * License. You may obtain a copy of the License at + * + * http://www.apache.org/licenses/LICENSE-2.0 + * + * Unless required by applicable law or agreed to in writing, software distributed under the License is distributed on + * an AS IS BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. See the License for the + * specific language governing permissions and limitations under the License. + */ +package org.apache.flink.languagebinding.api.java.common; + +/** + * Container for all generic information related to operations. This class contains the absolute minimum fields that are + * required for all operations. This class should be extended to contain any additional fields required on a + * per-language basis. + */ +public abstract class OperationInfo { + public int parentID; //DataSet that an operation is applied on + public int otherID; //secondary DataSet + public int setID; //ID for new DataSet + public int[] keys1; //grouping keys + public int[] keys2; //grouping keys + public int[] projectionKeys1; //projection keys + public int[] projectionKeys2; //projection keys + public Object types; //an object that is of the same type as the output type --- End diff -- transient field is a good idea... Create a general purpose framework for language bindings Key: FLINK-377 URL: https://issues.apache.org/jira/browse/FLINK-377 Project: Flink Issue Type: Improvement Reporter: GitHub Import Assignee: Chesnay Schepler Labels: github-import Fix For: pre-apache A general purpose API to run operators with arbitrary binaries. This will allow to run Stratosphere programs written in Python, JavaScript, Ruby, Go or whatever you like. We suggest using Google Protocol Buffers for data serialization. This is the list of languages that currently support ProtoBuf: https://code.google.com/p/protobuf/wiki/ThirdPartyAddOns Very early prototype with python: https://github.com/rmetzger/scratch/tree/learn-protobuf (basically testing protobuf) For Ruby: https://github.com/infochimps-labs/wukong Two new students working at Stratosphere (@skunert and @filiphaase) are working on this. The reference binding language will be for Python, but other bindings are very welcome. The best name for this so far is stratosphere-lang-bindings. I created this issue to track the progress (and give everybody a chance to comment on this) Imported from GitHub Url: https://github.com/stratosphere/stratosphere/issues/377 Created by: [rmetzger|https://github.com/rmetzger] Labels: enhancement, Assignee: [filiphaase|https://github.com/filiphaase] Created at: Tue Jan 07 19:47:20 CET 2014 State: open -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (FLINK-377) Create a general purpose framework for language bindings
[ https://issues.apache.org/jira/browse/FLINK-377?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14301803#comment-14301803 ] ASF GitHub Bot commented on FLINK-377: -- Github user rmetzger commented on the pull request: https://github.com/apache/flink/pull/202#issuecomment-72526181 I've tested the changes again, because I would really like to merge them The bin/pyflink3.sh script only works when called from the flink root dir ``` robert@robert-tower ...9-SNAPSHOT-bin/flink-0.9-SNAPSHOT/bin (git)-[papipr] % ./pyflink3.sh Error: Jar file: 'lib/flink-language-binding-0.9-SNAPSHOT.jar' does not exist. ``` This issue will be fixed soon because the `bin/flink` client will print all errors immediately (instead of asking the user to put a `-v`). For now, you can maybe add the `-v´ by default. ``` ./bin/pyflink3.sh pyflink.py Traceback (most recent call last): File /tmp/flink_plan/plan.py, line 1, in module bullshit NameError: name 'bullshit' is not defined 20:16:20,658 WARN org.apache.hadoop.util.NativeCodeLoader - Unable to load native-hadoop library for your platform... using builtin-java classes where applicable Error: The main method caused an error. For a more detailed error message use the vebose output option '-v'. ``` The Python PlanBuilder seems to insist on using HDFS, even though I'm testing the code locally: ``` robert@robert-tower ...k-0.9-SNAPSHOT-bin/flink-0.9-SNAPSHOT (git)-[papipr] % ./bin/pyflink3.sh pyflink.py 20:25:57,440 WARN org.apache.hadoop.util.NativeCodeLoader - Unable to load native-hadoop library for your platform... using builtin-java classes where applicable Error: The main method caused an error. org.apache.flink.client.program.ProgramInvocationException: The main method caused an error. at org.apache.flink.client.program.PackagedProgram.callMainMethod(PackagedProgram.java:449) at org.apache.flink.client.program.PackagedProgram.invokeInteractiveModeForExecution(PackagedProgram.java:350) at org.apache.flink.client.program.Client.run(Client.java:242) at org.apache.flink.client.CliFrontend.executeProgram(CliFrontend.java:389) at org.apache.flink.client.CliFrontend.run(CliFrontend.java:358) at org.apache.flink.client.CliFrontend.parseParameters(CliFrontend.java:1068) at org.apache.flink.client.CliFrontend.main(CliFrontend.java:1092) Caused by: java.io.IOException: The given HDFS file URI (hdfs:/tmp/flink) did not describe the HDFS NameNode. The attempt to use a default HDFS configuration, as specified in the 'fs.hdfs.hdfsdefault' or 'fs.hdfs.hdfssite' config parameter failed due to the following problem: Either no default file system was registered, or the provided configuration contains no valid authority component (fs.default.name or fs.defaultFS) describing the (hdfs namenode) host and port. at org.apache.flink.runtime.fs.hdfs.HadoopFileSystem.initialize(HadoopFileSystem.java:287) at org.apache.flink.core.fs.FileSystem.get(FileSystem.java:261) at org.apache.flink.languagebinding.api.java.python.PythonPlanBinder.clearPath(PythonPlanBinder.java:135) at org.apache.flink.languagebinding.api.java.python.PythonPlanBinder.distributeFiles(PythonPlanBinder.java:153) at org.apache.flink.languagebinding.api.java.python.PythonPlanBinder.runPlan(PythonPlanBinder.java:101) at org.apache.flink.languagebinding.api.java.python.PythonPlanBinder.main(PythonPlanBinder.java:78) at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62) at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) at java.lang.reflect.Method.invoke(Method.java:483) at org.apache.flink.client.program.PackagedProgram.callMainMethod(PackagedProgram.java:434) ... 6 more ``` Apparently, using `env.execute(local=True)` resolves the problem. But leads to a new problem: ``` robert@robert-tower ...k-0.9-SNAPSHOT-bin/flink-0.9-SNAPSHOT (git)-[papipr] % ./bin/pyflink3.sh pyflink.py 02/02/2015 20:55:00 Job execution switched to status RUNNING. 02/02/2015 20:55:00 DataSource (ValueSource)(1/1) switched to SCHEDULED 02/02/2015 20:55:00 DataSource (ValueSource)(1/1) switched to DEPLOYING 02/02/2015 20:55:01 DataSource (ValueSource)(1/1) switched to RUNNING 02/02/2015 20:55:01 MapPartition (PythonFlatMap - PythonCombine)(1/1) switched to SCHEDULED 02/02/2015 20:55:01 MapPartition (PythonFlatMap - PythonCombine)(1/1) switched to DEPLOYING 02/02/2015 20:55:01 DataSource (ValueSource)(1/1) switched to FINISHED 02/02/2015 20:55:01
[jira] [Commented] (FLINK-377) Create a general purpose framework for language bindings
[ https://issues.apache.org/jira/browse/FLINK-377?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14302010#comment-14302010 ] ASF GitHub Bot commented on FLINK-377: -- Github user fhueske commented on the pull request: https://github.com/apache/flink/pull/202#issuecomment-72547127 Asking others to implement the standard example programs has worked quite well to identify issues with new APIs. How about, we look for people who try out the API by implementing one or two of these examples? This would also serve as a minimal in-code documentation... I will make a start and port the triangle enumeration job to the Python API. Create a general purpose framework for language bindings Key: FLINK-377 URL: https://issues.apache.org/jira/browse/FLINK-377 Project: Flink Issue Type: Improvement Reporter: GitHub Import Assignee: Chesnay Schepler Labels: github-import Fix For: pre-apache A general purpose API to run operators with arbitrary binaries. This will allow to run Stratosphere programs written in Python, JavaScript, Ruby, Go or whatever you like. We suggest using Google Protocol Buffers for data serialization. This is the list of languages that currently support ProtoBuf: https://code.google.com/p/protobuf/wiki/ThirdPartyAddOns Very early prototype with python: https://github.com/rmetzger/scratch/tree/learn-protobuf (basically testing protobuf) For Ruby: https://github.com/infochimps-labs/wukong Two new students working at Stratosphere (@skunert and @filiphaase) are working on this. The reference binding language will be for Python, but other bindings are very welcome. The best name for this so far is stratosphere-lang-bindings. I created this issue to track the progress (and give everybody a chance to comment on this) Imported from GitHub Url: https://github.com/stratosphere/stratosphere/issues/377 Created by: [rmetzger|https://github.com/rmetzger] Labels: enhancement, Assignee: [filiphaase|https://github.com/filiphaase] Created at: Tue Jan 07 19:47:20 CET 2014 State: open -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (FLINK-377) Create a general purpose framework for language bindings
[ https://issues.apache.org/jira/browse/FLINK-377?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14302302#comment-14302302 ] ASF GitHub Bot commented on FLINK-377: -- Github user zentol commented on the pull request: https://github.com/apache/flink/pull/202#issuecomment-72559722 @rmetzger follow-up about HDFS usage when starting flink in local mode: how can i determine how flink was started from within a plan? Create a general purpose framework for language bindings Key: FLINK-377 URL: https://issues.apache.org/jira/browse/FLINK-377 Project: Flink Issue Type: Improvement Reporter: GitHub Import Assignee: Chesnay Schepler Labels: github-import Fix For: pre-apache A general purpose API to run operators with arbitrary binaries. This will allow to run Stratosphere programs written in Python, JavaScript, Ruby, Go or whatever you like. We suggest using Google Protocol Buffers for data serialization. This is the list of languages that currently support ProtoBuf: https://code.google.com/p/protobuf/wiki/ThirdPartyAddOns Very early prototype with python: https://github.com/rmetzger/scratch/tree/learn-protobuf (basically testing protobuf) For Ruby: https://github.com/infochimps-labs/wukong Two new students working at Stratosphere (@skunert and @filiphaase) are working on this. The reference binding language will be for Python, but other bindings are very welcome. The best name for this so far is stratosphere-lang-bindings. I created this issue to track the progress (and give everybody a chance to comment on this) Imported from GitHub Url: https://github.com/stratosphere/stratosphere/issues/377 Created by: [rmetzger|https://github.com/rmetzger] Labels: enhancement, Assignee: [filiphaase|https://github.com/filiphaase] Created at: Tue Jan 07 19:47:20 CET 2014 State: open -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (FLINK-377) Create a general purpose framework for language bindings
[ https://issues.apache.org/jira/browse/FLINK-377?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14301840#comment-14301840 ] ASF GitHub Bot commented on FLINK-377: -- Github user zentol commented on a diff in the pull request: https://github.com/apache/flink/pull/202#discussion_r23954389 --- Diff: flink-addons/flink-language-binding/src/main/java/org/apache/flink/languagebinding/api/java/common/OperationInfo.java --- @@ -0,0 +1,48 @@ +/** + * Licensed to the Apache Software Foundation (ASF) under one or more contributor license agreements. See the NOTICE + * file distributed with this work for additional information regarding copyright ownership. The ASF licenses this file + * to you under the Apache License, Version 2.0 (the License); you may not use this file except in compliance with the + * License. You may obtain a copy of the License at + * + * http://www.apache.org/licenses/LICENSE-2.0 + * + * Unless required by applicable law or agreed to in writing, software distributed under the License is distributed on + * an AS IS BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. See the License for the + * specific language governing permissions and limitations under the License. + */ +package org.apache.flink.languagebinding.api.java.common; + +/** + * Container for all generic information related to operations. This class contains the absolute minimum fields that are + * required for all operations. This class should be extended to contain any additional fields required on a + * per-language basis. + */ +public abstract class OperationInfo { + public int parentID; //DataSet that an operation is applied on + public int otherID; //secondary DataSet + public int setID; //ID for new DataSet + public int[] keys1; //grouping keys + public int[] keys2; //grouping keys + public int[] projectionKeys1; //projection keys + public int[] projectionKeys2; //projection keys + public Object types; //an object that is of the same type as the output type --- End diff -- yes that would be nicer, but last time i tried that i got NotSerializableException due to the TypeInformation. Create a general purpose framework for language bindings Key: FLINK-377 URL: https://issues.apache.org/jira/browse/FLINK-377 Project: Flink Issue Type: Improvement Reporter: GitHub Import Assignee: Chesnay Schepler Labels: github-import Fix For: pre-apache A general purpose API to run operators with arbitrary binaries. This will allow to run Stratosphere programs written in Python, JavaScript, Ruby, Go or whatever you like. We suggest using Google Protocol Buffers for data serialization. This is the list of languages that currently support ProtoBuf: https://code.google.com/p/protobuf/wiki/ThirdPartyAddOns Very early prototype with python: https://github.com/rmetzger/scratch/tree/learn-protobuf (basically testing protobuf) For Ruby: https://github.com/infochimps-labs/wukong Two new students working at Stratosphere (@skunert and @filiphaase) are working on this. The reference binding language will be for Python, but other bindings are very welcome. The best name for this so far is stratosphere-lang-bindings. I created this issue to track the progress (and give everybody a chance to comment on this) Imported from GitHub Url: https://github.com/stratosphere/stratosphere/issues/377 Created by: [rmetzger|https://github.com/rmetzger] Labels: enhancement, Assignee: [filiphaase|https://github.com/filiphaase] Created at: Tue Jan 07 19:47:20 CET 2014 State: open -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (FLINK-377) Create a general purpose framework for language bindings
[ https://issues.apache.org/jira/browse/FLINK-377?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14301734#comment-14301734 ] ASF GitHub Bot commented on FLINK-377: -- Github user rmetzger commented on a diff in the pull request: https://github.com/apache/flink/pull/202#discussion_r23950076 --- Diff: docs/python_programming_guide.md --- @@ -0,0 +1,600 @@ +--- +title: Python Programming Guide +--- +!-- +Licensed to the Apache Software Foundation (ASF) under one +or more contributor license agreements. See the NOTICE file +distributed with this work for additional information +regarding copyright ownership. The ASF licenses this file +to you under the Apache License, Version 2.0 (the +License); you may not use this file except in compliance +with the License. You may obtain a copy of the License at + + http://www.apache.org/licenses/LICENSE-2.0 + +Unless required by applicable law or agreed to in writing, +software distributed under the License is distributed on an +AS IS BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY +KIND, either express or implied. See the License for the +specific language governing permissions and limitations +under the License. +-- + +* This will be replaced by the TOC +{:toc} + + +a href=#top/a + +Introduction + + +Analysis programs in Flink are regular programs that implement transformations on data sets +(e.g., filtering, mapping, joining, grouping). The data sets are initially created from certain +sources (e.g., by reading files, or from collections). Results are returned via sinks, which may for +example write the data to (distributed) files, or to standard output (for example the command line +terminal). Flink programs run in a variety of contexts, standalone, or embedded in other programs. +The execution can happen in a local JVM, or on clusters of many machines. + +In order to create your own Flink program, we encourage you to start with the +[program skeleton](#program-skeleton) and gradually add your own +[transformations](#transformations). The remaining sections act as references for additional +operations and advanced features. + + +Example Program +--- + +The following program is a complete, working example of WordCount. You can copy amp; paste the code +to run it locally. + +{% highlight python %} +from flink.plan.Environment import get_environment +from flink.plan.Constants import INT, STRING +from flink.functions.GroupReduceFunction import GroupReduceFunction + +class Adder(GroupReduceFunction): + def reduce(self, iterator, collector): +count, word = iterator.next() +count += sum([x[0] for x in iterator]) +collector.collect((count, word)) + +if __name__ == __main__: + env = get_environment() + data = env.from_elements(Who's there?, + I think I hear them. Stand, ho! Who's there?) + + data \ +.flat_map(lambda x: x.lower().split(), (INT, STRING)) \ +.group_by(1) \ +.reduce_group(Adder(), (INT, STRING), combinable=True) \ +.output() + + env.execute() +} --- End diff -- I've copy pasted the program as said in the documentation but it doesn't run. Most likely because of the `}` sign here. Create a general purpose framework for language bindings Key: FLINK-377 URL: https://issues.apache.org/jira/browse/FLINK-377 Project: Flink Issue Type: Improvement Reporter: GitHub Import Assignee: Chesnay Schepler Labels: github-import Fix For: pre-apache A general purpose API to run operators with arbitrary binaries. This will allow to run Stratosphere programs written in Python, JavaScript, Ruby, Go or whatever you like. We suggest using Google Protocol Buffers for data serialization. This is the list of languages that currently support ProtoBuf: https://code.google.com/p/protobuf/wiki/ThirdPartyAddOns Very early prototype with python: https://github.com/rmetzger/scratch/tree/learn-protobuf (basically testing protobuf) For Ruby: https://github.com/infochimps-labs/wukong Two new students working at Stratosphere (@skunert and @filiphaase) are working on this. The reference binding language will be for Python, but other bindings are very welcome. The best name for this so far is stratosphere-lang-bindings. I created this issue to track the progress (and give everybody a chance to comment on this) Imported from GitHub Url: https://github.com/stratosphere/stratosphere/issues/377
[jira] [Commented] (FLINK-377) Create a general purpose framework for language bindings
[ https://issues.apache.org/jira/browse/FLINK-377?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14295481#comment-14295481 ] ASF GitHub Bot commented on FLINK-377: -- Github user zentol commented on the pull request: https://github.com/apache/flink/pull/202#issuecomment-71880245 i don't have a clue about all this licensing stuff. what i have seen though is that spark uses a file that has a similar license as the dill library. (https://github.com/apache/spark/blob/master/python/pyspark/cloudpickle.py) Create a general purpose framework for language bindings Key: FLINK-377 URL: https://issues.apache.org/jira/browse/FLINK-377 Project: Flink Issue Type: Improvement Reporter: GitHub Import Labels: github-import Fix For: pre-apache A general purpose API to run operators with arbitrary binaries. This will allow to run Stratosphere programs written in Python, JavaScript, Ruby, Go or whatever you like. We suggest using Google Protocol Buffers for data serialization. This is the list of languages that currently support ProtoBuf: https://code.google.com/p/protobuf/wiki/ThirdPartyAddOns Very early prototype with python: https://github.com/rmetzger/scratch/tree/learn-protobuf (basically testing protobuf) For Ruby: https://github.com/infochimps-labs/wukong Two new students working at Stratosphere (@skunert and @filiphaase) are working on this. The reference binding language will be for Python, but other bindings are very welcome. The best name for this so far is stratosphere-lang-bindings. I created this issue to track the progress (and give everybody a chance to comment on this) Imported from GitHub Url: https://github.com/stratosphere/stratosphere/issues/377 Created by: [rmetzger|https://github.com/rmetzger] Labels: enhancement, Assignee: [filiphaase|https://github.com/filiphaase] Created at: Tue Jan 07 19:47:20 CET 2014 State: open -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (FLINK-377) Create a general purpose framework for language bindings
[ https://issues.apache.org/jira/browse/FLINK-377?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14295625#comment-14295625 ] ASF GitHub Bot commented on FLINK-377: -- Github user StephanEwen commented on the pull request: https://github.com/apache/flink/pull/202#issuecomment-71893972 We definitely have to double check the Licenses issues when merging this. I can do that next week. Create a general purpose framework for language bindings Key: FLINK-377 URL: https://issues.apache.org/jira/browse/FLINK-377 Project: Flink Issue Type: Improvement Reporter: GitHub Import Assignee: Chesnay Schepler Labels: github-import Fix For: pre-apache A general purpose API to run operators with arbitrary binaries. This will allow to run Stratosphere programs written in Python, JavaScript, Ruby, Go or whatever you like. We suggest using Google Protocol Buffers for data serialization. This is the list of languages that currently support ProtoBuf: https://code.google.com/p/protobuf/wiki/ThirdPartyAddOns Very early prototype with python: https://github.com/rmetzger/scratch/tree/learn-protobuf (basically testing protobuf) For Ruby: https://github.com/infochimps-labs/wukong Two new students working at Stratosphere (@skunert and @filiphaase) are working on this. The reference binding language will be for Python, but other bindings are very welcome. The best name for this so far is stratosphere-lang-bindings. I created this issue to track the progress (and give everybody a chance to comment on this) Imported from GitHub Url: https://github.com/stratosphere/stratosphere/issues/377 Created by: [rmetzger|https://github.com/rmetzger] Labels: enhancement, Assignee: [filiphaase|https://github.com/filiphaase] Created at: Tue Jan 07 19:47:20 CET 2014 State: open -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (FLINK-377) Create a general purpose framework for language bindings
[ https://issues.apache.org/jira/browse/FLINK-377?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14293521#comment-14293521 ] ASF GitHub Bot commented on FLINK-377: -- Github user dan-blanchard commented on the pull request: https://github.com/apache/flink/pull/202#issuecomment-71648291 @rmetzger I'm really more curious than anything at this point. I recently worked on a fairly large Storm topology that has parts written Java, Python, and Perl. As part of that I ended up taking over as the maintainer of IO::Storm, the Perl library for interfacing with Storm via their [Multilang protocol](https://storm.apache.org/documentation/Multilang-protocol.html). Multilang makes it incredibly easy to add support for other languages, so I just wanted to know if you guys were going for something that simple or not. Create a general purpose framework for language bindings Key: FLINK-377 URL: https://issues.apache.org/jira/browse/FLINK-377 Project: Flink Issue Type: Improvement Reporter: GitHub Import Labels: github-import Fix For: pre-apache A general purpose API to run operators with arbitrary binaries. This will allow to run Stratosphere programs written in Python, JavaScript, Ruby, Go or whatever you like. We suggest using Google Protocol Buffers for data serialization. This is the list of languages that currently support ProtoBuf: https://code.google.com/p/protobuf/wiki/ThirdPartyAddOns Very early prototype with python: https://github.com/rmetzger/scratch/tree/learn-protobuf (basically testing protobuf) For Ruby: https://github.com/infochimps-labs/wukong Two new students working at Stratosphere (@skunert and @filiphaase) are working on this. The reference binding language will be for Python, but other bindings are very welcome. The best name for this so far is stratosphere-lang-bindings. I created this issue to track the progress (and give everybody a chance to comment on this) Imported from GitHub Url: https://github.com/stratosphere/stratosphere/issues/377 Created by: [rmetzger|https://github.com/rmetzger] Labels: enhancement, Assignee: [filiphaase|https://github.com/filiphaase] Created at: Tue Jan 07 19:47:20 CET 2014 State: open -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (FLINK-377) Create a general purpose framework for language bindings
[ https://issues.apache.org/jira/browse/FLINK-377?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14293217#comment-14293217 ] ASF GitHub Bot commented on FLINK-377: -- Github user zentol commented on the pull request: https://github.com/apache/flink/pull/202#issuecomment-71611872 Tests run on travis (they don't right now because fabian merged something that changes the CSVInputFormat constructor, which breaks stuff on my end) but see: https://travis-ci.org/zentol/incubator-flink/jobs/48334902 and search for Running org.apache.flink.languagebinding.api.java.python.PythonPlanBinderTest putting it under flink-python means splitting it from the generic interface, right? that would be necessary in the long run anyway, so I'm all for it. @dan-blanchard the generic interface is not just for python. it does reduce the amount of code you have to write in java by a pretty high amount. but It sets up some requirements, most prominently example support for binary data, memory-mapped files and sockets, though it would be possible to provide different options here. It is difficult for me to assess how difficult it would be; the generic and python part were coded and evolved simultaneously, and when something didn't fit i could just change it to do so. I think it's very likely that when someone wants to add another language we'll have to revisit a few things, but it provides at a good starting point. Create a general purpose framework for language bindings Key: FLINK-377 URL: https://issues.apache.org/jira/browse/FLINK-377 Project: Flink Issue Type: Improvement Reporter: GitHub Import Labels: github-import Fix For: pre-apache A general purpose API to run operators with arbitrary binaries. This will allow to run Stratosphere programs written in Python, JavaScript, Ruby, Go or whatever you like. We suggest using Google Protocol Buffers for data serialization. This is the list of languages that currently support ProtoBuf: https://code.google.com/p/protobuf/wiki/ThirdPartyAddOns Very early prototype with python: https://github.com/rmetzger/scratch/tree/learn-protobuf (basically testing protobuf) For Ruby: https://github.com/infochimps-labs/wukong Two new students working at Stratosphere (@skunert and @filiphaase) are working on this. The reference binding language will be for Python, but other bindings are very welcome. The best name for this so far is stratosphere-lang-bindings. I created this issue to track the progress (and give everybody a chance to comment on this) Imported from GitHub Url: https://github.com/stratosphere/stratosphere/issues/377 Created by: [rmetzger|https://github.com/rmetzger] Labels: enhancement, Assignee: [filiphaase|https://github.com/filiphaase] Created at: Tue Jan 07 19:47:20 CET 2014 State: open -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (FLINK-377) Create a general purpose framework for language bindings
[ https://issues.apache.org/jira/browse/FLINK-377?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14293223#comment-14293223 ] ASF GitHub Bot commented on FLINK-377: -- Github user rmetzger commented on the pull request: https://github.com/apache/flink/pull/202#issuecomment-71612618 @dan-blanchard What non-JVM language are you looking for? Maybe we can do a little prototype with that language to see how well it works. Maybe you or somebody else from the community is interested in making the prototype production ready? Create a general purpose framework for language bindings Key: FLINK-377 URL: https://issues.apache.org/jira/browse/FLINK-377 Project: Flink Issue Type: Improvement Reporter: GitHub Import Labels: github-import Fix For: pre-apache A general purpose API to run operators with arbitrary binaries. This will allow to run Stratosphere programs written in Python, JavaScript, Ruby, Go or whatever you like. We suggest using Google Protocol Buffers for data serialization. This is the list of languages that currently support ProtoBuf: https://code.google.com/p/protobuf/wiki/ThirdPartyAddOns Very early prototype with python: https://github.com/rmetzger/scratch/tree/learn-protobuf (basically testing protobuf) For Ruby: https://github.com/infochimps-labs/wukong Two new students working at Stratosphere (@skunert and @filiphaase) are working on this. The reference binding language will be for Python, but other bindings are very welcome. The best name for this so far is stratosphere-lang-bindings. I created this issue to track the progress (and give everybody a chance to comment on this) Imported from GitHub Url: https://github.com/stratosphere/stratosphere/issues/377 Created by: [rmetzger|https://github.com/rmetzger] Labels: enhancement, Assignee: [filiphaase|https://github.com/filiphaase] Created at: Tue Jan 07 19:47:20 CET 2014 State: open -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (FLINK-377) Create a general purpose framework for language bindings
[ https://issues.apache.org/jira/browse/FLINK-377?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14294187#comment-14294187 ] ASF GitHub Bot commented on FLINK-377: -- Github user StephanEwen commented on the pull request: https://github.com/apache/flink/pull/202#issuecomment-71727262 @dan-blanchard That is fine. Thank you for the pointer to Storm's multilang protocol. We'll have a look at it and see whether we can make something similar work with Flink. Create a general purpose framework for language bindings Key: FLINK-377 URL: https://issues.apache.org/jira/browse/FLINK-377 Project: Flink Issue Type: Improvement Reporter: GitHub Import Labels: github-import Fix For: pre-apache A general purpose API to run operators with arbitrary binaries. This will allow to run Stratosphere programs written in Python, JavaScript, Ruby, Go or whatever you like. We suggest using Google Protocol Buffers for data serialization. This is the list of languages that currently support ProtoBuf: https://code.google.com/p/protobuf/wiki/ThirdPartyAddOns Very early prototype with python: https://github.com/rmetzger/scratch/tree/learn-protobuf (basically testing protobuf) For Ruby: https://github.com/infochimps-labs/wukong Two new students working at Stratosphere (@skunert and @filiphaase) are working on this. The reference binding language will be for Python, but other bindings are very welcome. The best name for this so far is stratosphere-lang-bindings. I created this issue to track the progress (and give everybody a chance to comment on this) Imported from GitHub Url: https://github.com/stratosphere/stratosphere/issues/377 Created by: [rmetzger|https://github.com/rmetzger] Labels: enhancement, Assignee: [filiphaase|https://github.com/filiphaase] Created at: Tue Jan 07 19:47:20 CET 2014 State: open -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (FLINK-377) Create a general purpose framework for language bindings
[ https://issues.apache.org/jira/browse/FLINK-377?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14291766#comment-14291766 ] ASF GitHub Bot commented on FLINK-377: -- Github user zentol commented on the pull request: https://github.com/apache/flink/pull/202#issuecomment-71453267 I've rebased and update this PR. Notable new stuff: * hybrid mode removed * documentation update and integrated into website * **chaining** on the python side (map,flatmap, filter, combine) * groupreduce/cogroup reworked - grouping done on python side * iterators passed to UDF's now iterable * **lambda support** * **test coverage** (works from IDE, maven and on travis) Create a general purpose framework for language bindings Key: FLINK-377 URL: https://issues.apache.org/jira/browse/FLINK-377 Project: Flink Issue Type: Improvement Reporter: GitHub Import Labels: github-import Fix For: pre-apache A general purpose API to run operators with arbitrary binaries. This will allow to run Stratosphere programs written in Python, JavaScript, Ruby, Go or whatever you like. We suggest using Google Protocol Buffers for data serialization. This is the list of languages that currently support ProtoBuf: https://code.google.com/p/protobuf/wiki/ThirdPartyAddOns Very early prototype with python: https://github.com/rmetzger/scratch/tree/learn-protobuf (basically testing protobuf) For Ruby: https://github.com/infochimps-labs/wukong Two new students working at Stratosphere (@skunert and @filiphaase) are working on this. The reference binding language will be for Python, but other bindings are very welcome. The best name for this so far is stratosphere-lang-bindings. I created this issue to track the progress (and give everybody a chance to comment on this) Imported from GitHub Url: https://github.com/stratosphere/stratosphere/issues/377 Created by: [rmetzger|https://github.com/rmetzger] Labels: enhancement, Assignee: [filiphaase|https://github.com/filiphaase] Created at: Tue Jan 07 19:47:20 CET 2014 State: open -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (FLINK-377) Create a general purpose framework for language bindings
[ https://issues.apache.org/jira/browse/FLINK-377?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14291803#comment-14291803 ] ASF GitHub Bot commented on FLINK-377: -- Github user uce commented on the pull request: https://github.com/apache/flink/pull/202#issuecomment-71458700 Wow, great news! :-) In general, I think we really have to do something about getting the changes in. The PR is growing faster than its getting feedback. Has anybody looked into this and tried it out recently? Create a general purpose framework for language bindings Key: FLINK-377 URL: https://issues.apache.org/jira/browse/FLINK-377 Project: Flink Issue Type: Improvement Reporter: GitHub Import Labels: github-import Fix For: pre-apache A general purpose API to run operators with arbitrary binaries. This will allow to run Stratosphere programs written in Python, JavaScript, Ruby, Go or whatever you like. We suggest using Google Protocol Buffers for data serialization. This is the list of languages that currently support ProtoBuf: https://code.google.com/p/protobuf/wiki/ThirdPartyAddOns Very early prototype with python: https://github.com/rmetzger/scratch/tree/learn-protobuf (basically testing protobuf) For Ruby: https://github.com/infochimps-labs/wukong Two new students working at Stratosphere (@skunert and @filiphaase) are working on this. The reference binding language will be for Python, but other bindings are very welcome. The best name for this so far is stratosphere-lang-bindings. I created this issue to track the progress (and give everybody a chance to comment on this) Imported from GitHub Url: https://github.com/stratosphere/stratosphere/issues/377 Created by: [rmetzger|https://github.com/rmetzger] Labels: enhancement, Assignee: [filiphaase|https://github.com/filiphaase] Created at: Tue Jan 07 19:47:20 CET 2014 State: open -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (FLINK-377) Create a general purpose framework for language bindings
[ https://issues.apache.org/jira/browse/FLINK-377?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14292761#comment-14292761 ] ASF GitHub Bot commented on FLINK-377: -- Github user dan-blanchard commented on the pull request: https://github.com/apache/flink/pull/202#issuecomment-71570805 I've only recently started looking at Flink, and the lack of support for non-JVM languages was a bit of showstopper for me. That's one of the main reasons we use Storm. Anyway, is the idea here that this will just be for Python? Will it be simple to for third parties to add support for other languages? Create a general purpose framework for language bindings Key: FLINK-377 URL: https://issues.apache.org/jira/browse/FLINK-377 Project: Flink Issue Type: Improvement Reporter: GitHub Import Labels: github-import Fix For: pre-apache A general purpose API to run operators with arbitrary binaries. This will allow to run Stratosphere programs written in Python, JavaScript, Ruby, Go or whatever you like. We suggest using Google Protocol Buffers for data serialization. This is the list of languages that currently support ProtoBuf: https://code.google.com/p/protobuf/wiki/ThirdPartyAddOns Very early prototype with python: https://github.com/rmetzger/scratch/tree/learn-protobuf (basically testing protobuf) For Ruby: https://github.com/infochimps-labs/wukong Two new students working at Stratosphere (@skunert and @filiphaase) are working on this. The reference binding language will be for Python, but other bindings are very welcome. The best name for this so far is stratosphere-lang-bindings. I created this issue to track the progress (and give everybody a chance to comment on this) Imported from GitHub Url: https://github.com/stratosphere/stratosphere/issues/377 Created by: [rmetzger|https://github.com/rmetzger] Labels: enhancement, Assignee: [filiphaase|https://github.com/filiphaase] Created at: Tue Jan 07 19:47:20 CET 2014 State: open -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (FLINK-377) Create a general purpose framework for language bindings
[ https://issues.apache.org/jira/browse/FLINK-377?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14292551#comment-14292551 ] ASF GitHub Bot commented on FLINK-377: -- Github user StephanEwen commented on the pull request: https://github.com/apache/flink/pull/202#issuecomment-71554357 Concerning the package / project structure: It is a bit humble and hidden as some language binding for python. I think this deserves to be more prominently called python api. Should we put this under `flink-addons/flink-python`? Create a general purpose framework for language bindings Key: FLINK-377 URL: https://issues.apache.org/jira/browse/FLINK-377 Project: Flink Issue Type: Improvement Reporter: GitHub Import Labels: github-import Fix For: pre-apache A general purpose API to run operators with arbitrary binaries. This will allow to run Stratosphere programs written in Python, JavaScript, Ruby, Go or whatever you like. We suggest using Google Protocol Buffers for data serialization. This is the list of languages that currently support ProtoBuf: https://code.google.com/p/protobuf/wiki/ThirdPartyAddOns Very early prototype with python: https://github.com/rmetzger/scratch/tree/learn-protobuf (basically testing protobuf) For Ruby: https://github.com/infochimps-labs/wukong Two new students working at Stratosphere (@skunert and @filiphaase) are working on this. The reference binding language will be for Python, but other bindings are very welcome. The best name for this so far is stratosphere-lang-bindings. I created this issue to track the progress (and give everybody a chance to comment on this) Imported from GitHub Url: https://github.com/stratosphere/stratosphere/issues/377 Created by: [rmetzger|https://github.com/rmetzger] Labels: enhancement, Assignee: [filiphaase|https://github.com/filiphaase] Created at: Tue Jan 07 19:47:20 CET 2014 State: open -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (FLINK-377) Create a general purpose framework for language bindings
[ https://issues.apache.org/jira/browse/FLINK-377?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14292592#comment-14292592 ] ASF GitHub Bot commented on FLINK-377: -- Github user tillrohrmann commented on the pull request: https://github.com/apache/flink/pull/202#issuecomment-71558532 I'm in favour of putting it more prominently under ```flink-addons/flink-python```. Create a general purpose framework for language bindings Key: FLINK-377 URL: https://issues.apache.org/jira/browse/FLINK-377 Project: Flink Issue Type: Improvement Reporter: GitHub Import Labels: github-import Fix For: pre-apache A general purpose API to run operators with arbitrary binaries. This will allow to run Stratosphere programs written in Python, JavaScript, Ruby, Go or whatever you like. We suggest using Google Protocol Buffers for data serialization. This is the list of languages that currently support ProtoBuf: https://code.google.com/p/protobuf/wiki/ThirdPartyAddOns Very early prototype with python: https://github.com/rmetzger/scratch/tree/learn-protobuf (basically testing protobuf) For Ruby: https://github.com/infochimps-labs/wukong Two new students working at Stratosphere (@skunert and @filiphaase) are working on this. The reference binding language will be for Python, but other bindings are very welcome. The best name for this so far is stratosphere-lang-bindings. I created this issue to track the progress (and give everybody a chance to comment on this) Imported from GitHub Url: https://github.com/stratosphere/stratosphere/issues/377 Created by: [rmetzger|https://github.com/rmetzger] Labels: enhancement, Assignee: [filiphaase|https://github.com/filiphaase] Created at: Tue Jan 07 19:47:20 CET 2014 State: open -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (FLINK-377) Create a general purpose framework for language bindings
[ https://issues.apache.org/jira/browse/FLINK-377?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14292547#comment-14292547 ] ASF GitHub Bot commented on FLINK-377: -- Github user StephanEwen commented on the pull request: https://github.com/apache/flink/pull/202#issuecomment-71553978 I agree, we should get this into the code. Important is that it is building out of the box and working well with Travis. I assume that works? Create a general purpose framework for language bindings Key: FLINK-377 URL: https://issues.apache.org/jira/browse/FLINK-377 Project: Flink Issue Type: Improvement Reporter: GitHub Import Labels: github-import Fix For: pre-apache A general purpose API to run operators with arbitrary binaries. This will allow to run Stratosphere programs written in Python, JavaScript, Ruby, Go or whatever you like. We suggest using Google Protocol Buffers for data serialization. This is the list of languages that currently support ProtoBuf: https://code.google.com/p/protobuf/wiki/ThirdPartyAddOns Very early prototype with python: https://github.com/rmetzger/scratch/tree/learn-protobuf (basically testing protobuf) For Ruby: https://github.com/infochimps-labs/wukong Two new students working at Stratosphere (@skunert and @filiphaase) are working on this. The reference binding language will be for Python, but other bindings are very welcome. The best name for this so far is stratosphere-lang-bindings. I created this issue to track the progress (and give everybody a chance to comment on this) Imported from GitHub Url: https://github.com/stratosphere/stratosphere/issues/377 Created by: [rmetzger|https://github.com/rmetzger] Labels: enhancement, Assignee: [filiphaase|https://github.com/filiphaase] Created at: Tue Jan 07 19:47:20 CET 2014 State: open -- This message was sent by Atlassian JIRA (v6.3.4#6332)