[ 
https://issues.apache.org/jira/browse/FLINK-3014?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15035812#comment-15035812
 ] 

ASF GitHub Bot commented on FLINK-3014:
---------------------------------------

GitHub user zentol opened a pull request:

    https://github.com/apache/flink/pull/1432

    3057 py con

    This PR allows bidirectional data exchange during the plan phase.
    To do this the following changes were made:
    * replaced OneWayMappedFileConnection with a simple TCP connection
    * added an iterator to the python ExEnv to deserialize data during the plan 
phase 
    * added a new java Streamer class, providing the PlanBinder with a single 
access point to both send and receive data
    
    A proof-of-concept JobExecutionResult was added as well, providing access 
to getNetRuntime()
    
    Most of the changes in this PR ended up as a refactoring of related 
classes. The data exchange code was residing completely within the 
PythonStreamer, Sender and Receiver classes, regardless of the phase(plan or 
runtime)they are used. All these classes are now split into classes for the 
respective phase: PythonSender/-Receiver/-Streamer are used at runtime, 
PythonPlanSender/-Receiver/-Streamer during the plan phase. Additionally, these 
classes were 
    renamed to be more consistent, and moved into separate packages.
    
    These changes are mostly about consistency. Process spawning is **always** 
handled within a Streamer class, which is also **always** used as the access 
point for both sending and receiving data. There is also no longer a 
(non-obvious) overlap in methods used in both phases, which frequently caused 
issues when these were modified.
    
    This PR also includes a fix for FLINK-3014, switching from 
InputStream.read() to DataInputStream.readFully().

You can merge this pull request into a Git repository by running:

    $ git pull https://github.com/zentol/flink 3057_py_con

Alternatively you can review and apply these changes as the patch at:

    https://github.com/apache/flink/pull/1432.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

    This closes #1432
    
----
commit 6df41cd929836f5dfa73edfec88a946deb57e1b0
Author: zentol <[email protected]>
Date:   2015-11-22T15:23:29Z

    [FLINK-3057] Bidrectional plan connection

commit a91ea807235717738263eb84c3ae317f93caa7e4
Author: zentol <[email protected]>
Date:   2015-11-22T15:24:30Z

    [FLINK-3014] Replace InStr.read() calls with DataInStr.readFully()

----


> Return code from InputStream#read() should be checked in 
> PythonStreamer#sendBroadCastVariables
> ----------------------------------------------------------------------------------------------
>
>                 Key: FLINK-3014
>                 URL: https://issues.apache.org/jira/browse/FLINK-3014
>             Project: Flink
>          Issue Type: Bug
>            Reporter: Ted Yu
>            Assignee: Chesnay Schepler
>            Priority: Minor
>
> Here is related code (there're 3 occurrences in the method):
> {code}
>       in.read(buffer, 0, 4);
>       checkForError();
> {code}
> java.io.InputStream.read(byte[], int, int) returns the number of bytes read.
> The return value should be checked because checkForError() does this:
> {code}
>     if (getInt(buffer, 0) == -2) {
>       try { //wait before terminating to ensure that the complete error 
> message is printed
> {code}
> Incorrect result may be returned if fewer than 4 bytes of data are read.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

Reply via email to