Hi, Folks,

     I am running python SDK PortableRunner, by connecting to Java
Reference Runner Job server. But we couldn't make it work because docker
container fails to start due to error message: "2018/11/16 21:38:55 Failed
to retrieve staged files: failed to retrieve pickled_main_session in 3
attempts: bad MD5 for /tmp/staged/pickled_main_session:
9g/EU11J0QTfwDVbpHQhAQ==, want ; bad MD5 for
/tmp/staged/pickled_main_session: 9g/EU11J0QTfwDVbpHQhAQ==, want ; bad MD5
for /tmp/staged/pickled_main_session: 9g/EU11J0QTfwDVbpHQhAQ==, want ; bad
MD5 for /tmp/staged/pickled_main_session: 9g/EU11J0QTfwDVbpHQhAQ==,
want ".  Actual code for this error message is here
<https://github.com/apache/beam/blob/master/sdks/go/pkg/beam/artifact/materialize.go#L173>
.

The file pickled_main_session is INDEED staged, but for unknown reason we
are expecting an empty string as the hash code. My hypothesis is that, the
job request should've included a hash code, but fails to do so on the
python part, thus led to an empty string.

If the hypothesis above is correct, then my question is: where should I put
the code in python SDK's job request to make it right? A pointer to the
right place is appreciated.

That being said, I also saw Ankur's recent PR#7049
<https://github.com/apache/beam/commit/1b241f9517342c73ed2f0a73251858ee67c7e191>
updates
MD5 into SHA256. And this PR we are not updating anything in Java or
Python. Therefore it makes me not sure about the hypothesis above. What did
I miss? (or maybe that is what PR#7049 should've done?)

Suggestions appreciated.

Cheers,
-- 
================
Ruoyun  Huang

Reply via email to