[
https://issues.apache.org/jira/browse/BEAM-6106?focusedWorklogId=168566&page=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-168566
]
ASF GitHub Bot logged work on BEAM-6106:
----------------------------------------
Author: ASF GitHub Bot
Created on: 21/Nov/18 22:48
Start Date: 21/Nov/18 22:48
Worklog Time Spent: 10m
Work Description: aaltay closed pull request #7112: [BEAM-6106] Cherry
pick #7107 to release branch
URL: https://github.com/apache/beam/pull/7112
This is a PR merged from a forked repository.
As GitHub hides the original diff on merge, it is displayed below for
the sake of provenance:
As this is a foreign pull request (from a fork), the diff is supplied
below (as it won't show otherwise due to GitHub magic):
diff --git a/sdks/python/apache_beam/testing/synthetic_pipeline.py
b/sdks/python/apache_beam/testing/synthetic_pipeline.py
index e19ec25790e..c76b9cd6904 100644
--- a/sdks/python/apache_beam/testing/synthetic_pipeline.py
+++ b/sdks/python/apache_beam/testing/synthetic_pipeline.py
@@ -71,7 +71,7 @@ def div_round_up(a, b):
def rotate_key(element):
"""Returns a new key-value pair of the same size but with a different key."""
(key, value) = element
- return key[-1:] + key[:-1], value
+ return key[-1] + key[:-1], value
class SyntheticStep(beam.DoFn):
diff --git a/sdks/python/apache_beam/testing/synthetic_pipeline_test.py
b/sdks/python/apache_beam/testing/synthetic_pipeline_test.py
index e7865538412..fe5e94a78cd 100644
--- a/sdks/python/apache_beam/testing/synthetic_pipeline_test.py
+++ b/sdks/python/apache_beam/testing/synthetic_pipeline_test.py
@@ -125,7 +125,7 @@ def run_pipeline(self, barrier, writes_output=True):
if writes_output:
read_output = []
for file_name in glob.glob(output_location + '*'):
- with open(file_name, 'rb') as f:
+ with open(file_name, 'r') as f:
read_output.extend(f.read().splitlines())
self.assertEqual(10, len(read_output))
diff --git a/sdks/python/apache_beam/testing/test_utils.py
b/sdks/python/apache_beam/testing/test_utils.py
index f9aa128d058..1f0e99eec24 100644
--- a/sdks/python/apache_beam/testing/test_utils.py
+++ b/sdks/python/apache_beam/testing/test_utils.py
@@ -75,14 +75,11 @@ def create_temp_file(self, suffix='', lines=None):
def compute_hash(content, hashing_alg=DEFAULT_HASHING_ALG):
- """Compute a hash value of a list of objects by hashing their string
- representations."""
- content = [str(x).encode('utf-8') if not isinstance(x, bytes) else x
- for x in content]
+ """Compute a hash value from a list of string."""
content.sort()
m = hashlib.new(hashing_alg)
for elem in content:
- m.update(elem)
+ m.update(str(elem).encode('utf-8'))
return m.hexdigest()
diff --git a/sdks/python/tox.ini b/sdks/python/tox.ini
index cabff08cf1d..09c794b16db 100644
--- a/sdks/python/tox.ini
+++ b/sdks/python/tox.ini
@@ -58,7 +58,7 @@ setenv =
BEAM_EXPERIMENTAL_PY3=1
RUN_SKIPPED_PY3_TESTS=0
modules =
-
apache_beam.typehints,apache_beam.coders,apache_beam.options,apache_beam.tools,apache_beam.utils,apache_beam.internal,apache_beam.metrics,apache_beam.portability,apache_beam.pipeline_test,apache_beam.pvalue_test,apache_beam.runners,apache_beam.io.hadoopfilesystem_test,apache_beam.io.hdfs_integration_test,apache_beam.io.gcp.tests.utils_test,apache_beam.io.gcp.big_query_query_to_table_it_test,apache_beam.io.gcp.bigquery_io_read_it_test,apache_beam.io.gcp.bigquery_test,apache_beam.io.gcp.gcsfilesystem_test,apache_beam.io.gcp.gcsio_test,apache_beam.io.gcp.pubsub_integration_test,apache_beam.io.hdfs_integration_test,apache_beam.io.gcp.internal,apache_beam.io.filesystem_test,apache_beam.io.filesystems,apache_beam.io.range_trackers_test,apache_beam.io.sources_test,apache_beam.testing
+
apache_beam.typehints,apache_beam.coders,apache_beam.options,apache_beam.tools,apache_beam.utils,apache_beam.internal,apache_beam.metrics,apache_beam.portability,apache_beam.pipeline_test,apache_beam.pvalue_test,apache_beam.runners,apache_beam.io.hadoopfilesystem_test,apache_beam.io.hdfs_integration_test,apache_beam.io.gcp.tests.utils_test,apache_beam.io.gcp.big_query_query_to_table_it_test,apache_beam.io.gcp.bigquery_io_read_it_test,apache_beam.io.gcp.bigquery_test,apache_beam.io.gcp.gcsfilesystem_test,apache_beam.io.gcp.gcsio_test,apache_beam.io.gcp.pubsub_integration_test,apache_beam.io.hdfs_integration_test,apache_beam.io.gcp.internal,apache_beam.io.filesystem_test,apache_beam.io.filesystems,apache_beam.io.range_trackers_test,apache_beam.io.sources_test
commands =
python --version
pip --version
----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
For queries about this service, please contact Infrastructure at:
[email protected]
Issue Time Tracking
-------------------
Worklog Id: (was: 168566)
Time Spent: 1h 20m (was: 1h 10m)
> test_bigquery_tornadoes_it fails due to a hash mismatch
> -------------------------------------------------------
>
> Key: BEAM-6106
> URL: https://issues.apache.org/jira/browse/BEAM-6106
> Project: Beam
> Issue Type: Bug
> Components: sdk-py-core
> Reporter: Chamikara Jayalath
> Assignee: Ahmet Altay
> Priority: Blocker
> Fix For: 2.10.0
>
> Time Spent: 1h 20m
> Remaining Estimate: 0h
>
> For example,
> [https://builds.apache.org/job/beam_PostCommit_Python_Verify/6626/console|https://builds.apache.org/job/beam_PostCommit_Python_Verify/6626/consoleFull]
>
> *09:32:19* Expected: (Test pipeline expected terminated in state: DONE and
> Expected checksum is 83789a7c1bca7959dcf23d3bc37e9204e594330f)*09:32:19*
> but: Expected checksum is 83789a7c1bca7959dcf23d3bc37e9204e594330f Actual
> checksum is d860e636050c559a16a791aff40d6ad809d4daf0
>
> Root cause seems to be a hash function change in
> https://github.com/apache/beam/pull/7029
--
This message was sent by Atlassian JIRA
(v7.6.3#76005)