[
https://issues.apache.org/jira/browse/AVRO-4199?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Ryan Skraba updated AVRO-4199:
------------------------------
Summary: [build][py] test_tether_word_count fails (was: [build][p]
test_tether_word_count fails)
> [build][py] test_tether_word_count fails
> ----------------------------------------
>
> Key: AVRO-4199
> URL: https://issues.apache.org/jira/browse/AVRO-4199
> Project: Apache Avro
> Issue Type: Bug
> Affects Versions: 1.11.5, 1.12.1
> Reporter: Ryan Skraba
> Priority: Major
>
> The
> [test_tether_word_count|https://github.com/apache/avro/blob/6429a1dfb246a54a276ff35b9544c34b22ea961c/lang/py/avro/test/test_tether_word_count.py#L168]
> currently fails, likely due to recent security fixes in the Java SDK.
> Notably:
> {code:java}
> 25/11/12 14:19:13 WARN tether.TetherMapRunner: Task failed
> java.lang.SecurityException: Forbidden org.apache.avro.ipc.HandshakeRequest!
> This class is not trusted to be included in Avro schemas. You may either use
> the system properties org.apache.avro.SERIALIZABLE_CLASSES and
> org.apache.avro.SERIALIZABLE_PACKAGES to set the comma separated list of the
> classes or packages you trust, or you can set them via the API (see
> org.apache.avro.util.ClassSecurityValidator). {code}
>
> *More log context:*
> {code:java}
> py310: install_deps> python -I -m pip install coverage python-snappy
> zstandard
> py310: install_package> python -I -m pip install --force-reinstall --no-deps
> /home/ryanskraba/avro/lang/py/.tox/.tmp/package/5/avro-1.13.0+snapshot.tar.gz
> py310: commands_pre[0]> mkdir -p avro/test/interop
> /home/ryanskraba/avro/lang/py/../../build/interop/data
> py310: commands_pre[1]> cp -r
> /home/ryanskraba/avro/lang/py/../../build/interop/data avro/test/interop
> py310: commands_pre[2]> coverage run -pm avro.test.gen_interop_data
> avro/interop.avsc avro/test/interop/data/py.avro
> py310: commands_pre[3]> cp -r avro/test/interop/data
> /home/ryanskraba/avro/lang/py/../../build/interop
> py310: commands[0]> coverage run -pm unittest discover --buffer --failfast
> /home/ryanskraba/avro/lang/py/avro/schema.py:1233: IgnoredLogicalType:
> Unknown unknown-logical-type, using string.
> warnings.warn(avro.errors.IgnoredLogicalType(f"Unknown {logical_type},
> using {type_}."))
> /home/ryanskraba/avro/lang/py/avro/schema.py:1229: IgnoredLogicalType:
> Logical type timestamp-millis requires literal type long, not string.
> warnings.warn(
> /home/ryanskraba/avro/lang/py/avro/schema.py:1233: IgnoredLogicalType:
> Unknown unknown-logical-type, using string.
> warnings.warn(avro.errors.IgnoredLogicalType(f"Unknown {logical_type},
> using {type_}."))
> /home/ryanskraba/avro/lang/py/avro/schema.py:1229: IgnoredLogicalType:
> Logical type timestamp-millis requires literal type long, not string.
> warnings.warn(
home/ryanskraba/avro/lang/py/avro/__main__.py:78:
> AvroWarning: There is no way to safely type check this.
> warnings.warn(avro.errors.AvroWarning("There is no way to safely type check
> this."))
> ......./home/ryanskraba/avro/lang/py/avro/datafile.py:172: AvroWarning:
> Writing binary data to a writer <_io.TextIOWrapper name='<stdout>' mode='w'
> encoding='utf-8'> that's opened for text
> warnings.warn(avro.errors.AvroWarning(f"Writing binary data to a writer
> {writer!r} that's opened for text"))
> 1.13.0+SNAPSHOT
> ..../home/ryanskraba/avro/lang/py/avro/datafile.py:172: AvroWarning: Writing
> binary data to a writer <_io.TextIOWrapper name='<stdout>' mode='w'
> encoding='utf-8'> that's opened for text
> warnings.warn(avro.errors.AvroWarning(f"Writing binary data to a writer
> {writer!r} that's opened for text"))
> ./home/ryanskraba/avro/lang/py/avro/datafile.py:172: AvroWarning: Writing
> binary data to a writer <_io.TextIOWrapper name='<stdout>' mode='w'
> encoding='utf-8'> that's opened for text
> warnings.warn(avro.errors.AvroWarning(f"Writing binary data to a writer
> {writer!r} that's opened for text"))
> mock_tether_parent: Launching Server on Port: 37659
> .MockParentResponder: Received 'configure'': inputPort=57031
> 127.0.0.1 - - [12/Nov/2025 14:19:07] "POST / HTTP/1.1" 200 -
> MockParentResponder: Received 'output'
> 127.0.0.1 - - [12/Nov/2025 14:19:07] "POST / HTTP/1.1" 200 -
> MockParentResponder: Received 'output'
> 127.0.0.1 - - [12/Nov/2025 14:19:07] "POST / HTTP/1.1" 200 -
> MockParentResponder: Received 'output'
> 127.0.0.1 - - [12/Nov/2025 14:19:07] "POST / HTTP/1.1" 200 -
> MockParentResponder: Received 'output'
> 127.0.0.1 - - [12/Nov/2025 14:19:07] "POST / HTTP/1.1" 200 -
> MockParentResponder: Received 'output'
> 127.0.0.1 - - [12/Nov/2025 14:19:07] "POST / HTTP/1.1" 200 -
> MockParentResponder: Received 'output'
> 127.0.0.1 - - [12/Nov/2025 14:19:07] "POST / HTTP/1.1" 200 -
> MockParentResponder: Received 'output'
> 127.0.0.1 - - [12/Nov/2025 14:19:07] "POST / HTTP/1.1" 200 -
> MockParentResponder: Received 'complete'
> 127.0.0.1 - - [12/Nov/2025 14:19:07] "POST / HTTP/1.1" 200 -
> MockParentResponder: Received 'status': message=Status message
> 127.0.0.1 - - [12/Nov/2025 14:19:07] "POST / HTTP/1.1" 200 -
> mock_tether_parent: Launching Server on Port: 38869
> .MockParentResponder: Received 'configure'': inputPort=36581
> 127.0.0.1 - - [12/Nov/2025 14:19:08] "POST / HTTP/1.1" 200 -
> MockParentResponder: Received 'output'
> 127.0.0.1 - - [12/Nov/2025 14:19:08] "POST / HTTP/1.1" 200 -
> MockParentResponder: Received 'output'
> 127.0.0.1 - - [12/Nov/2025 14:19:08] "POST / HTTP/1.1" 200 -
> MockParentResponder: Received 'output'
> 127.0.0.1 - - [12/Nov/2025 14:19:08] "POST / HTTP/1.1" 200 -
> MockParentResponder: Received 'output'
> 127.0.0.1 - - [12/Nov/2025 14:19:08] "POST / HTTP/1.1" 200 -
> MockParentResponder: Received 'output'
> 127.0.0.1 - - [12/Nov/2025 14:19:08] "POST / HTTP/1.1" 200 -
> MockParentResponder: Received 'output'
> 127.0.0.1 - - [12/Nov/2025 14:19:08] "POST / HTTP/1.1" 200 -
> MockParentResponder: Received 'output'
> 127.0.0.1 - - [12/Nov/2025 14:19:08] "POST / HTTP/1.1" 200 -
> MockParentResponder: Received 'complete'
> 127.0.0.1 - - [12/Nov/2025 14:19:08] "POST / HTTP/1.1" 200 -
> mock_tether_parent: Launching Server on Port: 46409
> .INFO:root:tether_task_runner.__main__: Task:
> avro.test.word_count_task.WordCountTask
> INFO:TetherTask:TetherTask.open: Opening connection to parent server on
> port=46409
> MockParentResponder: Received 'configure'': inputPort=37521
> 127.0.0.1 - - [12/Nov/2025 14:19:10] "POST / HTTP/1.1" 200 -
> .25/11/12 14:19:12 WARN util.NativeCodeLoader: Unable to load native-hadoop
> library for your platform... using builtin-java classes where applicable
> 25/11/12 14:19:13 WARN impl.MetricsSystemImpl: JobTracker metrics system
> already initialized!
> 25/11/12 14:19:13 WARN mapreduce.JobResourceUploader: Hadoop command-line
> option parsing not performed. Implement the Tool interface and execute your
> application with ToolRunner to remedy this.
> 25/11/12 14:19:13 WARN mapreduce.JobResourceUploader: No job jar file set.
> User classes may not be found. See Job or Job#setJar(String).
> 25/11/12 14:19:13 INFO mapred.FileInputFormat: Total input files to process :
> 1
> 25/11/12 14:19:13 INFO mapreduce.JobSubmitter: number of splits:1
> 25/11/12 14:19:13 INFO mapreduce.JobSubmitter: Submitting tokens for job:
> job_local1890703834_0001
> 25/11/12 14:19:13 INFO mapreduce.JobSubmitter: Executing with tokens: []
> 25/11/12 14:19:13 INFO mapreduce.Job: The url to track the job:
> http://localhost:8080/
> 25/11/12 14:19:13 INFO mapred.LocalJobRunner: OutputCommitter set in config
> null
> 25/11/12 14:19:13 INFO mapreduce.Job: Running job: job_local1890703834_0001
> 25/11/12 14:19:13 INFO mapred.LocalJobRunner: OutputCommitter is
> org.apache.hadoop.mapred.FileOutputCommitter
> 25/11/12 14:19:13 INFO output.FileOutputCommitter: File Output Committer
> Algorithm version is 2
> 25/11/12 14:19:13 INFO output.FileOutputCommitter: FileOutputCommitter skip
> cleanup _temporary folders under output directory:false, ignore cleanup
> failures: false
> 25/11/12 14:19:13 INFO mapred.LocalJobRunner: Waiting for map tasks
> 25/11/12 14:19:13 INFO mapred.LocalJobRunner: Starting task:
> attempt_local1890703834_0001_m_000000_0
> 25/11/12 14:19:13 INFO output.FileOutputCommitter: File Output Committer
> Algorithm version is 2
> 25/11/12 14:19:13 INFO output.FileOutputCommitter: FileOutputCommitter skip
> cleanup _temporary folders under output directory:false, ignore cleanup
> failures: false
> 25/11/12 14:19:13 INFO mapred.Task: Using ResourceCalculatorProcessTree : [ ]
> 25/11/12 14:19:13 INFO mapred.MapTask: Processing split:
> file:/tmp/test_tether_word_countasg2o6_g/in/lines.avro:0+195
> 25/11/12 14:19:13 INFO mapred.MapTask: numReduceTasks: 1
> 25/11/12 14:19:13 INFO mapred.MapTask: (EQUATOR) 0 kvi 26214396(104857584)
> 25/11/12 14:19:13 INFO mapred.MapTask: mapreduce.task.io.sort.mb: 100
> 25/11/12 14:19:13 INFO mapred.MapTask: soft limit at 83886080
> 25/11/12 14:19:13 INFO mapred.MapTask: bufstart = 0; bufvoid = 104857600
> 25/11/12 14:19:13 INFO mapred.MapTask: kvstart = 26214396; length = 6553600
> 25/11/12 14:19:13 INFO mapred.MapTask: Map output collector class =
> org.apache.hadoop.mapred.MapTask$MapOutputBuffer
> 25/11/12 14:19:13 WARN tether.TetherMapRunner: Task failed
> java.lang.SecurityException: Forbidden org.apache.avro.ipc.HandshakeRequest!
> This class is not trusted to be included in Avro schemas. You may either use
> the system properties org.apache.avro.SERIALIZABLE_CLASSES and
> org.apache.avro.SERIALIZABLE_PACKAGES to set the comma separated list of the
> classes or packages you trust, or you can set them via the API (see
> org.apache.avro.util.ClassSecurityValidator).
> at
> org.apache.avro.util.ClassSecurityValidator$ClassSecurityPredicate.forbiddenClass(ClassSecurityValidator.java:106)
> at
> org.apache.avro.util.ClassSecurityValidator.validate(ClassSecurityValidator.java:60)
> at org.apache.avro.util.ClassUtils.forName(ClassUtils.java:99)
> at org.apache.avro.util.ClassUtils.forName(ClassUtils.java:72)
> at
> org.apache.avro.specific.SpecificData.lambda$getClass$2(SpecificData.java:392)
> at
> java.base/java.util.concurrent.ConcurrentHashMap.computeIfAbsent(ConcurrentHashMap.java:1708)
> at org.apache.avro.specific.SpecificData.getClass(SpecificData.java:390)
> at
> org.apache.avro.specific.SpecificDatumReader.setSchema(SpecificDatumReader.java:98)
> at
> org.apache.avro.specific.SpecificDatumReader.<init>(SpecificDatumReader.java:62)
> at org.apache.avro.ipc.Responder.<init>(Responder.java:204)
> at
> org.apache.avro.ipc.generic.GenericResponder.<init>(GenericResponder.java:45)
> at
> org.apache.avro.ipc.specific.SpecificResponder.<init>(SpecificResponder.java:55)
> at
> org.apache.avro.ipc.specific.SpecificResponder.<init>(SpecificResponder.java:51)
> at
> org.apache.avro.ipc.specific.SpecificResponder.<init>(SpecificResponder.java:43)
> at
> org.apache.avro.mapred.tether.TetheredProcess.<init>(TetheredProcess.java:93)
> at
> org.apache.avro.mapred.tether.TetherMapRunner.run(TetherMapRunner.java:52)
> at org.apache.hadoop.mapred.MapTask.runOldMapper(MapTask.java:466)
> at org.apache.hadoop.mapred.MapTask.run(MapTask.java:350)
> at
> org.apache.hadoop.mapred.LocalJobRunner$Job$MapTaskRunnable.run(LocalJobRunner.java:271)
> at
> java.base/java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:572)
> at java.base/java.util.concurrent.FutureTask.run(FutureTask.java:317)
> at
> java.base/java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1144)
> at
> java.base/java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:642)
> at java.base/java.lang.Thread.run(Thread.java:1583)
> 25/11/12 14:19:13 INFO mapred.LocalJobRunner: map task executor complete.
> 25/11/12 14:19:13 WARN mapred.LocalJobRunner: job_local1890703834_0001
> java.lang.Exception: java.lang.NullPointerException: Cannot read field
> "inputClient" because "this.process" is null
> at
> org.apache.hadoop.mapred.LocalJobRunner$Job.runTasks(LocalJobRunner.java:492)
> at
> org.apache.hadoop.mapred.LocalJobRunner$Job.run(LocalJobRunner.java:552)
> Caused by: java.lang.NullPointerException: Cannot read field "inputClient"
> because "this.process" is null
> at
> org.apache.avro.mapred.tether.TetherMapRunner.run(TetherMapRunner.java:80)
> at org.apache.hadoop.mapred.MapTask.runOldMapper(MapTask.java:466)
> at org.apache.hadoop.mapred.MapTask.run(MapTask.java:350)
> at
> org.apache.hadoop.mapred.LocalJobRunner$Job$MapTaskRunnable.run(LocalJobRunner.java:271)
> at
> java.base/java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:572)
> at java.base/java.util.concurrent.FutureTask.run(FutureTask.java:317)
> at
> java.base/java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1144)
> at
> java.base/java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:642)
> at java.base/java.lang.Thread.run(Thread.java:1583)
> 25/11/12 14:19:14 INFO mapreduce.Job: Job job_local1890703834_0001 running in
> uber mode : false
> 25/11/12 14:19:14 INFO mapreduce.Job: map 0% reduce 0%
> 25/11/12 14:19:14 INFO mapreduce.Job: Job job_local1890703834_0001 failed
> with state FAILED due to: NA
> 25/11/12 14:19:14 INFO mapreduce.Job: Counters: 0
> Exception in thread "main" java.io.IOException: Job failed!
> at org.apache.hadoop.mapred.JobClient.runJob(JobClient.java:875)
> at org.apache.avro.mapred.tether.TetherJob.runJob(TetherJob.java:114)
> at org.apache.avro.tool.TetherTool.run(TetherTool.java:152)
> at org.apache.avro.tool.Main.run(Main.java:67)
> at org.apache.avro.tool.Main.main(Main.java:56)Stdout:
> Command:
> java -jar
> /home/ryanskraba/avro/lang/java/tools/target/avro-tools-1.13.0-SNAPSHOT.jar
> tether --protocol http --in /tmp/test_tether_word_countasg2o6_g/in --out
> /tmp/test_tether_word_countasg2o6_g/out --outschema
> /tmp/test_tether_word_countasg2o6_g/output.avsc --program
> /home/ryanskraba/avro/lang/py/.tox/py310/bin/python --exec_args -m
> avro.tether.tether_task_runner word_count_task.WordCountTask
> E
> ======================================================================
> ERROR: test_tether_word_count
> (avro.test.test_tether_word_count.TestTetherWordCount)
> Check that a tethered map-reduce job produces the output expected locally.
> ----------------------------------------------------------------------
> Traceback (most recent call last):
> File "/home/ryanskraba/avro/lang/py/avro/test/test_tether_word_count.py",
> line 168, in test_tether_word_count
> subprocess.check_call(args, env={"PYTHONPATH": _PYTHON_PATH, "PATH":
> os.environ["PATH"]})
> File "/usr/lib/python3.10/subprocess.py", line 369, in check_call
> raise CalledProcessError(retcode, cmd)
> subprocess.CalledProcessError: Command '('java', '-jar',
> '/home/ryanskraba/avro/lang/java/tools/target/avro-tools-1.13.0-SNAPSHOT.jar',
> 'tether', '--protocol', 'http', '--in',
> '/tmp/test_tether_word_countasg2o6_g/in', '--out',
> '/tmp/test_tether_word_countasg2o6_g/out', '--outschema',
> '/tmp/test_tether_word_countasg2o6_g/output.avsc', '--program',
> '/home/ryanskraba/avro/lang/py/.tox/py310/bin/python', '--exec_args', '-m
> avro.tether.tether_task_runner word_count_task.WordCountTask')' returned
> non-zero exit status 1.Stdout:
> Command:
> java -jar
> /home/ryanskraba/avro/lang/java/tools/target/avro-tools-1.13.0-SNAPSHOT.jar
> tether --protocol http --in /tmp/test_tether_word_countasg2o6_g/in --out
> /tmp/test_tether_word_countasg2o6_g/out --outschema
> /tmp/test_tether_word_countasg2o6_g/output.avsc --program
> /home/ryanskraba/avro/lang/py/.tox/py310/bin/python --exec_args -m
> avro.tether.tether_task_runner
> word_count_task.WordCountTask----------------------------------------------------------------------
> Ran 559 tests in 10.491sFAILED (errors=1)
> py310: exit 1 (10.66 seconds) /home/ryanskraba/avro/lang/py> coverage run -pm
> unittest discover --buffer --failfast pid=25273
> {code}
>
--
This message was sent by Atlassian Jira
(v8.20.10#820010)