Ryan Skraba created AVRO-4199:
---------------------------------
Summary: [build][p] test_tether_word_count fails
Key: AVRO-4199
URL: https://issues.apache.org/jira/browse/AVRO-4199
Project: Apache Avro
Issue Type: Bug
Affects Versions: 1.11.5, 1.12.1
Reporter: Ryan Skraba
The
[test_tether_word_count|https://github.com/apache/avro/blob/6429a1dfb246a54a276ff35b9544c34b22ea961c/lang/py/avro/test/test_tether_word_count.py#L168]
currently fails, likely due to recent security fixes in the Java SDK.
Notably:
{code:java}
25/11/12 14:19:13 WARN tether.TetherMapRunner: Task failed
java.lang.SecurityException: Forbidden org.apache.avro.ipc.HandshakeRequest!
This class is not trusted to be included in Avro schemas. You may either use
the system properties org.apache.avro.SERIALIZABLE_CLASSES and
org.apache.avro.SERIALIZABLE_PACKAGES to set the comma separated list of the
classes or packages you trust, or you can set them via the API (see
org.apache.avro.util.ClassSecurityValidator). {code}
*More log context:*
{code:java}
py310: install_deps> python -I -m pip install coverage python-snappy zstandard
py310: install_package> python -I -m pip install --force-reinstall --no-deps
/home/ryanskraba/avro/lang/py/.tox/.tmp/package/5/avro-1.13.0+snapshot.tar.gz
py310: commands_pre[0]> mkdir -p avro/test/interop
/home/ryanskraba/avro/lang/py/../../build/interop/data
py310: commands_pre[1]> cp -r
/home/ryanskraba/avro/lang/py/../../build/interop/data avro/test/interop
py310: commands_pre[2]> coverage run -pm avro.test.gen_interop_data
avro/interop.avsc avro/test/interop/data/py.avro
py310: commands_pre[3]> cp -r avro/test/interop/data
/home/ryanskraba/avro/lang/py/../../build/interop
py310: commands[0]> coverage run -pm unittest discover --buffer --failfast
/home/ryanskraba/avro/lang/py/avro/schema.py:1233: IgnoredLogicalType: Unknown
unknown-logical-type, using string.
warnings.warn(avro.errors.IgnoredLogicalType(f"Unknown {logical_type}, using
{type_}."))
/home/ryanskraba/avro/lang/py/avro/schema.py:1229: IgnoredLogicalType: Logical
type timestamp-millis requires literal type long, not string.
warnings.warn(
/home/ryanskraba/avro/lang/py/avro/schema.py:1233: IgnoredLogicalType: Unknown
unknown-logical-type, using string.
warnings.warn(avro.errors.IgnoredLogicalType(f"Unknown {logical_type}, using
{type_}."))
/home/ryanskraba/avro/lang/py/avro/schema.py:1229: IgnoredLogicalType: Logical
type timestamp-millis requires literal type long, not string.
warnings.warn(
............................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................../home/ryanskraba/avro/lang/py/avro/__main__.py:78:
AvroWarning: There is no way to safely type check this.
warnings.warn(avro.errors.AvroWarning("There is no way to safely type check
this."))
......./home/ryanskraba/avro/lang/py/avro/datafile.py:172: AvroWarning: Writing
binary data to a writer <_io.TextIOWrapper name='<stdout>' mode='w'
encoding='utf-8'> that's opened for text
warnings.warn(avro.errors.AvroWarning(f"Writing binary data to a writer
{writer!r} that's opened for text"))
1.13.0+SNAPSHOT
..../home/ryanskraba/avro/lang/py/avro/datafile.py:172: AvroWarning: Writing
binary data to a writer <_io.TextIOWrapper name='<stdout>' mode='w'
encoding='utf-8'> that's opened for text
warnings.warn(avro.errors.AvroWarning(f"Writing binary data to a writer
{writer!r} that's opened for text"))
./home/ryanskraba/avro/lang/py/avro/datafile.py:172: AvroWarning: Writing
binary data to a writer <_io.TextIOWrapper name='<stdout>' mode='w'
encoding='utf-8'> that's opened for text
warnings.warn(avro.errors.AvroWarning(f"Writing binary data to a writer
{writer!r} that's opened for text"))
mock_tether_parent: Launching Server on Port: 37659
.MockParentResponder: Received 'configure'': inputPort=57031
127.0.0.1 - - [12/Nov/2025 14:19:07] "POST / HTTP/1.1" 200 -
MockParentResponder: Received 'output'
127.0.0.1 - - [12/Nov/2025 14:19:07] "POST / HTTP/1.1" 200 -
MockParentResponder: Received 'output'
127.0.0.1 - - [12/Nov/2025 14:19:07] "POST / HTTP/1.1" 200 -
MockParentResponder: Received 'output'
127.0.0.1 - - [12/Nov/2025 14:19:07] "POST / HTTP/1.1" 200 -
MockParentResponder: Received 'output'
127.0.0.1 - - [12/Nov/2025 14:19:07] "POST / HTTP/1.1" 200 -
MockParentResponder: Received 'output'
127.0.0.1 - - [12/Nov/2025 14:19:07] "POST / HTTP/1.1" 200 -
MockParentResponder: Received 'output'
127.0.0.1 - - [12/Nov/2025 14:19:07] "POST / HTTP/1.1" 200 -
MockParentResponder: Received 'output'
127.0.0.1 - - [12/Nov/2025 14:19:07] "POST / HTTP/1.1" 200 -
MockParentResponder: Received 'complete'
127.0.0.1 - - [12/Nov/2025 14:19:07] "POST / HTTP/1.1" 200 -
MockParentResponder: Received 'status': message=Status message
127.0.0.1 - - [12/Nov/2025 14:19:07] "POST / HTTP/1.1" 200 -
mock_tether_parent: Launching Server on Port: 38869
.MockParentResponder: Received 'configure'': inputPort=36581
127.0.0.1 - - [12/Nov/2025 14:19:08] "POST / HTTP/1.1" 200 -
MockParentResponder: Received 'output'
127.0.0.1 - - [12/Nov/2025 14:19:08] "POST / HTTP/1.1" 200 -
MockParentResponder: Received 'output'
127.0.0.1 - - [12/Nov/2025 14:19:08] "POST / HTTP/1.1" 200 -
MockParentResponder: Received 'output'
127.0.0.1 - - [12/Nov/2025 14:19:08] "POST / HTTP/1.1" 200 -
MockParentResponder: Received 'output'
127.0.0.1 - - [12/Nov/2025 14:19:08] "POST / HTTP/1.1" 200 -
MockParentResponder: Received 'output'
127.0.0.1 - - [12/Nov/2025 14:19:08] "POST / HTTP/1.1" 200 -
MockParentResponder: Received 'output'
127.0.0.1 - - [12/Nov/2025 14:19:08] "POST / HTTP/1.1" 200 -
MockParentResponder: Received 'output'
127.0.0.1 - - [12/Nov/2025 14:19:08] "POST / HTTP/1.1" 200 -
MockParentResponder: Received 'complete'
127.0.0.1 - - [12/Nov/2025 14:19:08] "POST / HTTP/1.1" 200 -
mock_tether_parent: Launching Server on Port: 46409
.INFO:root:tether_task_runner.__main__: Task:
avro.test.word_count_task.WordCountTask
INFO:TetherTask:TetherTask.open: Opening connection to parent server on
port=46409
MockParentResponder: Received 'configure'': inputPort=37521
127.0.0.1 - - [12/Nov/2025 14:19:10] "POST / HTTP/1.1" 200 -
.25/11/12 14:19:12 WARN util.NativeCodeLoader: Unable to load native-hadoop
library for your platform... using builtin-java classes where applicable
25/11/12 14:19:13 WARN impl.MetricsSystemImpl: JobTracker metrics system
already initialized!
25/11/12 14:19:13 WARN mapreduce.JobResourceUploader: Hadoop command-line
option parsing not performed. Implement the Tool interface and execute your
application with ToolRunner to remedy this.
25/11/12 14:19:13 WARN mapreduce.JobResourceUploader: No job jar file set.
User classes may not be found. See Job or Job#setJar(String).
25/11/12 14:19:13 INFO mapred.FileInputFormat: Total input files to process : 1
25/11/12 14:19:13 INFO mapreduce.JobSubmitter: number of splits:1
25/11/12 14:19:13 INFO mapreduce.JobSubmitter: Submitting tokens for job:
job_local1890703834_0001
25/11/12 14:19:13 INFO mapreduce.JobSubmitter: Executing with tokens: []
25/11/12 14:19:13 INFO mapreduce.Job: The url to track the job:
http://localhost:8080/
25/11/12 14:19:13 INFO mapred.LocalJobRunner: OutputCommitter set in config null
25/11/12 14:19:13 INFO mapreduce.Job: Running job: job_local1890703834_0001
25/11/12 14:19:13 INFO mapred.LocalJobRunner: OutputCommitter is
org.apache.hadoop.mapred.FileOutputCommitter
25/11/12 14:19:13 INFO output.FileOutputCommitter: File Output Committer
Algorithm version is 2
25/11/12 14:19:13 INFO output.FileOutputCommitter: FileOutputCommitter skip
cleanup _temporary folders under output directory:false, ignore cleanup
failures: false
25/11/12 14:19:13 INFO mapred.LocalJobRunner: Waiting for map tasks
25/11/12 14:19:13 INFO mapred.LocalJobRunner: Starting task:
attempt_local1890703834_0001_m_000000_0
25/11/12 14:19:13 INFO output.FileOutputCommitter: File Output Committer
Algorithm version is 2
25/11/12 14:19:13 INFO output.FileOutputCommitter: FileOutputCommitter skip
cleanup _temporary folders under output directory:false, ignore cleanup
failures: false
25/11/12 14:19:13 INFO mapred.Task: Using ResourceCalculatorProcessTree : [ ]
25/11/12 14:19:13 INFO mapred.MapTask: Processing split:
file:/tmp/test_tether_word_countasg2o6_g/in/lines.avro:0+195
25/11/12 14:19:13 INFO mapred.MapTask: numReduceTasks: 1
25/11/12 14:19:13 INFO mapred.MapTask: (EQUATOR) 0 kvi 26214396(104857584)
25/11/12 14:19:13 INFO mapred.MapTask: mapreduce.task.io.sort.mb: 100
25/11/12 14:19:13 INFO mapred.MapTask: soft limit at 83886080
25/11/12 14:19:13 INFO mapred.MapTask: bufstart = 0; bufvoid = 104857600
25/11/12 14:19:13 INFO mapred.MapTask: kvstart = 26214396; length = 6553600
25/11/12 14:19:13 INFO mapred.MapTask: Map output collector class =
org.apache.hadoop.mapred.MapTask$MapOutputBuffer
25/11/12 14:19:13 WARN tether.TetherMapRunner: Task failed
java.lang.SecurityException: Forbidden org.apache.avro.ipc.HandshakeRequest!
This class is not trusted to be included in Avro schemas. You may either use
the system properties org.apache.avro.SERIALIZABLE_CLASSES and
org.apache.avro.SERIALIZABLE_PACKAGES to set the comma separated list of the
classes or packages you trust, or you can set them via the API (see
org.apache.avro.util.ClassSecurityValidator).
at
org.apache.avro.util.ClassSecurityValidator$ClassSecurityPredicate.forbiddenClass(ClassSecurityValidator.java:106)
at
org.apache.avro.util.ClassSecurityValidator.validate(ClassSecurityValidator.java:60)
at org.apache.avro.util.ClassUtils.forName(ClassUtils.java:99)
at org.apache.avro.util.ClassUtils.forName(ClassUtils.java:72)
at
org.apache.avro.specific.SpecificData.lambda$getClass$2(SpecificData.java:392)
at
java.base/java.util.concurrent.ConcurrentHashMap.computeIfAbsent(ConcurrentHashMap.java:1708)
at org.apache.avro.specific.SpecificData.getClass(SpecificData.java:390)
at
org.apache.avro.specific.SpecificDatumReader.setSchema(SpecificDatumReader.java:98)
at
org.apache.avro.specific.SpecificDatumReader.<init>(SpecificDatumReader.java:62)
at org.apache.avro.ipc.Responder.<init>(Responder.java:204)
at
org.apache.avro.ipc.generic.GenericResponder.<init>(GenericResponder.java:45)
at
org.apache.avro.ipc.specific.SpecificResponder.<init>(SpecificResponder.java:55)
at
org.apache.avro.ipc.specific.SpecificResponder.<init>(SpecificResponder.java:51)
at
org.apache.avro.ipc.specific.SpecificResponder.<init>(SpecificResponder.java:43)
at
org.apache.avro.mapred.tether.TetheredProcess.<init>(TetheredProcess.java:93)
at
org.apache.avro.mapred.tether.TetherMapRunner.run(TetherMapRunner.java:52)
at org.apache.hadoop.mapred.MapTask.runOldMapper(MapTask.java:466)
at org.apache.hadoop.mapred.MapTask.run(MapTask.java:350)
at
org.apache.hadoop.mapred.LocalJobRunner$Job$MapTaskRunnable.run(LocalJobRunner.java:271)
at
java.base/java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:572)
at java.base/java.util.concurrent.FutureTask.run(FutureTask.java:317)
at
java.base/java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1144)
at
java.base/java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:642)
at java.base/java.lang.Thread.run(Thread.java:1583)
25/11/12 14:19:13 INFO mapred.LocalJobRunner: map task executor complete.
25/11/12 14:19:13 WARN mapred.LocalJobRunner: job_local1890703834_0001
java.lang.Exception: java.lang.NullPointerException: Cannot read field
"inputClient" because "this.process" is null
at
org.apache.hadoop.mapred.LocalJobRunner$Job.runTasks(LocalJobRunner.java:492)
at org.apache.hadoop.mapred.LocalJobRunner$Job.run(LocalJobRunner.java:552)
Caused by: java.lang.NullPointerException: Cannot read field "inputClient"
because "this.process" is null
at
org.apache.avro.mapred.tether.TetherMapRunner.run(TetherMapRunner.java:80)
at org.apache.hadoop.mapred.MapTask.runOldMapper(MapTask.java:466)
at org.apache.hadoop.mapred.MapTask.run(MapTask.java:350)
at
org.apache.hadoop.mapred.LocalJobRunner$Job$MapTaskRunnable.run(LocalJobRunner.java:271)
at
java.base/java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:572)
at java.base/java.util.concurrent.FutureTask.run(FutureTask.java:317)
at
java.base/java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1144)
at
java.base/java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:642)
at java.base/java.lang.Thread.run(Thread.java:1583)
25/11/12 14:19:14 INFO mapreduce.Job: Job job_local1890703834_0001 running in
uber mode : false
25/11/12 14:19:14 INFO mapreduce.Job: map 0% reduce 0%
25/11/12 14:19:14 INFO mapreduce.Job: Job job_local1890703834_0001 failed with
state FAILED due to: NA
25/11/12 14:19:14 INFO mapreduce.Job: Counters: 0
Exception in thread "main" java.io.IOException: Job failed!
at org.apache.hadoop.mapred.JobClient.runJob(JobClient.java:875)
at org.apache.avro.mapred.tether.TetherJob.runJob(TetherJob.java:114)
at org.apache.avro.tool.TetherTool.run(TetherTool.java:152)
at org.apache.avro.tool.Main.run(Main.java:67)
at org.apache.avro.tool.Main.main(Main.java:56)Stdout:
Command:
java -jar
/home/ryanskraba/avro/lang/java/tools/target/avro-tools-1.13.0-SNAPSHOT.jar
tether --protocol http --in /tmp/test_tether_word_countasg2o6_g/in --out
/tmp/test_tether_word_countasg2o6_g/out --outschema
/tmp/test_tether_word_countasg2o6_g/output.avsc --program
/home/ryanskraba/avro/lang/py/.tox/py310/bin/python --exec_args -m
avro.tether.tether_task_runner word_count_task.WordCountTask
E
======================================================================
ERROR: test_tether_word_count
(avro.test.test_tether_word_count.TestTetherWordCount)
Check that a tethered map-reduce job produces the output expected locally.
----------------------------------------------------------------------
Traceback (most recent call last):
File "/home/ryanskraba/avro/lang/py/avro/test/test_tether_word_count.py",
line 168, in test_tether_word_count
subprocess.check_call(args, env={"PYTHONPATH": _PYTHON_PATH, "PATH":
os.environ["PATH"]})
File "/usr/lib/python3.10/subprocess.py", line 369, in check_call
raise CalledProcessError(retcode, cmd)
subprocess.CalledProcessError: Command '('java', '-jar',
'/home/ryanskraba/avro/lang/java/tools/target/avro-tools-1.13.0-SNAPSHOT.jar',
'tether', '--protocol', 'http', '--in',
'/tmp/test_tether_word_countasg2o6_g/in', '--out',
'/tmp/test_tether_word_countasg2o6_g/out', '--outschema',
'/tmp/test_tether_word_countasg2o6_g/output.avsc', '--program',
'/home/ryanskraba/avro/lang/py/.tox/py310/bin/python', '--exec_args', '-m
avro.tether.tether_task_runner word_count_task.WordCountTask')' returned
non-zero exit status 1.Stdout:
Command:
java -jar
/home/ryanskraba/avro/lang/java/tools/target/avro-tools-1.13.0-SNAPSHOT.jar
tether --protocol http --in /tmp/test_tether_word_countasg2o6_g/in --out
/tmp/test_tether_word_countasg2o6_g/out --outschema
/tmp/test_tether_word_countasg2o6_g/output.avsc --program
/home/ryanskraba/avro/lang/py/.tox/py310/bin/python --exec_args -m
avro.tether.tether_task_runner
word_count_task.WordCountTask----------------------------------------------------------------------
Ran 559 tests in 10.491sFAILED (errors=1)
py310: exit 1 (10.66 seconds) /home/ryanskraba/avro/lang/py> coverage run -pm
unittest discover --buffer --failfast pid=25273
{code}
--
This message was sent by Atlassian Jira
(v8.20.10#820010)