Ryan Skraba created AVRO-4199:
---------------------------------

             Summary: [build][p] test_tether_word_count fails
                 Key: AVRO-4199
                 URL: https://issues.apache.org/jira/browse/AVRO-4199
             Project: Apache Avro
          Issue Type: Bug
    Affects Versions: 1.11.5, 1.12.1
            Reporter: Ryan Skraba


The 
[test_tether_word_count|https://github.com/apache/avro/blob/6429a1dfb246a54a276ff35b9544c34b22ea961c/lang/py/avro/test/test_tether_word_count.py#L168]
 currently fails, likely due to recent security fixes in the Java SDK.


Notably:
{code:java}
25/11/12 14:19:13 WARN tether.TetherMapRunner: Task failed 
java.lang.SecurityException: Forbidden org.apache.avro.ipc.HandshakeRequest! 
This class is not trusted to be included in Avro schemas. You may either use 
the system properties org.apache.avro.SERIALIZABLE_CLASSES and 
org.apache.avro.SERIALIZABLE_PACKAGES to set the comma separated list of the 
classes or packages you trust, or you can set them via the API (see 
org.apache.avro.util.ClassSecurityValidator). {code}
 

*More log context:*
{code:java}
 py310: install_deps> python -I -m pip install coverage python-snappy zstandard
py310: install_package> python -I -m pip install --force-reinstall --no-deps 
/home/ryanskraba/avro/lang/py/.tox/.tmp/package/5/avro-1.13.0+snapshot.tar.gz
py310: commands_pre[0]> mkdir -p avro/test/interop 
/home/ryanskraba/avro/lang/py/../../build/interop/data
py310: commands_pre[1]> cp -r 
/home/ryanskraba/avro/lang/py/../../build/interop/data avro/test/interop
py310: commands_pre[2]> coverage run -pm avro.test.gen_interop_data 
avro/interop.avsc avro/test/interop/data/py.avro
py310: commands_pre[3]> cp -r avro/test/interop/data 
/home/ryanskraba/avro/lang/py/../../build/interop
py310: commands[0]> coverage run -pm unittest discover --buffer --failfast
/home/ryanskraba/avro/lang/py/avro/schema.py:1233: IgnoredLogicalType: Unknown 
unknown-logical-type, using string.
  warnings.warn(avro.errors.IgnoredLogicalType(f"Unknown {logical_type}, using 
{type_}."))
/home/ryanskraba/avro/lang/py/avro/schema.py:1229: IgnoredLogicalType: Logical 
type timestamp-millis requires literal type long, not string.
  warnings.warn(
/home/ryanskraba/avro/lang/py/avro/schema.py:1233: IgnoredLogicalType: Unknown 
unknown-logical-type, using string.
  warnings.warn(avro.errors.IgnoredLogicalType(f"Unknown {logical_type}, using 
{type_}."))
/home/ryanskraba/avro/lang/py/avro/schema.py:1229: IgnoredLogicalType: Logical 
type timestamp-millis requires literal type long, not string.
  warnings.warn(
............................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................../home/ryanskraba/avro/lang/py/avro/__main__.py:78:
 AvroWarning: There is no way to safely type check this.
  warnings.warn(avro.errors.AvroWarning("There is no way to safely type check 
this."))
......./home/ryanskraba/avro/lang/py/avro/datafile.py:172: AvroWarning: Writing 
binary data to a writer <_io.TextIOWrapper name='<stdout>' mode='w' 
encoding='utf-8'> that's opened for text
  warnings.warn(avro.errors.AvroWarning(f"Writing binary data to a writer 
{writer!r} that's opened for text"))
1.13.0+SNAPSHOT
..../home/ryanskraba/avro/lang/py/avro/datafile.py:172: AvroWarning: Writing 
binary data to a writer <_io.TextIOWrapper name='<stdout>' mode='w' 
encoding='utf-8'> that's opened for text
  warnings.warn(avro.errors.AvroWarning(f"Writing binary data to a writer 
{writer!r} that's opened for text"))
./home/ryanskraba/avro/lang/py/avro/datafile.py:172: AvroWarning: Writing 
binary data to a writer <_io.TextIOWrapper name='<stdout>' mode='w' 
encoding='utf-8'> that's opened for text
  warnings.warn(avro.errors.AvroWarning(f"Writing binary data to a writer 
{writer!r} that's opened for text"))
mock_tether_parent: Launching Server on Port: 37659
.MockParentResponder: Received 'configure'': inputPort=57031
127.0.0.1 - - [12/Nov/2025 14:19:07] "POST / HTTP/1.1" 200 -
MockParentResponder: Received 'output'
127.0.0.1 - - [12/Nov/2025 14:19:07] "POST / HTTP/1.1" 200 -
MockParentResponder: Received 'output'
127.0.0.1 - - [12/Nov/2025 14:19:07] "POST / HTTP/1.1" 200 -
MockParentResponder: Received 'output'
127.0.0.1 - - [12/Nov/2025 14:19:07] "POST / HTTP/1.1" 200 -
MockParentResponder: Received 'output'
127.0.0.1 - - [12/Nov/2025 14:19:07] "POST / HTTP/1.1" 200 -
MockParentResponder: Received 'output'
127.0.0.1 - - [12/Nov/2025 14:19:07] "POST / HTTP/1.1" 200 -
MockParentResponder: Received 'output'
127.0.0.1 - - [12/Nov/2025 14:19:07] "POST / HTTP/1.1" 200 -
MockParentResponder: Received 'output'
127.0.0.1 - - [12/Nov/2025 14:19:07] "POST / HTTP/1.1" 200 -
MockParentResponder: Received 'complete'
127.0.0.1 - - [12/Nov/2025 14:19:07] "POST / HTTP/1.1" 200 -
MockParentResponder: Received 'status': message=Status message
127.0.0.1 - - [12/Nov/2025 14:19:07] "POST / HTTP/1.1" 200 -
mock_tether_parent: Launching Server on Port: 38869
.MockParentResponder: Received 'configure'': inputPort=36581
127.0.0.1 - - [12/Nov/2025 14:19:08] "POST / HTTP/1.1" 200 -
MockParentResponder: Received 'output'
127.0.0.1 - - [12/Nov/2025 14:19:08] "POST / HTTP/1.1" 200 -
MockParentResponder: Received 'output'
127.0.0.1 - - [12/Nov/2025 14:19:08] "POST / HTTP/1.1" 200 -
MockParentResponder: Received 'output'
127.0.0.1 - - [12/Nov/2025 14:19:08] "POST / HTTP/1.1" 200 -
MockParentResponder: Received 'output'
127.0.0.1 - - [12/Nov/2025 14:19:08] "POST / HTTP/1.1" 200 -
MockParentResponder: Received 'output'
127.0.0.1 - - [12/Nov/2025 14:19:08] "POST / HTTP/1.1" 200 -
MockParentResponder: Received 'output'
127.0.0.1 - - [12/Nov/2025 14:19:08] "POST / HTTP/1.1" 200 -
MockParentResponder: Received 'output'
127.0.0.1 - - [12/Nov/2025 14:19:08] "POST / HTTP/1.1" 200 -
MockParentResponder: Received 'complete'
127.0.0.1 - - [12/Nov/2025 14:19:08] "POST / HTTP/1.1" 200 -
mock_tether_parent: Launching Server on Port: 46409
.INFO:root:tether_task_runner.__main__: Task: 
avro.test.word_count_task.WordCountTask
INFO:TetherTask:TetherTask.open: Opening connection to parent server on 
port=46409
MockParentResponder: Received 'configure'': inputPort=37521
127.0.0.1 - - [12/Nov/2025 14:19:10] "POST / HTTP/1.1" 200 -
.25/11/12 14:19:12 WARN util.NativeCodeLoader: Unable to load native-hadoop 
library for your platform... using builtin-java classes where applicable
25/11/12 14:19:13 WARN impl.MetricsSystemImpl: JobTracker metrics system 
already initialized!
25/11/12 14:19:13 WARN mapreduce.JobResourceUploader: Hadoop command-line 
option parsing not performed. Implement the Tool interface and execute your 
application with ToolRunner to remedy this.
25/11/12 14:19:13 WARN mapreduce.JobResourceUploader: No job jar file set.  
User classes may not be found. See Job or Job#setJar(String).
25/11/12 14:19:13 INFO mapred.FileInputFormat: Total input files to process : 1
25/11/12 14:19:13 INFO mapreduce.JobSubmitter: number of splits:1
25/11/12 14:19:13 INFO mapreduce.JobSubmitter: Submitting tokens for job: 
job_local1890703834_0001
25/11/12 14:19:13 INFO mapreduce.JobSubmitter: Executing with tokens: []
25/11/12 14:19:13 INFO mapreduce.Job: The url to track the job: 
http://localhost:8080/
25/11/12 14:19:13 INFO mapred.LocalJobRunner: OutputCommitter set in config null
25/11/12 14:19:13 INFO mapreduce.Job: Running job: job_local1890703834_0001
25/11/12 14:19:13 INFO mapred.LocalJobRunner: OutputCommitter is 
org.apache.hadoop.mapred.FileOutputCommitter
25/11/12 14:19:13 INFO output.FileOutputCommitter: File Output Committer 
Algorithm version is 2
25/11/12 14:19:13 INFO output.FileOutputCommitter: FileOutputCommitter skip 
cleanup _temporary folders under output directory:false, ignore cleanup 
failures: false
25/11/12 14:19:13 INFO mapred.LocalJobRunner: Waiting for map tasks
25/11/12 14:19:13 INFO mapred.LocalJobRunner: Starting task: 
attempt_local1890703834_0001_m_000000_0
25/11/12 14:19:13 INFO output.FileOutputCommitter: File Output Committer 
Algorithm version is 2
25/11/12 14:19:13 INFO output.FileOutputCommitter: FileOutputCommitter skip 
cleanup _temporary folders under output directory:false, ignore cleanup 
failures: false
25/11/12 14:19:13 INFO mapred.Task:  Using ResourceCalculatorProcessTree : [ ]
25/11/12 14:19:13 INFO mapred.MapTask: Processing split: 
file:/tmp/test_tether_word_countasg2o6_g/in/lines.avro:0+195
25/11/12 14:19:13 INFO mapred.MapTask: numReduceTasks: 1
25/11/12 14:19:13 INFO mapred.MapTask: (EQUATOR) 0 kvi 26214396(104857584)
25/11/12 14:19:13 INFO mapred.MapTask: mapreduce.task.io.sort.mb: 100
25/11/12 14:19:13 INFO mapred.MapTask: soft limit at 83886080
25/11/12 14:19:13 INFO mapred.MapTask: bufstart = 0; bufvoid = 104857600
25/11/12 14:19:13 INFO mapred.MapTask: kvstart = 26214396; length = 6553600
25/11/12 14:19:13 INFO mapred.MapTask: Map output collector class = 
org.apache.hadoop.mapred.MapTask$MapOutputBuffer
25/11/12 14:19:13 WARN tether.TetherMapRunner: Task failed
java.lang.SecurityException: Forbidden org.apache.avro.ipc.HandshakeRequest! 
This class is not trusted to be included in Avro schemas. You may either use 
the system properties org.apache.avro.SERIALIZABLE_CLASSES and 
org.apache.avro.SERIALIZABLE_PACKAGES to set the comma separated list of the 
classes or packages you trust, or you can set them via the API (see 
org.apache.avro.util.ClassSecurityValidator).
    at 
org.apache.avro.util.ClassSecurityValidator$ClassSecurityPredicate.forbiddenClass(ClassSecurityValidator.java:106)
    at 
org.apache.avro.util.ClassSecurityValidator.validate(ClassSecurityValidator.java:60)
    at org.apache.avro.util.ClassUtils.forName(ClassUtils.java:99)
    at org.apache.avro.util.ClassUtils.forName(ClassUtils.java:72)
    at 
org.apache.avro.specific.SpecificData.lambda$getClass$2(SpecificData.java:392)
    at 
java.base/java.util.concurrent.ConcurrentHashMap.computeIfAbsent(ConcurrentHashMap.java:1708)
    at org.apache.avro.specific.SpecificData.getClass(SpecificData.java:390)
    at 
org.apache.avro.specific.SpecificDatumReader.setSchema(SpecificDatumReader.java:98)
    at 
org.apache.avro.specific.SpecificDatumReader.<init>(SpecificDatumReader.java:62)
    at org.apache.avro.ipc.Responder.<init>(Responder.java:204)
    at 
org.apache.avro.ipc.generic.GenericResponder.<init>(GenericResponder.java:45)
    at 
org.apache.avro.ipc.specific.SpecificResponder.<init>(SpecificResponder.java:55)
    at 
org.apache.avro.ipc.specific.SpecificResponder.<init>(SpecificResponder.java:51)
    at 
org.apache.avro.ipc.specific.SpecificResponder.<init>(SpecificResponder.java:43)
    at 
org.apache.avro.mapred.tether.TetheredProcess.<init>(TetheredProcess.java:93)
    at 
org.apache.avro.mapred.tether.TetherMapRunner.run(TetherMapRunner.java:52)
    at org.apache.hadoop.mapred.MapTask.runOldMapper(MapTask.java:466)
    at org.apache.hadoop.mapred.MapTask.run(MapTask.java:350)
    at 
org.apache.hadoop.mapred.LocalJobRunner$Job$MapTaskRunnable.run(LocalJobRunner.java:271)
    at 
java.base/java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:572)
    at java.base/java.util.concurrent.FutureTask.run(FutureTask.java:317)
    at 
java.base/java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1144)
    at 
java.base/java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:642)
    at java.base/java.lang.Thread.run(Thread.java:1583)
25/11/12 14:19:13 INFO mapred.LocalJobRunner: map task executor complete.
25/11/12 14:19:13 WARN mapred.LocalJobRunner: job_local1890703834_0001
java.lang.Exception: java.lang.NullPointerException: Cannot read field 
"inputClient" because "this.process" is null
    at 
org.apache.hadoop.mapred.LocalJobRunner$Job.runTasks(LocalJobRunner.java:492)
    at org.apache.hadoop.mapred.LocalJobRunner$Job.run(LocalJobRunner.java:552)
Caused by: java.lang.NullPointerException: Cannot read field "inputClient" 
because "this.process" is null
    at 
org.apache.avro.mapred.tether.TetherMapRunner.run(TetherMapRunner.java:80)
    at org.apache.hadoop.mapred.MapTask.runOldMapper(MapTask.java:466)
    at org.apache.hadoop.mapred.MapTask.run(MapTask.java:350)
    at 
org.apache.hadoop.mapred.LocalJobRunner$Job$MapTaskRunnable.run(LocalJobRunner.java:271)
    at 
java.base/java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:572)
    at java.base/java.util.concurrent.FutureTask.run(FutureTask.java:317)
    at 
java.base/java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1144)
    at 
java.base/java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:642)
    at java.base/java.lang.Thread.run(Thread.java:1583)
25/11/12 14:19:14 INFO mapreduce.Job: Job job_local1890703834_0001 running in 
uber mode : false
25/11/12 14:19:14 INFO mapreduce.Job:  map 0% reduce 0%
25/11/12 14:19:14 INFO mapreduce.Job: Job job_local1890703834_0001 failed with 
state FAILED due to: NA
25/11/12 14:19:14 INFO mapreduce.Job: Counters: 0
Exception in thread "main" java.io.IOException: Job failed!
    at org.apache.hadoop.mapred.JobClient.runJob(JobClient.java:875)
    at org.apache.avro.mapred.tether.TetherJob.runJob(TetherJob.java:114)
    at org.apache.avro.tool.TetherTool.run(TetherTool.java:152)
    at org.apache.avro.tool.Main.run(Main.java:67)
    at org.apache.avro.tool.Main.main(Main.java:56)Stdout:
Command:
    java -jar 
/home/ryanskraba/avro/lang/java/tools/target/avro-tools-1.13.0-SNAPSHOT.jar 
tether --protocol http --in /tmp/test_tether_word_countasg2o6_g/in --out 
/tmp/test_tether_word_countasg2o6_g/out --outschema 
/tmp/test_tether_word_countasg2o6_g/output.avsc --program 
/home/ryanskraba/avro/lang/py/.tox/py310/bin/python --exec_args -m 
avro.tether.tether_task_runner word_count_task.WordCountTask
E
======================================================================
ERROR: test_tether_word_count 
(avro.test.test_tether_word_count.TestTetherWordCount)
Check that a tethered map-reduce job produces the output expected locally.
----------------------------------------------------------------------
Traceback (most recent call last):
  File "/home/ryanskraba/avro/lang/py/avro/test/test_tether_word_count.py", 
line 168, in test_tether_word_count
    subprocess.check_call(args, env={"PYTHONPATH": _PYTHON_PATH, "PATH": 
os.environ["PATH"]})
  File "/usr/lib/python3.10/subprocess.py", line 369, in check_call
    raise CalledProcessError(retcode, cmd)
subprocess.CalledProcessError: Command '('java', '-jar', 
'/home/ryanskraba/avro/lang/java/tools/target/avro-tools-1.13.0-SNAPSHOT.jar', 
'tether', '--protocol', 'http', '--in', 
'/tmp/test_tether_word_countasg2o6_g/in', '--out', 
'/tmp/test_tether_word_countasg2o6_g/out', '--outschema', 
'/tmp/test_tether_word_countasg2o6_g/output.avsc', '--program', 
'/home/ryanskraba/avro/lang/py/.tox/py310/bin/python', '--exec_args', '-m 
avro.tether.tether_task_runner word_count_task.WordCountTask')' returned 
non-zero exit status 1.Stdout:
Command:
    java -jar 
/home/ryanskraba/avro/lang/java/tools/target/avro-tools-1.13.0-SNAPSHOT.jar 
tether --protocol http --in /tmp/test_tether_word_countasg2o6_g/in --out 
/tmp/test_tether_word_countasg2o6_g/out --outschema 
/tmp/test_tether_word_countasg2o6_g/output.avsc --program 
/home/ryanskraba/avro/lang/py/.tox/py310/bin/python --exec_args -m 
avro.tether.tether_task_runner 
word_count_task.WordCountTask----------------------------------------------------------------------
Ran 559 tests in 10.491sFAILED (errors=1)
py310: exit 1 (10.66 seconds) /home/ryanskraba/avro/lang/py> coverage run -pm 
unittest discover --buffer --failfast pid=25273
{code}
 



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

Reply via email to