[
https://issues.apache.org/jira/browse/FLINK-4795?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15566022#comment-15566022
]
Yakov Goldberg edited comment on FLINK-4795 at 10/11/16 6:01 PM:
-----------------------------------------------------------------
{code}
Caused by: java.lang.RuntimeException: External process for task MapPartition
(PythonMap) terminated prematurely due to an error.
Traceback (most recent call last):
File
"/tmp/flink-dist-cache-f5206634-8cfa-466c-baf4-30eb37f8af10/a229b7e91bc650d230f62eaf039397ec/flink/plan.py",
line 527, in <module>
env.execute(local=True)
File
"/tmp/flink-dist-cache-f5206634-8cfa-466c-baf4-30eb37f8af10/a229b7e91bc650d230f62eaf039397ec/flink/flink/plan/Environment.py",
line 181, in execute
operator._go()
File
"/tmp/flink-dist-cache-f5206634-8cfa-466c-baf4-30eb37f8af10/a229b7e91bc650d230f62eaf039397ec/flink/flink/functions/Function.py",
line 64, in _go
self._run()
File
"/tmp/flink-dist-cache-f5206634-8cfa-466c-baf4-30eb37f8af10/a229b7e91bc650d230f62eaf039397ec/flink/flink/functions/MapFunction.py",
line 29, in _run
collector.collect(function(value))
File
"/tmp/flink-dist-cache-f5206634-8cfa-466c-baf4-30eb37f8af10/a229b7e91bc650d230f62eaf039397ec/flink/flink/plan/DataSet.py",
line 49, in map
return self.delim.join([self._map(field) for field in value])
File
"/tmp/flink-dist-cache-f5206634-8cfa-466c-baf4-30eb37f8af10/a229b7e91bc650d230f62eaf039397ec/flink/flink/plan/DataSet.py",
line 53, in _map
return "(" + b", ".join([self.map(x) for x in value]) + ")"
File
"/tmp/flink-dist-cache-f5206634-8cfa-466c-baf4-30eb37f8af10/a229b7e91bc650d230f62eaf039397ec/flink/flink/plan/DataSet.py",
line 49, in map
return self.delim.join([self._map(field) for field in value])
TypeError: 'int' object is not iterable
at
org.apache.flink.python.api.streaming.data.PythonStreamer.streamBufferWithoutGroups(PythonStreamer.java:268)
at
org.apache.flink.python.api.functions.PythonMapPartition.mapPartition(PythonMapPartition.java:54)
at
org.apache.flink.runtime.operators.MapPartitionDriver.run(MapPartitionDriver.java:98)
at org.apache.flink.runtime.operators.BatchTask.run(BatchTask.java:480)
at
org.apache.flink.runtime.operators.BatchTask.invoke(BatchTask.java:345)
at org.apache.flink.runtime.taskmanager.Task.run(Task.java:559)
at java.lang.Thread.run(Thread.java:745)
{code}
was (Author: gyag):
Caused by: java.lang.RuntimeException: External process for task MapPartition
(PythonMap) terminated prematurely due to an error.
Traceback (most recent call last):
File
"/tmp/flink-dist-cache-f5206634-8cfa-466c-baf4-30eb37f8af10/a229b7e91bc650d230f62eaf039397ec/flink/plan.py",
line 527, in <module>
env.execute(local=True)
File
"/tmp/flink-dist-cache-f5206634-8cfa-466c-baf4-30eb37f8af10/a229b7e91bc650d230f62eaf039397ec/flink/flink/plan/Environment.py",
line 181, in execute
operator._go()
File
"/tmp/flink-dist-cache-f5206634-8cfa-466c-baf4-30eb37f8af10/a229b7e91bc650d230f62eaf039397ec/flink/flink/functions/Function.py",
line 64, in _go
self._run()
File
"/tmp/flink-dist-cache-f5206634-8cfa-466c-baf4-30eb37f8af10/a229b7e91bc650d230f62eaf039397ec/flink/flink/functions/MapFunction.py",
line 29, in _run
collector.collect(function(value))
File
"/tmp/flink-dist-cache-f5206634-8cfa-466c-baf4-30eb37f8af10/a229b7e91bc650d230f62eaf039397ec/flink/flink/plan/DataSet.py",
line 49, in map
return self.delim.join([self._map(field) for field in value])
File
"/tmp/flink-dist-cache-f5206634-8cfa-466c-baf4-30eb37f8af10/a229b7e91bc650d230f62eaf039397ec/flink/flink/plan/DataSet.py",
line 53, in _map
return "(" + b", ".join([self.map(x) for x in value]) + ")"
File
"/tmp/flink-dist-cache-f5206634-8cfa-466c-baf4-30eb37f8af10/a229b7e91bc650d230f62eaf039397ec/flink/flink/plan/DataSet.py",
line 49, in map
return self.delim.join([self._map(field) for field in value])
TypeError: 'int' object is not iterable
at
org.apache.flink.python.api.streaming.data.PythonStreamer.streamBufferWithoutGroups(PythonStreamer.java:268)
at
org.apache.flink.python.api.functions.PythonMapPartition.mapPartition(PythonMapPartition.java:54)
at
org.apache.flink.runtime.operators.MapPartitionDriver.run(MapPartitionDriver.java:98)
at org.apache.flink.runtime.operators.BatchTask.run(BatchTask.java:480)
at
org.apache.flink.runtime.operators.BatchTask.invoke(BatchTask.java:345)
at org.apache.flink.runtime.taskmanager.Task.run(Task.java:559)
at java.lang.Thread.run(Thread.java:745)
> CsvStringify crashes in case of tuple in tuple, t.e. ("a", True, (1,5))
> -----------------------------------------------------------------------
>
> Key: FLINK-4795
> URL: https://issues.apache.org/jira/browse/FLINK-4795
> Project: Flink
> Issue Type: Bug
> Components: Python API
> Reporter: Yakov Goldberg
>
> CsvStringify crashes in case of tuple in tuple, t.e. ("a", True, (1,5))
> Looks like, mistyping in CsvStringify._map()
> {code}
> def _map(self, value):
> if isinstance(value, (tuple, list)):
> return "(" + b", ".join([self.map(x) for x in value]) + ")"
> else:
> return str(value)
> {code}
> self._map() should be called
> But this will affect write_csv() and read_csv().
> write_csv() will work automatically
> and read_csv() should be implemented to be able to read Tuple type.
--
This message was sent by Atlassian JIRA
(v6.3.4#6332)