[jira] [Issue Comment Deleted] (FLINK-20295) File Source lost data when reading from directories created by FileSystemTableSink with JSON format

Jingsong Lee (Jira) Mon, 23 Nov 2020 22:01:43 -0800


     [ 
https://issues.apache.org/jira/browse/FLINK-20295?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]


Jingsong Lee updated FLINK-20295:
---------------------------------
    Comment: was deleted

(was:  
{code:java}
java.lang.IllegalStateException: MiniCluster is not yet running or has already 
been shut down.java.lang.IllegalStateException: MiniCluster is not yet running 
or has already been shut down. at 
org.apache.flink.util.Preconditions.checkState(Preconditions.java:198) 
~[classes/:?] at 
org.apache.flink.runtime.minicluster.MiniCluster.getDispatcherGatewayFuture(MiniCluster.java:706)
 ~[classes/:?] at 
org.apache.flink.runtime.minicluster.MiniCluster.runDispatcherCommand(MiniCluster.java:621)
 ~[classes/:?] at 
org.apache.flink.runtime.minicluster.MiniCluster.getJobStatus(MiniCluster.java:587)
 ~[classes/:?] at 
org.apache.flink.runtime.minicluster.MiniClusterJobClient.getJobStatus(MiniClusterJobClient.java:89)
 ~[classes/:?] at 
org.apache.flink.streaming.api.operators.collect.CollectResultFetcher.isJobTerminated(CollectResultFetcher.java:199)
 [classes/:?] at 
org.apache.flink.streaming.api.operators.collect.CollectResultFetcher.next(CollectResultFetcher.java:123)
 [classes/:?] at 
org.apache.flink.streaming.api.operators.collect.CollectResultIterator.nextResultFromFetcher(CollectResultIterator.java:103)
 [classes/:?] at 
org.apache.flink.streaming.api.operators.collect.CollectResultIterator.hasNext(CollectResultIterator.java:77)
 [classes/:?] at 
org.apache.flink.table.planner.sinks.SelectTableSinkBase$RowIteratorWrapper.hasNext(SelectTableSinkBase.java:115)
 [classes/:?] at 
org.apache.flink.table.api.internal.TableResultImpl$CloseableRowIteratorWrapper.hasNext(TableResultImpl.java:355)
 [classes/:?] at java.util.Iterator.forEachRemaining(Iterator.java:115) 
[?:1.8.0_152] at 
org.apache.flink.table.examples.java.connectors.JsonSource.main(JsonSource.java:62)
 [classes/:?]
{code}
It looks like there was something wrong that caused {{CollectResultFetcher}} to 
quit early. )

> File Source lost data when reading from directories created by 
> FileSystemTableSink with JSON format
> ---------------------------------------------------------------------------------------------------
>
>                 Key: FLINK-20295
>                 URL: https://issues.apache.org/jira/browse/FLINK-20295
>             Project: Flink
>          Issue Type: Bug
>          Components: Connectors / FileSystem, Table SQL / Ecosystem
>            Reporter: Yun Gao
>            Priority: Critical
>             Fix For: 1.12.0
>
>         Attachments: compaction.tgz
>
>
> When testing the compaction functionality of the FileSystemTableSink, I found 
> that when using json format, the produced directories could not be read 
> correctly by the file source, namely only a part of records are read.
> By checking the produced directories, the number of the records in it is the 
> same as expected, thus it seems to be the issue of the source side.
>  
> The issue only exists for JSON format.
> The data is produced by 
> [FileCompactionTest|https://github.com/gaoyunhaii/flink1.12test/blob/main/src/main/java/FileCompactionTest.java]
>  and read by  
> [FileCompactionCheckTest|https://github.com/gaoyunhaii/flink1.12test/blob/main/src/main/java/FileCompactionCheckTest.java]
>  . An example directories tar file of 8000 records are also attached.
>  



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

[jira] [Issue Comment Deleted] (FLINK-20295) File Source lost data when reading from directories created by FileSystemTableSink with JSON format

Reply via email to