[ 
https://issues.apache.org/jira/browse/FLINK-31686?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17709887#comment-17709887
 ] 

Jark Wu commented on FLINK-31686:
---------------------------------

I think you are right. The current implementation has some problems. The root 
cause is {{DecodingFormat}} doesn't support {{copy()}}, which makes the 
DecodingFormat resued after filter/projection is pushed down. 

Therefore, we need to first come up with a new API for 
{{DecodingFormat#copy()}} which may need a public discussion. 

What do you think [~luoyuxia] [~lincoln.86xy] [~twalthr]? 

> Filesystem connector should replace the shallow copy with deep copy
> -------------------------------------------------------------------
>
>                 Key: FLINK-31686
>                 URL: https://issues.apache.org/jira/browse/FLINK-31686
>             Project: Flink
>          Issue Type: Bug
>          Components: Connectors / FileSystem
>    Affects Versions: 1.16.1
>            Reporter: tanjialiang
>            Priority: Major
>         Attachments: image-2023-04-01-16-18-48-762.png, 
> image-2023-04-01-16-18-56-075.png
>
>
> Hi team, when i using the following sql
> {code:java}
> CREATE TABLE student (
>     `id` STRING,
>     `name` STRING,
>     `age` INT
> ) WITH (
>   'connector' = 'filesystem',
>   'path' = '...',
>   'format' = 'orc'
> );
> select
>     t1.total,
>     t2.total
> from
>     (
>         select
>             count(*) as total,
>             1 as join_key
>         from student
>         where name = 'tanjialiang'
>     ) t1
>     LEFT JOIN (
>         select
>             count(*) as total,
>             1 as join_key
>         from student;
>     ) t2 
>     ON t1.join_key = t2.join_key; {code}
>  
> it will throw an error
> !image-2023-04-01-16-18-48-762.png!
>  
> I tried to solve it, and i found filesystem connector's copy function using a 
> shallow copy instread of deep copy.   It lead to all of query from a same 
> table source reuse the same bulkWriterFormat, and my query have filter 
> condition, which will push down into the bulkWriterFormat, so the filter 
> condition maybe reuse.   
> I found the DynamicTableSource and DynamicTableSink's copy function comment 
> to ask we should impletement it with deep copy, but i found every connector 
> are using shallow copy to impletement it.   So i think not only the 
> filesystem connector have this problem.
> !image-2023-04-01-16-18-56-075.png!



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

Reply via email to