[jira] [Commented] (FLINK-17883) Unable to configure write mode for FileSystem() connector in PyFlink

2020-06-04 Thread Dian Fu (Jira)


[ 
https://issues.apache.org/jira/browse/FLINK-17883?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17126336#comment-17126336
 ] 

Dian Fu commented on FLINK-17883:
-

I have verified the following interface defined in StatementSet:
{code}
StatementSet addInsert(String targetPath, Table table, boolean overwrite);
{code}

Note: currently only FileSystemTableSink and HiveTableSink supports `overwrite`.

> Unable to configure write mode for FileSystem() connector in PyFlink
> 
>
> Key: FLINK-17883
> URL: https://issues.apache.org/jira/browse/FLINK-17883
> Project: Flink
>  Issue Type: Improvement
>  Components: API / Python
>Affects Versions: 1.10.1
>Reporter: Robert Metzger
>Assignee: Nicholas Jiang
>Priority: Major
>
> As a user of PyFlink, I'm getting the following exception:
> {code}
> File or directory /tmp/output already exists. Existing files and directories 
> are not overwritten in NO_OVERWRITE mode. Use OVERWRITE mode to overwrite 
> existing files and directories.
> {code}
> I would like to be able to configure writeMode = OVERWRITE for the FileSystem 
> connector.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Commented] (FLINK-17883) Unable to configure write mode for FileSystem() connector in PyFlink

2020-05-25 Thread Dian Fu (Jira)


[ 
https://issues.apache.org/jira/browse/FLINK-17883?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17116362#comment-17116362
 ] 

Dian Fu commented on FLINK-17883:
-

Yes, I think so. We could close it after verifying that it works in 1.11. I'll 
verify it.

> Unable to configure write mode for FileSystem() connector in PyFlink
> 
>
> Key: FLINK-17883
> URL: https://issues.apache.org/jira/browse/FLINK-17883
> Project: Flink
>  Issue Type: Improvement
>  Components: API / Python
>Affects Versions: 1.10.1
>Reporter: Robert Metzger
>Assignee: Nicholas Jiang
>Priority: Major
>
> As a user of PyFlink, I'm getting the following exception:
> {code}
> File or directory /tmp/output already exists. Existing files and directories 
> are not overwritten in NO_OVERWRITE mode. Use OVERWRITE mode to overwrite 
> existing files and directories.
> {code}
> I would like to be able to configure writeMode = OVERWRITE for the FileSystem 
> connector.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Commented] (FLINK-17883) Unable to configure write mode for FileSystem() connector in PyFlink

2020-05-25 Thread Nicholas Jiang (Jira)


[ 
https://issues.apache.org/jira/browse/FLINK-17883?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17116347#comment-17116347
 ] 

Nicholas Jiang commented on FLINK-17883:


[~dian.fu], as you mentioned, this issue has already supported by using INSERT 
OVERWRITE statement, therefore would this issue be closed?

> Unable to configure write mode for FileSystem() connector in PyFlink
> 
>
> Key: FLINK-17883
> URL: https://issues.apache.org/jira/browse/FLINK-17883
> Project: Flink
>  Issue Type: Improvement
>  Components: API / Python
>Affects Versions: 1.10.1
>Reporter: Robert Metzger
>Assignee: Nicholas Jiang
>Priority: Major
>
> As a user of PyFlink, I'm getting the following exception:
> {code}
> File or directory /tmp/output already exists. Existing files and directories 
> are not overwritten in NO_OVERWRITE mode. Use OVERWRITE mode to overwrite 
> existing files and directories.
> {code}
> I would like to be able to configure writeMode = OVERWRITE for the FileSystem 
> connector.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Commented] (FLINK-17883) Unable to configure write mode for FileSystem() connector in PyFlink

2020-05-23 Thread Dian Fu (Jira)


[ 
https://issues.apache.org/jira/browse/FLINK-17883?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17114597#comment-17114597
 ] 

Dian Fu commented on FLINK-17883:
-

[~nicholasjiang] Yes, I think so. It should have been supported in 1.11.

> Unable to configure write mode for FileSystem() connector in PyFlink
> 
>
> Key: FLINK-17883
> URL: https://issues.apache.org/jira/browse/FLINK-17883
> Project: Flink
>  Issue Type: Improvement
>  Components: API / Python
>Affects Versions: 1.10.1
>Reporter: Robert Metzger
>Assignee: Nicholas Jiang
>Priority: Major
>
> As a user of PyFlink, I'm getting the following exception:
> {code}
> File or directory /tmp/output already exists. Existing files and directories 
> are not overwritten in NO_OVERWRITE mode. Use OVERWRITE mode to overwrite 
> existing files and directories.
> {code}
> I would like to be able to configure writeMode = OVERWRITE for the FileSystem 
> connector.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Commented] (FLINK-17883) Unable to configure write mode for FileSystem() connector in PyFlink

2020-05-22 Thread Nicholas Jiang (Jira)


[ 
https://issues.apache.org/jira/browse/FLINK-17883?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17114549#comment-17114549
 ] 

Nicholas Jiang commented on FLINK-17883:


[~dian.fu][~lzljs3620320][~jark]As you discuss, do you mean that this should 
use INSERT OVERWRITE statement or addInsert instead of configure write mode for 
FileSystem() connector in PyFlink?

> Unable to configure write mode for FileSystem() connector in PyFlink
> 
>
> Key: FLINK-17883
> URL: https://issues.apache.org/jira/browse/FLINK-17883
> Project: Flink
>  Issue Type: Improvement
>  Components: API / Python
>Affects Versions: 1.10.1
>Reporter: Robert Metzger
>Assignee: Nicholas Jiang
>Priority: Major
>
> As a user of PyFlink, I'm getting the following exception:
> {code}
> File or directory /tmp/output already exists. Existing files and directories 
> are not overwritten in NO_OVERWRITE mode. Use OVERWRITE mode to overwrite 
> existing files and directories.
> {code}
> I would like to be able to configure writeMode = OVERWRITE for the FileSystem 
> connector.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Commented] (FLINK-17883) Unable to configure write mode for FileSystem() connector in PyFlink

2020-05-22 Thread Dian Fu (Jira)


[ 
https://issues.apache.org/jira/browse/FLINK-17883?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17114479#comment-17114479
 ] 

Dian Fu commented on FLINK-17883:
-

[~jark] [~lzljs3620320] Thanks for your reply. It makes sense to me that the 
"insert override" semantics should be specified in the DML instead of DDL. I 
just noticed the following interface in StatementSet(since 1.11):
{code}
StatementSet addInsert(String targetPath, Table table, boolean overwrite);
{code}
It's also supported in the Python Table API. So I think users could specify 
"insert overwrite" semantics in both the Java/Python Table API via this 
interface since 1.11.


> Unable to configure write mode for FileSystem() connector in PyFlink
> 
>
> Key: FLINK-17883
> URL: https://issues.apache.org/jira/browse/FLINK-17883
> Project: Flink
>  Issue Type: Improvement
>  Components: API / Python
>Affects Versions: 1.10.1
>Reporter: Robert Metzger
>Assignee: Nicholas Jiang
>Priority: Major
>
> As a user of PyFlink, I'm getting the following exception:
> {code}
> File or directory /tmp/output already exists. Existing files and directories 
> are not overwritten in NO_OVERWRITE mode. Use OVERWRITE mode to overwrite 
> existing files and directories.
> {code}
> I would like to be able to configure writeMode = OVERWRITE for the FileSystem 
> connector.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Commented] (FLINK-17883) Unable to configure write mode for FileSystem() connector in PyFlink

2020-05-22 Thread Jingsong Lee (Jira)


[ 
https://issues.apache.org/jira/browse/FLINK-17883?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17114089#comment-17114089
 ] 

Jingsong Lee commented on FLINK-17883:
--

Hi [~jark], you are right, we can use new filesystem connector by DDL.

But PyFlink mapping Descriptor to python api, and we will support descriptor 
for new connector properties (Defined in FLIP-122) in FLINK 1.12.

> Unable to configure write mode for FileSystem() connector in PyFlink
> 
>
> Key: FLINK-17883
> URL: https://issues.apache.org/jira/browse/FLINK-17883
> Project: Flink
>  Issue Type: Improvement
>  Components: API / Python
>Affects Versions: 1.10.1
>Reporter: Robert Metzger
>Assignee: Nicholas Jiang
>Priority: Major
>
> As a user of PyFlink, I'm getting the following exception:
> {code}
> File or directory /tmp/output already exists. Existing files and directories 
> are not overwritten in NO_OVERWRITE mode. Use OVERWRITE mode to overwrite 
> existing files and directories.
> {code}
> I would like to be able to configure writeMode = OVERWRITE for the FileSystem 
> connector.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Commented] (FLINK-17883) Unable to configure write mode for FileSystem() connector in PyFlink

2020-05-22 Thread Jark Wu (Jira)


[ 
https://issues.apache.org/jira/browse/FLINK-17883?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17114077#comment-17114077
 ] 

Jark Wu commented on FLINK-17883:
-

We have a similar discussion during last release, but the want to solve this by 
using the {{INSERT OVERWRITE}} statement, instead of adding a property to the 
connector. AFAIK, the new filesystem connector (FLIP-115), already supports 
{{INSERT OVERWRITE}} statement, so maybe you can have a try. Please correct me 
if I'm wrong [~lzljs3620320]

> Unable to configure write mode for FileSystem() connector in PyFlink
> 
>
> Key: FLINK-17883
> URL: https://issues.apache.org/jira/browse/FLINK-17883
> Project: Flink
>  Issue Type: Improvement
>  Components: API / Python
>Affects Versions: 1.10.1
>Reporter: Robert Metzger
>Assignee: Nicholas Jiang
>Priority: Major
>
> As a user of PyFlink, I'm getting the following exception:
> {code}
> File or directory /tmp/output already exists. Existing files and directories 
> are not overwritten in NO_OVERWRITE mode. Use OVERWRITE mode to overwrite 
> existing files and directories.
> {code}
> I would like to be able to configure writeMode = OVERWRITE for the FileSystem 
> connector.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Commented] (FLINK-17883) Unable to configure write mode for FileSystem() connector in PyFlink

2020-05-22 Thread Dian Fu (Jira)


[ 
https://issues.apache.org/jira/browse/FLINK-17883?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17113962#comment-17113962
 ] 

Dian Fu commented on FLINK-17883:
-

I take a quick look at the code and I guess the reason is that the Java Csv 
descriptor hasn't provided an interface to set the write mode. Adding such an 
interface at Python side alone is feasible, however, it would be great if we 
can align the Python API with the Java API. 

cc [~docete] I'm assuming you are more familiar with this as I noticed that you 
added the write mode property in the csv table sink. Could you kindly share 
some thoughts on this? Is there any reason why not allowing to set the write 
mode in the Csv descriptor? 

> Unable to configure write mode for FileSystem() connector in PyFlink
> 
>
> Key: FLINK-17883
> URL: https://issues.apache.org/jira/browse/FLINK-17883
> Project: Flink
>  Issue Type: Improvement
>  Components: API / Python
>Affects Versions: 1.10.1
>Reporter: Robert Metzger
>Assignee: Nicholas Jiang
>Priority: Major
>
> As a user of PyFlink, I'm getting the following exception:
> {code}
> File or directory /tmp/output already exists. Existing files and directories 
> are not overwritten in NO_OVERWRITE mode. Use OVERWRITE mode to overwrite 
> existing files and directories.
> {code}
> I would like to be able to configure writeMode = OVERWRITE for the FileSystem 
> connector.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Commented] (FLINK-17883) Unable to configure write mode for FileSystem() connector in PyFlink

2020-05-22 Thread Dian Fu (Jira)


[ 
https://issues.apache.org/jira/browse/FLINK-17883?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17113944#comment-17113944
 ] 

Dian Fu commented on FLINK-17883:
-

Thanks [~rmetzger] for reporting this issue. Also thanks [~nicholasjiang] for 
the contribution. I have assigned this issue to you.

> Unable to configure write mode for FileSystem() connector in PyFlink
> 
>
> Key: FLINK-17883
> URL: https://issues.apache.org/jira/browse/FLINK-17883
> Project: Flink
>  Issue Type: Improvement
>  Components: API / Python
>Affects Versions: 1.10.1
>Reporter: Robert Metzger
>Assignee: Nicholas Jiang
>Priority: Major
>
> As a user of PyFlink, I'm getting the following exception:
> {code}
> File or directory /tmp/output already exists. Existing files and directories 
> are not overwritten in NO_OVERWRITE mode. Use OVERWRITE mode to overwrite 
> existing files and directories.
> {code}
> I would like to be able to configure writeMode = OVERWRITE for the FileSystem 
> connector.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Commented] (FLINK-17883) Unable to configure write mode for FileSystem() connector in PyFlink

2020-05-22 Thread Nicholas Jiang (Jira)


[ 
https://issues.apache.org/jira/browse/FLINK-17883?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17113880#comment-17113880
 ] 

Nicholas Jiang commented on FLINK-17883:


[~rmetzger], I don't agree with you any more. I previously need to overwrite 
filesystem but python api doesn't support this. I would like to try this issue 
for my business case.

> Unable to configure write mode for FileSystem() connector in PyFlink
> 
>
> Key: FLINK-17883
> URL: https://issues.apache.org/jira/browse/FLINK-17883
> Project: Flink
>  Issue Type: Improvement
>  Components: API / Python
>Affects Versions: 1.10.1
>Reporter: Robert Metzger
>Priority: Major
>
> As a user of PyFlink, I'm getting the following exception:
> {code}
> File or directory /tmp/output already exists. Existing files and directories 
> are not overwritten in NO_OVERWRITE mode. Use OVERWRITE mode to overwrite 
> existing files and directories.
> {code}
> I would like to be able to configure writeMode = OVERWRITE for the FileSystem 
> connector.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)