[GitHub] [hudi] yangzhiyue opened a new issue, #5658: [SUPPORT]

GitBox Sun, 22 May 2022 20:22:36 -0700


yangzhiyue opened a new issue, #5658:
URL: https://github.com/apache/hudi/issues/5658

**_Tips before filing an issue_**

- Have you gone through our [FAQs](https://hudi.apache.org/learn/faq/)? yes

- Join the mailing list to engage in conversations and get faster support at
[email protected].

- If you have triaged this as a bug, then file an
[issue](https://issues.apache.org/jira/projects/HUDI/issues) directly.

**Describe the problem you faced**
Using flink to fall into the hudi data, using spark to modify it according
to the primary key, a new piece of data will be inserted. That is, a primary
key has two records. Conversely, the hudi table built by spark and flink to
write new data will also have one more insert based on the primary
key.(用flink落入的hudi数据，用spark根据主键去修改，会插入一条新的数据。也就是一个主键有两个记录。反之也一样，spark建的hudi表，flink去落新的数据，根据主键的insert也会多一条出来)

**To Reproduce**

Steps to reproduce the behavior:

1. use flink to fall into hudi data
2. use spark to update hudi data according to the primary key

**Expected behavior**

A clear and concise description of what you expected to happen.

**Environment Description**

* Hudi version : 0.10.0

* Spark version : 3.0.1

* Flink version : 1.13.1

* Hadoop version :3.1.0

* Storage (HDFS/S3/GCS..) :

* Running on Docker? (yes/no) : no

**Additional context**

Add any other context about the problem here.

**Stacktrace**

```Add the stacktrace of the error.```

--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]

[GitHub] [hudi] yangzhiyue opened a new issue, #5658: [SUPPORT]

Reply via email to