[GitHub] [hudi] mithalee commented on issue #3336: [SUPPORT] Delete not functioning with deltastreamer

GitBox Wed, 28 Jul 2021 17:06:18 -0700


mithalee commented on issue #3336:
URL: https://github.com/apache/hudi/issues/3336#issuecomment-888699402



   > @mithalee Can you try with the latest master branch? I built the master 
code and tried to reproduce the scenario in a local docker environment. It runs 
fine. For example, after first ingest, you can see `_hoodie_is_deleted` is 
false for both timestamp and after second ingest (in which I set 
`_hoodie_is_deleted` to true for a timestamp), it is present only for one 
timestamp.
   > 
   > ```
   > // after first ingest
   > scala> spark.sql("select symbol, ts, _hoodie_is_deleted from 
stock_ticks_cow WHERE symbol = 'MSFT'").show(100, false)
   > +------+-------------------+------------------+
   > |symbol|ts                 |_hoodie_is_deleted|
   > +------+-------------------+------------------+
   > |MSFT  |2018-08-31 09:59:00|false             |
   > |MSFT  |2018-08-31 10:29:00|false             |
   > +------+-------------------+------------------+
   > 
   > // after second ingest
   > scala> spark.sql("select symbol, ts, _hoodie_is_deleted from 
stock_ticks_cow WHERE symbol = 'MSFT'").show(100, false)
   > +------+-------------------+------------------+
   > |symbol|ts                 |_hoodie_is_deleted|
   > +------+-------------------+------------------+
   > |MSFT  |2018-08-31 09:59:00|false             |
   > +------+-------------------+------------------+
   > ```
   > 
   > My schema is similar to 
[this](https://github.com/apache/hudi/blob/master/docker/demo/config/schema.avsc)
 except that I added `_hoodie_is_deleted` field with default false.
   > 
   > FYI, my spark-submit command is same as [mentioned 
here](https://hudi.apache.org/docs/docker_demo.html#step-2-incrementally-ingest-data-from-kafka-topic).
   
   Sure.I will try and get back to you.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]

[GitHub] [hudi] mithalee commented on issue #3336: [SUPPORT] Delete not functioning with deltastreamer

Reply via email to