fedsp commented on issue #4502:
URL: https://github.com/apache/hudi/issues/4502#issuecomment-1013708796
I have a similar business need:
Imagine that you have a fact table containing the sales of a given period.
So you have the salesman id , the product id and at which price the product was
sold and a timestamp of the sale.
|salesman_id|product_id|price_sold|sales_timestamp |
| fedsp | 10001 | 0.99 |2022-01-15 13:10|
At the end of each month, the department calculates a bonus for each
salesman based on the sales data.
After 3 months, a salesman complained that in october 2021 the bonus was not
calculated properlly, but here comes the issue: The sales table can be updated
anytime. Then we need to see not the current data for october 2021, but **how**
the data was in november1st, the day where the bonus was calculated.
There are several ways to do it, but often does ways envolves hacky ways
that lead to more maintenance, non optimized data storage (lots of duplicated
data) and adding additional columns in the data which originally was not needed.
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: [email protected]
For queries about this service, please contact Infrastructure at:
[email protected]