[GitHub] [incubator-pinot] minwoo-jung edited a comment on issue #7100: [question] Even if I recreate the kafka topic or modify the topic properties, I wonder how consuming can continue to do it.

GitBox Tue, 29 Jun 2021 23:56:09 -0700


minwoo-jung edited a comment on issue #7100:
URL: 
https://github.com/apache/incubator-pinot/issues/7100#issuecomment-871093574

@mcvsubbu
Thanks for the reply.
I understand what you are saying and I know that solving problems is not
easy.
I'd like to ask you a few more questions.

### 1
**Is there any development in progress to solve this problem?**
When the above situation occurs, this issue is important to me because I
want to keep the existing data and minimize the modification of the business
code.

### 2
If I follow the method of creating a new table, however, existing data must
be preserved, so I create a new table and insert data.
Assume that the data of the existing table(table A) must be preserved and
used continuously, and that the newly added table(Table B) is created with a
different name and starts to store data.
In the business code, I will have to modify the code to access two tables
(Table_A, Table_B).

For example, To continue using the data in the tables,
the application code should be changed as shown below.

```JAVA
if (searchtime < createTableBtime)
select table A
else (searchtime > createTableBtime)
select table B
```

That is, what I want to ask is...
Instead of modifying the code every time to access the data as above
whenever the table name or number is changed,
**Even if it takes time, is there a way to push the data from the existing
table_A to the newly added table_B?**
If this is possible, it is likely that the amount of code modifications and
conditional statements will be significantly reduced.

If so, it would be good if you could give me some advice.:)
I've looked for a possible way in the manual, but I haven't found it yet.

### 3
I've seen this issue. https://github.com/apache/incubator-pinot/issues/6555
I couldn't understand it exactly, but I assumed that I gave up the integrity
of the data (either duplicate rows, or miss rows altogether).
**If we can solve the table recreate issue by supporting the earliest/latest
offset, This would be a good workaround for us.**

For reference,
**the offset value has been changed from smallest/largest to
latest/earliest, and none from kafka 2.0 or later.** The version may not be
correct.
However, it is confirmed that only smallest and largest are supported in
KafkaStreamMetadataProvider class.

https://github.com/apache/incubator-pinot/blob/47a75e5093129cc280de4c118434ccb337cd3da1/pinot-plugins/pinot-stream-ingestion/pinot-kafka-2.0/src/main/java/org/apache/pinot/plugin/stream/kafka20/KafkaStreamMetadataProvider.java#L55

### 4
We have trouble because we want to permanently store customer data(up to 6
months to 1 year) in the pinot, rather than storing, verifying, and erasing
user data one-time.
Is it not recommended for the case where pinot continuously stores user data
as streaming and maintains user data? I want to keep data like RDB used in
normal web development. However, it doesn't matter whether there is TTL or
expire time.
**Does Pino only recommend storing and analyzing data on a one-time basis?**
Wouldn't it be recommended to continuously store and analyze data for a long
period of time (up to 6 months to 1 year) by streaming?

We know you are busy, but we look forward to answering your questions.
Thank you. Have a nice day.

--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]

---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

[GitHub] [incubator-pinot] minwoo-jung edited a comment on issue #7100: [question] Even if I recreate the kafka topic or modify the topic properties, I wonder how consuming can continue to do it.

Reply via email to