Re: Handling Duplicate Timestamps

2024-05-19 Thread Jialin Qiao
Hi Trevor,

Yes, IoTDB cannot handle this scenario currently because our primary
key is Path + Timestamp.

This year we will focus on the table model, a lot work to do :-)

Jialin Qiao

Trevor Hart  于2024年5月20日周一 09:45写道:
>
> Hi Jialin
>
>
>
> Yes the values would be different.
>
>
>
> As as example, these are from a web server log. The device is openzweb01 
> which is an IIS web server which may handle multiple requests at the same 
> time. The rows are unique in their own right but the timestamp is the same in 
> the logging.
>
>
>
> 2024-05-20 00:00:14 W3SVC1 openzweb01 192.168.3.69 POST 
> /portal/sharing/rest/community/users/Meriadoc 200 0 0 3339 503 7
>
>
> 2024-05-20 00:00:14 W3SVC1 openzweb01 192.168.3.69 POST 
> /portal/sharing/rest/community/users/Peregrin 200 0 0 3327 503 6
>
>
> 2024-05-20 00:00:14 W3SVC1 openzweb01 192.168.3.69 POST 
> /portal/sharing/rest/community/users/Samwise 200 0 0 3325 502 6
>
> 2024-05-20 00:00:14 W3SVC1 openzweb01 192.168.3.69 POST 
> /portal/sharing/rest/community/users/siteadmin 200 0 0 15279 504 5
>
>
> 2024-05-20 00:00:15 W3SVC1 openzweb01 192.168.3.69 POST 
> /portal/sharing/rest/community/users/testuser 200 0 0 1794 503 6
>
> 2024-05-20 00:00:15 W3SVC1 openzweb01 192.168.3.69 POST 
> /portal/sharing/rest/community/users/testuser2 200 0 0 1794 506 6
>
>
>
> This particular log file only records in seconds. So what I am doing with 
> these rows at the moment is to add an artitifical millisecond to enforce 
> uniqueness.
>
>
>
>
> 2024-05-20 00:00:14.000 W3SVC1 openzweb01 192.168.3.69 POST 
> /portal/sharing/rest/community/users/Meriadoc 200 0 0 3339 503 7
>
> 2024-05-20 00:00:14.001 W3SVC1 openzweb01 192.168.3.69 POST 
> /portal/sharing/rest/community/users/Peregrin 200 0 0 3327 503 6
>
> 2024-05-20 00:00:14.002 W3SVC1 openzweb01 192.168.3.69 POST 
> /portal/sharing/rest/community/users/Samwise 200 0 0 3325 502 6
>
> 2024-05-20 00:00:14.003 W3SVC1 openzweb01 192.168.3.69 POST 
> /portal/sharing/rest/community/users/siteadmin 200 0 0 15279 504 5
>
> 2024-05-20 00:00:15.000 W3SVC1 openzweb01 192.168.3.69 POST 
> /portal/sharing/rest/community/users/testuser 200 0 0 1794 503 6
>
> 2024-05-20 00:00:15.001 W3SVC1 openzweb01 192.168.3.69 POST 
> /portal/sharing/rest/community/users/testuser2 200 0 0 1794 506 6
>
>
>
> For some other log files that I am processing they are in milliseconds 
> already but there is a (small) chance of dataloss if multiple requests happen 
> to be processed at the exact same time.
>
>
>
> I have been thinking about this some more and I think that rather than break 
> the IoTDB CRUD model I should handle this on the client side. In my use case 
> the log data is actually staged in an H2 database before it is sent to IoTDB 
> so I can enforce PK validation there. That way it is less expensive that 
> checking the timestamp in IoTDB for each record.
>
>
>
> Thanks
>
> Trevor Hart
>
>
>
>
>
>
>
>
>  On Fri, 17 May 2024 19:11:13 +1200 Jialin Qiao  
> wrote ---
>
>
>
> Hi Trevor,
>
> Will different values of the same timestamp be the same?
>
> 1. Same
> Time, Value
> 1, 1
> 1, 1
> 1, 1
>
> 2. Different
> Time, Value
> 1, 1
> 1, 2
> 1, 1
>
>
> Jialin Qiao
>
> Trevor Hart  于2024年5月14日周二 11:20写道:
> >
> > Thank you! I will implment some work around for now.
> >
> >
> > I would appreciate some consideration for this option in the future.
> >
> >
> > Thanks
> >
> > Trevor Hart
> >
> > Ope Limited
> >
> > w: http://www.ope.nz/
> >
> > m: +64212728039
> >
> >
> >
> >
> >
> >
> >
> >
> >  On Tue, 14 May 2024 15:17:47 +1200 Xiangdong Huang 
> >  wrote ---
> >
> >
> >
> > > 1. Checking before insert if the timestamp already exists and remedy on 
> > > the client before resend
> > > 2. Moving to Nanosecond and introducing some insignificant time value to 
> > > keep timestamp values unique.
> > Yes these maybe the best solutions for a specific application.
> >
> >
> > Analysis for IoTDB:
> > - Rejecting the write when receiving an existing timestamp in IoTDB is
> > time-costly (IoTDB needs to check historical data). I think we will do
> > not check it until we find a low-latency method.
> > - Allowing multiple value versions for a timestamp may introduce a
> > chain reaction and there may be a lot of codes that should be
> > modified, which is a huge work.
> >
> > There is a new idea (but I have no time to implement it...)
> > - Add a parameter in IoTDB: replace_strategy: first, last, avg etc...
> > - when an existing timestamp arrives, IoTDB accepts it
> > - when IoTDB runs LSM to merge data and meets multiple values for a
> > timestamp, then handles it according to the replace_startegy.
> >
> > The solution may also introduce some work to do... and we need to
> > think carefully the impact to the query process.
> > Need to survey whether this is a common requirement.
> >
> > Best,
> > ---
> > Xiangdong Huang
> >
> > Trevor Hart  

Re: Handling Duplicate Timestamps

2024-05-19 Thread Trevor Hart
Hi Jialin



Yes the values would be different.



As as example, these are from a web server log. The device is openzweb01 which 
is an IIS web server which may handle multiple requests at the same time. The 
rows are unique in their own right but the timestamp is the same in the 
logging. 



2024-05-20 00:00:14 W3SVC1 openzweb01 192.168.3.69 POST 
/portal/sharing/rest/community/users/Meriadoc 200 0 0 3339 503 7


2024-05-20 00:00:14 W3SVC1 openzweb01 192.168.3.69 POST 
/portal/sharing/rest/community/users/Peregrin 200 0 0 3327 503 6


2024-05-20 00:00:14 W3SVC1 openzweb01 192.168.3.69 POST 
/portal/sharing/rest/community/users/Samwise 200 0 0 3325 502 6

2024-05-20 00:00:14 W3SVC1 openzweb01 192.168.3.69 POST 
/portal/sharing/rest/community/users/siteadmin 200 0 0 15279 504 5


2024-05-20 00:00:15 W3SVC1 openzweb01 192.168.3.69 POST 
/portal/sharing/rest/community/users/testuser 200 0 0 1794 503 6

2024-05-20 00:00:15 W3SVC1 openzweb01 192.168.3.69 POST 
/portal/sharing/rest/community/users/testuser2 200 0 0 1794 506 6



This particular log file only records in seconds. So what I am doing with these 
rows at the moment is to add an artitifical millisecond to enforce uniqueness.




2024-05-20 00:00:14.000 W3SVC1 openzweb01 192.168.3.69 POST 
/portal/sharing/rest/community/users/Meriadoc 200 0 0 3339 503 7 

2024-05-20 00:00:14.001 W3SVC1 openzweb01 192.168.3.69 POST 
/portal/sharing/rest/community/users/Peregrin 200 0 0 3327 503 6 

2024-05-20 00:00:14.002 W3SVC1 openzweb01 192.168.3.69 POST 
/portal/sharing/rest/community/users/Samwise 200 0 0 3325 502 6

2024-05-20 00:00:14.003 W3SVC1 openzweb01 192.168.3.69 POST 
/portal/sharing/rest/community/users/siteadmin 200 0 0 15279 504 5 

2024-05-20 00:00:15.000 W3SVC1 openzweb01 192.168.3.69 POST 
/portal/sharing/rest/community/users/testuser 200 0 0 1794 503 6

2024-05-20 00:00:15.001 W3SVC1 openzweb01 192.168.3.69 POST 
/portal/sharing/rest/community/users/testuser2 200 0 0 1794 506 6



For some other log files that I am processing they are in milliseconds already 
but there is a (small) chance of dataloss if multiple requests happen to be 
processed at the exact same time.



I have been thinking about this some more and I think that rather than break 
the IoTDB CRUD model I should handle this on the client side. In my use case 
the log data is actually staged in an H2 database before it is sent to IoTDB so 
I can enforce PK validation there. That way it is less expensive that checking 
the timestamp in IoTDB for each record.



Thanks 

Trevor Hart








 On Fri, 17 May 2024 19:11:13 +1200 Jialin Qiao  
wrote ---



Hi Trevor, 
 
Will different values of the same timestamp be the same? 
 
1. Same 
Time, Value 
1, 1 
1, 1 
1, 1 
 
2. Different 
Time, Value 
1, 1 
1, 2 
1, 1 
 
 
Jialin Qiao 
 
Trevor Hart  于2024年5月14日周二 11:20写道: 
> 
> Thank you! I will implment some work around for now. 
> 
> 
> I would appreciate some consideration for this option in the future. 
> 
> 
> Thanks 
> 
> Trevor Hart 
> 
> Ope Limited 
> 
> w: http://www.ope.nz/ 
> 
> m: +64212728039 
> 
> 
> 
> 
> 
> 
> 
> 
>  On Tue, 14 May 2024 15:17:47 +1200 Xiangdong Huang 
>  wrote --- 
> 
> 
> 
> > 1. Checking before insert if the timestamp already exists and remedy on the 
> > client before resend 
> > 2. Moving to Nanosecond and introducing some insignificant time value to 
> > keep timestamp values unique. 
> Yes these maybe the best solutions for a specific application. 
> 
> 
> Analysis for IoTDB: 
> - Rejecting the write when receiving an existing timestamp in IoTDB is 
> time-costly (IoTDB needs to check historical data). I think we will do 
> not check it until we find a low-latency method. 
> - Allowing multiple value versions for a timestamp may introduce a 
> chain reaction and there may be a lot of codes that should be 
> modified, which is a huge work. 
> 
> There is a new idea (but I have no time to implement it...) 
> - Add a parameter in IoTDB: replace_strategy: first, last, avg etc... 
> - when an existing timestamp arrives, IoTDB accepts it 
> - when IoTDB runs LSM to merge data and meets multiple values for a 
> timestamp, then handles it according to the replace_startegy. 
> 
> The solution may also introduce some work to do... and we need to 
> think carefully the impact to the query process. 
> Need to survey whether this is a common requirement. 
> 
> Best, 
> --- 
> Xiangdong Huang 
> 
> Trevor Hart  于2024年5月14日周二 09:55写道: 
> > 
> > Hello Yuan 
> > 
> > 
> > 
> > Correct, the first timestamp and values should be retained. 
> > 
> > 
> > 
> > I realise this is does not align with the current design. I was just asking 
> > whether there was an existing option to operate to block duplicates. 
> > 
> > 
> > 
> > In a normal RDBMS if you try to insert with a duplicate the insert will 
> > fail with a PK violation. It would be great in some 

Re: Handling Duplicate Timestamps

2024-05-17 Thread Jialin Qiao
Hi Trevor,

Will different values of the same timestamp be the same?

1. Same
Time, Value
1, 1
1, 1
1, 1

2. Different
Time, Value
1, 1
1, 2
1, 1


Jialin Qiao

Trevor Hart  于2024年5月14日周二 11:20写道:
>
> Thank you! I will implment some work around for now.
>
>
> I would appreciate some consideration for this option in the future.
>
>
> Thanks
>
> Trevor Hart
>
> Ope Limited
>
> w: http://www.ope.nz/
>
> m: +64212728039
>
>
>
>
>
>
>
>
>  On Tue, 14 May 2024 15:17:47 +1200 Xiangdong Huang  
> wrote ---
>
>
>
> > 1. Checking before insert if the timestamp already exists and remedy on the 
> > client before resend
> > 2. Moving to Nanosecond and introducing some insignificant time value to 
> > keep timestamp values unique.
> Yes these maybe the best solutions for a specific application.
>
>
> Analysis for IoTDB:
> - Rejecting the write when receiving an existing timestamp in IoTDB is
> time-costly (IoTDB needs to check historical data). I think we will do
> not check it until we find a low-latency method.
> - Allowing multiple value versions for a timestamp may introduce a
> chain reaction and there may be a lot of codes that should be
> modified, which is a huge work.
>
> There is a new idea (but I have no time to implement it...)
> - Add a parameter in IoTDB: replace_strategy: first, last, avg etc...
> - when an existing timestamp arrives, IoTDB accepts it
> - when IoTDB runs LSM to merge data and meets multiple values for a
> timestamp, then handles it according to the replace_startegy.
>
> The solution may also introduce some work to do... and we need to
> think carefully the impact to the query process.
> Need to survey whether this is a common requirement.
>
> Best,
> ---
> Xiangdong Huang
>
> Trevor Hart  于2024年5月14日周二 09:55写道:
> >
> > Hello Yuan
> >
> >
> >
> > Correct, the first timestamp and values should be retained.
> >
> >
> >
> > I realise this is does not align with the current design. I was just asking 
> > whether there was an existing option to operate to block duplicates.
> >
> >
> >
> > In a normal RDBMS if you try to insert with a duplicate the insert will 
> > fail with a PK violation. It would be great in some circumstances if IotDB 
> > at least had the option to fail this way.
> >
> >
> >
> > I am considering some options such as;
> >
> >
> >
> > 1. Checking before insert if the timestamp already exists and remedy on the 
> > client before resend
> >
> > 2. Moving to Nanosecond and introducing some insignificant time value to 
> > keep timestamp values unique.
> >
> >
> >
> > I have already done something similar to #2 with storing IIS web log files 
> > as they are recorded in seconds and not milliseconds.
> >
> >
> >
> > Thanks
> >
> > Trevor Hart
> >
> >
> >
> >
> >  On Tue, 14 May 2024 13:29:02 +1200 Yuan Tian 
> >  wrote ---
> >
> >
> >
> > Hi Trevor,
> >
> > By "rejects duplicates", you mean you want to keep the first duplicate
> > timestamp and its corresponding values?(because the following duplicated
> > ones will be rejected)
> >
> > Best regards,
> > 
> > Yuan Tian
> >
> > On Mon, May 13, 2024 at 6:24 PM Trevor Hart  
> > wrote:
> >
> > >
> > >
> > >
> > >
> > > Correct. I’m not disputing that. What I’m asking is that it
> > > would be good to have a configuration that either allows overwrites or
> > > rejects duplicates.My scenario is request log data from a server (the
> > > device). As it may be processing multiple requests at once there is a
> > > chance that there could be colliding time stamps.As it stands now I would
> > > need to check if the timestamp exists before inserting the data. Which
> > > obviously affects throughput. Thanks Trevor Hart On Fri, 10 May
> > > 2024 00:33:40 +1200  Jialin Qiao 
> > > wrote  Hi,
> > > In IoT or IIoT scenarios, we thought each data point represent a metric of
> > > a timestamp.In which case you need to store duplicated values?  Take this
> > > for an example: Time, root.sg1.car1.speed 1, 1 1, 2  Could a car has
> > > different speed at time 1?   Jialin Qiao  Yuan Tian <
> > > mailto:mailto:jackietie...@gmail.com> 于2024年5月9日周四 18:51写道: > > Hi 
> > > Trevor, > > Now we
> > > will override the duplicate timestamp with a newer one. There is > nothing
> > > we can do about it now. > > Best regards, > --- > Yuan 
> > > Tian
> > > > > On Wed, May 8, 2024 at 5:31 PM Trevor Hart 
> > > > >  wrote: > >
> > > > Hello > > > > > > > > I’m aware that when inserting a duplicate 
> > > > timestamp
> > > the values will be > > overwritten. This will obviously result in data
> > > loss. > > > > > > > > Is there a config/setting to reject or throw an 
> > > error
> > > on duplicate > > inserts? Although highly unlikely I would prefer to be
> > > alerted to the > > situation rather than lose data. > > > > > 

Re: Handling Duplicate Timestamps

2024-05-13 Thread Trevor Hart
Thank you! I will implment some work around for now.


I would appreciate some consideration for this option in the future.


Thanks 

Trevor Hart

Ope Limited

w: http://www.ope.nz/

m: +64212728039








 On Tue, 14 May 2024 15:17:47 +1200 Xiangdong Huang  
wrote ---



> 1. Checking before insert if the timestamp already exists and remedy on the 
> client before resend 
> 2. Moving to Nanosecond and introducing some insignificant time value to keep 
> timestamp values unique. 
Yes these maybe the best solutions for a specific application. 
 
 
Analysis for IoTDB: 
- Rejecting the write when receiving an existing timestamp in IoTDB is 
time-costly (IoTDB needs to check historical data). I think we will do 
not check it until we find a low-latency method. 
- Allowing multiple value versions for a timestamp may introduce a 
chain reaction and there may be a lot of codes that should be 
modified, which is a huge work. 
 
There is a new idea (but I have no time to implement it...) 
- Add a parameter in IoTDB: replace_strategy: first, last, avg etc... 
- when an existing timestamp arrives, IoTDB accepts it 
- when IoTDB runs LSM to merge data and meets multiple values for a 
timestamp, then handles it according to the replace_startegy. 
 
The solution may also introduce some work to do... and we need to 
think carefully the impact to the query process. 
Need to survey whether this is a common requirement. 
 
Best, 
--- 
Xiangdong Huang 
 
Trevor Hart  于2024年5月14日周二 09:55写道: 
> 
> Hello Yuan 
> 
> 
> 
> Correct, the first timestamp and values should be retained. 
> 
> 
> 
> I realise this is does not align with the current design. I was just asking 
> whether there was an existing option to operate to block duplicates. 
> 
> 
> 
> In a normal RDBMS if you try to insert with a duplicate the insert will fail 
> with a PK violation. It would be great in some circumstances if IotDB at 
> least had the option to fail this way. 
> 
> 
> 
> I am considering some options such as; 
> 
> 
> 
> 1. Checking before insert if the timestamp already exists and remedy on the 
> client before resend 
> 
> 2. Moving to Nanosecond and introducing some insignificant time value to keep 
> timestamp values unique. 
> 
> 
> 
> I have already done something similar to #2 with storing IIS web log files as 
> they are recorded in seconds and not milliseconds. 
> 
> 
> 
> Thanks 
> 
> Trevor Hart 
> 
> 
> 
> 
>  On Tue, 14 May 2024 13:29:02 +1200 Yuan Tian 
>  wrote --- 
> 
> 
> 
> Hi Trevor, 
> 
> By "rejects duplicates", you mean you want to keep the first duplicate 
> timestamp and its corresponding values?(because the following duplicated 
> ones will be rejected) 
> 
> Best regards, 
>  
> Yuan Tian 
> 
> On Mon, May 13, 2024 at 6:24 PM Trevor Hart  
> wrote: 
> 
> > 
> > 
> > 
> > 
> > Correct. I’m not disputing that. What I’m asking is that it 
> > would be good to have a configuration that either allows overwrites or 
> > rejects duplicates.My scenario is request log data from a server (the 
> > device). As it may be processing multiple requests at once there is a 
> > chance that there could be colliding time stamps.As it stands now I would 
> > need to check if the timestamp exists before inserting the data. Which 
> > obviously affects throughput. Thanks Trevor Hart On Fri, 10 May 
> > 2024 00:33:40 +1200  Jialin Qiao wrote 
> >  Hi, 
> > In IoT or IIoT scenarios, we thought each data point represent a metric of 
> > a timestamp.In which case you need to store duplicated values?  Take this 
> > for an example: Time, root.sg1.car1.speed 1, 1 1, 2  Could a car has 
> > different speed at time 1?   Jialin Qiao  Yuan Tian < 
> > mailto:mailto:jackietie...@gmail.com> 于2024年5月9日周四 18:51写道: > > Hi Trevor, 
> > > > Now we 
> > will override the duplicate timestamp with a newer one. There is > nothing 
> > we can do about it now. > > Best regards, > --- > Yuan Tian 
> > > > On Wed, May 8, 2024 at 5:31 PM Trevor Hart 
> > > >  wrote: > > 
> > > Hello > > > > > > > > I’m aware that when inserting a duplicate timestamp 
> > the values will be > > overwritten. This will obviously result in data 
> > loss. > > > > > > > > Is there a config/setting to reject or throw an error 
> > on duplicate > > inserts? Although highly unlikely I would prefer to be 
> > alerted to the > > situation rather than lose data. > > > > > > > > I read 
> > through the documentation but couldn’t find anything. > > > > > > > > 
> > Thanks > > > > Trevor Hart 
> > 
> > 
> > 
> > 
> > 
> > 
> >

Re: Handling Duplicate Timestamps

2024-05-13 Thread Xiangdong Huang
> 1. Checking before insert if the timestamp already exists and remedy on the 
> client before resend
> 2. Moving to Nanosecond and introducing some insignificant time value to keep 
> timestamp values unique.
Yes these maybe the best solutions for a specific application.


Analysis for IoTDB:
- Rejecting the write when receiving an existing timestamp in IoTDB is
time-costly (IoTDB needs to check historical data). I think we will do
not check it until we find a low-latency method.
- Allowing multiple value versions for a timestamp may introduce a
chain reaction and there may be a lot of codes that should be
modified, which is a huge work.

There is a new idea (but I have no time to implement it...)
- Add a parameter in IoTDB: replace_strategy: first, last, avg etc...
- when an existing timestamp arrives, IoTDB accepts it
- when IoTDB runs LSM to merge data and meets multiple values for a
timestamp, then handles it according to the replace_startegy.

The solution may also introduce some work to do... and we need to
think carefully the impact to the query process.
Need to survey whether this is a common requirement.

Best,
---
Xiangdong Huang

Trevor Hart  于2024年5月14日周二 09:55写道:
>
> Hello Yuan
>
>
>
> Correct, the first timestamp and values should be retained.
>
>
>
> I realise this is does not align with the current design. I was just asking 
> whether there was an existing option to operate to block duplicates.
>
>
>
> In a normal RDBMS if you try to insert with a duplicate the insert will fail 
> with a PK violation. It would be great in some circumstances if IotDB at 
> least had the option to fail this way.
>
>
>
> I am considering some options such as;
>
>
>
> 1. Checking before insert if the timestamp already exists and remedy on the 
> client before resend
>
> 2. Moving to Nanosecond and introducing some insignificant time value to keep 
> timestamp values unique.
>
>
>
> I have already done something similar to #2 with storing IIS web log files as 
> they are recorded in seconds and not milliseconds.
>
>
>
> Thanks
>
> Trevor Hart
>
>
>
>
>  On Tue, 14 May 2024 13:29:02 +1200 Yuan Tian  
> wrote ---
>
>
>
> Hi Trevor,
>
> By "rejects duplicates", you mean you want to keep the first duplicate
> timestamp and its corresponding values?(because the following duplicated
> ones will be rejected)
>
> Best regards,
> 
> Yuan Tian
>
> On Mon, May 13, 2024 at 6:24 PM Trevor Hart  wrote:
>
> >
> >
> >
> >
> > Correct. I’m not disputing that. What I’m asking is that it
> > would be good to have a configuration that either allows overwrites or
> > rejects duplicates.My scenario is request log data from a server (the
> > device). As it may be processing multiple requests at once there is a
> > chance that there could be colliding time stamps.As it stands now I would
> > need to check if the timestamp exists before inserting the data. Which
> > obviously affects throughput. Thanks Trevor Hart On Fri, 10 May
> > 2024 00:33:40 +1200  Jialin Qiao wrote  
> > Hi,
> > In IoT or IIoT scenarios, we thought each data point represent a metric of
> > a timestamp.In which case you need to store duplicated values?  Take this
> > for an example: Time, root.sg1.car1.speed 1, 1 1, 2  Could a car has
> > different speed at time 1?   Jialin Qiao  Yuan Tian <
> > mailto:jackietie...@gmail.com> 于2024年5月9日周四 18:51写道: > > Hi Trevor, > > Now 
> > we
> > will override the duplicate timestamp with a newer one. There is > nothing
> > we can do about it now. > > Best regards, > --- > Yuan Tian
> > > > On Wed, May 8, 2024 at 5:31 PM Trevor Hart  
> > > > wrote: > >
> > > Hello > > > > > > > > I’m aware that when inserting a duplicate timestamp
> > the values will be > > overwritten. This will obviously result in data
> > loss. > > > > > > > > Is there a config/setting to reject or throw an error
> > on duplicate > > inserts? Although highly unlikely I would prefer to be
> > alerted to the > > situation rather than lose data. > > > > > > > > I read
> > through the documentation but couldn’t find anything. > > > > > > > >
> > Thanks > > > > Trevor Hart
> >
> >
> >
> >
> >
> >
> >


Re: Handling Duplicate Timestamps

2024-05-13 Thread Trevor Hart
Hello Yuan



Correct, the first timestamp and values should be retained.



I realise this is does not align with the current design. I was just asking 
whether there was an existing option to operate to block duplicates.



In a normal RDBMS if you try to insert with a duplicate the insert will fail 
with a PK violation. It would be great in some circumstances if IotDB at least 
had the option to fail this way.



I am considering some options such as;



1. Checking before insert if the timestamp already exists and remedy on the 
client before resend

2. Moving to Nanosecond and introducing some insignificant time value to keep 
timestamp values unique.



I have already done something similar to #2 with storing IIS web log files as 
they are recorded in seconds and not milliseconds.



Thanks 

Trevor Hart




 On Tue, 14 May 2024 13:29:02 +1200 Yuan Tian  
wrote ---



Hi Trevor, 
 
By "rejects duplicates", you mean you want to keep the first duplicate 
timestamp and its corresponding values?(because the following duplicated 
ones will be rejected) 
 
Best regards, 
 
Yuan Tian 
 
On Mon, May 13, 2024 at 6:24 PM Trevor Hart  wrote: 
 
> 
> 
> 
> 
> Correct. I’m not disputing that. What I’m asking is that it 
> would be good to have a configuration that either allows overwrites or 
> rejects duplicates.My scenario is request log data from a server (the 
> device). As it may be processing multiple requests at once there is a 
> chance that there could be colliding time stamps.As it stands now I would 
> need to check if the timestamp exists before inserting the data. Which 
> obviously affects throughput. Thanks Trevor Hart On Fri, 10 May 
> 2024 00:33:40 +1200  Jialin Qiao wrote  Hi, 
> In IoT or IIoT scenarios, we thought each data point represent a metric of 
> a timestamp.In which case you need to store duplicated values?  Take this 
> for an example: Time, root.sg1.car1.speed 1, 1 1, 2  Could a car has 
> different speed at time 1?   Jialin Qiao  Yuan Tian < 
> mailto:jackietie...@gmail.com> 于2024年5月9日周四 18:51写道: > > Hi Trevor, > > Now 
> we 
> will override the duplicate timestamp with a newer one. There is > nothing 
> we can do about it now. > > Best regards, > --- > Yuan Tian 
> > > On Wed, May 8, 2024 at 5:31 PM Trevor Hart  wrote: 
> > > > > 
> > Hello > > > > > > > > I’m aware that when inserting a duplicate timestamp 
> the values will be > > overwritten. This will obviously result in data 
> loss. > > > > > > > > Is there a config/setting to reject or throw an error 
> on duplicate > > inserts? Although highly unlikely I would prefer to be 
> alerted to the > > situation rather than lose data. > > > > > > > > I read 
> through the documentation but couldn’t find anything. > > > > > > > > 
> Thanks > > > > Trevor Hart 
> 
> 
> 
> 
> 
> 
>

Re: Handling Duplicate Timestamps

2024-05-13 Thread Yuan Tian
Hi Trevor,

By "rejects duplicates", you mean you want to keep the first duplicate
timestamp and its corresponding values?(because the following duplicated
ones will be rejected)

Best regards,

Yuan Tian

On Mon, May 13, 2024 at 6:24 PM Trevor Hart  wrote:

>
>
>
>
> Correct. I’m not disputing that. What I’m asking is that it
> would be good to have a configuration that either allows overwrites or
> rejects duplicates.My scenario is request log data from a server (the
> device). As it may be processing multiple requests at once there is a
> chance that there could be colliding time stamps.As it stands now I would
> need to check if the timestamp exists before inserting the data. Which
> obviously affects throughput. Thanks Trevor Hart On Fri, 10 May
> 2024 00:33:40 +1200  Jialin Qiao wrote  Hi,
> In IoT or IIoT scenarios, we thought each data point represent a metric of
> a timestamp.In which case you need to store duplicated values?  Take this
> for an example: Time, root.sg1.car1.speed 1, 1 1, 2  Could a car has
> different speed at time 1?   Jialin Qiao  Yuan Tian <
> jackietie...@gmail.com> 于2024年5月9日周四 18:51写道: > > Hi Trevor, > > Now we
> will override the duplicate timestamp with a newer one. There is > nothing
> we can do about it now. > > Best regards, > --- > Yuan Tian
> > > On Wed, May 8, 2024 at 5:31 PM Trevor Hart  wrote: > >
> > Hello > > > > > > > > I’m aware that when inserting a duplicate timestamp
> the values will be > > overwritten. This will obviously result in data
> loss. > > > > > > > > Is there a config/setting to reject or throw an error
> on duplicate > > inserts? Although highly unlikely I would prefer to be
> alerted to the > > situation rather than lose data. > > > > > > > > I read
> through the documentation but couldn’t find anything. > > > > > > > >
> Thanks > > > > Trevor Hart
>
>
>
>
>
>
>


Re: Handling Duplicate Timestamps

2024-05-13 Thread Trevor Hart




Correct. I’m not disputing that. What I’m asking is that it would 
be good to have a configuration that either allows overwrites or rejects 
duplicates.My scenario is request log data from a server (the device). As it 
may be processing multiple requests at once there is a chance that there could 
be colliding time stamps.As it stands now I would need to check if the 
timestamp exists before inserting the data. Which obviously affects throughput. 
Thanks Trevor Hart On Fri, 10 May 2024 00:33:40 +1200  Jialin 
Qiao wrote  Hi,  In IoT or IIoT scenarios, we 
thought each data point represent a metric of a timestamp.In which case you 
need to store duplicated values?  Take this for an example: Time, 
root.sg1.car1.speed 1, 1 1, 2  Could a car has different speed at time 1?   
Jialin Qiao  Yuan Tian  于2024年5月9日周四 18:51写道: > > Hi 
Trevor, > > Now we will override the duplicate timestamp with a newer one. 
There is > nothing we can do about it now. > > Best regards, > 
--- > Yuan Tian > > On Wed, May 8, 2024 at 5:31 PM Trevor Hart 
 wrote: > > > Hello > > > > > > > > I’m aware that when 
inserting a duplicate timestamp the values will be > > overwritten. This will 
obviously result in data loss. > > > > > > > > Is there a config/setting to 
reject or throw an error on duplicate > > inserts? Although highly unlikely I 
would prefer to be alerted to the > > situation rather than lose data. > > > > 
> > > > I read through the documentation but couldn’t find anything. > > > > > 
> > > Thanks > > > > Trevor Hart  








Re: Handling Duplicate Timestamps

2024-05-09 Thread Jialin Qiao
Hi,

In IoT or IIoT scenarios, we thought each data point represent a
metric of a timestamp.In which case you need to store duplicated
values?

Take this for an example:
Time, root.sg1.car1.speed
1, 1
1, 2

Could a car has different speed at time 1?


Jialin Qiao

Yuan Tian  于2024年5月9日周四 18:51写道:
>
> Hi Trevor,
>
> Now we will override the duplicate timestamp with a newer one. There is
> nothing we can do about it now.
>
> Best regards,
> ---
> Yuan Tian
>
> On Wed, May 8, 2024 at 5:31 PM Trevor Hart  wrote:
>
> > Hello
> >
> >
> >
> > I’m aware that when inserting a duplicate timestamp the values will be
> > overwritten. This will obviously result in data loss.
> >
> >
> >
> > Is there a config/setting to reject or throw an error on duplicate
> > inserts? Although highly unlikely I would prefer to be alerted to the
> > situation rather than lose data.
> >
> >
> >
> > I read through the documentation but couldn’t find anything.
> >
> >
> >
> > Thanks
> >
> > Trevor Hart


Re: Handling Duplicate Timestamps

2024-05-09 Thread Yuan Tian
Hi Trevor,

Now we will override the duplicate timestamp with a newer one. There is
nothing we can do about it now.

Best regards,
---
Yuan Tian

On Wed, May 8, 2024 at 5:31 PM Trevor Hart  wrote:

> Hello
>
>
>
> I’m aware that when inserting a duplicate timestamp the values will be
> overwritten. This will obviously result in data loss.
>
>
>
> Is there a config/setting to reject or throw an error on duplicate
> inserts? Although highly unlikely I would prefer to be alerted to the
> situation rather than lose data.
>
>
>
> I read through the documentation but couldn’t find anything.
>
>
>
> Thanks
>
> Trevor Hart


Handling Duplicate Timestamps

2024-05-08 Thread Trevor Hart
Hello



I’m aware that when inserting a duplicate timestamp the values will be 
overwritten. This will obviously result in data loss. 



Is there a config/setting to reject or throw an error on duplicate inserts? 
Although highly unlikely I would prefer to be alerted to the situation rather 
than lose data.



I read through the documentation but couldn’t find anything. 



Thanks 

Trevor Hart

Handling Duplicate Timestamps

2024-05-02 Thread Trevor Hart
Hello



I’m aware that when inserting a duplicate timestamp the values will be 
overwritten. This can obviously result in data loss. 



Is there a config/setting to reject or throw an error on duplicate inserts? 
Although highly unlikely I would prefer to be alerted to the situation rather 
than lose data.



I read through the documentation but couldn’t find anything. 



Thanks 

Trevor Hart