Hi Trevor,

Yes, IoTDB cannot handle this scenario currently because our primary
key is Path + Timestamp.

This year we will focus on the table model, a lot work to do :-)

Jialin Qiao

Trevor Hart <tre...@ope.nz> 于2024年5月20日周一 09:45写道:
>
> Hi Jialin
>
>
>
> Yes the values would be different.
>
>
>
> As as example, these are from a web server log. The device is openzweb01 
> which is an IIS web server which may handle multiple requests at the same 
> time. The rows are unique in their own right but the timestamp is the same in 
> the logging.
>
>
>
> 2024-05-20 00:00:14 W3SVC1 openzweb01 192.168.3.69 POST 
> /portal/sharing/rest/community/users/Meriadoc 200 0 0 3339 503 7
>
>
> 2024-05-20 00:00:14 W3SVC1 openzweb01 192.168.3.69 POST 
> /portal/sharing/rest/community/users/Peregrin 200 0 0 3327 503 6
>
>
> 2024-05-20 00:00:14 W3SVC1 openzweb01 192.168.3.69 POST 
> /portal/sharing/rest/community/users/Samwise 200 0 0 3325 502 6
>
> 2024-05-20 00:00:14 W3SVC1 openzweb01 192.168.3.69 POST 
> /portal/sharing/rest/community/users/siteadmin 200 0 0 15279 504 5
>
>
> 2024-05-20 00:00:15 W3SVC1 openzweb01 192.168.3.69 POST 
> /portal/sharing/rest/community/users/testuser 200 0 0 1794 503 6
>
> 2024-05-20 00:00:15 W3SVC1 openzweb01 192.168.3.69 POST 
> /portal/sharing/rest/community/users/testuser2 200 0 0 1794 506 6
>
>
>
> This particular log file only records in seconds. So what I am doing with 
> these rows at the moment is to add an artitifical millisecond to enforce 
> uniqueness.
>
>
>
>
> 2024-05-20 00:00:14.000 W3SVC1 openzweb01 192.168.3.69 POST 
> /portal/sharing/rest/community/users/Meriadoc 200 0 0 3339 503 7
>
> 2024-05-20 00:00:14.001 W3SVC1 openzweb01 192.168.3.69 POST 
> /portal/sharing/rest/community/users/Peregrin 200 0 0 3327 503 6
>
> 2024-05-20 00:00:14.002 W3SVC1 openzweb01 192.168.3.69 POST 
> /portal/sharing/rest/community/users/Samwise 200 0 0 3325 502 6
>
> 2024-05-20 00:00:14.003 W3SVC1 openzweb01 192.168.3.69 POST 
> /portal/sharing/rest/community/users/siteadmin 200 0 0 15279 504 5
>
> 2024-05-20 00:00:15.000 W3SVC1 openzweb01 192.168.3.69 POST 
> /portal/sharing/rest/community/users/testuser 200 0 0 1794 503 6
>
> 2024-05-20 00:00:15.001 W3SVC1 openzweb01 192.168.3.69 POST 
> /portal/sharing/rest/community/users/testuser2 200 0 0 1794 506 6
>
>
>
> For some other log files that I am processing they are in milliseconds 
> already but there is a (small) chance of dataloss if multiple requests happen 
> to be processed at the exact same time.
>
>
>
> I have been thinking about this some more and I think that rather than break 
> the IoTDB CRUD model I should handle this on the client side. In my use case 
> the log data is actually staged in an H2 database before it is sent to IoTDB 
> so I can enforce PK validation there. That way it is less expensive that 
> checking the timestamp in IoTDB for each record.
>
>
>
> Thanks
>
> Trevor Hart
>
>
>
>
>
>
>
>
> ---- On Fri, 17 May 2024 19:11:13 +1200 Jialin Qiao <qiaojia...@apache.org> 
> wrote ---
>
>
>
> Hi Trevor,
>
> Will different values of the same timestamp be the same?
>
> 1. Same
> Time, Value
> 1, 1
> 1, 1
> 1, 1
>
> 2. Different
> Time, Value
> 1, 1
> 1, 2
> 1, 1
>
>
> Jialin Qiao
>
> Trevor Hart <mailto:tre...@ope.nz> 于2024年5月14日周二 11:20写道:
> >
> > Thank you! I will implment some work around for now.
> >
> >
> > I would appreciate some consideration for this option in the future.
> >
> >
> > Thanks
> >
> > Trevor Hart
> >
> > Ope Limited
> >
> > w: http://www.ope.nz/
> >
> > m: +64212728039
> >
> >
> >
> >
> >
> >
> >
> >
> > ---- On Tue, 14 May 2024 15:17:47 +1200 Xiangdong Huang 
> > <mailto:saint...@gmail.com> wrote ---
> >
> >
> >
> > > 1. Checking before insert if the timestamp already exists and remedy on 
> > > the client before resend
> > > 2. Moving to Nanosecond and introducing some insignificant time value to 
> > > keep timestamp values unique.
> > Yes these maybe the best solutions for a specific application.
> >
> >
> > Analysis for IoTDB:
> > - Rejecting the write when receiving an existing timestamp in IoTDB is
> > time-costly (IoTDB needs to check historical data). I think we will do
> > not check it until we find a low-latency method.
> > - Allowing multiple value versions for a timestamp may introduce a
> > chain reaction and there may be a lot of codes that should be
> > modified, which is a huge work.
> >
> > There is a new idea (but I have no time to implement it...)
> > - Add a parameter in IoTDB: replace_strategy: first, last, avg etc...
> > - when an existing timestamp arrives, IoTDB accepts it
> > - when IoTDB runs LSM to merge data and meets multiple values for a
> > timestamp, then handles it according to the replace_startegy.
> >
> > The solution may also introduce some work to do... and we need to
> > think carefully the impact to the query process.
> > Need to survey whether this is a common requirement.
> >
> > Best,
> > -----------------------------------
> > Xiangdong Huang
> >
> > Trevor Hart <mailto:mailto:tre...@ope.nz> 于2024年5月14日周二 09:55写道:
> > >
> > > Hello Yuan
> > >
> > >
> > >
> > > Correct, the first timestamp and values should be retained.
> > >
> > >
> > >
> > > I realise this is does not align with the current design. I was just 
> > > asking whether there was an existing option to operate to block 
> > > duplicates.
> > >
> > >
> > >
> > > In a normal RDBMS if you try to insert with a duplicate the insert will 
> > > fail with a PK violation. It would be great in some circumstances if 
> > > IotDB at least had the option to fail this way.
> > >
> > >
> > >
> > > I am considering some options such as;
> > >
> > >
> > >
> > > 1. Checking before insert if the timestamp already exists and remedy on 
> > > the client before resend
> > >
> > > 2. Moving to Nanosecond and introducing some insignificant time value to 
> > > keep timestamp values unique.
> > >
> > >
> > >
> > > I have already done something similar to #2 with storing IIS web log 
> > > files as they are recorded in seconds and not milliseconds.
> > >
> > >
> > >
> > > Thanks
> > >
> > > Trevor Hart
> > >
> > >
> > >
> > >
> > > ---- On Tue, 14 May 2024 13:29:02 +1200 Yuan Tian 
> > > <mailto:mailto:jackietie...@gmail.com> wrote ---
> > >
> > >
> > >
> > > Hi Trevor,
> > >
> > > By "rejects duplicates", you mean you want to keep the first duplicate
> > > timestamp and its corresponding values?(because the following duplicated
> > > ones will be rejected)
> > >
> > > Best regards,
> > > --------------------
> > > Yuan Tian
> > >
> > > On Mon, May 13, 2024 at 6:24 PM Trevor Hart 
> > > <mailto:mailto:mailto:tre...@ope.nz> wrote:
> > >
> > > >
> > > >
> > > >
> > > >
> > > >             Correct. I’m not disputing that. What I’m asking is that it
> > > > would be good to have a configuration that either allows overwrites or
> > > > rejects duplicates.My scenario is request log data from a server (the
> > > > device). As it may be processing multiple requests at once there is a
> > > > chance that there could be colliding time stamps.As it stands now I 
> > > > would
> > > > need to check if the timestamp exists before inserting the data. Which
> > > > obviously affects throughput. Thanks Trevor Hart    ---- On Fri, 10 May
> > > > 2024 00:33:40 +1200  Jialin 
> > > > Qiao<mailto:mailto:mailto:qiaojia...@apache.org> wrote ---- Hi,
> > > > In IoT or IIoT scenarios, we thought each data point represent a metric 
> > > > of
> > > > a timestamp.In which case you need to store duplicated values?  Take 
> > > > this
> > > > for an example: Time, root.sg1.car1.speed 1, 1 1, 2  Could a car has
> > > > different speed at time 1?   Jialin Qiao  Yuan Tian <
> > > > mailto:mailto:mailto:jackietie...@gmail.com> 于2024年5月9日周四 18:51写道: > > 
> > > > Hi Trevor, > > Now we
> > > > will override the duplicate timestamp with a newer one. There is > 
> > > > nothing
> > > > we can do about it now. > > Best regards, > ------------------- > Yuan 
> > > > Tian
> > > > > > On Wed, May 8, 2024 at 5:31 PM Trevor Hart 
> > > > > > <mailto:mailto:mailto:tre...@ope.nz> wrote: > >
> > > > > Hello > > > > > > > > I’m aware that when inserting a duplicate 
> > > > > timestamp
> > > > the values will be > > overwritten. This will obviously result in data
> > > > loss. > > > > > > > > Is there a config/setting to reject or throw an 
> > > > error
> > > > on duplicate > > inserts? Although highly unlikely I would prefer to be
> > > > alerted to the > > situation rather than lose data. > > > > > > > > I 
> > > > read
> > > > through the documentation but couldn’t find anything. > > > > > > > >
> > > > Thanks > > > > Trevor Hart
> > > >
> > > >
> > > >
> > > >
> > > >
> > > >
> > > >

Reply via email to