Hello Quan, On 20/07/2021 17:24, Quan tran hong wrote: > [...] > > SELECT threadId FROM threadtable WHERE username = 'quan' AND baseSubject = > 'baseSubject1' AND mimeMessageId IN ('MimeMessageID2', 'MimeMessageID3') > LIMIT 1 ALLOW FILTERING; ALLOW FILTERING should not be used as it will result in a full scan and is thus a performance disaster.
If you need it, this means you do not have the right table structure and likely should rework the CREATE TABLE statement. > > => This new message should have this threadId. > New unrelated message > > Assume that we do a query for a new unrelated message. > > SELECT threadId FROM threadtable WHERE username = 'quan' AND baseSubject = > 'unrelatedBaseSubject' AND mimeMessageId IN ('MimeMessageID2', > 'MimeMessageID3') LIMIT 1 ALLOW FILTERING; > > => This new message should have a new threadId. > Insert new message data > > After having a threadId, we need to insert new message data into the thread > table. > > insert into ThreadTable (messageId, threadId, username, mimeMessageId, > baseSubject) values (now(), 02294fe1-e941-11eb-a8ee-77de5498f1fa, 'quan', > 'MimeMessageID2', 'baseSubject1'); > > insert into ThreadTable (messageId, threadId, username, mimeMessageId, > baseSubject) values (now(), 02294fe1-e941-11eb-a8ee-77de5498f1fa, 'quan', > 'MimeMessageID3', 'baseSubject1'); > Conclusion > > I think this data model complies with the needed request for the guessing > algorithm problem, but it looks like still maybe there is room for > improvement. What Cassandra request do we use to delete the data in there? > > > Best Regards, > > Quan > > > > > > Vào Th 2, 19 thg 7, 2021 vào lúc 18:23 btell...@apache.org < > btell...@apache.org> đã viết: > >> Hello Quan, >> >> On 19/07/2021 17:59, Quan tran hong wrote: >>> Hi, >>> I am starting to implement ThreadIdGuessingAlgorithm for the distributed >>> module. Because this is a breaking change and I am new to Cassandra also, >>> therefore I want to have some discussion with you about how to do this. >> As long as we introduce a new table there is no reason that it creates >> breaking change, but getting the format right will ease our life down >> the line. >>> For the ones who did not catch up with this work, please have a look at >>> JMAP Threads specs [1] and my work related to this [2]. >>> >>> So my ideas on how to do this: >>> - Add a needed inputs Cassandra Table for guessing threadId algorithm. >>> Maybe a table likes: >>> CREATE TABLE ThreadRelatedTable ( >>> threadId timeuuid, >>> messageId timeuuid, >>> mimeMessageIds SET<text>, >>> subject text, >>> PRIMARY KEY (mimeMessageIds, subject) >>> ); >>> - Whenever we guess threadId for a new message, we access this table and >> do >>> the matching query to get related threadId(if there is) or decide new >>> message should have a new threadId. >>> - Whenever we save a new message, we save the thread-related data to this >>> table. >>> >>> This is my first come-up idea. Please express your thoughts about this. >> Collections are an advanced data modeling tool, that should be used with >> caution. I am not sure using it in a PRIMARY KEY is a good idea. I am >> not sure that does what you want (the full primary key should be >> specified to know which node hold the data. >> >> Also, once you found the message related to a thread you want to >> validate that the subject matches. This can be done on application side >> (James), and avoids complicated data model. >> >> I encourage you to validate your data model using a Cassandra in docker >> and executing CQL commands locally with CQLSH tool to simulate the >> queries you whish to do, and learn about your data model before even >> starting to implement it. IMO sharing CQL commands for creating the >> table, inserting data in it, and retrieving data from it would be a >> great follow up to this email. >> >> How would you populate the data of that table? >> >> Best regards, >> >> Benoit >>> Best regards, >>> >>> Quan >>> >>> [1] https://jmap.io/spec-mail.html#threads >>> [2] https://issues.apache.org/jira/browse/JAMES-3516 >>> >> --------------------------------------------------------------------- >> To unsubscribe, e-mail: server-dev-unsubscr...@james.apache.org >> For additional commands, e-mail: server-dev-h...@james.apache.org >> >> --------------------------------------------------------------------- To unsubscribe, e-mail: server-dev-unsubscr...@james.apache.org For additional commands, e-mail: server-dev-h...@james.apache.org