Re: [VOTE] CEP-8 Datastax Drivers Donation

2023-06-13 Thread DuyHai Doan
+1 nb On Tue, Jun 13, 2023 at 8:00 PM C. Scott Andreas wrote: > +1nb > > On Jun 13, 2023, at 10:25 AM, German Eichberger via dev < > dev@cassandra.apache.org> wrote: > >  > + 1 > > Great to see this moving forward! > -- > *From:* Abe Ratnofsky > *Sent:* Tuesday,

Re: Vector search demo, and query syntax

2023-05-24 Thread DuyHai Doan
Hello all Sorry to disturb the discussion but there is an official announcement from Microsoft about CosmosDB supporting Vector Search https://devblogs.microsoft.com/cosmosdb/introducing-vector-search-in-azure-cosmos-db-for-mongodb-vcore/ Looks like Jonathan is spot on about this feature, it's

Re: Welcome Patrick McFadin as Cassandra Committer

2023-02-09 Thread DuyHai Doan
Congratulations to Patrick ! After those years serving the community it is very well deserved ! Le mer. 8 févr. 2023, 18:43, Mick Semb Wever a écrit : > > Long overdue with so much you have done for so many years. Congrats! > > On Thu, 2 Feb 2023 at 23:26, Molly Monroy wrote: > >> Congrats,

Re: [VOTE] CEP-7: Storage Attached Index

2022-02-17 Thread DuyHai Doan
+1 nb On Fri, Feb 18, 2022 at 7:41 AM Berenguer Blasi wrote: > +1 > On 18/2/22 2:15, Jasonstack Zhao Yang wrote: > > +1 > > On Fri, 18 Feb 2022 at 08:15, Jeremy Hanna > wrote: > >> +1 nb. Thanks Caleb, Mike, Jason, and everyone involved with the effort. >> >> On Feb 17, 2022, at 4:23 PM, Caleb

Re: Implementing a secondary index

2021-11-17 Thread DuyHai Doan
Hello Claude I have written a blog post about 2nd index architecture a long time ago but most of the content should still be relevant, worth checking https://www.doanduyhai.com/blog/?p=13191 Regards Duy Hai DOAN Le mer. 17 nov. 2021 à 10:17, Claude Warren a écrit : > Greetings, > > I am

Re: [DISCUSS] CEP-7 Storage Attached Index

2021-09-16 Thread DuyHai Doan
;>>>>> > >>>>>>>> operator) > >>>>>>>> > >>>>>>>> tend to start with a strong, deeply thought out point of > >>>>>>>> > >>>>>>>> view > >>>>>>>> >

Re: [RELEASE] Apache Cassandra 4.0.0 released

2021-07-26 Thread DuyHai Doan
Congrats !!! 3 years worth of waiting and finally released !!! On Tue, Jul 27, 2021 at 12:02 AM Ben Slater wrote: > Congratulations and thank you to everyone involved in getting 4.0 released! > It has been a very impressive community effort. > > --- > > > *Ben Slater**Chief Product Officer*

Re: [DISCUSS] CEP-7 Storage Attached Index

2020-08-18 Thread DuyHai Doan
Last but not least 4) Are collections, static columns, composite partition key composent and UDT indexings (at any depth) on the roadmap of SAI ? I strongly believe that those features are the bare minimum to make SAI an interesting replacement for the native 2nd index as well as SASI. SASI

Re: [DISCUSS] CEP-7 Storage Attached Index

2020-08-18 Thread DuyHai Doan
Thank you Zhao Yang for starting this topic After reading the short design doc, I have a few questions 1) SASI was pretty inefficient indexing wide partitions because the index structure only retains the partition token, not the clustering colums. As per design doc SAI has row id mapping to

Re: [proposal] Introduce AssertJ in test framework

2020-03-10 Thread DuyHai Doan
Definitely +1 Coding in Java every day, I can't write test without assertJ. Once you get to know assertJ, it's impossible to go back to basic assertions of JUnit that feels really boilerplate On Tue, Mar 10, 2020 at 8:53 PM Jordan West wrote: > If it encourages more and higher quality test

Re: time for a release?

2019-10-04 Thread DuyHai Doan
+1 too (non binding) On Fri, Oct 4, 2019 at 10:33 PM Scott Andreas wrote: > > +1nb for me for the 3.x releases. > > The user-facing issues resolved in 2.2 are slimmer and relatively minor (just > 15225, 15050, 15045, 15041), but if it makes sense to release all three > together, sounds good to

Re: ACNA 2019 Schedule is out

2019-06-10 Thread DuyHai Doan
So sad, at least let's hope that the slides will be available online so people who can't attend will catch up Thanks On Sun, Jun 9, 2019 at 10:27 PM Nate McCall wrote: > The topics look really interesting. Do you know if it will be recorded for > > people to catch up later? > > > > Regards > >

Re: ACNA 2019 Schedule is out

2019-06-08 Thread DuyHai Doan
The topics look really interesting. Do you know if it will be recorded for people to catch up later? Regards On Sat, Jun 8, 2019 at 1:15 AM Nate McCall wrote: > Also, quick note that if you got an acceptance email from the ASF, but dont > see your track on the schedule yet, it's probably

Level N atomic updates for UDT necessary

2019-02-17 Thread DuyHai Doan
Hello devs After CASSANDRA-7423 (Cassandra 3.6), it is possible to declare un-frozen UDT at 1st level and more important/interesting, it is possible to update atomically individual fields on an UDT (without the need to rewrite the UDT entirely) This JIRA opens tremendous new opportunity in term

Re: Using Cassandra as local db without cluster

2018-10-18 Thread DuyHai Doan
I have an application with a purpose to store a dynamic number of colones on each rows (thing that i cannot do with classical relational database), ---> Postgresql allows you tu use array type or map type with dynamic number of records, provided of course that the cardinality of the collection

Re: Roadmap for 4.0

2018-04-10 Thread DuyHai Doan
> I'd like to see pluggable storage and transient replica tickets land, for > starters. So after all the fuss and scandal about incremental repair and MV not stable and being downgraded to experimental, I would like to suggest that those new features are also flagged as experimental for some time

Re: Roadmap for 4.0

2018-04-02 Thread DuyHai Doan
My wish list: * Add support for arithmetic operators (CASSANDRA-11935) * Allow IN restrictions on column families with collections (CASSANDRA-12654) * Add support for + and - operations on dates (CASSANDRA-11936) * Add the currentTimestamp, currentDate, currentTime and currentTimeUUID functions

Re: Paying off tech debt and correctly naming things

2018-03-21 Thread DuyHai Doan
+10 There are 2 hard problems in CS: naming things and cache invalidation Le 20 mars 2018 23:04, "Jon Haddad" a écrit : > Whenever I hop around in the codebase, one thing that always manages to > slow me down is needing to understand the context of the variable names > that

Re: Cassandra Needs to Grow Up by Version Five!

2018-02-21 Thread DuyHai Doan
So before buying any marketing claims from Microsoft or whoever, maybe should you try to use it extensively ? And talking about backup, have a look at DynamoDB: http://i68.tinypic.com/n1b6yr.jpg >From my POV, if a multi-billions company like Amazon doesn't get it right or can't make it easy for

Re: Cassandra 2017 Wrapup

2017-12-22 Thread DuyHai Doan
Thanks Jeff for the very comprehensive list of actions taken this year. Can't wait to put my hands on 4.0 once it's released On Fri, Dec 22, 2017 at 10:20 PM, Jeff Jirsa wrote: > Happy holidays all, > > I imagine most people are about to disappear to celebrate holidays, so I

Re: CASSANDRA-8527

2017-12-21 Thread DuyHai Doan
+1 to report range tombstones. This one is quite tricky indeed to track +1 to Mockito too, with the reserve that it should be used wisely On Thu, Dec 21, 2017 at 9:11 PM, Jon Haddad wrote: > I had suggested to Alex we kick this discussion over to the ML because the > change

Re: Cassandra pluggable storage engine (update)

2017-10-04 Thread DuyHai Doan
Excellent docs, thanks for the update Dikang. A question about a design choice, is there any technical reason to specify the storage engine at keyspace level rather than table level ? It's not overly complicated to move all tables sharing the same storage engine into the same keyspace but then

Re: Proposal to retroactively mark materialized views experimental

2017-10-02 Thread DuyHai Doan
to people and say “feature X is ready” > when it’s not. That’s a great way to get a reputation as “unstable” or > “not fit for production." > > Jon > > > > On Oct 2, 2017, at 11:54 AM, DuyHai Doan <doanduy...@gmail.com> wrote: > > > > "I wo

Re: Proposal to retroactively mark materialized views experimental

2017-10-02 Thread DuyHai Doan
"I would (in a patch release) disable MV CREATE statements, and emit warnings for ALTER statements and on schema load if they’re not explicitly enabled" --> I find this pretty extreme. Now we have an existing feature sitting there in the base code but forbidden from version xxx onward. Since

Re: Proposal to retroactively mark materialized views experimental

2017-10-01 Thread DuyHai Doan
parts of it, behind an experimental flag would have been > the right thing to do. It was a huge change that we're still finding issues > with 2 years later. > > On October 1, 2017 at 6:08:50 AM, DuyHai Doan (doanduy...@gmail.com) > wrote: > > How should we transition one fea

Re: Proposal to retroactively mark materialized views experimental

2017-10-01 Thread DuyHai Doan
How should we transition one feature from the "experimental" state to "production ready" state ? On which criteria ? On Sun, Oct 1, 2017 at 12:12 PM, Marcus Eriksson wrote: > I was just thinking that we should try really hard to avoid adding > experimental features - they

Re: Does partition size limitation still exists in Cassandra 3.10 given there is a B-tree implementation?

2017-05-11 Thread DuyHai Doan
Yes the recommendation still applies Wide partitions have huge impact on repair (over streaming), compaction and bootstrap Le 10 mai 2017 23:54, "Kant Kodali" a écrit : Hi All, Cassandra community had always been recommending 100MB per partition as a sweet spot however does

Re: Cassandra on RocksDB experiment result

2017-04-19 Thread DuyHai Doan
"I have no clue what it would take to accomplish a pluggable storage engine, but I love this idea." This was a long and old debate we had several times in the past. One of the difficulty of pluggable storage engine is that we need to manage the differences between the LSMT of native C* and RockDB

Re: Code quality, principles and rules

2017-03-16 Thread DuyHai Doan
"Otherwise it'll be difficult to write unit test cases." Having to pull in dependency injection framework to make unit testing easier is generally a sign of code design issue. With constructor injection & other techniques, there is more than enough to unit test your code without having to pull

Re: Hopefully Weekly Apache Cassandra JIRA Topics of Interest

2017-03-12 Thread DuyHai Doan
Static columns can already be indexed by custom 2nd index impl, for example SASI : https://issues.apache.org/jira/browse/CASSANDRA-11183 On Sun, Mar 12, 2017 at 10:40 PM, Jeff Jirsa wrote: > Hi folks, > > Cassandra JIRA is huge, active, and ever-changing. It's easy to miss >

Re: committing performance patches without performance numbers

2017-03-09 Thread DuyHai Doan
After looking at the patch, my thoughts (beware, it's getting very technical): Original code: -t = new ListType(elements, isMultiCell); -ListType t2 = internMap.putIfAbsent(elements, t); -t = (t2 == null) ? t : t2; Optimized code: +t =

Re: State of triggers

2017-03-05 Thread DuyHai Doan
h" <brs...@gmail.com>: > > > No. You just change the partitioner. That's all > > > > Am 05.03.2017 09:15 schrieb "DuyHai Doan" <doanduy...@gmail.com>: > > > >> "How can that be achieved? I haven't done "scientific researches" yet &

Re: State of triggers

2017-03-05 Thread DuyHai Doan
"How can that be achieved? I haven't done "scientific researches" yet but I guess a "MV partitioner" could do the trick. Instead of applying the regular partitioner, an MV partitioner would calculate the PK of the base table (which is always possible) and then apply the regular partitioner." The

Re: If reading from materialized view with a consistency level of quorum am I guaranteed to have the most recent view?

2017-02-10 Thread DuyHai Doan
See my blog post to understand how MV is implemented: http://www.doanduyhai.com/blog/?p=1930 On Fri, Feb 10, 2017 at 7:48 PM, Benjamin Roth wrote: > Same partition key: > > PRIMARY KEY ((a, b), c, d) and > PRIMARY KEY ((a, b), d, c) > > PRIMARY KEY ((a), b, c) and >

Re: Why does CockroachDB github website say Cassandra has no Availability on datacenter failure?

2017-02-07 Thread DuyHai Doan
The link you posted doesn't say anything about Cassandra Le 7 févr. 2017 11:41, "Kant Kodali" a écrit : > Why does CockroachDB github website say Cassandra has no Availability on > datacenter failure? > > https://github.com/cockroachdb/cockroach >

Re: Rough roadmap for 4.0

2016-11-17 Thread DuyHai Doan
Be very careful, there is a serious bug about AND/OR semantics, not solved yet and not going to be solved any soon: https://issues.apache.org/jira/browse/CASSANDRA-12674 On Thu, Nov 17, 2016 at 7:32 PM, Jeff Jirsa wrote: > > We’ll be voting in the very near future on

Re: Is there a way to do Read and Set at Cassandra level?

2016-11-05 Thread DuyHai Doan
2 AM, Kant Kodali <k...@peernova.com> wrote: > But then don't I need to evict for every batch of writes? I thought cache > would make sense when reads/writes > 1 per say. What do you think? > > On Sat, Nov 5, 2016 at 3:33 AM, DuyHai Doan <doanduy...@gmail.com> wrote: > >&

Re: Is there a way to do Read and Set at Cassandra level?

2016-11-05 Thread DuyHai Doan
"I have a requirement where I need to know last value that is written successfully so I could read that value and do some computation and include it in the subsequent write" Maybe keeping the last written value in a distributed cache is cheaper than doing a read before write in Cassandra ? On

Re: Is SASI index in Cassandra efficient for high cardinality columns?

2016-10-21 Thread DuyHai Doan
need to scan the whole > cluster atleast for exact matches. I understand if it is a substring search > then there will 2^n substrings which equates to 2^n hashes/tokens which can > be a lot! > > On Sat, Oct 15, 2016 at 4:35 AM, DuyHai Doan <doanduy...@gmail.com> wrote: > > &

Re: Is SASI index in Cassandra efficient for high cardinality columns?

2016-10-15 Thread DuyHai Doan
it is #2 and it is just one matching row in my case. > > > > On Sat, Oct 15, 2016 at 2:40 AM, DuyHai Doan <doanduy...@gmail.com> wrote: > > > Define precisely what you mean by "high cardinality columns". Do you > mean: > > > > 1) a single indexe

Re: Is SASI index in Cassandra efficient for high cardinality columns?

2016-10-15 Thread DuyHai Doan
Define precisely what you mean by "high cardinality columns". Do you mean: 1) a single indexed value is present in a lot of rows 2) a single indexed value has only a few (if not just one) matching row On Sat, Oct 15, 2016 at 8:37 AM, Kant Kodali wrote: > I understand

Re: Question regd CDC in cassandra 3.7+

2016-10-08 Thread DuyHai Doan
You need to use the CommitLogReader, there is no way to access CDC data using CQL. I'm not even sure it will be possible one day On Fri, Oct 7, 2016 at 11:19 PM, sridhar nemani wrote: > Hi, > > > > I am fairly new to Cassandra. I have a requirement to be able to read

Re: [VOTE] Release Apache Cassandra 3.9

2016-10-01 Thread DuyHai Doan
Java driver 3.1.0 for support of the latest features of Cassandra 3.x (PER PARTITION LIMIT, JSON support ...) Java driver 3.0.4 is also compatible with Cassandra 3.x but has less support for bleeding edge features On Sat, Oct 1, 2016 at 4:29 PM, Jonathan Ekwempu wrote: >

Re: Question on assert

2016-09-21 Thread DuyHai Doan
I found that SO question very interesting to fuel the discussion about assert vs exception : http://stackoverflow.com/questions/1957645/when-to-use-an-assertion-and-when-to-use-an-exception On Wed, Sep 21, 2016 at 8:20 PM, Michael Kjellman < mkjell...@internalcircle.com> wrote: > Yeah, I

Re: Github pull requests

2016-08-28 Thread DuyHai Doan
Just a question before moving forward, do we want to trigger CI build for each proposed pull request on Github ? And if yes, on which infra ? I'm asking because opening PR to the crowd is a good thing but we may have tons of PRs coming and the infra may be heavily loaded. Apache Zeppelin project

Re: Distributed masterless architecture

2016-08-24 Thread DuyHai Doan
You can read this blog post, there are a handful of interesting links: http://the-paper-trail.org/blog/distributed-systems-theory-for-the-distributed-systems-engineer/ On Wed, Aug 24, 2016 at 1:45 PM, Salih Gedik wrote: > Hi everyone, > I am an undergrad student and working on a

Re: Using writetime in CAS Lightweight transactions

2016-05-11 Thread DuyHai Doan
It is not (yet) possible to use functions in LWT predicates. LWT only supports = and != plus IF (NOT) EXISTS right now Le 11 mai 2016 16:50, "Bhuvan Rawal" a écrit : > Hi, > > I was working on maintaining counters in cassandra, I read that it is not > 100% accurate, I have

Re: Datastax including Cassandra 3.0

2016-05-10 Thread DuyHai Doan
The next major release of Datastax Enterprise 5.0 will include C* 3.x. It is expected to be released somewhere around June/July, no exact date has been announced yet. Le 10 mai 2016 09:08, "Joly, Josselin" a écrit : > Hello, > > I am wondering if the version of Cassandra 3.0