Distributed masterless architecture
Hi everyone, I am an undergrad student and working on a simple distributed database for learning purposes. I was wondering if you guys can give me tips about designing and coding distributed no master nodes. For instance what classes should I be looking for in source code? I am so sorry if this is not the right place. Thank you so much! Best regards -- Salih Gedik
Re: CASSANDRA-9143
Agreed, I’d rather discuss the details on JIRA. It might be nice to send another email describing whatever conclusion we come to, after we have everything hashed out. > On Aug 24, 2016, at 4:09 PM, Paulo Mottawrote: > > Thanks for sharing this! I added some comments/suggestions on the ticket > for those interested. > > On a side note, it's still not clear if we should do the discussion here on > the dev-list or just call attention for a particular issue/ticket and then > continue discussion on JIRA, but I find the latter more appropriate to > avoid spamming those not interested, and only update here if there are new > developments in the ticket direction. > > 2016-08-24 18:35 GMT-03:00 Blake Eggleston : > >> Hi everyone, >> >> I just posted a proposed solution to some issues with incremental repair >> in CASSANDRA-9143. The solution involves non-trivial changes to the way >> incremental repair works, so I’m giving it a shout out on the dev list in >> the spirit of increasing the flow of information here. >> >> Summary of problem: >> >> Anticompaction excludes sstables that have been, or are, compacting. >> Anticompactions can also fail on a single machine due to any number of >> reasons. In either of these scenarios, a potentially large amount of data >> will be marked as unrepaired on one machine that’s marked as repaired on >> the others. During the next incremental repair, this potentially large >> amount of data will be unnecessarily streamed out to the other nodes, >> because it won’t be in their unrepaired data. >> >> Proposed solution: >> >> Add a ‘pending repair’ bucket to the existing repaired and unrepaired >> sstable buckets. We do the anticompaction up front, but put the >> anticompacted data into the pending bucket. From here, the repair proceeds >> normally against the pending sstables, with the streamed sstables also >> going into the pending buckets. Once all nodes have completed streaming, >> the pending sstables are moved into the repaired bucket, or back into >> unrepaired if there’s a failure. >> >> - Blake
Re: CASSANDRA-9143
Thanks for sharing this! I added some comments/suggestions on the ticket for those interested. On a side note, it's still not clear if we should do the discussion here on the dev-list or just call attention for a particular issue/ticket and then continue discussion on JIRA, but I find the latter more appropriate to avoid spamming those not interested, and only update here if there are new developments in the ticket direction. 2016-08-24 18:35 GMT-03:00 Blake Eggleston: > Hi everyone, > > I just posted a proposed solution to some issues with incremental repair > in CASSANDRA-9143. The solution involves non-trivial changes to the way > incremental repair works, so I’m giving it a shout out on the dev list in > the spirit of increasing the flow of information here. > > Summary of problem: > > Anticompaction excludes sstables that have been, or are, compacting. > Anticompactions can also fail on a single machine due to any number of > reasons. In either of these scenarios, a potentially large amount of data > will be marked as unrepaired on one machine that’s marked as repaired on > the others. During the next incremental repair, this potentially large > amount of data will be unnecessarily streamed out to the other nodes, > because it won’t be in their unrepaired data. > > Proposed solution: > > Add a ‘pending repair’ bucket to the existing repaired and unrepaired > sstable buckets. We do the anticompaction up front, but put the > anticompacted data into the pending bucket. From here, the repair proceeds > normally against the pending sstables, with the streamed sstables also > going into the pending buckets. Once all nodes have completed streaming, > the pending sstables are moved into the repaired bucket, or back into > unrepaired if there’s a failure. > > - Blake
CASSANDRA-9143
Hi everyone, I just posted a proposed solution to some issues with incremental repair in CASSANDRA-9143. The solution involves non-trivial changes to the way incremental repair works, so I’m giving it a shout out on the dev list in the spirit of increasing the flow of information here. Summary of problem: Anticompaction excludes sstables that have been, or are, compacting. Anticompactions can also fail on a single machine due to any number of reasons. In either of these scenarios, a potentially large amount of data will be marked as unrepaired on one machine that’s marked as repaired on the others. During the next incremental repair, this potentially large amount of data will be unnecessarily streamed out to the other nodes, because it won’t be in their unrepaired data. Proposed solution: Add a ‘pending repair’ bucket to the existing repaired and unrepaired sstable buckets. We do the anticompaction up front, but put the anticompacted data into the pending bucket. From here, the repair proceeds normally against the pending sstables, with the streamed sstables also going into the pending buckets. Once all nodes have completed streaming, the pending sstables are moved into the repaired bucket, or back into unrepaired if there’s a failure. - Blake
Failing tests 2016-08-24 [cassandra-3.9]
=== testall: All passed! === dtest: 2 failures scrub_test.TestScrubIndexes.test_standalone_scrub CASSANDRA-12337. I've root-caused this; the failure is cosmetic but user-facing, so I plan on fixing this soon. commitlog_test.TestCommitLog.test_commitlog_replay_on_startup CASSANDRA-12213. This is still being analyzed. === novnode: All passed! === upgrade: All passed! While it is somewhat due to the stars aligning such that our flaky tests all didn't fail this run, it is very exciting to see an upgrade test run with 0 failures. This is 50+ fewer failures than two weeks ago.
Failing tests 2016-08-23 [cassandra-3.9]
=== testall: All passed! === dtest: 1 failure materialized_views_test.TestMaterializedViews .add_dc_after_mv_network_replication_test CASSANDRA-12140. Known issue, still needs to be solved. === novnode: All passed! === upgrade: 1 failure upgrade_tests.paging_test .TestPagingDataNodes2RF1_Upgrade_current_2_2_x_To_indev_3_x .static_columns_paging_test CASSANDRA-11195. This issue still needs to be analyzed and fixed. Overall, today looked very good. We're seeing a fairly static long tail of challenging issues that are still in progress. I opened CASSANDRA-12528 to fix the outstanding eclipse-warning problems that are presently failing testall jobs on 2.2, 3.0, 3.9, and trunk. If you are interested, feel free to assign the issue to yourself.
3.8/3.9 releases/branch freeze, current merge order
TL;DR: cassandra-3.8 branch is dead; cassandra-3.9 is frozen, unless you are committing the fix for #12140 or #12528. For everything else go cassandra-3.0 -> trunk. There has been some confusion regarding the current branch merge order that I’d like to clarify. As you’ve seen from Joel’s last email, we are close to full Code Green status on casandra-3.9 branch, with one dtest and one upgrade test failing. As soon as those two issues are resolved, we’ll be starting off the long delayed 3.8+3.9 votes. What does it mean for the merge order? It means that unless you are committing the fix for CASSANDRA-12140 (the one failing dtest), or the fix for CASSANDRA-12528 (the one failing upgrade test), you should skip cassandra-3.9 branch altogether and merge directly into trunk (to become 3.10 eventually). For all other tickets consider the branch to be frozen. On a related note, cassandra-3.8 branch is dead, and should be skipped altogether. -- AY
Re: 3.8/3.9 releases/branch freeze, current merge order
Correction: s/12528/11195/g. I’m an idiot who cannot copy-paste. Also, cassandra-3.8 branch was removed from the repo, to further minimise confusion. -- AY On 24 August 2016 at 16:25:21, Aleksey Yeschenko (alek...@apache.org) wrote: TL;DR: cassandra-3.8 branch is dead; cassandra-3.9 is frozen, unless you are committing the fix for #12140 or #12528. For everything else go cassandra-3.0 -> trunk. There has been some confusion regarding the current branch merge order that I’d like to clarify. As you’ve seen from Joel’s last email, we are close to full Code Green status on casandra-3.9 branch, with one dtest and one upgrade test failing. As soon as those two issues are resolved, we’ll be starting off the long delayed 3.8+3.9 votes. What does it mean for the merge order? It means that unless you are committing the fix for CASSANDRA-12140 (the one failing dtest), or the fix for CASSANDRA-12528 (the one failing upgrade test), you should skip cassandra-3.9 branch altogether and merge directly into trunk (to become 3.10 eventually). For all other tickets consider the branch to be frozen. On a related note, cassandra-3.8 branch is dead, and should be skipped altogether. -- AY
Re: Distributed masterless architecture
You can read this blog post, there are a handful of interesting links: http://the-paper-trail.org/blog/distributed-systems-theory-for-the-distributed-systems-engineer/ On Wed, Aug 24, 2016 at 1:45 PM, Salih Gedikwrote: > Hi everyone, > I am an undergrad student and working on a simple distributed database for > learning purposes. I was wondering if you guys can give me tips about > designing and coding distributed no master nodes. For instance what classes > should I be looking for in source code? I am so sorry if this is not the > right place. > Thank you so much! > > Best regards > -- > Salih Gedik >
Re: 3.8/3.9 releases/branch freeze, current merge order
No. Removing a dead branch is just mindless admin work. As for 3.8/3.9 plans, look up the previous quite lengthy vote discussion on 3.8, on dev. -- AY On 24 August 2016 at 20:23:04, Mark Thomas (ma...@apache.org) wrote: On 24/08/2016 16:44, Aleksey Yeschenko wrote: > Also, cassandra-3.8 branch was removed from the repo, to further minimise > confusion. That is the sort of thing I'd expect to see discussed on the dev list first. Where is that discussion? Mark
Re: 3.8/3.9 releases/branch freeze, current merge order
On 24/08/2016 20:26, Aleksey Yeschenko wrote: > No. Removing a dead branch is just mindless admin work. > > As for 3.8/3.9 plans, look up the previous quite lengthy vote discussion on > 3.8, on dev. Thanks. Found it. Just need to go back a little further in the archive. Mark > > -- > AY > > On 24 August 2016 at 20:23:04, Mark Thomas (ma...@apache.org) wrote: > > On 24/08/2016 16:44, Aleksey Yeschenko wrote: > > > >> Also, cassandra-3.8 branch was removed from the repo, to further minimise >> confusion. > > That is the sort of thing I'd expect to see discussed on the dev list > first. Where is that discussion? > > Mark > > >
Re: 3.8/3.9 releases/branch freeze, current merge order
No worries. It was a somewhat.. messy thread. And it’s taken us a while to get the tests to this level, so it’s somewhat far away in time in the past. -- AY On 24 August 2016 at 20:43:39, Mark Thomas (ma...@apache.org) wrote: On 24/08/2016 20:26, Aleksey Yeschenko wrote: > No. Removing a dead branch is just mindless admin work. > > As for 3.8/3.9 plans, look up the previous quite lengthy vote discussion on > 3.8, on dev. Thanks. Found it. Just need to go back a little further in the archive. Mark > > -- > AY > > On 24 August 2016 at 20:23:04, Mark Thomas (ma...@apache.org) wrote: > > On 24/08/2016 16:44, Aleksey Yeschenko wrote: > > > >> Also, cassandra-3.8 branch was removed from the repo, to further minimise >> confusion. > > That is the sort of thing I'd expect to see discussed on the dev list > first. Where is that discussion? > > Mark > > >
Re: 3.8/3.9 releases/branch freeze, current merge order
On 24/08/2016 16:44, Aleksey Yeschenko wrote: > Also, cassandra-3.8 branch was removed from the repo, to further minimise > confusion. That is the sort of thing I'd expect to see discussed on the dev list first. Where is that discussion? Mark
Re: 3.8/3.9 releases/branch freeze, current merge order
It's basically just removing a tag, nothing more. Completely trivial. --- On 2016-08-24 15:22, Mark Thomas wrote: On 24/08/2016 16:44, Aleksey Yeschenko wrote: Also, cassandra-3.8 branch was removed from the repo, to further minimise confusion. That is the sort of thing I'd expect to see discussed on the dev list first. Where is that discussion? Mark
Re: Distributed masterless architecture
Thanks for the resources! On 24.08.2016 21:27, DuyHai Doan wrote: You can read this blog post, there are a handful of interesting links: http://the-paper-trail.org/blog/distributed-systems-theory-for-the-distributed-systems-engineer/ On Wed, Aug 24, 2016 at 1:45 PM, Salih Gedikwrote: Hi everyone, I am an undergrad student and working on a simple distributed database for learning purposes. I was wondering if you guys can give me tips about designing and coding distributed no master nodes. For instance what classes should I be looking for in source code? I am so sorry if this is not the right place. Thank you so much! Best regards -- Salih Gedik -- Salih Gedik