Distributed masterless architecture

2016-08-24 Thread Salih Gedik
Hi everyone, 
I am an undergrad student and working on a simple distributed database for 
learning purposes. I was wondering if you guys can give me tips about designing 
and coding distributed no master nodes. For instance what classes should I be 
looking for in source code? I am so sorry if this is not the right place. 
Thank you so much!

 Best regards
-- 
Salih Gedik


Re: CASSANDRA-9143

2016-08-24 Thread Blake Eggleston
Agreed, I’d rather discuss the details on JIRA. It might be nice to send 
another email describing whatever conclusion we come to, after we have 
everything hashed out.


> On Aug 24, 2016, at 4:09 PM, Paulo Motta  wrote:
> 
> Thanks for sharing this! I added some comments/suggestions on the ticket
> for those interested.
> 
> On a side note, it's still not clear if we should do the discussion here on
> the dev-list or just call attention for a particular issue/ticket and then
> continue discussion on JIRA, but I find the latter more appropriate to
> avoid spamming those not interested, and only update here if there are new
> developments in the ticket direction.
> 
> 2016-08-24 18:35 GMT-03:00 Blake Eggleston :
> 
>> Hi everyone,
>> 
>> I just posted a proposed solution to some issues with incremental repair
>> in CASSANDRA-9143. The solution involves non-trivial changes to the way
>> incremental repair works, so I’m giving it a shout out on the dev list in
>> the spirit of increasing the flow of information here.
>> 
>> Summary of problem:
>> 
>> Anticompaction excludes sstables that have been, or are, compacting.
>> Anticompactions can also fail on a single machine due to any number of
>> reasons. In either of these scenarios, a potentially large amount of data
>> will be marked as unrepaired on one machine that’s marked as repaired on
>> the others. During the next incremental repair, this potentially large
>> amount of data will be unnecessarily streamed out to the other nodes,
>> because it won’t be in their unrepaired data.
>> 
>> Proposed solution:
>> 
>> Add a ‘pending repair’ bucket to the existing repaired and unrepaired
>> sstable buckets. We do the anticompaction up front, but put the
>> anticompacted data into the pending bucket. From here, the repair proceeds
>> normally against the pending sstables, with the streamed sstables also
>> going into the pending buckets. Once all nodes have completed streaming,
>> the pending sstables are moved into the repaired bucket, or back into
>> unrepaired if there’s a failure.
>> 
>> - Blake



Re: CASSANDRA-9143

2016-08-24 Thread Paulo Motta
Thanks for sharing this! I added some comments/suggestions on the ticket
for those interested.

On a side note, it's still not clear if we should do the discussion here on
the dev-list or just call attention for a particular issue/ticket and then
continue discussion on JIRA, but I find the latter more appropriate to
avoid spamming those not interested, and only update here if there are new
developments in the ticket direction.

2016-08-24 18:35 GMT-03:00 Blake Eggleston :

> Hi everyone,
>
> I just posted a proposed solution to some issues with incremental repair
> in CASSANDRA-9143. The solution involves non-trivial changes to the way
> incremental repair works, so I’m giving it a shout out on the dev list in
> the spirit of increasing the flow of information here.
>
> Summary of problem:
>
> Anticompaction excludes sstables that have been, or are, compacting.
> Anticompactions can also fail on a single machine due to any number of
> reasons. In either of these scenarios, a potentially large amount of data
> will be marked as unrepaired on one machine that’s marked as repaired on
> the others. During the next incremental repair, this potentially large
> amount of data will be unnecessarily streamed out to the other nodes,
> because it won’t be in their unrepaired data.
>
> Proposed solution:
>
> Add a ‘pending repair’ bucket to the existing repaired and unrepaired
> sstable buckets. We do the anticompaction up front, but put the
> anticompacted data into the pending bucket. From here, the repair proceeds
> normally against the pending sstables, with the streamed sstables also
> going into the pending buckets. Once all nodes have completed streaming,
> the pending sstables are moved into the repaired bucket, or back into
> unrepaired if there’s a failure.
>
> - Blake


CASSANDRA-9143

2016-08-24 Thread Blake Eggleston
Hi everyone,

I just posted a proposed solution to some issues with incremental repair in 
CASSANDRA-9143. The solution involves non-trivial changes to the way 
incremental repair works, so I’m giving it a shout out on the dev list in the 
spirit of increasing the flow of information here.

Summary of problem:

Anticompaction excludes sstables that have been, or are, compacting. 
Anticompactions can also fail on a single machine due to any number of reasons. 
In either of these scenarios, a potentially large amount of data will be marked 
as unrepaired on one machine that’s marked as repaired on the others. During 
the next incremental repair, this potentially large amount of data will be 
unnecessarily streamed out to the other nodes, because it won’t be in their 
unrepaired data.

Proposed solution:

Add a ‘pending repair’ bucket to the existing repaired and unrepaired sstable 
buckets. We do the anticompaction up front, but put the anticompacted data into 
the pending bucket. From here, the repair proceeds normally against the pending 
sstables, with the streamed sstables also going into the pending buckets. Once 
all nodes have completed streaming, the pending sstables are moved into the 
repaired bucket, or back into unrepaired if there’s a failure.

- Blake

Failing tests 2016-08-24 [cassandra-3.9]

2016-08-24 Thread Joel Knighton
===
testall: All passed!

===
dtest: 2 failures
  scrub_test.TestScrubIndexes.test_standalone_scrub
CASSANDRA-12337. I've root-caused this; the failure is cosmetic
but user-facing, so I plan on fixing this soon.

  commitlog_test.TestCommitLog.test_commitlog_replay_on_startup
CASSANDRA-12213. This is still being analyzed.

===
novnode: All passed!

===
upgrade: All passed!

While it is somewhat due to the stars aligning such that our flaky tests
all didn't fail this run, it is very exciting to see an upgrade test run
with 0 failures. This is 50+ fewer failures than two weeks ago.


Failing tests 2016-08-23 [cassandra-3.9]

2016-08-24 Thread Joel Knighton
===
testall: All passed!

===
dtest: 1 failure
  materialized_views_test.TestMaterializedViews
  .add_dc_after_mv_network_replication_test
CASSANDRA-12140. Known issue, still needs to be solved.

===
novnode: All passed!

===
upgrade: 1 failure
  upgrade_tests.paging_test
  .TestPagingDataNodes2RF1_Upgrade_current_2_2_x_To_indev_3_x
  .static_columns_paging_test
  CASSANDRA-11195. This issue still needs to be analyzed and
  fixed.


Overall, today looked very good. We're seeing a fairly static long tail
of challenging issues that are still in progress. I opened
CASSANDRA-12528 to fix the outstanding eclipse-warning
problems that are presently failing testall jobs on 2.2, 3.0, 3.9, and
trunk. If you are interested, feel free to assign the issue to yourself.


3.8/3.9 releases/branch freeze, current merge order

2016-08-24 Thread Aleksey Yeschenko
TL;DR: cassandra-3.8 branch is dead; cassandra-3.9 is frozen, unless you are 
committing the fix for #12140 or #12528.
For everything else go cassandra-3.0 -> trunk.

There has been some confusion regarding the current branch merge order that I’d 
like to clarify.

As you’ve seen from Joel’s last email, we are close to full Code Green status 
on casandra-3.9 branch, with one dtest and one upgrade test failing.

As soon as those two issues are resolved, we’ll be starting off the long 
delayed 3.8+3.9 votes.

What does it mean for the merge order? It means that unless you are committing 
the fix for CASSANDRA-12140 (the one failing dtest),
or the fix for CASSANDRA-12528 (the one failing upgrade test), you should skip 
cassandra-3.9 branch altogether and merge directly
into trunk (to become 3.10 eventually).

For all other tickets consider the branch to be frozen. On a related note, 
cassandra-3.8 branch is dead, and should be skipped altogether.

-- 
AY

Re: 3.8/3.9 releases/branch freeze, current merge order

2016-08-24 Thread Aleksey Yeschenko
Correction: s/12528/11195/g. I’m an idiot who cannot copy-paste.

Also, cassandra-3.8 branch was removed from the repo, to further minimise 
confusion.

-- 
AY

On 24 August 2016 at 16:25:21, Aleksey Yeschenko (alek...@apache.org) wrote:

TL;DR: cassandra-3.8 branch is dead; cassandra-3.9 is frozen, unless you are 
committing the fix for #12140 or #12528.
For everything else go cassandra-3.0 -> trunk.

There has been some confusion regarding the current branch merge order that I’d 
like to clarify.

As you’ve seen from Joel’s last email, we are close to full Code Green status 
on casandra-3.9 branch, with one dtest and one upgrade test failing.

As soon as those two issues are resolved, we’ll be starting off the long 
delayed 3.8+3.9 votes.

What does it mean for the merge order? It means that unless you are committing 
the fix for CASSANDRA-12140 (the one failing dtest),
or the fix for CASSANDRA-12528 (the one failing upgrade test), you should skip 
cassandra-3.9 branch altogether and merge directly
into trunk (to become 3.10 eventually).

For all other tickets consider the branch to be frozen. On a related note, 
cassandra-3.8 branch is dead, and should be skipped altogether.

-- 
AY

Re: Distributed masterless architecture

2016-08-24 Thread DuyHai Doan
You can read this blog post, there are a handful of interesting links:
http://the-paper-trail.org/blog/distributed-systems-theory-for-the-distributed-systems-engineer/

On Wed, Aug 24, 2016 at 1:45 PM, Salih Gedik  wrote:

> Hi everyone,
> I am an undergrad student and working on a simple distributed database for
> learning purposes. I was wondering if you guys can give me tips about
> designing and coding distributed no master nodes. For instance what classes
> should I be looking for in source code? I am so sorry if this is not the
> right place.
> Thank you so much!
>
>  Best regards
> --
> Salih Gedik
>


Re: 3.8/3.9 releases/branch freeze, current merge order

2016-08-24 Thread Aleksey Yeschenko
No. Removing a dead branch is just mindless admin work.

As for 3.8/3.9 plans, look up the previous quite lengthy vote discussion on 
3.8, on dev.

-- 
AY

On 24 August 2016 at 20:23:04, Mark Thomas (ma...@apache.org) wrote:

On 24/08/2016 16:44, Aleksey Yeschenko wrote:  

  

> Also, cassandra-3.8 branch was removed from the repo, to further minimise 
> confusion.  

That is the sort of thing I'd expect to see discussed on the dev list  
first. Where is that discussion?  

Mark  




Re: 3.8/3.9 releases/branch freeze, current merge order

2016-08-24 Thread Mark Thomas
On 24/08/2016 20:26, Aleksey Yeschenko wrote:
> No. Removing a dead branch is just mindless admin work.
> 
> As for 3.8/3.9 plans, look up the previous quite lengthy vote discussion on 
> 3.8, on dev.

Thanks. Found it. Just need to go back a little further in the archive.

Mark

> 
> -- 
> AY
> 
> On 24 August 2016 at 20:23:04, Mark Thomas (ma...@apache.org) wrote:
> 
> On 24/08/2016 16:44, Aleksey Yeschenko wrote:  
> 
>   
> 
>> Also, cassandra-3.8 branch was removed from the repo, to further minimise 
>> confusion.  
> 
> That is the sort of thing I'd expect to see discussed on the dev list  
> first. Where is that discussion?  
> 
> Mark  
> 
> 
> 



Re: 3.8/3.9 releases/branch freeze, current merge order

2016-08-24 Thread Aleksey Yeschenko
No worries. It was a somewhat.. messy thread.

And it’s taken us a while to get the tests to this level, so it’s somewhat far 
away in time in the past.

-- 
AY

On 24 August 2016 at 20:43:39, Mark Thomas (ma...@apache.org) wrote:

On 24/08/2016 20:26, Aleksey Yeschenko wrote:  
> No. Removing a dead branch is just mindless admin work.  
>  
> As for 3.8/3.9 plans, look up the previous quite lengthy vote discussion on 
> 3.8, on dev.  

Thanks. Found it. Just need to go back a little further in the archive.  

Mark  

>  
> --  
> AY  
>  
> On 24 August 2016 at 20:23:04, Mark Thomas (ma...@apache.org) wrote:  
>  
> On 24/08/2016 16:44, Aleksey Yeschenko wrote:  
>  
>   
>  
>> Also, cassandra-3.8 branch was removed from the repo, to further minimise 
>> confusion.  
>  
> That is the sort of thing I'd expect to see discussed on the dev list  
> first. Where is that discussion?  
>  
> Mark  
>  
>  
>  



Re: 3.8/3.9 releases/branch freeze, current merge order

2016-08-24 Thread Mark Thomas
On 24/08/2016 16:44, Aleksey Yeschenko wrote:



> Also, cassandra-3.8 branch was removed from the repo, to further minimise 
> confusion.

That is the sort of thing I'd expect to see discussed on the dev list
first. Where is that discussion?

Mark




Re: 3.8/3.9 releases/branch freeze, current merge order

2016-08-24 Thread Dave Brosius

It's basically just removing a tag, nothing more. Completely trivial.

---


On 2016-08-24 15:22, Mark Thomas wrote:

On 24/08/2016 16:44, Aleksey Yeschenko wrote:



Also, cassandra-3.8 branch was removed from the repo, to further 
minimise confusion.


That is the sort of thing I'd expect to see discussed on the dev list
first. Where is that discussion?

Mark


Re: Distributed masterless architecture

2016-08-24 Thread Salih Gedik

Thanks for the resources!


On 24.08.2016 21:27, DuyHai Doan wrote:

You can read this blog post, there are a handful of interesting links:
http://the-paper-trail.org/blog/distributed-systems-theory-for-the-distributed-systems-engineer/

On Wed, Aug 24, 2016 at 1:45 PM, Salih Gedik  wrote:


Hi everyone,
I am an undergrad student and working on a simple distributed database for
learning purposes. I was wondering if you guys can give me tips about
designing and coding distributed no master nodes. For instance what classes
should I be looking for in source code? I am so sorry if this is not the
right place.
Thank you so much!

  Best regards
--
Salih Gedik



--
Salih Gedik