[kudu-CR] KUDU-1943: Add BlockTransaction to Block Manager

2017-08-24 Thread Adar Dembo (Code Review)
Adar Dembo has submitted this change and it was merged.

Change subject: KUDU-1943: Add BlockTransaction to Block Manager
..


KUDU-1943: Add BlockTransaction to Block Manager

This adds a new layer of abstraction 'BlockTransaction' to
Block Manager, to accumulate new blocks in a logical operation,
e.g. flush/compaction. Avoid using fsync() per block can
potentionaly reduce a lot I/O overhead.

This patch also add a new API, Finalize(), to Block Manager,
so that an LBM container can be reused without flushing
in-flight writable blocks to disk. This can bring down the
log block container number and disk space consumption for
first few flush/compaction operations when using log block
manager.

I performed manual test which creates a table with two tablets,
'kudu perf loadgen localhost --num-rows-per-thread=2000
 --num-threads=10 --keep-auto-table --table_num_buckets=2'.
With this patch, 'log_block_manager_containers' dropped from 56
to 7.

A design doc can be found here:
https://docs.google.com/document/d/1cB8pDs-Ho7p2HlzkWo1d-wmXr90YntIjdvdOv7u7gn0/edit?usp=sharing

Change-Id: I01f3ed830cc1f86d7bb8a66b2f780914a4bd
Reviewed-on: http://gerrit.cloudera.org:8080/7207
Reviewed-by: Adar Dembo 
Tested-by: Adar Dembo 
---
M src/kudu/cfile/bloomfile.cc
M src/kudu/cfile/bloomfile.h
M src/kudu/cfile/cfile-test.cc
M src/kudu/cfile/cfile_writer.cc
M src/kudu/cfile/cfile_writer.h
M src/kudu/fs/block_manager-stress-test.cc
M src/kudu/fs/block_manager-test.cc
M src/kudu/fs/block_manager.cc
M src/kudu/fs/block_manager.h
M src/kudu/fs/file_block_manager.cc
M src/kudu/fs/file_block_manager.h
M src/kudu/fs/log_block_manager-test.cc
M src/kudu/fs/log_block_manager.cc
M src/kudu/fs/log_block_manager.h
M src/kudu/tablet/deltafile.cc
M src/kudu/tablet/deltafile.h
M src/kudu/tablet/diskrowset.cc
M src/kudu/tablet/diskrowset.h
M src/kudu/tablet/multi_column_writer.cc
M src/kudu/tablet/multi_column_writer.h
20 files changed, 615 insertions(+), 453 deletions(-)

Approvals:
  Adar Dembo: Looks good to me, approved; Verified



-- 
To view, visit http://gerrit.cloudera.org:8080/7207
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: merged
Gerrit-Change-Id: I01f3ed830cc1f86d7bb8a66b2f780914a4bd
Gerrit-PatchSet: 35
Gerrit-Project: kudu
Gerrit-Branch: master
Gerrit-Owner: Hao Hao 
Gerrit-Reviewer: Adar Dembo 
Gerrit-Reviewer: Andrew Wong 
Gerrit-Reviewer: Dan Burkert 
Gerrit-Reviewer: Hao Hao 
Gerrit-Reviewer: Mike Percy 
Gerrit-Reviewer: Tidy Bot


[kudu-CR] KUDU-1943: Add BlockTransaction to Block Manager

2017-08-24 Thread Adar Dembo (Code Review)
Adar Dembo has posted comments on this change.

Change subject: KUDU-1943: Add BlockTransaction to Block Manager
..


Patch Set 34: Verified+1

We can't let IWYU get in the way of progress.

-- 
To view, visit http://gerrit.cloudera.org:8080/7207
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: comment
Gerrit-Change-Id: I01f3ed830cc1f86d7bb8a66b2f780914a4bd
Gerrit-PatchSet: 34
Gerrit-Project: kudu
Gerrit-Branch: master
Gerrit-Owner: Hao Hao 
Gerrit-Reviewer: Adar Dembo 
Gerrit-Reviewer: Andrew Wong 
Gerrit-Reviewer: Dan Burkert 
Gerrit-Reviewer: Hao Hao 
Gerrit-Reviewer: Kudu Jenkins
Gerrit-Reviewer: Mike Percy 
Gerrit-Reviewer: Tidy Bot
Gerrit-HasComments: No


[kudu-CR] KUDU-1943: Add BlockTransaction to Block Manager

2017-08-24 Thread Hao Hao (Code Review)
Hao Hao has posted comments on this change.

Change subject: KUDU-1943: Add BlockTransaction to Block Manager
..


Patch Set 34:

Not sure why IWYU is complaining about including ext/alloc_traits.h for 
log_block_manager.cc, etc?

-- 
To view, visit http://gerrit.cloudera.org:8080/7207
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: comment
Gerrit-Change-Id: I01f3ed830cc1f86d7bb8a66b2f780914a4bd
Gerrit-PatchSet: 34
Gerrit-Project: kudu
Gerrit-Branch: master
Gerrit-Owner: Hao Hao 
Gerrit-Reviewer: Adar Dembo 
Gerrit-Reviewer: Andrew Wong 
Gerrit-Reviewer: Dan Burkert 
Gerrit-Reviewer: Hao Hao 
Gerrit-Reviewer: Kudu Jenkins
Gerrit-Reviewer: Mike Percy 
Gerrit-Reviewer: Tidy Bot
Gerrit-HasComments: No


[kudu-CR] KUDU-1943: Add BlockTransaction to Block Manager

2017-08-24 Thread Adar Dembo (Code Review)
Adar Dembo has posted comments on this change.

Change subject: KUDU-1943: Add BlockTransaction to Block Manager
..


Patch Set 34: Code-Review+2

-- 
To view, visit http://gerrit.cloudera.org:8080/7207
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: comment
Gerrit-Change-Id: I01f3ed830cc1f86d7bb8a66b2f780914a4bd
Gerrit-PatchSet: 34
Gerrit-Project: kudu
Gerrit-Branch: master
Gerrit-Owner: Hao Hao 
Gerrit-Reviewer: Adar Dembo 
Gerrit-Reviewer: Andrew Wong 
Gerrit-Reviewer: Dan Burkert 
Gerrit-Reviewer: Hao Hao 
Gerrit-Reviewer: Kudu Jenkins
Gerrit-Reviewer: Mike Percy 
Gerrit-Reviewer: Tidy Bot
Gerrit-HasComments: No


[kudu-CR] KUDU-1943: Add BlockTransaction to Block Manager

2017-08-24 Thread Hao Hao (Code Review)
Hello Adar Dembo, Kudu Jenkins,

I'd like you to reexamine a change.  Please visit

http://gerrit.cloudera.org:8080/7207

to look at the new patch set (#34).

Change subject: KUDU-1943: Add BlockTransaction to Block Manager
..

KUDU-1943: Add BlockTransaction to Block Manager

This adds a new layer of abstraction 'BlockTransaction' to
Block Manager, to accumulate new blocks in a logical operation,
e.g. flush/compaction. Avoid using fsync() per block can
potentionaly reduce a lot I/O overhead.

This patch also add a new API, Finalize(), to Block Manager,
so that an LBM container can be reused without flushing
in-flight writable blocks to disk. This can bring down the
log block container number and disk space consumption for
first few flush/compaction operations when using log block
manager.

I performed manual test which creates a table with two tablets,
'kudu perf loadgen localhost --num-rows-per-thread=2000
 --num-threads=10 --keep-auto-table --table_num_buckets=2'.
With this patch, 'log_block_manager_containers' dropped from 56
to 7.

A design doc can be found here:
https://docs.google.com/document/d/1cB8pDs-Ho7p2HlzkWo1d-wmXr90YntIjdvdOv7u7gn0/edit?usp=sharing

Change-Id: I01f3ed830cc1f86d7bb8a66b2f780914a4bd
---
M src/kudu/cfile/bloomfile.cc
M src/kudu/cfile/bloomfile.h
M src/kudu/cfile/cfile-test.cc
M src/kudu/cfile/cfile_writer.cc
M src/kudu/cfile/cfile_writer.h
M src/kudu/fs/block_manager-stress-test.cc
M src/kudu/fs/block_manager-test.cc
M src/kudu/fs/block_manager.cc
M src/kudu/fs/block_manager.h
M src/kudu/fs/file_block_manager.cc
M src/kudu/fs/file_block_manager.h
M src/kudu/fs/log_block_manager-test.cc
M src/kudu/fs/log_block_manager.cc
M src/kudu/fs/log_block_manager.h
M src/kudu/tablet/deltafile.cc
M src/kudu/tablet/deltafile.h
M src/kudu/tablet/diskrowset.cc
M src/kudu/tablet/diskrowset.h
M src/kudu/tablet/multi_column_writer.cc
M src/kudu/tablet/multi_column_writer.h
20 files changed, 615 insertions(+), 453 deletions(-)


  git pull ssh://gerrit.cloudera.org:29418/kudu refs/changes/07/7207/34
-- 
To view, visit http://gerrit.cloudera.org:8080/7207
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: newpatchset
Gerrit-Change-Id: I01f3ed830cc1f86d7bb8a66b2f780914a4bd
Gerrit-PatchSet: 34
Gerrit-Project: kudu
Gerrit-Branch: master
Gerrit-Owner: Hao Hao 
Gerrit-Reviewer: Adar Dembo 
Gerrit-Reviewer: Andrew Wong 
Gerrit-Reviewer: Dan Burkert 
Gerrit-Reviewer: Hao Hao 
Gerrit-Reviewer: Kudu Jenkins
Gerrit-Reviewer: Mike Percy 
Gerrit-Reviewer: Tidy Bot


[kudu-CR] KUDU-1943: Add BlockTransaction to Block Manager

2017-08-24 Thread Adar Dembo (Code Review)
Adar Dembo has posted comments on this change.

Change subject: KUDU-1943: Add BlockTransaction to Block Manager
..


Patch Set 33: Code-Review+2

-- 
To view, visit http://gerrit.cloudera.org:8080/7207
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: comment
Gerrit-Change-Id: I01f3ed830cc1f86d7bb8a66b2f780914a4bd
Gerrit-PatchSet: 33
Gerrit-Project: kudu
Gerrit-Branch: master
Gerrit-Owner: Hao Hao 
Gerrit-Reviewer: Adar Dembo 
Gerrit-Reviewer: Andrew Wong 
Gerrit-Reviewer: Dan Burkert 
Gerrit-Reviewer: Hao Hao 
Gerrit-Reviewer: Kudu Jenkins
Gerrit-Reviewer: Mike Percy 
Gerrit-Reviewer: Tidy Bot
Gerrit-HasComments: No


[kudu-CR] KUDU-1943: Add BlockTransaction to Block Manager

2017-08-24 Thread Hao Hao (Code Review)
Hello Adar Dembo, Kudu Jenkins,

I'd like you to reexamine a change.  Please visit

http://gerrit.cloudera.org:8080/7207

to look at the new patch set (#33).

Change subject: KUDU-1943: Add BlockTransaction to Block Manager
..

KUDU-1943: Add BlockTransaction to Block Manager

This adds a new layer of abstraction 'BlockTransaction' to
Block Manager, to accumulate new blocks in a logical operation,
e.g. flush/compaction. Avoid using fsync() per block can
potentionaly reduce a lot I/O overhead.

This patch also add a new API, Finalize(), to Block Manager,
so that an LBM container can be reused without flushing
in-flight writable blocks to disk. This can bring down the
log block container number and disk space consumption for
first few flush/compaction operations when using log block
manager.

I performed manual test which creates a table with two tablets,
'kudu perf loadgen localhost --num-rows-per-thread=2000
 --num-threads=10 --keep-auto-table --table_num_buckets=2'.
With this patch, 'log_block_manager_containers' dropped from 56
to 7.

A design doc can be found here:
https://docs.google.com/document/d/1cB8pDs-Ho7p2HlzkWo1d-wmXr90YntIjdvdOv7u7gn0/edit?usp=sharing

Change-Id: I01f3ed830cc1f86d7bb8a66b2f780914a4bd
---
M src/kudu/cfile/bloomfile.cc
M src/kudu/cfile/bloomfile.h
M src/kudu/cfile/cfile-test.cc
M src/kudu/cfile/cfile_writer.cc
M src/kudu/cfile/cfile_writer.h
M src/kudu/fs/block_manager-stress-test.cc
M src/kudu/fs/block_manager-test.cc
M src/kudu/fs/block_manager.cc
M src/kudu/fs/block_manager.h
M src/kudu/fs/file_block_manager.cc
M src/kudu/fs/file_block_manager.h
M src/kudu/fs/log_block_manager-test.cc
M src/kudu/fs/log_block_manager.cc
M src/kudu/fs/log_block_manager.h
M src/kudu/tablet/deltafile.cc
M src/kudu/tablet/deltafile.h
M src/kudu/tablet/diskrowset.cc
M src/kudu/tablet/diskrowset.h
M src/kudu/tablet/multi_column_writer.cc
M src/kudu/tablet/multi_column_writer.h
20 files changed, 602 insertions(+), 437 deletions(-)


  git pull ssh://gerrit.cloudera.org:29418/kudu refs/changes/07/7207/33
-- 
To view, visit http://gerrit.cloudera.org:8080/7207
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: newpatchset
Gerrit-Change-Id: I01f3ed830cc1f86d7bb8a66b2f780914a4bd
Gerrit-PatchSet: 33
Gerrit-Project: kudu
Gerrit-Branch: master
Gerrit-Owner: Hao Hao 
Gerrit-Reviewer: Adar Dembo 
Gerrit-Reviewer: Andrew Wong 
Gerrit-Reviewer: Dan Burkert 
Gerrit-Reviewer: Hao Hao 
Gerrit-Reviewer: Kudu Jenkins
Gerrit-Reviewer: Mike Percy 
Gerrit-Reviewer: Tidy Bot


[kudu-CR] KUDU-1943: Add BlockTransaction to Block Manager

2017-08-24 Thread Dan Burkert (Code Review)
Dan Burkert has posted comments on this change.

Change subject: KUDU-1943: Add BlockTransaction to Block Manager
..


Patch Set 32:

(1 comment)

http://gerrit.cloudera.org:8080/#/c/7207/32/src/kudu/fs/block_manager.h
File src/kudu/fs/block_manager.h:

Line 132
> There's no avoiding a downcast in LogBlockManager::CloseBlocks; we need it 
OK sounds good to me, it can stay as is.


-- 
To view, visit http://gerrit.cloudera.org:8080/7207
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: comment
Gerrit-Change-Id: I01f3ed830cc1f86d7bb8a66b2f780914a4bd
Gerrit-PatchSet: 32
Gerrit-Project: kudu
Gerrit-Branch: master
Gerrit-Owner: Hao Hao 
Gerrit-Reviewer: Adar Dembo 
Gerrit-Reviewer: Andrew Wong 
Gerrit-Reviewer: Dan Burkert 
Gerrit-Reviewer: Hao Hao 
Gerrit-Reviewer: Kudu Jenkins
Gerrit-Reviewer: Mike Percy 
Gerrit-Reviewer: Tidy Bot
Gerrit-HasComments: Yes


[kudu-CR] KUDU-1943: Add BlockTransaction to Block Manager

2017-08-24 Thread Adar Dembo (Code Review)
Adar Dembo has posted comments on this change.

Change subject: KUDU-1943: Add BlockTransaction to Block Manager
..


Patch Set 32:

(2 comments)

http://gerrit.cloudera.org:8080/#/c/7207/32/src/kudu/fs/block_manager.h
File src/kudu/fs/block_manager.h:

Line 132
> Adar earlier made the point "it shouldn't be part of the WritableBlock inte
There's no avoiding a downcast in LogBlockManager::CloseBlocks; we need it in 
order to call other non-interface methods on LogWritableBlock.

But, if you prefer to reduce the number of downcasts from two to one, some 
light refactoring could do that.


http://gerrit.cloudera.org:8080/#/c/7207/32/src/kudu/fs/log_block_manager.cc
File src/kudu/fs/log_block_manager.cc:

PS32, Line 1217:   if (container_->read_only()) {
   : state_ = CLOSED;
   : if (container_->metrics()) {
   :   
container_->metrics()->generic_metrics.blocks_open_writing->Decrement();
   :   
container_->metrics()->generic_metrics.total_bytes_written->IncrementBy(
   :   block_length_);
   : }
   : return Status::Aborted("container $0 is read-only", 
container_->ToString());
   :   }
> Makes sense. I don't see a case we would want to Abort() the block if it is
I would prefer we didn't change API semantics at this point. Multiple Abort() 
calls in succession are legal; let's keep them that way (and ensure they no-op).


-- 
To view, visit http://gerrit.cloudera.org:8080/7207
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: comment
Gerrit-Change-Id: I01f3ed830cc1f86d7bb8a66b2f780914a4bd
Gerrit-PatchSet: 32
Gerrit-Project: kudu
Gerrit-Branch: master
Gerrit-Owner: Hao Hao 
Gerrit-Reviewer: Adar Dembo 
Gerrit-Reviewer: Andrew Wong 
Gerrit-Reviewer: Dan Burkert 
Gerrit-Reviewer: Hao Hao 
Gerrit-Reviewer: Kudu Jenkins
Gerrit-Reviewer: Mike Percy 
Gerrit-Reviewer: Tidy Bot
Gerrit-HasComments: Yes


[kudu-CR] KUDU-1943: Add BlockTransaction to Block Manager

2017-08-24 Thread Dan Burkert (Code Review)
Dan Burkert has posted comments on this change.

Change subject: KUDU-1943: Add BlockTransaction to Block Manager
..


Patch Set 32:

(1 comment)

http://gerrit.cloudera.org:8080/#/c/7207/32/src/kudu/fs/block_manager.h
File src/kudu/fs/block_manager.h:

Line 132
Looks like this is being retained as a method on both implementations, but to 
call it requires a downcast.  Should we keep it in the interface so that the 
downcast isn't necessary?


-- 
To view, visit http://gerrit.cloudera.org:8080/7207
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: comment
Gerrit-Change-Id: I01f3ed830cc1f86d7bb8a66b2f780914a4bd
Gerrit-PatchSet: 32
Gerrit-Project: kudu
Gerrit-Branch: master
Gerrit-Owner: Hao Hao 
Gerrit-Reviewer: Adar Dembo 
Gerrit-Reviewer: Andrew Wong 
Gerrit-Reviewer: Dan Burkert 
Gerrit-Reviewer: Hao Hao 
Gerrit-Reviewer: Kudu Jenkins
Gerrit-Reviewer: Mike Percy 
Gerrit-Reviewer: Tidy Bot
Gerrit-HasComments: Yes


[kudu-CR] KUDU-1943: Add BlockTransaction to Block Manager

2017-08-24 Thread Hao Hao (Code Review)
Hao Hao has posted comments on this change.

Change subject: KUDU-1943: Add BlockTransaction to Block Manager
..


Patch Set 32:

(1 comment)

http://gerrit.cloudera.org:8080/#/c/7207/32/src/kudu/fs/log_block_manager.cc
File src/kudu/fs/log_block_manager.cc:

PS32, Line 1217:   if (container_->read_only()) {
   : state_ = CLOSED;
   : if (container_->metrics()) {
   :   
container_->metrics()->generic_metrics.blocks_open_writing->Decrement();
   :   
container_->metrics()->generic_metrics.total_bytes_written->IncrementBy(
   :   block_length_);
   : }
   : return Status::Aborted("container $0 is read-only", 
container_->ToString());
   :   }
> It's hard to tell whether it's possible to wind up calling Abort() twice on
Yeah, it does not hurt to add the checking. But looking at the flow, Abort() is 
only called when the state_ != CLOSED now. So instead maybe I can add a DCHECK 
of the state_ at the beginning of Abort()?


-- 
To view, visit http://gerrit.cloudera.org:8080/7207
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: comment
Gerrit-Change-Id: I01f3ed830cc1f86d7bb8a66b2f780914a4bd
Gerrit-PatchSet: 32
Gerrit-Project: kudu
Gerrit-Branch: master
Gerrit-Owner: Hao Hao 
Gerrit-Reviewer: Adar Dembo 
Gerrit-Reviewer: Andrew Wong 
Gerrit-Reviewer: Dan Burkert 
Gerrit-Reviewer: Hao Hao 
Gerrit-Reviewer: Kudu Jenkins
Gerrit-Reviewer: Mike Percy 
Gerrit-Reviewer: Tidy Bot
Gerrit-HasComments: Yes


[kudu-CR] KUDU-1943: Add BlockTransaction to Block Manager

2017-08-24 Thread Adar Dembo (Code Review)
Adar Dembo has posted comments on this change.

Change subject: KUDU-1943: Add BlockTransaction to Block Manager
..


Patch Set 32:

(1 comment)

http://gerrit.cloudera.org:8080/#/c/7207/32/src/kudu/fs/log_block_manager.cc
File src/kudu/fs/log_block_manager.cc:

PS32, Line 1217:   if (container_->read_only()) {
   : state_ = CLOSED;
   : if (container_->metrics()) {
   :   
container_->metrics()->generic_metrics.blocks_open_writing->Decrement();
   :   
container_->metrics()->generic_metrics.total_bytes_written->IncrementBy(
   :   block_length_);
   : }
   : return Status::Aborted("container $0 is read-only", 
container_->ToString());
   :   }
> Do we need to check if state_ is already CLOSED and if so make this a no-op
It's hard to tell whether it's possible to wind up calling Abort() twice on a 
read-only container given the constraints required to get a container to become 
read-only, but Mike's suggestion certainly won't hurt.


-- 
To view, visit http://gerrit.cloudera.org:8080/7207
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: comment
Gerrit-Change-Id: I01f3ed830cc1f86d7bb8a66b2f780914a4bd
Gerrit-PatchSet: 32
Gerrit-Project: kudu
Gerrit-Branch: master
Gerrit-Owner: Hao Hao 
Gerrit-Reviewer: Adar Dembo 
Gerrit-Reviewer: Andrew Wong 
Gerrit-Reviewer: Dan Burkert 
Gerrit-Reviewer: Hao Hao 
Gerrit-Reviewer: Kudu Jenkins
Gerrit-Reviewer: Mike Percy 
Gerrit-Reviewer: Tidy Bot
Gerrit-HasComments: Yes


[kudu-CR] KUDU-1943: Add BlockTransaction to Block Manager

2017-08-24 Thread Mike Percy (Code Review)
Mike Percy has posted comments on this change.

Change subject: KUDU-1943: Add BlockTransaction to Block Manager
..


Patch Set 32:

(1 comment)

http://gerrit.cloudera.org:8080/#/c/7207/32/src/kudu/fs/block_manager.cc
File src/kudu/fs/block_manager.cc:

PS32, Line 50: close
closed


-- 
To view, visit http://gerrit.cloudera.org:8080/7207
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: comment
Gerrit-Change-Id: I01f3ed830cc1f86d7bb8a66b2f780914a4bd
Gerrit-PatchSet: 32
Gerrit-Project: kudu
Gerrit-Branch: master
Gerrit-Owner: Hao Hao 
Gerrit-Reviewer: Adar Dembo 
Gerrit-Reviewer: Andrew Wong 
Gerrit-Reviewer: Dan Burkert 
Gerrit-Reviewer: Hao Hao 
Gerrit-Reviewer: Kudu Jenkins
Gerrit-Reviewer: Mike Percy 
Gerrit-Reviewer: Tidy Bot
Gerrit-HasComments: Yes


[kudu-CR] KUDU-1943: Add BlockTransaction to Block Manager

2017-08-24 Thread Mike Percy (Code Review)
Mike Percy has posted comments on this change.

Change subject: KUDU-1943: Add BlockTransaction to Block Manager
..


Patch Set 32:

(1 comment)

http://gerrit.cloudera.org:8080/#/c/7207/32/src/kudu/fs/log_block_manager.cc
File src/kudu/fs/log_block_manager.cc:

PS32, Line 1217:   if (container_->read_only()) {
   : state_ = CLOSED;
   : if (container_->metrics()) {
   :   
container_->metrics()->generic_metrics.blocks_open_writing->Decrement();
   :   
container_->metrics()->generic_metrics.total_bytes_written->IncrementBy(
   :   block_length_);
   : }
   : return Status::Aborted("container $0 is read-only", 
container_->ToString());
   :   }
Do we need to check if state_ is already CLOSED and if so make this a no-op and 
avoid changing the metrics, like we do in DoClose()? i.e. Close(); Abort();


-- 
To view, visit http://gerrit.cloudera.org:8080/7207
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: comment
Gerrit-Change-Id: I01f3ed830cc1f86d7bb8a66b2f780914a4bd
Gerrit-PatchSet: 32
Gerrit-Project: kudu
Gerrit-Branch: master
Gerrit-Owner: Hao Hao 
Gerrit-Reviewer: Adar Dembo 
Gerrit-Reviewer: Andrew Wong 
Gerrit-Reviewer: Dan Burkert 
Gerrit-Reviewer: Hao Hao 
Gerrit-Reviewer: Kudu Jenkins
Gerrit-Reviewer: Mike Percy 
Gerrit-Reviewer: Tidy Bot
Gerrit-HasComments: Yes


[kudu-CR] KUDU-1943: Add BlockTransaction to Block Manager

2017-08-23 Thread Adar Dembo (Code Review)
Adar Dembo has posted comments on this change.

Change subject: KUDU-1943: Add BlockTransaction to Block Manager
..


Patch Set 30: Code-Review+2

-- 
To view, visit http://gerrit.cloudera.org:8080/7207
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: comment
Gerrit-Change-Id: I01f3ed830cc1f86d7bb8a66b2f780914a4bd
Gerrit-PatchSet: 30
Gerrit-Project: kudu
Gerrit-Branch: master
Gerrit-Owner: Hao Hao 
Gerrit-Reviewer: Adar Dembo 
Gerrit-Reviewer: Andrew Wong 
Gerrit-Reviewer: Dan Burkert 
Gerrit-Reviewer: Hao Hao 
Gerrit-Reviewer: Kudu Jenkins
Gerrit-Reviewer: Mike Percy 
Gerrit-Reviewer: Tidy Bot
Gerrit-HasComments: No


[kudu-CR] KUDU-1943: Add BlockTransaction to Block Manager

2017-08-23 Thread Hao Hao (Code Review)
Hao Hao has posted comments on this change.

Change subject: KUDU-1943: Add BlockTransaction to Block Manager
..


Patch Set 30:

(1 comment)

http://gerrit.cloudera.org:8080/#/c/7207/29/src/kudu/fs/log_block_manager.cc
File src/kudu/fs/log_block_manager.cc:

PS29, Line 1229:  it would
> Seems like 'it' accidentally got dropped. This sentence should read as:
Oops, my bad...Thanks for catching it. Updated.


-- 
To view, visit http://gerrit.cloudera.org:8080/7207
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: comment
Gerrit-Change-Id: I01f3ed830cc1f86d7bb8a66b2f780914a4bd
Gerrit-PatchSet: 30
Gerrit-Project: kudu
Gerrit-Branch: master
Gerrit-Owner: Hao Hao 
Gerrit-Reviewer: Adar Dembo 
Gerrit-Reviewer: Andrew Wong 
Gerrit-Reviewer: Dan Burkert 
Gerrit-Reviewer: Hao Hao 
Gerrit-Reviewer: Kudu Jenkins
Gerrit-Reviewer: Mike Percy 
Gerrit-Reviewer: Tidy Bot
Gerrit-HasComments: Yes


[kudu-CR] KUDU-1943: Add BlockTransaction to Block Manager

2017-08-23 Thread Hao Hao (Code Review)
Hello Adar Dembo, Kudu Jenkins,

I'd like you to reexamine a change.  Please visit

http://gerrit.cloudera.org:8080/7207

to look at the new patch set (#30).

Change subject: KUDU-1943: Add BlockTransaction to Block Manager
..

KUDU-1943: Add BlockTransaction to Block Manager

This adds a new layer of abstraction 'BlockTransaction' to
Block Manager, to accumulate new blocks in a logical operation,
e.g. flush/compaction. Avoid using fsync() per block can
potentionaly reduce a lot I/O overhead.

This patch also add a new API, Finalize(), to Block Manager,
so that an LBM container can be reused without flushing
in-flight writable blocks to disk. This can bring down the
log block container number and disk space consumption for
first few flush/compaction operations when using log block
manager.

I performed manual test which creates a table with two tablets,
'kudu perf loadgen localhost --num-rows-per-thread=2000
 --num-threads=10 --keep-auto-table --table_num_buckets=2'.
With this patch, 'log_block_manager_containers' dropped from 56
to 7.

A design doc can be found here:
https://docs.google.com/document/d/1cB8pDs-Ho7p2HlzkWo1d-wmXr90YntIjdvdOv7u7gn0/edit?usp=sharing

Change-Id: I01f3ed830cc1f86d7bb8a66b2f780914a4bd
---
M src/kudu/cfile/bloomfile.cc
M src/kudu/cfile/bloomfile.h
M src/kudu/cfile/cfile-test.cc
M src/kudu/cfile/cfile_writer.cc
M src/kudu/cfile/cfile_writer.h
M src/kudu/fs/block_manager-stress-test.cc
M src/kudu/fs/block_manager-test.cc
M src/kudu/fs/block_manager.cc
M src/kudu/fs/block_manager.h
M src/kudu/fs/file_block_manager.cc
M src/kudu/fs/file_block_manager.h
M src/kudu/fs/log_block_manager-test.cc
M src/kudu/fs/log_block_manager.cc
M src/kudu/fs/log_block_manager.h
M src/kudu/tablet/deltafile.cc
M src/kudu/tablet/deltafile.h
M src/kudu/tablet/diskrowset.cc
M src/kudu/tablet/diskrowset.h
M src/kudu/tablet/multi_column_writer.cc
M src/kudu/tablet/multi_column_writer.h
20 files changed, 599 insertions(+), 434 deletions(-)


  git pull ssh://gerrit.cloudera.org:29418/kudu refs/changes/07/7207/30
-- 
To view, visit http://gerrit.cloudera.org:8080/7207
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: newpatchset
Gerrit-Change-Id: I01f3ed830cc1f86d7bb8a66b2f780914a4bd
Gerrit-PatchSet: 30
Gerrit-Project: kudu
Gerrit-Branch: master
Gerrit-Owner: Hao Hao 
Gerrit-Reviewer: Adar Dembo 
Gerrit-Reviewer: Andrew Wong 
Gerrit-Reviewer: Dan Burkert 
Gerrit-Reviewer: Hao Hao 
Gerrit-Reviewer: Kudu Jenkins
Gerrit-Reviewer: Mike Percy 
Gerrit-Reviewer: Tidy Bot


[kudu-CR] KUDU-1943: Add BlockTransaction to Block Manager

2017-08-23 Thread Adar Dembo (Code Review)
Adar Dembo has posted comments on this change.

Change subject: KUDU-1943: Add BlockTransaction to Block Manager
..


Patch Set 29:

(1 comment)

http://gerrit.cloudera.org:8080/#/c/7207/29/src/kudu/fs/log_block_manager.cc
File src/kudu/fs/log_block_manager.cc:

PS29, Line 1229:  would be
Seems like 'it' accidentally got dropped. This sentence should read as:

  Here is the reasoning as to why it would be safe to do so.


-- 
To view, visit http://gerrit.cloudera.org:8080/7207
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: comment
Gerrit-Change-Id: I01f3ed830cc1f86d7bb8a66b2f780914a4bd
Gerrit-PatchSet: 29
Gerrit-Project: kudu
Gerrit-Branch: master
Gerrit-Owner: Hao Hao 
Gerrit-Reviewer: Adar Dembo 
Gerrit-Reviewer: Andrew Wong 
Gerrit-Reviewer: Dan Burkert 
Gerrit-Reviewer: Hao Hao 
Gerrit-Reviewer: Kudu Jenkins
Gerrit-Reviewer: Mike Percy 
Gerrit-Reviewer: Tidy Bot
Gerrit-HasComments: Yes


[kudu-CR] KUDU-1943: Add BlockTransaction to Block Manager

2017-08-23 Thread Hao Hao (Code Review)
Hao Hao has posted comments on this change.

Change subject: KUDU-1943: Add BlockTransaction to Block Manager
..


Patch Set 29:

(3 comments)

http://gerrit.cloudera.org:8080/#/c/7207/28/src/kudu/fs/log_block_manager.cc
File src/kudu/fs/log_block_manager.cc:

PS28, Line 352: possibly sync
> possibly synchronizing
Done


PS28, Line 1227: cou
> could
Done


PS28, Line 1229: saf
> would be
Done


-- 
To view, visit http://gerrit.cloudera.org:8080/7207
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: comment
Gerrit-Change-Id: I01f3ed830cc1f86d7bb8a66b2f780914a4bd
Gerrit-PatchSet: 29
Gerrit-Project: kudu
Gerrit-Branch: master
Gerrit-Owner: Hao Hao 
Gerrit-Reviewer: Adar Dembo 
Gerrit-Reviewer: Andrew Wong 
Gerrit-Reviewer: Dan Burkert 
Gerrit-Reviewer: Hao Hao 
Gerrit-Reviewer: Kudu Jenkins
Gerrit-Reviewer: Mike Percy 
Gerrit-Reviewer: Tidy Bot
Gerrit-HasComments: Yes


[kudu-CR] KUDU-1943: Add BlockTransaction to Block Manager

2017-08-23 Thread Hao Hao (Code Review)
Hello Adar Dembo, Kudu Jenkins,

I'd like you to reexamine a change.  Please visit

http://gerrit.cloudera.org:8080/7207

to look at the new patch set (#29).

Change subject: KUDU-1943: Add BlockTransaction to Block Manager
..

KUDU-1943: Add BlockTransaction to Block Manager

This adds a new layer of abstraction 'BlockTransaction' to
Block Manager, to accumulate new blocks in a logical operation,
e.g. flush/compaction. Avoid using fsync() per block can
potentionaly reduce a lot I/O overhead.

This patch also add a new API, Finalize(), to Block Manager,
so that an LBM container can be reused without flushing
in-flight writable blocks to disk. This can bring down the
log block container number and disk space consumption for
first few flush/compaction operations when using log block
manager.

I performed manual test which creates a table with two tablets,
'kudu perf loadgen localhost --num-rows-per-thread=2000
 --num-threads=10 --keep-auto-table --table_num_buckets=2'.
With this patch, 'log_block_manager_containers' dropped from 56
to 7.

A design doc can be found here:
https://docs.google.com/document/d/1cB8pDs-Ho7p2HlzkWo1d-wmXr90YntIjdvdOv7u7gn0/edit?usp=sharing

Change-Id: I01f3ed830cc1f86d7bb8a66b2f780914a4bd
---
M src/kudu/cfile/bloomfile.cc
M src/kudu/cfile/bloomfile.h
M src/kudu/cfile/cfile-test.cc
M src/kudu/cfile/cfile_writer.cc
M src/kudu/cfile/cfile_writer.h
M src/kudu/fs/block_manager-stress-test.cc
M src/kudu/fs/block_manager-test.cc
M src/kudu/fs/block_manager.cc
M src/kudu/fs/block_manager.h
M src/kudu/fs/file_block_manager.cc
M src/kudu/fs/file_block_manager.h
M src/kudu/fs/log_block_manager-test.cc
M src/kudu/fs/log_block_manager.cc
M src/kudu/fs/log_block_manager.h
M src/kudu/tablet/deltafile.cc
M src/kudu/tablet/deltafile.h
M src/kudu/tablet/diskrowset.cc
M src/kudu/tablet/diskrowset.h
M src/kudu/tablet/multi_column_writer.cc
M src/kudu/tablet/multi_column_writer.h
20 files changed, 599 insertions(+), 434 deletions(-)


  git pull ssh://gerrit.cloudera.org:29418/kudu refs/changes/07/7207/29
-- 
To view, visit http://gerrit.cloudera.org:8080/7207
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: newpatchset
Gerrit-Change-Id: I01f3ed830cc1f86d7bb8a66b2f780914a4bd
Gerrit-PatchSet: 29
Gerrit-Project: kudu
Gerrit-Branch: master
Gerrit-Owner: Hao Hao 
Gerrit-Reviewer: Adar Dembo 
Gerrit-Reviewer: Andrew Wong 
Gerrit-Reviewer: Dan Burkert 
Gerrit-Reviewer: Hao Hao 
Gerrit-Reviewer: Kudu Jenkins
Gerrit-Reviewer: Mike Percy 
Gerrit-Reviewer: Tidy Bot


[kudu-CR] KUDU-1943: Add BlockTransaction to Block Manager

2017-08-23 Thread Adar Dembo (Code Review)
Adar Dembo has posted comments on this change.

Change subject: KUDU-1943: Add BlockTransaction to Block Manager
..


Patch Set 28:

(3 comments)

http://gerrit.cloudera.org:8080/#/c/7207/28/src/kudu/fs/log_block_manager.cc
File src/kudu/fs/log_block_manager.cc:

PS28, Line 352: synchronizing
possibly synchronizing


PS28, Line 1227: can
could


PS28, Line 1229:  is
would be


-- 
To view, visit http://gerrit.cloudera.org:8080/7207
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: comment
Gerrit-Change-Id: I01f3ed830cc1f86d7bb8a66b2f780914a4bd
Gerrit-PatchSet: 28
Gerrit-Project: kudu
Gerrit-Branch: master
Gerrit-Owner: Hao Hao 
Gerrit-Reviewer: Adar Dembo 
Gerrit-Reviewer: Andrew Wong 
Gerrit-Reviewer: Dan Burkert 
Gerrit-Reviewer: Hao Hao 
Gerrit-Reviewer: Kudu Jenkins
Gerrit-Reviewer: Mike Percy 
Gerrit-Reviewer: Tidy Bot
Gerrit-HasComments: Yes


[kudu-CR] KUDU-1943: Add BlockTransaction to Block Manager

2017-08-23 Thread Hao Hao (Code Review)
Hello Adar Dembo, Kudu Jenkins,

I'd like you to reexamine a change.  Please visit

http://gerrit.cloudera.org:8080/7207

to look at the new patch set (#28).

Change subject: KUDU-1943: Add BlockTransaction to Block Manager
..

KUDU-1943: Add BlockTransaction to Block Manager

This adds a new layer of abstraction 'BlockTransaction' to
Block Manager, to accumulate new blocks in a logical operation,
e.g. flush/compaction. Avoid using fsync() per block can
potentionaly reduce a lot I/O overhead.

This patch also add a new API, Finalize(), to Block Manager,
so that an LBM container can be reused without flushing
in-flight writable blocks to disk. This can bring down the
log block container number and disk space consumption for
first few flush/compaction operations when using log block
manager.

I performed manual test which creates a table with two tablets,
'kudu perf loadgen localhost --num-rows-per-thread=2000
 --num-threads=10 --keep-auto-table --table_num_buckets=2'.
With this patch, 'log_block_manager_containers' dropped from 56
to 7.

A design doc can be found here:
https://docs.google.com/document/d/1cB8pDs-Ho7p2HlzkWo1d-wmXr90YntIjdvdOv7u7gn0/edit?usp=sharing

Change-Id: I01f3ed830cc1f86d7bb8a66b2f780914a4bd
---
M src/kudu/cfile/bloomfile.cc
M src/kudu/cfile/bloomfile.h
M src/kudu/cfile/cfile-test.cc
M src/kudu/cfile/cfile_writer.cc
M src/kudu/cfile/cfile_writer.h
M src/kudu/fs/block_manager-stress-test.cc
M src/kudu/fs/block_manager-test.cc
M src/kudu/fs/block_manager.cc
M src/kudu/fs/block_manager.h
M src/kudu/fs/file_block_manager.cc
M src/kudu/fs/file_block_manager.h
M src/kudu/fs/log_block_manager-test.cc
M src/kudu/fs/log_block_manager.cc
M src/kudu/fs/log_block_manager.h
M src/kudu/tablet/deltafile.cc
M src/kudu/tablet/deltafile.h
M src/kudu/tablet/diskrowset.cc
M src/kudu/tablet/diskrowset.h
M src/kudu/tablet/multi_column_writer.cc
M src/kudu/tablet/multi_column_writer.h
20 files changed, 599 insertions(+), 434 deletions(-)


  git pull ssh://gerrit.cloudera.org:29418/kudu refs/changes/07/7207/28
-- 
To view, visit http://gerrit.cloudera.org:8080/7207
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: newpatchset
Gerrit-Change-Id: I01f3ed830cc1f86d7bb8a66b2f780914a4bd
Gerrit-PatchSet: 28
Gerrit-Project: kudu
Gerrit-Branch: master
Gerrit-Owner: Hao Hao 
Gerrit-Reviewer: Adar Dembo 
Gerrit-Reviewer: Andrew Wong 
Gerrit-Reviewer: Dan Burkert 
Gerrit-Reviewer: Hao Hao 
Gerrit-Reviewer: Kudu Jenkins
Gerrit-Reviewer: Mike Percy 
Gerrit-Reviewer: Tidy Bot


[kudu-CR] KUDU-1943: Add BlockTransaction to Block Manager

2017-08-23 Thread Hao Hao (Code Review)
Hao Hao has posted comments on this change.

Change subject: KUDU-1943: Add BlockTransaction to Block Manager
..


Patch Set 27:

(1 comment)

http://gerrit.cloudera.org:8080/#/c/7207/25/src/kudu/fs/log_block_manager.cc
File src/kudu/fs/log_block_manager.cc:

PS25, Line 1206:   // Do nothing for Abort() other than updating metrics and 
block state.
   :   // Here is the reasoning why it is safe to do so. Currently, 
failures in block
   :   // creation can happen at various stages:
   :   /
> Shouldn't this still update metrics? And maybe change state to CLOSED?
Done


-- 
To view, visit http://gerrit.cloudera.org:8080/7207
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: comment
Gerrit-Change-Id: I01f3ed830cc1f86d7bb8a66b2f780914a4bd
Gerrit-PatchSet: 27
Gerrit-Project: kudu
Gerrit-Branch: master
Gerrit-Owner: Hao Hao 
Gerrit-Reviewer: Adar Dembo 
Gerrit-Reviewer: Andrew Wong 
Gerrit-Reviewer: Dan Burkert 
Gerrit-Reviewer: Hao Hao 
Gerrit-Reviewer: Kudu Jenkins
Gerrit-Reviewer: Mike Percy 
Gerrit-Reviewer: Tidy Bot
Gerrit-HasComments: Yes


[kudu-CR] KUDU-1943: Add BlockTransaction to Block Manager

2017-08-23 Thread Hao Hao (Code Review)
Hello Adar Dembo, Kudu Jenkins,

I'd like you to reexamine a change.  Please visit

http://gerrit.cloudera.org:8080/7207

to look at the new patch set (#27).

Change subject: KUDU-1943: Add BlockTransaction to Block Manager
..

KUDU-1943: Add BlockTransaction to Block Manager

This adds a new layer of abstraction 'BlockTransaction' to
Block Manager, to accumulate new blocks in a logical operation,
e.g. flush/compaction. Avoid using fsync() per block can
potentionaly reduce a lot I/O overhead.

This patch also add a new API, Finalize(), to Block Manager,
so that an LBM container can be reused without flushing
in-flight writable blocks to disk. This can bring down the
log block container number and disk space consumption for
first few flush/compaction operations when using log block
manager.

I performed manual test which creates a table with two tablets,
'kudu perf loadgen localhost --num-rows-per-thread=2000
 --num-threads=10 --keep-auto-table --table_num_buckets=2'.
With this patch, 'log_block_manager_containers' dropped from 56
to 7.

A design doc can be found here:
https://docs.google.com/document/d/1cB8pDs-Ho7p2HlzkWo1d-wmXr90YntIjdvdOv7u7gn0/edit?usp=sharing

Change-Id: I01f3ed830cc1f86d7bb8a66b2f780914a4bd
---
M src/kudu/cfile/bloomfile.cc
M src/kudu/cfile/bloomfile.h
M src/kudu/cfile/cfile-test.cc
M src/kudu/cfile/cfile_writer.cc
M src/kudu/cfile/cfile_writer.h
M src/kudu/fs/block_manager-stress-test.cc
M src/kudu/fs/block_manager-test.cc
M src/kudu/fs/block_manager.cc
M src/kudu/fs/block_manager.h
M src/kudu/fs/file_block_manager.cc
M src/kudu/fs/file_block_manager.h
M src/kudu/fs/log_block_manager-test.cc
M src/kudu/fs/log_block_manager.cc
M src/kudu/fs/log_block_manager.h
M src/kudu/tablet/deltafile.cc
M src/kudu/tablet/deltafile.h
M src/kudu/tablet/diskrowset.cc
M src/kudu/tablet/diskrowset.h
M src/kudu/tablet/multi_column_writer.cc
M src/kudu/tablet/multi_column_writer.h
20 files changed, 590 insertions(+), 439 deletions(-)


  git pull ssh://gerrit.cloudera.org:29418/kudu refs/changes/07/7207/27
-- 
To view, visit http://gerrit.cloudera.org:8080/7207
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: newpatchset
Gerrit-Change-Id: I01f3ed830cc1f86d7bb8a66b2f780914a4bd
Gerrit-PatchSet: 27
Gerrit-Project: kudu
Gerrit-Branch: master
Gerrit-Owner: Hao Hao 
Gerrit-Reviewer: Adar Dembo 
Gerrit-Reviewer: Andrew Wong 
Gerrit-Reviewer: Dan Burkert 
Gerrit-Reviewer: Hao Hao 
Gerrit-Reviewer: Kudu Jenkins
Gerrit-Reviewer: Mike Percy 
Gerrit-Reviewer: Tidy Bot


[kudu-CR] KUDU-1943: Add BlockTransaction to Block Manager

2017-08-23 Thread Hao Hao (Code Review)
Hello Adar Dembo, Kudu Jenkins,

I'd like you to reexamine a change.  Please visit

http://gerrit.cloudera.org:8080/7207

to look at the new patch set (#26).

Change subject: KUDU-1943: Add BlockTransaction to Block Manager
..

KUDU-1943: Add BlockTransaction to Block Manager

This adds a new layer of abstraction 'BlockTransaction' to
Block Manager, to accumulate new blocks in a logical operation,
e.g. flush/compaction. Avoid using fsync() per block can
potentionaly reduce a lot I/O overhead.

This patch also add a new API, Finalize(), to Block Manager,
so that an LBM container can be reused without flushing
in-flight writable blocks to disk. This can bring down the
log block container number and disk space consumption for
first few flush/compaction operations when using log block
manager.

I performed manual test which creates a table with two tablets,
'kudu perf loadgen localhost --num-rows-per-thread=2000
 --num-threads=10 --keep-auto-table --table_num_buckets=2'.
With this patch, 'log_block_manager_containers' dropped from 56
to 7.

A design doc can be found here:
https://docs.google.com/document/d/1cB8pDs-Ho7p2HlzkWo1d-wmXr90YntIjdvdOv7u7gn0/edit?usp=sharing

Change-Id: I01f3ed830cc1f86d7bb8a66b2f780914a4bd
---
M src/kudu/cfile/bloomfile.cc
M src/kudu/cfile/bloomfile.h
M src/kudu/cfile/cfile-test.cc
M src/kudu/cfile/cfile_writer.cc
M src/kudu/cfile/cfile_writer.h
M src/kudu/fs/block_manager-stress-test.cc
M src/kudu/fs/block_manager-test.cc
M src/kudu/fs/block_manager.cc
M src/kudu/fs/block_manager.h
M src/kudu/fs/file_block_manager.cc
M src/kudu/fs/file_block_manager.h
M src/kudu/fs/log_block_manager-test.cc
M src/kudu/fs/log_block_manager.cc
M src/kudu/fs/log_block_manager.h
M src/kudu/tablet/deltafile.cc
M src/kudu/tablet/deltafile.h
M src/kudu/tablet/diskrowset.cc
M src/kudu/tablet/diskrowset.h
M src/kudu/tablet/multi_column_writer.cc
M src/kudu/tablet/multi_column_writer.h
20 files changed, 588 insertions(+), 438 deletions(-)


  git pull ssh://gerrit.cloudera.org:29418/kudu refs/changes/07/7207/26
-- 
To view, visit http://gerrit.cloudera.org:8080/7207
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: newpatchset
Gerrit-Change-Id: I01f3ed830cc1f86d7bb8a66b2f780914a4bd
Gerrit-PatchSet: 26
Gerrit-Project: kudu
Gerrit-Branch: master
Gerrit-Owner: Hao Hao 
Gerrit-Reviewer: Adar Dembo 
Gerrit-Reviewer: Andrew Wong 
Gerrit-Reviewer: Dan Burkert 
Gerrit-Reviewer: Hao Hao 
Gerrit-Reviewer: Kudu Jenkins
Gerrit-Reviewer: Mike Percy 
Gerrit-Reviewer: Tidy Bot


[kudu-CR] KUDU-1943: Add BlockTransaction to Block Manager

2017-08-23 Thread Adar Dembo (Code Review)
Adar Dembo has posted comments on this change.

Change subject: KUDU-1943: Add BlockTransaction to Block Manager
..


Patch Set 25:

(1 comment)

http://gerrit.cloudera.org:8080/#/c/7207/25/src/kudu/fs/log_block_manager.cc
File src/kudu/fs/log_block_manager.cc:

PS25, Line 1206:   // Abort the operation for read-only container.
   :   if (container_->read_only()) {
   : return Status::Aborted("container $0 is read-only", 
container_->ToString());
   :   }
Shouldn't this still update metrics? And maybe change state to CLOSED?


-- 
To view, visit http://gerrit.cloudera.org:8080/7207
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: comment
Gerrit-Change-Id: I01f3ed830cc1f86d7bb8a66b2f780914a4bd
Gerrit-PatchSet: 25
Gerrit-Project: kudu
Gerrit-Branch: master
Gerrit-Owner: Hao Hao 
Gerrit-Reviewer: Adar Dembo 
Gerrit-Reviewer: Andrew Wong 
Gerrit-Reviewer: Dan Burkert 
Gerrit-Reviewer: Hao Hao 
Gerrit-Reviewer: Kudu Jenkins
Gerrit-Reviewer: Mike Percy 
Gerrit-Reviewer: Tidy Bot
Gerrit-HasComments: Yes


[kudu-CR] KUDU-1943: Add BlockTransaction to Block Manager

2017-08-23 Thread Hao Hao (Code Review)
Hello Adar Dembo, Kudu Jenkins,

I'd like you to reexamine a change.  Please visit

http://gerrit.cloudera.org:8080/7207

to look at the new patch set (#25).

Change subject: KUDU-1943: Add BlockTransaction to Block Manager
..

KUDU-1943: Add BlockTransaction to Block Manager

This adds a new layer of abstraction 'BlockTransaction' to
Block Manager, to accumulate new blocks in a logical operation,
e.g. flush/compaction. Avoid using fsync() per block can
potentionaly reduce a lot I/O overhead.

This patch also add a new API, Finalize(), to Block Manager,
so that an LBM container can be reused without flushing
in-flight writable blocks to disk. This can bring down the
log block container number and disk space consumption for
first few flush/compaction operations when using log block
manager.

I performed manual test which creates a table with two tablets,
'kudu perf loadgen localhost --num-rows-per-thread=2000
 --num-threads=10 --keep-auto-table --table_num_buckets=2'.
With this patch, 'log_block_manager_containers' dropped from 56
to 7.

A design doc can be found here:
https://docs.google.com/document/d/1cB8pDs-Ho7p2HlzkWo1d-wmXr90YntIjdvdOv7u7gn0/edit?usp=sharing

Change-Id: I01f3ed830cc1f86d7bb8a66b2f780914a4bd
---
M src/kudu/cfile/bloomfile.cc
M src/kudu/cfile/bloomfile.h
M src/kudu/cfile/cfile-test.cc
M src/kudu/cfile/cfile_writer.cc
M src/kudu/cfile/cfile_writer.h
M src/kudu/fs/block_manager-stress-test.cc
M src/kudu/fs/block_manager-test.cc
M src/kudu/fs/block_manager.cc
M src/kudu/fs/block_manager.h
M src/kudu/fs/file_block_manager.cc
M src/kudu/fs/file_block_manager.h
M src/kudu/fs/log_block_manager-test.cc
M src/kudu/fs/log_block_manager.cc
M src/kudu/fs/log_block_manager.h
M src/kudu/tablet/deltafile.cc
M src/kudu/tablet/deltafile.h
M src/kudu/tablet/diskrowset.cc
M src/kudu/tablet/diskrowset.h
M src/kudu/tablet/multi_column_writer.cc
M src/kudu/tablet/multi_column_writer.h
20 files changed, 587 insertions(+), 437 deletions(-)


  git pull ssh://gerrit.cloudera.org:29418/kudu refs/changes/07/7207/25
-- 
To view, visit http://gerrit.cloudera.org:8080/7207
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: newpatchset
Gerrit-Change-Id: I01f3ed830cc1f86d7bb8a66b2f780914a4bd
Gerrit-PatchSet: 25
Gerrit-Project: kudu
Gerrit-Branch: master
Gerrit-Owner: Hao Hao 
Gerrit-Reviewer: Adar Dembo 
Gerrit-Reviewer: Andrew Wong 
Gerrit-Reviewer: Dan Burkert 
Gerrit-Reviewer: Hao Hao 
Gerrit-Reviewer: Kudu Jenkins
Gerrit-Reviewer: Mike Percy 
Gerrit-Reviewer: Tidy Bot


[kudu-CR] KUDU-1943: Add BlockTransaction to Block Manager

2017-08-23 Thread Hao Hao (Code Review)
Hao Hao has posted comments on this change.

Change subject: KUDU-1943: Add BlockTransaction to Block Manager
..


Patch Set 24:

(11 comments)

http://gerrit.cloudera.org:8080/#/c/7207/22/src/kudu/fs/block_manager.h
File src/kudu/fs/block_manager.h:

PS22, Line 283:   return Status::OK();
  : }
  : 
> nit: It seems superfluous to specify = default here, particularly since the
Done


http://gerrit.cloudera.org:8080/#/c/7207/23/src/kudu/fs/log_block_manager.cc
File src/kudu/fs/log_block_manager.cc:

PS23, Line 325:   static const char* kMagic;
  : 
  :   // Creates a new block container in 'dir'.
  :   st
> No longer needed?
Done


PS23, Line 1219: Do
> Doing
Done


PS23, Line 1219: lt
> in
Done


Line 1220:   //data-consuming gaps in containers, but these gaps can be 
cleaned up
> "unmanaged" isn't really an existing concept. How about "Doing nothing can 
Done


Line 1223:   //orphaned blocks if the metadata is durable. But orphaned 
blocks can be
> orphaned
Done


PS23, Line 1226: abort ha
> abort
Done


PS23, Line 1227: abort ha
> abort
Done


Line 1228:   // avoid large chunks of data-consuming gaps and orphaned blocks. A
> "chunks" and "data-consuming gaps".
Done


PS23, Line 1230: abort, i
> abort
Done


PS23, Line 1232: abort, D
> abort
Done


-- 
To view, visit http://gerrit.cloudera.org:8080/7207
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: comment
Gerrit-Change-Id: I01f3ed830cc1f86d7bb8a66b2f780914a4bd
Gerrit-PatchSet: 24
Gerrit-Project: kudu
Gerrit-Branch: master
Gerrit-Owner: Hao Hao 
Gerrit-Reviewer: Adar Dembo 
Gerrit-Reviewer: Andrew Wong 
Gerrit-Reviewer: Dan Burkert 
Gerrit-Reviewer: Hao Hao 
Gerrit-Reviewer: Kudu Jenkins
Gerrit-Reviewer: Mike Percy 
Gerrit-Reviewer: Tidy Bot
Gerrit-HasComments: Yes


[kudu-CR] KUDU-1943: Add BlockTransaction to Block Manager

2017-08-23 Thread Hao Hao (Code Review)
Hello Adar Dembo, Kudu Jenkins,

I'd like you to reexamine a change.  Please visit

http://gerrit.cloudera.org:8080/7207

to look at the new patch set (#24).

Change subject: KUDU-1943: Add BlockTransaction to Block Manager
..

KUDU-1943: Add BlockTransaction to Block Manager

This adds a new layer of abstraction 'BlockTransaction' to
Block Manager, to accumulate new blocks in a logical operation,
e.g. flush/compaction. Avoid using fsync() per block can
potentionaly reduce a lot I/O overhead.

This patch also add a new API, Finalize(), to Block Manager,
so that an LBM container can be reused without flushing
in-flight writable blocks to disk. This can bring down the
log block container number and disk space consumption for
first few flush/compaction operations when using log block
manager.

I performed manual test which creates a table with two tablets,
'kudu perf loadgen localhost --num-rows-per-thread=2000
 --num-threads=10 --keep-auto-table --table_num_buckets=2'.
With this patch, 'log_block_manager_containers' dropped from 56
to 7.

A design doc can be found here:
https://docs.google.com/document/d/1cB8pDs-Ho7p2HlzkWo1d-wmXr90YntIjdvdOv7u7gn0/edit?usp=sharing

Change-Id: I01f3ed830cc1f86d7bb8a66b2f780914a4bd
---
M src/kudu/cfile/bloomfile.cc
M src/kudu/cfile/bloomfile.h
M src/kudu/cfile/cfile-test.cc
M src/kudu/cfile/cfile_writer.cc
M src/kudu/cfile/cfile_writer.h
M src/kudu/fs/block_manager-stress-test.cc
M src/kudu/fs/block_manager-test.cc
M src/kudu/fs/block_manager.cc
M src/kudu/fs/block_manager.h
M src/kudu/fs/file_block_manager.cc
M src/kudu/fs/file_block_manager.h
M src/kudu/fs/log_block_manager-test.cc
M src/kudu/fs/log_block_manager.cc
M src/kudu/fs/log_block_manager.h
M src/kudu/tablet/deltafile.cc
M src/kudu/tablet/deltafile.h
M src/kudu/tablet/diskrowset.cc
M src/kudu/tablet/diskrowset.h
M src/kudu/tablet/multi_column_writer.cc
M src/kudu/tablet/multi_column_writer.h
20 files changed, 590 insertions(+), 436 deletions(-)


  git pull ssh://gerrit.cloudera.org:29418/kudu refs/changes/07/7207/24
-- 
To view, visit http://gerrit.cloudera.org:8080/7207
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: newpatchset
Gerrit-Change-Id: I01f3ed830cc1f86d7bb8a66b2f780914a4bd
Gerrit-PatchSet: 24
Gerrit-Project: kudu
Gerrit-Branch: master
Gerrit-Owner: Hao Hao 
Gerrit-Reviewer: Adar Dembo 
Gerrit-Reviewer: Andrew Wong 
Gerrit-Reviewer: Dan Burkert 
Gerrit-Reviewer: Hao Hao 
Gerrit-Reviewer: Kudu Jenkins
Gerrit-Reviewer: Mike Percy 
Gerrit-Reviewer: Tidy Bot


[kudu-CR] KUDU-1943: Add BlockTransaction to Block Manager

2017-08-22 Thread Adar Dembo (Code Review)
Adar Dembo has posted comments on this change.

Change subject: KUDU-1943: Add BlockTransaction to Block Manager
..


Patch Set 23:

(10 comments)

http://gerrit.cloudera.org:8080/#/c/7207/23/src/kudu/fs/log_block_manager.cc
File src/kudu/fs/log_block_manager.cc:

PS23, Line 325:   enum SyncMode {
  : SYNC,
  : NO_SYNC
  :   };
No longer needed?


PS23, Line 1219: Do
Doing


PS23, Line 1219: it
in


Line 1220:   //unmanaged blocks which leaves gaps in data files. But the 
unmanaged
"unmanaged" isn't really an existing concept. How about "Doing nothing can 
result in data-consuming gaps in containers, but these gaps can be cleaned up 
by hole repunching at start up"?


Line 1223:   //orphan blocks if the metadata is durable. But orphan blocks 
can be
orphaned


PS23, Line 1226: abortion
abort


PS23, Line 1227: abortion
abort


Line 1228:   // avoid large chunk of useless consumed space and unmanaged 
blocks. A
"chunks" and "data-consuming gaps".


PS23, Line 1230: abortion
abort


PS23, Line 1232: abortion
abort


-- 
To view, visit http://gerrit.cloudera.org:8080/7207
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: comment
Gerrit-Change-Id: I01f3ed830cc1f86d7bb8a66b2f780914a4bd
Gerrit-PatchSet: 23
Gerrit-Project: kudu
Gerrit-Branch: master
Gerrit-Owner: Hao Hao 
Gerrit-Reviewer: Adar Dembo 
Gerrit-Reviewer: Andrew Wong 
Gerrit-Reviewer: Dan Burkert 
Gerrit-Reviewer: Hao Hao 
Gerrit-Reviewer: Kudu Jenkins
Gerrit-Reviewer: Mike Percy 
Gerrit-Reviewer: Tidy Bot
Gerrit-HasComments: Yes


[kudu-CR] KUDU-1943: Add BlockTransaction to Block Manager

2017-08-22 Thread Hao Hao (Code Review)
Hao Hao has posted comments on this change.

Change subject: KUDU-1943: Add BlockTransaction to Block Manager
..


Patch Set 22:

(10 comments)

> (1 comment)
 > 
 > Looks good!  I'm very pleased how this came together, net-net I
 > think the code is easier to understand with this patch.

Thank you! Really appreciate yours and Adar's thorough code reviews!

http://gerrit.cloudera.org:8080/#/c/7207/22//COMMIT_MSG
Commit Message:

PS22, Line 15: for
> an
Done


http://gerrit.cloudera.org:8080/#/c/7207/22/src/kudu/fs/block_manager.cc
File src/kudu/fs/block_manager.cc:

PS22, Line 34: blocks
> block
Done


Line 45: DEFINE_string(block_manager_flush_control, "finalize",
> It sounds scarier than it really is. We should at least clarify that this f
Done


PS22, Line 46: when to flush
> More like when to pre-flush?
Done


http://gerrit.cloudera.org:8080/#/c/7207/22/src/kudu/fs/block_manager.h
File src/kudu/fs/block_manager.h:

PS22, Line 70: is
> is set to
Done


PS22, Line 127: is
> is set to
Done


PS22, Line 128: asynchronous flush of dirty block data to disk
> Sounds good for performance. What is the crash consistency contract? Can th
Yes, in the worst case we could have 'gaps' in the data file, but this should 
be handled by hole repunching. Also, we only flush data in this case, and 
always append metadata after data is durable. This ensures metadata never 
points to garbage data.


http://gerrit.cloudera.org:8080/#/c/7207/22/src/kudu/fs/log_block_manager.cc
File src/kudu/fs/log_block_manager.cc:

Line 276:   // This is called after synchronization of dirty data and metadata
> doesn't finalization have to happen before the data is synced to disk?
DoClose() is called by DoCloseBlocks(). If closing a group of blocks, the 
blocks should be finalized. But for a single block it can be called without 
being finalized. There is a DCHECK at L904.


PS22, Line 458: rounds up
> "rounds up"?
Done


PS22, Line 459: it
> clarify: is "it" the block or the container?
Done


-- 
To view, visit http://gerrit.cloudera.org:8080/7207
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: comment
Gerrit-Change-Id: I01f3ed830cc1f86d7bb8a66b2f780914a4bd
Gerrit-PatchSet: 22
Gerrit-Project: kudu
Gerrit-Branch: master
Gerrit-Owner: Hao Hao 
Gerrit-Reviewer: Adar Dembo 
Gerrit-Reviewer: Andrew Wong 
Gerrit-Reviewer: Dan Burkert 
Gerrit-Reviewer: Hao Hao 
Gerrit-Reviewer: Kudu Jenkins
Gerrit-Reviewer: Mike Percy 
Gerrit-Reviewer: Tidy Bot
Gerrit-HasComments: Yes


[kudu-CR] KUDU-1943: Add BlockTransaction to Block Manager

2017-08-22 Thread Hao Hao (Code Review)
Hello Adar Dembo, Kudu Jenkins,

I'd like you to reexamine a change.  Please visit

http://gerrit.cloudera.org:8080/7207

to look at the new patch set (#23).

Change subject: KUDU-1943: Add BlockTransaction to Block Manager
..

KUDU-1943: Add BlockTransaction to Block Manager

This adds a new layer of abstraction 'BlockTransaction' to
Block Manager, to accumulate new blocks in a logical operation,
e.g. flush/compaction. Avoid using fsync() per block can
potentionaly reduce a lot I/O overhead.

This patch also add a new API, Finalize(), to Block Manager,
so that an LBM container can be reused without flushing
in-flight writable blocks to disk. This can bring down the
log block container number and disk space consumption for
first few flush/compaction operations when using log block
manager.

I performed manual test which creates a table with two tablets,
'kudu perf loadgen localhost --num-rows-per-thread=2000
 --num-threads=10 --keep-auto-table --table_num_buckets=2'.
With this patch, 'log_block_manager_containers' dropped from 56
to 7.

A design doc can be found here:
https://docs.google.com/document/d/1cB8pDs-Ho7p2HlzkWo1d-wmXr90YntIjdvdOv7u7gn0/edit?usp=sharing

Change-Id: I01f3ed830cc1f86d7bb8a66b2f780914a4bd
---
M src/kudu/cfile/bloomfile.cc
M src/kudu/cfile/bloomfile.h
M src/kudu/cfile/cfile-test.cc
M src/kudu/cfile/cfile_writer.cc
M src/kudu/cfile/cfile_writer.h
M src/kudu/fs/block_manager-stress-test.cc
M src/kudu/fs/block_manager-test.cc
M src/kudu/fs/block_manager.cc
M src/kudu/fs/block_manager.h
M src/kudu/fs/file_block_manager.cc
M src/kudu/fs/file_block_manager.h
M src/kudu/fs/log_block_manager-test.cc
M src/kudu/fs/log_block_manager.cc
M src/kudu/fs/log_block_manager.h
M src/kudu/tablet/deltafile.cc
M src/kudu/tablet/deltafile.h
M src/kudu/tablet/diskrowset.cc
M src/kudu/tablet/diskrowset.h
M src/kudu/tablet/multi_column_writer.cc
M src/kudu/tablet/multi_column_writer.h
20 files changed, 574 insertions(+), 440 deletions(-)


  git pull ssh://gerrit.cloudera.org:29418/kudu refs/changes/07/7207/23
-- 
To view, visit http://gerrit.cloudera.org:8080/7207
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: newpatchset
Gerrit-Change-Id: I01f3ed830cc1f86d7bb8a66b2f780914a4bd
Gerrit-PatchSet: 23
Gerrit-Project: kudu
Gerrit-Branch: master
Gerrit-Owner: Hao Hao 
Gerrit-Reviewer: Adar Dembo 
Gerrit-Reviewer: Andrew Wong 
Gerrit-Reviewer: Dan Burkert 
Gerrit-Reviewer: Hao Hao 
Gerrit-Reviewer: Kudu Jenkins
Gerrit-Reviewer: Mike Percy 
Gerrit-Reviewer: Tidy Bot


[kudu-CR] KUDU-1943: Add BlockTransaction to Block Manager

2017-08-22 Thread Mike Percy (Code Review)
Mike Percy has posted comments on this change.

Change subject: KUDU-1943: Add BlockTransaction to Block Manager
..


Patch Set 22:

(2 comments)

http://gerrit.cloudera.org:8080/#/c/7207/22/src/kudu/fs/block_manager.cc
File src/kudu/fs/block_manager.cc:

Line 45: DEFINE_string(block_manager_flush_control, "finalize",
> Would prefer pre-flush to pre-sync.
It sounds scarier than it really is. We should at least clarify that this flag 
doesn't have any affect on durability, which is controlled by other flags.


http://gerrit.cloudera.org:8080/#/c/7207/22/src/kudu/fs/log_block_manager.cc
File src/kudu/fs/log_block_manager.cc:

Line 355:   // If successful, adds all blocks to the block manager's in-memory 
maps.
> Don't think so; the blocks are just manipulated.
ah, you're right.


-- 
To view, visit http://gerrit.cloudera.org:8080/7207
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: comment
Gerrit-Change-Id: I01f3ed830cc1f86d7bb8a66b2f780914a4bd
Gerrit-PatchSet: 22
Gerrit-Project: kudu
Gerrit-Branch: master
Gerrit-Owner: Hao Hao 
Gerrit-Reviewer: Adar Dembo 
Gerrit-Reviewer: Andrew Wong 
Gerrit-Reviewer: Dan Burkert 
Gerrit-Reviewer: Hao Hao 
Gerrit-Reviewer: Kudu Jenkins
Gerrit-Reviewer: Mike Percy 
Gerrit-Reviewer: Tidy Bot
Gerrit-HasComments: Yes


[kudu-CR] KUDU-1943: Add BlockTransaction to Block Manager

2017-08-22 Thread Hao Hao (Code Review)
Hao Hao has posted comments on this change.

Change subject: KUDU-1943: Add BlockTransaction to Block Manager
..


Patch Set 22:

(1 comment)

http://gerrit.cloudera.org:8080/#/c/7207/22/src/kudu/fs/log_block_manager.cc
File src/kudu/fs/log_block_manager.cc:

Line 1228:   RETURN_NOT_OK(container_->DoCloseBlocks({ this }, 
LogBlockContainer::SyncMode::NO_SYNC));
> As I brought up on slack I think there may be an opportunity here to simpli
Discussed with Dan and Adar offline. We came to the conclusion that it is safe 
to do nothing in Abort() for now. Because by doing this could either result in 
'gaps' in data file or orphan blocks, and both of them can be handled today by 
hole repunching and blocks garbage collection.

I will add a TODO in Abort() so that we can add a fine-grained handing of 
abortion in future.


-- 
To view, visit http://gerrit.cloudera.org:8080/7207
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: comment
Gerrit-Change-Id: I01f3ed830cc1f86d7bb8a66b2f780914a4bd
Gerrit-PatchSet: 22
Gerrit-Project: kudu
Gerrit-Branch: master
Gerrit-Owner: Hao Hao 
Gerrit-Reviewer: Adar Dembo 
Gerrit-Reviewer: Andrew Wong 
Gerrit-Reviewer: Dan Burkert 
Gerrit-Reviewer: Hao Hao 
Gerrit-Reviewer: Kudu Jenkins
Gerrit-Reviewer: Mike Percy 
Gerrit-Reviewer: Tidy Bot
Gerrit-HasComments: Yes


[kudu-CR] KUDU-1943: Add BlockTransaction to Block Manager

2017-08-22 Thread Adar Dembo (Code Review)
Adar Dembo has posted comments on this change.

Change subject: KUDU-1943: Add BlockTransaction to Block Manager
..


Patch Set 22:

(3 comments)

http://gerrit.cloudera.org:8080/#/c/7207/22/src/kudu/fs/block_manager.cc
File src/kudu/fs/block_manager.cc:

Line 45: DEFINE_string(block_manager_flush_control, "finalize",
> I don't like the term flush because it doesn't distinguish between a memory
Would prefer pre-flush to pre-sync.

Also note that this is an experimental flag that largely exists for our own 
benchmarking, so I think the bar for documenting it is low.


http://gerrit.cloudera.org:8080/#/c/7207/22/src/kudu/fs/log_block_manager.cc
File src/kudu/fs/log_block_manager.cc:

Line 355:   // If successful, adds all blocks to the block manager's in-memory 
maps.
> and takes ownership of the LogWritableBlock instances, right?
Don't think so; the blocks are just manipulated.


PS22, Line 1119: due to KUDU-1793
> That issue is marked as RESOLVED. Is this still an issue?
Yes, because KUDU-1793 existed in the wild in at least one release, so we may 
load on-disk deployments with misaligned blocks.


-- 
To view, visit http://gerrit.cloudera.org:8080/7207
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: comment
Gerrit-Change-Id: I01f3ed830cc1f86d7bb8a66b2f780914a4bd
Gerrit-PatchSet: 22
Gerrit-Project: kudu
Gerrit-Branch: master
Gerrit-Owner: Hao Hao 
Gerrit-Reviewer: Adar Dembo 
Gerrit-Reviewer: Andrew Wong 
Gerrit-Reviewer: Dan Burkert 
Gerrit-Reviewer: Hao Hao 
Gerrit-Reviewer: Kudu Jenkins
Gerrit-Reviewer: Mike Percy 
Gerrit-Reviewer: Tidy Bot
Gerrit-HasComments: Yes


[kudu-CR] KUDU-1943: Add BlockTransaction to Block Manager

2017-08-22 Thread Mike Percy (Code Review)
Mike Percy has posted comments on this change.

Change subject: KUDU-1943: Add BlockTransaction to Block Manager
..


Patch Set 22:

(13 comments)

Looks pretty good to me, I mainly have nits.

http://gerrit.cloudera.org:8080/#/c/7207/22//COMMIT_MSG
Commit Message:

PS22, Line 15: for
an


http://gerrit.cloudera.org:8080/#/c/7207/22/src/kudu/fs/block_manager.cc
File src/kudu/fs/block_manager.cc:

PS22, Line 34: blocks
block


Line 45: DEFINE_string(block_manager_flush_control, "finalize",
I don't like the term flush because it doesn't distinguish between a memory 
buffer flush vs. an fsync. Also, from the description here it sounds like 
'never' means the blocks will never be fsynced.

We should clarify that this is an implicit pre-flush or pre-sync, that the 
global settings for syncing will still be used at close time.

Perhaps we should call this flag block_manager_preflush_control instead, or 
block_manager_early_sync_control.


PS22, Line 46: when to flush
More like when to pre-flush?


http://gerrit.cloudera.org:8080/#/c/7207/22/src/kudu/fs/block_manager.h
File src/kudu/fs/block_manager.h:

PS22, Line 70: is
is set to


PS22, Line 127: is
is set to


PS22, Line 128: asynchronous flush of dirty block data to disk
Sounds good for performance. What is the crash consistency contract? Can this 
leak disk space?


PS22, Line 283:   BlockTransaction() = default;
  : 
  :   ~BlockTransaction() = default;
nit: It seems superfluous to specify = default here, particularly since these 
are not something special like copy or move constructors. Am I missing 
something?


http://gerrit.cloudera.org:8080/#/c/7207/22/src/kudu/fs/log_block_manager.cc
File src/kudu/fs/log_block_manager.cc:

Line 276:   // This is called after synchronization of dirty data and metadata
doesn't finalization have to happen before the data is synced to disk?


Line 355:   // If successful, adds all blocks to the block manager's in-memory 
maps.
and takes ownership of the LogWritableBlock instances, right?


PS22, Line 458: rounds up
"rounds up"?


PS22, Line 459: it
clarify: is "it" the block or the container?


PS22, Line 1119: due to KUDU-1793
That issue is marked as RESOLVED. Is this still an issue?


-- 
To view, visit http://gerrit.cloudera.org:8080/7207
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: comment
Gerrit-Change-Id: I01f3ed830cc1f86d7bb8a66b2f780914a4bd
Gerrit-PatchSet: 22
Gerrit-Project: kudu
Gerrit-Branch: master
Gerrit-Owner: Hao Hao 
Gerrit-Reviewer: Adar Dembo 
Gerrit-Reviewer: Andrew Wong 
Gerrit-Reviewer: Dan Burkert 
Gerrit-Reviewer: Hao Hao 
Gerrit-Reviewer: Kudu Jenkins
Gerrit-Reviewer: Mike Percy 
Gerrit-Reviewer: Tidy Bot
Gerrit-HasComments: Yes


[kudu-CR] KUDU-1943: Add BlockTransaction to Block Manager

2017-08-22 Thread Dan Burkert (Code Review)
Dan Burkert has posted comments on this change.

Change subject: KUDU-1943: Add BlockTransaction to Block Manager
..


Patch Set 22:

(1 comment)

Looks good!  I'm very pleased how this came together, net-net I think the code 
is easier to understand with this patch.

http://gerrit.cloudera.org:8080/#/c/7207/22/src/kudu/fs/log_block_manager.cc
File src/kudu/fs/log_block_manager.cc:

Line 1228:   RETURN_NOT_OK(container_->DoCloseBlocks({ this }, 
LogBlockContainer::SyncMode::NO_SYNC));
As I brought up on slack I think there may be an opportunity here to simplify 
DoClose and make Abort more efficient by doing nothing during abort calls 
(instead of writing a create + delete metadata entry).


-- 
To view, visit http://gerrit.cloudera.org:8080/7207
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: comment
Gerrit-Change-Id: I01f3ed830cc1f86d7bb8a66b2f780914a4bd
Gerrit-PatchSet: 22
Gerrit-Project: kudu
Gerrit-Branch: master
Gerrit-Owner: Hao Hao 
Gerrit-Reviewer: Adar Dembo 
Gerrit-Reviewer: Andrew Wong 
Gerrit-Reviewer: Dan Burkert 
Gerrit-Reviewer: Hao Hao 
Gerrit-Reviewer: Kudu Jenkins
Gerrit-Reviewer: Tidy Bot
Gerrit-HasComments: Yes


[kudu-CR] KUDU-1943: Add BlockTransaction to Block Manager

2017-08-22 Thread Adar Dembo (Code Review)
Adar Dembo has posted comments on this change.

Change subject: KUDU-1943: Add BlockTransaction to Block Manager
..


Patch Set 22: Code-Review+2

-- 
To view, visit http://gerrit.cloudera.org:8080/7207
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: comment
Gerrit-Change-Id: I01f3ed830cc1f86d7bb8a66b2f780914a4bd
Gerrit-PatchSet: 22
Gerrit-Project: kudu
Gerrit-Branch: master
Gerrit-Owner: Hao Hao 
Gerrit-Reviewer: Adar Dembo 
Gerrit-Reviewer: Andrew Wong 
Gerrit-Reviewer: Dan Burkert 
Gerrit-Reviewer: Hao Hao 
Gerrit-Reviewer: Kudu Jenkins
Gerrit-Reviewer: Tidy Bot
Gerrit-HasComments: No


[kudu-CR] KUDU-1943: Add BlockTransaction to Block Manager

2017-08-22 Thread Hao Hao (Code Review)
Hao Hao has posted comments on this change.

Change subject: KUDU-1943: Add BlockTransaction to Block Manager
..


Patch Set 22:

(1 comment)

http://gerrit.cloudera.org:8080/#/c/7207/20/src/kudu/fs/log_block_manager.cc
File src/kudu/fs/log_block_manager.cc:

PS20, Line 883: 
> Should be with a pointer (*), actually.
Done


-- 
To view, visit http://gerrit.cloudera.org:8080/7207
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: comment
Gerrit-Change-Id: I01f3ed830cc1f86d7bb8a66b2f780914a4bd
Gerrit-PatchSet: 22
Gerrit-Project: kudu
Gerrit-Branch: master
Gerrit-Owner: Hao Hao 
Gerrit-Reviewer: Adar Dembo 
Gerrit-Reviewer: Andrew Wong 
Gerrit-Reviewer: Dan Burkert 
Gerrit-Reviewer: Hao Hao 
Gerrit-Reviewer: Kudu Jenkins
Gerrit-Reviewer: Tidy Bot
Gerrit-HasComments: Yes


[kudu-CR] KUDU-1943: Add BlockTransaction to Block Manager

2017-08-22 Thread Hao Hao (Code Review)
Hao Hao has posted comments on this change.

Change subject: KUDU-1943: Add BlockTransaction to Block Manager
..


Patch Set 22:

(7 comments)

http://gerrit.cloudera.org:8080/#/c/7207/21/src/kudu/fs/block_manager-test.cc
File src/kudu/fs/block_manager-test.cc:

PS21, Line 522:   Random rand(SeedRandom());
  :   vector dirty_bl
> This is no longer true and can be removed, right?
Right. Removed.


http://gerrit.cloudera.org:8080/#/c/7207/21/src/kudu/fs/file_block_manager.cc
File src/kudu/fs/file_block_manager.cc:

Line 63: DECLARE_bool(block_manager_lock_dirs);
> Not needed anymore?
Done


http://gerrit.cloudera.org:8080/#/c/7207/21/src/kudu/fs/log_block_manager.cc
File src/kudu/fs/log_block_manager.cc:

Line 79: DECLARE_bool(enable_data_block_fsync);
> Can be removed.
Done


PS21, Line 528: ck is ful
> , marking
Done


PS21, Line 905: 
> Use DCHECK_EQ
Done


PS21, Line 1054: e conta
> depending
Done


PS21, Line 1054: n isn'
> place
Done


-- 
To view, visit http://gerrit.cloudera.org:8080/7207
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: comment
Gerrit-Change-Id: I01f3ed830cc1f86d7bb8a66b2f780914a4bd
Gerrit-PatchSet: 22
Gerrit-Project: kudu
Gerrit-Branch: master
Gerrit-Owner: Hao Hao 
Gerrit-Reviewer: Adar Dembo 
Gerrit-Reviewer: Andrew Wong 
Gerrit-Reviewer: Dan Burkert 
Gerrit-Reviewer: Hao Hao 
Gerrit-Reviewer: Kudu Jenkins
Gerrit-Reviewer: Tidy Bot
Gerrit-HasComments: Yes


[kudu-CR] KUDU-1943: Add BlockTransaction to Block Manager

2017-08-22 Thread Hao Hao (Code Review)
Hello Kudu Jenkins,

I'd like you to reexamine a change.  Please visit

http://gerrit.cloudera.org:8080/7207

to look at the new patch set (#22).

Change subject: KUDU-1943: Add BlockTransaction to Block Manager
..

KUDU-1943: Add BlockTransaction to Block Manager

This adds a new layer of abstraction 'BlockTransaction' to
Block Manager, to accumulate new blocks in a logical operation,
e.g. flush/compaction. Avoid using fsync() per block can
potentionaly reduce a lot I/O overhead.

This patch also add a new API, Finalize(), to Block Manager,
so that for LBM container can be reused without flushing
in-flight writable blocks to disk. This can bring down the
log block container number and disk space consumption for
first few flush/compaction operations when using log block
manager.

I performed manual test which creates a table with two tablets,
'kudu perf loadgen localhost --num-rows-per-thread=2000
 --num-threads=10 --keep-auto-table --table_num_buckets=2'.
With this patch, 'log_block_manager_containers' dropped from 56
to 7.

A design doc can be found here:
https://docs.google.com/document/d/1cB8pDs-Ho7p2HlzkWo1d-wmXr90YntIjdvdOv7u7gn0/edit?usp=sharing

Change-Id: I01f3ed830cc1f86d7bb8a66b2f780914a4bd
---
M src/kudu/cfile/bloomfile.cc
M src/kudu/cfile/bloomfile.h
M src/kudu/cfile/cfile-test.cc
M src/kudu/cfile/cfile_writer.cc
M src/kudu/cfile/cfile_writer.h
M src/kudu/fs/block_manager-stress-test.cc
M src/kudu/fs/block_manager-test.cc
M src/kudu/fs/block_manager.cc
M src/kudu/fs/block_manager.h
M src/kudu/fs/file_block_manager.cc
M src/kudu/fs/file_block_manager.h
M src/kudu/fs/log_block_manager-test.cc
M src/kudu/fs/log_block_manager.cc
M src/kudu/fs/log_block_manager.h
M src/kudu/tablet/deltafile.cc
M src/kudu/tablet/deltafile.h
M src/kudu/tablet/diskrowset.cc
M src/kudu/tablet/diskrowset.h
M src/kudu/tablet/multi_column_writer.cc
M src/kudu/tablet/multi_column_writer.h
20 files changed, 575 insertions(+), 449 deletions(-)


  git pull ssh://gerrit.cloudera.org:29418/kudu refs/changes/07/7207/22
-- 
To view, visit http://gerrit.cloudera.org:8080/7207
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: newpatchset
Gerrit-Change-Id: I01f3ed830cc1f86d7bb8a66b2f780914a4bd
Gerrit-PatchSet: 22
Gerrit-Project: kudu
Gerrit-Branch: master
Gerrit-Owner: Hao Hao 
Gerrit-Reviewer: Adar Dembo 
Gerrit-Reviewer: Andrew Wong 
Gerrit-Reviewer: Dan Burkert 
Gerrit-Reviewer: Hao Hao 
Gerrit-Reviewer: Kudu Jenkins
Gerrit-Reviewer: Tidy Bot


[kudu-CR] KUDU-1943: Add BlockTransaction to Block Manager

2017-08-22 Thread Adar Dembo (Code Review)
Adar Dembo has posted comments on this change.

Change subject: KUDU-1943: Add BlockTransaction to Block Manager
..


Patch Set 21:

(8 comments)

Almost there, just a few nits left.

Dan, since the implementation has changed quite a bit, do you want to make 
another pass?

http://gerrit.cloudera.org:8080/#/c/7207/21/src/kudu/fs/block_manager-test.cc
File src/kudu/fs/block_manager-test.cc:

PS21, Line 522:   // Disable preallocation for this test as it creates many 
containers.
  :   FLAGS_log_container_preallocate_bytes = 0;
This is no longer true and can be removed, right?


http://gerrit.cloudera.org:8080/#/c/7207/21/src/kudu/fs/file_block_manager.cc
File src/kudu/fs/file_block_manager.cc:

Line 63: DECLARE_bool(block_coalesce_close);
Not needed anymore?


http://gerrit.cloudera.org:8080/#/c/7207/20/src/kudu/fs/log_block_manager.cc
File src/kudu/fs/log_block_manager.cc:

PS20, Line 883: 
> Done
Should be with a pointer (*), actually.


http://gerrit.cloudera.org:8080/#/c/7207/21/src/kudu/fs/log_block_manager.cc
File src/kudu/fs/log_block_manager.cc:

Line 79: DECLARE_bool(block_coalesce_close);
Can be removed.


PS21, Line 528: . Marking
, marking


PS21, Line 905: DCHECK
Use DCHECK_EQ


PS21, Line 1054: places
place


PS21, Line 1054: depends
depending


-- 
To view, visit http://gerrit.cloudera.org:8080/7207
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: comment
Gerrit-Change-Id: I01f3ed830cc1f86d7bb8a66b2f780914a4bd
Gerrit-PatchSet: 21
Gerrit-Project: kudu
Gerrit-Branch: master
Gerrit-Owner: Hao Hao 
Gerrit-Reviewer: Adar Dembo 
Gerrit-Reviewer: Andrew Wong 
Gerrit-Reviewer: Dan Burkert 
Gerrit-Reviewer: Hao Hao 
Gerrit-Reviewer: Kudu Jenkins
Gerrit-Reviewer: Tidy Bot
Gerrit-HasComments: Yes


[kudu-CR] KUDU-1943: Add BlockTransaction to Block Manager

2017-08-21 Thread Hao Hao (Code Review)
Hello Kudu Jenkins,

I'd like you to reexamine a change.  Please visit

http://gerrit.cloudera.org:8080/7207

to look at the new patch set (#21).

Change subject: KUDU-1943: Add BlockTransaction to Block Manager
..

KUDU-1943: Add BlockTransaction to Block Manager

This adds a new layer of abstraction 'BlockTransaction' to
Block Manager, to accumulate new blocks in a logical operation,
e.g. flush/compaction. Avoid using fsync() per block can
potentionaly reduce a lot I/O overhead.

This patch also add a new API, Finalize(), to Block Manager,
so that for LBM container can be reused without flushing
in-flight writable blocks to disk. This can bring down the
log block container number and disk space consumption for
first few flush/compaction operations when using log block
manager.

I performed manual test which creates a table with two tablets,
'kudu perf loadgen localhost --num-rows-per-thread=2000
 --num-threads=10 --keep-auto-table --table_num_buckets=2'.
With this patch, 'log_block_manager_containers' dropped from 56
to 7.

A design doc can be found here:
https://docs.google.com/document/d/1cB8pDs-Ho7p2HlzkWo1d-wmXr90YntIjdvdOv7u7gn0/edit?usp=sharing

Change-Id: I01f3ed830cc1f86d7bb8a66b2f780914a4bd
---
M src/kudu/cfile/bloomfile.cc
M src/kudu/cfile/bloomfile.h
M src/kudu/cfile/cfile-test.cc
M src/kudu/cfile/cfile_writer.cc
M src/kudu/cfile/cfile_writer.h
M src/kudu/fs/block_manager-stress-test.cc
M src/kudu/fs/block_manager-test.cc
M src/kudu/fs/block_manager.cc
M src/kudu/fs/block_manager.h
M src/kudu/fs/file_block_manager.cc
M src/kudu/fs/file_block_manager.h
M src/kudu/fs/log_block_manager-test.cc
M src/kudu/fs/log_block_manager.cc
M src/kudu/fs/log_block_manager.h
M src/kudu/tablet/deltafile.cc
M src/kudu/tablet/deltafile.h
M src/kudu/tablet/diskrowset.cc
M src/kudu/tablet/diskrowset.h
M src/kudu/tablet/multi_column_writer.cc
M src/kudu/tablet/multi_column_writer.h
20 files changed, 575 insertions(+), 444 deletions(-)


  git pull ssh://gerrit.cloudera.org:29418/kudu refs/changes/07/7207/21
-- 
To view, visit http://gerrit.cloudera.org:8080/7207
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: newpatchset
Gerrit-Change-Id: I01f3ed830cc1f86d7bb8a66b2f780914a4bd
Gerrit-PatchSet: 21
Gerrit-Project: kudu
Gerrit-Branch: master
Gerrit-Owner: Hao Hao 
Gerrit-Reviewer: Adar Dembo 
Gerrit-Reviewer: Andrew Wong 
Gerrit-Reviewer: Dan Burkert 
Gerrit-Reviewer: Hao Hao 
Gerrit-Reviewer: Kudu Jenkins
Gerrit-Reviewer: Tidy Bot


[kudu-CR] KUDU-1943: Add BlockTransaction to Block Manager

2017-08-21 Thread Hao Hao (Code Review)
Hao Hao has posted comments on this change.

Change subject: KUDU-1943: Add BlockTransaction to Block Manager
..


Patch Set 21:

(30 comments)

http://gerrit.cloudera.org:8080/#/c/7207/19//COMMIT_MSG
Commit Message:

PS19, Line 22:  --nu
> You still need 'localhost' though; that's not an optional parameter I think
Done


http://gerrit.cloudera.org:8080/#/c/7207/20/src/kudu/fs/block_manager-stress-test.cc
File src/kudu/fs/block_manager-stress-test.cc:

Line 282:   AtomicInt total_bytes_written_;
> Should be 
Done


PS20, Line 297: vector all_dirty_blocks;
  : for (int i = 0; i < FLAGS_block_group_number; i++) {
> You changed the pointer/ref style here.
Done


Line 310: 
> Update.
Done


Line 311: dirty_blocks.emplace_back(std::move(block));
> const auto&
Done


Line 318:   // random, and write a smaller chunk of data to it.
> auto&
Done


Line 326: // Write a small chunk of data.
> const auto&
Done


http://gerrit.cloudera.org:8080/#/c/7207/20/src/kudu/fs/block_manager.cc
File src/kudu/fs/block_manager.cc:

PS20, Line 44: //   throughput but may improve write latency.
 : DEFINE_string(block_manager_flush_control, "finalize",
 :   "Controls when to flush a block. Valid values are 
'finalize', "
 :   "'close', or 'never'. If 'finalize', blocks will 
be flushed "
 :   "when writing is finished. If 'close', blocks will 
be 
> Rewrite:
Done


http://gerrit.cloudera.org:8080/#/c/7207/17/src/kudu/fs/log_block_manager.cc
File src/kudu/fs/log_block_manager.cc:

Line 1410: 
> Okay, but why bother maintaining FlushDataAsync() as a separate method? It 
Makes sense. Updated. But still kept the per-impl methods in 
LogWritableBlock/FileWritableBlock since calling Flush() from 
LogBlockManager/FileBlockManager will result in more code if not.


http://gerrit.cloudera.org:8080/#/c/7207/20/src/kudu/fs/log_block_manager.cc
File src/kudu/fs/log_block_manager.cc:

PS20, Line 260: onst OVERRIDE;
> finalizing it if it has not yet been finalized.
Done


PS20, Line 260: 
> "Also updates"
Done


PS20, Line 336: c
> ,
Done


PS20, Line 337: 
> the
Done


PS20, Line 434:   // Malformed records and other container inconsistencies are 
written to
  :   // 'report'. Healthy blocks are written either to 
'live_blocks' or
  :   // 'dead_blocks'. Live records are written to 
'live_block_recor
> Nit: would prefer to keep BlockCreated and BlockDeleted next to each other 
Done


PS20, Line 543:   // Offset up to which we have preallocated bytes.
  :   int64_t preallocated_offset_ = 0;
> Since all counters are now atomic, these comments are no longer interesting
Done


PS20, Line 878: at as a d
> "data" (to be consistent with "Syncing metadata file")
Done


PS20, Line 882: 
> Replace with "is already made durable".
Done


PS20, Line 883: 
> const auto*
Done


PS20, Line 890: if (mode == SYNC) 
> I don't think this one needs to be conditioned on SYNC.
Done


Line 891: 
> Would be nice if we could DCHECK that if blocks.size() > 1, all of the bloc
Done


PS20, Line 1040: preallocated_offset_ = off + len;
   :   }
   : 
> This needs to be updated.
Done


Line 1068:   DCHECK_GE(block_offset, 0);
> Don't need the if statement anymore.
Done


PS20, Line 1071:   // means accounting for the new block should be as simple as 
adding its
   :   // aligned length to 'next
> Let's defer this to BlockCreated(). Then RoundUpContainer can be renamed to
Done


PS20, Line 1206: etrics
> Nit: { this }
Done


Line 1216: }
> Nit: { this }
Done


PS20, Line 1289: }
   : 
   : Status LogWritableBl
> Rewrite: "We do not mark the container as read-only if FlushDataAsync() fai
Done


Line 1317: }
> 'lbm' is only used once; don't bother storing it in a local variable.
Done


PS20, Line 1319: state() const {
> Just use block_length_ here; it's more clear.
Done


PS20, Line 1319: te L
> and block_id_ here.
Done


PS20, Line 1751: ol == "close") {
> Nit: 
Done


-- 
To view, visit http://gerrit.cloudera.org:8080/7207
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: comment
Gerrit-Change-Id: I01f3ed830cc1f86d7bb8a66b2f780914a4bd
Gerrit-PatchSet: 21
Gerrit-Project: kudu
Gerrit-Branch: master
Gerrit-Owner: Hao Hao 
Gerrit-Reviewer: Adar Dembo 
Gerrit-Reviewer: Andrew Wong 
Gerrit-Reviewer: Dan Burkert 
Gerrit-Reviewer: Hao Hao 
Gerrit-Reviewer: Kudu Jenkins
Gerrit-Reviewer: Tidy Bot
Gerrit-HasComments: Yes


[kudu-CR] KUDU-1943: Add BlockTransaction to Block Manager

2017-08-21 Thread Adar Dembo (Code Review)
Adar Dembo has posted comments on this change.

Change subject: KUDU-1943: Add BlockTransaction to Block Manager
..


Patch Set 20:

(30 comments)

http://gerrit.cloudera.org:8080/#/c/7207/19//COMMIT_MSG
Commit Message:

PS19, Line 22: s-per
> Done
You still need 'localhost' though; that's not an optional parameter I think. I 
was just saying you could remove the port, which is optional.


http://gerrit.cloudera.org:8080/#/c/7207/20/src/kudu/fs/block_manager-stress-test.cc
File src/kudu/fs/block_manager-stress-test.cc:

Line 282: Slice seed_slice(reinterpret_cast(), 
sizeof(seed));
Should be 


PS20, Line 297: WritableBlock *block = 
dirty_blocks[next_block_idx].get();
  : Random  = dirty_block_rands[next_block_idx];
You changed the pointer/ref style here.


Line 310:   // Close all dirty blocks.
Update.


Line 311:   for (const auto _block : dirty_blocks) {
const auto&


Line 318:   for (auto _block : dirty_blocks) {
auto&


Line 326:   for (const auto  : all_dirty_blocks) {
const auto&


http://gerrit.cloudera.org:8080/#/c/7207/20/src/kudu/fs/block_manager.cc
File src/kudu/fs/block_manager.cc:

PS20, Line 44:   "Control when to flush a block, either at 
'finalize', 'close',"
 :   " or 'never'. 'finalize' indicates to flush when 
writing to "
 :   "the block is finished. 'close' indicates to defer 
the flushing "
 :   "to when the entire BlockTransaction, that the 
blocks belong "
 :   "to, is committed. And 'never' is not flush at 
all.");
Rewrite:

Controls when to flush a block. Valid values are 'finalize', 'close', or 
'never'. If 'finalize', blocks will be flushed when writing is finished. If 
'close', blocks will be flushed when their transaction is committed. If 
'never', blocks will never be flushed.


http://gerrit.cloudera.org:8080/#/c/7207/17/src/kudu/fs/log_block_manager.cc
File src/kudu/fs/log_block_manager.cc:

Line 1410: }
> Updated. To clarify, I did this because we were saying 'Finalize indicates 
Okay, but why bother maintaining FlushDataAsync() as a separate method? It 
certainly shouldn't be part of the WritableBlock interface because there's no 
need for anyone outside the block manager to know about the distinction.

I suppose you could keep it as per-impl methods in 
LogWritableBlock/FileWritableBlock, but it's just one line of code, so the 
decomposition doesn't seem that useful.


http://gerrit.cloudera.org:8080/#/c/7207/20/src/kudu/fs/log_block_manager.cc
File src/kudu/fs/log_block_manager.cc:

PS20, Line 260: finalize it if has not
finalizing it if it has not yet been finalized.


PS20, Line 260: And update
"Also updates"


PS20, Line 336: .
,


PS20, Line 337: th 
the


PS20, Line 434:   // Finalize a fully written block. It round up this 
container, truncate it if full
  :   // and mark the container as available.
  :   void FinalizeBlock(int64_t block_offset, int64_t 
block_length);
Nit: would prefer to keep BlockCreated and BlockDeleted next to each other 
since they're fairly symmetric.


PS20, Line 543:   //
  :   // Declared atomic because it is mutated from BlockDeleted().
Since all counters are now atomic, these comments are no longer interesting.


PS20, Line 878: container
"data" (to be consistent with "Syncing metadata file")


PS20, Line 882:  does so at any given point
Replace with "is already made durable".

Or rewrite as "Append metadata only after data is synced so that there's no 
chance of metadata landing on the disk before the data."


PS20, Line 883: const auto &
const auto*


PS20, Line 890: if (mode == SYNC) 
I don't think this one needs to be conditioned on SYNC.


Line 891: for (LogWritableBlock* block : blocks) {
Would be nice if we could DCHECK that if blocks.size() > 1, all of the blocks 
are FINALIZED. If they aren't, we have a big problem: we'll end up calling 
FinalizeBlock() on the same container multiple times from within DoClose() and 
could corrupt on-disk data via truncate calls.


PS20, Line 1040:   // Note that this take places _after_ the container has been 
synced to disk.
   :   // That's OK; truncation isn't needed for correctness, and 
in the event of a
   :   // crash or error, it will be retried at startup.
This needs to be updated.


Line 1068:   if (PREDICT_TRUE(new_next_block_offset >= next_block_offset())) {
Don't need the if statement anymore.


PS20, Line 1071:   total_bytes_ .IncrementBy(fs_aligned_length(block_offset, 
block_length));
   :   total_blocks_.Increment();
Let's defer this to BlockCreated(). Then RoundUpContainer can be renamed to the 
more clear UpdateNextBlockOffset.


PS20, Line 1206: {this}
Nit: { this }


Line 1216:   RETURN_NOT_OK(container_->DoCloseBlocks({this}, 
LogBlockContainer::SyncMode::NO_SYNC));
Nit: 

[kudu-CR] KUDU-1943: Add BlockTransaction to Block Manager

2017-08-21 Thread Hao Hao (Code Review)
Hao Hao has posted comments on this change.

Change subject: KUDU-1943: Add BlockTransaction to Block Manager
..


Patch Set 20:

(26 comments)

http://gerrit.cloudera.org:8080/#/c/7207/19//COMMIT_MSG
Commit Message:

PS19, Line 22: s-per
> Nit: can just omit this.
Done


http://gerrit.cloudera.org:8080/#/c/7207/17/src/kudu/cfile/cfile-test.cc
File src/kudu/cfile/cfile-test.cc:

Line 38: DECLARE_bool(cfile_write_checksums);
> warning: redundant 'FLAGS_block_time_to_flush' declaration [readability-red
Done


Line 892:   unique_ptr sink;
> What happened to parameterizing this test for all block_time_to_flush value
Refactored the code and the flag specific code in Cfile are removed.


http://gerrit.cloudera.org:8080/#/c/7207/17/src/kudu/cfile/cfile_writer.cc
File src/kudu/cfile/cfile_writer.cc:

PS17, Line 266:   RETURN_NOT_OK(block_->Finalize());
  :   transaction->AddCreatedBlock(std::move(block_));
  :   return Status::OK();
  : }
  : 
  : void CFileWriter::AddMetadataPair(const Slice , const Slice 
) {
  :   C
> This is incorrect. We should always Finalize() the block; whether or not we
Done


http://gerrit.cloudera.org:8080/#/c/7207/17/src/kudu/fs/block_manager.cc
File src/kudu/fs/block_manager.cc:

Line 43: DEFINE_string(block_manager_flush_control, "finalize",
> Let's rename to "block_manager_flush_control":
Done


Line 43: DEFINE_string(block_manager_flush_control, "finalize",
> warning: redundant 'FLAGS_block_time_to_flush' declaration [readability-red
Done


Line 44:   "Control when to flush a block, either at 'finalize', 
'close',"
> This should apply to any WritableBlock, not just those in a BlockTransactio
Done


PS17, Line 48: neve
> Let's use "never" instead, since we're talking about a point in time.
Done


http://gerrit.cloudera.org:8080/#/c/7207/17/src/kudu/fs/block_manager.h
File src/kudu/fs/block_manager.h:

Line 32: namespace kudu {
> This doesn't belong here (block_coalesce_close didn't belong either).
Done


http://gerrit.cloudera.org:8080/#/c/7207/17/src/kudu/fs/file_block_manager.cc
File src/kudu/fs/file_block_manager.cc:

Line 340: Status FileWritableBlock::Finalize() {
> This is where we should check the value of block_time_to_flush. We should o
Done


http://gerrit.cloudera.org:8080/#/c/7207/17/src/kudu/fs/log_block_manager.cc
File src/kudu/fs/log_block_manager.cc:

Line 1176:   deleted_ = true;
> I mentioned this to Dan in our impromptu meeting yesterday; perhaps you wer
Done


PS17, Line 1397: ReadableBlock::Close() {
> You can use 'lbm' here.
Done


Line 1410: }
> And here we also need to gate on block_time_to_flush == 'finalize'.
Updated. To clarify, I did this because we were saying 'Finalize indicates 
Flushing'. But after thought more, I think your suggestion makes more sense. So 
I separated the flushing (I kept FlushDataAsync()) from Finalize and use the 
flag to control when to call FlushDataAsync().


http://gerrit.cloudera.org:8080/#/c/7207/19/src/kudu/fs/log_block_manager.cc
File src/kudu/fs/log_block_manager.cc:

Line 245: 
> Don't need this; just change "enum CloseFlag" to "enum CloseFlags".
Done


Line 499:   // Note: 'record' may be swapped into 'report'; do not use it after 
calling
> Don't need this; it's only ever used internally (by LogBlockContainer).
Done


Line 582:   total_bytes_(0),
> Can we get by with a boolean? Seems like we could set it in RoundUpContaine
Removed as is not necessary.


PS19, Line 594: }
  : 
> I don't understand what protection the lock offers. Furthermore, PosixRWFil
Discussed this with Adar offline. I will remove this lock as 
PosixRWFile::Sync() already tries to keeps track of whether a sync is 
necessary. Meanwhile, fix KUDU-2102 in another commit.


Line 906: }
> You should add a note here about mutating result_status on failure, like in
Refactored the code and removed FinishBlock.


Line 939: 
RETURN_NOT_OK_HANDLE_ERROR(data_dir_->RefreshIsFull(DataDir::RefreshMode::ALWAYS));
> I don't understand this, and it's making me nervous.
Done


PS19, Line 955: Status LogBlockContainer::AppendMetadata(const BlockRecordPB& 
pb) {
  :   DCHECK(!read_only());
  :   // Note: We don't check for sufficient disk space for 
metadata writes in
  :   // order to allow for block deletion on full disks.
  :   RETURN_NOT_OK_HANDLE_ERROR(metadata_file_->Append(pb));
  :   return Status::OK();
  : }
  : 
> Here's another case where, even though FinishBlock() can be called concurre
Done


Line 1076: "Container $0 with size $1 is now full, max size is $2",
> These two accesses to next_sync_offset_ aren't atomic. Use StoreMax() inste
Removed.


PS19, Line 1173: 
   : void LogBlock::Delete() {
   :   DCHECK(!deleted_);
   :   deleted_ = true;
   : }
 

[kudu-CR] KUDU-1943: Add BlockTransaction to Block Manager

2017-08-21 Thread Hao Hao (Code Review)
Hello Kudu Jenkins,

I'd like you to reexamine a change.  Please visit

http://gerrit.cloudera.org:8080/7207

to look at the new patch set (#20).

Change subject: KUDU-1943: Add BlockTransaction to Block Manager
..

KUDU-1943: Add BlockTransaction to Block Manager

This adds a new layer of abstraction 'BlockTransaction' to
Block Manager, to accumulate new blocks in a logical operation,
e.g. flush/compaction. Avoid using fsync() per block can
potentionaly reduce a lot I/O overhead.

This patch also add a new API, Finalize(), to Block Manager,
so that for LBM container can be reused without flushing
in-flight writable blocks to disk. This can bring down the
log block container number and disk space consumption for
first few flush/compaction operations when using log block
manager.

I performed manual test which creates a table with two tablets,
'kudu perf loadgen --num-rows-per-thread=2000
 --num-threads=10 --keep-auto-table --table_num_buckets=2'.
With this patch, 'log_block_manager_containers' dropped from 56
to 7.

A design doc can be found here:
https://docs.google.com/document/d/1cB8pDs-Ho7p2HlzkWo1d-wmXr90YntIjdvdOv7u7gn0/edit?usp=sharing

Change-Id: I01f3ed830cc1f86d7bb8a66b2f780914a4bd
---
M src/kudu/cfile/bloomfile.cc
M src/kudu/cfile/bloomfile.h
M src/kudu/cfile/cfile-test.cc
M src/kudu/cfile/cfile_writer.cc
M src/kudu/cfile/cfile_writer.h
M src/kudu/fs/block_manager-stress-test.cc
M src/kudu/fs/block_manager-test.cc
M src/kudu/fs/block_manager.cc
M src/kudu/fs/block_manager.h
M src/kudu/fs/file_block_manager.cc
M src/kudu/fs/file_block_manager.h
M src/kudu/fs/log_block_manager-test.cc
M src/kudu/fs/log_block_manager.cc
M src/kudu/fs/log_block_manager.h
M src/kudu/tablet/deltafile.cc
M src/kudu/tablet/deltafile.h
M src/kudu/tablet/diskrowset.cc
M src/kudu/tablet/diskrowset.h
M src/kudu/tablet/multi_column_writer.cc
M src/kudu/tablet/multi_column_writer.h
20 files changed, 563 insertions(+), 435 deletions(-)


  git pull ssh://gerrit.cloudera.org:29418/kudu refs/changes/07/7207/20
-- 
To view, visit http://gerrit.cloudera.org:8080/7207
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: newpatchset
Gerrit-Change-Id: I01f3ed830cc1f86d7bb8a66b2f780914a4bd
Gerrit-PatchSet: 20
Gerrit-Project: kudu
Gerrit-Branch: master
Gerrit-Owner: Hao Hao 
Gerrit-Reviewer: Adar Dembo 
Gerrit-Reviewer: Andrew Wong 
Gerrit-Reviewer: Dan Burkert 
Gerrit-Reviewer: Hao Hao 
Gerrit-Reviewer: Kudu Jenkins
Gerrit-Reviewer: Tidy Bot


[kudu-CR] KUDU-1943: Add BlockTransaction to Block Manager

2017-08-17 Thread Adar Dembo (Code Review)
Adar Dembo has posted comments on this change.

Change subject: KUDU-1943: Add BlockTransaction to Block Manager
..


Patch Set 19:

(14 comments)

I think I reviewed everything, but I'm curious to see how this will evolve 
following my suggestions.

http://gerrit.cloudera.org:8080/#/c/7207/19//COMMIT_MSG
Commit Message:

PS19, Line 22: :
Nit: can just omit this.


http://gerrit.cloudera.org:8080/#/c/7207/19/src/kudu/fs/log_block_manager.cc
File src/kudu/fs/log_block_manager.cc:

Line 245:   typedef int CloseFlags;
Don't need this; just change "enum CloseFlag" to "enum CloseFlags".


Line 499:   int64_t next_sync_offset() const { return next_sync_offset_.Load(); 
}
Don't need this; it's only ever used internally (by LogBlockContainer).


Line 582:   // The offset of the container that next sync should start with.
Can we get by with a boolean? Seems like we could set it in RoundUpContainer 
when the next_block_offset_ increases and clear it when we sync.


PS19, Line 594:   // Protects sync data operation to avoid unnecessary I/O.
  :   Mutex lock_;
I don't understand what protection the lock offers. Furthermore, 
PosixRWFile::Sync() already keeps track of whether a sync is actually necessary.

Dan and I had a conversation about PosixRWFile::Sync. The use of pending_sync_ 
as an optimization is unsafe as it can cause the "losing" thread in a race to 
return early under the assumption that the data has been made durable when it 
hasn't (yet). I filed KUDU-2102 for this; feel free to tackle that next if you 
like.


Line 906:   auto cleanup = MakeScopedCleanup([&]() {
You should add a note here about mutating result_status on failure, like in 
LogWritableBlock::DoClose().


Line 939:   int64_t offset = mode == ROUND_UP ? next_block_offset() : 
block_offset;
I don't understand this, and it's making me nervous.

Let's simplify by exposing and using LogWritableBlock's offset. Hopefully we 
can do it without modifying WritableBlock itself (may need to down_cast earlier 
and deal with LogWritableBlocks through and through).


PS19, Line 955:   if (mode == ROUND_UP) {
  : WARN_NOT_OK(TruncateDataToNextBlockOffset(),
  : "could not truncate excess preallocated space");
  : 
  : if (full() && block_manager()->metrics()) {
  :   block_manager()->metrics()->full_containers->Increment();
  : }
  :   }
Here's another case where, even though FinishBlock() can be called 
concurrently, we would never want to call TruncateDataToNextBlockOffset() 
concurrently, and the ROUND_UP check prevents us from doing that. But it's 
tricky to follow 'mode' through the various call chains to prove this guarantee.


Line 1076: if (next_sync_offset() < cur_next_block_offset) {
These two accesses to next_sync_offset_ aren't atomic. Use StoreMax() instead.

But since you can make this a bool, a simple CAS should do the trick. Look at 
how PosixRWFile uses pending_sync_.


PS19, Line 1173:   int64_t new_next_block_offset = KUDU_ALIGN_UP(
   :   block_offset + block_length,
   :   instance()->filesystem_block_size_bytes());
   :   if (PREDICT_FALSE(new_next_block_offset < 
next_block_offset())) {
   : LOG(WARNING) << Substitute(
   : "Container $0 unexpectedly tried to lower its next 
block offset "
   : "(from $1 to $2), ignoring",
   : ToString(), next_block_offset(), 
new_next_block_offset);
   :   } else {
   : next_block_offset_.Store(new_next_block_offset);
   :   }
I keep circling back to this as a potential synchronization issue. The 
comparison and subsequent assignment to next_block_offset_ is not atomic. It 
didn't need to be prior to your change because BlockCreated() was guaranteed to 
be called by just one thread at a time. It may not need to be now, but I'm 
finding it really tough (based on tracing through the code paths) to guarantee 
that, due to the proliferation of RoundMode in various deep methods.

I think this would be safer if we used StoreMax() instead. And maybe 
total_bytes_ and total_blocks_ should be atomic too.


Line 1188: int64_t LogBlockContainer::fs_aligned_length(int64_t block_offset,
Surely we needn't duplicate this from LogBlock; if you need it at the container 
level (for operations where LogBlocks may not exist), can't we move it instead?

In general please try a little harder to avoid duplicating code, especially 
code with large comments like this.


PS19, Line 1464: if ((flags & SYNC) &&
   : (state_ == CLEAN || state_ == DIRTY || state_ == 
FINALIZED)) {
   :   VLOG(3) << "Syncing block " << id();
   : 
   :   // TODO(unknown): Sync just this block's dirty data.
   

[kudu-CR] KUDU-1943: Add BlockTransaction to Block Manager

2017-08-17 Thread Hao Hao (Code Review)
Hello Kudu Jenkins,

I'd like you to reexamine a change.  Please visit

http://gerrit.cloudera.org:8080/7207

to look at the new patch set (#19).

Change subject: KUDU-1943: Add BlockTransaction to Block Manager
..

KUDU-1943: Add BlockTransaction to Block Manager

This adds a new layer of abstraction 'BlockTransaction' to
Block Manager, to accumulate new blocks in a logical operation,
e.g. flush/compaction. Avoid using fsync() per block can
potentionaly reduce a lot I/O overhead.

This patch also add a new API, Finalize(), to Block Manager,
so that for LBM container can be reused without flushing
in-flight writable blocks to disk. This can bring down the
log block container number and disk space consumption for
first few flush/compaction operations when using log block
manager.

I performed manual test which creates a table with two tablets,
'kudu perf loadgen localhost: --num-rows-per-thread=2000
 --num-threads=10 --keep-auto-table --table_num_buckets=2'.
With this patch, 'log_block_manager_containers' dropped from 56
to 7.

A design doc can be found here:
https://docs.google.com/document/d/1cB8pDs-Ho7p2HlzkWo1d-wmXr90YntIjdvdOv7u7gn0/edit?usp=sharing

Change-Id: I01f3ed830cc1f86d7bb8a66b2f780914a4bd
---
M src/kudu/cfile/bloomfile.cc
M src/kudu/cfile/bloomfile.h
M src/kudu/cfile/cfile-test.cc
M src/kudu/cfile/cfile_writer.cc
M src/kudu/cfile/cfile_writer.h
M src/kudu/fs/block_manager-stress-test.cc
M src/kudu/fs/block_manager-test.cc
M src/kudu/fs/block_manager.cc
M src/kudu/fs/block_manager.h
M src/kudu/fs/file_block_manager.cc
M src/kudu/fs/file_block_manager.h
M src/kudu/fs/log_block_manager-test.cc
M src/kudu/fs/log_block_manager.cc
M src/kudu/fs/log_block_manager.h
M src/kudu/tablet/deltafile.cc
M src/kudu/tablet/deltafile.h
M src/kudu/tablet/diskrowset.cc
M src/kudu/tablet/diskrowset.h
M src/kudu/tablet/multi_column_writer.cc
M src/kudu/tablet/multi_column_writer.h
20 files changed, 594 insertions(+), 310 deletions(-)


  git pull ssh://gerrit.cloudera.org:29418/kudu refs/changes/07/7207/19
-- 
To view, visit http://gerrit.cloudera.org:8080/7207
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: newpatchset
Gerrit-Change-Id: I01f3ed830cc1f86d7bb8a66b2f780914a4bd
Gerrit-PatchSet: 19
Gerrit-Project: kudu
Gerrit-Branch: master
Gerrit-Owner: Hao Hao 
Gerrit-Reviewer: Adar Dembo 
Gerrit-Reviewer: Andrew Wong 
Gerrit-Reviewer: Dan Burkert 
Gerrit-Reviewer: Hao Hao 
Gerrit-Reviewer: Kudu Jenkins
Gerrit-Reviewer: Tidy Bot


[kudu-CR] KUDU-1943: Add BlockTransaction to Block Manager

2017-08-16 Thread Hao Hao (Code Review)
Hello Kudu Jenkins,

I'd like you to reexamine a change.  Please visit

http://gerrit.cloudera.org:8080/7207

to look at the new patch set (#18).

Change subject: KUDU-1943: Add BlockTransaction to Block Manager
..

KUDU-1943: Add BlockTransaction to Block Manager

This adds a new layer of abstraction 'BlockTransaction' to
Block Manager, to accumulate new blocks in a logical operation,
e.g. flush/compaction. Avoid using fsync() per block can
potentionaly reduce a lot I/O overhead.

This patch also add a new API, Finalize(), to Block Manager,
so that for LBM container can be reused without flushing
in-flight writable blocks to disk. This can bring down the
log block container number and disk space consumption for
first few flush/compaction operations when using log block
manager.

I performed manual test which creates a table with two tablets,
'kudu perf loadgen localhost: --num-rows-per-thread=2000
 --num-threads=10 --keep-auto-table --table_num_buckets=2'.
With this patch, 'log_block_manager_containers' dropped from 56
to 7.

A design doc can be found here:
https://docs.google.com/document/d/1cB8pDs-Ho7p2HlzkWo1d-wmXr90YntIjdvdOv7u7gn0/edit?usp=sharing

Change-Id: I01f3ed830cc1f86d7bb8a66b2f780914a4bd
---
M src/kudu/cfile/bloomfile.cc
M src/kudu/cfile/bloomfile.h
M src/kudu/cfile/cfile-test.cc
M src/kudu/cfile/cfile_writer.cc
M src/kudu/cfile/cfile_writer.h
M src/kudu/fs/block_manager-stress-test.cc
M src/kudu/fs/block_manager-test.cc
M src/kudu/fs/block_manager.cc
M src/kudu/fs/block_manager.h
M src/kudu/fs/file_block_manager.cc
M src/kudu/fs/file_block_manager.h
M src/kudu/fs/log_block_manager-test.cc
M src/kudu/fs/log_block_manager.cc
M src/kudu/fs/log_block_manager.h
M src/kudu/tablet/deltafile.cc
M src/kudu/tablet/deltafile.h
M src/kudu/tablet/diskrowset.cc
M src/kudu/tablet/diskrowset.h
M src/kudu/tablet/multi_column_writer.cc
M src/kudu/tablet/multi_column_writer.h
20 files changed, 591 insertions(+), 310 deletions(-)


  git pull ssh://gerrit.cloudera.org:29418/kudu refs/changes/07/7207/18
-- 
To view, visit http://gerrit.cloudera.org:8080/7207
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: newpatchset
Gerrit-Change-Id: I01f3ed830cc1f86d7bb8a66b2f780914a4bd
Gerrit-PatchSet: 18
Gerrit-Project: kudu
Gerrit-Branch: master
Gerrit-Owner: Hao Hao 
Gerrit-Reviewer: Adar Dembo 
Gerrit-Reviewer: Andrew Wong 
Gerrit-Reviewer: Dan Burkert 
Gerrit-Reviewer: Hao Hao 
Gerrit-Reviewer: Kudu Jenkins
Gerrit-Reviewer: Tidy Bot


[kudu-CR] KUDU-1943: Add BlockTransaction to Block Manager

2017-08-16 Thread Adar Dembo (Code Review)
Adar Dembo has posted comments on this change.

Change subject: KUDU-1943: Add BlockTransaction to Block Manager
..


Patch Set 17:

(10 comments)

I think I'm done reviewing everything outside log_block_manager.cc. Still a lot 
to take in there. I've been making my way through it, but thought I'd publish 
my comments so far.

http://gerrit.cloudera.org:8080/#/c/7207/17/src/kudu/cfile/cfile-test.cc
File src/kudu/cfile/cfile-test.cc:

Line 892: TEST_P(TestCFileBothCacheTypes, TestReleaseBlock) {
What happened to parameterizing this test for all block_time_to_flush values? 
In my earlier comment I suggested that doing so would be moot only if there was 
no block_time_to_flush-specific code in here. But there is such code, so we 
should do the parameterization so as to avoid bit-rot.


http://gerrit.cloudera.org:8080/#/c/7207/17/src/kudu/cfile/cfile_writer.cc
File src/kudu/cfile/cfile_writer.cc:

PS17, Line 266:   if (FLAGS_block_time_to_flush == "finalize") {
  : RETURN_NOT_OK(block_->Finalize());
  :   } else if (FLAGS_block_time_to_flush != "close" &&
  :  FLAGS_block_time_to_flush != "none") {
  : LOG(FATAL) << "Unknown value for block_time_to_flush: "
  :<< FLAGS_block_time_to_flush;
  :   }
This is incorrect. We should always Finalize() the block; whether or not we 
flush inside Finalize() is determined by the block manager (and the value of 
this gflag).


http://gerrit.cloudera.org:8080/#/c/7207/17/src/kudu/fs/block_manager.cc
File src/kudu/fs/block_manager.cc:

Line 43: DEFINE_string(block_time_to_flush, "finalize",
> warning: redundant 'FLAGS_block_time_to_flush' declaration [readability-red
Let's rename to "block_manager_flush_control":
1. "block_manager_" is the prefix we've been using for other BM-related flags.
2. "time_to_flush" sounds conspicuously like time to live and other such 
numerical values.


Line 44:   "When to flush blocks registered in a BlockTransaction. "
This should apply to any WritableBlock, not just those in a BlockTransaction.


PS17, Line 48: none
Let's use "never" instead, since we're talking about a point in time.


http://gerrit.cloudera.org:8080/#/c/7207/17/src/kudu/fs/block_manager.h
File src/kudu/fs/block_manager.h:

Line 32: DECLARE_string(block_time_to_flush);
This doesn't belong here (block_coalesce_close didn't belong either).


http://gerrit.cloudera.org:8080/#/c/7207/17/src/kudu/fs/file_block_manager.cc
File src/kudu/fs/file_block_manager.cc:

Line 340: 
RETURN_NOT_OK_HANDLE_ERROR(writer_->Flush(WritableFile::FLUSH_ASYNC));
This is where we should check the value of block_time_to_flush. We should only 
flush if it's 'finalize'.


http://gerrit.cloudera.org:8080/#/c/7207/17/src/kudu/fs/log_block_manager.cc
File src/kudu/fs/log_block_manager.cc:

Line 1176: LOG(WARNING) << Substitute(
I mentioned this to Dan in our impromptu meeting yesterday; perhaps you weren't 
yet there: we should remove this WARNING because with out-of-order block 
metadata on disk it'll fire more frequently.


PS17, Line 1397: container_->block_manager()
You can use 'lbm' here.


Line 1410: RETURN_NOT_OK(container_->FlushData(block_offset_, 
block_length_));
And here we also need to gate on block_time_to_flush == 'finalize'.


-- 
To view, visit http://gerrit.cloudera.org:8080/7207
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: comment
Gerrit-Change-Id: I01f3ed830cc1f86d7bb8a66b2f780914a4bd
Gerrit-PatchSet: 17
Gerrit-Project: kudu
Gerrit-Branch: master
Gerrit-Owner: Hao Hao 
Gerrit-Reviewer: Adar Dembo 
Gerrit-Reviewer: Andrew Wong 
Gerrit-Reviewer: Dan Burkert 
Gerrit-Reviewer: Hao Hao 
Gerrit-Reviewer: Kudu Jenkins
Gerrit-Reviewer: Tidy Bot
Gerrit-HasComments: Yes


[kudu-CR] KUDU-1943: Add BlockTransaction to Block Manager

2017-08-16 Thread Hao Hao (Code Review)
Hao Hao has posted comments on this change.

Change subject: KUDU-1943: Add BlockTransaction to Block Manager
..


Patch Set 16:

> (5 comments)
 > 
 > Hao, Dan, and I had a long discussion about this patch, and I
 > wanted to reproduce our conclusions here for everyone else:
 > 
 > 1. The cfile and block performance-related knobs are useful for
 > testing, but are not as important as having a good API and a robust
 > implementation. IIUC, the resplitting of FlushDataAsync and
 > Finalize was due to retaining all the existing flags; let's not do
 > this. Let's stick to the earlier approach (where Finalize implies a
 > flush), and adjust the flags if need be.
 > 
 > 2. What flags are worth preserving? We identified the need for just
 > one, which dictates when a block's data should be flushed. It has
 > three values: "finalize" (default value), "close" (defers the flush
 > to when the entire transaction is committed), and "none" (doesn't
 > flush at all, "flushing" will happen when the transaction commits
 > and files are fsynced).
 > 
 > 3. When should AppendMetadata be called? One option is to do it at
 > Finalize time. Another is to defer it to Close. The problem with
 > doing it during Finalize is that it exacerbates the "1. AppendData,
 > 2. AppendMetadata, 3. SyncData, 4. SyncMetadata" race: if we crash
 > between #2 and #3, we may end up with just the metadata on disk,
 > and if we AppendMetadata during Finalize, we have many more blocks'
 > worth of metadata that can land on disk without corresponding data.
 > 
 > 4. But AppendMetadata is called during Close, we may end up with
 > out-of-order metadata entries on disk, since transactions could
 > commit out of order. Do we care? We don't think so; the LBM
 > implementation should already be robust to this. An alternative is
 > to use an in-memory buffer to retain metadata order, but this
 > didn't seem worth the complexity.
 > 
 > 5. What to do about zero length blocks? Should their metadata be
 > written, or skipped? The answer is: it must be written, or attempts
 > to Close such blocks should fail. If for whatever reason we skipped
 > the metadata append, we could end up with block IDs whose blocks
 > can't be found on startup.

Thanks a lot Adar for the summary!

-- 
To view, visit http://gerrit.cloudera.org:8080/7207
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: comment
Gerrit-Change-Id: I01f3ed830cc1f86d7bb8a66b2f780914a4bd
Gerrit-PatchSet: 16
Gerrit-Project: kudu
Gerrit-Branch: master
Gerrit-Owner: Hao Hao 
Gerrit-Reviewer: Adar Dembo 
Gerrit-Reviewer: Andrew Wong 
Gerrit-Reviewer: Dan Burkert 
Gerrit-Reviewer: Hao Hao 
Gerrit-Reviewer: Kudu Jenkins
Gerrit-Reviewer: Tidy Bot
Gerrit-HasComments: No


[kudu-CR] KUDU-1943: Add BlockTransaction to Block Manager

2017-08-16 Thread Hao Hao (Code Review)
Hao Hao has posted comments on this change.

Change subject: KUDU-1943: Add BlockTransaction to Block Manager
..


Patch Set 16:

(4 comments)

http://gerrit.cloudera.org:8080/#/c/7207/16/src/kudu/cfile/cfile-test.cc
File src/kudu/cfile/cfile-test.cc:

Line 423: class TestCFileBothCacheTypesCloseTypes : public TestCFile,
> You should be able to use this to replace TestCFileBothCacheTypes. Here's w
Done


Line 449: if (desc.close_type == CLOSE) FLAGS_cfile_close_blocks_on_finish 
= true;
> If the default value of this gflag changes to true, we'll never test the fa
Done


http://gerrit.cloudera.org:8080/#/c/7207/16/src/kudu/cfile/cfile_writer.cc
File src/kudu/cfile/cfile_writer.cc:

PS16, Line 59: // - users could always change this to "close", which slows down 
throughput
 : //   but may improve write latency.
> This part isn't quite right anymore. Should the entire comment be rewritten
Done


PS16, Line 63:  
> Nit: got an extra space here.
Done


-- 
To view, visit http://gerrit.cloudera.org:8080/7207
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: comment
Gerrit-Change-Id: I01f3ed830cc1f86d7bb8a66b2f780914a4bd
Gerrit-PatchSet: 16
Gerrit-Project: kudu
Gerrit-Branch: master
Gerrit-Owner: Hao Hao 
Gerrit-Reviewer: Adar Dembo 
Gerrit-Reviewer: Andrew Wong 
Gerrit-Reviewer: Dan Burkert 
Gerrit-Reviewer: Hao Hao 
Gerrit-Reviewer: Kudu Jenkins
Gerrit-Reviewer: Tidy Bot
Gerrit-HasComments: Yes


[kudu-CR] KUDU-1943: Add BlockTransaction to Block Manager

2017-08-16 Thread Hao Hao (Code Review)
Hello Kudu Jenkins,

I'd like you to reexamine a change.  Please visit

http://gerrit.cloudera.org:8080/7207

to look at the new patch set (#17).

Change subject: KUDU-1943: Add BlockTransaction to Block Manager
..

KUDU-1943: Add BlockTransaction to Block Manager

This adds a new layer of abstraction 'BlockTransaction' to
Block Manager, to accumulate new blocks in a logical operation,
e.g. flush/compaction. Avoid using fsync() per block can
potentionaly reduce a lot I/O overhead.

This patch also add a new API, Finalize(), to Block Manager,
so that for LBM container can be reused without flushing
in-flight writable blocks to disk.

A design doc can be found here:
https://docs.google.com/document/d/1cB8pDs-Ho7p2HlzkWo1d-wmXr90YntIjdvdOv7u7gn0/edit?usp=sharing

Change-Id: I01f3ed830cc1f86d7bb8a66b2f780914a4bd
---
M src/kudu/cfile/bloomfile.cc
M src/kudu/cfile/bloomfile.h
M src/kudu/cfile/cfile-test.cc
M src/kudu/cfile/cfile_writer.cc
M src/kudu/cfile/cfile_writer.h
M src/kudu/fs/block_manager-stress-test.cc
M src/kudu/fs/block_manager-test.cc
M src/kudu/fs/block_manager.cc
M src/kudu/fs/block_manager.h
M src/kudu/fs/file_block_manager.cc
M src/kudu/fs/file_block_manager.h
M src/kudu/fs/log_block_manager-test.cc
M src/kudu/fs/log_block_manager.cc
M src/kudu/fs/log_block_manager.h
M src/kudu/tablet/deltafile.cc
M src/kudu/tablet/deltafile.h
M src/kudu/tablet/diskrowset.cc
M src/kudu/tablet/diskrowset.h
M src/kudu/tablet/multi_column_writer.cc
M src/kudu/tablet/multi_column_writer.h
20 files changed, 591 insertions(+), 309 deletions(-)


  git pull ssh://gerrit.cloudera.org:29418/kudu refs/changes/07/7207/17
-- 
To view, visit http://gerrit.cloudera.org:8080/7207
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: newpatchset
Gerrit-Change-Id: I01f3ed830cc1f86d7bb8a66b2f780914a4bd
Gerrit-PatchSet: 17
Gerrit-Project: kudu
Gerrit-Branch: master
Gerrit-Owner: Hao Hao 
Gerrit-Reviewer: Adar Dembo 
Gerrit-Reviewer: Andrew Wong 
Gerrit-Reviewer: Dan Burkert 
Gerrit-Reviewer: Hao Hao 
Gerrit-Reviewer: Kudu Jenkins
Gerrit-Reviewer: Tidy Bot


[kudu-CR] KUDU-1943: Add BlockTransaction to Block Manager

2017-08-15 Thread Adar Dembo (Code Review)
Adar Dembo has posted comments on this change.

Change subject: KUDU-1943: Add BlockTransaction to Block Manager
..


Patch Set 10:

(5 comments)

Hao, Dan, and I had a long discussion about this patch, and I wanted to 
reproduce our conclusions here for everyone else:

1. The cfile and block performance-related knobs are useful for testing, but 
are not as important as having a good API and a robust implementation. IIUC, 
the resplitting of FlushDataAsync and Finalize was due to retaining all the 
existing flags; let's not do this. Let's stick to the earlier approach (where 
Finalize implies a flush), and adjust the flags if need be.

2. What flags are worth preserving? We identified the need for just one, which 
dictates when a block's data should be flushed. It has three values: "finalize" 
(default value), "close" (defers the flush to when the entire transaction is 
committed), and "none" (doesn't flush at all, "flushing" will happen when the 
transaction commits and files are fsynced).

3. When should AppendMetadata be called? One option is to do it at Finalize 
time. Another is to defer it to Close. The problem with doing it during 
Finalize is that it exacerbates the "1. AppendData, 2. AppendMetadata, 3. 
SyncData, 4. SyncMetadata" race: if we crash between #2 and #3, we may end up 
with just the metadata on disk, and if we AppendMetadata during Finalize, we 
have many more blocks' worth of metadata that can land on disk without 
corresponding data.

4. But AppendMetadata is called during Close, we may end up with out-of-order 
metadata entries on disk, since transactions could commit out of order. Do we 
care? We don't think so; the LBM implementation should already be robust to 
this. An alternative is to use an in-memory buffer to retain metadata order, 
but this didn't seem worth the complexity.

5. What to do about zero length blocks? Should their metadata be written, or 
skipped? The answer is: it must be written, or attempts to Close such blocks 
should fail. If for whatever reason we skipped the metadata append, we could 
end up with block IDs whose blocks can't be found on startup.

http://gerrit.cloudera.org:8080/#/c/7207/16/src/kudu/cfile/cfile-test.cc
File src/kudu/cfile/cfile-test.cc:

Line 423: TEST_P(TestCFileBothCacheTypes, TestWrite100MFileInts) {
You should be able to use this to replace TestCFileBothCacheTypes. Here's what 
I think you'll need to do:

1. Rewrite TestCFileBothCacheTypes to be an empty subclass of 
TestCFileBothCacheCloseTypes.
2. The first call to INSTANTIATE_TEST_CASE_P should be for 
TestCFileBothCacheTypes and its values should be two CFileDescriptors that vary 
the CacheType but both use the CLOSE CloseType.
3. The second call to INSTANTIATE_TEST_CASE_P should be for 
TestCFileBothCacheCloseTypes and should pass all four CFileDescriptors.

After our long in-person discussion about other changes to this patch, it's 
possible all of this becomes moot because there's no longer a cfile flag to 
test.


Line 449: 
If the default value of this gflag changes to true, we'll never test the false 
case. You should use a switch statement and explicitly set it to either true or 
false.

After our long in-person discussion about other changes to this patch, it's 
possible all of this becomes moot because there's no longer a cfile flag to 
test.


http://gerrit.cloudera.org:8080/#/c/7207/16/src/kudu/cfile/cfile_writer.cc
File src/kudu/cfile/cfile_writer.cc:

PS16, Line 59: // - users could always change this to "close", which slows down 
throughput
 : //   but may improve write latency.
This part isn't quite right anymore. Should the entire comment be rewritten in 
light of the split into two gflags?

After our long in-person discussion about other changes to this patch, it's 
possible this becomes moot because there's no longer a cfile flag. Though you 
may still need to move this explanation (and rewrite it).


PS16, Line 63:  
Nit: got an extra space here.


http://gerrit.cloudera.org:8080/#/c/7207/10/src/kudu/fs/log_block_manager-test.cc
File src/kudu/fs/log_block_manager-test.cc:

Line 1288:   // Ensure the same container has not been marked as available 
twice.
> I was trying to check in the available_containers_by_data_dir_ map, there i
Ah, I forgot that available_containers_by_data_dir_::value_type is a vector. So 
indeed, if we marked the container as available twice, you'd see two entries in 
that vector. Makes sense.


-- 
To view, visit http://gerrit.cloudera.org:8080/7207
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: comment
Gerrit-Change-Id: I01f3ed830cc1f86d7bb8a66b2f780914a4bd
Gerrit-PatchSet: 10
Gerrit-Project: kudu
Gerrit-Branch: master
Gerrit-Owner: Hao Hao 
Gerrit-Reviewer: Adar Dembo 
Gerrit-Reviewer: Andrew Wong 
Gerrit-Reviewer: Dan Burkert 

[kudu-CR] KUDU-1943: Add BlockTransaction to Block Manager

2017-08-15 Thread Hao Hao (Code Review)
Hello Kudu Jenkins,

I'd like you to reexamine a change.  Please visit

http://gerrit.cloudera.org:8080/7207

to look at the new patch set (#16).

Change subject: KUDU-1943: Add BlockTransaction to Block Manager
..

KUDU-1943: Add BlockTransaction to Block Manager

This adds a new layer of abstraction 'BlockTransaction' to
Block Manager, to accumulate new blocks in a logical operation,
e.g. flush/compaction. Avoid using fsync() per block can
potentionaly reduce a lot I/O overhead.

This patch also add a new API, Finalize(), to Block Manager,
so that for LBM container can be reused without flushing
in-flight writable blocks to disk.

A design doc can be found here:
https://docs.google.com/document/d/1cB8pDs-Ho7p2HlzkWo1d-wmXr90YntIjdvdOv7u7gn0/edit?usp=sharing

Change-Id: I01f3ed830cc1f86d7bb8a66b2f780914a4bd
---
M src/kudu/cfile/bloomfile.cc
M src/kudu/cfile/bloomfile.h
M src/kudu/cfile/cfile-test.cc
M src/kudu/cfile/cfile_writer.cc
M src/kudu/cfile/cfile_writer.h
M src/kudu/fs/block_manager-stress-test.cc
M src/kudu/fs/block_manager-test.cc
M src/kudu/fs/block_manager.cc
M src/kudu/fs/block_manager.h
M src/kudu/fs/file_block_manager.cc
M src/kudu/fs/file_block_manager.h
M src/kudu/fs/log_block_manager-test.cc
M src/kudu/fs/log_block_manager.cc
M src/kudu/fs/log_block_manager.h
M src/kudu/tablet/deltafile.cc
M src/kudu/tablet/deltafile.h
M src/kudu/tablet/diskrowset.cc
M src/kudu/tablet/diskrowset.h
M src/kudu/tablet/multi_column_writer.cc
M src/kudu/tablet/multi_column_writer.h
20 files changed, 734 insertions(+), 304 deletions(-)


  git pull ssh://gerrit.cloudera.org:29418/kudu refs/changes/07/7207/16
-- 
To view, visit http://gerrit.cloudera.org:8080/7207
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: newpatchset
Gerrit-Change-Id: I01f3ed830cc1f86d7bb8a66b2f780914a4bd
Gerrit-PatchSet: 16
Gerrit-Project: kudu
Gerrit-Branch: master
Gerrit-Owner: Hao Hao 
Gerrit-Reviewer: Adar Dembo 
Gerrit-Reviewer: Andrew Wong 
Gerrit-Reviewer: Dan Burkert 
Gerrit-Reviewer: Hao Hao 
Gerrit-Reviewer: Kudu Jenkins
Gerrit-Reviewer: Tidy Bot


[kudu-CR] KUDU-1943: Add BlockTransaction to Block Manager

2017-08-14 Thread Dan Burkert (Code Review)
Dan Burkert has posted comments on this change.

Change subject: KUDU-1943: Add BlockTransaction to Block Manager
..


Patch Set 14:

(2 comments)

http://gerrit.cloudera.org:8080/#/c/7207/14/src/kudu/fs/file_block_manager.cc
File src/kudu/fs/file_block_manager.cc:

Line 396: Status FileWritableBlock::FlushAndFinalize(FlushMode mode) {
> Good catch, thanks! Do you think it is better to remove 'FLAGS_block_manage
Good question, it does seem a bit confused to have the two flags.  I'm not 
entirely sure what we're using the two for at the moment, I'd want Adar to 
weigh in.


http://gerrit.cloudera.org:8080/#/c/7207/13/src/kudu/fs/log_block_manager.cc
File src/kudu/fs/log_block_manager.cc:

Line 278:   Status FinishBlock(const Status& s, WritableBlock* block,
> Thanks for pointing out!  I personally prefer to keep it as the way it is, 
OK sounds good to me.


-- 
To view, visit http://gerrit.cloudera.org:8080/7207
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: comment
Gerrit-Change-Id: I01f3ed830cc1f86d7bb8a66b2f780914a4bd
Gerrit-PatchSet: 14
Gerrit-Project: kudu
Gerrit-Branch: master
Gerrit-Owner: Hao Hao 
Gerrit-Reviewer: Adar Dembo 
Gerrit-Reviewer: Andrew Wong 
Gerrit-Reviewer: Dan Burkert 
Gerrit-Reviewer: Hao Hao 
Gerrit-Reviewer: Kudu Jenkins
Gerrit-Reviewer: Tidy Bot
Gerrit-HasComments: Yes


[kudu-CR] KUDU-1943: Add BlockTransaction to Block Manager

2017-08-11 Thread Hao Hao (Code Review)
Hao Hao has posted comments on this change.

Change subject: KUDU-1943: Add BlockTransaction to Block Manager
..


Patch Set 15:

(3 comments)

http://gerrit.cloudera.org:8080/#/c/7207/14/src/kudu/fs/block_manager-stress-test.cc
File src/kudu/fs/block_manager-stress-test.cc:

Line 282:   dirty_blocks.emplace_back(std::move(block));
> use emplace_back with r-value references.
Done


http://gerrit.cloudera.org:8080/#/c/7207/13/src/kudu/fs/log_block_manager.cc
File src/kudu/fs/log_block_manager.cc:

Line 278:   Status FinishBlock(const Status& s, WritableBlock* block,
> Yah I didn't realize it was being called in a closure.  Still, you could re
Thanks for pointing out!  I personally prefer to keep it as the way it is, 
because inside FinishBlock(), if the input status is not ok, we need to mark 
the container as read-only. Or maybe you have other thought?


http://gerrit.cloudera.org:8080/#/c/7207/14/src/kudu/fs/log_block_manager.cc
File src/kudu/fs/log_block_manager.cc:

Line 1524: RETURN_NOT_OK_PREPEND(s, "unable to append block metadata");
> here and abve, start the message with a lowercase ('unable', 'container').
Done


-- 
To view, visit http://gerrit.cloudera.org:8080/7207
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: comment
Gerrit-Change-Id: I01f3ed830cc1f86d7bb8a66b2f780914a4bd
Gerrit-PatchSet: 15
Gerrit-Project: kudu
Gerrit-Branch: master
Gerrit-Owner: Hao Hao 
Gerrit-Reviewer: Adar Dembo 
Gerrit-Reviewer: Andrew Wong 
Gerrit-Reviewer: Dan Burkert 
Gerrit-Reviewer: Hao Hao 
Gerrit-Reviewer: Kudu Jenkins
Gerrit-Reviewer: Tidy Bot
Gerrit-HasComments: Yes


[kudu-CR] KUDU-1943: Add BlockTransaction to Block Manager

2017-08-11 Thread Hao Hao (Code Review)
Hello Kudu Jenkins,

I'd like you to reexamine a change.  Please visit

http://gerrit.cloudera.org:8080/7207

to look at the new patch set (#15).

Change subject: KUDU-1943: Add BlockTransaction to Block Manager
..

KUDU-1943: Add BlockTransaction to Block Manager

This adds a new layer of abstraction 'BlockTransaction' to
Block Manager, to accumulate new blocks in a logical operation,
e.g. flush/compaction. Avoid using fsync() per block can
potentionaly reduce a lot I/O overhead.

This patch also add a new API, Finalize(), to Block Manager,
so that for LBM container can be reused without flushing
in-flight writable blocks to disk.

A design doc can be found here:
https://docs.google.com/document/d/1cB8pDs-Ho7p2HlzkWo1d-wmXr90YntIjdvdOv7u7gn0/edit?usp=sharing

Change-Id: I01f3ed830cc1f86d7bb8a66b2f780914a4bd
---
M src/kudu/cfile/bloomfile.cc
M src/kudu/cfile/bloomfile.h
M src/kudu/cfile/cfile-test.cc
M src/kudu/cfile/cfile_writer.cc
M src/kudu/cfile/cfile_writer.h
M src/kudu/fs/block_manager-stress-test.cc
M src/kudu/fs/block_manager-test.cc
M src/kudu/fs/block_manager.cc
M src/kudu/fs/block_manager.h
M src/kudu/fs/file_block_manager.cc
M src/kudu/fs/file_block_manager.h
M src/kudu/fs/log_block_manager-test.cc
M src/kudu/fs/log_block_manager.cc
M src/kudu/fs/log_block_manager.h
M src/kudu/tablet/deltafile.cc
M src/kudu/tablet/deltafile.h
M src/kudu/tablet/diskrowset.cc
M src/kudu/tablet/diskrowset.h
M src/kudu/tablet/multi_column_writer.cc
M src/kudu/tablet/multi_column_writer.h
20 files changed, 734 insertions(+), 304 deletions(-)


  git pull ssh://gerrit.cloudera.org:29418/kudu refs/changes/07/7207/15
-- 
To view, visit http://gerrit.cloudera.org:8080/7207
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: newpatchset
Gerrit-Change-Id: I01f3ed830cc1f86d7bb8a66b2f780914a4bd
Gerrit-PatchSet: 15
Gerrit-Project: kudu
Gerrit-Branch: master
Gerrit-Owner: Hao Hao 
Gerrit-Reviewer: Adar Dembo 
Gerrit-Reviewer: Andrew Wong 
Gerrit-Reviewer: Dan Burkert 
Gerrit-Reviewer: Hao Hao 
Gerrit-Reviewer: Kudu Jenkins
Gerrit-Reviewer: Tidy Bot


[kudu-CR] KUDU-1943: Add BlockTransaction to Block Manager

2017-08-11 Thread Dan Burkert (Code Review)
Dan Burkert has posted comments on this change.

Change subject: KUDU-1943: Add BlockTransaction to Block Manager
..


Patch Set 14:

(4 comments)

http://gerrit.cloudera.org:8080/#/c/7207/14/src/kudu/fs/block_manager-stress-test.cc
File src/kudu/fs/block_manager-stress-test.cc:

Line 282:   dirty_blocks.push_back(std::move(block));
use emplace_back with r-value references.


http://gerrit.cloudera.org:8080/#/c/7207/14/src/kudu/fs/file_block_manager.cc
File src/kudu/fs/file_block_manager.cc:

Line 396: Status FileWritableBlock::FlushAndFinalize(FlushMode mode) {
Hmm I'm not sure this is doing what we want when 
block_manager_flush_on_finalize = false.  Consider this sequence of events:

  FLAGS_block_manager_flush_on_finalize = false;
  FLAGS_block_coalesce_close = true;
  block.Finalize(); // Finalizes the block, but doesn't flush it
  block_manager.CloseBlocks( { move(block) } ); // Doesn't flush the block, 
since it's already finalized.

I think in that case CloseBlocks actually should async flush the block.


http://gerrit.cloudera.org:8080/#/c/7207/13/src/kudu/fs/log_block_manager.cc
File src/kudu/fs/log_block_manager.cc:

Line 278:   Status FinishBlock(const Status& s, WritableBlock* block,
> Sorry, I do not quite follow your comment. You mean FinishBlock is always c
Yah I didn't realize it was being called in a closure.  Still, you could remove 
the Status param and change line 1431 to be:

s = s.AndThen([&] { return container_->FinishBlock(this, round_mode, 
block_offset_); });


http://gerrit.cloudera.org:8080/#/c/7207/14/src/kudu/fs/log_block_manager.cc
File src/kudu/fs/log_block_manager.cc:

Line 1524: RETURN_NOT_OK_PREPEND(s, "Unable to append block metadata");
here and abve, start the message with a lowercase ('unable', 'container').


-- 
To view, visit http://gerrit.cloudera.org:8080/7207
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: comment
Gerrit-Change-Id: I01f3ed830cc1f86d7bb8a66b2f780914a4bd
Gerrit-PatchSet: 14
Gerrit-Project: kudu
Gerrit-Branch: master
Gerrit-Owner: Hao Hao 
Gerrit-Reviewer: Adar Dembo 
Gerrit-Reviewer: Andrew Wong 
Gerrit-Reviewer: Dan Burkert 
Gerrit-Reviewer: Hao Hao 
Gerrit-Reviewer: Kudu Jenkins
Gerrit-Reviewer: Tidy Bot
Gerrit-HasComments: Yes


[kudu-CR] KUDU-1943: Add BlockTransaction to Block Manager

2017-08-10 Thread Hao Hao (Code Review)
Hao Hao has posted comments on this change.

Change subject: KUDU-1943: Add BlockTransaction to Block Manager
..


Patch Set 14:

(17 comments)

http://gerrit.cloudera.org:8080/#/c/7207/13//COMMIT_MSG
Commit Message:

PS13, Line 10: new blo
> s/creates which happen/new blocks
Done


PS13, Line 15:  
> no comma here
Done


PS13, Line 15: reused
> reused
Done


Line 19: 
https://docs.google.com/a/cloudera.com/document/d/15zZaKeLENRGC_AdjF2N3hxtM_6_OuydXA1reWvDbbK0/edit?usp=sharing
> I think you may need to make this viewable from anyone (I tried in an incog
Done


http://gerrit.cloudera.org:8080/#/c/7207/13/src/kudu/fs/block_manager.h
File src/kudu/fs/block_manager.h:

PS13, Line 74: resulti
> s/results/resulting
Done


Line 88: // No more data may be written to the block, but it is not yet 
guaranteed
> 'not yet closed' doesn't really distinguish this state from CLOSED.  I thin
Done


Line 281: // Group a set of block writes together in a transaction. This has two
> prefer 'BlockTransaction() = default;'
Done


Line 283: // synchronization for a batch of blocks if possible to achieve better
> Can't this just be set to the default dtor?
Done


PS13, Line 288: 
> emplace_back
Done


PS13, Line 293: 
> Might be more clear as 'CommitCreatedBlocks'.
Done


Line 305: Status s = bm->CloseBlocks(created_blocks_);
> Would it be difficult to change BlockManager::CloseBlocks to take a 'const 
Updated it as you suggested.


http://gerrit.cloudera.org:8080/#/c/7207/13/src/kudu/fs/file_block_manager.cc
File src/kudu/fs/file_block_manager.cc:

Line 727: 
> I think this should be doing an async flush regardless of the block_manager
Done


http://gerrit.cloudera.org:8080/#/c/7207/13/src/kudu/fs/log_block_manager.cc
File src/kudu/fs/log_block_manager.cc:

Line 278:   Status FinishBlock(const Status& s, WritableBlock* block,
> It looks like the only place that calls FinishBlock calls it with an OK sta
Sorry, I do not quite follow your comment. You mean FinishBlock is always 
called with 's' as OK status? It does not seem to be the case, either Sync() or 
AppendMetadata() can return bad status, right?


PS13, Line 522: unnecessary I
> unnecessary
Done


Line 1234:   // Actually close the block, possibly synchronizing its dirty data 
and
> Could DoClose and CloseFlags be private?
Hmm, they are used in other class such as LogBlockManager.


Line 1911:   record.set_timestamp_us(GetCurrentTimeMicros());
> Same thing here - I think we want to do this depending only on the block_co
Done


PS13, Line 1917:  that 
> s/belong/belonging
Done


-- 
To view, visit http://gerrit.cloudera.org:8080/7207
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: comment
Gerrit-Change-Id: I01f3ed830cc1f86d7bb8a66b2f780914a4bd
Gerrit-PatchSet: 14
Gerrit-Project: kudu
Gerrit-Branch: master
Gerrit-Owner: Hao Hao 
Gerrit-Reviewer: Adar Dembo 
Gerrit-Reviewer: Andrew Wong 
Gerrit-Reviewer: Dan Burkert 
Gerrit-Reviewer: Hao Hao 
Gerrit-Reviewer: Kudu Jenkins
Gerrit-Reviewer: Tidy Bot
Gerrit-HasComments: Yes


[kudu-CR] KUDU-1943: Add BlockTransaction to Block Manager

2017-08-10 Thread Hao Hao (Code Review)
Hello Kudu Jenkins,

I'd like you to reexamine a change.  Please visit

http://gerrit.cloudera.org:8080/7207

to look at the new patch set (#14).

Change subject: KUDU-1943: Add BlockTransaction to Block Manager
..

KUDU-1943: Add BlockTransaction to Block Manager

This adds a new layer of abstraction 'BlockTransaction' to
Block Manager, to accumulate new blocks in a logical operation,
e.g. flush/compaction. Avoid using fsync() per block can
potentionaly reduce a lot I/O overhead.

This patch also add a new API, Finalize(), to Block Manager,
so that for LBM container can be reused without flushing
in-flight writable blocks to disk.

A design doc can be found here:
https://docs.google.com/a/cloudera.com/document/d/15zZaKeLENRGC_AdjF2N3hxtM_6_OuydXA1reWvDbbK0/edit?usp=sharing

Change-Id: I01f3ed830cc1f86d7bb8a66b2f780914a4bd
---
M src/kudu/cfile/bloomfile.cc
M src/kudu/cfile/bloomfile.h
M src/kudu/cfile/cfile-test.cc
M src/kudu/cfile/cfile_writer.cc
M src/kudu/cfile/cfile_writer.h
M src/kudu/fs/block_manager-stress-test.cc
M src/kudu/fs/block_manager-test.cc
M src/kudu/fs/block_manager.cc
M src/kudu/fs/block_manager.h
M src/kudu/fs/file_block_manager.cc
M src/kudu/fs/file_block_manager.h
M src/kudu/fs/log_block_manager-test.cc
M src/kudu/fs/log_block_manager.cc
M src/kudu/fs/log_block_manager.h
M src/kudu/tablet/deltafile.cc
M src/kudu/tablet/deltafile.h
M src/kudu/tablet/diskrowset.cc
M src/kudu/tablet/diskrowset.h
M src/kudu/tablet/multi_column_writer.cc
M src/kudu/tablet/multi_column_writer.h
20 files changed, 734 insertions(+), 304 deletions(-)


  git pull ssh://gerrit.cloudera.org:29418/kudu refs/changes/07/7207/14
-- 
To view, visit http://gerrit.cloudera.org:8080/7207
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: newpatchset
Gerrit-Change-Id: I01f3ed830cc1f86d7bb8a66b2f780914a4bd
Gerrit-PatchSet: 14
Gerrit-Project: kudu
Gerrit-Branch: master
Gerrit-Owner: Hao Hao 
Gerrit-Reviewer: Adar Dembo 
Gerrit-Reviewer: Andrew Wong 
Gerrit-Reviewer: Dan Burkert 
Gerrit-Reviewer: Hao Hao 
Gerrit-Reviewer: Kudu Jenkins
Gerrit-Reviewer: Tidy Bot


[kudu-CR] KUDU-1943: Add BlockTransaction to Block Manager

2017-08-08 Thread Dan Burkert (Code Review)
Dan Burkert has posted comments on this change.

Change subject: KUDU-1943: Add BlockTransaction to Block Manager
..


Patch Set 13:

(19 comments)

http://gerrit.cloudera.org:8080/#/c/7207/5//COMMIT_MSG
Commit Message:

PS5, Line 12: potentionaly
typo: potentially


http://gerrit.cloudera.org:8080/#/c/7207/13//COMMIT_MSG
Commit Message:

PS13, Line 10: creates
s/creates which happen/new blocks


PS13, Line 14: I FinalizeWrite() t
add commas (API, FinalizeWrite(), to...)


PS13, Line 15: resued
reused


PS13, Line 15: ,
no comma here


Line 19: 
https://docs.google.com/a/cloudera.com/document/d/15zZaKeLENRGC_AdjF2N3hxtM_6_OuydXA1reWvDbbK0/edit?usp=sharing
I think you may need to make this viewable from anyone (I tried in an incognito 
window and it asked me to log into my cloudera gmail account).


http://gerrit.cloudera.org:8080/#/c/7207/13/src/kudu/fs/block_manager.h
File src/kudu/fs/block_manager.h:

PS13, Line 74: results
s/results/resulting


Line 88: // No more data may be written to the block, but it is not yet 
closed.
'not yet closed' doesn't really distinguish this state from CLOSED.  I think 
this would be more clear as 'but it is not yet guaranteed to be durably stored 
on disk'.


Line 281:   BlockTransaction() {}
prefer 'BlockTransaction() = default;'


Line 283:   ~BlockTransaction() {
Can't this just be set to the default dtor?

   ~BlockTransaction() = default;


PS13, Line 288: push_back(
emplace_back


PS13, Line 293: CommitCreation
Might be more clear as 'CommitCreatedBlocks'.


Line 305: Status s = bm->CloseBlocks(blocks);
Would it be difficult to change BlockManager::CloseBlocks to take a 'const 
vector&' so you don't have to make a vector copy 
here?  If that would require changing a bunch of other call sites, feel free to 
ignore.


http://gerrit.cloudera.org:8080/#/c/7207/13/src/kudu/fs/file_block_manager.cc
File src/kudu/fs/file_block_manager.cc:

Line 727: // dirty pages if 'block_manager_flush_on_finalize' is true.
I think this should be doing an async flush regardless of the 
block_manager_flush_on_finalize flag value.  Unfortunately I think that means 
you will need to copy most of the body of Finalize and remove the check (maybe 
abstract it out?), but I think we definitely want to parallel flushing to 
happen here.  block_coalesce_close shouldn't be tied to the flush on finalize 
flag.


http://gerrit.cloudera.org:8080/#/c/7207/13/src/kudu/fs/log_block_manager.cc
File src/kudu/fs/log_block_manager.cc:

Line 278:   Status FinishBlock(const Status& s, WritableBlock* block,
It looks like the only place that calls FinishBlock calls it with an OK status, 
so I think the s param can be removed.


PS13, Line 522: unnecessarily
unnecessary


Line 1234:   Status DoClose(CloseFlags flags);
Could DoClose and CloseFlags be private?


Line 1911: // dirty pages if 'block_manager_flush_on_finalize' is true.
Same thing here - I think we want to do this depending only on the 
block_coalesce_close flag.


PS13, Line 1917: belong
s/belong/belonging


-- 
To view, visit http://gerrit.cloudera.org:8080/7207
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: comment
Gerrit-Change-Id: I01f3ed830cc1f86d7bb8a66b2f780914a4bd
Gerrit-PatchSet: 13
Gerrit-Project: kudu
Gerrit-Branch: master
Gerrit-Owner: Hao Hao 
Gerrit-Reviewer: Adar Dembo 
Gerrit-Reviewer: Andrew Wong 
Gerrit-Reviewer: Dan Burkert 
Gerrit-Reviewer: Hao Hao 
Gerrit-Reviewer: Kudu Jenkins
Gerrit-Reviewer: Tidy Bot
Gerrit-HasComments: Yes


[kudu-CR] KUDU-1943: Add BlockTransaction to Block Manager

2017-08-04 Thread Hao Hao (Code Review)
Hello Kudu Jenkins,

I'd like you to reexamine a change.  Please visit

http://gerrit.cloudera.org:8080/7207

to look at the new patch set (#13).

Change subject: KUDU-1943: Add BlockTransaction to Block Manager
..

KUDU-1943: Add BlockTransaction to Block Manager

This adds a new layer of abstraction 'BlockTransaction' to
Block Manager, to accumulate creates which happen in a
logical operation, e.g. flush/compaction. Avoid using
fsync() per block can potentionaly reduce a lot I/O overhead.

This patch also add a new API FinalizeWrite() to Block Manager,
so that for LBM container can be resued, without flushing
in-flight writable blocks to disk.

A design doc can be found here:
https://docs.google.com/a/cloudera.com/document/d/15zZaKeLENRGC_AdjF2N3hxtM_6_OuydXA1reWvDbbK0/edit?usp=sharing

Change-Id: I01f3ed830cc1f86d7bb8a66b2f780914a4bd
---
M src/kudu/cfile/bloomfile.cc
M src/kudu/cfile/bloomfile.h
M src/kudu/cfile/cfile-test.cc
M src/kudu/cfile/cfile_writer.cc
M src/kudu/cfile/cfile_writer.h
M src/kudu/fs/block_manager-test.cc
M src/kudu/fs/block_manager.cc
M src/kudu/fs/block_manager.h
M src/kudu/fs/file_block_manager.cc
M src/kudu/fs/log_block_manager-test.cc
M src/kudu/fs/log_block_manager.cc
M src/kudu/fs/log_block_manager.h
M src/kudu/tablet/deltafile.cc
M src/kudu/tablet/deltafile.h
M src/kudu/tablet/diskrowset.cc
M src/kudu/tablet/diskrowset.h
M src/kudu/tablet/multi_column_writer.cc
M src/kudu/tablet/multi_column_writer.h
18 files changed, 657 insertions(+), 262 deletions(-)


  git pull ssh://gerrit.cloudera.org:29418/kudu refs/changes/07/7207/13
-- 
To view, visit http://gerrit.cloudera.org:8080/7207
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: newpatchset
Gerrit-Change-Id: I01f3ed830cc1f86d7bb8a66b2f780914a4bd
Gerrit-PatchSet: 13
Gerrit-Project: kudu
Gerrit-Branch: master
Gerrit-Owner: Hao Hao 
Gerrit-Reviewer: Adar Dembo 
Gerrit-Reviewer: Andrew Wong 
Gerrit-Reviewer: Dan Burkert 
Gerrit-Reviewer: Hao Hao 
Gerrit-Reviewer: Kudu Jenkins
Gerrit-Reviewer: Tidy Bot


[kudu-CR] KUDU-1943: Add BlockTransaction to Block Manager

2017-08-04 Thread Hao Hao (Code Review)
Hao Hao has posted comments on this change.

Change subject: KUDU-1943: Add BlockTransaction to Block Manager
..


Patch Set 12:

(38 comments)

http://gerrit.cloudera.org:8080/#/c/7207/10/src/kudu/cfile/cfile-test.cc
File src/kudu/cfile/cfile-test.cc:

Line 900: }
> Could you further parameterize the test so that we run with both values of 
Done


http://gerrit.cloudera.org:8080/#/c/7207/10/src/kudu/cfile/cfile_writer.cc
File src/kudu/cfile/cfile_writer.cc:

PS10, Line 62: "Whether to finalize or close cfile blocks when 
writing "
 : "is finished. ");
> "Whether to finalize or close cfile blocks when writing is finished."
Done


http://gerrit.cloudera.org:8080/#/c/7207/10/src/kudu/fs/block_manager-test.cc
File src/kudu/fs/block_manager-test.cc:

Line 998: TYPED_TEST(BlockManagerTest, 
ConcurrentCloseFinalizedWritableBlockTest) {
> kTestData for this. Above and below too.
Done


Line 1002:   // written, and then Close() it.
> Since these operations happen on other threads, you should use CHECK_OK ins
Done


Line 1007:   CHECK_OK(writer->Append(kTestData));
> Maybe inject a tiny (random) bit of sleeping between Finalize() and Close()
Done


PS10, Line 1010:   CHECK_OK(writer->Close());
   : }
   :   };
   : 
   :   vector threads;
   :   for (int i = 0; i < 100; i++) {
   : threads.emplace_back(write_data);
   :   }
   :   for (auto& t : threads) {
   :
> Use std::thread; they're less LOC and thus better suited for tests.
Done


Line 1032: ASSERT_OK(block->Read(0, ));
> ASSERT_EQ
Done


PS10, Line 1043: a 
> "some" instead, so that if 20 changes to 30 or 40, we don't have to update 
Done


Line 1053: ASSERT_OK(writer->Append(kTestData));
> What's this about?
My bad, comment it out for testing, but forget to uncomment back.


http://gerrit.cloudera.org:8080/#/c/7207/10/src/kudu/fs/block_manager.h
File src/kudu/fs/block_manager.h:

Line 22: #include 
> What happened here? The older style (and location) was correct.
Done


PS10, Line 73: 1) flushing of dirty
 : //blocks a
> Not sure what this is saying; please reword.
Done


Line 88: // No more data may be written to the block, but it is not yet 
closed.
> "No more data may be written to the block, but it is not yet closed."
Done


PS10, Line 123:   // Signals that the block will no longer receive writes. Does 
not guarantee
  :   // durability; Close() must still be called for t
> "Signals that the block will no longer receive writes. Does not guarantee d
Done


PS10, Line 275: block 
> block
Done


PS10, Line 276: 1) the underlying block manager can optimize
> Expand on this a bit (by describing in vagaries that the underlying block m
Done


PS10, Line 277: possi
> logical
Done


Line 279: class BlockTransaction {
> Nit: indent by one char.
Done


Line 312:   }
> Nit: indent by one char.
Done


http://gerrit.cloudera.org:8080/#/c/7207/10/src/kudu/fs/file_block_manager.cc
File src/kudu/fs/file_block_manager.cc:

PS10, Line 335: 
  :   if (state_ == FINALIZED) {
  : return Status::OK();
  :   }
  :   VLOG(3) << "Finalizing block " << id();
  :   if (state_ == DIRTY) {
  : if (FLAGS_block_manager_flush_on_finalize) {
  :
> How about inverting this:
Done


http://gerrit.cloudera.org:8080/#/c/7207/10/src/kudu/fs/fs_manager.cc
File src/kudu/fs/fs_manager.cc:

Line 38: #include "kudu/gutil/stringprintf.h"
> Why is this include needed?
Done


http://gerrit.cloudera.org:8080/#/c/7207/10/src/kudu/fs/log_block_manager-test.cc
File src/kudu/fs/log_block_manager-test.cc:

Line 1274:   }
> I'd also close all the blocks in 'blocks' and check all_containers_by_name_
Done


Line 1277:   for (const auto& block : blocks) {
> Isn't this state transition already tested by BlockManagerTest.WritableBloc
I combined this test with the test case above. Let me know if you still think 
it is redundant. Thanks!


Line 1288: } // namespace fs
> How does L1289 check this?
I was trying to check in the available_containers_by_data_dir_ map, there is 
only one container for the corresponding datadir(should be the first element of 
the map). Maybe there is a more elegant way to do so?


http://gerrit.cloudera.org:8080/#/c/7207/10/src/kudu/fs/log_block_manager.cc
File src/kudu/fs/log_block_manager.cc:

Line 129: using pb_util::WritablePBContainerFile;
> There's an existing internal::LogWritableBlock reference in the file; pleas
Done


Line 513:   // The metrics. Not owned by the log container; it has the same 
lifespan
> Please use AtomicInt rather than std::atomic, since that's what's already b
Done


Line 984:   // is we sync the data/metadata more often than needed.
> We shouldn't hold spinlocks while doing IO, because IO is 

[kudu-CR] KUDU-1943: Add BlockTransaction to Block Manager

2017-08-04 Thread Hao Hao (Code Review)
Hello Kudu Jenkins,

I'd like you to reexamine a change.  Please visit

http://gerrit.cloudera.org:8080/7207

to look at the new patch set (#12).

Change subject: KUDU-1943: Add BlockTransaction to Block Manager
..

KUDU-1943: Add BlockTransaction to Block Manager

This adds a new layer of abstraction 'BlockTransaction' to
Block Manager, to accumulate creates which happen in a
logical operation, e.g. flush/compaction. Avoid using
fsync() per block can potentionaly reduce a lot I/O overhead.

This patch also add a new API FinalizeWrite() to Block Manager,
so that for LBM container can be resued, without flushing
in-flight writable blocks to disk.

A design doc can be found here:
https://docs.google.com/a/cloudera.com/document/d/15zZaKeLENRGC_AdjF2N3hxtM_6_OuydXA1reWvDbbK0/edit?usp=sharing

Change-Id: I01f3ed830cc1f86d7bb8a66b2f780914a4bd
---
M src/kudu/cfile/bloomfile.cc
M src/kudu/cfile/bloomfile.h
M src/kudu/cfile/cfile-test.cc
M src/kudu/cfile/cfile_writer.cc
M src/kudu/cfile/cfile_writer.h
M src/kudu/fs/block_manager-test.cc
M src/kudu/fs/block_manager.cc
M src/kudu/fs/block_manager.h
M src/kudu/fs/file_block_manager.cc
M src/kudu/fs/log_block_manager-test.cc
M src/kudu/fs/log_block_manager.cc
M src/kudu/fs/log_block_manager.h
M src/kudu/tablet/deltafile.cc
M src/kudu/tablet/deltafile.h
M src/kudu/tablet/diskrowset.cc
M src/kudu/tablet/diskrowset.h
M src/kudu/tablet/multi_column_writer.cc
M src/kudu/tablet/multi_column_writer.h
18 files changed, 662 insertions(+), 267 deletions(-)


  git pull ssh://gerrit.cloudera.org:29418/kudu refs/changes/07/7207/12
-- 
To view, visit http://gerrit.cloudera.org:8080/7207
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: newpatchset
Gerrit-Change-Id: I01f3ed830cc1f86d7bb8a66b2f780914a4bd
Gerrit-PatchSet: 12
Gerrit-Project: kudu
Gerrit-Branch: master
Gerrit-Owner: Hao Hao 
Gerrit-Reviewer: Adar Dembo 
Gerrit-Reviewer: Andrew Wong 
Gerrit-Reviewer: Dan Burkert 
Gerrit-Reviewer: Hao Hao 
Gerrit-Reviewer: Kudu Jenkins
Gerrit-Reviewer: Tidy Bot


[kudu-CR] KUDU-1943: Add BlockTransaction to Block Manager

2017-08-04 Thread Hao Hao (Code Review)
Hello Kudu Jenkins,

I'd like you to reexamine a change.  Please visit

http://gerrit.cloudera.org:8080/7207

to look at the new patch set (#11).

Change subject: KUDU-1943: Add BlockTransaction to Block Manager
..

KUDU-1943: Add BlockTransaction to Block Manager

This adds a new layer of abstraction 'BlockTransaction' to
Block Manager, to accumulate creates which happen in a
logical operation, e.g. flush/compaction. Avoid using
fsync() per block can potentionaly reduce a lot I/O overhead.

This patch also add a new API FinalizeWrite() to Block Manager,
so that for LBM container can be resued, without flushing
in-flight writable blocks to disk.

A design doc can be found here:
https://docs.google.com/a/cloudera.com/document/d/15zZaKeLENRGC_AdjF2N3hxtM_6_OuydXA1reWvDbbK0/edit?usp=sharing

Change-Id: I01f3ed830cc1f86d7bb8a66b2f780914a4bd
---
M src/kudu/cfile/bloomfile.cc
M src/kudu/cfile/bloomfile.h
M src/kudu/cfile/cfile-test.cc
M src/kudu/cfile/cfile_writer.cc
M src/kudu/cfile/cfile_writer.h
M src/kudu/fs/block_manager-test.cc
M src/kudu/fs/block_manager.cc
M src/kudu/fs/block_manager.h
M src/kudu/fs/file_block_manager.cc
M src/kudu/fs/log_block_manager-test.cc
M src/kudu/fs/log_block_manager.cc
M src/kudu/fs/log_block_manager.h
M src/kudu/tablet/deltafile.cc
M src/kudu/tablet/deltafile.h
M src/kudu/tablet/diskrowset.cc
M src/kudu/tablet/diskrowset.h
M src/kudu/tablet/multi_column_writer.cc
M src/kudu/tablet/multi_column_writer.h
M src/kudu/util/scoped_cleanup-test.cc
19 files changed, 627 insertions(+), 268 deletions(-)


  git pull ssh://gerrit.cloudera.org:29418/kudu refs/changes/07/7207/11
-- 
To view, visit http://gerrit.cloudera.org:8080/7207
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: newpatchset
Gerrit-Change-Id: I01f3ed830cc1f86d7bb8a66b2f780914a4bd
Gerrit-PatchSet: 11
Gerrit-Project: kudu
Gerrit-Branch: master
Gerrit-Owner: Hao Hao 
Gerrit-Reviewer: Adar Dembo 
Gerrit-Reviewer: Andrew Wong 
Gerrit-Reviewer: Dan Burkert 
Gerrit-Reviewer: Hao Hao 
Gerrit-Reviewer: Kudu Jenkins
Gerrit-Reviewer: Tidy Bot


[kudu-CR] KUDU-1943: Add BlockTransaction to Block Manager

2017-08-02 Thread Hao Hao (Code Review)
Hello Kudu Jenkins,

I'd like you to reexamine a change.  Please visit

http://gerrit.cloudera.org:8080/7207

to look at the new patch set (#10).

Change subject: KUDU-1943: Add BlockTransaction to Block Manager
..

KUDU-1943: Add BlockTransaction to Block Manager

This adds a new layer of abstraction 'BlockTransaction' to
Block Manager, to accumulate creates which happen in a
logical operation, e.g. flush/compaction. Avoid using
fsync() per block can potentionaly reduce a lot I/O overhead.

This patch also add a new API FinalizeWrite() to Block Manager,
so that for LBM container can be resued, without flushing
in-flight writable blocks to disk.

A design doc can be found here:
https://docs.google.com/a/cloudera.com/document/d/15zZaKeLENRGC_AdjF2N3hxtM_6_OuydXA1reWvDbbK0/edit?usp=sharing

Change-Id: I01f3ed830cc1f86d7bb8a66b2f780914a4bd
---
M src/kudu/cfile/bloomfile.cc
M src/kudu/cfile/bloomfile.h
M src/kudu/cfile/cfile-test.cc
M src/kudu/cfile/cfile_writer.cc
M src/kudu/cfile/cfile_writer.h
M src/kudu/fs/block_manager-test.cc
M src/kudu/fs/block_manager.cc
M src/kudu/fs/block_manager.h
M src/kudu/fs/file_block_manager.cc
M src/kudu/fs/fs_manager.cc
M src/kudu/fs/log_block_manager-test.cc
M src/kudu/fs/log_block_manager.cc
M src/kudu/fs/log_block_manager.h
M src/kudu/tablet/deltafile.cc
M src/kudu/tablet/deltafile.h
M src/kudu/tablet/diskrowset.cc
M src/kudu/tablet/diskrowset.h
M src/kudu/tablet/multi_column_writer.cc
M src/kudu/tablet/multi_column_writer.h
19 files changed, 548 insertions(+), 244 deletions(-)


  git pull ssh://gerrit.cloudera.org:29418/kudu refs/changes/07/7207/10
-- 
To view, visit http://gerrit.cloudera.org:8080/7207
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: newpatchset
Gerrit-Change-Id: I01f3ed830cc1f86d7bb8a66b2f780914a4bd
Gerrit-PatchSet: 10
Gerrit-Project: kudu
Gerrit-Branch: master
Gerrit-Owner: Hao Hao 
Gerrit-Reviewer: Adar Dembo 
Gerrit-Reviewer: Andrew Wong 
Gerrit-Reviewer: Dan Burkert 
Gerrit-Reviewer: Hao Hao 
Gerrit-Reviewer: Kudu Jenkins
Gerrit-Reviewer: Tidy Bot


[kudu-CR] KUDU-1943: Add BlockTransaction to Block Manager

2017-08-02 Thread Hao Hao (Code Review)
Hao Hao has posted comments on this change.

Change subject: KUDU-1943: Add BlockTransaction to Block Manager
..


Patch Set 10:

(3 comments)

http://gerrit.cloudera.org:8080/#/c/7207/9/src/kudu/cfile/cfile-test.cc
File src/kudu/cfile/cfile-test.cc:

Line 38: DECLARE_bool(cfile_close_block_on_finish);
> warning: redundant 'FLAGS_block_cache_type' declaration [readability-redund
Done


http://gerrit.cloudera.org:8080/#/c/7207/9/src/kudu/fs/block_manager.h
File src/kudu/fs/block_manager.h:

Line 22: #include 
> warning: #includes are not sorted properly [llvm-include-order]
Done


http://gerrit.cloudera.org:8080/#/c/7207/9/src/kudu/fs/log_block_manager.cc
File src/kudu/fs/log_block_manager.cc:

Line 1229:   // metadata to disk.
> warning: function 'kudu::fs::internal::LogWritableBlock::DoClose' has a def
Done


-- 
To view, visit http://gerrit.cloudera.org:8080/7207
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: comment
Gerrit-Change-Id: I01f3ed830cc1f86d7bb8a66b2f780914a4bd
Gerrit-PatchSet: 10
Gerrit-Project: kudu
Gerrit-Branch: master
Gerrit-Owner: Hao Hao 
Gerrit-Reviewer: Adar Dembo 
Gerrit-Reviewer: Andrew Wong 
Gerrit-Reviewer: Dan Burkert 
Gerrit-Reviewer: Hao Hao 
Gerrit-Reviewer: Kudu Jenkins
Gerrit-Reviewer: Tidy Bot
Gerrit-HasComments: Yes