Grant Henke has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/12119 )

Change subject: [blog] a blogpost about location awareness in Kudu
......................................................................


Patch Set 8:

(28 comments)

http://gerrit.cloudera.org:8080/#/c/12119/8/_posts/2019-03-25-location-awareness.md
File _posts/2019-03-25-location-awareness.md:

http://gerrit.cloudera.org:8080/#/c/12119/8/_posts/2019-03-25-location-awareness.md@1
PS8, Line 1: ---
Can you push this to the gh_pages branch in your github fork so a rendered 
version can be proofed?


http://gerrit.cloudera.org:8080/#/c/12119/8/_posts/2019-03-25-location-awareness.md@15
PS8, Line 15: <!--TODO(aserbin) rename the file to reflect the date when 
published -->
Should this be removed?


http://gerrit.cloudera.org:8080/#/c/12119/8/_posts/2019-03-25-location-awareness.md@19
PS8, Line 19: first cut
initial implementation?


http://gerrit.cloudera.org:8080/#/c/12119/8/_posts/2019-03-25-location-awareness.md@19
PS8, Line 19: starting 1.9.0
...starting *with the* 1.9.0...


http://gerrit.cloudera.org:8080/#/c/12119/8/_posts/2019-03-25-location-awareness.md@20
PS8, Line 20: is built for the following use case:
I am not sure this is a "use case" per se, but instead what the term "location 
awareness" currently means in Kudu. Maybe say something like:

"In the initial implementation of location awareness in Kudu, when a Kudu 
cluster consists of multiple servers spread across several racks, Kudu will 
place the replicas of a tablet in such a way that the tablet stays available 
even if all the servers in a single rack become unavailable."


http://gerrit.cloudera.org:8080/#/c/12119/8/_posts/2019-03-25-location-awareness.md@26
PS8, Line 26: A rack failure might happen because of a failure of a hardware 
component shared
            : among servers in the rack: network switch, power supply, etc.
A rack failure can occur when a hardware component shared
among servers in the rack, such as a network switch or power supply, fails.


http://gerrit.cloudera.org:8080/#/c/12119/8/_posts/2019-03-25-location-awareness.md@31
PS8, Line 31: network latency between datacenters is low.
This is a good opportunity to explicitly mention that this is why we call the 
feature location awareness and not rack awareness.


http://gerrit.cloudera.org:8080/#/c/12119/8/_posts/2019-03-25-location-awareness.md@37
PS8, Line 37: are
            : supposed to
should


http://gerrit.cloudera.org:8080/#/c/12119/8/_posts/2019-03-25-location-awareness.md@38
PS8, Line 38: physical or cloud-defined hierarchy of the
            : deployed cluster
I am not sure I understand what this means in relation to location awareness 
utility. I suspect it's saying that the components should map to the 
hierarchical levels of "failure domains". 

You could then give a private data center example:
`/data-center-0/rack-09`

And a cloud example:
`/region-0/availability-zone-01`


http://gerrit.cloudera.org:8080/#/c/12119/8/_posts/2019-03-25-location-awareness.md@41
PS8, Line 41: However, we want to keep the hierarchy
            : there to make it possible to exploit it later
However, we plan to leverage the hierarchical structure in future releases.


http://gerrit.cloudera.org:8080/#/c/12119/8/_posts/2019-03-25-location-awareness.md@43
PS8, Line 43: compatibility with HDFS
Perhaps this should be moved up and describe a bit more in detail as a design 
choice? It's useful to know that you can use the same locations as your HDFS 
nodes, because it's common to deploy Kudu along size HDFS.


http://gerrit.cloudera.org:8080/#/c/12119/8/_posts/2019-03-25-location-awareness.md@52
PS8, Line 52: etc
What is the "etc"? What else does it use it for?


http://gerrit.cloudera.org:8080/#/c/12119/8/_posts/2019-03-25-location-awareness.md@55
PS8, Line 55: location string for the specified IP address/hostname.
The script below specifically shows ip-address. How do I use hostname?


http://gerrit.cloudera.org:8080/#/c/12119/8/_posts/2019-03-25-location-awareness.md@59
PS8, Line 59: tablet server restarts
Is this dependent on `--follower_unavailable_considered_failed_sec`? Or will a 
"quick" restart cause the location to be reset?


http://gerrit.cloudera.org:8080/#/c/12119/8/_posts/2019-03-25-location-awareness.md@59
PS8, Line 59: Kudu tablet servers are location
            : agnostic, at least for now, so the assigned location is not 
reported back
            : to the tablet server.
Maybe this paragraph would flow better if you moved this part to the bottom. 
That would make it so you describe how the master uses the location 
configurations, and then tack on at the end that the tablet servers do not 
need/use it.


http://gerrit.cloudera.org:8080/#/c/12119/8/_posts/2019-03-25-location-awareness.md@64
PS8, Line 64: masters provide connected clients
How do they do this?


http://gerrit.cloudera.org:8080/#/c/12119/8/_posts/2019-03-25-location-awareness.md@62
PS8, Line 62: to try to place replicas evenly across
            : locations and to keep tablets available in case all tablet 
servers in a single
            : location fail.
This last part is somewhat duplicated from the Introduction section above. 
Perhaps it's not needed here.


http://gerrit.cloudera.org:8080/#/c/12119/8/_posts/2019-03-25-location-awareness.md@75
PS8, Line 75: Essentially, that's about having
This results in...


http://gerrit.cloudera.org:8080/#/c/12119/8/_posts/2019-03-25-location-awareness.md@81
PS8, Line 81: The error handling and the input validation are minimalistic. 
Also, the
            : #   network topology choice, supportability and capacity planning 
aspects of
            : #   this script might be sub-optimal if applied as-is for 
real-world use cases.
Is there anywhere else anyone can get a "good" production worthy example? If 
not from us, from who? This leaves the reader with a lot of concerning 
questions.


http://gerrit.cloudera.org:8080/#/c/12119/8/_posts/2019-03-25-location-awareness.md@104
PS8, Line 104:   echo "ERROR: '$ip_address' is not a valid IPv4 address"
Should errors map to "/other"? How does Kudu handle this script exiting with a 
non-zero exit code?


http://gerrit.cloudera.org:8080/#/c/12119/8/_posts/2019-03-25-location-awareness.md@142
PS8, Line 142: The reasoning is simple: with
I try to stay away from saying something is "simple". People have wide levels 
of experience with distributed systems. Maybe something like:

"It's recommended to have at least three locations defined in a Kudu
cluster so that no location contains a majority of replicas of a tablet."

Then below you can mention the replication factor of 3 in your example.


http://gerrit.cloudera.org:8080/#/c/12119/8/_posts/2019-03-25-location-awareness.md@151
PS8, Line 151: The location-aware placement policy for tablet replicas in Kudu
This seems more appropriate for earlier sections. When reading the blog post I 
got the idea that the structure was:

- What it is
- How it works
- How to use it
- Future work

We are now in the "How to use it" part, but this is more about how it works.

Can users configure these policies? Are there more than one?


http://gerrit.cloudera.org:8080/#/c/12119/8/_posts/2019-03-25-location-awareness.md@162
PS8, Line 162: Automatic re-replication and placement policy
Per my earlier comment, this is also more about "How it works".


http://gerrit.cloudera.org:8080/#/c/12119/8/_posts/2019-03-25-location-awareness.md@177
PS8, Line 177: Reinstating location-aware policy in Kudu cluster
I think this is "How to use it" and makes sense here.


http://gerrit.cloudera.org:8080/#/c/12119/8/_posts/2019-03-25-location-awareness.md@208
PS8, Line 208: Examples
Per my earlier comment, this is also more about "How it works".


http://gerrit.cloudera.org:8080/#/c/12119/8/_posts/2019-03-25-location-awareness.md@337
PS8, Line 337: roadmap
What roadmap? Does Apache Kudu have a roadmap?

Maybe we should open jiras and link them for any future work/ideas.


http://gerrit.cloudera.org:8080/#/c/12119/8/_posts/2019-03-25-location-awareness.md@342
PS8, Line 342: see [2]
Any reason not to link inline instead of using reference style?


http://gerrit.cloudera.org:8080/#/c/12119/8/_posts/2019-03-25-location-awareness.md@346
PS8, Line 346: [[1]] [Location awareness in Kudu, design 
document](https://s.apache.org/location-awareness-design)
Can we check this design doc into 
https://github.com/apache/kudu/tree/master/docs/design-docs and link there?



--
To view, visit http://gerrit.cloudera.org:8080/12119
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: kudu
Gerrit-Branch: gh-pages
Gerrit-MessageType: comment
Gerrit-Change-Id: I10b30a80d5661fb889a11285b8118cdea1a87cd2
Gerrit-Change-Number: 12119
Gerrit-PatchSet: 8
Gerrit-Owner: Alexey Serbin <aser...@cloudera.com>
Gerrit-Reviewer: Alexey Serbin <aser...@cloudera.com>
Gerrit-Reviewer: Grant Henke <granthe...@apache.org>
Gerrit-Reviewer: Greg Solovyev <gsolov...@cloudera.com>
Gerrit-Reviewer: Todd Lipcon <t...@apache.org>
Gerrit-Reviewer: Will Berkeley <wdberke...@gmail.com>
Gerrit-Comment-Date: Tue, 26 Mar 2019 04:05:31 +0000
Gerrit-HasComments: Yes

Reply via email to