These look nice! Technical Q - is "layout unknown" the truth in terms of how assignments of processing to the NC's is being done? There is some (future) opportunity to do somewhat better, if desired, if so, as it would be possible for the HDFS name node to provide info that the compiler could use to try and set location constraints for the Hyracks operators - so that the latter two figures behave closer to the first one as well (instead of being location-unaware).

Cheers,
Mike


On 8/18/15 4:13 PM, Preston Carman wrote:
The figures have been updated based on Till's feedback. I also noticed I
did not include the Yarn figure link.

- Full names of processes
- Legend added
- Added outline to represent cluster
- Standardized the process

The figures seem to better express the logical and physical layout better
now. Ready for the next round of suggestions.

Preston


VXQuery Cluster:
https://docs.google.com/drawings/d/1PZbvJk-G0J3hQffd-fFr2n893bXSNg3xfXFexM5c2A8/edit?usp=sharing

VXQuery Cluster using HDFS:
https://docs.google.com/drawings/d/1ge-0h8wa0Epio42Wor-SeBoafQdLSZxfKZFFQtcN1w0/edit?usp=sharing

VXQuery Yarn Cluster using HDFS:
https://docs.google.com/drawings/d/13_kP4Yt1ze_pgqQcbVLmlBOxE6aX0Pmjg3FT2q4XX2k/edit?usp=sharing

On Mon, Aug 17, 2015 at 4:08 PM, Till Westmann <[email protected]> wrote:

Hi Preston,

Thanks for creating those diagrams!

A few comments/proposals:
1) I think that it would be good clarify the meaning of the shapes and
lines. For the first diagram I read regular rectangles as machines, round
rectangles as processes and  the rectangle with the wavy bottom as files.
On the second one I'm not sure if the rounded rectangle around HDFS is a
process. Maybe we could add a legend for the diagrams?
2) When naming the machines I would replace "laptop" with "client" as
that's more generic and potentially fix the spelling of controller.
However, I think that the naming of the "Hyracks machines" doesn't add a
lot. Maybe we could just expand on the name of the processes to
NodeController and ClusterController and not have names for the individual
cluster nodes. Having he long process names would also ease the connection
between the diagrams and the code.

Does this make sense?

Cheers,
Till


On 17 Aug 2015, at 12:05, Eldon Carman wrote:

The following diagrams are intended to be used on our documentation site
(as images in the HTML). I think they will be helpful in discussing the
actual architecture of the VXQuery cluster, especially in Yarn.

Please post questions or suggestions on how to clarify or improve the
diagrams or cluster architecture.


VXQuery Cluster:

https://docs.google.com/drawings/d/1PZbvJk-G0J3hQffd-fFr2n893bXSNg3xfXFexM5c2A8/edit?usp=sharing

VXQuery Cluster using HDFS:

https://docs.google.com/drawings/d/1ge-0h8wa0Epio42Wor-SeBoafQdLSZxfKZFFQtcN1w0/edit?usp=sharing

VXQuery Yarn Cluster using HDFS:

https://docs.google.com/drawings/d/13_kP4Yt1ze_pgqQcbVLmlBOxE6aX0Pmjg3FT2q4XX2k/edit?usp=sharing


Reply via email to