2018-01-04 16:24:57 UTC - Daniel Ferreira Jorge: What is the proxy component? I 
cannot find any kind of documentation for it 
(<https://github.com/apache/incubator-pulsar/blob/master/kubernetes/generic/proxy.yaml>).
 Why is it only used in the generic kubernetes deployment and not in the gke 
version?
----
2018-01-04 16:27:13 UTC - Matteo Merli: It’s a component that was introduced 
recently. Essentially it’s a stateless proxy that speaks that Pulsar binary 
protocol. The motivation is to avoid (or overcome the impossibility) of direct 
connection between clients and brokers.
----
2018-01-04 16:28:35 UTC - Matteo Merli: being a stateless service, it can be 
exposed through a regular load balancer (eg: ElasticLoadBalancer or 
clusterIP/nodePort in kubernetes)
----
2018-01-04 16:29:09 UTC - Matteo Merli: (documentation for that it’s not really 
“complete”)
----
2018-01-04 16:30:52 UTC - Matteo Merli: the way it works is to point the 
clients to the proxy rather than the brokers, and the proxy will make sure to 
redirect all the connections through itself
----
2018-01-04 16:33:57 UTC - Daniel Ferreira Jorge: so instead of having the 
clients connect directly to the brokers, they should connect to the proxies, 
right?
----
2018-01-04 16:34:52 UTC - Matteo Merli: correct, there’s the overhead of the 
extra network hop, but it can simplify the deployments, especially in terms on 
network ACLs
----
2018-01-04 16:35:11 UTC - Matteo Merli: or to expose the service outside of a 
Kubernetes cluster
----
2018-01-04 16:36:56 UTC - Matteo Merli: because normally, brokers are 
advertising their own address to client. That could be either the `podIP` or 
the `nodeIP` but it needs to be accessible from clients (in absence of proxy)
----
2018-01-04 16:38:40 UTC - Daniel Ferreira Jorge: great, thanks
----
2018-01-05 01:27:25 UTC - Daniel Ferreira Jorge: I think it would be really 
nice if pulsar implemented a multi tiered storage system like pravega. Pravega 
uses bookkeeper for fresher data and some sort of slower and more 
cost-efficient storage like HDFS, S3 or GCS for older data. From the 
perspective of clients everything is the same, but historical data comes from 
cold storage. This is just an idea for the future...  
<http://pravega.io/docs/pravega-concepts/#a-note-on-tiered-storage>
----

Reply via email to