Re: [kubernetes-users] Cluster DNS: bottleneck with ~1000 outbound connections per second

2018-03-20 Thread Evan Jones
The downside that I am aware of is that you don't get the Kubernetes DNS
magic, where names automatically point to your services. For the particular
use case where I ran into this, it worked perfectly!

I was also going to attempt to add an alias so we could eventually migrate
to dnsPolicy: Host instead of the confusingly named Default, but it seemed
challenging enough that I never got around to it.

Evan


On Tue, Mar 20, 2018 at 1:55 AM, <m...@percy.io> wrote:

> On Thursday, October 5, 2017 at 1:29:28 PM UTC-7, Evan Jones wrote:
> > The sustained 1000 qps comes from an application making that many
> outbound connections. I agree that the application is very inefficient and
> shouldn't be doing a DNS lookup for every request it sends, but it's a
> python program that uses urllib2.urlopen so it creates a new connection
> each time. I suspect this isn't that unusual? This could be a server that
> hits an external service for every user request, for example. Given the
> activity on the GitHub issues I linked, it appears I'm not the only person
> to have run into this.
> >
> >
> > Thanks for the response though, since that answers my question: there is
> currently no plans to change how this works. Hopefully if anyone else hits
> this they might find this email so they can solve it faster than I did.
> >
> >
> > Finally the fact that dnsPolicy: Default is *not* the default is also
> surprising. It should probably be called dnsPolicy: Host or something
> instead.
> >
> >
> >
> >
> >
> > On Oct 5, 2017 13:54, "'Tim Hockin' via Kubernetes user discussion and
> Q" <kubernet...@googlegroups.com> wrote:
> > We had a proposal to avoid conntrack for DNS, but no real movement on it.
> >
> >
> >
> > We have flags to adjust the conntrack table size.
> >
> >
> >
> > Kernel has params to tweak timeouts, which users can tweak.
> >
> >
> >
> > Sustained 1000 QPS DNS seems artificial.
> >
> >
> >
> > On Thu, Oct 5, 2017 at 10:47 AM, Evan Jones <evan@bluecore.com>
> wrote:
> >
> > > TL;DR: Kubernetes dnsPolicy: ClusterFirst can become a bottleneck with
> a
> >
> > > high rate of outbound connections. It seems like the problem is
> filling the
> >
> > > nf_conntrack table, causing client applications to fail to do DNS
> lookups. I
> >
> > > resolved this problem by switching my application to dnsPolicy:
> Default,
> >
> > > which provided much better performance for my application that does
> not need
> >
> > > cluster DNS.
> >
> > >
> >
> > > It seems like this is probably a "known" problem (see issues below),
> but I
> >
> > > can't tell: Is there a solution being worked on for this?
> >
> > >
> >
> > > Thanks!
> >
> > >
> >
> > >
> >
> > > Details:
> >
> > >
> >
> > > We were running a load generator, and were surprised to find that the
> >
> > > aggregate rate did not increase as we added more instances and nodes
> to our
> >
> > > cluster (GKE 1.7.6-gke.1). Eventually the application started getting
> errors
> >
> > > like "Name or service not known" at surprisingly low rates, like ~1000
> >
> > > requests/second. Switching the application to dnsPolicy: Default
> resolved
> >
> > > the issue.
> >
> > >
> >
> > > I spent some time digging into this, and the problem is not the CPU
> >
> > > utilization kube-dns / dnsmasq itself. On my small cluster of ~10
> >
> > > n1-standard-1 instances, I can get about 8 cached DNS
> queries/second. I
> >
> > > *think* the issue is that when there are enough machines talking to
> this
> >
> > > single DNS server, it fills the nf_conntrack table, causing packets to
> get
> >
> > > dropped, which I believe ends up rate limiting the clients. dmesg on
> the
> >
> > > node that is running kube-dns shows a constant stream of:
> >
> > >
> >
> > > [1124553.016331] nf_conntrack: table full, dropping packet
> >
> > > [1124553.021680] nf_conntrack: table full, dropping packet
> >
> > > [1124553.027024] nf_conntrack: table full, dropping packet
> >
> > > [1124553.032807] nf_conntrack: table full, dropping packet
> >
> > >
> >
> > > It seems to me that this is a bottleneck for Kubernetes clusters,
> since by
> >
> > > default all queries ar

[kubernetes-users] Re: Connecting to Cloud SQL from container app on kubernetes built via google container builder.

2018-01-04 Thread Evan Jones
I recommend using the Google Cloud SQL proxy container so you don't need to 
mess with IP whitelists. I don't quite get what you mean about "manual work 
to edit my pod deployment file": You just need to copy and paste this 
"sidecar" definition into your .yaml and leave it there. We have been using 
this with a variety of clients without problems:

https://cloud.google.com/sql/docs/mysql/connect-kubernetes-engine



On Wednesday, January 3, 2018 at 1:49:50 PM UTC-5, ftd dba wrote:
>
> Hi ALL,
>
> I am using a google container build method to do a continuous deployment 
> of my java application which needs to connect to Cloud SQL instance.
>
> Everything is going fine except the connection to database {Cloud SQL}. I 
> tried to whitelist the GKE cluster IP and java application exposed IP in to 
> Cloud SQL instance via authorization tab. But nothing works.
>
> when i connect to my container and tried telnet  3306, it's 
> not working. neither my java application when i send some request which 
> should connect to Cloud-SQL database.
>
> I found about Cloud-SQL-Proxy but that seems  a lot while doing a google 
> container build deployment method, as i don't want any manual work to edit 
> my pod deployment file.
>
> Can someone please point me on to right direction which will be of great 
> help.
>
> Thanks
> Khaleeq.
>
>

-- 
You received this message because you are subscribed to the Google Groups 
"Kubernetes user discussion and Q" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to kubernetes-users+unsubscr...@googlegroups.com.
To post to this group, send email to kubernetes-users@googlegroups.com.
Visit this group at https://groups.google.com/group/kubernetes-users.
For more options, visit https://groups.google.com/d/optout.


Re: [kubernetes-users] Cluster DNS: bottleneck with ~1000 outbound connections per second

2017-10-05 Thread Evan Jones
My script *is* always looking up the same domain, and I believe it is
cached by dnsmasq. I *think* the limit is the kernel NAT connection
tracking, because each DNS query comes from a new ephemeral port, so it
ends up using up all NAT mappings on the node running kube-dns. This is why
dnsPolicy: Default fixes the problem: It uses the host's DNS configuration
which avoids the NAT connection limits.

Details including the Python code and configs to reproduce it on a brand
new GKE cluster are at the bottom of https://github.com/
kubernetes/kubernetes/issues/45976

I did a separate test, using a Go DNS query generator, which was able to do
8 DNS queries per second, so dnsmasq does not appear to be the limit.

Thanks!

Evan


On Thu, Oct 5, 2017 at 5:26 PM, Rodrigo Campos <rodr...@sdfg.com.ar> wrote:

> On Thu, Oct 05, 2017 at 04:29:21PM -0400, Evan Jones wrote:
> > The sustained 1000 qps comes from an application making that many
> outbound
> > connections. I agree that the application is very inefficient and
> shouldn't
> > be doing a DNS lookup for every request it sends, but it's a python
> program
> > that uses urllib2.urlopen so it creates a new connection each time. I
> > suspect this isn't that unusual? This could be a server that hits an
> > external service for every user request, for example. Given the activity
> on
> > the GitHub issues I linked, it appears I'm not the only person to have
> run
> > into this.
>
> But is always on different domains? If not, it can probably be cached (as
> long
> as the TTL allows) by the DNS server and, even if your app makes so many
> requests, it should be answered quite fast.
>
> --
> You received this message because you are subscribed to a topic in the
> Google Groups "Kubernetes user discussion and Q" group.
> To unsubscribe from this topic, visit https://groups.google.com/d/to
> pic/kubernetes-users/7JBq6jhMZHc/unsubscribe.
> To unsubscribe from this group and all its topics, send an email to
> kubernetes-users+unsubscr...@googlegroups.com.
> To post to this group, send email to kubernetes-users@googlegroups.com.
> Visit this group at https://groups.google.com/group/kubernetes-users.
> For more options, visit https://groups.google.com/d/optout.
>

-- 
You received this message because you are subscribed to the Google Groups 
"Kubernetes user discussion and Q" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to kubernetes-users+unsubscr...@googlegroups.com.
To post to this group, send email to kubernetes-users@googlegroups.com.
Visit this group at https://groups.google.com/group/kubernetes-users.
For more options, visit https://groups.google.com/d/optout.


[kubernetes-users] Cluster DNS: bottleneck with ~1000 outbound connections per second

2017-10-05 Thread Evan Jones
*TL;DR*: Kubernetes dnsPolicy: ClusterFirst can become a bottleneck with a 
high rate of outbound connections. It seems like the problem is filling the 
nf_conntrack table, causing client applications to fail to do DNS lookups. 
I resolved this problem by switching my application to dnsPolicy: Default, 
which provided much better performance for my application that does not 
need cluster DNS.

It seems like this is probably a "known" problem (see issues below), but I 
can't tell: Is there a solution being worked on for this? 

Thanks!


*Details*:

We were running a load generator, and were surprised to find that the 
aggregate rate did not increase as we added more instances and nodes to our 
cluster (GKE 1.7.6-gke.1). Eventually the application started getting 
errors like "Name or service not known" at surprisingly low rates, like 
~1000 requests/second. Switching the application to dnsPolicy: Default 
resolved the issue.

I spent some time digging into this, and the problem is not the CPU 
utilization kube-dns / dnsmasq itself. On my small cluster of ~10 
n1-standard-1 instances, I can get about 8 cached DNS queries/second. I 
*think* the issue is that when there are enough machines talking to this 
single DNS server, it fills the nf_conntrack table, causing packets to get 
dropped, which I believe ends up rate limiting the clients. dmesg on the 
node that is running kube-dns shows a constant stream of:

[1124553.016331] nf_conntrack: table full, dropping packet
[1124553.021680] nf_conntrack: table full, dropping packet
[1124553.027024] nf_conntrack: table full, dropping packet
[1124553.032807] nf_conntrack: table full, dropping packet

It seems to me that this is a bottleneck for Kubernetes clusters, since by 
default all queries are directed to a small number of machines, which will 
then fill the connection tracking tables.

Is there a planned solution to this bottleneck? I was very surprised that 
*DNS* would be my bottleneck on a Kubernetes cluster, and at shockingly low 
rates.


*Related Github issues*

The following Github issues may be related to this problem. They all have a 
bunch of discussion but no clear resolution:

Run kube-dns on each node: 
https://github.com/kubernetes/kubernetes/issues/45363
Run dnsmasq on each node; mentions 
conntrack: https://github.com/kubernetes/kubernetes/issues/32749
kube-dns should be a daemonset / run on each 
node https://github.com/kubernetes/kubernetes/issues/26707

dnsmasq intermittent connection 
refused: https://github.com/kubernetes/kubernetes/issues/45976
Intermitted DNS to external 
name: https://github.com/kubernetes/kubernetes/issues/47142

kube-aws seems to already do something to run a local DNS resolver on each 
node? https://github.com/kubernetes-incubator/kube-aws/pull/792/

-- 
You received this message because you are subscribed to the Google Groups 
"Kubernetes user discussion and Q" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to kubernetes-users+unsubscr...@googlegroups.com.
To post to this group, send email to kubernetes-users@googlegroups.com.
Visit this group at https://groups.google.com/group/kubernetes-users.
For more options, visit https://groups.google.com/d/optout.


Re: [kubernetes-users] Need some guidance/help: howto diagnose an oomkill

2017-09-27 Thread Evan Jones
Its been a while since I've dealt with this sort of issue, but there are 
various libraries that use "native" memory outside the Java heap. The -Xmx 
flag only limits the Java heap, so it isn't surprising that some processes 
may need a way higher container memory limit than the Java GC heap limit.

However, if the memory usage increases over time without limit, you might 
have some sort of native memory leak due to not closing things (e.g. direct 
ByteBuffers, GZIP streams, many others). You can watch the container memory 
usage of the pod over time, and if it seems to increase without bound this 
may be what is happening. The JVM's native memory tracking summary 
statistics can also be 
useful: 
https://docs.oracle.com/javase/8/docs/technotes/guides/troubleshoot/tooldescr007.html

I've had success tracking down native memory leaks using jemalloc's 
profiling: http://www.evanjones.ca/java-native-leak-bug.html

Hope this helps, good luck!

Evan


On Tuesday, September 26, 2017 at 8:50:30 PM UTC-4, John VanRyn wrote:
>
> helps some...  we made the kube pods have almost twice as much memory as 
> we are allocating the jvm.. and it seems to get us out of the woods  
> but it totally means we need to look into a jdk upgrade from 8.
>
> Thanks
>
> On Tue, Sep 26, 2017 at 7:50 AM, Davanum Srinivas  > wrote:
>
>> John,
>>
>> Does this help?
>> https://developers.redhat.com/blog/2017/03/14/java-inside-docker/
>>
>> There are some details here as well:
>> https://github.com/moby/moby/issues/15020
>>
>> Thanks,
>> Dims
>>
>> On Tue, Sep 26, 2017 at 7:37 AM, John VanRyn > > wrote:
>> > I have a kube cluster running on n1-highmem-16 (16 vCPUs, 104 GB 
>> memory),
>> > using the unmodified cos-stable-60-9592-84-0 image.
>> >
>> > I have a java app running under wildfly
>> >
>> > 
>> > apiVersion: extensions/v1beta1
>> > kind: Deployment
>> > metadata:
>> >   name: cas-unicas-ws
>> >   labels:
>> > name: cas-unicas-ws
>> > model: cas
>> > spec:
>> >   replicas: 1
>> >   template:
>> > metadata:
>> >   labels:
>> > name: cas-unicas-ws
>> > model: cas
>> > spec:
>> >   containers:
>> >   - name: cas-unicas-ws
>> > image: liaisonintl/cas-unicas-ws:__CAS_TAG__
>> > imagePullPolicy: Always
>> > ports:
>> >   - containerPort: 8080
>> > readinessProbe:
>> >   periodSeconds: 20
>> >   timeoutSeconds: 5
>> >   successThreshold: 1
>> >   failureThreshold: 3
>> >   httpGet:
>> > path: /services/getPdfServiceConfig
>> > port: 8080
>> > resources:
>> >   limits:
>> > memory: "1M"
>> >   requests:
>> > memory: "1M"
>> > env:
>> >   - name: JAVA_MEM
>> > value: -Xms9000m -Xmx9000m -XX:+UseG1GC
>> > -XX:+UseStringDeduplication -XX:+AlwaysPreTouch
>> >   - name: SPRING_PROFILE
>> > value: __SPRING_PROFILE__
>> > command: ["/bin/bash","-ic"]
>> > args:
>> >   - "set -xeo pipefail ; source /interpolate ; exec
>> > /opt/jboss/wildfly/bin/standalone.sh -b 0.0.0.0"
>> > 
>> >
>> > Here is the important parts of the dockerFile
>> >
>> > 
>> > FROM liaisonintl/docker-cas-base:master
>> > MAINTAINER John VanRyn 
>> >
>> > EXPOSE 8080
>> > EXPOSE 9990
>> >
>> > LABEL "GITHASH"="__GIT_HASH__"
>> > ENV WILDFLY_HOME /opt/jboss/wildfly
>> > ENV PATH $WILDFLY_HOME/bin:$PATH
>> >
>> > ADD *.war ${WILDFLY_HOME}/standalone/deployments/
>> >
>> > ## App config
>> > #
>> > ADD config/ ${WILDFLY_HOME}/appConfigTemplate/
>> >
>> > ## Temporary fix just to see things working
>> > ADD config/gen.unicas-ws.docker 
>> ${WILDFLY_HOME}/appConfig/unicas-ws.docker
>> >
>> > USER root
>> > ENV CAS_CONFIGS ${WILDFLY_HOME}/appConfig
>> > ENV SPRING_PROFILE QA
>> >
>> > ENV JAVA_OPTS="${JAVA_OPTS} ${JAVA_MEM} -XX:+UseG1GC
>> > -XX:+UseStringDeduplication -DCAS_CONFIGS=${CAS_CONFIGS}
>> > -Dspring.profiles.active=${SPRING_PROFILE}"
>> >
>> > RUN \
>> > mkdir -p $CAS_CONFIGS && \
>> > chmod 777 ${WILDFLY_HOME}/appConfig && \
>> > chmod 777 ${WILDFLY_HOME}/appConfigTemplate && \
>> > /opt/jboss/wildfly/bin/add-user.sh admin REDACTED --silent
>> >
>> > # Add REVISION FILE FOR GITHASH Reporting
>> > ADD config/REVISION REVISION
>> >
>> > CMD ["/opt/jboss/wildfly/bin/standalone.sh", "-b", "0.0.0.0"]
>> > 
>> >
>> > Log looks like this..
>> > 
>> > + exec /opt/jboss/wildfly/bin/standalone.sh -b 0.0.0.0
>> > JAVA_OPTS already set in environment; overriding default settings with
>> > values:   -XX:+UseG1GC -XX:+UseStringDeduplication
>> > -DCAS_CONFIGS=/opt/jboss/wildfly/appConfig -Dspring.profiles.active=QA
>> > -Xms9000m -Xmx9000m -XX:+UseG1GC -XX:+UseStringDeduplication
>> > -XX:+AlwaysPreTouch -XX:+UseG1GC -XX:+UseStringDeduplication
>> > 
>> =
>> >
>> >   JBoss Bootstrap 

Re: [kubernetes-users] Getting Google's Cloud SQL Proxy to work

2017-06-28 Thread Evan Jones
I believe we only grant the service account the "Cloud SQL Client" role.

I'm just a user of CloudSQL, but my understanding of the advantage of using
the proxy is that it lets you control access via Google Cloud roles.
Without it, you are trusting everything that uses some set of IP addresses
with permission to connect to MySQL. If you have tools around managing
Google Cloud permissions already, its nice to control access to MySQL using
those some methods. The disadvantage, as you noted, is that you have to
configure this weird Cloud SQL thing.



On Wed, Jun 28, 2017 at 3:20 AM, Traiano Welcome <trai...@gmail.com> wrote:

>
>
> On Thursday, 22 June 2017 18:16:22 UTC+4, Evan Jones wrote:
>>
>> The Cloud SQL Proxy logs suggest to me that it may not be using the right
>> credentials? It is possible that it is trying to use the cluster's "default
>> service account"?
>>
>>
> It seems to have been a permissions issue. It started working when I gave
> the cloud-sql service account "Project Owner" rights. The problem with this
> is that it's too permissive though.
>
> Another question though: Do I really have to use cloudsql proxy to access
> it from GKE? It seems that whitelisting the IP addresses of the GKE cluster
> works just as well - In which case I wonder why the documentation
> recommends the overcomplicated route of setting up cloud sql proxy.
>
>
>
>> If you use "kubectl exec ... -ti /bin/sh" you should be able to examine
>> the contents of the credentials file that is being passed to ensure that
>> the contents match what you expect. You can also try to interactively start
>> /cloud_sql_proxy with different flags to see if you can get it to start and
>> print the "expected" log message.
>>
>> Good luck I hope that helps!
>>
>>
>>
>> On Thursday, June 22, 2017 at 9:46:52 AM UTC-4, Traiano Welcome wrote:
>>>
>>> Hi Evan
>>>
>>>
>>> On Thu, Jun 22, 2017 at 5:34 PM, Evan Jones <evan@triggermail.io>
>>> wrote:
>>>
>>>> I know nothing about wordpress, but for what it is worth, we are using
>>>> this Cloud SQL Proxy container with success. A few notes about the config
>>>> you posted:
>>>>
>>>>
>>>> * I'm assuming that where you have 
>>>> "-instances=[INSTANCE_CONNECTION_NAME]=tcp:[PORT]"
>>>> you've replaced this with your Cloud SQL instance name
>>>> (project:region:cloud_sql_name), and port with 3306, right?
>>>>
>>>
>>>
>>> I have this:
>>>
>>>   command: ["/cloud_sql_proxy", "--dir=/cloudsql",
>>> "-instances=lol-staging:europe
>>> -west2:lol-staging-001=tcp:3306",
>>> "-credential_file=/secrets/clo
>>> udsql/credentials.json"]
>>>
>>> I've redeployed and can see logs now, the wordpress container is failing
>>> because it's failing connection to SQL via the proxy:
>>>
>>>
>>>  MySQL Connection Error: (2006) MySQL server has gone away
>>>  Warning: mysqli::mysqli(): MySQL server has gone away in - on line
>>> 10
>>>  Warning: mysqli::mysqli(): Error while reading greeting packet.
>>> PID=167 in - on line 10
>>>  Warning: mysqli::mysqli(): (HY000/2006): MySQL server has gone away
>>> in - on line 10
>>>
>>> The cloud sql proxy container complains about an account not having
>>> correct permissions (however  I have given it editor permissions):
>>>
>>> kubectl logs pods/wordpress-2608214628 <(260)%20821-4628>-9bc9b
>>> cloudsql-proxy
>>>
>>>
>>> 2017/06/22 13:40:15 Throttling 
>>> refreshCfg(lol-staging:europe-west2:lol-staging-001):
>>> it was only called 24.005689746s ago
>>> 2017/06/22 13:40:15 couldn't connect to 
>>> "lol-staging:europe-west2:lol-staging-001":
>>> ensure that the account has access to 
>>> "lol-staging:europe-west2:lol-staging-001"
>>> (and make sure there's no typo in that name). Error during createEphemeral
>>> for lol-staging:europe-west2:lol-staging-001: googleapi: Error 403: The
>>> client is not authorized to make this request., notAuthorized
>>> 2017/06/22 13:40:18 New connection for "lol-staging:europe-west2:lol-
>>> staging-001"
>>>
>>>
>>> So I'm now wondering how to better validate which account is being
>>> referred to here a

Re: [kubernetes-users] Getting Google's Cloud SQL Proxy to work

2017-06-22 Thread Evan Jones
The Cloud SQL Proxy logs suggest to me that it may not be using the right 
credentials? It is possible that it is trying to use the cluster's "default 
service account"?

If you use "kubectl exec ... -ti /bin/sh" you should be able to examine the 
contents of the credentials file that is being passed to ensure that the 
contents match what you expect. You can also try to interactively start 
/cloud_sql_proxy with different flags to see if you can get it to start and 
print the "expected" log message.

Good luck I hope that helps!



On Thursday, June 22, 2017 at 9:46:52 AM UTC-4, Traiano Welcome wrote:
>
> Hi Evan
>
>
> On Thu, Jun 22, 2017 at 5:34 PM, Evan Jones <evan@triggermail.io 
> > wrote:
>
>> I know nothing about wordpress, but for what it is worth, we are using 
>> this Cloud SQL Proxy container with success. A few notes about the config 
>> you posted:
>>
>>
>> * I'm assuming that where you have 
>> "-instances=[INSTANCE_CONNECTION_NAME]=tcp:[PORT]" you've replaced this 
>> with your Cloud SQL instance name (project:region:cloud_sql_name), and port 
>> with 3306, right?
>>
>
>
> I have this:
>
>   command: ["/cloud_sql_proxy", "--dir=/cloudsql",
> 
> "-instances=lol-staging:europe-west2:lol-staging-001=tcp:3306",
> "-credential_file=/secrets/cloudsql/credentials.json"]
>
> I've redeployed and can see logs now, the wordpress container is failing 
> because it's failing connection to SQL via the proxy:
>
>
>  MySQL Connection Error: (2006) MySQL server has gone away
>  Warning: mysqli::mysqli(): MySQL server has gone away in - on line 10
>  Warning: mysqli::mysqli(): Error while reading greeting packet. 
> PID=167 in - on line 10
>  Warning: mysqli::mysqli(): (HY000/2006): MySQL server has gone away 
> in - on line 10
>
> The cloud sql proxy container complains about an account not having 
> correct permissions (however  I have given it editor permissions):
>
> kubectl logs pods/wordpress-2608214628-9bc9b cloudsql-proxy
>
>
> 2017/06/22 13:40:15 Throttling 
> refreshCfg(lol-staging:europe-west2:lol-staging-001): it was only called 
> 24.005689746s ago
> 2017/06/22 13:40:15 couldn't connect to 
> "lol-staging:europe-west2:lol-staging-001": ensure that the account has 
> access to "lol-staging:europe-west2:lol-staging-001" (and make sure there's 
> no typo in that name). Error during createEphemeral for 
> lol-staging:europe-west2:lol-staging-001: googleapi: Error 403: The client 
> is not authorized to make this request., notAuthorized
> 2017/06/22 13:40:18 New connection for 
> "lol-staging:europe-west2:lol-staging-001"
>
>
> So I'm now wondering how to better validate which account is being 
> referred to here and whether the permissions are correct.
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>  
>
>>
>> * If you run kubectl logs pod cloudsql-proxy you should see something 
>> like the following:
>>
>> 2017/06/21 20:45:06 Listening on (whatever) for (your instance name)
>> 2017/06/21 20:45:06 Ready for new connections
>>
>> * If the cloudsql-proxy container is running, you can run kubectl exec 
>> (pod) -c cloudsql-proxy -ti /bin/sh to get a shell and try to verify the 
>> cloudsql-proxy configuration, or to start a new instance and make sure it 
>> is configured correctly.
>>
>> * We are using this with local Unix socket connections in the /cloudsql 
>> directory, but localhost sockets should also work.
>>
>>
>> I'm surprised you don't see any logs from your wordpress container, but I 
>> can't help with that part.
>>
>> Good luck!
>>
>>
>> On Thursday, June 22, 2017 at 3:15:26 AM UTC-4, Traiano Welcome wrote:
>>>
>>> Hi Ahmet
>>>
>>> On Thursday, 22 June 2017 02:25:05 UTC+4, Ahmet Alp Balkan wrote:
>>>>
>>>> Can you run "kubectl logs -l app=wordpress"? I am assuming there will 
>>>> be some logs from the crashing mysql container.
>>>>
>>>>
>>> Thanks for your response. I get no output from running that command (I 
>>> suppose no logs are being generated). I tried both the command, and running 
>>> it in a loop, just in case:
>>>
>>>  for i in `seq 1 100`;do echo "checking $i " - $(kubectl logs -l app=web 
>>> );done
>>>
>>> No result.
>>>
>>> Just to be clear, wordpress container is in a pod, and the cloud sql 
>

Re: [kubernetes-users] Authentication on GKE

2017-06-09 Thread Evan Jones
On the cluster details page on https://console.cloud.google.com/kubernetes 
, if you have upgraded to 1.6 (I think?), you should see the following drop 
down to edit an existing cluster. I haven't yet attempted this personally:







On Friday, June 9, 2017 at 7:48:58 AM UTC-4, Matt Brown wrote:
>
> https://cloud.google.com/container-engine/docs/role-based-access-control 
> mentions that you can create a cluster with 
> the --no-enable-legacy-authorization flag.
>
>
> On Thursday, June 8, 2017 at 12:53:41 PM UTC-4, timo.r...@holidaycheck.com 
> wrote:
>>
>> Any update on this one? IAM seems to be supported by now; however, I 
>> can't find a way to disable legacy authentication. 
>>
>>
>> On Wednesday, August 10, 2016 at 11:47:31 PM UTC+2, CJ Cullen wrote: 
>> > There isn't currently a way to turn off the legacy authentication 
>> systems on GKE. 
>> > 
>> > 
>> > GKE will soon support IAM roles. This will allow allow users to have to 
>> have full access to GKE resources without being allowed to retrieve the 
>> legacy credentials for the cluster. The credentials will still work though. 
>> > 
>> > 
>> > 
>> > On Wed, Aug 10, 2016 at 1:30 PM Romain Vrignaud  
>> wrote: 
>> > 
>> > Hello, 
>> > 
>> > 
>> > I just succeed in using new IAM support on GKE and wanted to report 
>> back. 
>> > 
>> > 
>> > I had a problem when I switched to use_client_certificate False (there 
>> is a typo on documentation with the '='). 
>> > 
>> > 
>> > I had to make a `gcloud auth login` otherwise I had an error: 
>> > ``` 
>> > 
>> > Unable to connect to the server: oauth2: cannot fetch token: 400 Bad 
>> Request 
>> > Response: { 
>> >   "error" : "invalid_grant" 
>> > } ``` 
>> > 
>> > 
>> > Now that I'm able to use IAM integration, is there any way to disable 
>> legacy admin authentication ? 
>> > 
>> > 
>> > Regards, 
>> > 
>> > 
>> > 
>> > 
>> > 
>> > 
>> > -- 
>> > 
>> > You received this message because you are subscribed to the Google 
>> Groups "Kubernetes user discussion and Q" group. 
>> > 
>> > To unsubscribe from this group and stop receiving emails from it, send 
>> an email to kubernetes-use...@googlegroups.com. 
>> > 
>> > To post to this group, send email to kubernet...@googlegroups.com. 
>> > 
>> > Visit this group at https://groups.google.com/group/kubernetes-users. 
>> > 
>> > For more options, visit https://groups.google.com/d/optout. 
>>
>>

-- 
You received this message because you are subscribed to the Google Groups 
"Kubernetes user discussion and Q" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to kubernetes-users+unsubscr...@googlegroups.com.
To post to this group, send email to kubernetes-users@googlegroups.com.
Visit this group at https://groups.google.com/group/kubernetes-users.
For more options, visit https://groups.google.com/d/optout.


[kubernetes-users] Re: Google Compute Engine is unusable, repeated "Failed: Create VM"

2017-06-01 Thread Evan Jones
A friend of mine ran into something that sounds suspiciously similar to 
this. I don't recall the details about it, but he did tweet about 
it: https://twitter.com/nicksantos/status/86997848164864

I seem to recall after the free trial expired, they literally had to delete 
everything and recreate it from scratch. I'll point him at this thread and 
see if he can add details. Hope that helps,

Evan



On Thursday, June 1, 2017 at 5:26:40 AM UTC-4, Trifork AB wrote:
>
> Hi,
>
> We had a few projects running with our free trial, however our trial 
> expired and upgraded the account. Now all of our clusters became unusable 
> and we can see in our activity that our virtual machines are failing to be 
> created. The error message is
>
> 400: The resource 'projects//global/networks/default' is 
> not ready
>
> This error has appeared since the upgrade and about 6 times a minute. No 
> one recalls changing any network configurations before or after the free 
> trial expired. We have tried deleting and recreating clusters but often it 
> takes 30+ minutes to delete or just fails.
>
> Any suggestions how to fix this problem?
>
> Regards
>

-- 
You received this message because you are subscribed to the Google Groups 
"Kubernetes user discussion and Q" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to kubernetes-users+unsubscr...@googlegroups.com.
To post to this group, send email to kubernetes-users@googlegroups.com.
Visit this group at https://groups.google.com/group/kubernetes-users.
For more options, visit https://groups.google.com/d/optout.


[kubernetes-users] Re: Help me understand Kubernetes/Google LB options and architectures

2017-05-15 Thread Evan Jones
This won't directly help answer your questions, since I don't know the 
answers. However, I found this talk about Kubernetes networking to be 
extremely helpful to understand the basics. Whenever I'm running into 
weirdness I end up reviewing it: https://www.youtube.com/watch?v=y2bhV81MfKQ

Hopefully it will help with the basics. For example, I *think* one of the 
reasons an "external" load balancer may not work correctly is that it may 
not see the actual state of services inside the cluster. E.g. it doesn't 
know what nodes are running the actual pods. According to what I seem to 
recall from this talk: one of the ways services can work is that external 
processes connect to any node in the cluster, and that node forwards it to 
a pod that is actually running the service.

However, this may be completely inaccurate since I am far from an expert 
here, so I'm looking forward to seeing the real answers :)

Evan




On Sunday, May 14, 2017 at 1:28:45 PM UTC-4, Joe Auty wrote:
>
> Sorry for such a vague subject, but I think I need some help breaking 
> things down here.
>
> I think I understand how the Google layer 7 LBs work (this diagram helped 
> me: 
> https://storage.googleapis.com/static.ianlewis.org/prod/img/750/gcp-lb-objects2.png)
>  
> , I understand NGinx and HAProxy LBs independently, and I believe I also 
> understand the concepts of NodePort, Ingress controllers, services, etc.
>
> What I don't understand is why when I research things like socket.io 
> architectures in Kubernetes (for example), or features like IP 
> whitelisting, session affinity, etc. I see people putting NGinx or HAProxy 
> into their clusters. It is hard for me to keep straight all of the 
> different levels of load balancing and their controls:
>
>
>- Google backend services (i.e. Google LB)
>- Kubernetes service LB
>- HAProxy/NGinx
>
>
> The rationale for HAProxy and NGinx seems to involve compensating for 
> missing features and/or bugs (kube-proxy, etc.) and it is hard to keep 
> straight what is a reality today and what the best path is? 
>
> Google's LBs support session affinity, and there are session affinity 
> Kubernetes service settings, so for starters, when and why is NGinx or 
> HAProxy necessary, and are there outstanding issues with tracking source 
> IPs and setting/respecting proper headers?
>
> I'm happy to get into what sort of features I need if this will help steer 
> the discussion, but at this point I'm thinking maybe it is best to start at 
> a more basic level where you treat me like I'm 6 years old :)
>
> Thanks in advance!
>

-- 
You received this message because you are subscribed to the Google Groups 
"Kubernetes user discussion and Q" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to kubernetes-users+unsubscr...@googlegroups.com.
To post to this group, send email to kubernetes-users@googlegroups.com.
Visit this group at https://groups.google.com/group/kubernetes-users.
For more options, visit https://groups.google.com/d/optout.


Re: [kubernetes-users] Finding a way to get stable public IP for outbound connections

2017-05-03 Thread Evan Jones
Correct, but at least at the moment we aren't using auto-resizing, and I've
never seen nodes get removed without us manually taking some action (e.g.
upgrading Kubernetes releases or similar). Are there automated events that
can delete a VM and remove it, without us having done something? Certainly
I've observed machines rebooting, but that also preserves dedicated IPs. I
can live with having to take some manual configuration action periodically,
if we are changing something with our cluster, but I would like to know if
there is something I've overlooked. Thanks!


On Wed, May 3, 2017 at 12:20 PM, Paul Tiplady <p...@qwil.co> wrote:

> The public IP is not stable in GKE. You can manually assign a static IP to
> a GKE node, but then if the node goes away (e.g. your cluster was resized)
> the IP will be detached, and you'll have to manually reassign. I'd guess
> this is also true on an AWS managed equivalent like CoreOS's CloudFormation
> scripts.
>
> On Wed, May 3, 2017 at 8:52 AM, Evan Jones <evan.jo...@triggermail.io>
> wrote:
>
>> As Rodrigo described, we are using Container Engine. I haven't fully
>> tested this yet, but my plan is to assign "dedicated IPs" to a set of
>> nodes, probably in their own Node Pool as part of the cluster. Those are
>> the IPs used by outbound connections from pods running those nodes, if I
>> recalling correctly from a previous experiment. Then I will use Rodrigo's
>> taint suggestion to schedule Pods on those nodes.
>>
>> If for whatever reason we need to remove those nodes from that pool, or
>> delete and recreate them, we can move the dedicated IP and taints to new
>> nodes, and the jobs should end up in the right place again.
>>
>> In short: I'm pretty sure this is going to solve our problem.
>>
>> Thanks!
>>
>>
>

-- 
You received this message because you are subscribed to the Google Groups 
"Kubernetes user discussion and Q" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to kubernetes-users+unsubscr...@googlegroups.com.
To post to this group, send email to kubernetes-users@googlegroups.com.
Visit this group at https://groups.google.com/group/kubernetes-users.
For more options, visit https://groups.google.com/d/optout.


Re: [kubernetes-users] Finding a way to get stable public IP for outbound connections

2017-05-03 Thread Evan Jones
As Rodrigo described, we are using Container Engine. I haven't fully tested 
this yet, but my plan is to assign "dedicated IPs" to a set of nodes, 
probably in their own Node Pool as part of the cluster. Those are the IPs 
used by outbound connections from pods running those nodes, if I recalling 
correctly from a previous experiment. Then I will use Rodrigo's taint 
suggestion to schedule Pods on those nodes.

If for whatever reason we need to remove those nodes from that pool, or 
delete and recreate them, we can move the dedicated IP and taints to new 
nodes, and the jobs should end up in the right place again.

In short: I'm pretty sure this is going to solve our problem.

Thanks!

-- 
You received this message because you are subscribed to the Google Groups 
"Kubernetes user discussion and Q" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to kubernetes-users+unsubscr...@googlegroups.com.
To post to this group, send email to kubernetes-users@googlegroups.com.
Visit this group at https://groups.google.com/group/kubernetes-users.
For more options, visit https://groups.google.com/d/optout.


Re: [kubernetes-users] Finding a way to get stable public IP for outbound connections

2017-05-02 Thread Evan Jones
Thank you! I had forgotten about that feature, since we previously have not 
needed it. That will absolutely solve our problem, and be much better than 
needing an "exceptional" thing outside of Kubernetes.

You are correct about what we need: We have a small number of services 
where their outbound requests need to come from known IPs. This will solve 
the issue for us.

Thanks again.

-- 
You received this message because you are subscribed to the Google Groups 
"Kubernetes user discussion and Q" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to kubernetes-users+unsubscr...@googlegroups.com.
To post to this group, send email to kubernetes-users@googlegroups.com.
Visit this group at https://groups.google.com/group/kubernetes-users.
For more options, visit https://groups.google.com/d/optout.


Re: [kubernetes-users] Finding a way to get stable public IP for outbound connections

2017-05-01 Thread Evan Jones
It turns out I've just run into a requirement to have a stable outbound IP 
address as well. In looking into this: I think we will likely some kind of 
proxy server running outside of Kubernetes. This will allow services "opt 
in" to this special handling, rather than doing it for everything in the 
cluster. It seems like the simplest way to make this work.

Honestly, this seems like enough of a rare case that I'm not sure 
Kubernetes should really support anything "natively" to solve this problem 
(at least not at the moment when there are more common things that still 
need work).


-- 
You received this message because you are subscribed to the Google Groups 
"Kubernetes user discussion and Q" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to kubernetes-users+unsubscr...@googlegroups.com.
To post to this group, send email to kubernetes-users@googlegroups.com.
Visit this group at https://groups.google.com/group/kubernetes-users.
For more options, visit https://groups.google.com/d/optout.