The restart policy only applies to "local restarts". If the node fails (e.g. hardware failure) or the pod is evicted or preempted, or your process crashes, then Job controller will create a replacement. The .spec.template.spec.restartPolicy does not prevent this.
Note that this is not a Kubernetes-specific problem. It is a general distributed systems problem. For example: https://twitter.com/mathiasverraes/status/632260618599403520?lang=en Some options are: - accept some duplicates, get automatic restart on hardware and software failures. (use a Job) - accept lost data and downtime when hardware failures (using bare Pods -- possible but not recommended) - make your message queue insert idempotent somehow. - use transactions somehow, which requires some means to clean up hanging transactions. On Thursday, July 6, 2017 at 9:38:48 PM UTC-7, sbsh...@gmail.com wrote: > > Hi > As per Kubernetes jobs controller documentation[1]: *even if you specify > .spec.parallelism = 1 and .spec.completions = 1 and > .spec.template.spec.restartPolicy = “Never”, the same program may sometimes > be started twice.* > > I want to know under what circumstances that can happen? > > > My app is reading data from HDFS and writing it to a message queue. It > terminates after processing all the files. I want to minimize possibility > of writing duplicate records. > > Thanks. > > References: > 1) > https://kubernetes.io/docs/concepts/workloads/controllers/jobs-run-to-completion/ -- You received this message because you are subscribed to the Google Groups "Kubernetes user discussion and Q&A" group. To unsubscribe from this group and stop receiving emails from it, send an email to kubernetes-users+unsubscr...@googlegroups.com. To post to this group, send email to kubernetes-users@googlegroups.com. Visit this group at https://groups.google.com/group/kubernetes-users. For more options, visit https://groups.google.com/d/optout.