I’m guessing the kafka service did not start properly? Can you verify the kafka service is usable?
> On Mar 5, 2018, at 3:00 PM, Kumar Subramanian <[email protected]> wrote: > > After I increased the timeout on the health checks I get the following > [2018-03-05T22:58:34.042Z] [ERROR] [??] [KafkaMessagingProvider] ensureTopic > for completed0 failed due to java.util.concurrent.ExecutionException: > org.apache.kafka.common.errors.TimeoutException: Timed out waiting for a node > assignment. > [2018-03-05T22:58:34.044Z] [ERROR] [??] [Controller] failure during > msgProvider.ensureTopic for topic completed0 > [INFO] [03/05/2018 22:58:34.070] [kamon-shutdown-hook-1] > [CoordinatedShutdown(akka://kamon)] Starting coordinated shutdown from JVM > shutdown hook > > On 3/5/18, 2:58 PM, "Kumar Subramanian" <[email protected]> wrote: > > Hi Carlos/Tyson/Chetan, > Any Suggestions? > > Note: I just tried to increase the timeout on the health checks …no luck. > > Thanks, > Kumar. > > On 3/5/18, 12:28 PM, "Kumar Subramanian" <[email protected]> wrote: > > This is the output log > [2018-03-05T20:24:10.016Z] [INFO] Initializing Kamon... > [INFO] [03/05/2018 20:24:10.301] [main] > [StatsDExtension(akka://kamon)] Starting the Kamon(StatsD) extension > [2018-03-05T20:24:10.352Z] [INFO] Slf4jLogger started > [2018-03-05T20:24:10.706Z] [INFO] [??] [Config] environment set value > for db.whisk.actions > [2018-03-05T20:24:10.707Z] [INFO] [??] [Config] environment set value > for db.protocol > [2018-03-05T20:24:10.708Z] [INFO] [??] [Config] environment set value > for limits.actions.sequence.maxLength > [2018-03-05T20:24:10.708Z] [INFO] [??] [Config] environment set value > for limits.triggers.fires.perMinute > [2018-03-05T20:24:10.708Z] [INFO] [??] [Config] environment set value > for akka.cluster.seed.nodes > [2018-03-05T20:24:10.709Z] [INFO] [??] [Config] environment set value > for limits.actions.invokes.concurrent > [2018-03-05T20:24:10.709Z] [INFO] [??] [Config] environment set value > for controller.instances > [2018-03-05T20:24:10.710Z] [INFO] [??] [Config] environment set value > for controller.localBookkeeping > [2018-03-05T20:24:10.710Z] [INFO] [??] [Config] environment set value > for whisk.version.date > [2018-03-05T20:24:10.710Z] [INFO] [??] [Config] environment set value > for db.port > [2018-03-05T20:24:10.711Z] [INFO] [??] [Config] environment set value > for whisk.version.buildno > [2018-03-05T20:24:10.711Z] [INFO] [??] [Config] environment set value > for db.whisk.activations > [2018-03-05T20:24:10.711Z] [INFO] [??] [Config] environment set value > for db.username > [2018-03-05T20:24:10.712Z] [INFO] [??] [Config] environment set value > for limits.actions.invokes.perMinute > [2018-03-05T20:24:10.712Z] [INFO] [??] [Config] environment set value > for db.whisk.auths > [2018-03-05T20:24:10.712Z] [INFO] [??] [Config] environment set value > for limits.actions.invokes.concurrentInSystem > [2018-03-05T20:24:10.712Z] [INFO] [??] [Config] environment set value > for runtimes.manifest > [2018-03-05T20:24:10.713Z] [INFO] [??] [Config] environment set value > for kafka.hosts > [2018-03-05T20:24:10.713Z] [INFO] [??] [Config] environment set value > for db.host > [2018-03-05T20:24:10.713Z] [INFO] [??] [Config] environment set value > for port > [2018-03-05T20:24:10.714Z] [INFO] [??] [Config] environment set value > for db.password > [2018-03-05T20:24:10.714Z] [INFO] [??] [Config] environment set value > for db.provider > Received killTask for task > whisk-controller.28d25d91-20b3-11e8-8754-3afdc003616b > [INFO] [03/05/2018 20:25:38.974] [kamon-shutdown-hook-1] > [CoordinatedShutdown(akka://kamon)] Starting coordinated shutdown from JVM > shutdown hook > [2018-03-05T20:25:38.975Z] [INFO] Starting coordinated shutdown from > JVM shutdown hook > [2018-03-05T20:25:38.982Z] [INFO] [??] [Controller] Shutting down > Kamon with coordinated shutdown > > ERROR_LOG > I0305 20:24:09.220711 1337 exec.cpp:162] Version: 1.2.3 > I0305 20:24:09.227144 1338 exec.cpp:237] Executor registered on agent > 995020e0-5129-44a3-8cf4-65900838b3af-S7 > W0305 20:24:09.227144 1341 logging.cpp:91] RAW: Received signal > SIGTERM from process 10243 of user 0; exiting > > On 3/5/18, 12:24 PM, "Kumar Subramanian" <[email protected]> > wrote: > > I get the following error while in the deploying state (then kills > it automatically and re-installs and goes on…) > Task was killed since health check failed. Reason: > ConnectionAttemptFailedException: Connection attempt to > <whisk_controller_ip>>:8888 failed > > > > On 3/5/18, 12:21 PM, "Kumar Subramanian" <[email protected]> > wrote: > > Gave the value as mykafka.marathon.mesos:9092…it seems to be > going forward now with the deployment…hope it succeeds > > On 3/5/18, 12:13 PM, "Kumar Subramanian" > <[email protected]> wrote: > > Hi Chetan, > I resolved all the settings (not sure about some those > values set)….now I’m getting the following error > I0305 20:07:00.341622 21216 exec.cpp:162] Version: 1.2.3 > I0305 20:07:00.347102 21224 exec.cpp:237] Executor > registered on agent 995020e0-5129-44a3-8cf4-65900838b3af-S4 > Exception in thread "main" > org.apache.kafka.common.KafkaException: Failed create new KafkaAdminClient > at > org.apache.kafka.clients.admin.KafkaAdminClient.createInternal(KafkaAdminClient.java:322) > at > org.apache.kafka.clients.admin.AdminClient.create(AdminClient.java:50) > at > whisk.connector.kafka.KafkaMessagingProvider$.ensureTopic(KafkaMessagingProvider.scala:70) > at > whisk.core.controller.Controller$.main(Controller.scala:217) > at > whisk.core.controller.Controller.main(Controller.scala) > Caused by: org.apache.kafka.common.config.ConfigException: > No resolvable bootstrap urls given in bootstrap.servers > at > org.apache.kafka.clients.ClientUtils.parseAndValidateAddresses(ClientUtils.java:64) > at > org.apache.kafka.clients.admin.KafkaAdminClient.<init>(KafkaAdminClient.java:345) > at > org.apache.kafka.clients.admin.KafkaAdminClient.createInternal(KafkaAdminClient.java:315) > ... 4 more > <<WHISK_CONTROLLER ENVIRONMENT SETTINGS>> > The config environment values are: > "KAFKA_HOST": "broker-0.kafka.mesos" > "KAFKA_HOSTS": "mykafka.docker:9092" > > > Here is the Kakfka service in my dcos services > <<KAFKA_SERVICE>> > ID: mykafka.654fa8cd-1e56-11e8-8754-3afdc003616b > Name: mykafka > Address: <<internal_ip>> > Status: Running > > On 3/5/18, 12:01 PM, "Kumar Subramanian" > <[email protected]> wrote: > > Also what is DB_WHISK_ACTIVATIONS=local_activations, > what does it mean by “local” activations? This setting is also needed, what > should the value be in my dcos env? Should it still be local_activations? > > On 3/5/18, 11:48 AM, "Kumar Subramanian" > <[email protected]> wrote: > > Thanks Chetan, I added that now I’m getting the > > [2018-03-05T19:43:45.025Z] [ERROR] [??] [Config] > required property kafka.hosts still not set > > what is kafka.hosts is that the kafka host name? > (as in mykafka…that;s the kafka name I gave when I installed kakfa). Should > it be just the name or is FQDN ? > > > On 3/5/18, 11:32 AM, "Chetan Mehrotra" > <[email protected]> wrote: > >> required property controller.instances still not set > > Looks like some configs are missing. You would > need to this or few > more props. The configs are generally managed > via Ansible for default > setup. For dcos you may need to configure them > explicitly. You can see > various configs and there values as an example > at [1] > (controller.instances becomes > CONTROLLER_INSTANCES). They would need > to be tweaked as per your setup though > > Chetan Mehrotra > [1] > https://na01.safelinks.protection.outlook.com/?url=https%3A%2F%2Furldefense.proofpoint.com%2Fv2%2Furl%3Fu%3Dhttps-3A__github.com_apache_incubator-2Dopenwhisk-2Ddevtools_blob_master_docker-2Dcompose_docker-2Dwhisk-2Dcontroller.env%26d%3DDwIFaQ%26c%3DuilaK90D4TOVoH58JNXRgQ%26r%3DF5C8fYlpBJ270qrdwLq2iRQrPd1CLap8zItxk8laWpo%26m%3DK8Rzl5BrVqOWFx5b1fYG8EjdY-6JVyi-x_eMD0thaKY%26s%3DVfbixlUG4tFbgBNxiMr-KCNJOakmlGuNcnXLlbQMrVY%26e&data=02%7C01%7Ctnorris%40adobe.com%7Cb6c350b230924b71a74608d582ecfe6d%7Cfa7b1b5a7b34438794aed2c178decee1%7C0%7C0%7C636558876747512963&sdata=0Ay%2FfOtiAyIPC6qqjTIUfoAnXT8%2BBQ11zTQKOgLtLM4%3D&reserved=0= > > > On Tue, Mar 6, 2018 at 12:09 AM, Kumar > Subramanian > <[email protected]> wrote: >> Hi, >> I was able to successfully do the following >> 1) Build the Controller image >> 2) Push the image >> >> However when I installed the controller package it gives me the following >> error in the output; then it shuts down and retries the installation (goes >> on…) >> >> Registered docker executor on 10.0.6.13 >> Starting task whisk-controller.a46ae612-20a2-11e8-8754-3afdc003616b >> [2018-03-05T18:25:55.853Z] [INFO] Initializing Kamon... >> [INFO] [03/05/2018 18:25:56.151] [main] [StatsDExtension(akka://kamon)] >> Starting the Kamon(StatsD) extension >> [2018-03-05T18:25:56.193Z] [INFO] Slf4jLogger started >> [2018-03-05T18:25:56.552Z] [INFO] [??] [Config] environment set value for >> db.whisk.actions >> [2018-03-05T18:25:56.554Z] [INFO] [??] [Config] environment set value for >> db.protocol >> [2018-03-05T18:25:56.554Z] [INFO] [??] [Config] environment set value for >> limits.triggers.fires.perMinute >> [2018-03-05T18:25:56.554Z] [INFO] [??] [Config] environment set value for >> limits.actions.invokes.concurrent >> [2018-03-05T18:25:56.555Z] [INFO] [??] [Config] environment set value for >> whisk.version.date >> [2018-03-05T18:25:56.555Z] [INFO] [??] [Config] environment set value for >> db.port >> [2018-03-05T18:25:56.555Z] [INFO] [??] [Config] environment set value for >> whisk.version.buildno >> [2018-03-05T18:25:56.556Z] [INFO] [??] [Config] environment set value for >> db.username >> [2018-03-05T18:25:56.556Z] [INFO] [??] [Config] environment set value for >> limits.actions.invokes.perMinute >> [2018-03-05T18:25:56.556Z] [INFO] [??] [Config] environment set value for >> db.whisk.auths >> [2018-03-05T18:25:56.556Z] [INFO] [??] [Config] environment set value for >> limits.actions.invokes.concurrentInSystem >> [2018-03-05T18:25:56.557Z] [INFO] [??] [Config] environment set value for >> runtimes.manifest >> [2018-03-05T18:25:56.557Z] [INFO] [??] [Config] environment set value for >> db.host >> [2018-03-05T18:25:56.558Z] [INFO] [??] [Config] environment set value for >> port >> [2018-03-05T18:25:56.558Z] [INFO] [??] [Config] environment set value for >> db.password >> [2018-03-05T18:25:56.558Z] [INFO] [??] [Config] environment set value for >> db.provider >> [2018-03-05T18:25:56.561Z] [ERROR] [??] [Config] required property >> controller.instances still not set >> [2018-03-05T18:25:56.561Z] [ERROR] [??] [Controller] Bad configuration, >> cannot start. >> >> Any suggestions? >> >> >> On 3/2/18, 4:05 PM, "Kumar Subramanian" <[email protected]> wrote: >> >> This is the error I get when I did docker build (for Controller) >> >> Step 1/7 : FROM scala >> repository scala not found: does not exist or no pull access >> >> >> Any Suggestions? >> >> On 3/2/18, 3:34 PM, "Carlos Santana" <[email protected]> wrote: >> >> No that it’s still in PR >> >> Just pull the changes locally and build >> >> - Carlos Santana >> @csantanapr >> >>> On Mar 2, 2018, at 6:20 PM, Kumar Subramanian <[email protected]> >>> wrote: >>> >>> Is that change at >>> https://na01.safelinks.protection.outlook.com/?url=https%3A%2F%2Furldefense.proofpoint.com%2Fv2%2Furl%3Fu%3Dhttps-3A__github.com_chetanmeh_incubator-2Dopenwhisk_blob_fa302249f4f9b4e6b3084956f18bda987674f46f_core_controller_Dockerfile%26d%3DDwIFaQ%26c%3DuilaK90D4TOVoH58JNXRgQ%26r%3DF5C8fYlpBJ270qrdwLq2iRQrPd1CLap8zItxk8laWpo%26m%3DnLubLAFijdQ4pOPqIydDI_wguMgbdmdmoMXcP7g-m8k%26s%3D0_zv4jTDip5Uk9oBB5-6Ka_Iug3KYWIhy7qzSDryqM0%26e&data=02%7C01%7Ctnorris%40adobe.com%7Cb6c350b230924b71a74608d582ecfe6d%7Cfa7b1b5a7b34438794aed2c178decee1%7C0%7C0%7C636558876747512963&sdata=TSMtdsDvFmEZNOwcKHPk%2BmIMse4pgHnt3nlF4uL%2BqyY%3D&reserved=0= >>> not merged? I don’t see change in master >>> https://na01.safelinks.protection.outlook.com/?url=https%3A%2F%2Furldefense.proofpoint.com%2Fv2%2Furl%3Fu%3Dhttps-3A__github.com_apache_incubator-2Dopenwhisk_blob_master_core_controller_Dockerfile%26d%3DDwIFaQ%26c%3DuilaK90D4TOVoH58JNXRgQ%26r%3DF5C8fYlpBJ270qrdwLq2iRQrPd1CLap8zItxk8laWpo%26m%3DnLubLAFijdQ4pOPqIydDI_wguMgbdmdmoMXcP7g-m8k%26s%3DTdB-IjihM3-0dqBF029dSMkoWbZHBAUXXqeDjfQMlhg%26e&data=02%7C01%7Ctnorris%40adobe.com%7Cb6c350b230924b71a74608d582ecfe6d%7Cfa7b1b5a7b34438794aed2c178decee1%7C0%7C0%7C636558876747512963&sdata=%2BYiogEyVeaOaF8brNypAf7NZSfhoK77Q5yFG0261O3k%3D&reserved=0= >>> >>> Thanks, >>> Kumar. >>> >>> On 3/2/18, 2:59 PM, "Kumar Subramanian" <[email protected]> wrote: >>> >>> Ok, I will try to build the controller image and see. Will keep you >>> posted. >>> >>> On 3/2/18, 2:39 PM, "Tyson Norris" <[email protected]> wrote: >>> >>> Thanks Carlos - I think you’re right. >>> >>> >>> >>> Kumar you can either build the controller image with that PR, or else >>> you should be able to manually set the docker cmd, e.g. /bin/sh -c \"exec >>> /init.sh 0 >> /dev/stdout\” on the dcos service for controller; >>> >>> >>> >>> I think you will have similar issue with invoker, mostly because this >>> universe is far out of date from current openwhisk images. >>> >>> >>> >>> For invoker can you use the docker cmd as /bin/sh -c \"exec /init.sh >>> --name $LIBPROCESS_IP >> /dev/stdout\” >>> >>> >>> >>> Additionally, the env vars (both invoker and controller) have changed >>> substantially, so I would expect a few hiccups there as well. >>> >>> >>> >>> We are working on getting updates to the universe so that our >>> internal deployment details are not included, and it will actually work >>> with recent openwhisk images (and stay working) but haven’t gotten >>> everything set just yet. >>> >>> >>> >>> Hope that helps >>> >>> Tyson >>> >>> >>> >>>>> On Mar 2, 2018, at 2:21 PM, Carlos Santana <[email protected]> wrote: >>>> >>>> >>> >>>> Maybe for the init.sh this PR is related >>> >>>> https://na01.safelinks.protection.outlook.com/?url=https%3A%2F%2Furldefense.proofpoint.com%2Fv2%2Furl%3Fu%3Dhttps-3A__na01.safelinks.protection.outlook.com_-3Furl-3Dhttps-253A-252F-252Fgithub.com-252Fapache-252Fincubator-2Dopenwhisk-252Fpull-252F3374-252Ffiles-2523diff-2D8f445fbdf6253dd176975ff6c629def4R18-26data-3D02-257C01-257Ctnorris-2540adobe.com-257C605cc50c7a484ccb833708d5808bfe64-257Cfa7b1b5a7b34438794aed2c178decee1-257C0-257C0-257C636556261107863657-26sdata-3Dtk8Des10hubLs7FNSgzDlsk1ibxDTIqSlXti-252FcAUyz0-253D-26reserved-3D0%26d%3DDwIGaQ%26c%3DuilaK90D4TOVoH58JNXRgQ%26r%3DF5C8fYlpBJ270qrdwLq2iRQrPd1CLap8zItxk8laWpo%26m%3DLUthdew4Dt10vSAZSYRBbREqgwWk2PUWc4KDBJtt0uU%26s%3D3UqWljTQjItMnzWhPrsfD1AF2IX6abtc9dYRfrxb2_M%26e&data=02%7C01%7Ctnorris%40adobe.com%7Cb6c350b230924b71a74608d582ecfe6d%7Cfa7b1b5a7b34438794aed2c178decee1%7C0%7C0%7C636558876747512963&sdata=O8hVxmtrLr5SL%2B2hnc25H0efls7esb5FER8HxWxR6RU%3D&reserved=0= >>> >>>> >>> >>>> >>> >>>> On Fri, Mar 2, 2018 at 5:09 PM Tyson Norris <[email protected]> >>> >>>> wrote: >>> >>>> >>> >>>>> Check your marathon/dcos service config to verify what image is used, and >>> >>>>> that you have the latest image pulled? >>> >>>>> >>> >>>>> The default should be openwhisk/controller - but I see that universe >>> >>>>> package marathon config is not set to force pull, so if you are using that >>> >>>>> image, make sure you have pulled the latest manually (or change the config >>> >>>>> to force pull in dcos/marathon ui). >>> >>>>> >>> >>>>> Tyson >>> >>>>> >>> >>>>>> On Mar 2, 2018, at 12:45 PM, Kumar Subramanian <[email protected]> >>> >>>>> wrote: >>> >>>>>> >>> >>>>>> Hi, >>> >>>>>> I have installed the following in DCOS successfully: >>> >>>>>> 1. Apigateway >>> >>>>>> 2. Exhibitor-dcos >>> >>>>>> 3. Kafka (name given is mykafka at the time of installation) >>> >>>>>> 4. Whisk-couchdb >>> >>>>>> 5. Consul >>> >>>>>> 6. Registrator >>> >>>>>> >>> >>>>>> Eror when deploying Whisk-Controller in DCOS: >>> >>>>>> When I tried to deploy whisk-controller with default settings, then the >>> >>>>> service fails to deploy (it just kills and redploys the service >>> >>>>> continuously on its own when deploying) >>> >>>>>> >>> >>>>>> Here is the content in the Error and Output >>> >>>>>> >>> >>>>>> STDERR: >>> >>>>>> (AT BEGINNING OF FILE) >>> >>>>>> I0302 20:38:35.176177 19822 exec.cpp:162] Version: 1.2.3 >>> >>>>>> I0302 20:38:35.180703 19824 exec.cpp:237] Executor registered on agent >>> >>>>> 995020e0-5129-44a3-8cf4-65900838b3af-S6 >>> >>>>>> docker: Error response from daemon: Container command 'init.sh' not >>> >>>>> found or does not exist.. >>> >>>>>> >>> >>>>>> OUTPUT: >>> >>>>>> (AT BEGINNING OF FILE) >>> >>>>>> Registered docker executor on 10.0.6.55 >>> >>>>>> Starting task whisk-controller.adb62c44-1e59-11e8-8754-3afdc003616b >>> >>>>>> >>> >>>>>> Can you please provide your valuable inputs on how to get >>> >>>>> whisk-controller deployed in dcos? >>> >>>>>> >>> >>>>>> Thanks, >>> >>>>>> Kumar. >>> >>>>>> >>> >>>>> >>> >>>>> >>> >>> >>> >>> >>> >>> >>> >> >> >> >> > > > > > > > > > > > > > > > >
