Hiya folks! I’ve spent the past few weeks prototyping a new data cluster with Mesos, Kafka, and Flume delivering data to HDFS which we plan to interact with via Spark. In the prototype environment, I had a fairly high volume of test data flowing for some weeks with little to no major issues except for learning about tuning Kafka and Flume.
I’m launching kafka with the github.com/mesos/kafka project, and flume is run via marathon. Yesterday morning, I came in and my flume jobs had disappeared from the task list in Mesos, though I found the actual processes still running when I searched the cluster ’ps’ output. Later in the day, I had the same happen to my kafka brokers. In some cases, the only way I’ve found to recover from this is to shut everything down and clear the zookeeper data, which would be fairly drastic if it happened in production, and particularly if we had many tasks / frameworks that were fine, but one or two disappeared. I’d appreciate any help sorting through this, I’m using latest Mesos and CDH5 installed via community Chef cookbooks. TIA, Justin Alan Ryan Sr. Systems Engineer ZipRealty ________________________________ P Please consider the environment before printing this e-mail The information in this electronic mail message is the sender's confidential business and may be legally privileged. It is intended solely for the addressee(s). Access to this internet electronic mail message by anyone else is unauthorized. If you are not the intended recipient, any disclosure, copying, distribution or any action taken or omitted to be taken in reliance on it is prohibited and may be unlawful. The sender believes that this E-mail and any attachments were free of any virus, worm, Trojan horse, and/or malicious code when sent. This message and its attachments could have been infected during transmission. By reading the message and opening any attachments, the recipient accepts full responsibility for taking protective and remedial action about viruses and other defects. The sender's employer is not liable for any loss or damage arising in any way.

