sergioferragut commented on code in PR #14742:
URL: https://github.com/apache/druid/pull/14742#discussion_r1289179395


##########
examples/quickstart/jupyter-notebooks/notebooks/02-ingestion/01-streaming-from-kafka.ipynb:
##########
@@ -126,81 +105,70 @@
    "metadata": {},
    "outputs": [],
    "source": [
-    "from kafka import KafkaProducer\n",
-    "from kafka import KafkaConsumer\n",
+    "# Use kafka_host variable when connecting to kafka \n",
+    "if 'KAFKA_HOST' not in os.environ.keys():\n",
+    "   kafka_host=f\"http://localhost:9092\"\n";,
+    "else:\n",
+    "    kafka_host=f\"{os.environ['KAFKA_HOST']}:9092\"\n",
     "\n",
-    "# Kafka runs on kafka:9092 in multi-container tutorial application\n",
-    "producer = KafkaProducer(bootstrap_servers='kafka:9092')\n",
+    "# this is the kafka topic we will be working with:\n",
     "topic_name = \"social_media\""
    ]
   },
-  {
-   "cell_type": "markdown",
-   "metadata": {},
-   "source": [
-    "Create the `social_media` topic and send a sample event. The `send()` 
command returns a metadata descriptor for the record."
-   ]
-  },
   {
    "cell_type": "code",
    "execution_count": null,
    "metadata": {},
    "outputs": [],
    "source": [
-    "event = {\n",
-    "    \"__time\": \"2023-01-03T16:40:21.501\",\n",
-    "    \"username\": \"willow\",\n",
-    "    \"post_title\": \"This title is required\",\n",
-    "    \"views\": 15284,\n",
-    "    \"upvotes\": 124,\n",
-    "    \"comments\": 21,\n",
-    "    \"edited\": \"True\"\n",
-    "}\n",
     "\n",
-    "producer.send(topic_name, json.dumps(event).encode('utf-8'))"
+    "import json\n",
+    "\n",
+    "# shortcuts for display and sql api's\n",
+    "display = druid.display\n",
+    "sql_client = druid.sql\n",
+    "\n",
+    "\n",
+    "# client for Data Generator API\n",
+    "datagen = druidapi.rest.DruidRestClient(\"http://datagen:9999\";)\n",
+    "\n",
+    "# client for Druid API\n",
+    "rest_client = druid.rest\n"
    ]
   },
   {
    "cell_type": "markdown",
    "metadata": {},
    "source": [
-    "To verify that the Kafka topic stored the event, create a consumer client 
to read records from the Kafka cluster, and get the next (only) message:"
+    "## Publish generated data directly to Kafka topic"
    ]
   },
   {
-   "cell_type": "code",
-   "execution_count": null,
+   "cell_type": "markdown",
    "metadata": {},
-   "outputs": [],
    "source": [
-    "consumer = KafkaConsumer(topic_name, bootstrap_servers=['kafka:9092'], 
auto_offset_reset='earliest',\n",
-    "     enable_auto_commit=True)\n",
-    "\n",
-    "print(next(consumer).value.decode('utf-8'))"
+    "We are using the datagen that is part of this cluster to generate a 
stream of messages. The data generator will create the topic if it does not 
exist. This data generator configuration output messages to topic 
`social_media`."

Review Comment:
   done



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]


---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to