weishiuntsai commented on a change in pull request #9766:
URL: https://github.com/apache/druid/pull/9766#discussion_r414965988
##########
File path: docs/tutorials/index.md
##########
@@ -99,96 +91,173 @@ $ ./bin/start-micro-quickstart
[Fri May 3 11:40:50 2019] Running command[middleManager], logging
to[/apache-druid-{{DRUIDVERSION}}/var/sv/middleManager.log]: bin/run-druid
middleManager conf/druid/single-server/micro-quickstart
```
-All persistent state such as the cluster metadata store and segments for the
services will be kept in the `var` directory under the
apache-druid-{{DRUIDVERSION}} package root. Logs for the services are located
at `var/sv`.
+All persistent state, such as the cluster metadata store and segments for the
services, are kept in the `var` directory under
+the Druid root directory, apache-druid-{{DRUIDVERSION}}. Each service writes
to a log file under `var/sv`, as noted in the startup script output above.
+
+At any time, you can revert Druid to its original, post-installation state by
deleting the entire `var` directory. You may
+want to do this, for example, between Druid tutorials or after
experimentation, to start with a fresh instance.
+
+To stop Druid at any time, use CTRL-C in the terminal. This exits the
`bin/start-micro-quickstart` script and
+terminates all Druid processes.
+
-Later on, if you'd like to stop the services, CTRL-C to exit the
`bin/start-micro-quickstart` script, which will terminate the Druid processes.
+## Step 3. Open the Druid console
-Once the cluster has started, you can navigate to
[http://localhost:8888](http://localhost:8888).
-The [Druid router process](../design/router.md), which serves the [Druid
console](../operations/druid-console.md), resides at this address.
+After the Druid services finish startup, open the [Druid
console](../operations/druid-console.md) at
[http://localhost:8888](http://localhost:8888).

-It takes a few seconds for all the Druid processes to fully start up. If you
open the console immediately after starting the services, you may see some
errors that you can safely ignore.
-
-
-## Loading data
-
-### Tutorial dataset
-
-For the following data loading tutorials, we have included a sample data file
containing Wikipedia page edit events that occurred on 2015-09-12.
-
-This sample data is located at
`quickstart/tutorial/wikiticker-2015-09-12-sampled.json.gz` from the Druid
package root.
-The page edit events are stored as JSON objects in a text file.
-
-The sample data has the following columns, and an example event is shown below:
-
- * added
- * channel
- * cityName
- * comment
- * countryIsoCode
- * countryName
- * deleted
- * delta
- * isAnonymous
- * isMinor
- * isNew
- * isRobot
- * isUnpatrolled
- * metroCode
- * namespace
- * page
- * regionIsoCode
- * regionName
- * user
-
-```json
-{
- "timestamp":"2015-09-12T20:03:45.018Z",
- "channel":"#en.wikipedia",
- "namespace":"Main",
- "page":"Spider-Man's powers and equipment",
- "user":"foobar",
- "comment":"/* Artificial web-shooters */",
- "cityName":"New York",
- "regionName":"New York",
- "regionIsoCode":"NY",
- "countryName":"United States",
- "countryIsoCode":"US",
- "isAnonymous":false,
- "isNew":false,
- "isMinor":false,
- "isRobot":false,
- "isUnpatrolled":false,
- "added":99,
- "delta":99,
- "deleted":0,
-}
-```
+It may take a few seconds for all Druid services to finish starting, including
the [Druid router](../design/router.md), which serves the console. If you
attempt to open the Druid console before startup is complete, you may see
errors in the browser. Wait a few moments and try again.
-### Data loading tutorials
+## Step 4. Load data
-The following tutorials demonstrate various methods of loading data into
Druid, including both batch and streaming use cases.
-All tutorials assume that you are using the `micro-quickstart` single-machine
configuration mentioned above.
-- [Loading a file](./tutorial-batch.md) - this tutorial demonstrates how to
perform a batch file load, using Druid's native batch ingestion.
-- [Loading stream data from Apache Kafka](./tutorial-kafka.md) - this tutorial
demonstrates how to load streaming data from a Kafka topic.
-- [Loading a file using Apache Hadoop](./tutorial-batch-hadoop.md) - this
tutorial demonstrates how to perform a batch file load, using a remote Hadoop
cluster.
-- [Writing your own ingestion spec](./tutorial-ingestion-spec.md) - this
tutorial demonstrates how to write a new ingestion spec and use it to load data.
+Ingestion specs define the schema of the data Druid reads and stores. You can
write ingestion specs by hand or using the _data loader_,
Review comment:
We might want to mention that the tutorial here is to do the batch file
load using Druid's native batch ingestion. The original page separates the
data loading into 4 different links "Loading a file", "Loading stream data from
Apache Kafka", "Loading a file using Apache Hadoop" and "Writing your own
ingestion spec" with explanations after each link. That makes it clear about
what the user is reading when a link is clicked. But once we move part of
content from "Loading a file" here, it becomes less clear that we will be doing
batch file load with native batch ingestion here.
----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
For queries about this service, please contact Infrastructure at:
[email protected]
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]