fjy commented on a change in pull request #7863: Added the web console to the 
quickstart tutorials and docs
URL: https://github.com/apache/incubator-druid/pull/7863#discussion_r294500799
 
 

 ##########
 File path: docs/content/tutorials/tutorial-batch.md
 ##########
 @@ -24,18 +24,98 @@ title: "Tutorial: Loading a file"
 
 # Tutorial: Loading a file
 
-## Getting started
-
 This tutorial demonstrates how to perform a batch file load, using Apache 
Druid (incubating)'s native batch ingestion.
 
 For this tutorial, we'll assume you've already downloaded Druid as described 
in 
 the [quickstart](index.html) using the `micro-quickstart` single-machine 
configuration and have it
 running on your local machine. You don't need to have loaded any data yet.
 
-## Preparing the data and the ingestion task spec
-
 A data load is initiated by submitting an *ingestion task* spec to the Druid 
Overlord. For this tutorial, we'll be loading the sample Wikipedia page edits 
data.
 
+An ingestion spec can be written by hand or you can use the "Data loader" that 
is built into the Druid console to help iteratively build one for you by 
sampling your data.
+The data loader currently only supports native batch ingestion (streaming 
support coming soon).
+
+We've included a sample of Wikipedia edits from September 12, 2015 to get you 
started.
+
+
+## Loading data with the data loader
+
+Navigate to [localhost:8888](http://localhost:8888) and click `Load data` in 
the console header, select `Local disk`.
+
+![Data loader init](../tutorials/img/tutorial-batch-data-loader-01.png "Data 
loader init")
+
+Enter the value of `quickstart/tutorial/` as the base directory and 
`wikiticker-2015-09-12-sampled.json.gz` as a filter.
+The separation of base directory and [wildcard file 
filter](https://commons.apache.org/proper/commons-io/apidocs/org/apache/commons/io/filefilter/WildcardFileFilter.html)
 is there if you need to ingest data from multiple files.
+
+Click `Preview` and make sure that the the data you are seeing is correct.
+
+![Data loader sample](../tutorials/img/tutorial-batch-data-loader-02.png "Data 
loader sample")
+
+Once the data is located, you can click "Next: Parse data" to go to the next 
step.
+The data loader will try to automatically determine the correct parser for the 
data.
+In this case it will successfully determine `json`.
+Feel free to play around with different parser options to get a preview of how 
Druid will parse your data.
+
+![Data loader parse data](../tutorials/img/tutorial-batch-data-loader-03.png 
"Data loader parse data")
+
+With the `json` parser selected click `Next: Parse time` to get to the step 
centered around determining your primary timestamp column.
+Druid's architecture mandates a primary timestamp column that will be called 
`__time`, which could always just be a `Constant value`.
+In this case the data loader will guess the `time` column as the primary time 
column as it is the only one with values that look like they might be time. 
+
+![Data loader parse time](../tutorials/img/tutorial-batch-data-loader-04.png 
"Data loader parse time")
+
+Click `Next: ...` twice to go past the `Transform` and `Filter` steps, you do 
not need to enter anything there and applying ingestion times transforms and 
filters is out of scope of this tutorial.
+
+In the `Configure schema` step, you can configure which dimensions (and 
metrics) will be ingested into Druid.
+This is exactly what the data will appear like in Druid once it is ingested.
+Since our dataset is very small go ahead and turn off `Rollup` by clicking on 
the switch and confirming the change.
+
+![Data loader schema](../tutorials/img/tutorial-batch-data-loader-05.png "Data 
loader schema")
+
+Once you are satisfied with the schema click `Next` to go to the `Partition` 
step where you can fine tune how the data will be partitioned into segments.
+Here you can adjust how the data will be split up into segments in Druid.
+Since this is such a small dataset there are no adjustments that need to be 
made in this step.
+
+![Data loader partition](../tutorials/img/tutorial-batch-data-loader-06.png 
"Data loader partition")
+
+Clicking past the `Tune` step we get to the publish step which is where we can 
specify what the datasource will be called in Druid.
+Let's name this datasource `wikipedia`.  
+
+![Data loader publish](../tutorials/img/tutorial-batch-data-loader-07.png 
"Data loader publish")
+
+Finally click `Next` to review your spec.
 
 Review comment:
   Finally**,** click

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
[email protected]


With regards,
Apache Git Services

---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to