[ 
https://issues.apache.org/jira/browse/DRILL-6249?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16468435#comment-16468435
 ] 

ASF GitHub Bot commented on DRILL-6249:
---------------------------------------

paul-rogers commented on a change in pull request #1251: DRILL-6249: Adding 
more unit testing documentation.
URL: https://github.com/apache/drill/pull/1251#discussion_r186933018
 
 

 ##########
 File path: docs/dev/TestDataSets.md
 ##########
 @@ -0,0 +1,146 @@
+# Data Sets
+
+Drill includes several data sets for testing, and also provides some tools for 
generating test data sets.
+
+## Bundled Data Sets
+
+There are three primary data sets bundled with drill for testing:
+
+  - **Sample Data:** These are parquet files in the 
[sample-data](../sample-data) folder.
+  - **Resource Data:** These are data files in the 
[exec/java-exec/src/test/resources](../exec/java-exec/src/test/resources) fold.
+  - **TPCH Data:** These are trimmed down versions of the tpch data sets. They 
are retrieved and bundled
+  in the [contrib/data](../contrib/data) maven submodule. They are also 
accessible on [Apache Drill's S3 
bucket](http://apache-drill.s3.amazonaws.com/files/sf-0.01_tpc-h_parquet.tgz).
+  When unit tests are running all of the files in these data set are available 
from the classpath storage plugin. The tpch
+  files include:
+    - **customer.parquet**
+    - **lineitem.parquet**
+    - **nation.parquet**
+    - **orders.parquet**
+    - **part.parquet**
+    - **partsup.parquet**
+    - **region.parquet**
+    - **supplier.parquet**
+  
+### Using Sample Data in Unit Tests
+
+When using the 
[BaseDirTestWatcher](../exec/java-exec/src/test/java/org/apache/drill/test/BaseDirTestWatcher.java)
 you
+can make [sample-data](../sample-data) accessible from the ```dfs``` storage 
plugin by doing the following:
+
+```
+public class TestMyClass {
+  @ClassRule
+  public static final BaseDirTestWatcher dirTestWatcher = new 
BaseDirTestWatcher();
+  
+  @BeforeClass
+  public static void setupFiles() {
+    dirTestWatcher.copyFileToRoot(Paths.get("sample-data", "region.parquet"));
+  }
+  
+  @Test
+  public void simpleTest() {
+     // dfs.root.`sample-data/region.parquet` will be accessible from my test
+  }
+}
+```
+
+Or if you are extending 
[BaseTestQuery](../exec/java-exec/src/test/java/org/apache/drill/test/BaseDirTestWatcher.java)
+
+```
+public class TestMyClass extends BaseTestQuery {
+  @BeforeClass
+  public static void setupFiles() {
+    dirTestWatcher.copyFileToRoot(Paths.get("sample-data", "region.parquet"));
+  }
+  
+  @Test
+  public void simpleTest() {
+     // dfs.root.`sample-data/region.parquet` will be accessible from my test
+  }
+}
 
 Review comment:
   This can be done very simply with `ClusterFixture`. See 
`ClusterFixture.defineWorkspace(...)`

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
[email protected]


> Add Markdown Docs for Unit Testing and Link to it in README.md
> --------------------------------------------------------------
>
>                 Key: DRILL-6249
>                 URL: https://issues.apache.org/jira/browse/DRILL-6249
>             Project: Apache Drill
>          Issue Type: Improvement
>            Reporter: Timothy Farkas
>            Assignee: Timothy Farkas
>            Priority: Major
>              Labels: ready-to-commit
>             Fix For: 1.14.0
>
>
> I am working on a presentation about how to use the unit testing utilities in 
> Drill. Instead of writing the doc and having it be lost in Google Drive 
> somewhere I am going to add a Markdown doc to the drill repo and link to it 
> in the README.md. This is appropriate since these docs will only be used by 
> developers, and the way we unit test will change as the code changes. So the 
> unit testing docs should be kept in the same repo as the code so it can be 
> updated and kept in sync with the rest of Drill.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

Reply via email to