[ 
https://issues.apache.org/jira/browse/DRILL-4709?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15319065#comment-15319065
 ] 

Paul Rogers commented on DRILL-4709:
------------------------------------

Thanks Julian! Every BI tool I've ever used or worked on struggles to create 
its own sample data. Great to see you are maintaining a good data set.

Seems that the key piece of information for Drill users, the schema diagram, is 
broken on the Github site. Is there another copy somewhere?

Julian's link above is a good resource. Perhaps include into Drill's doc, in 
the Sample Datasets section:

* A pointer to Julian's page above. Also, the JSON version: 
https://github.com/julianhyde/foodmart-data-json
* Instructions for accessing the data using the "cp" data source.
* If the image/documentation link is fixed on Julian's site, include that as a 
link.

> Document the included Foodmart sample data
> ------------------------------------------
>
>                 Key: DRILL-4709
>                 URL: https://issues.apache.org/jira/browse/DRILL-4709
>             Project: Apache Drill
>          Issue Type: Improvement
>          Components: Documentation
>    Affects Versions: 1.6.0
>            Reporter: Paul Rogers
>            Priority: Minor
>
> Drill includes a JSON version of the Mondrian FoodMart sample data. This data 
> appears in the $DRILL_HOME/jars/3rdparty/foodmart-data-json-0.4.jar jar file, 
> accessible using the class path storage plugin.
> The documentation mentions using the cp plugin to access customers.json. 
> However, the FoodMart data set is quite rich, with many example files.
> As it is, unless someone is a curious developer, and good with Google, they 
> won't be able to find the other data sets or the source of the FoodMart data.
> The data appears to be a JSON version of the SQL sample data for the Mondrian 
> project. A schema description is here: 
> https://github.com/pentaho/mondrian/blob/master/demo/FoodMart.xml
> The Mondrian data appears to have originated at Microsoft to highlight their 
> circa 2000 OLAP projects, but has since been discontinued. See
> * http://sqlmag.com/development/dts-2000-action
> * https://technet.microsoft.com/en-us/library/aa217032(v=sql.80).aspx
> * http://sqlmag.com/sql-server/desperately-seeking-samples
> Or do a Google search for "microsoft foodmart database".
> The request is to:
> 1. Credit MS and Mondrian for the data.
> 2. Either explain the data (which is quite a bit of work), or
> 3. Explain how to extract the files from the jar file to explore manually.
> 4. Provide a pointer to a description of the schema (if such can be found.)
> For option 3:
> cd $DRILL_HOME/jars/3rdparty
> unzip foodmart-data-json-0.4.jar -d ~/foodmart
> cd ~/foodmart
> ls
> Looking at the data, it is clear that SOME description is needed to 
> understand the many tables and how they might work with Drill.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

Reply via email to