That is correct. It looks like vimeo just finished the conversion. Try the link again.
On Thu, Aug 14, 2014 at 3:02 PM, Jason Altekruse <[email protected]> wrote: > Interestingly it appears to be re-directing to a URL that looks like it has > a video ID on it. I haven't used vimeo, but is it possible it was uploaded > but not yet made public? > > -Jason > > > On Thu, Aug 14, 2014 at 12:57 PM, Timothy Chen <[email protected]> wrote: > > > Hi Jim, > > > > The vimeo link gave me 404 though. > > > > Tim > > > > On Thu, Aug 14, 2014 at 12:31 PM, Jim Scott <[email protected]> wrote: > > > I recently gave a presentation on how to use Apache Drill with some > > > demonstrations. > > > > > > The video for the presentation is available here: > > > http://vimeo.com/chug/using-apache-drill > > > The slides are available here: > > > > > > http://www.slideshare.net/ChicagoHUG/using-apache-drill-chug-august-2014-jim-scott > > > > > > *The data used in this presentation can be found and downloaded (1.3GB > > CSV > > > file and a 100KB CSV file) * > > > *here:* > > > > > > https://data.cityofchicago.org/Public-Safety/Crimes-2001-to-present/ijzp-q8t2 > > > https://data.cityofchicago.org/Public-Safety/Sex-Offenders/vc9r-bqvy > > > > > > Once you start your drill session you can connect to it via: > > > http://localhost:8047/ > > > > > > > > > *Queries (pay attention to back ticks) * > > > To see a full stack trace when errors occur: > > > alter session set `exec.errors.verbose`=true; > > > > > > > > > *Looking through the information schema:*SELECT * from > > > INFORMATION_SCHEMA.SCHEMATA; > > > SELECT * from INFORMATION_SCHEMA.`TABLES`; > > > SELECT * from INFORMATION_SCHEMA.`COLUMNS`; > > > SELECT COLUMN_NAME, DATA_TYPE FROM INFORMATION_SCHEMA.COLUMNS WHERE > > > TABLE_SCHEMA = 'sys' and TABLE_NAME = 'options'; > > > > > > > > > *How many of each type of incident:*select count(*) as incidents, > > > columns[5] as category from > > > dfs.`/home/jscott/Downloads/chicago/Crimes_-_2001_to_present.csv` group > > by > > > columns[5] order by incidents desc limit 10; > > > > > > > > > *How many of each type of incident and subtype:*select count(*) as > > > incidents, columns[5] as type, columns[6] as subtype from > > > dfs.`/home/jscott/Downloads/chicago/Crimes_-_2001_to_present.csv` group > > by > > > columns[6], columns[5] order by incidents desc limit 10; > > > > > > > > > *CREATE A TABLE in PARQUET format:* > > > > > > *Please note that I created a workspace "db" at my path of > /opt/drill/db > > > and I gave it a default storage format of parquet. * > > > > > > create table dfs.db.`crime_data` as select columns[0] as REFID, > > columns[1] > > > as CaseNumber, columns[2] as DateTime, columns[3] as Block, columns[4] > as > > > IUCR, columns[5] as PrimaryType, columns[6] as Description, columns[7] > as > > > LocationDescription, columns[8] as Arrest, columns[9] as Domestic, > > > columns[10] as Beat, columns[11] as District, columns[12] as Ward, > > > columns[13] as CommunityArea, columns[14] as FBICode, columns[15] as > > > XCoordinate, columns[16] as YCoordinate, columns[17] as YEAR_DATE, > > > columns[18] as UpdatedOn, columns[19] as Latitude, columns[20] as > > > Longitude, columns[21] as Location from > > > dfs.`/home/jscott/Downloads/chicago/Crimes_-_2001_to_present.csv`; > > > create table dfs.db.`sex_offenders` as select columns[0] as > > LAST,columns[1] > > > as FIRST,columns[2] as BLOCK,columns[3] as GENDER,columns[4] as > > > RACE,columns[5] as BIRTHDATE,columns[6] as AGE,columns[7] as > > > HEIGHT,columns[8] as WEIGHT,columns[9] as VICTIM_MINOR from > > > dfs.`/home/jscott/Downloads/chicago/Sex_Offenders.csv`; > > > > > > > > > *Run the previous query on the newly created table:*select count(*) as > > > incidents, primarytype, description from dfs.db.`crime_data` group by > > > description, primarytype order by incidents desc limit 10; > > > > > > > > > *How many crimes occurred on a block with a registered sex > > offender?*SELECT > > > count(*) as incidents FROM dfs.db.`crime_data` crimes JOIN > > > dfs.db.`sex_offenders` offenders ON crimes.block = offenders.block > ORDER > > BY > > > incidents DESC; > > > > > > > > > *What types of incidents occurred?*SELECT count(*) as incidents, > > > crimes.primarytype as type, crimes.description as subtype FROM > > > dfs.db.`crime_data` crimes JOIN dfs.db.`sex_offenders` offenders ON > > > crimes.block = offenders.block GROUP BY crimes.description, > > > crimes.primarytype ORDER BY incidents DESC LIMIT 20; > > > > > > > > > *What all is in the BATTERY category?*SELECT count(*) as incidents, > > > crimes.description as subtype FROM dfs.db.`crime_data` crimes JOIN > > > dfs.db.`sex_offenders` offenders ON crimes.block = offenders.block > WHERE > > > crimes.primarytype = 'BATTERY' GROUP BY crimes.description ORDER BY > > > incidents DESC; > > > > > > > > > *How many total BATTERY incidents?*SELECT count(*) as incidents FROM > > > dfs.db.`crime_data` crimes JOIN dfs.db.`sex_offenders` offenders ON > > > crimes.block = offenders.block WHERE crimes.primarytype = 'BATTERY'; > > > > > > > > > *How many "BATTERY" incidents occurred on those blocks by year (has it > > > gotten better or worse over time)?*SELECT count(*) as incidents, > > > extract(year from to_date(crimes.datetime, 'MM/DD/YYYY hh:mm:ss a')) as > > > `year` FROM dfs.db.`crime_data` crimes JOIN dfs.db.`sex_offenders` > > > offenders ON crimes.block = offenders.block WHERE crimes.primarytype = > > > 'BATTERY' and crimes.datetime not like '%Date%' GROUP BY extract(year > > from > > > to_date(crimes.datetime, 'MM/DD/YYYY hh:mm:ss a')) ORDER BY `year` ASC; > > > > > > > > > Do you have queries against this data set that you have written that > you > > > would like to share with others? Reply to the thread... > > > > > > > > > -- > > > *Jim Scott* > > > Director, Enterprise Strategy & Architecture > > > > > > <http://www.mapr.com/> > > > [image: MapR Technologies] <http://www.mapr.com> > > > -- *Jim Scott* Director, Enterprise Strategy & Architecture +1 (347) 746-9281 <http://www.mapr.com/> [image: MapR Technologies] <http://www.mapr.com>
