Actually I spoke too soon. Running what Jacques suggested returned a single column for all the data, with all the headers smashed together:
0: jdbc:drill:> select * from TABLE(maprfs.cmatta.`weather/data/PHL/2012/1/data.csv` (type => 'text', extractHeader => true)) limit 10; +---------------------------------------------------------------------------------------------------------------------------------------------------------------------------+ | TimeEST,TemperatureF,DewPointF,Humidity,SeaLevelPressureIn,VisibilityMPH,WindDirection,WindSpeedMPH,GustSpeedMPH,PrecipitationIn,Events,Conditions,WindDirDegrees,DateUTC | +---------------------------------------------------------------------------------------------------------------------------------------------------------------------------+ | 12:54 AM,46.9,35.1,63,30.09,10.0,WNW,5.8,-,N/A,,Clear,290,2012-01-01 05:54:00 | | 1:54 AM,42.1,34.0,73,30.09,10.0,Calm,Calm,-,N/A,,Clear,0,2012-01-01 06:54:00 | | 2:54 AM,39.9,34.0,79,30.11,10.0,West,9.2,-,N/A,,Clear,260,2012-01-01 07:54:00 | | 3:54 AM,37.9,33.1,83,30.11,10.0,SW,4.6,-,N/A,,Clear,230,2012-01-01 08:54:00 | | 4:54 AM,37.9,34.0,86,30.11,10.0,SW,4.6,-,N/A,,Clear,230,2012-01-01 09:54:00 | | 5:54 AM,36.0,32.0,86,30.13,10.0,WSW,5.8,-,N/A,,Clear,240,2012-01-01 10:54:00 | | 6:54 AM,37.9,34.0,86,30.13,10.0,WSW,6.9,-,N/A,,Clear,250,2012-01-01 11:54:00 | | 7:54 AM,39.0,35.1,86,30.13,10.0,SW,3.5,-,N/A,,Clear,220,2012-01-01 12:54:00 | | 8:54 AM,42.1,37.0,82,30.13,10.0,WSW,3.5,-,N/A,,Clear,240,2012-01-01 13:54:00 | | 9:54 AM,45.0,37.0,74,30.11,10.0,South,4.6,-,N/A,,Partly Cloudy,180,2012-01-01 14:54:00 | +---------------------------------------------------------------------------------------------------------------------------------------------------------------------------+ 10 rows selected (0.343 seconds) A regular select of this data shows the columns: 0: jdbc:drill:> select * from maprfs.cmatta.`weather/data/PHL/2012/1/data.csv` limit 10; +---------+ | columns | +---------+ | ["TimeEST","TemperatureF","DewPointF","Humidity","SeaLevelPressureIn","VisibilityMPH","WindDirection","WindSpeedMPH","GustSpeedMPH","PrecipitationIn","Events","Conditions","WindDirDegrees","DateUTC"] | | ["12:54 AM","46.9","35.1","63","30.09","10.0","WNW","5.8","-","N/A","","Clear","290","2012-01-01 05:54:00"] | | ["1:54 AM","42.1","34.0","73","30.09","10.0","Calm","Calm","-","N/A","","Clear","0","2012-01-01 06:54:00"] | | ["2:54 AM","39.9","34.0","79","30.11","10.0","West","9.2","-","N/A","","Clear","260","2012-01-01 07:54:00"] | | ["3:54 AM","37.9","33.1","83","30.11","10.0","SW","4.6","-","N/A","","Clear","230","2012-01-01 08:54:00"] | | ["4:54 AM","37.9","34.0","86","30.11","10.0","SW","4.6","-","N/A","","Clear","230","2012-01-01 09:54:00"] | | ["5:54 AM","36.0","32.0","86","30.13","10.0","WSW","5.8","-","N/A","","Clear","240","2012-01-01 10:54:00"] | | ["6:54 AM","37.9","34.0","86","30.13","10.0","WSW","6.9","-","N/A","","Clear","250","2012-01-01 11:54:00"] | | ["7:54 AM","39.0","35.1","86","30.13","10.0","SW","3.5","-","N/A","","Clear","220","2012-01-01 12:54:00"] | | ["8:54 AM","42.1","37.0","82","30.13","10.0","WSW","3.5","-","N/A","","Clear","240","2012-01-01 13:54:00"] | +---------+ 10 rows selected (0.387 seconds) The above behavior isn’t expected is it? Chris Matta [email protected] 215-701-3146 On Mon, Feb 29, 2016 at 2:39 PM, Christopher Matta <[email protected]> wrote: > Thanks Jacques, that worked. > > Chris Matta > [email protected] > 215-701-3146 > > On Sat, Feb 27, 2016 at 1:01 PM, Jacques Nadeau <[email protected]> > wrote: > >> You should use select with options. >> >> Examples are here: >> https://drill.apache.org/blog/2015/12/14/drill-1.4-released/ >> >> E.g.: >> >> SELECT * FROM TABLE(dfs.`/path/to/CO.dat`(type => 'text', extractHeader => >> true)); >> >> -- >> Jacques Nadeau >> CTO and Co-Founder, Dremio >> >> On Fri, Feb 26, 2016 at 12:38 PM, Christopher Matta <[email protected]> >> wrote: >> >> > Is it possible to set the extractHeader option for CSV/TSV in a session >> > variable? Doing it on the format type is just too broad sometimes and >> I'd >> > like to be able to set it based on the files I'm querying. >> > >> > >> > Chris Matta >> > [email protected] >> > 215-701-3146 >> > >> > >
