Re: extractHeader in session variable?

Christopher Matta Mon, 29 Feb 2016 12:28:11 -0800

Ah, thanks Jason, that worked:

0: jdbc:drill:> select * from
TABLE(maprfs.cmatta.`weather/data/PHL/2012/1/data.csv` (type =>
'text', extractHeader => true, fieldDelimiter => ',')) limit 10;
+-----------+---------------+------------+-----------+---------------------+----------------+----------------+---------------+---------------+------------------+---------+----------------+-----------------+----------------------+
|  TimeEST  | TemperatureF  | DewPointF  | Humidity  |
SeaLevelPressureIn  | VisibilityMPH  | WindDirection  | WindSpeedMPH
| GustSpeedMPH  | PrecipitationIn  | Events  |   Conditions   |
WindDirDegrees  |       DateUTC        |
+-----------+---------------+------------+-----------+---------------------+----------------+----------------+---------------+---------------+------------------+---------+----------------+-----------------+----------------------+
| 12:54 AM  | 46.9          | 35.1       | 63        | 30.09
    | 10.0           | WNW            | 5.8           | -
| N/A              |         | Clear          | 290             |
2012-01-01 05:54:00  |
| 1:54 AM   | 42.1          | 34.0       | 73        | 30.09
    | 10.0           | Calm           | Calm          | -
| N/A              |         | Clear          | 0               |
2012-01-01 06:54:00  |
| 2:54 AM   | 39.9          | 34.0       | 79        | 30.11
    | 10.0           | West           | 9.2           | -
| N/A              |         | Clear          | 260             |
2012-01-01 07:54:00  |
| 3:54 AM   | 37.9          | 33.1       | 83        | 30.11
    | 10.0           | SW             | 4.6           | -
| N/A              |         | Clear          | 230             |
2012-01-01 08:54:00  |
| 4:54 AM   | 37.9          | 34.0       | 86        | 30.11
    | 10.0           | SW             | 4.6           | -
| N/A              |         | Clear          | 230             |
2012-01-01 09:54:00  |
| 5:54 AM   | 36.0          | 32.0       | 86        | 30.13
    | 10.0           | WSW            | 5.8           | -
| N/A              |         | Clear          | 240             |
2012-01-01 10:54:00  |
| 6:54 AM   | 37.9          | 34.0       | 86        | 30.13
    | 10.0           | WSW            | 6.9           | -
| N/A              |         | Clear          | 250             |
2012-01-01 11:54:00  |
| 7:54 AM   | 39.0          | 35.1       | 86        | 30.13
    | 10.0           | SW             | 3.5           | -
| N/A              |         | Clear          | 220             |
2012-01-01 12:54:00  |
| 8:54 AM   | 42.1          | 37.0       | 82        | 30.13
    | 10.0           | WSW            | 3.5           | -
| N/A              |         | Clear          | 240             |
2012-01-01 13:54:00  |
| 9:54 AM   | 45.0          | 37.0       | 74        | 30.11
    | 10.0           | South          | 4.6           | -
| N/A              |         | Partly Cloudy  | 180             |
2012-01-01 14:54:00  |
+-----------+---------------+------------+-----------+---------------------+----------------+----------------+---------------+---------------+------------------+---------+----------------+-----------------+----------------------+
10 rows selected (0.381 seconds)




Chris Matta
[email protected]
215-701-3146

On Mon, Feb 29, 2016 at 2:51 PM, Jason Altekruse <[email protected]>
wrote:

> You need to specify the delimiter, it doesn't seem to default to comma as
> the field delimiter.
>
> On Mon, Feb 29, 2016 at 11:46 AM, Christopher Matta <[email protected]>
> wrote:
>
> > Actually I spoke too soon.
> >
> > Running what Jacques suggested returned a single column for all the data,
> > with all the headers smashed together:
> >
> > 0: jdbc:drill:> select * from
> > TABLE(maprfs.cmatta.`weather/data/PHL/2012/1/data.csv` (type =>
> > 'text', extractHeader => true)) limit 10;
> >
> >
> +---------------------------------------------------------------------------------------------------------------------------------------------------------------------------+
> > |
> >
> TimeEST,TemperatureF,DewPointF,Humidity,SeaLevelPressureIn,VisibilityMPH,WindDirection,WindSpeedMPH,GustSpeedMPH,PrecipitationIn,Events,Conditions,WindDirDegrees,DateUTC
> > |
> >
> >
> +---------------------------------------------------------------------------------------------------------------------------------------------------------------------------+
> > | 12:54 AM,46.9,35.1,63,30.09,10.0,WNW,5.8,-,N/A,,Clear,290,2012-01-01
> > 05:54:00 |
> > | 1:54 AM,42.1,34.0,73,30.09,10.0,Calm,Calm,-,N/A,,Clear,0,2012-01-01
> > 06:54:00 |
> > | 2:54 AM,39.9,34.0,79,30.11,10.0,West,9.2,-,N/A,,Clear,260,2012-01-01
> > 07:54:00 |
> > | 3:54 AM,37.9,33.1,83,30.11,10.0,SW,4.6,-,N/A,,Clear,230,2012-01-01
> > 08:54:00 |
> > | 4:54 AM,37.9,34.0,86,30.11,10.0,SW,4.6,-,N/A,,Clear,230,2012-01-01
> > 09:54:00 |
> > | 5:54 AM,36.0,32.0,86,30.13,10.0,WSW,5.8,-,N/A,,Clear,240,2012-01-01
> > 10:54:00 |
> > | 6:54 AM,37.9,34.0,86,30.13,10.0,WSW,6.9,-,N/A,,Clear,250,2012-01-01
> > 11:54:00 |
> > | 7:54 AM,39.0,35.1,86,30.13,10.0,SW,3.5,-,N/A,,Clear,220,2012-01-01
> > 12:54:00 |
> > | 8:54 AM,42.1,37.0,82,30.13,10.0,WSW,3.5,-,N/A,,Clear,240,2012-01-01
> > 13:54:00 |
> > | 9:54 AM,45.0,37.0,74,30.11,10.0,South,4.6,-,N/A,,Partly
> > Cloudy,180,2012-01-01 14:54:00 |
> >
> >
> +---------------------------------------------------------------------------------------------------------------------------------------------------------------------------+
> > 10 rows selected (0.343 seconds)
> >
> > A regular select of this data shows the columns:
> >
> > 0: jdbc:drill:> select * from
> > maprfs.cmatta.`weather/data/PHL/2012/1/data.csv` limit 10;
> > +---------+
> > | columns |
> > +---------+
> > |
> >
> ["TimeEST","TemperatureF","DewPointF","Humidity","SeaLevelPressureIn","VisibilityMPH","WindDirection","WindSpeedMPH","GustSpeedMPH","PrecipitationIn","Events","Conditions","WindDirDegrees","DateUTC"]
> > |
> > | ["12:54
> >
> AM","46.9","35.1","63","30.09","10.0","WNW","5.8","-","N/A","","Clear","290","2012-01-01
> > 05:54:00"] |
> > | ["1:54
> >
> AM","42.1","34.0","73","30.09","10.0","Calm","Calm","-","N/A","","Clear","0","2012-01-01
> > 06:54:00"] |
> > | ["2:54
> >
> AM","39.9","34.0","79","30.11","10.0","West","9.2","-","N/A","","Clear","260","2012-01-01
> > 07:54:00"] |
> > | ["3:54
> >
> AM","37.9","33.1","83","30.11","10.0","SW","4.6","-","N/A","","Clear","230","2012-01-01
> > 08:54:00"] |
> > | ["4:54
> >
> AM","37.9","34.0","86","30.11","10.0","SW","4.6","-","N/A","","Clear","230","2012-01-01
> > 09:54:00"] |
> > | ["5:54
> >
> AM","36.0","32.0","86","30.13","10.0","WSW","5.8","-","N/A","","Clear","240","2012-01-01
> > 10:54:00"] |
> > | ["6:54
> >
> AM","37.9","34.0","86","30.13","10.0","WSW","6.9","-","N/A","","Clear","250","2012-01-01
> > 11:54:00"] |
> > | ["7:54
> >
> AM","39.0","35.1","86","30.13","10.0","SW","3.5","-","N/A","","Clear","220","2012-01-01
> > 12:54:00"] |
> > | ["8:54
> >
> AM","42.1","37.0","82","30.13","10.0","WSW","3.5","-","N/A","","Clear","240","2012-01-01
> > 13:54:00"] |
> > +---------+
> > 10 rows selected (0.387 seconds)
> >
> > The above behavior isn’t expected is it?
> > 
> >
> > Chris Matta
> > [email protected]
> > 215-701-3146
> >
> > On Mon, Feb 29, 2016 at 2:39 PM, Christopher Matta <[email protected]>
> > wrote:
> >
> > > Thanks Jacques, that worked.
> > >
> > > Chris Matta
> > > [email protected]
> > > 215-701-3146
> > >
> > > On Sat, Feb 27, 2016 at 1:01 PM, Jacques Nadeau <[email protected]>
> > > wrote:
> > >
> > >> You should use select with options.
> > >>
> > >> Examples are here:
> > >> https://drill.apache.org/blog/2015/12/14/drill-1.4-released/
> > >>
> > >> E.g.:
> > >>
> > >> SELECT * FROM TABLE(dfs.`/path/to/CO.dat`(type => 'text',
> extractHeader
> > =>
> > >> true));
> > >>
> > >> --
> > >> Jacques Nadeau
> > >> CTO and Co-Founder, Dremio
> > >>
> > >> On Fri, Feb 26, 2016 at 12:38 PM, Christopher Matta <[email protected]>
> > >> wrote:
> > >>
> > >> > Is it possible to set the extractHeader option for CSV/TSV in a
> > session
> > >> > variable? Doing it on the format type is just too broad sometimes
> and
> > >> I'd
> > >> > like to be able to set it based on the files I'm querying.
> > >> >
> > >> >
> > >> > Chris Matta
> > >> > [email protected]
> > >> > 215-701-3146
> > >> >
> > >>
> > >
> > >
> >
>

Re: extractHeader in session variable?

Reply via email to