[GitHub] spark issue #15334: [SPARK-10367][SQL] Support Parquet logical type INTERVAL

dilipbiswal Thu, 31 Aug 2017 18:03:07 -0700

Github user dilipbiswal commented on the issue:

    https://github.com/apache/spark/pull/15334
  
    @gatorsmile Hi Sean, i tried apache-drill after looking through their 
documentation. And they are able to encode interval data into parquet.
    
    ```
    0: jdbc:drill:zk=local> CREATE TABLE dfs.tmp.parquet_intervals AS 
    . . . . . . . . . . . > (SELECT CAST( INTERVALYEAR_col as INTERVAL YEAR) 
INTERVALYEAR_col,
    . . . . . . . . . . . > CAST( INTERVALDAY_col as INTERVAL DAY) 
INTERVALDAY_col,
    . . . . . . . . . . . > CAST( INTERVAL_col as INTERVAL SECOND) INTERVAL_col 
    . . . . . . . . . . . > FROM dfs.`/tmp/intervals.json`);
    SLF4J: Failed to load class "org.slf4j.impl.StaticLoggerBinder".
    SLF4J: Defaulting to no-operation (NOP) logger implementation
    SLF4J: See http://www.slf4j.org/codes.html#StaticLoggerBinder for further 
details.
    +-----------+----------------------------+
    | Fragment  | Number of records written  |
    +-----------+----------------------------+
    | 0_0       | 3                          |
    +-----------+----------------------------+
    ```
    Here is the schema of the written parquet file.
    ```
    message root {
      optional fixed_len_byte_array(12) INTERVALYEAR_col (INTERVAL);
      optional fixed_len_byte_array(12) INTERVALDAY_col (INTERVAL);
      optional fixed_len_byte_array(12) INTERVAL_col (INTERVAL);
    }
    
    ```
    From presto's documentation, it seems like they also may be able to encode 
interval data. But i haven't tried. 
    
    FYI - i also tried hive. Its not possible to encode interval data in 
parquet format through hive.



---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at [email protected] or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

[GitHub] spark issue #15334: [SPARK-10367][SQL] Support Parquet logical type INTERVAL

Reply via email to