Mariano Dominguez created HIVE-7631:
---------------------------------------

             Summary: Add support for NOT operator in column regex
                 Key: HIVE-7631
                 URL: https://issues.apache.org/jira/browse/HIVE-7631
             Project: Hive
          Issue Type: Improvement
    Affects Versions: 0.12.0
            Reporter: Mariano Dominguez


Given the following table:
0: jdbc:hive2://localhost:10000/default> DESCRIBE regex_column_tb;          
+-----------------------+-----------------------+-----------------------+
|       col_name        |       data_type       |        comment        |
+-----------------------+-----------------------+-----------------------+
| stage_c1              | int                   | None                  |
| stage_c2              | int                   | None                  |
| c1                    | int                   | None                  |
| c2                    | int                   | None                  |
+-----------------------+-----------------------+-----------------------+

A simple regex can be used to select certain columns:
0: jdbc:hive2://localhost:10000/default> SELECT `stage_.*` FROM regex_column_tb;
+-----------+-----------+
| stage_c1  | stage_c2  |
+-----------+-----------+
+-----------+-----------+

The regex “(?!stage_).*” using the NOT operator is a valid Java regex, but it 
does not seem to be supported in Hive:
0: jdbc:hive2://localhost:10000/default> SELECT `(?!stage_).*` FROM 
regex_column_tb;
Error: Error while compiling statement: FAILED: ParseException line 1:17 
mismatched input ')' expecting FROM near 'stage_' in from clause 
(state=42000,code=40000)

The following regex is supported (HIVE-420), but it is not as intuitive:
0: jdbc:hive2://localhost:10000/default> SELECT `(stage_.*)?+.+` FROM 
regex_column_tb;                             
+-----+-----+
| c1  | c2  |
+-----+-----+
+-----+-----+




--
This message was sent by Atlassian JIRA
(v6.2#6252)

Reply via email to