[jira] [Commented] (IMPALA-3777) SqlParser parsed error for unicode

Quanlong Huang (Jira) Wed, 27 Jan 2021 17:21:07 -0800


    [ 
https://issues.apache.org/jira/browse/IMPALA-3777?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17273245#comment-17273245
 ]


Quanlong Huang commented on IMPALA-3777:
----------------------------------------

I can't reproduce the issue now. It's confused for me why the delimiter can be 
mutil-bytes in the description. It does cause an AnalysisException:
{code:java}
$ bin/impala-shell.sh 
Starting Impala Shell with no authentication using Python 2.7.16
Opened TCP connection to localhost:21050
Connected to localhost:21050
Server version: impalad version 4.0.0-SNAPSHOT DEBUG (build 
08367e91f04508b54f77b56e0d211dd167b0116f)
***********************************************************************************
Welcome to the Impala shell.
(impala shell build version not available)

To see live updates on a query's progress, run 'set LIVE_SUMMARY=1;'.
***********************************************************************************
[localhost:21050] default> create table unicode_parse_error(id int) row format 
delimited fields terminated by '\u0023##';
Query: create table unicode_parse_error(id int) row format delimited fields 
terminated by '\u0023##'
ERROR: AnalysisException: ESCAPED BY values and LINE/FIELD terminators must be 
specified as a single character or as a decimal value in the range [-128:127]: 
###
{code}
For using '\u0023' as the delimiter, it's ok and work as expected:
{code:java}
[localhost:21050] default> create table unicode_parse_error(id int) row format 
delimited fields terminated by '\u0023';
Query: create table unicode_parse_error(id int) row format delimited fields 
terminated by '\u0023'
+-------------------------+
| summary                 |
+-------------------------+
| Table has been created. |
+-------------------------+
Fetched 1 row(s) in 0.14s
[localhost:21050] default> describe extended unicode_parse_error;
Query: describe extended unicode_parse_error
+------------------------------+------------------------------------------------------------+----------------------+
| name                         | type                                           
            | comment              |
+------------------------------+------------------------------------------------------------+----------------------+
| # col_name                   | data_type                                      
            | comment              |
|                              | NULL                                           
            | NULL                 |
| id                           | int                                            
            | NULL                 |
|                              | NULL                                           
            | NULL                 |
| # Detailed Table Information | NULL                                           
            | NULL                 |
| Database:                    | default                                        
            | NULL                 |
| OwnerType:                   | USER                                           
            | NULL                 |
| Owner:                       | quanlong                                       
            | NULL                 |
| CreateTime:                  | Thu Jan 28 09:18:52 CST 2021                   
            | NULL                 |
| LastAccessTime:              | UNKNOWN                                        
            | NULL                 |
| Retention:                   | 0                                              
            | NULL                 |
| Location:                    | 
hdfs://localhost:20500/test-warehouse/unicode_parse_error  | NULL               
  |
| Table Type:                  | EXTERNAL_TABLE                                 
            | NULL                 |
| Table Parameters:            | NULL                                           
            | NULL                 |
|                              | EXTERNAL                                       
            | TRUE                 |
|                              | OBJCAPABILITIES                                
            | EXTREAD,EXTWRITE     |
|                              | TRANSLATED_TO_EXTERNAL                         
            | TRUE                 |
|                              | external.table.purge                           
            | TRUE                 |
|                              | transient_lastDdlTime                          
            | 1611796732           |
|                              | NULL                                           
            | NULL                 |
| # Storage Information        | NULL                                           
            | NULL                 |
| SerDe Library:               | 
org.apache.hadoop.hive.serde2.lazy.LazySimpleSerDe         | NULL               
  |
| InputFormat:                 | org.apache.hadoop.mapred.TextInputFormat       
            | NULL                 |
| OutputFormat:                | 
org.apache.hadoop.hive.ql.io.HiveIgnoreKeyTextOutputFormat | NULL               
  |
| Compressed:                  | No                                             
            | NULL                 |
| Num Buckets:                 | 0                                              
            | NULL                 |
| Bucket Columns:              | []                                             
            | NULL                 |
| Sort Columns:                | []                                             
            | NULL                 |
| Storage Desc Params:         | NULL                                           
            | NULL                 |
|                              | field.delim                                    
            | #                    |
|                              | serialization.format                           
            | #                    |
|                              | NULL                                           
            | NULL                 |
| # Constraints                | NULL                                           
            | NULL                 |
+------------------------------+------------------------------------------------------------+----------------------+
Fetched 33 row(s) in 4.54s
{code}

> SqlParser parsed error for unicode
> ----------------------------------
>
>                 Key: IMPALA-3777
>                 URL: https://issues.apache.org/jira/browse/IMPALA-3777
>             Project: IMPALA
>          Issue Type: Bug
>          Components: Frontend
>    Affects Versions: Impala 2.2.4
>         Environment: CentOS 6.7 64 bit. impalad version 2.7.0-cdh5-INTERNAL 
> DEBUG
>            Reporter: Yuanhao Luo
>            Priority: Minor
>              Labels: correctness, downgraded
>         Attachments: After calling SqlParser.parse.JPG, Before calling 
> SqlParser.parse.JPG
>
>
> When I run query:create table unicode_parse_error(id int) row format 
> delimited fields terminated by '\u0023##'; the field delimiter becomes to 
> '\u0017##'.
> Logs:
> {noformat}
> [nobida147:21000] > create table unicode_parse_error(id int) row format 
> delimited fields terminated by '\u0023##';
> Query: create table unicode_parse_error(id int) row format delimited fields 
> terminated by '\u0023##'
> Fetched 0 row(s) in 242.44s
> [nobida147:21000] > describe extended unicode_parse_error;
> Query: describe extended unicode_parse_error
> +------------------------------+------------------------------------------------------------------+----------------------+
> | name                         | type                                         
>                     | comment              |
> +------------------------------+------------------------------------------------------------------+----------------------+
> | # col_name                   | data_type                                    
>                     | comment              |
> |                              | NULL                                         
>                     | NULL                 |
> | id                           | int                                          
>                     | NULL                 |
> |                              | NULL                                         
>                     | NULL                 |
> | # Detailed Table Information | NULL                                         
>                     | NULL                 |
> | Database:                    | db1                                          
>                     | NULL                 |
> | Owner:                       | root                                         
>                     | NULL                 |
> | CreateTime:                  | Thu Jun 23 15:54:20 CST 2016                 
>                     | NULL                 |
> | LastAccessTime:              | UNKNOWN                                      
>                     | NULL                 |
> | Protect Mode:                | None                                         
>                     | NULL                 |
> | Retention:                   | 0                                            
>                     | NULL                 |
> | Location:                    | 
> hdfs://localhost:20500/test-warehouse/db1.db/unicode_parse_error | NULL       
>           |
> | Table Type:                  | MANAGED_TABLE                                
>                     | NULL                 |
> | Table Parameters:            | NULL                                         
>                     | NULL                 |
> |                              | transient_lastDdlTime                        
>                     | 1466668460           |
> |                              | NULL                                         
>                     | NULL                 |
> | # Storage Information        | NULL                                         
>                     | NULL                 |
> | SerDe Library:               | 
> org.apache.hadoop.hive.serde2.lazy.LazySimpleSerDe               | NULL       
>           |
> | InputFormat:                 | org.apache.hadoop.mapred.TextInputFormat     
>                     | NULL                 |
> | OutputFormat:                | 
> org.apache.hadoop.hive.ql.io.HiveIgnoreKeyTextOutputFormat       | NULL       
>           |
> | Compressed:                  | No                                           
>                     | NULL                 |
> | Num Buckets:                 | 0                                            
>                     | NULL                 |
> | Bucket Columns:              | []                                           
>                     | NULL                 |
> | Sort Columns:                | []                                           
>                     | NULL                 |
> | Storage Desc Params:         | NULL                                         
>                     | NULL                 |
> |                              | field.delim                                  
>                     | \u0017##             |
> |                              | serialization.format                         
>                     | \u0017##             |
> +------------------------------+------------------------------------------------------------------+----------------------+
> Fetched 27 row(s) in 4.77s
> {noformat}
> After debugging, it seems that SqlParser.parse() goes wrong. As attachment 
> shows, before calling SqlParse.parse() the statement is: fields terminated by 
> '\u0023##' , but after parsing, it becomes '\u0017##'



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

[jira] [Commented] (IMPALA-3777) SqlParser parsed error for unicode

Reply via email to