[ 
https://issues.apache.org/jira/browse/TRAFODION-1424?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Suresh Subbiah closed TRAFODION-1424.
-------------------------------------
       Resolution: Fixed
    Fix Version/s: 2.0-incubating

> Enable CIF (compressed internal format) for Trafodion scan operator
> -------------------------------------------------------------------
>
>                 Key: TRAFODION-1424
>                 URL: https://issues.apache.org/jira/browse/TRAFODION-1424
>             Project: Apache Trafodion
>          Issue Type: Improvement
>          Components: sql-cmp, sql-exe
>    Affects Versions: 0.6 (pre-incubation)
>            Reporter: Suresh Subbiah
>            Assignee: Suresh Subbiah
>             Fix For: 2.0-incubating
>
>   Original Estimate: 168h
>  Remaining Estimate: 168h
>
> When varchar data is read from a Trafodion table by the scan operator it is 
> converted to exploded format by blank padding data to its maximum declared 
> length. A change contributed by Khaled Bouaziz extends Trafodion's compressed 
> internal format (CIF) feature to now include the Traf scan operator.
> Here is an email exchange with Khaled on this subject
> Hi Anoop:
> I made few changes to use CIF with traf scan 
> The changes are mostly for the the convert expression:
> -     Changed the row format of the convert row to aligned format
> -     Added code to use cif so that we only use the needed space
> If we can store the CIF row length with row when we insert/update that we can 
> bypass the convert expression I think.
> >>select * from testcif;   
> A            B
> -----------  
> ----------------------------------------------------------------------------------------------------
>           1  aaa                                                              
>                                    
>           1  bbbbb                                                            
>                                    
>           1  ccccccccccccc                                                    
>                                    
>           1  eeeeeeeeeeeeeeeeeeeeeeeeeeeeee                                   
>                                    
> --- 4 row(s) selected.
> >>fc                  
> Expression: Convert Expr
> Expr Len: 632, Consts Len: 8
> flags_ = 0000000010001000   
>   Clause #1: ex_function_clause
>     OperatorTypeEnum = ITM_HEADER(2375), NumOperands = 1
>     ex_clause::flags_ = 0000000010000000                
>     ex_function_clause::flags_ = 0000000000000000       
>     PCODE  = supported                                  
>     Operand #0 (result):                                
>       Datatype = REC_BYTE_F_ASCII(0), Length = 16, Null Flag = 0
>       Precision = 0, Scale = 1, Collation = 1, flags_ = 0000001000001000
>       Tuple Data Format = SQLMX_ALIGNED_FORMAT                          
>       Atp = 1, AtpIndex = 2                                             
>       Offset = 0, NullIndOffset = -1, VClenIndOffset = -1               
>       RelOffset = 0, VoaOffset = -1, NullBitIdx = -1                    
>       NullIndLength = 0, VClenIndLength = 0                             
>       ValueId = 0                                                       
>       Text = Hdr                                                        
>   Clause #2: ex_conv_clause
>     OperatorTypeEnum = ITM_CAST(2452), NumOperands = 2
>     ex_clause::flags_ = 0000000010000000              
>     ex_conv_clause::flags_ = 0000000000000000         
>     PCODE  = supported                                
>     Operand #0 (result):                              
>       Datatype = REC_BIN64_SIGNED(134), Length = 8, Null Flag = 0
>       Precision = 0, Scale = 0, Collation = 1, flags_ = 0000001000001001
>       Tuple Data Format = SQLMX_ALIGNED_FORMAT                          
>       Atp = 1, AtpIndex = 2                                             
>       Offset = 16, NullIndOffset = -1, VClenIndOffset = -1              
>       RelOffset = 0, VoaOffset = -1, NullBitIdx = -1                    
>       NullIndLength = 0, VClenIndLength = 0                             
>       ValueId = 30                                                      
>       Text = cast                                                       
>     Operand #1:
>       Datatype = REC_BIN64_SIGNED(134), Length = 8, Null Flag = 0
>       Precision = 0, Scale = 0, Collation = 1, flags_ = 0000001000001001
>       Tuple Data Format = SQLMX_ALIGNED_FORMAT                          
>       Atp = 1, AtpIndex = 4                                             
>       Offset = 16, NullIndOffset = -1, VClenIndOffset = -1              
>       RelOffset = 0, VoaOffset = -1, NullBitIdx = -1                    
>       NullIndLength = 0, VClenIndLength = 0                             
>       ValueId = 29                                                      
>       Text = LARGEINT                                                   
>   Clause #3: ex_conv_clause
>     OperatorTypeEnum = ITM_CAST(2452), NumOperands = 2
>     ex_clause::flags_ = 0000000010000110              
>     ex_conv_clause::flags_ = 0000000000000000         
>     PCODE  = supported                                
>     Operand #0 (result):                              
>       Datatype = REC_BIN32_SIGNED(132), Length = 4, Null Flag = 1
>       Precision = 0, Scale = 0, Collation = 1, flags_ = 0000001000001001
>       Tuple Data Format = SQLMX_ALIGNED_FORMAT                          
>       Atp = 1, AtpIndex = 2                                             
>       Offset = 24, NullIndOffset = 12, VClenIndOffset = -1              
>       RelOffset = 8, VoaOffset = -1, NullBitIdx = 0                     
>       NullIndLength = 0, VClenIndLength = 0                             
>       ValueId = 32                                                      
>       Text = cast                                                       
>     Operand #1:
>       Datatype = REC_BIN32_SIGNED(132), Length = 4, Null Flag = 1
>       Precision = 0, Scale = 0, Collation = 1, flags_ = 0000001000001001
>       Tuple Data Format = SQLMX_ALIGNED_FORMAT                          
>       Atp = 1, AtpIndex = 4                                             
>       Offset = 24, NullIndOffset = 12, VClenIndOffset = -1              
>       RelOffset = 8, VoaOffset = -1, NullBitIdx = 0                     
>       NullIndLength = 0, VClenIndLength = 0                             
>       ValueId = 31                                                      
>       Text = INTEGER SIGNED                                             
>   Clause #4: ex_conv_clause
>     OperatorTypeEnum = ITM_CAST(2452), NumOperands = 2
>     ex_clause::flags_ = 0000000010000110              
>     ex_conv_clause::flags_ = 0000000000000000         
>     PCODE  = supported                                
>     Operand #0 (result):                              
>       Datatype = REC_BYTE_V_ASCII(64), Length = 100, Null Flag = 1
>       Precision = 0, Scale = 1, Collation = 1, flags_ = 0000001000001000
>       Tuple Data Format = SQLMX_ALIGNED_FORMAT                          
>       Atp = 1, AtpIndex = 2                                             
>       Offset = 32, NullIndOffset = 12, VClenIndOffset = 28              
>       RelOffset = 0, VoaOffset = 8, NullBitIdx = 1                      
>       NullIndLength = 0, VClenIndLength = 4                             
>       ValueId = 34                                                      
>       Text = cast                                                       
>     Operand #1:
>       Datatype = REC_BYTE_V_ASCII(64), Length = 100, Null Flag = 1
>       Precision = 0, Scale = 1, Collation = 1, flags_ = 0000001000001000
>       Tuple Data Format = SQLMX_ALIGNED_FORMAT                          
>       Atp = 1, AtpIndex = 4                                             
>       Offset = 32, NullIndOffset = 12, VClenIndOffset = 28              
>       RelOffset = 0, VoaOffset = 8, NullBitIdx = 1                      
>       NullIndLength = 0, VClenIndLength = 4                             
>       ValueId = 33                                                      
>       Text = VARCHAR(100) CHARACTER SET ISO88591                        
>   PCode:
> PCode Expr Length: 332
>     [1]               
>     HDR_MPTR32_IBIN32S_IBIN32S_IBIN32S_IBIN32S_IBIN32S (303) 4 0 16 16 4 12 4
>     MOVE_MBIN64S_MBIN64S (203) 4 16 5 16                                     
>     NOT_NULL_BRANCH_MBIN32S_MBIN32S_IATTR3_IBIN32S (248) 4 12 5 12 134219270 
> 0 0 14  (Tgt: 3)
>     [2]
>     MOVE_MBIN32S_IBIN32S (3) 4 24 0
>     BRANCH (95) 6  (Tgt: 4)        
>     [3]  (Preds: 1 )
>     MOVE_MBIN32U_MBIN32U (202) 4 24 5 24
>     [4]  (Preds: 2 )
>     NOT_NULL_BRANCH_MBIN32S_MBIN32S_IATTR3_IBIN32S (248) 4 12 5 12 134219270 
> 1 1 24  (Tgt: 6)
>     [5]
>     FILL_MEM_BYTES_VARIABLE (317) 4 32 8 100 1024 0 0
>     UPDATE_ROWLEN3_MATTR5_IBIN32S (316) 4 32 8 -1 1024 4
>     RETURN (264)                                        
>     [6]  (Preds: 4 )
>     MOVE_MATTR5_MATTR5 (284) 4 32 8 100 1024 5 32 8 100 1024
>     UPDATE_ROWLEN3_MATTR5_IBIN32S (316) 4 32 8 -1 1024 4    
>     RETURN (264)                                            
> Expression: ScanExpr is NULL
> Expression: RowIdExpr
> Expr Len: 384, Consts Len: 8
> flags_ = 0000000010001000   
>   Clause #1: ex_conv_clause 
>     OperatorTypeEnum = ITM_CAST(2452), NumOperands = 2
>     ex_clause::flags_ = 0000000010000000              
>     ex_conv_clause::flags_ = 0000000000010000         
>     PCODE  = supported                                
>     Operand #0 (result):                              
>       Datatype = REC_BIN64_SIGNED(134), Length = 8, Null Flag = 0
>       Precision = 0, Scale = 0, Collation = 1, flags_ = 0000001000001000
>       Tuple Data Format = SQLARK_EXPLODED_FORMAT                        
>       Atp = 0, AtpIndex = 1 (Temporary)                                 
>       Offset = 0, NullIndOffset = -1, VClenIndOffset = -1               
>       NullIndLength = 0, VClenIndLength = 0                             
>       ValueId = 36                                                      
>       Text = cast                                                       
>     Operand #1:
>       Datatype = REC_BYTE_V_ASCII(64), Length = 8, Null Flag = 0
>       Precision = 0, Scale = 1, Collation = 1, flags_ = 0000001000001000
>       Tuple Data Format = SQLARK_EXPLODED_FORMAT                        
>       Atp = 1, AtpIndex = 5                                             
>       Offset = 2, NullIndOffset = -1, VClenIndOffset = 0                
>       NullIndLength = 0, VClenIndLength = 2                             
>       ValueId = 35                                                      
>       Text = VARCHAR(8) CHARACTER SET ISO88591                          
>   Clause #2: ex_function_clause
>     OperatorTypeEnum = ITM_COMP_ENCODE(2114), NumOperands = 2
>     ex_clause::flags_ = 0000000010000000                     
>     ex_function_clause::flags_ = 0000000000000000            
>     PCODE  = supported                                       
>     Operand #0 (result):                                     
>       Datatype = REC_BYTE_F_ASCII(0), Length = 8, Null Flag = 0
>       Precision = 0, Scale = 1, Collation = 1, flags_ = 0000001000001000
>       Tuple Data Format = SQLARK_EXPLODED_FORMAT                        
>       Atp = 1, AtpIndex = 3                                             
>       Offset = 0, NullIndOffset = -1, VClenIndOffset = -1               
>       NullIndLength = 0, VClenIndLength = 0                             
>       ValueId = 37                                                      
>       Text = comp_encode                                                
>     Operand #1:
>       Datatype = REC_BIN64_SIGNED(134), Length = 8, Null Flag = 0
>       Precision = 0, Scale = 0, Collation = 1, flags_ = 0000001000001000
>       Tuple Data Format = SQLARK_EXPLODED_FORMAT                        
>       Atp = 0, AtpIndex = 1 (Temporary)                                 
>       Offset = 0, NullIndOffset = -1, VClenIndOffset = -1               
>       NullIndLength = 0, VClenIndLength = 0                             
>       ValueId = 36                                                      
>       Text = cast                                                       
>   PCode:
> PCode Expr Length: 88
>     [1]              
>     CONVVCPTR_MBIN64S_MATTR5_IBIN32S (330) 2 0 5 2 -1 8 512 8
>     ENCODE_MASCII_MBIN64S_IBIN32S (91) 4 0 2 0 0             
>     RETURN (264)                                             
> Expression: UpdateExpr is NULL
> Expression: MergeInsertExpr is NULL
> Expression: LowKeyExpr is NULL
> Expression: HighKeyExpr is NULL
> Expression: ReturnFetchExpr is NULL
> Expression: ReturnUpdateExpr is NULL
> Expression: ReturnMergeInsertExpr is NULL
> Expression: mergeUpdScanExpr is NULL
> Expression: mergeInsertRowIdExpr is NULL
> Expression: encodedKeyExpr is NULL
> Expression: keyColValExpr is NULL
> Expression: hbaseFilterExpr is NULL
> From: Sharma, Anoop 
> Sent: Tuesday, April 21, 2015 10:04 AM
> To: Bouaziz, Khaled; Subbiah, Suresh
> Subject: CIF question
> hi
>   do we use CIF when selecting out of traf scan operator?
> for ex:
>   create table t (a int not null primary key, b varchar(1000), c 
> varchar(1000));
> would “select * from t” use CIF for the output row from scan operator?
> Right now I see exploded format in HbaseAccess even if I set cqd 
> compressed_internal_format to ON?
> But if I do a join, then the hash join operator uses CIF but scan operator 
> uses exploded.
> When is the conversion from exploded to aligned(cif) format done?
> anoop



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

Reply via email to