Lorand Bendig created PIG-3662:
----------------------------------

             Summary: Static loadcaster in BinStorage can cause exception
                 Key: PIG-3662
                 URL: https://issues.apache.org/jira/browse/PIG-3662
             Project: Pig
          Issue Type: Bug
            Reporter: Lorand Bendig
            Assignee: Lorand Bendig
             Fix For: 0.13.0


I came a cross this issue when testing PIG-3642. Consider the following two 
testcases from {{TestEvalPipeline2}} executed in local mode:
{{testBinStorageByteCast}}:
{code}
A = load 'table_testBinStorageByteCast' as (a0, a1, a2);
store A into 'table_testBinStorageByteCast.temp' using BinStorage();
A = load 'table_testBinStorageByteCast.temp' using BinStorage() as (a0, a1, a2);
B = foreach A generate (long)a0;
dump B;
{code}

{{testBinStorageByteArrayCastsSimple}}:
{code}
A = load 'table_bs_ac';
store A into 'TestEvalPipeline2_BinStorageByteArrayCasts' using
  org.apache.pig.builtin.BinStorage();
B = load 'TestEvalPipeline2_BinStorageByteArrayCasts'  using 
  BinStorage('Utf8StorageConverter') as (name: int, age: int, gpa: float, 
lage: long, dgpa: double); 
dump B;
{code}

The first testcase should fail (same example at: 
http://pig.apache.org/docs/r0.12.0/func.html#binstorage) while the second one 
should pass.
When I run *only* _testBinStorageByteArrayCastsSimple_ there's no problem, but 
when it runs *after* _testBinStorageByteCast_, it fails with the same exception 
as _testBinStorageByteArrayCastsSimple_ :
{{java.lang.Exception: org.apache.pig.backend.executionengine.ExecException: 
ERROR 1118: Cannot cast bytes loaded from BinStorage. Please provide a custom 
converter.}}

Reason:
When 'table_testBinStorageByteCast.temp' is loaded, 
_BinStorage#getLoadCaster()_ sets _UnImplementedLoadCaster_ since casterString 
is not defined. This causes the exception when _BinStorage#bytesToLong()_ is 
called during the cast of a0 which is correct. Now move on to the next testcase 
where _'TestEvalPipeline2_BinStorageByteArrayCasts'_ is loaded. We expect 
BinStorage to use _Utf8StorageConverter_ as a loadcaster but it will use 
_UnImplementedLoadCaster_ instead, which results in an exception. It's because 
caster and casterString are *static* variables in BinStorage, however these are 
set and accessed like instance variables. Therefore, when 
BinStorage('Utf8StorageConverter') gets instantiated, it will contain the 
already initialized caster from the previous run. _BinStorage#getLoadCaster()_ 
will just return this instead of instantiating Utf8StorageConverter from the 
provided constructor parameter.

Are caster and casterString just by accident static? If so, I'd address this 
issue with the patch attached.






--
This message was sent by Atlassian JIRA
(v6.1.5#6160)

Reply via email to