Sandhya Sundaresan created TRAFODION-3263:
---------------------------------------------

             Summary: Disable LOB locking and refactor order of  LOB iid 
expression evaluation
                 Key: TRAFODION-3263
                 URL: https://issues.apache.org/jira/browse/TRAFODION-3263
             Project: Apache Trafodion
          Issue Type: Improvement
          Components: sql-general
    Affects Versions: 2.2.0
            Reporter: Sandhya Sundaresan
            Assignee: Sandhya Sundaresan


 The change to use JNI to do HDFS writes improved the interface by returning 
more useful infomration to the caller. In TRAFODION-2946, we ddescribe the need 
for LOB locking because of a condition where multiple threads writing to the 
same LOB column could interleave and cause  problems. TWith the new JNI 
interface and HDFS write will now return the offset where the data was written. 
So we can use this return offset to store in the descriptor tables. Prior to 
this while using the libhdfs API, we would not get back the "written offset".

 

So the order of operations before this change  used to be :
 # Get the EOD for the LOB data file in HDFS
 # Store this offset into the LOB descriptor tables so we know where to 
retrieve the data from during a read. 
 # call hdfsWrite to write to the LOB data file. And hope that the offset where 
the hdfsWrite writes is the same as the EOD calculated in 1. hdfs being an 
"append only"file system, this is usually how it works. But if another process 
comes in and does an insert into the LOB column between 2 and 3, then we have 
an incorrect offset stored int he descriptor tables. Hence we added a Lob Lock 
to make steps 1,2 and 3 atomic as part of Trafodion-2946 to address this issue.

The order of operations with this change is as follows :
 # Call JNI hdfs Write API to write the lob data to hdfs. 
 # Use return data offset from JNI hdfswrite API in 1. as the offset to store 
in the LOB descriptor tables. 
 # If there are multiple chunks to write, do it in a loop and append to the 
first chunk. This way each chunk can be anywhere in hdfs and not necessarily 
continguous. But we are guaranteed that whatever we wrote will be stored in our 
internalLOB descriptor files.
 # If any failure or TM erro occurs whilewriting to the LOB descriptor tables,  
transaction gets rolled back and the chunk of hdfs data written becomes "dead 
data". It doesn't harm the next operation. 
 # GC check is now done before an update or insert. Earlier it was done as part 
of the ::allocateDesc operation to get the EOD of the file. 

 



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

Reply via email to