[ 
https://issues.apache.org/jira/browse/TRAFODION-3263?focusedWorklogId=271856&page=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-271856
 ]

ASF GitHub Bot logged work on TRAFODION-3263:
---------------------------------------------

                Author: ASF GitHub Bot
            Created on: 03/Jul/19 20:06
            Start Date: 03/Jul/19 20:06
    Worklog Time Spent: 10m 
      Work Description: Traf-Jenkins commented on issue #1845: [TRAFODION-3263] 
and other misc fixes for lob locking and refactoring 
URL: https://github.com/apache/trafodion/pull/1845#issuecomment-508237062
 
 
   Test Passed.  https://jenkins.esgyn.com/job/Check-PR-master/3239/
   
 
----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
-------------------

    Worklog Id:     (was: 271856)
    Time Spent: 3h 20m  (was: 3h 10m)

> Disable LOB locking and refactor order of  LOB iud expression evaluation
> ------------------------------------------------------------------------
>
>                 Key: TRAFODION-3263
>                 URL: https://issues.apache.org/jira/browse/TRAFODION-3263
>             Project: Apache Trafodion
>          Issue Type: Improvement
>          Components: sql-general
>    Affects Versions: 2.2.0
>            Reporter: Sandhya Sundaresan
>            Assignee: Sandhya Sundaresan
>            Priority: Major
>          Time Spent: 3h 20m
>  Remaining Estimate: 0h
>
>  The change to use JNI to do HDFS writes improved the interface by returning 
> more useful infomration to the caller. In TRAFODION-2946, we ddescribe the 
> need for LOB locking because of a condition where multiple threads writing to 
> the same LOB column could interleave and cause  problems. TWith the new JNI 
> interface and HDFS write will now return the offset where the data was 
> written. So we can use this return offset to store in the descriptor tables. 
> Prior to this while using the libhdfs API, we would not get back the "written 
> offset".
>  
> So the order of operations before this change  used to be :
>  # Get the EOD for the LOB data file in HDFS
>  # Store this offset into the LOB descriptor tables so we know where to 
> retrieve the data from during a read. 
>  # call hdfsWrite to write to the LOB data file. And hope that the offset 
> where the hdfsWrite writes is the same as the EOD calculated in 1. hdfs being 
> an "append only"file system, this is usually how it works. But if another 
> process comes in and does an insert into the LOB column between 2 and 3, then 
> we have an incorrect offset stored int he descriptor tables. Hence we added a 
> Lob Lock to make steps 1,2 and 3 atomic as part of Trafodion-2946 to address 
> this issue.
> The order of operations with this change is as follows :
>  # Call JNI hdfs Write API to write the lob data to hdfs. 
>  # Use return data offset from JNI hdfswrite API in 1. as the offset to store 
> in the LOB descriptor tables. 
>  # If there are multiple chunks to write, do it in a loop and append to the 
> first chunk. This way each chunk can be anywhere in hdfs and not necessarily 
> continguous. But we are guaranteed that whatever we wrote will be stored in 
> our interna lLOB descriptor files.
>  # If any failure or TM error occurs while writing to the LOB descriptor 
> tables,  transaction gets rolled back and the chunk of hdfs data written 
> becomes "dead data". It doesn't harm the next operation. 
>  # GC check is now done before an update or insert. Earlier it was done as 
> part of the ::allocateDesc operation to get the EOD of the file. 
>  



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

Reply via email to